Commun. Math. Phys. 194, 1 – 45 (1997)
Communications in
Mathematical Physics © Springer-Verlag 1998
¨ Modified Prufer and EFGP Transforms and the Spectral Analysis of One-Dimensional Schr¨odinger Operators Alexander Kiselev1,? , Yoram Last2 , Barry Simon2,?? 1
Department of Mathematics, University of Chicago, Chicago, IL 60637, USA Division of Physics, Mathematics, and Astronomy, California Institute of Technology 253-37, Pasadena, CA 91125, USA
2
Received: 8 April 1997 / Accepted: 19 June 1997
Abstract: Using control of the growth of the transfer matrices, we discuss the spectral analysis of continuum and discrete half-line Schr¨odinger P∞ operators with slowly decaying potentials. Among our results we show if V (x) = n=1 an W (x − xn ), where W has 0, then H has purely a.c. (resp. purely s.c.) spectrum compact support xn /xn+1 →P P and a2n = ∞). For λn−1/2 an potentials, where an are on (0, ∞) if a2n < ∞ (resp. independent, identically distributed random variables with E(an ) = 0, E(a2n ) = 1, and λ < 2, we find singular continuous spectrum with explicitly computable fractional Hausdorff dimension.
1. Introduction In this paper, we will study continuum and discrete Schr¨odinger operators on the halfline (while we don’t always make them explicit, given theory in [10, 26, 32], many of our results extend to suitable whole-line problems). Explicitly, we are interested in the spectral analysis of operators H on L2 (0, ∞; dx) and on `2 ([1, ∞)) given by (Hu)(x) = −
d2 + V (x) dx2
(1.1)
in the continuum case and (Hu)(n) = u(n + 1) + u(n − 1) + V (n)u(n)
(1.2)
in the discrete case. ?
Research supported in part by NSF Grant No. DMS-9022140. This material is based upon work supported by the National Science Foundation under Grant No. DMS9401491. The Government has certain rights in this material. ??
2
A. Kiselev, Y. Last, B. Simon
Suitable boundary conditions are set at x (or n) = 0, so that H is self-adjoint (since in all our examples V is limit point at infinity, a boundary condition is not needed there). We are interested in the spectral properties of such operators in situations where |V (n)| → 0 as n → ∞, but so slowly that the usual scattering theory will not apply. Our main theme in this paper is that there are perturbation techniques of remarkable power for such operators based on two ideas. The first is that one can use the transfer matrix to analyze spectral properties. The transfer or fundamental matrix is a 2 × 2 unimodular matrix defined in the continuum case for any E by 0 a u (x) , (1.3) = TE (x, 0) u(x) b where −u00 + V u = Eu, u0 (0) = a, u(0) = b. In the discrete case a u(n + 1) = , TE (n, 0) b u(n)
(1.4)
where u(1) = a, u(0) = b, and u(n + 1) + u(n − 1) + V (n)u(n) = Eu(n). The second idea is that one can control the transfer matrix by controlling the norms of two solutions of −u00 + V u = Eu and that the critical equations are ones that involve those norms. Two of us [22] have recently found several new criteria for singular or absolutely continuous spectra in terms of transfer matrices, and these criteria will make some of our results here possible. The perturbation equations we will use have not been systematically used in this context except in the paper of Kotani-Ushiroya [21] whose techniques have some overlap with this paper. But they didn’t control the discrete case and their method is so entwined with certain Martingale inequalities that it is unclear how to use them in other contexts. While we were writing up the work for this paper, we received a preprint from Remling [29] that uses this two-pronged approach and has considerable overlap with our Sects. 5 and 6. We will discuss the connection shortly. Here are some of the theorems that we will use that relate spectral properties to behavior of T (n). The first is from [22]: Theorem 1.1. Suppose that there is a fixed sequence ni → ∞ and S is a subset of R so that for a.e. E ∈ S, limi→∞ kTE (ni , 0)k = ∞. Then µac (S) = 0, where µac is the absolutely continuous part of the spectral measure for H. Remarks. 1. The interesting aspect of this theorem is that ni is arbitrary. The result actually allows a more general sequence kTE (ni , mi )k. 2. In typical applications, S is an interval in the essential spectrum. In the other direction, one has the following pair of results: Theorem 1.2. Suppose S is a set so that for a.e. E ∈ S, limx→∞ kTE (x, 0)k < ∞. Then µac (Q) > 0 for any Q ⊂ S with |Q| > 0 where | · | = Lebesgue measure. Rb Theorem 1.3. Suppose there is a sequence ni → ∞ so that a kTE (ni , 0)k4 dE < ∞. Then (a, b) ⊂ spec(H) and the spectral measure is purely absolutely continuous on (a, b) and µac (Q) > 0 for any Q with |Q ∩ (a, b)| > 0.
Modified Pr¨ufer and EFGP Transforms
3
Remarks. 1. That Theorem 1.2 is implied by the Gilbert-Pearson [11] theory was noted by Stolz [36]. A simple proof can be found in [33]. Last-Simon [22] prove a stronger R1 variant in which kTE (x, 0)k is replaced by −1 kTE (x + y, 0)k dy and lim by lim. (The discrete analog holds with lim and without any local integration.) 2. Theorem 1.3 is from Last-Simon [22] although the method used there is not much more than what is in Carmona [1]. As to distinguishing dense pure point from singular continuous spectrum, in one direction we have the following elementary result from Simon-Stolz [35]. R∞ P Theorem 1.4. If n kTE (n, 0)k−2 = ∞ in the discrete case or 0 kTE (x, 0)k−2 dx = ∞ in the continuum case, then Hu = Eu has no solution which is L2 at infinity. The paradigm of results that guarantees a solution L2 at ∞ is Ruelle’s proof [30] of Osceledec’s theorem. His argument is abstracted in [22]. We will need the following in Sect. 8: Theorem 1.5. If limn→∞ [log kTE (n, 0)k/nα ] exists and lies in (0, ∞) for some α > 0, then Hu = Eu has an L2 solution. [22] also has a general abstract result on power decay which, to get an L2 solution, requires 3 log kTE (n, 0)k > . lim n→∞ log n 2 [22] also has an example where the limit is 23 and there is no `2 solution. But there are stronger results that hold a.e. in certain probabilistic situations, so we won’t discuss the power decay result here. In Sect. 8, we will discuss the probabilistic result. As for the technique to control the growth of solutions, in the continuum case we will use modified Pr¨ufer variables defined for E > 0 by √ (1.5a) u0 (x) = E R(x) cos(θ(x)), u(x) = R(x) sin(θ(x)). (1.5b) √ One finds these obey the differential equations (with k = E) V (x) dθ =k− sin2 (θ(x)), dx k
(1.6)
1 d log R(x) = V (x) sin(2θ(x)). dx 2k
(1.7)
Two features of these equations are immediately noteworthy: (a) They separate in the sense that (1.6) does not involve R and after solving it, one obtains R by integration. That R drops out of (1.6) and the right side of (1.7) is an expression of the linearity of the initial equations. (b) If V = 0 in some region (a, b), then in that region R is √ constant and θ(x) = θ(a) + E in (1.5a). The addition k(x − a). It is this fact that leads one to take the factor √ of this E is what distinguishes this from ordinary Pr¨ufer transformations.
4
A. Kiselev, Y. Last, B. Simon
There is a third significant feature which we will turn to momentarily. Given how common these continuum equations are, we would have expected their discrete analogs would have been rediscovered and used many times, but even after some efforts at tracking various literature, we’ve found them in a single chain of four papers (and we are a fifth in this chain, since we learned of them from the fourth paper!). The original discoverer of the correct equation was Thomas Eggarter [9] in 1971. He was not looking Pnat an explicit difference equation but rather a continuum equation with V (x) = V0 i=1 δ(x − xi ). By integrating modified Pr¨ufer variables across the δ-functions, he was led to the transforms (E = 2 cos(k)), u(n) − cos(k)u(n − 1) = R(n) cos(θ(n)), sin(k)u(n − 1) = R(n) sin(θ(n)),
(1.8b) (1.8b)
in which case we have, after some calculation (see Sect. 2), cot(θ(n + 1)) = cot(k + θ(n)) − (sin(k))−1 V (n), V (n)2 V (n) R(n + 1)2 sin(2θ(n) + 2k) + =1− sin2 (θ(n) + k). 2 R(n) sin(k) sin2 (k)
(1.9) (1.10)
Actually, he had only an equation of the form (1.9). The definition of θ(n) and precise (1.9) is in a 1975 paper of Gredeskul-Pastur [13] who followed up on Eggarter’s work. [9, 13] focus on (1.9) because they use the transform to study the integrated density of states. Pastur-Figotin [26] defined R and exploited (1.10) to study the Lyapunov exponent. In recognition of these seminal works, we call (1.8) the EFGP transform. Their approach was further exploited in Chulaevsky-Spencer [2]. It will often be useful to use an equivalent form of (1.9) that appears as (2.14). Notice that (1.9), (1.10) have the two critical properties (a), (b) mentioned for (1.6), (1.7) in the continuum case. In particular, if V (n) = 0 for n in some interval [n0 , n1 ], then in that region R(n) is constant and θ(n) = θ(n0 ) + k(n − n0 ). While the EFGP transform was obtained by integrating a continuum δ-function model, it could also be found by looking for a transform with property (b). We will explain this in Sect. 2. [9, 13, 26, 2] all consider V ’s with no decay as n → ∞ but with a small coupling so that any calculations are only asymptotic in coupling constant. It turns out that the methods are especially well suited when V (n) → 0 at infinity and one obtains results that are exact for a fixed V . For example, in Sect. 8, we will find exact formulas for the local Hausdorff dimensions of certain singular continuous spectral measures. The third critical factor of the modified Pr¨ufer and EFGP transforms is a major theme of this paper, namely, that first-order terms in V are oscillatory while the second-order term has a strong tendency to be strictly positive. This idea is already seen in [26, 2], where γ(E) is O(g 2 ) with g a coupling constant because the first-order terms drop out. Let us be explicit about this idea. In (1.6), one might think the positivity comes via the square in sin2 (θ(x)) but that is wrong! Indeed, in writing sin2 (θ) = 21 − 21 cos(2θ), it is the cos(2θ) that is critical! Formally, (1.6) says θ(x) = kx + θ0 −
V (x) sin2 (kx + θ0 ) + O(V 2 ) ≡ kx + θ0 + δθ + O(V 2 ), k
Modified Pr¨ufer and EFGP Transforms
5
and then using sin(2θ) = sin(2kx + 2θ0 ) + 2 cos(2kx + 2θ0 )δθ + O(V 2 ), we get d log R = t1 + t2 + O(V 3 ), dx where t1 =
1 1 V (x) sin(2(kx + θ0 )) − 2 2k 2k
Z
x
V (x)
V (y) dy cos(2(kx + θ0 ))
x0
is the oscillatory term that is often unimportant, while 1 d t2 = 2 4k dx
Z
2
x
V (y) cos(2ky + 2θ0 ) dy x0
has a positive integral, second order in V . In explicit cases, it is more subtle to prove the second order is strictly positive and, indeed, for examples like V (x) = x−α , α < 21 , where the spectrum is absolutely continuous (by Weidmann [37]), the second-order terms do not cause divergences. This means that results that depend on a finite second-order term should hold more generally than those that depend on an infinite second-order term. Indeed, we Conjecture. If V is bounded and in L2 (R, dx) (or `2 (Z+ )), then the essential support of the a.c. part of the spectrum is all of (0, ∞) (or (−2, 2) in the discrete case). Our idea is that for almost all (but not all; see, e.g., [24, 25, 34]) k, the oscillations should kill the first-order term, and so the L2 condition should suffice to give a bounded transfer matrix for a.e. k and so the stated conclusion about the a.c. spectrum by Theorem 1.2. After discussing the modified Pr¨ufer and EFGP transforms and their relation to the growth of the transfer matrix in Sect. 2, we turn to two warm-up problems in Sects. 3 and 4. In Sect. 3, we show these transforms can replace the Harris-Lutz [15] method in many cases where that method is applicable. In Sect. 4, we look at potentials V with lim x|V (x)| finite and show that for such potentials their positive eigenvalues can only coalesce at E = 0. Since examples are known with countable many eigenvalues embedded in (0, ∞), this result is interesting. In Sects. 5–7, we study sparse potentials. Definition. A Pearson potential is one of the form V (x) =
∞ X
an W (x − xn ),
(1.11)
n=1
where W is a bounded, non-negative function of compact support, an → 0, and 1 ≤ x1 < x2 < x3 < · · ·, xn → 0. (1.12) xn+1
6
A. Kiselev, Y. Last, B. Simon
TheP name is in honor of David Pearson who considered potentials of the form (1.11) ∞ where n=1 a2n = ∞ and xn went to infinity sufficiently fast. To make things precise, think of the example xn = n!. Our major goal in Sects. 5–6 is to prove the following: Theorem 1.6. Let V be a Pearson potential. Then P∞ d2 (1) If n=1 a2n < ∞, the spectrum of − dx 2 + V (x) is purely absolutely continuous on (0, ∞) for any boundary condition at 0. P∞ d2 (2) If n=1 a2n = ∞, the spectrum of − dx 2 + V (x) is purely singular continuous on (0, ∞) for any boundary condition at 0. In Sect. 5, we will actually prove a stronger version of (1): Theorem 1.60 . Let V have the form (1.11) where lim
xn < 1. xn+1
(1.13)
Then (1) holds. P∞ Pearson [27, 28] proved a weak version of (2) in that if n=1 a2n = ∞, there exists some set of xn ’s so that the spectrum is purely singular continuous. In [27], there are hints that a result of type (1) (again with xn sufficiently large) should hold, but nothing explicit. As noted at the end of Sect. 5, for (1) the bumps W (x − xn ) can be n-dependent. At the end of Sect. 6, for [a, b] ≡ S ⊂ (0, ∞), we construct Pearson-like potentials (bumps whose width grows with n) so that there is purely a.c. spectrum on S and purely s.c. spectrum on (0, ∞)\S. In a recent paper, coincident with our work, Remling [29] obtained results related to Theorem 1.6(1) using similar methods. He only obtains the existence of absolutely continuous spectrum (his results are consistent with simultaneous singular continuous spectrum while we prove there is none), and he needs at least exp( 23 n log n) growth on the xn (whereas, if f (n) is a monotone function with f (m) → ∞ no matter how slowly, then xn = exp(nf (n)) obeys (1.12) and xn = exp(an) obeys (1.13)). After this manuscript was completed, we obtained a preliminary version of a preprint of Molchanov [23] with considerable overlap with our results in Sects. 5 and 6. In Sect. 7, we will prove Theorem 1.7. Let xn ∈ Z obey xn /xn+1 → 0. Let V be the potential with V (xn ) = an , V (x) = 0
x 6= xn for any n.
Then, P 2 (1) If an < ∞, the discrete Schr¨odinger operator with potential V has purely a.c. spectrum for (−2, 2). P (2) If a2n = ∞, the operator has purely singular continuous spectrum on (−2, 2).
Modified Pr¨ufer and EFGP Transforms
7
In Sects. 8 and 9, we discuss models with randomness and decay, first studied by Simon [31] and then by Delyon, et al. [7], Delyon [6], and Kotani-Ushiroya [21]. Typical of the models discussed in these sections is (g is positive constant) V (n) = gn−α an , where the an are independent, identically distributed random variables, uniformly distributed in [−1, 1]. We prove (i)
If α > 21 , the spectrum is almost surely purely absolutely continuous in (−2, 2).
(ii) If 0 < α < 21 , the spectrum is almost surely dense pure point in (−2, 2). (iii) If α = 21 , the spectrum is almost surely purely singular continuous in the region |E| < (4 − 13 g 2 )1/2 and dense pure point in the region (4 − 13 g 2 )1/2 ≤ |E| < 2 (if g 2 > 12, interpret (4 − 13 g 2 )1/2 as 0). (iv) In case α =
1 2
and g 2 < 12, in the region |E| < (4 − 13 g 2 )1/2 , the spectrum has
fractional Hausdorff dimension with local dimension (4 − E 2 −
g2 3 )/(4
− E 2 ).
Section 8 handles the discrete case, and Sect. 9 the continuum case. For sparse potentials, we give the details in the continuum case and sketch the discrete case; while for random decaying potentials, we give details in the discrete case and sketch the continuum case. A.K. would like to thank the hospitality of I.H.E.S., and B.S. the hospitality of Hebrew University where some of this work was done.
¨ 2. Modified Prufer and EFGP Transforms We will be interested in solutions of −u00 (x) + V (x)u(x) = k 2 u(x).
(2.1)
u0 (x) = kR(x) cos(θ(x)), u(x) = R(x) sin(θ(x)).
(2.2a) (2.2b)
Change variables to
These are called modified Pr¨ufer variables. The 2π ambiguity in θ is fixed by choosing θ(0) ∈ [0, 2π) and demanding θ(x) be continuous in x. Then a straightforward calculation shows (2.1) is equivalent to the pair of equations V (x) dθ =k− sin2 (θ(x)) dx k 1 d(log R)(x) = V (x) sin(2θ(x)). dx 2k
(2.3) (2.4)
This change of variables is so very useful because if V = 0, then θ(x) = θ0 + kx, R(x) = R0 . We will be able to study V as a perturbation about this solution. As explained in the introduction, one needs to study the asymptotic behavior of the norm of the transfer matrix T (x, 0). For any θ0 in [0, π), let θ(x, θ0 ) solve (2.3) with initial condition θ(0) = θ0 . Then let R(x, θ0 ) solve (2.4) with R(0, θ0 ) = 1. Then
8
A. Kiselev, Y. Last, B. Simon
Theorem 2.1. For any α, β ∈ (0, ∞) and θ1 6= θ2 , there exists non-zero, finite constants C1 and C2 (independent of x and V ) so that C1 max(R(x, θ1 ), R(x, θ2 )) ≤ kT (x, 0)k ≤ C2 max(R(x, θ1 ), R(x, θ2 ))
(2.5)
for all k ∈ (α, β). Proof. Define k(a, b)k2k = (ka)2 + b2 . Then min(1, k)k(a, b)k ≤ k(a, b)kk ≤ max(1, k) k(a, b)k. So defining operator norms in terms of k · kk , we see min(k, k −1 )kT (x, 0)kk ≤ kT (x, 0)k ≤ max(k, k −1 )kT (x, 0)kk , so it suffices to prove (2.5) with k · kk rather than k · k. But kT (x, 0)kk ≥ max(R(x, θ1 ), R(x, θ2 )) is trivial and kT (x, 0)kk ≤ {min[sin( 21 |θ1 − θ2 |), cos( 21 |θ1 − θ2 |)]}−1 max(R(x, θ1 ), R(x, θ2 )) by the lemma below.
If |θ1 − θ2 | ≤ π2 (which can be done by replacing θ1 by π + θ, if need be), then this proof shows we can take C1 = min(α, β −1 ), C2 = max(β, α−1 )[sin( 21 |θ1 − θ2 |)]−1 . Lemma 2.2. Let A be a unimodular matrix. Let uθ = (cos(θ), sin(θ)). Then if |θ1 −θ2 | ≤ π 2, kAk ≤ sin( 21 |θ1 − θ2 |)−1 max(kAuθ1 k, kAuθ2 k). Proof. There exists θ0 so that kAuθ k2 ≥ kAk2 sin2 (θ − θ0 ). If |θ1 − θ2 | < π2 , for any θ0 at least one of | sin(θ0 − θi )| is larger than or equal to | sin( 21 (θ1 − θ2 )|. Remark. One might worry that the lemma involves k · k and not k · kk but kAkk =
k 0 k 0 −1
and this product is also unimodular.
01 A 01 For the discrete case, we are interested in solutions of (0 ≤ k ≤ π) u(n + 1) + u(n − 1) + V (n)u(n) = 2 cos(k)u(n).
(2.6)
EFGP variables R(n), θ(n) are defined by R(n) cos(θ(n)) = u(n) − cos(k)u(n − 1), R(n) sin(θ(n)) = sin(k)u(n − 1).
(2.7a) (2.7b)
A priori θ(n) is only determined mod (2π). We will fix this ambiguity later. Noticing that R(n) sin(k + θ(n)) = sin(k)u(n), (2.8) sin(k + θ(n)) u(n) = . (2.9) u(n − 1) sin(θ(n))
Modified Pr¨ufer and EFGP Transforms
Similarly,
9
R(n) cos(k + θ(n)) = cos(k)u(n) − u(n − 1).
Thus, cot(k + θ(n)) =
(2.10)
cos(k)u(n) − u(n − 1) , sin(k)u(n)
where by definition, − cot(θ(n + 1)) =
cos(k)u(n) − u(n + 1) . sin(k)u(n)
Thus, (2.6) is equivalent to cot(θ(n + 1)) = cot(k + θ(n)) −
V (n) . sin(k)
(2.11)
¯ Writing θ(n) ≡ θ(n) + k, we see, using first (2.7) and then (2.8)/(2.9): R(n + 1)2 = sin2 (k)u(n)2 + (u(n + 1) − cos(k)u(n))2 = sin2 (k)u(n)2 + (u(n − 1) − cos(k)u(n) + V (n)u(n))2 2 V (n) 2 ¯ 2 2 ¯ ¯ sin(θ(n)) = R(n) sin (θ(n)) + R(n) cos(θ(n)) − sin(k) V (n)2 V (n) 2 ¯ 2 ¯ sin(2θ(n)) + = R(n) 1 − sin (θ(n)) . sin(k) sin2 (k) We can summarize with the EFGP equations: V (n) ¯ ; θ(n) = θ(n) + k, sin(k) ¯ cot(θ(n + 1)) = cot(θ(n)) + νk (n),
νk (n) ≡ −
(2.12a) (2.12b)
2
R(n + 1) ¯ ¯ = 1 + νk (n) sin(2θ(n)) + νk (n)2 sin2 (θ). R(n)2
(2.12c)
¯ We will fix the ambiguity in θ by demanding θ(n + 1) − θ(n) ∈ [−π, π). Equation (2.12) can be regarded as analogs of modified Pr¨ufer equations in that if V = 0, R(n) = constant, and θ(n) = θ(0) + kn. As noted in the introduction, Eggarter arrived at the first version of the EFGP transform by looking at continuum models with δ-function potential ((2.12b) is especially transparent in this mode). But one arrived at it by noting that when V (n) ≡ 0, could have 2 cos(k) −1 . This matrix has eigenvalues e±ik and so it the transfer matrix is powers of 0 1 cos(k) sin(k) must be similar to − sin(k) cos(k) . That similarity transformation will make the powers simple. Indeed, 0 sin(k) 2 cos(k) −1 cos(k) sin(k) 0 sin(k) = 1 − cos(k) 1 0 − sin(k) cos(k) 1 − cos(k) so the transform (2.7) precisely realizes the similarity. There is an analog of Theorem 2.1. Define R(n, θ) by requiring R(1) = 1, θ(1) = θ in [0, π). Then
10
A. Kiselev, Y. Last, B. Simon
Theorem 2.3. For any α ∈ (0, π2 ) and θ1 6= θ2 , there exists non-zero, finite constants C1 and C2 (independent of x and V ) so that for all k ∈ (α, π − α), C1 max(R(n, θ1 ), R(n, θ2 )) ≤ kT (n − 1, 0)k ≤ C2 max(R(n, θ1 ), R(n, θ2 )).
(2.13)
Because of the arccot, (2.12b) is somewhat awkward to deal with. Pastur-Figotin [26] have noted an equivalent form of (2.12b) which is straightforward from e2iϕ = 1 + viz., e
2iθ(n+1)
=e
¯ 2iθ(n)
1 1 2 1 + i cot(ϕ)
¯ iνk (n) (e2iθ(n) − 1)2 + . ¯ 2 1 − iνk2(n) (e2iθ(n) − 1)
(2.14)
As an application of (2.14) we have Proposition 2.4. If |νk (n)| < 21 , then ¯ |θ(n + 1) − θ(n)| ≤ π|νk (n)|. Proof. If |νk (n)| <
1 2,
then (2.14) implies that ¯
|e2iθ(n+1) − e2iθ(n) | ≤ Since |eiη − 1| ≥
2|η| π ,
(2.15)
|νk (n)| 4 = 4|νk (n)|. 1 2 2
we get
π ¯ ¯ |θ(n + 1) − θ(n)| ≤ |e2iθ(n+1) − e2iθ(n) |, 4 and so the claimed result.
Note. Kiselev, Remling, and Simon [20] present a way of defining R, θ that makes the analogy to the continuum case transparent, makes (2.14) transparent, improves (2.15), and extends to more general h0 . 3. Conditional Integrals and A.C. Spectrum It follows from [11, 16, 17] that for both continuum and discrete Schr¨odinger operators on [0, ∞), we have (see also [33] for a quick proof): Proposition 3.1. If S is a set of reals so that for each λ ∈ S, supx kTλ (x, 0)k < ∞, then H has purely a.c. spectrum on S in the sense that (i) For any boundary condition θ and any T ⊂ S with |T | > 0, we have ρac θ (T ) > 0. (ii) For any boundary condition θ, ρsing θ (S) = 0. Thus, bounded transfer matrices have important spectral consequences. By Theorems 2.1 and 2.3, if we can show R( · , θ) remains bounded for two initial θ’s, we have boundedness R of T . From this and (2.4), P (2.12c), one easily obtains the well-known result that if |V (x)| dx < ∞, (resp. |V (n)| < ∞), then the spectrum is purely a.c. in (0, ∞) (resp. (−2, 2)). Here is a result allowing more general decay, first in the continuum case.
Modified Pr¨ufer and EFGP Transforms
11
Rβ Theorem 3.2. Fix k 6= 0. Suppose that limβ→∞ x V (y)e2iky dy exists and that Z ∞ V (y)e2iky dy (3.1) Wk (x) = x
obeys
Z
Then
|V (x)Wk (x)| dx < ∞.
(3.2)
lim kT (x, 0)k < ∞.
(3.3)
x→∞
Remarks. 1. This result is not new; it is essentially due to Harris-Lutz [15]. This is a new proof. PN 2. This result implies that if V (x) = m=1 am sin(km x)/xβ , β > 21 , and k 6= ± 21 km for any m, then (3.3) holds, and so by Proposition 3.1, the spectrum is purely a.c. except 2 }. for possible positive eigenvalues of { 41 km 3. In [19], Kiselev proved that if V (x) = O(x− 4 − ), then (3.2) holds off a set of Lebesgue measure zero. 3
Proof. We will show for any θ0 , R(x, θ0 ) is bounded, and then one can appeal to Theorem 2.1 to complete the proof of (3.3). Write θ(x) = kx + ϕ(x), so by (2.3), ϕ obeys V (x) dϕ =− sin2 (kx + ϕ). dx k By (2.4) (and R(0) = 1), Z x 1 dWk 2iϕ Im e dx log R(x) = dx 0 2k Z 2i x dϕ 2iϕ 1 = Im [Wk (x)e2iϕ(x) − Wk (0)e2iθ0 ] − e Wk dx 2k 2k 0 dx if we integrate by parts. By hypothesis, Wk (x) is bdd so using (3.4), Z 1 x | log R(x)| ≤ bdd + |Wk (y)V (y)| dy k 0 is bounded by (3.2).
Remark. A similar argument proves that lim θ − kx −
x→∞
1 2k
Z
x
V (y) dy 0
exists. This in turn lets one prove there are complex solutions η± (k, x) with Z x 1 η± (k, x) exp (∓i kx − V (y) dy → 1, 2k 0 Z x 1 0 η± (k, x) exp ∓i kx − V (y) dy → ik. 2k 0
(3.4)
12
A. Kiselev, Y. Last, B. Simon
Notice that if V ∈ L2 , kx −
Z
1 2k
Z
x 0
x
p
k 2 − V (y) dy + Q(x),
V (y) dy = 0
where limx→∞ Q(x) exists. So if V ∈ L2 , this says that WKB-type solutions exist. This is also what the Harris-Lutz method gives [19]. We are heading toward a proof of Theorem 3.3. Fix k 6= 0, π. Suppose V (n) is a discrete potential with lim
B X
B→∞
exists and that
∞ X
V (m)e2ikm = Wk (n)
m=n
|V (n)Wk (n)| + |V (n)Wk (n + 1)| < ∞.
(3.5)
n=1
Then lim kT (n, 0)k < ∞.
n→∞
Given a function f on {1, 2, . . .}, define (δf )(n) = f (n + 1) − f (n) and note that summation by parts takes the form b X
b X
g(m)(δf )(m) = −
m=a
f (m + 1)(δg)(m) + (f g)(b + 1) − (f g)(a).
m=a
Lemma 3.4. If (3.5) holds for some k, then
P∞ n=1
|V (n)|2 < ∞.
Proof. Since W exists, V → 0 at ∞ and so V is bounded. Thus, writing V (n) = −e−2ikn (δWk )(n), and summing by parts, B X
V (n)2 = bdd +
n=1
B+1 X
V (n)Wk (n)e−2ikn −
n=2
is bounded by (3.5).
B X
V (n)Wk (n + 1)e−2ikn
n=1
Lemma 3.5. Suppose that {an }∞ n=1 is a real sequence so that an → 0 and
Then
N X
QN
n=1 (1
an
as n → ∞
is bounded.
(3.6)
(3.7)
n=1
+ an ) is bounded.
Proof. By (3.6), |an | → 0, so without loss we can suppose that |an | < 1. Then |1+an | ≤ 1 + an ≤ ean and (3.7) implies the result.
Modified Pr¨ufer and EFGP Transforms
13
Proof of Theorem 3.3. By (2.12c), Lemma 3.4, and Lemma 3.5, it suffices to prove that N X
¯
νk (n)e2iθ(n) ≡ G(N )
(3.8)
n=1
is bounded. Define ¯ − kn. ϕ(n) = θ(n) − k(n − 1) = θ(n) Proposition 2.4 and Lemma 3.4 imply that for n large |(δϕ)(n)| ≤ π|νk (n)|.
(3.9)
By the definition (3.8),
G(N ) = −
N X
δWk (n)(sin k)−1 e2iϕ(n)
n=1
= bdd + (sin k)−1
N X
Wk (n + 1)δ(e2iϕ )(n).
n=1
But |δ(e2iϕ )| ≤ 2|δϕ|, so by (3.9) |G(N ) − bdd| ≤ C1
N X
|Wk (n + 1)νk (n)|
n=1
≤ C1
X N
|Wk (n)νk (n)| + |νk (n)|2 < ∞.
n=1
Sometimes it is better to use slightly different Pr¨ufer variables. For example, if R, θ are defined by p u0 (x) = E − V (x) R(x) cos(θ(x)), u(x) = R(x) sin(θ(x)), then d log(R) 1 ∂V = cos2 (θ(x)), dx 2 ∂x 1 from which we see if V (x) → 0 at infinity and ∂V ∂x ∈ L , then solutions are bounded. (This is essentially the proof of Weidmann’s theorem [37] in [33].) If one tries out an 1 2 integration by parts argument, one needs both ∂V ∂x ∈ L and V ∈ L .
14
A. Kiselev, Y. Last, B. Simon
4. Bound States for O(x−1 ) Potentials 2
d If |V (x)| = o(x−1 ), Eastham-Kalf [8] show that − dx 2 +V (x) has no positive eigenvalues; more generally, if lim x|V (x)| = C < ∞, they show any eigenvalue λ must obey λ ≤ C 2 . On the other hand, Naboko [24] and Simon [34] have constructed V (x) decaying arbitrarily slower than x−1 with eigenvalues dense in [0, ∞). In fact, Simon [34] constructed V (x) withP V√ (x) = O(x−1 ) so that there are infinitely many eigenvalues with λi < ∞. In this section, we will handle the borderline case and λi → 0 as long as improve Eastham-Kalf [8] by showing:
Theorem 4.1. Let V (x) obey C = limx→∞ x|V (x)| < ∞. Then there are at most countably many positive eigenvalues λn for which there are solutions un of −u00n + V (x)un = λn un and un ∈ L2 . Moreover,
X
λn ≤
n
C2 . 2
(4.1)
Remarks. 1. We do not specify boundary conditions on V , that is, (4.1) is a bound on all possible boundary conditions at once. P√ 2 λn = ∞ (e.g., λn = π3C 2. There are λn so that (4.1) holds, but 2 n2 ) so there is a gap between Simon’s examples P √and what our bounds allow. We believe the optimal result would be to prove that n λn ≤ C. Without loss of generality by slightly increasing C and looking at [x, ∞), we can suppose that (4.2) |V (x)| ≤ C(1 + |x|)−1 which we henceforth do. The following is standard (see, e.g., Eastham-Kalf [8]): Lemma 4.2. If V is bounded and u solves −u00 + V u = λu and u ∈ L2 , then u0 ∈ L2 . In particular, R(x, θ0 ) ∈ L2 for that θ0 with (u(0), u0 (0)) = (R0 sin(θ0 ), kR0 cos(θ0 )). Proof.
Z
N 0
N Z |u0 |2 dx = u0 u − 0
N Z 0 = u u + 0
N
u00 u dx
0 N
(λ − V )u2 dx, 0
RN so if limN →∞ 0 |u0 |2 dx = ∞, then limN →∞ u0 u = ∞, but that implies u2 (N ) = RN u(0)2 + 2 0 u0 u dx → ∞, contradicting the fact that u ∈ L2 . Lemma 4.3. Let f and g be C 1 functions on [1, ∞) so that |g 0 f | + |f 0 | ∈ L1 . Then
RN 0
f (x)ei(kx+g(x)) dx is bounded as N → ∞ for any k 6= 0.
Modified Pr¨ufer and EFGP Transforms
Proof. Write eikx = Z
1 d ikx ik dx e
N
f (x)e
i(kx+g(x))
1
15
and integrate by parts to see that
Z N |f (N )| |f (1)| 1 + + dx ≤ (|f 0 | + |f g 0 |) dx. |k| |k| |k| 1
Noting that |f (N )| = |f (1)| +
RN 1
|f 0 (y)| dy, we see that the integral is bounded.
Remark. If f (x) → 0 at infinity, this argument shows that limN →∞ exists.
RN f (x)ei(kx+g(x)) dx 1
Lemma 4.4. Let {ei }N i=1 be a set of unit vectors in a Hilbert space H so that α ≡ N suphei , ej i < 1.
(4.3)
i6=k
Then
N X
|hg, ei i|2 ≤ (1 + α)kgk2
(4.4)
i=1
for any g ∈ H. Proof. Let A be the n × n matrix with aij = hei , ej i. Note that the Hilbert-Schmidt P norm of A − 1 is bounded by ( i6=j hei , ej i2 )1/2 ≤ α so (4.3) says that A is invertible. If B is its inverse, then X Bij ej (4.5) fi = obeys hfi , ej i = δij , and thus X hg, ei ifj ≡ Proj of g to the span of the e’s, and so
X
2
hg, e if kgk2 ≥ i i .
By (4.5), hfi , fj i = Bij and since hh, A−1 hiCn ≥ kAk−1 hh, hiCn , we see that n X
|hg, ei i|2 ≤ kAk
X
hg, ei i hfi , fj ihg, ej i
i,j
i=1
≤ kAk kgk2 , which is (4.4).
Proof of Theorem 4.1. It obviously suffices to show for each fixed N < ∞ that N X n=1
λn ≤
C2 . 2
Define Rn (x) to be the R corresponding to the L2 solution u(x, λn ). Normalize u so Rn (0) = 1. By Lemma 4.2,
16
A. Kiselev, Y. Last, B. Simon N X
|Rn (x)|2 ∈ L1
n=1
so lim x
N X
|Rn (x)|2 = 0
n=1
PN
(for if not, eventually n=1 |Rn (x)| ≥ Cx−1 is not L1 ). Thus, we can find Bj → ∞ so that for n = 1, . . . , N , −1/2 Rn (Bj ) ≤ Bj or
Z
2
Bj 0
so by (2.4),
Z
Bj
1 d (log Rn (y)) dy ≤ − ln Bj , dx 2 p
V (x) sin(2θn (y)) dy ≤ −
λn log Bj .
(4.6)
0
Now consider the Hilbert spaces Hj = L2 ((0, Bj ), (1 + x) dx). In Hj , we have Z kV
k2Hj
Bj
≤
C 2 (1 + |x|)−2 (1 + x) dx = C 2 log(Bj ) + O(1).
(4.7)
0
Let e(j) n (y) =
sin(2θn (y)) 1 p χ[0,Bj ] (y), (1 + |y|) Nn(j)
where
Z
Bj
sin2 (2θn (y)) dy. (1 + |y|) 0 √ √ √ Notice that 4θn (y) − 4 λn and 2(θn ± θm ) − 2( λn ± λm ) have derivatives that are O(x−1 ) by (2.3). Thus by Lemma 4.3, Z Bj sin(2θn (y)) sin(2θm (y)) − 21 δnm dy (1 + |y|) 0 Nn(j)
=
are bounded. We conclude that Ni(j) = (j) he(j) i , ek i
1 2
log Bj + O(1),
= O((log Bj )
−1
)
i 6= k.
(4.8) (4.9)
Equations (4.6) and (4.8) imply that p
hV, e(j) n i Hj ≤ −
2λn (log Bj )1/2 + O(1).
(4.10)
Since the number N of eigenfunctions is fixed, but Bj → ∞ for j large, Lemma 4.4 applies and
Modified Pr¨ufer and EFGP Transforms N X
17
2 −1 2 |hV, e(j) n iHj | ≤ (1 + O((log Bj ) )kV kHj .
(4.11)
n=1
But (4.10) and (4.7) then say that 2
X N
λn log(Bj ) ≤ C 2 log(Bj ) + O(1),
n=1
so
N X
λn ≤
n=1
C2 . 2
5. Sparse Potentials: The Continuum, Absolutely Continuous Case Our goal in this section is to prove assertion (1) in Theorem 1.6 and Theorem 1.60 . The idea will be to control kT (x)k4 and then use Theorem 1.3. As explained in Sect. 1, the key is oscillations in sin(2θ(x)) for θ(x) ∼ kxn+1 for x near xn+1 . We will realize this (xn )k . using an integration by parts so we need a priori control on objects like dkTdk Fix a Pearson potential; an is assumed to obey an → 0 and xn+1 > xn + 21. Fix θ0 and solve the modified Pr¨ufer equations for each k ∈ (0, ∞) to get functions θ(x, k) and R(x, k) (with initial conditions θ(x = 0, k) = θ0 , R(x = 0, k) = 1). Fix 1 so supp(W ) ⊂ [−1, 1]. We need two propositions to prepare for bounds in an integration by parts: Proposition 5.1. Suppose that lim xn+1 /xn > 1. For each a, b > 0, there exists a constant C so that for each k ∈ (a, b), ∂θ (5.1) ∂k (xn + 1) ≤ Cxn 2 ∂ θ ≤ Cx2n . (x + 1) n ∂k 2
and
(5.2)
Moreover, uniformly for k ∈ (a, b), 1 ∂θ (x) = 1, x→∞ x ∂k 1 ∂2θ lim 2 (x) = 0. x→∞ x ∂k 2 lim
Proof. Let β = inf n
xn+1 >1 xn
(5.3) (5.4)
(5.5)
by hypothesis. As a preliminary, note that if h, g, f are functions on [a, b], h is C 1 and h0 (x) = f (x) + g(x)h(x).
(5.6)
18
A. Kiselev, Y. Last, B. Simon
Then
|h(b)| ≤ (|h(a)| + (b − a)kf k∞ )e(b−a)kgk∞
as follows from the exact solution of (5.6): Z Rx g(y) dy + h(x) = h(x)e a
Rx
x
f (y)e
y
g(z) dz
(5.7)
dy.
a
Now let h(x) =
∂θ ∂k (x).
From (2.3),
V (x) V (x) ∂h = 1 + 2 sin2 (θ(x)) − sin(2θ(x))h. ∂x k k
(5.8)
This means for x ∈ (xn−1 + 1, xn − 1), we have that ∂h = 1. ∂x
(5.9)
By (5.7) and (5.8), |h(xn + 1)| ≤ e2C|an |1 [|h(xn − 1)| + 21 + 2C|an |1] ≤e
2C|an |1
(5.10)
[|h(xn−1 + 1)| + (xn − xn−1 ) + 2C|an |1],
(5.11)
where we used (5.9) to go from (5.10) to (5.11). In these equations, C is a constant only depending on (a, b). Throughout this proof, C is such a constant whose value can vary from one equation to the next. Let β > 1 be given by (5.5). Pick n0 so large that for n ≥ n0 : β −1 e2|an |C1 ≤ and
1 2
(1 + β −1 )
2C|an |1 2an C1 1 − β −1 . 1+ ≤1+ e xn 2
(5.12)
(5.13)
Since β > 1 and an → 0, such an n0 exists. Next, pick D ≥ 2 so |h(xn0 −1 + 1)| ≤ Dxn0 −1 .
(5.14)
We claim inductively that for n ≥ n0 − 1, we have that |h(xn + 1)| ≤ Dxn
(5.15)
for by (5.14), this holds for n = n0 − 1, and if it holds for n − 1, then by (5.11) and xn−1 ≤ β −1 xn , |h(xn + 1)| ≤ [Dxn−1 + xn − xn−1 + 2C|an |1]e2C|an |1 2C|an |1 2C|an |1 e ≤ xn (D − 1)β −1 + 1 + xn 1 1 − β −1 (1 + β −1 ) + 1 + ≤ xn (D − 1) 2 2 −1 1−β = xn D − (D − 2) ≤ Dxn 2
(by (5.12)/(5.13))
Modified Pr¨ufer and EFGP Transforms
19
since D ≥ 2. Thus, we’ve proven (5.15). Next, let H(x) = h(x) − x, so (5.8) implies that ∂H ∂x ≤ C|an |(1 + |H|)
(5.16)
on (xn − 1, xn + 1). Using (5.7) and (5.15), we conclude (recall the constant C changes from one equation to the next!) |H(xn + 1) − H(xn − 1)| ≤ C|an |xn . Since H(xn−1 + 1) = H(xn − 1), we have that for n ≥ n0 , n X H(xn + 1) xm ≤ C + am xn + 1 xn + 1 (x n + 1) m=n 0
n X C + am β −(n−m) → 0 ≤ xn + 1 m=n 0
as n → ∞ since β > 1 and am → 0. From this and (5.16), we see that | H(x) x | → 0 as x → ∞, which proves (5.1). ∂2θ To prove (5.2), let g = ∂h ∂k = ∂k2 . Then differentiating (5.8) with respect to k, we see that ∂g = 0 on (xn−1 + 1, xn − 1) ∂x ∂g = A(x) + B(x)h(x) + D(x)g(x) + E(x)h2 (x) ∂x
(5.17a) on (xn − 1, xn + 1) (5.17b)
where A, B, D, E are uniformly bounded by Can on this interval with C uniformly bounded as k runs through (a, b). Now use (5.7) and (5.1) to see that |g(xn + 1)| ≤ e2Can 1 [g(xn−1 + 1) + Can x2n 1]. As above, if n is so large that β −2 e2Can 1 ≤
1 2
(1 + β −1 )
and
(Can 1)e2Can 1 ≤
1 2
(1 − β −1 )
then inductively, g(xn + 1) ≤ Cx2n for n large. This is (5.2). Plugging this into (5.17b), we see that g(xn + 1) ≤ C 1 +
n X
am x2m
,
(5.17c)
m=1
which yields limn→∞ g(xn + 1)/x2n = 0 from which (5.4) is immediate.
20
A. Kiselev, Y. Last, B. Simon
Proposition 5.2. For any a, b > 0, there is a C so that for all k ∈ (a, b), log R(xn + 1) ≤ C
n X
|am |,
(5.18)
X ∂ log R (xn + 1) ≤ C |am xm |. ∂k
(5.19)
m=1 n
m=1
Proof. By (2.4), log R(x) is constant for x ∈ (xn−1 + 1, xn − 1) and Z | log R(xn + 1) − log R(xn − 1)| ≤ 2k −1 |an | W (y) dy, so (5.18) holds with C = 2 min(k)−1 From (2.4), we have
R
W (y) dy.
∂θ ∂ ∂ (k log R) = V (x) cos(2θ(x)) , ∂x ∂k ∂k so that the bound (5.1) implies (5.9).
As a final preliminary, we note that Lemma 5.3. Suppose that lim xn+1 /xn > 1. Then for a constant C, ∞ X X
∞
|an am |
n=1 m≤n
X xm ≤C a2n . xn n=1
Proof. Let β = lim xn+1 /xn . Pick 1 < γ < β. Then for m ≤ n, xm /xn ≤ Cγ −|m−n| . Thus, the lemma follows from Young’s inequality that X T (a)n ≡ γ −|m−n| am m
is bounded from `2 to `2 for any γ > 1.
0
Proof of Theorem 1.6 . Let g be a non-negative C ∞ -function compactly supported on (0, ∞). We will prove that Z (5.20) sup g(k)R(k, xn + 1)4 dk < ∞. n
Proving this for two values of θ0 and appealing to Theorem 2.1 gets a uniform bound R on g(k)kT (0, xn + 1)k4 dk. Theorem 1.3 then proves pure absolute continuity of the spectrum on (0, R ∞). Let Bn = g(k)R(xn + 1)4 dk. Notice that by (2.4), R(xn−1 + 1) = R(xn − 1) and (5.21) R(xn + 1)4 = R(xn − 1)4 exp(Qn ), where 2 Qn = k
Z
1 −1
an W (y) sin(2θ(xn + y)) dy.
Since k −1 and an are bounded, Qn is uniformly bounded in n, and so
Modified Pr¨ufer and EFGP Transforms
21
exp(Qn ) ≤ 1 + Qn + CQ2n ≤ 1 + Qn + Ca2n
(5.22)
(where again C is a constant that varies from formula to formula). For y ∈ (−1, 1), we have by (2.3) |θ(xn + y) − θ˜n (y)| ≤ Can , where θ˜n (y) = θ(xn−1 + 1) + k(xn + y − xn−1 − 1), so
Z 1 Qn − 2 ˜ an W (y) sin(2θn (y)) dy ≤ Ca2n . k −1
(5.23)
By (5.21)–(5.23), Bn ≤ Bn−1 (1 + Ca2n ) + En , where
Z E n = an
Z
1
dy −1
(5.24)
2g(k) R(xn−1 + 1, k)4 W (y) sin(θ˜n (y)) dk. k
Notice that we’re implementing our basicQ strategy: We separate out the second-order ∞ terms (which will present no problem since n=1 (1 + Ca2n ) < ∞) and need to control the first-order terms where we have an explicit highly oscillatory factor since θn ∼ kxn . Now 1 ∂θ(xn−1 + 1) ∂ θ˜n (y) = xn + y − xn−1 − 1 + > xn (5.25) ∂k ∂k 2 for n large by the bound (5.3). Thus, we can write sin(θ˜n (y)) =
1 ∂ θ˜n ∂k
∂ (− cos(θ˜n (y))) ∂k
and integrate by parts. After integration by parts, we have three terms En(1) En(2) En(3)
∂[k −1 g(k)] , ∂k ∂R4 , coming from ∂k ∂ 1 . coming from ∂ θ˜ ∂k coming from
∂k
For the En(1) term, we can bound R4 as follows using (5.18) and xn ≥ Cβ n . By (5.10), for n large,
(5.26)
22
A. Kiselev, Y. Last, B. Simon
X n R4 ≤ C exp C am m=1
n ln(β) , ≤ C exp 2 since an → 0. Thus, by (5.19) and (5.26),
En(1) ≤ Cβ n/2 β −n = Cβ −n/2 . For the En(2) term, we use
∂R4 ∂k
(5.27)
R = R4 ∂ log ∂k , (5.19), and (5.25) to see that
En(2) ≤ CBn−1 bn , where bn = Note now that by
P
n−1 X
an am
m=1
a2n
xm . xn
< ∞ and Lemma 5.3, we have ∞ X
bn < ∞.
(5.28)
n=1
For the En(3) term, we use (5.25) and (5.17c) to see that En(3) ≤ Bn−1 cn , where cn = Can As in the proof of Lemma 5.3,
(1 +
X
Pn−1 m=1 x2n
am x2m )
.
cn < ∞.
(5.29)
By (5.24) and the above estimates on En(i) , max(Bn , 1) ≤ (1 + Ca2n + Cbn + Ccn + Cβ −n/2 ) max(Bn−1 , 1). P P By hypothesis, a2n < ∞, and by (5.28–5.29), bn + cn < ∞. Thus N Y
(1 + Ca2n + Cbn + Ccn + Cβ −n/2 )
n=1
is bounded and consequently, so is Bn .
It is easy to see that the methods of this section extend to prove: P Theorem 5.4. Suppose V (x) = Wn (x − xn ), where (i) lim xn /xn+1 < 1, (ii) supp Wn ⊂ [−1, 1] for some fixed 1, P R 2 (iii) n |Wn (y)| dy < ∞. 2
d Then − dx 2 + V (x) has purely a.c. spectrum on (0, ∞).
(5.30)
Modified Pr¨ufer and EFGP Transforms
23
6. Sparse Potentials: The Continuum, Singular Continuous Case In this section, we will prove assertion (2) in Theorem 1.6. The idea will be to force kT (k 2 , xn )k to infinity for almost all k and suitable xn . To do this, we will need to isolate a strictly positive second-order term and show that these second-order terms then dominate the first-order terms because of oscillations. Here is a warm-up problem to show this cancellation mechanism. Let Xn be independent, identically distributed random variables taking the values ±1 with probability 1 2 . Let > 0 and let an be a sequence going to zero as n → ∞. Finally, let Yn =
n X
(a2m + am Xm ).
m=1
Suppose that probability 1,
P
a2n = ∞. We claim there exists a subsequence n(i) → ∞, so with lim Yn(i) = ∞.
i→∞
(6.1) Pn
The reason (6.1) holds is that by the central limit theorem m=1 an Xn is typically pP a2n and, because of the square root, this is smaller not more negative than O − Pn than m=1 a2n . Pn To make a proof, notice that since m=1 a2m → ∞, we can choose n(i) so that Pn(i) 2 2 m=1 am ≥ i . By a Tschbechev inequality, Prob
X n(i) 1
P
1 i2
X 2 am 2 n(i)
am Xm ≥
≤
k
1
Pn(i)
am Xm k2 4 4 1 Pn(i) 2 2 = 2 Pn(i) 2 ≤ 2 2 . i a ) a m m 1 1
1
( 2
< ∞, so by the Borel-Cantelli lemma, with probability 1, eventually n(i) X
a m Xm ≤
1
n(i) X 2 an , 2 1
and thus eventually, Yn(i) ≥
n(i) X 2 am 2 1
diverges. The usual Kolmogorov stopping argument that lets one prove things without subsequences isn’t obviously applicable here in a situation where we assume no regularity on the am ’s (see Sect. 8 for the case am = m−α ). Since a subsequence suffices for our application, we have not tried to push the argument through to get lim Yn = ∞, even in the toy problem. Notice that independence of the Xn ’s was not needed; rather, it suffices to have enough control of E(Xn Xm ) to show that the first-order term is small compared to the second-order term. In the case at hand, we will use integration by parts in k as we did in the last section to get this control. We summarize the key to the above argument with Lemma 6.1. Let Pn , Qn be random variables so that
24
A. Kiselev, Y. Last, B. Simon
(i) Pn (x) ≥ αn > 0 for a.e. x and positive reals αn , P −1 (ii) αn Exp(|Qn |) < ∞, (iii) limn→∞ αn = ∞. Then Pn (x) + Qn (x) → ∞ for a.e. x. If (ii) is replaced with (ii0 ) limn→∞ αn−1 Exp(|Qn |) = 0, then there exists a subsequence n(i) so that Pn(i) (x) + Qn(i) (x) → ∞ for a.e. x. Proof. If (ii0 ) holds, we can find a subsequence so that (ii) holds. Thus, it suffices to prove the result P assuming (ii). By (ii), αn−1 |Qn (x)| < ∞ for a.e. x. In particular, αn−1 Qn (x) → 0 so Pn + Qn ≥ αn [1 − αn−1 |Qn (x)|] → ∞. We will also need the following lemma: Lemma 6.2. Suppose that Bn , αn , βn ≥ 0 are real numbers and that p Bn ≤ Bn−1 + 2αn Bn−1 + βn (n ≥ 1). Then, p
(6.2)
v u n n X p uX Bn ≤ B0 + αk + t βk . k=1
(6.3)
k=1
Proof.PWe give a proof by induction. Equation (6.2) holds for n = 0. Let an = n bn = k=1 βk . By the induction hypothesis, p p p Bn−1 ≤ B0 + an−1 + bn−1 .
Pn k=1
αk ,
(6.4)
Equation (6.2) implies that Bn ≤ So by (6.4),
p 2 Bn−1 + αn + βn .
2 p p B0 + an + bn−1 + βn 2 p p p ≤ B0 + an + bn + 2 bn−1 B0 + a n p p 2 B 0 + a n + bn , ≤
Bn ≤
proving (6.3) inductively.
P So fix a Pearson potential with a2n = ∞. Fix θ0 and let R(x, k) be the solution of (2.3/2.4). Let Yn (k) = log R(xn + 1, k) and δYn (k) = Yn (k) − Yn−1 (k). By (2.4),
Modified Pr¨ufer and EFGP Transforms
δYn (k) =
25
an 2k
Z
1 −1
W (y) sin 2θ(xn + y) dy.
(6.5)
As in Sect. 5, we write θ˜n (y) = θ(xn−1 + 1) + k(xn + y − xn−1 − 1). But we expand θ to the next order by letting Z an y θn(1) (y) = − W (y) sin2 (θ˜n (y)) dy. k −1 Then by (2.3),
(6.6)
θ(xn + y) = θ˜n (y) + θn(1) (y) + O(a2n ),
so by (6.5), δYn (k) = an Xn(1) + a2n Sn + O(a3n ), Z 1 1 W (y) sin(2θ˜n (y)) dy, Xn(1) = 2k −1 (1) Z 1 1 θ (y) W (y) cos(2θ˜n (y)) n . Sn = k −1 an
(6.7a) (6.7b) (6.7c)
In the formula for θn(1) , use sin2 (θ˜n (y)) = 21 (1 − cos(2θ˜n (y))). The cos term from this formula when plugged into (6.7c) gives Z y Z 1 2 1 ˜ ˜ k W (y) cos(2θn (y)) W (s) cos(2θn (s)) dy 2 −1 −1 Z 1 2 1 ˜ = 2 W (y) cos(2θn (y)) dy . 4k −1 lump the contribution of the RWe y W (s) ds, we find −1
1 2
(6.8)
term with the first-order term. Defining X(y) =
δYn (k) = [a2n Zn (k) + an Xn (k)] + O(a3n ),
(6.9)
where Z 1 2 1 ˜n (y)) dy , W (y) cos(2 θ 4k 2 −1 Z 1 1 an W (y)X(y) ˜ ˜ cos(2θn (y)) dy. W (y) sin(2θn (y)) − Xn (k) = 2k −1 2k Zn (k) =
In (6.9), the O(a3n ) means an error bounded by Ca3n , where C is a finite constant for k ∈ [a, b] any compact subinterval of (0, ∞). Define Z 1 f (k) = W (y)e2iky dy. W −1
26
A. Kiselev, Y. Last, B. Simon
Then, Zn (k) =
1 f |W (k)|2 + X˜ n (k), 8k 2
(6.10)
where
1 f X˜ n (k) = 2 |W (k)|2 cos(4(θ˜n (0, k) + ϕ(k)), 8k f (k)). where ϕ(k) = 21 Arg(W f (k) = |W f (k)|e2iϕ(k) , then For let θ˜n (y) = θ˜n (0) + ky. If W
(6.11)
2 Z 1 1 2i(θ˜n (0)+ky) W (y)e dy Zn (k) = 2 Re 4k −1 1 f = 2 |W (k)|2 cos2 (2(θ˜n (0, k) + ϕ(k))). 4k Proof of Theorem 1.6, Part (2). Let X 1 f 2 | W (k)| a2m , 8k 2 n
Pn (k) =
m=1
Qn (k) = Yn (k) − Pn (k); so
δQn (k) = Qn (k) − Qn−1 (k),
δQn (k) = a2n X˜ n (k) + an Xn (k) + O(a3n ). f (k) 6= 0}. Let Let g be a C ∞ -function compactly supported in {k ∈ (0, ∞) | W n 2 Z X am Xm (k) dk, Bn = g(k) Z B˜ n =
m=1
n 2 X 2 g(k) am X˜ m (k) dk. m=1
We will prove that
p
Bn
X n
a2m → 0
(6.12)
m=1
en . Since Pn a3 / Pn a2 → 0 (on account of as n → ∞, P and similarly for B m=1 m m=1 m n an → 0 and m=1 a2m → ∞), (6.12) and the Schwartz inequality imply that X Z n a2m → 0, g(k)|Qn (k)| dk m=1
e > 0 implies that there is a subsequence n(i) so by Lemma 6.1 and inf k∈supp g |W8k(k)| 2 so that Yn(i) (k) → ∞ for a.e. k in supp g. By doing this for two values of θ0 and using Theorem 2.1 and Theorem 1.1, we conclude there is no a.c. spectrum on supp g. f is an entire function, it has isolated zeros and thus, this argument shows σac Since W is empty. By Theorem 1.4, σpp ∩ (0, ∞) is empty, and an elementary argument proves that σ(H) ⊃ [0, ∞). So the spectrum on (0, ∞) is purely singular continuous. It thus suffices to prove (6.12) (the proof for B˜ n is essentially identical).
Modified Pr¨ufer and EFGP Transforms
Let Mn−1 (k) =
27
Pn−1
am Xm (k). Then Z Bn ≤ Bn−1 + g(k)Mn−1 (k)an Xn (k) + Ca2n m=1
for a suitable constant C. Now Xn has cos(2θ˜n (y)) and sin(2θ˜n (y)) terms. As in the last d [. . .] and integrate by parts and section, we write those as a suitable [dθ˜n (y, k)/dk]−1 dk get three terms: −1 One coming from ∂[k ∂kg(k)] ∂k. Noting that |Mn (k)| ≤ Cn, we have that this is bounded by Cn xn . n (k) . Using (5.1), this term is bounded by One coming from ∂M∂k C
n−1 X
an am
m=1
xm . xn
2
n 2 One coming from Ln = [ ∂∂kθ2n ]/[ ∂θ ∂k ] . As in the last section, thisRLn is bounded by Pn−1 C( m=1 am x2m )/x2n . We can use the Schwartz inequality to control g(k)|Mn (k)| dk, p Pn−1 and so bound this term by C Bn−1 an m=1 am x2m /x2n . The net result is the bound p Bn ≤ Bn−1 + 2αn Bn−1 + βn , (6.13)
where αn = C
n−1 X
x2m x2n
|an am |
m=1
and
n−1 n X xm βn = C a2n + + an am . xn xn m=1
By the argument in Lemma 5.3 with xn−1 /xn → 0 and that
n X
αm
m=1
and that
n X
X n
a2m → 0,
m=1
a2n → ∞, we see (6.14)
m=1
n X βm ≤ C 1 + a2m ,
m=1
so
P∞
m=1
v u n X n uX t βm a2m → 0. m=1
(6.15)
m=1
Lemma 6.2 and (6.13–6.15) imply (6.12).
One can modify this construction to make examples of decaying potentials for which the associated Schr¨odinger operator has regions of a.c. spectrum and regions of s.c. specf (k) vanishes in a whole interval so that even though trum. The idea is to arrange that W
28
A. Kiselev, Y. Last, B. Simon
f (k) cannot vanish if W has coman ∈ / `2 , we have a.c. spectrum for those k. Of course, W pact support, so we will take the bump functions of increasing support converging toward a function whose Fourier transform vanishes in an interval. So, let S = [a, b] ⊂ (0, ∞). Let f be an even Schwartz class function that vanishes if k 2 ∈ S and is strictly positive on [0, ∞)\S. P Let an = n−1/2 , xn = (n!)2 , 1n = n−1/12 . Notice that a2n = ∞. Define Z 1 f˜(x) = exp(−2ikx)f (k) dk, 4π Wn (x) = f˜(x)χ(−1n ,1n ) (x) and V (x) =
X
an Wn (x − xn ).
n
We are heading toward: 2
d Theorem 6.3. The half-axis Schr¨odinger operator − dx 2 + V (x) has purely singular spectrum on (0, ∞)\S and purely a.c. spectrum on S.
Lemma 6.4. For any m > 0, there exists a constant Cm with Z f (k) − e2ikx Wn (x) dx ≤ Cm n−m . Proof. Let fn (k) =
R
(6.16)
e−2ikx Wn (x) dx. Then Z 1 sin(1n (k − k 0 )) f (k 0 ) dk 0 , fn (k) = 2π (k − k 0 )
so the left side of (6.16) is Z 1 sin 1n (k − k 0 ) 0 0 [f (k) − f (k )] dk , 0 2π k−k Z g(y, k) sin 1n y dy ,
which has the form
where g(y, k) is Schwartz space in y with bounds (including bounds on derivatives) uniform in k. If we integrate by parts 12m times, we will get (6.16). Proposition 5.1 extends with no change. In the region where f (k) 6= 0, the analysis earlier in this section shows that log R(xn(i) + 1n(i) ) → ∞ for a.e. k and a suitable subsequence xn(i) , so we know the spectrum in (0, ∞)\S is purely singular continuous. On the other hand, if g is C ∞ supported in S, we claim that Z (6.17) sup g(k)R(k, xn + 1n )4 dk < ∞. n
The proof is similar to that in the last section. In place of (5.22), we need to use exp(Qn ) ≤ 1 + Qn + 21 Q2n + O(a3n ).
Modified Pr¨ufer and EFGP Transforms
29
fn (k)|2 /8k 2 and oscillatory terms that we can As in this section, Q2n has a term a2n |W integrate by parts. Noting that n−1 X m=1
X xm an am ≤ n−2 n−1/2 m−1/2 ≤ Cn−2 xn
is still summable and that
n−1
m=1
P
fn (k)|2 is summable by Lemma 6.4, we obtain (6.17). a2n |W
7. Sparse Potentials: The Discrete Case In this section, we will sketch the proof of Theorem 1.7. The proof follows closely that in the last two sections with (2.12) replacing (2.3/2.4). We will make use of (2.14), the Pastur-Figotin form of (2.12b). Fix α > 0 and pick k ∈ (α, π−α) and then N so large that for all such k, |νk (n)| < 21 for n ≥ N0 . Equation (2.14) can then be effectively used to prove the analogs of (5.1/5.2), that is, 2 ∂ θ ∂θ 2 (x ) ≤ Cx (x ) (7.1) n ∂k 2 n ≤ Cxn . ∂k n Equation (2.12c) can be rewritten log R(n + 1) − log R(n) =
1 2
¯ + νk (n)2 sin2 (θ)). ¯ log(1 + νk (n) sin(2θ)
This implies the bound log R(xn ) ≤ C
n X
|am |.
(7.2)
(7.3)
m=1
Next notice that 1 + α sin(2θ) + α2 sin2 (θ) = (1 + 21 α sin(2θ))2 + α2 sin4 (θ). This provides a uniform bound on the argument of the log(·) in (7.2), and so allows one to prove n X ∂ log R(x ) ≤ C a n xm . (7.4) n ∂k m=1
With these tools, the proof of assertion (1) of Theorem 1.6 is similar to that in Sect. 5, only a little simpler since (2.12c) implies ¯ + Cn a2n ). R(n + 1)4 ≤ R(n)4 (1 + νk (n) sin(2θ(n)) The same integration by parts used in Sects. 5 and 6 shows that Z Z g(k)R(n, k)4 νk (n) sin(2θ(n)) dk = C(bn + cn + B −n/2 ) 1 + g(k)R(n, k)4 dk Pn−1 with bn = m−1 an am xm /xn and cn is like bn with x2m /x2n replacing xm /xn . As in Sect. 5, this proves assertion (1) in Theorem 1.7. To prove assertion (2), we must identify a strictly positive second-order term. We write
30
A. Kiselev, Y. Last, B. Simon
log(1 + α sin(2θ) + α2 sin2 (θ)) = α sin(2θ) + α2 (sin2 (θ) −
1 2
sin2 (2θ)) + O(α3 )
(7.5)
= α sin(2θ) + α cos(4θ) − α cos(2θ) + α + O(α ). 1 4
This lets us write
2
1 2
2
log R(n + 1) − log R(n) =
1 4
1 2
2
3
(7.6)
a2n + an Xn ,
and, as in Sect. 6, use the integration by parts machine to prove Z X N
2 an Xn
g(k) dk
1/2 X
a2n → 0
n=1
and complete the proof as there. f (k) since the analog of W here In this case, we don’t need to worry about zeros of W f is δn0 and so W (k) = 1. 8. Random Decaying Potentials: The Discrete Case In this section, we consider discrete situations where the V (n) are independent random variables of zero mean and decaying variance. The results that imply a.c. spectrum require no regularity in E(V (n)2 ), while those for singular spectrum require some kind of regular decay, as we will explain. The results for a.c. spectrum are so general yet so simple to prove that they are a paradigm of the usefulness of the EFGP transform. Theorem 8.1. Suppose Vω (n) are independent random variables with E(Vω (n)) = 0 and X E(Vω (n)2 ) + E(Vω (n)4 ) < ∞. (8.1) n
Then for a.e. ω, hω has purely a.c. spectrum on (−2, 2). Remarks. 1. For E(Vω2 )1/2 ≤ Cn−α with V bounded and α > 21 , we get a.c. spectrum recovering results of Delyon, et al. [7]. bounded, then E(Vω (n)4 ) ≤ CE(Vω (n)2 ) and so (8.1) 2. If the PVω (n) are uniformly 2 becomes n E(Vω (n) ) < ∞; we state the general bound because unbounded V ’s are so easy to accommodate. 3. The case E(Vω (n)2 )1/2 = n−1/2 log(n)−1 is of some interest. This sequence is 2 ` so if V is bounded, the theorem proves a.c. spectrum. Kotani-Ushiroya [21] cannot handle such borderline cases. Proof. Fix θ0 . Then Rω (n) and θω (n) become random variables which are measurable functions of {Vω (j)}j≤n−1 and so independent of {Vω (j)}j≥n . By (2.12c), Vω (n) sin(2θ¯ω (n)) + O(Vω2 + Vω4 ) . R(n + 1)4 = R(n)4 1 + sin k ¯ and R(n), we have Since Vω (n) is independent of θ(n) E(Rω (n)4 Vω (n) sin(2θω (n) )) = E(Vω (n))E(Rω4 (n) sin(2θ¯ω (n))) = 0.
Modified Pr¨ufer and EFGP Transforms
31
Using independence to bound E(R(n)4 Vωj ) by E(R(n)4 )E(Vωj ), we see that E(Rω (n + 1)4 ) ≤ [1 + CE(Vω2 (n) + Vω4 (n))]E(Rω4 (n)), where C is uniformly bounded for k in any (α, π − α) with α > 0. It follows that Z π−α 4 E Rω (n, k) dk < ∞. α
By Fatou’s lemma, for a.e. ω, Z
π−α
Rω (n, k)4 dk < ∞,
lim α
and by Theorem 1.3, the spectrum is purely a.c. on (−2 cos(α), 2 cos(α)). P∞ For the case where n=1 E(V (n)2 ) = ∞, we need some regularity of the fall-off. Rather than try to find complicated general conditions, we consider the case where E(V (n)2 ) ∼ n−2α with α ≤ 21 . The same method can handle a case like E(V (n)2 ) = [n log(n+1)]−1 (which always has singular continuous spectrum of Hausdorff dimension 1) by the kind of arguments we will discuss in the case α = 21 ; in this case for typical energies kT (0, n)k grows like log(n). Explicitly, we suppose 0 < α ≤ 21 ; λ > 0, (i) E(Vω (n)2 )1/2 = λn−α (ii) E(Vω (n)) = 0, (iii) For some > 0, supω |Vω (n)| ≤ Cn−(2α/3)− , (iv) Vω (n) is independent of {Vω (j)}n−1 j=1 . Remarks. 1. Think of the case discussed in [26, 7], where Vω (n) = n−α Xn (ω) with Xn identically distributed bounded, independent random variables. If E(X) = 0 and X is bounded, then (i)–(iv) hold. 2. With some extra effort, we could allow unbounded distributions, and only require that limn→∞ n+α E(Vω (n)2 )1/2 exists and be non-zero. Theorem 8.2. Suppose (i)–(iv) hold. Fix k in (0, π) with k 6=
π 2π 3π 4, 4 4 .
Then for a.e. ω,
λ2 log kT2 cos(k) (n, 0)k Pn −2α = . n→∞ ( j=1 j ) 8 sin2 (k) lim
Remark. In case α < 21 , this says kT (n, 0)k ∼ exp(Cn1−α ) with C = α=
1 2,
this says kT k ∼ n with C = C
λ2 . 8(1−2α) sin2 (k)
If
λ2 . 8 sin2 (k)
Proof. By Theorem 2.3, we need only prove this result with R(n) replacing T for each θ0 . So fix k and θ0 , and let θω (n), Rω (n) solve (2.12). By (2.12c), log R(n + 1) − log R(n) =
1 2
¯ ¯ log(1 + νk (n) sin(2θ(n)) + νk (n)2 sin2 (θ(n))).
(8.2)
Since supω νk (n) → 0 as n → ∞, we can use log(1 + x) = x −
x2 + O(x3 ). 2
(8.3)
32
A. Kiselev, Y. Last, B. Simon
We also use sin2 θ − 21 sin2 (2θ) =
1 4
− 21 cos(2θ) + 41 cos(4θ).
The net result is 1 X E(Vω (n)2 ) + C1 + C2 + C3 + C4 , 8 sin2 (k) n
log R(n) =
j=1
where the corrections have the form X 1 Vω (j) sin(2θ¯ω (j)), 2 sin(k) n
C1 = −
j=1
1 2 ¯ ¯ [Vω (j) − E(Vω (j) )] sin (θω (j)) − sin (2θω (j)) , 2 j=1 n X 1 1 2 1 ¯ ¯ C3 = cos(2 θ cos(4 θ E(V (j) ) (j)) − (j)) , ω ω ω 2 4 2 sin2 (k)
1 C2 = 2 sin2 (k)
n X
2
2
2
j=1
C4 =
n X
O(Vω (j)3 + Vω (j)4 ).
j=1
The theorem follows if we prove that for each q = 1, 2, 3, 4 and a.e. ω, |Cq (ω)| lim Pn −2α = 0. n→∞ j=1 j
(8.4)
Equation (8.4) for q = 4 is an immediate consequence of hypothesis (iii). C1 , C2 clearly have zero expectation values and variances that decay properly for us to hope (8.4) holds; the key to the proof will be a Martingale inequality. C3 will depend on the fact that cos(θ) has zero average and the slow variation of E(Vω (n)2 ). We break the proof to present some needed lemmas. For the first two of these lemmas, let X0 , X1 , . . . , XN be independent random variables, where X0 can be vector valued. Suppose that for j = 1, . . . , N , Zj = Xj fj (X1 , . . . , Xj−1 ; X0 )
(8.5)
with fj a measurable function, and that E(Xj ) = 0.
(8.6)
The following is a variant of a standard Martingale inequality; we provide a proof for the reader’s convenience: Lemma 8.3.
E
sup n=1,2,...,N
|Z1 + · · · + Zn | ≥ r
≤
X N 1 2 E Z j . r2 j=1
(8.7)
Modified Pr¨ufer and EFGP Transforms
33
Proof. Define Yn =
n X
Zj ,
Qn =
j=1
N X
Zj
j=n+1
and let Aj = {ω | |Y1 | ≤ r, |Y2 | ≤ r, . . . , |Yj | > r}. Then χn , the characteristic function of An , is a function only of X0 , X1 , . . . , Xn and thus, if k > n, E(Zk Yn χAn ) = E(Xk )E(fk (X1 , . . . , Xk−1 , X0 )Yn χAn ) = 0. Thus, E(χn Yn2 ) ≤ E(χn (Yn + Qn )2 ), since the cross term has zero expectation when we expand the square. Thus, r2
n X
E(χj ) ≤
j=1
which is (8.7).
N X
E(χj Yj2 ) ≤
j=1
N X
E(χj YN2 ) ≤ E(YN2 ),
j=1
Lemma 8.4. Suppose E(Zn2 ) ≤ Cn−2α . Then for a.e. ω: (1) If α <
1 2
and β > 21 (1 − 2α), then n X −β Zj n = 0. lim n→∞ j=1
(2) If α =
1 2
and β > 21 , then X n Zj (log n)−β = 0. lim n→∞ j=1
(3) If α > 21 , lim
n→∞
n X
Z j = Y∞
j=1
exists, and for any β < α − 21 , X ∞ Zj = 0. lim n+β
n→∞
j=n
Pn Remark. Naively, fluctuations should behave as ( j=1 j −2α )1/2 . This lemma shows they Pn are not much worse. Since we only need that they are small compared to j=1 j −2α , the lemma suffices.
34
A. Kiselev, Y. Last, B. Simon
Proof. (1) Pick β1 so β > β1 ≥ 21 (1 − 2α). By Lemma 8.3, E
sup n−1
j=1,...,2
nβ1 Zk ≥ 2 ≤ C 2−2nβ1 2(n−1) 2−2(n−1)α
2n−1 X+j k=2n−1 +1
(8.8)
≤ C 2−(n−1)(2α+2β1 −1) is summable in n by the choice of β1 . Therefore, by the Borel-Cantelli lemma, for a.e. ω, there is an n0 (ω0 ) so that the sup inside (8.8) is less than 2nβ1 if n ≥ n0 . Let j be larger than 2n0 −1 and pick n so that 2n−1 + 1 ≤ j ≤ 2n . Then |Z1 + · · · + Zj | ≤ |Z1 + · · · + Z2n0 | +
n X
2kβ1
k=1 nβ1
2 2 β1 − 1 2 β1 ≤ |Z1 + · · · + Z2n0 | + β1 j β1 . 2 −1 ≤ |Z1 + · · · + Z2n0 | +
Thus, lim j −β1 |Z1 + · · · + Zj | < ∞. Since β > β1 , the limit for β is 0. (2) Pick β1 with β > β1 > 21 and define Kn =
ω
sup j=1,...,2
X j ≥ nβ1 . Z j n m=1
Then by Lemma 8.3, n
−2β1
E(Kn ) ≤ n
2 X 1 1
j
≤ n−2β1 (1 + n log 2) ≤ Cn1−2β1 ,
Pk
≤ 1 + log k. since Pick an integer m so m(2β1 − 1) > 1. Then 1 1 j
∞ X
E(Knm ) < ∞.
n=1
/ K nm . So by themBorel-Cantelli lemma, for a.e. ω, there is n0 (ω), so if n ≥ n0 , then ω ∈ If j > 2n0 , pick n so that m m 2(n−1) < j ≤ 2n . Then |Z1 + · · · + Zj | ≤ (nm )β1 ≤ 2mβ1 (n − 1)mβ1 ≤ 2mβ1 (log 2)−β1 (log j)β1 . (3) Pick β1 so β < β1 < α − 21 . Then E
2n−1 X+j sup Zk ≥ 2−nβ1 ≤ C 2−2nβ1 2n−1 2−2(n−1)α n−1
j=1,...,2
k=2n−1 +1
≤ C 22β1 2−2(n−1)[α−1/2−β1 ]
Modified Pr¨ufer and EFGP Transforms
35
is summable. Thus, for a.e. ω, there is an n0 (ω) so that for n ≥ n0 (ω), the sup is bounded by 2−nβ1 . Thus, if j1 ≥ j2 ≥ 2n2 −1 ≥ 2n0 −1 , j2 ∞ X X ≤ Z 2−nβ1 → 0 k n=n2
k=j1
as n2 → ∞. So the sum is convergent (i.e., the partial sums are Cauchy). Moreover, if j ≥ 2n0 −1 and n is picked so 2n−1 ≤ j ≤ 2n , then ∞ ∞ X X 2−nβ1 j −β1 Zk ≤ 2−mβ1 = ≤ , 1 − 2−β1 1 − 2−β1 m=n k=j
and thus, if we multiply by j β , the limit is 0.
Lemma 8.5. Suppose that k ∈ R is not in Zπ. Then there exist integers q` → ∞ so that for any θ0 , . . . , θq` , X q` X q` ≤1+ cos(θ ) |θj − θ0 − kj|. j j=1
j=1
Pq
Remark. In essence, we show | j=1 cos(θ0 + kj)| ≤ 1 a stronger result than the ergodic Pq theory result that | q1 j=1 cos(θ0 + kj)| → 0. The weaker ergodic theory result suffices for our application, but the proof of this lemma is easy so we give it. Proof. By general number theory considerations [14], we can find p` , q` so that k − πp` ≤ 1 (8.8) q` q`2 / Z if k ∈ / Zπ. For any p/q ∈ / Z and any θ0 , and p` /q` ∈ q X j=1
Thus
jpπ = 0. cos θ0 + q
q` q` X X jp` π cos(θj ) = cos(θj ) − cos θ0 + q` j=1 j=1 q` X θj − θ0 − jp` π ≤ q` j=1 q` q` X X πp` ≤ |θj − θ0 − kj| + j k − q` j=1
≤
q` (q` + 1) + 2q`2
j=1
q` X j=1
|θj − θ0 − kj|.
(8.9)
36
A. Kiselev, Y. Last, B. Simon
Conclusion of the Proof of Theorem 8.2. We need to verify (8.4) for q = 1, 2, 3. Vω (n) sin(2θ¯ω (n)) ≡ Zn has the form (8.5) and E(Zn2 ) ≤ Cn−2α , so by Lemma 8.4, for a.e. ω, X n −2α j . |C1 (ω)| = o j=1
¯ − 1 sin (2θ)] also has the form of (8.4) since E(Vω2 − [Vω (n)2 − E(Vω (n)2 )][sin (θ) 2 2 E(Vω )) = 0. Since V is bounded, 2
2
E((V 2 − E(V 2 ))2 ) ≤ CE(V 2 ). Thus, for a.e. ω, |C2 (ω)| = o
X n
j −2α
j=1
also. Finally, we will show n X
j −2α cos(4θ¯ω (j)) = o
j=1
X n
j −2α ,
j=1
which proves (8.4) for q = 3. By hypothesis on k, 4k ∈ / Zπ so Lemma 8.5 applies. Let q` be as in that lemma. Note next that by hypothesis (iii) and Proposition 2.4 for j large, |θω (j + 1) − θω (j) − k| ≤ C0 j −2α/3 .
(8.10)
n0 ≥ q`2
(8.11)
Pick n0 so and
−2α/3
4C0 n0
≤ q`−2 .
(8.12)
Suppose N = n0 + Kq` . Then N X q` X −2α K X −2α j cos(4θω (j)) = (n0 + mq` + j) cos(4θω (mq` + j)) j=n0 +1
m=0 j=1
= A 1 + A2 , where A1 is what we get by replacing (n0 + kq` + j)−2α by (n0 + kq` )−2α and A2 is the difference. By Lemma 8.5, (8.10), and (8.12), A1 ≤
K X
(n0 + kq` )−2α [1 + 1],
k=0
while using |(n0 + kq` + j)−2α − (n0 + kq` )−2α | ≤ (n0 + kq` )−2α jn−1 0 and (8.11),
Modified Pr¨ufer and EFGP Transforms
A2 ≤
K X
(n0 +
37
kq` )−2α q`2 n−1 0
≤
k=0
K X
(n0 + kq` )−2α .
k=0
Thus for any N , N N X X −2α −α j cos(4θ (j)) ≤ C + 3q j −2α , ω ` ` j=1
1
and so lim
X N
N →∞
j
−2α
j=1
−1 X N −2α j cos(4θω (j)) ≤ 3q`−α 1
uniformly in ω. Since we can take q` → ∞ by Lemma 8.5, the lim is 0. Theorem 8.6. Suppose that (i)–(iv) hold with α < uous parameter. Then for a.e. ω:
1 2
but we consider V (1) as a contin-
(1) For a dense Gδ of values of V (1), Hω has purely singular continuous spectrum in (−2, 2). (2) For Lebesgue a.e. value of V (1), Hω has dense pure point spectrum in (−2, 2) and the eigenfunctions obey Hω u = 2 cos(km )u with (1 − 2α)λ2 log(|u(n)2 + u(n + 1)2 |1/2 ) = − . n→∞ |n|1−2α 8 sin2 (km ) lim
(8.13)
If we consider a whole-line problem with independent Vω (n), where both {Vω (n)}∞ n=1 and V˜ω (n) ≡ Vω (−n), n = 1, 2, . . . obey hypotheses (i)–(iv) and Vω (0) has a purely a.c. density, then for a.e. ω, Hω has dense pure point spectrum in (−2, 2) and (8.13) holds as |n| → ∞. Remark. This strengthens the result originally proven in [31] and improved in [7] in two ways. First, we get the explicit constant in (8.13). Second, we only require one Vω ( · ) to have an a.e. distribution. Proof. By Theorem 8.2 and Fubini’s theorem for a.e. ω, we have for a.e. k ∈ (0, π), log kT (n)k (1 − 2α)λ2 = . n→∞ n1−2α 8 sin2 (k) lim
Thus by Theorem 8.3 of [22], there is an L2 -solution obeying (8.13). The theorem follows from general principles on rank one perturbations [12, 4, 5, 28]. The case α = 21 has an extra subtlety we will need to deal with, using an argument modeled on Kotani-Ushiroya [21]. The following replaces an explicit but complex formula they use for the projection onto a decaying solution (and fills in a gap in their argument): Lemma 8.7. Let uθ = (cos θ, sin θ) in R2 . For any unimodular matrix A with kAk > 1, let θ(A) be the unique θ ∈ (− π2 , π2 ] with kAuθ k = kAk−1 . Define ρ(A) = kAu0 k/kAuπ/2 k. Let An be a sequence of unimodular matrices with kAn k → ∞ and kAn+1 A−1 n k/kAn kkAn+1 k → 0 as n → ∞. Let ρn = ρ(An ), θn = θ(An ). Then:
38
A. Kiselev, Y. Last, B. Simon
(i) θn has a limit θ∞ if and only if limn→∞ ρn ≡ ρ∞ exists (ρ∞ = ∞ is allowed, but then we only have |θn | → π2 ). (ii) Suppose θn has a limit θ∞ 6= 0, π2 (equivalently, ρ∞ 6= 0, ∞). Then lim
log kAn u∞ k = −1 log kAn k
(8.14)
log |ρn − ρ∞ | ≤ −2. log kAn k
(8.15)
n→∞
if and only if lim
n→∞
Remark. Consider
An =
cosh(n) (−1)n sinh(n)
(−1)n sinh(n) . cosh(n)
Then ρ(An ) ≡ 1 and kAn k → ∞ but θn = (−1)n+1 ( π4 ) does not have a limit. This shows that the condition kAn+1 A−1 n k/kAn kkAn+1 k → 0 is required. Indeed, in this case that limit is 1. Kotani-Ushiroya miss this issue. Proof. (i) Note first that kAn uθ k2 = kAn k2 sin2 (θ − θn ) + kAn k−2 cos2 (θ − θn ). Thus, ρn =
tan2 (θn ) + kAn k−4 . 1 + kAn k−4 tan2 (θn )
(8.16)
(8.17)
It follows that ρn has a finite limit ρ∞ if tan2 (θn ) has a finite limit. By writing ρ−1 n =
cot 2 (θn ) + kAn k−4 , 1 + kAn k−4 cot2 (θn )
this is true also for ρn → ∞ and tan2 (θn ) → ∞. Pick η ∈ [0, π2 ] so tan2 (θn ) → tan2 (η). If η = 0, then θn → 0, and if η → π2 , then |θn | → π2 because of the continuity of tan(θ) on [− π2 , π2 ]. If 0 < η < π2 , we only have |θn | → η and have to worry about the sign (see the remark above). In (8.16), take θ = θn+1 and see that 2 2 sin2 (θn+1 − θn ) ≤ kAn k−2 kAn A−1 n+1 k kAn+1 uθn+1 k 2 = kAn k−2 kAn+1 k−2 kAn+1 A−1 n k −1 −1 since An A−1 n+1 is unimodular, and thus kAn An+1 k = kAn+1 An k. Thus by hypothesis,
sin2 (θn+1 − θn ) → 0. This, together with |θn | → η ∈ (0, π2 ), implies that θn has a limit. (ii) By (8.16), we have that (8.14) holds if and only if lim
n→∞
log |θn − θ∞ | ≤ −2. log kAn k
Since θ∞ 6= 0, π, this is true if and only if
(8.18)
Modified Pr¨ufer and EFGP Transforms
lim
n→∞
39
log | tan2 (θn ) − tan2 (θ∞ )| ≤ −2. log kAn k
By (8.17) and θ∞ 6= π, || tan2 (θn ) − tan2 (θ∞ )| − |ρn − ρ∞ || ≤ CkAn k−4 . Thus, (8.18) holds if and only if (8.15) holds.
Lemma 8.8. Suppose the hypotheses of Theorem 8.2 hold with α = 21 and k 6= is fixed. Then for a.e. ω, there exists an initial condition uθ(ω) so that
π 2π 3π 4, 4 , 4
λ2 log kT2 cos(k) (n, 0)uθ(ω) k =− . n→∞ log(n) 8 sin2 (k) lim
Remark. As noted in [22] (and gotten incorrectly in [21]), Ruelle’s deterministic argument doesn’t ever suffice in this kT k ∼ nγ case. If An is a sequence of unimodular matrices with limn→∞ log kAn k/ log(n) = γ, then [22] has explicit examples (even coming from deterministic Schr¨odinger operators) for each γ > 21 where the decaying solution only obeys limn→∞ log kAn u∞ k/ log(n) = −γ + 1. It also appears one needs γ > 23 to be sure of the existence of decaying solutions. But following [21], the probabilistic argument here can replace Ruelle’s argument. 2
Proof. Let β = 8 sinλ2 (k) . Let R1 (n) and R2 (n) be the R’s associated to θ = 0 and θ = 21 . By the proof of Theorem 8.2 for a.e. ω, lim
n→∞
log kRi (n)k = β. log(n)
(8.19)
Let θi (n) be the corresponding EFGP angles. By (2.7), R1 (n)R2 (n) sin(θ1 (n) − θ2 (n)) = sin(k)[u1 (n)u2 (n − 1) − u1 (n − 1)u2 (n)] = −1 (by the initial conditions R1 (1) = R2 (1) = 0, θ1 (1) = 0, θ2 (1) = Wronskian. Thus by (8.19) for a.e. ω, lim
n→∞
Let ρn =
R1 (n) R2 (n) .
π 2)
and constancy of the
log |θ1 (n) − θ2 (n)| = −2β. log(n)
Then by (2.12c),
Lω (n) ≡ [log ρ(n + 1) − log ρ(n)] = log(1 + A1 (n)) − log(1 + A2 (n)), where Ai (n) = − Define
Vω (n) Vω (n)2 sin(2θi,ω (n)) + sin2 (θi,ω (n)). sin(k) sin2 (k)
F (a, θ) = log(1 − a sin(2θ) + a2 sin2 (θ)).
By a finite Taylor expansion, F (a, θ) =
J−1 X j=1
aj Pj (θ) + O(aJ )
(8.20)
40
A. Kiselev, Y. Last, B. Simon
with P1 (θ) = sin(2θ) and the P ’s, C ∞ in θ. Fix > 0 so for n large, use (8.20) to see that |θ1 − θ2 | = o(n−2β+ ). Choosing J so n−J/3 = o(n−2β−1 ), we see that Lω (n) = −
Vω (n) [sin(2θ1 (n)) − sin(2θ2 (n))] + O(n−2β−1+ ). sin(k)
Since θj (n) depend only on {Vω (k)}k≤n−1 , we can apply part (3) of Lemma 8.4 (with 2α = 1 + 2β − ) to see that for a.e. ω, lim
N X
N →∞
exists and
Lω (n)
(8.21)
1
X ∞ ≤ Cω N −2β+ . L (n) ω
(8.22)
N
By (8.21), lim
R1 (n) R2 (n)
≡ ρ∞ exists and is different from 0 and ∞. Moreover, by (8.22), lim
log |ρ(n) − ρ(∞)| ≤ −2β. log(n)
Lemma 8.7 completes the proof.
Theorem 8.9. Suppose (i)–(iv) hold with α = 21 . Then, (1) For a.e. ω, the essential spectrum of Hω is [−2, 2] and the absolutely continuous spectrum of Hω is empty. (2) If |λ| ≥ 2 and Vω (1) has an absolutely continuous distribution, then for a.e. ω, Hω has dense point spectrum and only dense point spectrum in (−2, 2). (3) If |λ| < 2 and Vω (1) has an absolutely continuous distribution, then for a.e. ω, Hω has purely singular continuous spectrum in {E | |E| < (4 − λ2 )1/2 } and only dense pure point spectrum in {E | (4 − λ2 )1/2 < |E| < 2}. In either case (2) or (3), in the region of point spectrum, there are almost surely eigenvectors of power decay n−β with β=
λ2 . 8 − 2E 2
(8.23)
Remark. This theorem extends results of Delyon, et al. [7], Delyon [6], and KotaniUshiroya [21]. In particular, [7] conjectured that there is a region of point spectrum near E = ±2 no matter how small λ is. Proof. By Theorem 8.2, limn→∞ kTω (0, n)k = ∞ for a.e. E for a.e ω, so by Theorem 1.1, we conclude (3). By Lemma 8.8, for a.e. pairs (ω, E), there is a unique λ2 1 2 decaying solution with rate of decay n−β with β = 8−2E 2 . If β > 2 , this is ` and we 1 have potential point spectrum. If β < 2 , there is no `2 solution. The general theory of rank one perturbations ([32, 5]) then yields (2) and (3). We can compute the precise Hausdorff dimension of the singular continuous spectral measures in this case:
Modified Pr¨ufer and EFGP Transforms
41
Theorem 8.10. Fix λ < 2 and a model obeying (i)–(iv) with α = 21 . In the region |E| ≤ (4 − λ2 )1/2 , define 4 − E 2 − λ2 d(E, λ) = . 4 − E2 Suppose Vω (1) has an absolutely continuous density. Then for a.e. ω, the spectral measure, µ, has dimension d(E, λ) at E in the sense that for any , there is a δ so that µ(A) = 0 if A is a subset of (E − δ, E + δ) of Hausdorff dimenion less than (d − ), and there is a subset B of Hausdorff dimension less than (d + ), so µ((E − δ, E + δ)\B) = 0. PL Proof. Let kukL = ( j=1 u(j)2 )1/2 . By the general theory of rank one perturbations, Theorem 8.2, Lemma 8.8, and the assumption of Vω (1) for a.e. ω, µ is supported on the set of energies where most solutions grow as nβ and one decays as n−β , where β(E, λ) is given by (8.23). The hypothesis for singular spectrum is precisely β < 21 . Since β < 21 , ku1 kL ∼ L−β L1/2 while ku2 kL ∼ Lβ L1/2 , where a ∼ b is shorthand for lim log(a) log(b) = 1. The Jitomirskaya-Last version [16, 17] of the Gilbert-Pearson [11] theory says that the Borel transform of the spectral measure is supported on the set of E’s, where ku2 kL , (8.26) |m(E + i)| ∼ ku1 kL and E is given by ku1 kL ku2 kL =
1 2
(the ∼ in (8.26) holds in the strong sense that the ratio lies in the interval (5 − √ 24 )). Thus, ∼ L−1 and (8.26) says that
(8.27) √
24 , 5 +
|m(E + i)| ∼ −2β . Since β is continuous, the theory in [3] then says that the local dimension is given by 1 − 2β as claimed. 9. Random Decaying Potentials: The Continuum Case Having done the discrete random case, we will only sketch the continuum case. We will specialize to a situation where {V (x)}n≤x
n+1
|Vω (y)| dy.
an (ω) = n
Suppose (i) E(Vω (x)) = 0 for each x, P 2 Can ) < ∞ for all C > 0, (ii) n E(an e (iii) {V (x)}n≤x
(9.1)
42
A. Kiselev, Y. Last, B. Simon 2
d 2 Then for a.e. ω, − dx 2 + Vω (x) on L (0, ∞) has purely absolutely continuous spectrum on (0, ∞) for any boundary condition.
Remarks. 1. Our methods imply for all E > 0 and a.e. ω, TE (n) is bounded, and that d2 implies − dx 2 +Vω (x) is limit point at infinity, so we need not worry about self-adjointness issues. R∞ 2. A simple example where (ii) holds is if supω,x |Vω (x)| < ∞ and E( 0 V (y)2 dy) < ∞. Proof. By (2.4), R4 (n + 1) = R4 (n) exp(Bn (ω)), where Bn (ω) =
2 k
Z
(9.2)
n+1
Vω (x) sin(2θω (x)) dx.
(9.3)
2 an (ω). k
(9.4)
n
By (2.3), |θω (x) − θω (n) − k(x − n)| ≤ Using
|ex − 1 − x| ≤ 21 x2 ex ,
we obtain from (9.2)–(9.4), R4 (n + 1) ≤ R4 (n)(1 + Ca2n eCan ) + Qn , Z n+1 2 Qn = R4 (n) Vω (x) sin(2θω (n) + 2k(x − n)) dx k n
(9.5a) (9.5b)
for some constant C uniformly bounded for k in any compact of (0, ∞). Since Vω (x) is independent of {Vω (y)}y≤n , it is independent of R(n) and θω (n), and so E(Qn ) = 0. Moreover, an is independent of R(n), so (9.5) implies that E(R4 (n + 1)) ≤ E(R4 (n))E(1 + Ca2n eCan ). By condition (ii), we see that lim E(R4 (n)) < ∞
n→∞
with bounds uniform in k on compacts of (0, ∞). Thus by Fatou’s lemma, for a.e. ω, Rb lim a Rω4 (n, k) dk < ∞ and so the spectrum is purely absolutely continuous by Theorem 1.3. Theorem 9.2. Let f be supported on (0, 1) and let Vω (x) =
∞ X
(n + 1)−α Xn (ω)f (x − n),
n=0
where {Xn (ω)} are independent, identically distributed bounded variables of mean zero / Zπ, and 0 < α ≤ 21 . Then for 4k ∈ 2 Z log kT (n, 0)k E(Xn2 ) 1 iky Pn −α = f (y)e dy . (9.6) lim 2 n→∞ 8k 0 j=1 j
Modified Pr¨ufer and EFGP Transforms
43
Remarks. 1. This implies pure point spectrum for a.e. ω if α < 21 . 2. If α = 21 , we get singular continuous spectrum for large E and pure point spectrum R1 R1 for small E (assuming 0 f (y) dy 6= 0 or 0 yf (y) dy 6= 0) and no a.c. spectrum. Sketch. Define θn (y) = θ(n) + ky and δθn (y) = −(n + 1)
Z
−α
y
f (y) sin2 (θn (y)) dy.
Xn
(9.7)
0
By (2.3),
|θn (n + y) − θn (y) − δθn (y)| = O(n−2α )
for y ∈ (0, 1). Plugging this into (2.4), we find that log R(n + 1) − log R(n) = Yn(1) + Yn(2) + O(n−2α ), where Yn(1) = and Yn(2) =
(n + 1)−α Xn (ω) 2k
(n + 1)−α Xn (ω) 2k
Z
Z
1
f (y) sin(2θn (y)) dy 0
1
2f (y) cos(2θn (y))(δθn )(y) dy. 0
By using Lemmas 8.2 and 8.3, one sees that X X n n (1) −2α (j + 1) Yj → 0 j=0
j=0
for a.e. ω. The same lemmas replace Xn (ω)2 by E(Xn2 (ω)) in Yn(2) . So if we let $ Pn let us −2α ) terms, we see that indicate equal up to o( j=0 (j + 1) log R(n) $
n−1 X
(Yn(3) + Yn(4) ),
j=0
where we use sin2 (θn (y)) = 21 − 21 cos(2θn (y)) and let Yn(3) indicate the − 21 cos(2θ) terms and Yn(4) the 21 terms. By an argument analogous to the one in the proof of Theorem 9.2 P that used Lemma 8.5, Yn(4) $ 0 because k ∈ / Zπ. As in (6.5), we get log R(n) $
n−1 X j=0
(j + 1)−2α E(Xn (ω)2 ) 4k 2
Z
2
1
f (y) cos(2θj (y)) dy
.
0
As in the proof of Lemma 6.2, this last square is Z 2 1 1 iky f (y)e 2 0 plus a term that has cos(4θj (y)), which we can handle using Lemma 8.5.
44
A. Kiselev, Y. Last, B. Simon
References 1. Carmona, R.: One-dimensional Schr¨odinger operators with random or deterministic potentials, New spectral types. J. Funct. Anal. 51, 229–258 (1983) 2. Chulaevsky, V. and Spencer, T.: Positive Lyapunov exponents for a class of deterministic potentials. Commun. Math. Phys. 168, 455–466 (1995) 3. del Rio, R., Jitomirskaya, S., Last, Y. and Simon,B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) 4. del Rio, R., Makarov, N. and Simon, B.: Operators with singular continuous spectrum, II. Rank one operators. Commun. Math. Phys. 165, 59–67 (1994) 5. del Rio, R., Simon, B. and Stolz, G.: Stability of spectral types for Sturm-Liouville operators. Math. Research Lett. 1, 437–450 (1994) 6. Delyon, F.: Apparition of purely singular continuous spectrum in a class of random Schr¨odinger operators. J. Statist. Phys. 40, 621–630 (1985) 7. Delyon, F., Simon, B. and Souillard, B.: From power pure point to continuous spectrum in disordered systems. Ann. Inst. H. Poincar´e 42, 283–309 (1985) 8. Eastham, M.S.P. and Kalf, H.: Schr¨odinger-type Operators With Continuous Spectra. London: Pitman Books, 1982 9. Eggarter, T.: Some exact results on electron energy levels in certain one-dimensional random potentials. Phys. Rev. B5, 3863–3865 (1972) 10. Gilbert, D.J.: On subordinacy and analysis of the spectrum of Schr¨odinger operators with two singular endpoints. Proc. Roy. Soc. Edin. 112A, 213–229 (1989) 11. Gilbert, D.J. and Pearson, D.B.: On subordinacy and analysis of the spectrum of one-dimensional Schr¨odinger operators. J. Math. Anal. 128, 30–56 (1987) 12. Gordon, A.Ya.: Pure point spectrum under 1-parameter perturbations and instability of Anderson localization. Commun. Math. Phys. 164, 489–505 (1994) 13. Gredeskul, S.A. and Pastur, L.A.: Behavior of the density of states in one-dimensional disordered systems near the edges of the spectrum. Theor. Math. Phys. 23, 132–139 (1975) 14. Hardy, G.H. and Wright, E.M.: An Introduction to the Theory of Numbers, 5th ed., Oxford: Oxford University Press, 1979 15. Harris W.A. and Lutz, D.A.: Asymptotic integration of adiabatic oscillator. J. Math. Anal. Appl. 51, 76–93 (1975) 16. Jitomirskaya, S. and Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 17. Jitomirskaya, S. and Last, Y.: Power law subordinacy and singular spectra I. Half-line operators. In preparation 18. Kahane, J.P.: Some Random Series of Functions. Cambridge: Cambridge University Press, 1985 19. Kiselev, A.A.: Absolutely continuous spectrum of one-dimensional Schr¨odinger operators and Jacobi matrices with slowly decreasing potentials. Commun. Math. Phys. 179, 377–400 (1996) 20. Kiselev, A., Remling, C. and Simon, B.: Effective perturbation methods for one-dimensional Schr¨odinger operators. Preprint 21. Kotani, S. and Ushiroya, N.: One-dimensional Schr¨odinger operators with random decaying potentials. Commun. Math. Phys. 115, 247–266 (1988) 22. Last, Y. and Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schr¨odinger operators. To appear in Invent. math. 23. Molchanov, S.: One-dimensional Schr¨odinger operators with sparse potentials. Preprint 24. Naboko, S.: Dense point spectra of Schr¨odinger and Dirac operators. Theor. Math. Phys. 68, 18–28 (1986) 25. Naboko, S. and Yakovlev, S.I.: On the point spectrum of discrete Schr¨odinger operators. Func. Anal. Appl. 26, 85–88 (1992) 26. Pastur, L. and Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer, 1992 27. Pearson, D.B.: Singular continuous measures in scattering theory. Commun. Math. Phys. 60, 13–36 (1978) 28. Pearson, D.B.: Pathological spectral properties. Mathematical Problems in Theoretical Physics, Proc. Internat. Conf. Math. Phys., Lausanne, Lecture Notes in Phys. 116, 1979, Berlin–New York: Springer, 1980, pp. 49–51 29. Remling, C.: A probabilistic approach to one-dimensional Schr¨odinger operators with sparse potentials. Commun. Math. Phys. 185, 313–323 (1997)
Modified Pr¨ufer and EFGP Transforms
45
30. Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. Math. IHES 50, 275–306 (1979) 31. Simon, B.: Some Jacobi matrices with decaying potentials and dense point spectrum. Commun. Math. Phys. 87, 253–258 (1982) 32. Simon, B.: Spectral analysis and rank one perturbations and applications. CRM Lecture Notes Vol. 8, J. Feldman, R. Froese, L. Rosen, eds., Providence, RI: Amer. Math. Soc., 1995, pp. 109–149 33. Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for onedimensional Schr¨odinger operators. Proc. Amer. Math. Soc. 124, 3361–3369 (1996) 34. Simon, B.: Some Schr¨odinger operators with dense point spectrum. Proc. Amer. Math. Soc. 125, 203–208 (1997) 35. Simon, B. and Stolz, G.: Operators with singular continuous spectrum, V. Sparse potentials. Proc. Amer. Math. Soc. 124, 2073–2080 (1996) 36. Stolz, G.: Bounded solutions and absolute continuity of Sturn-Liouville operators. J. Math. Anal. Appl. 169, 210–228 (1992) 37. Weidmann, J.: Zur Spektral theorie von Sturm-Liouville Operatoren. Math. Z. 98, 268–302 (1967) Communicated by D. C. Brydges
Commun. Math. Phys. 194, 47 – 60 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Conformally Symplectic Dynamics and Symmetry of the Lyapunov Spectrum Maciej P. Wojtkowski1 , Carlangelo Liverani2 1 Department of Mathematics, University of Arizona, Tuscon, Arizona 85721, USA. E-mail:
[email protected] 2 Mathematics Department, University of Rome “Tor Vergata”, 00133 Rome, Italy. E-mail:
[email protected]
Received: 18 March 1997 / Accepted: 5 June 1997
Abstract: A generalization of the Hamiltonian formalism is studied and the symmetry of the Lyapunov spectrum established for the resulting systems. The formalism is applied to the Gausssian isokinetic dynamics of interacting particles with hard core collisions and other systems.
0. Introduction We study the symmetry of the Lyapunov spectrum in systems more general than Hamiltonian but closely related to the symplectic formalism. We call these systems conformally Hamiltonian. They are determined by a non-degenerate 2-form 2 on the phase space and a function H, called again a Hamiltonian. The form 2 is not assumed to be closed but it satisfies the following basic condition d2 = γ ∧ 2, for some closed 1-form γ. This condition guarantees that, at least locally, the form 2 can be multiplied by a nonzero function to give a bona fide symplectic structure. The skew-orthogonality of tangent vectors is preserved under multiplication of the form by any nonzero function, hence the name conformally symplectic structure. These ideas were known to geometers for a long time, see for example the paper of Vaisman [V]. The conformally Hamiltonian (with respect to the form 2) vector field ∇2 H is defined by the usual relation 2 (·, ∇2 H) = dH(·). The Hamiltonian function H is again a first integral of the system. In Sect. 2 we prove that for any conformally Hamiltonian system restricted to a smooth level set of the Hamiltonian the Lyapunov spectrum is symmetric, with symmetric exponents adding up to a constant. More precisely the direction of the flow has to be factored out. In Sect. 3 we extend this formalism to flows with collisions.
48
M. P. Wojtkowski, C. Liverani
In Sect. 1 we give an independent proof of the fact that for any conformally symplectic cocycle we have the symmetry of the Lyapunov spectrum. The first proof of this fact in the symplectic case goes back to Benettin et al. [B-G-G-S]. Our proof is based on an alternative description of Lyapunov exponents and it borrows an idea from [W1] (Lemma 1.2). In Sect. 4 we present examples which were recently the subject of several papers. We show that the Gaussian isokinetic dynamics can be viewed as a conformally Hamiltonian system, by which we immediately recover the results of Dettmann and Morriss [D-M 1, D-M 2], on the symmetry of the Lyapunov spectrum and the Hamiltonian character of the dynamics. We extend these results to systems with collisions, taking advantage of the fact that our formalism works equally well for collisions as it does for flows. Such systems were studied in the papers of Garrido and Gallavotti, [G-G], Dellago, Posch and Hoover, [D-P-H], and Bonetto, Gallavotti and Garrido, [B-G-G], and the symmetry of the Lyapunov spectra was demonstrated numerically. In such systems the symmetry of the Lyapunov spectrum was discovered as a pairing rule by Evans, Cohen and Morriss, [E-C-M, E-M]. Chernov et al, [Ch-E-L-S], studied rigorously the Lorentz gas of periodic scatterers with an electric external field in dimension 2. Latz, van Beijeren and Dorfman, [L-B-D], considered a thermostated random Lorentz gas in 3-dimensions and found the symmetry there. Let us note that we prove the symmetry of the Lyapunov spectrum for any invariant (ergodic) measure and not only for the SRB measure, which is the easiest to access numerically. We also show that the Gaussian isokinetic dynamics on a Riemannian manifold can be given the same treatment. The last application is to Nos´e–Hoover dynamics. We show that the Hoover equations can be naturally viewed as a conformally Hamiltonian system, thus giving another proof of the symmetry of Lyapunov spectra for this system. It was originally proven by Dettmann and Morriss, [D-M 3]. Finally let us note that our approach is one of several possible. Recently Choquard, [Ch], showed that the isokinetic and Nos´e-Hoover dynamics can be considered as Lagrangian systems. Even in the conformally symplectic framework one can keep the Hamiltonian unchanged and modify the form 2, or keep the form unchanged and modify the Hamiltonian, or keep both the form and the Hamiltonian unchanged, but change time on a level set of the Hamiltonian. We elaborate on that in Remark 2.1. We believe that our approach sheds new light on these issues. 1. Conformally Symplectic Group Pn Let ω = i=1 dpi ∧ dqi be the standard linear symplectic form in Rn × Rn . Proposition 1.1. For an invertible linear mapping S acting on R2n = Rn × Rn , the following are equivalent: (a) ω(Su, Sv) = βω(u, v) for some scalar β and all u, v ∈ R2n ; (b) ω(Su, Sv) = 0 if and only if ω(u, v) = 0, i.e., S preserves skew-orthogonality of vectors; (c) S maps Lagrangian subspaces of R2n onto Lagrangian subspaces. Proof. It is apparent that (a) implies (b) and that (b) is equivalent to (c). It remains to prove that (c) implies (a). By composing S with an appropriate linear symplectic map
Conformally Symplectic Dynamics
49
we can assume without loss of generality that S preserves the Lagrangian subspaces Rn × {0} and {0} × Rn , i.e., S is block diagonal. Moreover, again by multiplying by an appropriate linear symplectic map, we can assume that S is equal to identity on Rn × {0}. By (b) we conclude that S is diagonal on {0} × Rn . A simple calculation shows that to satisfy (b) this diagonal matrix must be a multiple of identity, which gives us (a). We will call a linear map from GL(R2n ) conformally symplectic if it satisfies one of the properties in Proposition 1.1. The group of all conformally symplectic maps will be denoted by CSp(R2n ). Let X be a measurable space with probabilistic measure µ and let T : X → X be an ergodic map. Let further A : X → GL(R2n ) be a measurable map such that Z log+ kA(x)kdµ(x) < +∞. (1.1) X
We define the matrix valued cocycle Am (x) = A(T m−1 x) . . . A(x). By the Oseledets Multiplicative Ergodic Theorem, [O], which in this generality was first proven by Ruelle, [R], there are numbers λ1 < . . . < λs , called the Lyapunov exponents of the measurable cocycle A(x), x ∈ X, and for almost all x a flag of subspaces {0} = V0 ⊂ V1 (x) ⊂ . . . ⊂ Vs−1 (x) ⊂ Vs = R2n , such that for all vectors v ∈ Vk (x) \ Vk−1 (x), lim
m→+∞
1 log kAm (x)vk := λ(v) = λk . m
In addition, denoting by dk the difference between the dimensions of Vk and Vk−1 (dk is called the multiplicity of the k th Lyapunov exponent), the following holds: Z s X d k λk = log |det(A(x))|dµ(x), (1.2) k=1
X
i.e., the sum of all Lyapunov exponents is equal to the average exponential rate of volume growth. Given a measurable cocycle A(x), x ∈ X, satisfying (1.1) and with values in the conformally symplectic group CSp(R2n ) we obtain a measurable function β = β(x) such that ω(A(x)u, A(x)v) = β(x)ω(u, v), (1.3) for all vectors u, v ∈ R2n . Let us define Z b := log |β(x)|dµ(x).
(1.4)
X
Lemma 1.1. If a measurable cocycle A(x), x ∈ X, satisfies (1.1) and it has values in the conformally symplectic group CSp(R2n ), then s X k=1
dk λk = nb.
50
M. P. Wojtkowski, C. Liverani
Proof. Since ω n is the volume form it follows from (1.3) that the determinant of A(x) is det A(x) = β(x)n . The lemma follows by applying (1.2).
Lemma 1.2. Given a measurable cocycle A(x), x ∈ X, satisfying (1.1) and with values in the conformally symplectic group CSp(R2n ), for each two non skew-orthogonal vectors u, v ∈ R2n , i.e., ω(u, v) 6= 0, we have λ(u) + λ(v) ≥ b. Proof. For the standard Euclidean norm k · k we have |ω(u, v)| ≤ kukkvk. From (1.3) we obtain m
m
ω(A (x)u, A (x)v) = ω(u, v)
m−1 Y
β(T i x).
i=0
Therefore, log |ω(Am (x)u, Am (x)v)| = log |ω(u, v)| +
m−1 X
log |β(T i x)|,
i=0
and 1 1 1 log |ω(Am (x)u, Am (x)v)| ≤ log kAm (x)uk + log kAm (x)vk. m m m Putting these relations together and using the Birkhoff Ergodic Theorem we conclude that Z log |β(x)|dµ(x) ≤ λ(u) + λ(v). b= X
The following lemma is obvious. We formulate it to streamline the proof of Theorem 1.4 where it is used twice. For a linear subspace X ⊂ R2n we denote by X ∠ the skew-orthogonal complement of X, i.e, X ∠ ⊂ R2n is the linear subspace containing all vectors v such that ω(u, v) = 0 for all u ∈ X. Since ω is assumed to be nondegenerate we have dim X + dim X ∠ = 2n
and
(X ∠ )∠ = X.
Lemma 1.3. Let U, V ⊂ R2n be two linear subspaces. If ω(u, v) = 0 for all u ∈ U and v ∈ V , then U ⊂ V ∠ , V ⊂ U ∠ and dim U + dim V ≤ 2n. Theorem 1.4. If a measurable cocycle A(x), x ∈ X, satisfies (1.1) and it has values in the conformally symplectic group CSp(R2n ), then we have the following symmetry of the Lyapunov spectrum: λk + λs−k+1 = b, where b is given by (1.4), and the multiplicities of λk and λs−k+1 are equal, for k = 1, 2, . . . , s. Moreover the subspace Vs−k is the skew-orthogonal complement of Vk .
Conformally Symplectic Dynamics
51
Proof. Let µ1 ≤ µ2 ≤ . . . ≤ µ2n be the Lyapunov exponents taken with repetitions according to their multiplicities. By Lemma 1.1 µ1 + µ2 + . . . + µ2n = nb. We can choose a flag of subspaces {0} = W0 ⊂ W1 (x) ⊂ . . . ⊂ W2n−1 (x) ⊂ W2n = R2n , such that dim Wl = l and for all vectors v ∈ Wl (x) \ Wl−1 (x) the Lyapunov exponent λ(v) = µl , for l = 1, 2, . . . , 2n. (Note that except in the case of all multiplicities equal to 1 there is a continuum of such flags.) Since for any l ≤ n, dim Wl + dim W2n−l+1 = 2n + 1, by Lemma 1.3 there are vectors u ∈ Wl and v ∈ W2n−l+1 such that ω(u, v) 6= 0. By continuity there must be ˜ v) ˜ 6= 0. It follows also vectors u˜ ∈ Wl \ Wl−1 and v˜ ∈ W2n−l+1 \ W2n−l such that ω(u, from Lemma 1.2 that µl + µ2n−l+1 ≥ b, for l = 1, 2, . . . , n. Adding these inequalities together, we get nb =
n X
(µl + µ2n−l+1 ) ≥ nb,
l=1
which shows that all the inequalities must be actually equalities. It follows immediately that for any k = 1, . . . , s, the multiplicities of λk and λs−k+1 are equal and λk +λs−k+1 = b. To show that the subspace Vs−k is the skew-orthogonal complement of the subspace Vk we observe that ω(u, v) = 0 for any u ∈ Vk and v ∈ Vs−k . Indeed, if this is not the case we could use Lemma 1.2 to claim that λk + λs−k ≥ b, which leads to the contradiction b = λk + λs−k+1 > λk + λs−k ≥ b. We can now apply Lemma 1.3 and we obtain Vs−k ⊂ V∠ k . Since the dimensions of these . subspaces are equal we must have Vs−k = V∠ k 2. Conformally Symplectic Manifolds and Conformal Hamiltonian Flows Let M be a smooth manifold of even dimension. A conformally symplectic structure on M is a differentiable 2-form 2 which is non-degenerate and has the following basic property: d2 = γ ∧ 2, (2.1) for some closed 1-form γ. A manifold with such a form 2 is called conformally symplectic. The origin of this name becomes clear when one observes that locally γ = dU for some smooth function U and d(e−U 2) = 0, i.e. e−U 2 defines a bona fide symplectic structure. For a given function H : M → R, called a Hamiltonian, let us consider a vector field ∇2 H defined by the usual relation 2(·, ∇2 H) = dH.
(2.2)
We will call it the conformally Hamiltonian vector field, or conformally symplectic, or simply a Hamiltonian vector field when the conformally symplectic structure is clearly
52
M. P. Wojtkowski, C. Liverani
chosen. Note that our definition does not coincide with the definition of a Hamiltonian vector field from [V]. Let 8t denote the flow defined by the vector field F = ∇2 H. The Hamiltonian function H is a first integral of the system. Indeed we have d H = dH(∇2 H) = 2(∇2 H, ∇2 H) = 0. dt Let us consider the Lie derivative of the form 2 in the direction of vector field F , i.e., (LF 2) (ξ, η) :=
d 2(D8u ξ, D8u η)|u=0 . du
Theorem 2.1. For a Hamiltonian vector field F = ∇2 H we have LF 2 = γ(F )2 + γ ∧ dH.
(2.3)
Proof. We will use the Cartan formula ([A-M-R]) LF = iF d + diF , where iF is the interior and d the exterior derivative. (For a differential m-form ζ the interior derivative iF ζ is the differential (m − 1)-form obtained by substituting F as the first vector argument of ζ.) We have iF 2 = −dH and we get immediately LF 2 = iF d2 − d2 H = iF (γ ∧ 2) = γ(F )2 + γ ∧ dH. Let us restrict the flow 8t to one smooth level set of the Hamiltonian, M c = {z ∈ M |H(z) = c}. In particular we assume that on M c the differential dH and the vector field F do not vanish. For two vectors ξ, η from the tangent space Tz M c , we introduce w(t) = 2(Dz 8t ξ, Dz 8t η). By (2.3) we get d w(t) = γ(F (8t z))w(t), dt since dH vanishes on the tangent space Tz M c . We conclude that 2(Dz 8t ξ, Dz 8t η) = β(t)2(ξ, η),
(2.4)
for every ξ, η, from Tz M c and Z
t
β(t) = exp
u
γ(F (8 z))du . 0
(2.5)
Conformally Symplectic Dynamics
53
Remark 2.1. Let us note that under a non-degenerate time change a conformally Hamiltonian vector field is still conformally Hamiltonian with the same Hamiltonian function but with respect to a modified conformally symplectic form. More precisely if F = ∇2 H is a Hamiltonian vector field, then if the new time τ is related to the original time t by dτ = f, dt for some function f of the phase point, we get that the vector field f1 F is conformally e = f 2. Indeed symplectic with respect to the form 2 1 e = (d ln f + γ) ∧ 2 e and 2(·, e F ) = dH. d2 f Alternatively we can keep the same conformally symplectic form and modify the Hamiltonian separately on each level set. Indeed we have 1 1 1 (H − c) , 2(·, F ) = dH = d f f f where the last equality is valid only on the level set {H = c}. Finally, let us consider the symplectic form e−U 2. On the level set {H = c} we have d(e−U (H − c)) = e−U dH. It follows that on this level set e−U 2(·, F ) = d e−U (H − c) , and, as a result, the vector field F coincides locally with the Hamiltonian vector field given by the Hamiltonian e−U (H − c) (with respect to the symplectic form e−U 2). This observation provides an alternate way to derive (2.4) by using the preservation of the symplectic form e−U 2 by any Hamiltonian flow (with respect to this symplectic form). For a fixed level set M c we introduce the quotient of the tangent bundle T M c of M by the vector field F = ∇2 H, i.e., by the one dimensional subspace spanned by F . Let us denote the quotient bundle by TbM c . The form 2 factors naturally from T M c to TbM c , in view of (2.2). The factor form defines in each of the quotient tangent spaces Tbz M c , z ∈ M c , a linear symplectic form. The derivative of the flow preserves the vector field F , i.e., c
Dz 8t (F (z)) = F (8t z). As a result the derivative can be also factored on the quotient bundle and we call it the transversal derivative cocyle and denote it by At (z) : Tbz M c → Tb8t z M c . It follows immediately from (2.4) that the transversal derivative cocycle is conformally symplectic with respect to 2 (or more precisely the linear symplectic form it defines in the quotient tangent spaces). We fix an invariant probability measure µc on M c and assume that
54
M. P. Wojtkowski, C. Liverani
Z Mc
kDz 8t kdµc (z) < +∞.
Under this assumption the derivative cocycle has well defined Lyapunov exponents, cf. [O, R]. Then the transversal derivative cocycle has also well defined Lyapunov exponents which coincide with the former except that one zero Lyapunov exponent is skipped. We can immediately apply Theorem 1.4 to the transversal derivative cocycle and we get the following. Theorem 2.2. For a Hamiltonian flow 8t , defined by the vector field F = ∇2 H, restricted to one level set M c we have the following symmetry of the Lyapunov spectrum of the transversal derivative cocyle with respect to an invariant ergodic probability measure µ. Let {0} ⊂ V0 (z) ⊂ V1 (z) ⊂ . . . ⊂ Vs−1 (z) ⊂ Vs = Tbz M c be the flag of subspaces at z associated with the Lyapunov spectrum λ1 < λ2 < . . . < λs−1 < λs of the transversal derivative cocycle At (z), z ∈ M c . Then the multiplicities of λk and λs−k+1 are equal and R
λk + λs−k+1 = a, for k = 1, 2, . . . , s,
where a = M c γ(F (z))dµc (z). Moreover the subspace Vs−k is the skew-orthogonal complement of the subspace Vk . Note that the invariant measure µc can be supported on a single periodic orbit, so that Theorem 2.2 applies as well to the real parts of the Floquet exponents. To apply Theorem 1.4 it is enough to have linear symplectic forms in each of the quotient tangent spaces (to the level set), not necessarily coming from a conformally symplectic structure on the phase space. But then one needs to check directly how the transversal derivative cocycle acts on these forms, because we do not have the advantage of Theorem 2.1. This is essentially the line of argument in [D-M 1] and [D-M 3]. 3. Conformally Symplectic Flows with Collisions Let M be a smooth manifold with piecewise smooth boundary ∂M . We assume that the manifold M is equipped with a conformally symplectic structure 2, as defined in Sect. 2. Given a smooth function H on M with non vanishing differential we obtain the non vanishing conformally Hamiltonian vector field F = ∇2 H on M . The vector field F is tangent to the level sets of the Hamiltonian M c = {z ∈ M |H(z) = c}. We distinguish in the boundary ∂M the regular part, ∂Mr , consisting of the points which do not belong to more than one smooth piece of the boundary and where the vector field F is transversal to the boundary. The regular part of the boundary is further split into “outgoing” part, ∂M− , where the vector field F points outside the manifold M and the “incoming” part, ∂M+ , where the vector field is directed inside the manifold. Suppose that additionally we have a piecewise smooth mapping 0 : ∂M− → ∂M+ , called the collision map. We assume that the mapping 0 preserves the Hamiltonian, H ◦ 0 = H, and so it can be restricted to each level set of the Hamiltonian. We assume that all the integral curves of the vector field F that end (or begin) in the singular part of the boundary lie in a codimension 1 submanifold of M . We can now define a flow 9t : M → M , called a flow with collisions, which is a concatenation of the continuous time dynamics 8t given by the vector field F , and the
Conformally Symplectic Dynamics
55
collision map 0. More precisely a trajectory of the flow with collisions, 9t (x), x ∈ M , coincides with the trajectory of the flow 8t until it gets to the boundary of M at time tc (x), the collision time. If the point on the boundary lies in the singular part then the flow is not defined for times t > tc (x) (the trajectory “dies” there). Otherwise tc the trajectory is continued at the point 0(9 x) until the next collision time, i.e., for 0 ≤ t ≤ tc 0(9tc (x) x) , 9tc +t x = 8t 09tc x. We define a flow with collisions to be conformally symplectic, if for the collision map 0 restricted to any level set M c of the Hamiltonian we have 0∗ 2 = β2,
(3.1)
for some non vanishing function β defined on the boundary. More explicitly we assume that for every vector ξ and η from the tangent space Tz ∂M c to the boundary of the level set M c we have 2(Dz 0ξ, Dz 0η) = β2(ξ, η). We restrict the flow with collisions to one level set M c of the Hamiltonian and we denote the resulting flow by 9tc . This flow is very likely to be badly discontinuous but we can expect that for a fixed time t the mapping 9tc is piecewise smooth, so that the derivative D9tc is well defined except for a finite union of codimension one submanifolds of M c . We will consider only such cases. We choose an invariant measure in our system which satisfies the condition that all the trajectories that begin (or end) in the singular part of the boundary have measure zero. Usually there are many natural invariant measures satisfying this property. For instance we get one by taking a Lebesgue measure ν in RT M c and averaging it over increasing time intervals ( T1 0 9tc∗ νdt as T → +∞). Let us denote the chosen invariant measure by µc . This measure µc defines the measure µcb on the boundary ∂M c , which is an invariant measure for the section of the flow (Poincar´e map of the flow). With respect to the measure µc the flow 9tc is a measurable flow in the sense of the Ergodic Theory and we obtain a measurable derivative cocycle D9tc : Tx M c → T9tc x M c . We can define Lyapunov exponents of the flow 9tc with respect to the measure µc , if we assume that Z Z t log+ ||Dx 9c ||dµc (x) < +∞ and log+ ||Dy 0||dµcb (y) < +∞ Mc
c ∂M−
(cf. [O, R]). The derivative of the flow with collisions can be also naturally factored onto the quotient of the tangent bundle T M c of M c by the vector field F , which we denote by TbM c . Note that for a point z ∈ ∂M c the tangent to the boundary at z can be naturally identified with the quotient space. We will again denote the factor of the derivative cocycle by At (x) : Tbx M c → Tb9tc x M c . We will call it the transversal derivative cocycle. If the derivative cocycle has well defined Lyapunov exponents then the transversal derivative cocycle has also well defined Lyapunov exponents which coincide with the former ones except that one zero Lyapunov exponent is skipped. For a conformally symplectic flow with collisions the factor At (x) of the derivative cocycle on one level set changes the form 2 by a scalar, (2.3) and (3.1), so that we can immediately apply Theorem 1.4 and we get
56
M. P. Wojtkowski, C. Liverani
Theorem 3.1. For a conformally symplectic flow with collisions 9tc we have the following symmetry of the Lyapunov exponents for a given ergodic invariant probability measure µc . Let {0} ⊂ V0 (x) ⊂ V1 (x) ⊂ . . . Vs−1 (x) ⊂ Vs = Tbx M c be the flag of subspaces at x associated with the Lyapunov spectrum λ1 < λ2 < . . . < λs−1 < λs of the transversal derivative cocycle At (x), x ∈ M c . Then the multiplicities of λk and λs−k+1 are equal and λk + λs−k+1 = a + b, for k = 1, 2, . . . , s, R R where a = M c γ(F )dµc and b = τ1 ∂M c log |β(y)|dµcb (y). τ = ∂M c tc (y)dµcb (y) is − − the average collision time on the section of the flow. Moreover the subspace Vs−k is the skew-orthogonal complement to Vk . R
4. Applications A. Gaussian isokinetic dynamics. The equations of the system are (cf. [D-M 1]) q˙ = p, p˙ = E − αp, where α =
hE, pi . hp, pi
(4.1)
In these equations q describes a point in the multidimensional configuration space RN , p is the momentum (velocity) also in RN and h·, ·i is the arithmetic scalar product in RN . The field of force E = E(q) is assumed to be irrotational, i.e., it has locally a potential . function U = U (q), E = − ∂U ∂q P Let us denote by pdq the 1-form which defines the standard symplectic Pκ = structure ω = dκ = dp ∧ dq. We introduce the following 2-form 2=ω+
hE, dqi ∧ κ. hp, pi
We choose the Hamiltonian to be H = 21 hp, pi and we denote by F the vector field defined by (4.1). We have 2(·, F ) = dH, (4.2) but the form 2 does not give us a conformally symplectic structure because the relation (2.1) fails. To correct this setback we fix one level set of the Hamiltonian M c = {H = 1 2 hp, pi = c} and define another 2-form 2c = ω +
hE, dqi ∧ κ. 2c
Now we get a conformally symplectic structure. Indeed hE, dqi hE, dqi d2c = − ∧ 2c , and locally =d 2c 2c
−U 2c
.
Moreover on M c we still have 2c (·, F ) = dH so that the restriction of (4.1) to M c coincides with a conformally Hamiltonian system with respect to the 2-form 2c and with the Hamiltonian H = 21 hp, pi. We can immediately apply Theorem 2.1 and we obtain
Conformally Symplectic Dynamics
57
that for any invariant ergodic probability measure µc on M c the Lyapunov exponents λ1 < . . . < λs satisfy Z Z 1 λk + λs−k+1 = − hE, pidµc = αdµc . 2c M c Mc Note that if the vector field of force has a global potential, E = − ∂U ∂q , then by the R R 1 1 Birkhoff Ergodic Theorem the integral − 2c M c hE, pidµc = 2c M c dU (F )dµc is equal −U to the time average of dU 2c dt and so it must vanish. Another way to see it is that e defines a global symplectic structure and on Mc our flow is Hamiltonian with respect to this symplectic structure and a modified Hamiltonian e = e−U ( 1 hp, pi − c). H 2 Indeed as discussed in Remark 2.1 on M c we have e e−U 2c (·, F ) = dH. For a Hamiltonian flow the symmetric Lyapunov exponents must add up to zero. B. Gaussian isokinetic dynamics on a Riemannian manifold. For a given Riemannian P gij dqi dqj we canP naturally generalize the manifold N with the metric tensor ds2 = pdq is independent of form 2 to the cotangent bundle T ∗ N . Indeed the 1-form κ = the coordinate system, cf. [A], and for a given closed 1-form γ we put 2c = dκ −
1 γ ∧ κ. 2c
1 We get d2c = − 2c γ ∧ 2c . Taking γ = dU for some potential function (single or P ij multi-valued) and the Hamiltonian H = 21 g pi pj we obtain the Gaussian isokinetic dynamics, [Ch], on the level set H = c by the relation (4.2). We can repeat the discussion in part A and we conclude again that the Lyapunov exponents must be symmetric and they add up to zero, if the potential U is single-valued.
C. The Gaussian isokinetic dynamics with collisions. Let us consider n spherical particles in a finite box B contained in Rd or the torus Td . We assume that the particles interact with each other by the potential V (q1 , q2 , . . . , qn ) (qk ∈ B, k = 1, . . . , n denote the positions of the particles) and that they are subjected to the external fields given by the potentials Vk (qk ), k = 1, . . . , n. Further we assume that the particles have the radii r1 , . . . , rn , the masses m1 , . . . , mn , and that they collide elastically with each other and the sides of the box, which can be flat or curved. The last element in the description of the system is the Gaussian isokinetic thermostat. As described in Part A and B the Gaussian isokinetic thermostat gives rise to a conformally Hamiltonian flow with the Pn p2 Hamiltonian H = k=1 2mkk and an appropriate conformally symplectic structure. We will check below that the collisions in this system preserve the form 2c giving rise to a conformally symplectic flow with collisions. Theorem 3.1 can be thus applied to our system giving us the symmetry of the Lyapunov spectrum. We introduce the canonical change of variables which brings the kinetic energy into the standard form,
58
M. P. Wojtkowski, C. Liverani
xk =
√
mk qk , pk vk = √ . mk
The advantage of these coordinates is that although the collision manifolds in the configuration space become less natural, the collisions between particles (and the walls of the box) are given by the billiard rule in the configuration space. The equations of motions in the (x, v) coordinates are x˙ = v, (4.3) ∂U − αv, v˙ = − ∂x Pn (v) where U = U (x) = V + k=1 Vk is the total potential of the system and α = − dU hv,vi . We introduce the differential 2-form X 1 dv ∧ dx − dU ∧ hv, dxi. 2c = 2c As in part A we conclude that the form satisfies (2.1) and the system (4.3) restricted to M c coincides with the conformally Hamiltonian system defined by this form and the Hamiltonian H = 21 hv, vi. Proposition 4.1. The collision maps preserve the form 2c . Proof. A collision manifold is locally given by an equation of the form g(x) = 0, where g is some differentiable function on Rnd . Note that the general form of the collision map is the same for collisions of particles and the collisions with the sides of the box. Let n(x), for x ∈ {x ∈ Rnd |g(x) = 0}, denote the unit normal vector to the collision manifold in the configuration space. The collision map is defined as x + = x− , v + = v − − 2hv − , n(x− )in(x− ),
(4.4)
where the index + corresponds to the values of x and v after the collision and the index − to the values before the collision. As a result of these formulas we get immediately that (4.5) δx+ = δx− . It is well known, [W1, W2], that in an elastic collision the symplectic form ω is preserved. It remains to show the preservation of the second term in 2c . It follows immediately from (4.4) and (4.5), because hv + , δx+ i = hv − , δx− i − 2hv − , n(x− )ihn(x− ), δx− i, and the last term is zero since we only take the variations (δx− , δv − ) tangent to the collision manifold, i.e., δx− is orthogonal to n(x). The proposition is proven. It follows from Proposition 4.1 that also the form e−U 2c is preserved under collisions. Hence, as remarked in Parts A and B, if the potential U is singlevalued then the system restricted to one energy level coincides with a globally Hamiltonian system (with collisions) with respect to the symplectic form e−U 2c with the Hamiltonian function e = e−U ( 1 hp, pi − c). We conclude that the occurrence of dissipation in such equal to H 2
Conformally Symplectic Dynamics
59
systems is related to the topology of the configuration space (the multivaluedness of the potential U ). D. Nos´e-Hoover dynamics. The Nos´e Hamiltonian is, cf. [D-M 3], H(q, s; π, ps ) =
N X i=1
p2 πi2 + ϕ(q) + s + C ln s, 2 2mi s 2
withP a non-physical time denoted by λ and some constant C. The symplectic form is ω = dπ∧dq+dps ∧ds. Changing the variables as π = sp and σ = ln s the Hamiltonian becomes N X p2 p2i (4.6) + ϕ(q) + s + Cσ, H(q, σ; p, ps ) = 2mi 2 i=1 P P and the symplectic form is ω = eσ i dpi ∧ dqi + dps ∧ dσ + dσ ∧ ( i pi dqi ) . Note that now in the Hamiltonian the thermostat (σ, ps ) is decoupled from the system but the σ coupling is shifted to the symplectic form. We make finally the time change dλ dt = e . We choose not to change the Hamiltonian but rather to modify the 2-form, e−σ ω(·, eσ ∇ω H) = dH. We end up with the Hamiltonian (4.6) and the conformally symplectic structure X X dpi ∧ dqi + dps ∧ dσ + dσ ∧ ( pi dqi + ps dσ). 2 = e−σ ω = i
We have d2 = dσ ∧ 2. Note the similarity of 2 with the form used in the discussion of the isokinetic dynamics above. This form and the Hamiltonian give us the Hoover equations pi , q˙i = mi ∂ϕ p˙i = − − ps pi , ∂qi σ˙ = ps , X p2 i p˙s = − C. m i i On any level set we can drop the equation for σ since σ can be trivially obtained from other variables using the constancy of the Hamiltonian. By Theorem 2.1 we have the symmetry of the Lyapunov spectrum for this system reduced to one level of the Hamiltonian. Moreover the Lyapunov exponents add up to the time average of σ. ˙ This average must be zero, unless σ grows linearly, which is unlikely. Note that the Nos´e–Hoover system is open in the sense that arbitrarily large fluctuations of σ cannot be ruled out. Acknowledgement. We thank Dmitri Alexeevski, Federico Bonetto and Philippe Choquard for many enlightening discussions during our stay at the ESI in December of 1996. In addition, we thank David Ruelle for valuable comments. We are also grateful for the opportunities provided by the hospitality of the Erwin Schr¨odinger Institute in Vienna where this work was done. M.P.Wojtkowski has been partially supported by NSF Grant DMS-9404420.
60
M. P. Wojtkowski, C. Liverani
References [A]
Arnold, V.I.: Mathematical Methods in Classical Mechanics. Berlin–Heidelberg–New York: Springer Verlag, 1978 [A-M-R] Abraham, R., Marsden, J.E., Ratiu, T.: Manifolds, tensor analysis and applications. Berlin– Heidelberg–New York: Springer Verlag, 1988 [B-G-G-S] Benettin, G., Galgani, I., Giorgilli, A., Strelcyn, J.-M.: Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Meccanica 15, 9–20 (1980) [B-G-G] Bonetto, F., Gallavotti, G., Garrido, P.L.: Chaotic principle: An experimental test. Preprint (1996) [Ch] Choquard, Ph.: Lagrangian formulation of Nos´e–Hoover and of isokinetic dynamics. ESI report (1996) [Ch-E-L-S] Chernov, N.I., Eyink, G.L., Lebowitz, J.L., Sinai, Ya.G.: Steady-state electric conduction in the periodic Lorentz gas. Commun. Math. Phys. 154, 569–601 (1993) [D-P-H] Dellago, C.P., Posch, H.A., Hoover, W.G.: Lyapunov instability in a system of hard disks in equilibrium and nonequilibrium steady states. Phys. Rev. E 53, n.2, 1485–1501 (1996) [D-M 1] Dettmann, C.P., Morriss, G.P.: Proof of Lyapunov exponent pairing for systems at constant kinetic energy. Phys. Rev. E 53, 5541 (1996) [D-M 2] Dettmann, C.P., Morriss, G.P.: Hamiltonian formulation of the Gaussian isokinetic thermostat. Phys. Rev. E 54, 2495 (1996) [D-M 3] Dettmann, C.P., Morriss, G.P.: Hamiltonian reformulation and pairing of Lyapunov exponents for Nos´e–Hoover dynamics. Phys. Rev. E 55, 3693 (1997) [E-C-M] Evans, D.J., Cohen, E.G.D., Morriss, G.P.: Viscosity of a simple fluid from its maximal Lyapunov exponents. Physical Review 42A, 5990–5997 (1990) [E-M] Evans, D.J., Morriss, G.P.: Statistical Mechanics of nonequilibrium liquids. New York: Academic Press, 1990 [G-G] Garrido, P.L., Gallavotti, G.: Billiards correlation functions. J. Stat. Phys. 76, 549–586 (1994) [L-B-D] Latz, A., van Beijeren, H., Dorfman, J.R.: Lyapunov spectrum and the conjugate pairing rule for a thermostatted random Lorentz gas: Kinetic theory. Preprint (1996) [O] Oseledets, V.I.: A multiplicative ergodic theorem: Characteristic Lyapunov exponents of dynamical systems. Trans. Moscow Math. Soc. 19, 197–231 (1968) [R] Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. Math. IHES 50, 27–58 (1979) [V] Vaisman, I.: Locally conformal symplectic manifolds. Int. J. Math.–Math. Sci. 8, 521–536 (1985) [W1] Wojtkowski, M.P.: Measure Theoretic Entropy of the system of hard spheres. Ergodic Theory and Dynamical Systems. 8, 133–153 (1988) [W2] Wojtkowski, M.P.: Systems of classical interacting particles with nonvanishing Lyapunov exponents. In: Lyapunov Exponents, Proceedings, Oberwolfach 1990, L. Arnold, H. Crauel, J.-P. Eckmann (Eds.), Lecture Notes in Math. Vol. 1486, 1991, pp. 243–262 Communicated by J. L. Lebowitz
Commun. Math. Phys. 194, 61 – 70 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Pair Correlation Function of Fractional Parts of Polynomials? Ze´ev Rudnick1 , Peter Sarnak2 1 Raymond and Beverly Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel 2 Department of Mathematics, Princeton University, Princeton NJ 08544, USA
Received: 22 July 1997 / Accepted: 24 September 1997
Abstract: We investigate the pair correlation function of the sequence of fractional parts of αnd , n = 1, 2, . . . , N , where d ≥ 2 is an integer and α an irrational. We conjecture that for badly approximable α, the normalized spacings between elements of this sequence have Poisson statistics as N → ∞. We show that for almost all α (in the sense of measure theory), the pair correlation of this sequence is Poissonian. In the quadratic case d = 2, this implies a similar result for the energy levels of the “boxed oscillator” in the high-energy limit. This is a simple integrable system in 2 degrees of freedom studied by Berry and Tabor as an example for their conjecture that the energy levels of generic completely integrable systems have Poisson spacing statistics.
1. Introduction Hermann Weyl [11] proved that for an integer d ≥ 1 and an irrational α, the sequence of fractional parts αnd mod 1, n = 1, 2, . . . is equidistributed in the unit interval. A different aspect of the random behavior of the sequence has attracted attention recently: Are the spacings between members of the sequence distributed like those between members of a sequence of random numbers in the unit interval (or as some would say, do they have a “Poissonian” distribution)? This issue came up in the context of the distribution of spacings of the energy levels of integrable systems [1]. For the case d = 1 the spacings between the fractional parts of αn are essentially those of the energy levels of a two-dimensional harmonic oscillator [4, 2, 3]. For d = 2 the spacings are related to the spacings between the energy levels of the “boxed oscillator” [1], a particle in a ? Supported in part by grants from the U.S.-Israel Binational Science Foundation, the Israel Science Foundation, and the NSF.
62
Z. Rudnick, P. Sarnak
2-dimensional potential well with hard walls in one direction and harmonic binding in the other. The spacings of αn2 mod 1 were also investigated numerically in [5]. If d = 1 it is elementary that the consecutive spacings have at most 3 values [9, 10]. Hence the sequence is not random in this case. For d ≥ 2 the picture is very different. To explain it, we recall a basic classification of real numbers with regards to their Diophantine approximation properties: We say α is of type κ if there is c = c(α) > 0 so that |α − p/q| > c/q κ for all integers p, q. For rational α, κ = 1 and α is irrational if and only if κ ≥ 2. It is well known that almost all α (in the sense of measure theory) are of type κ = 2 + for all > 0. We will call such α “Diophantine”. For instance, algebraic irrationals are of this type (Roth’s theorem). In [7] we establish some results towards the conjecture that αnd mod 1 is Poissonian for any α of Diophantine type. In this note we examine the behavior for almost all α, which according to the above should be Poissonian. The statistic we examine is the pair correlation: The pair correlation density for a sequence of N numbers θ1 , . . . , θN ∈ [0, 1] which are equidistributed as N → ∞, measures the distribution of spacings between the θj at distances of order of the mean spacing 1/N . Precisely, if kxk = distance(x, Z) then for any interval [−s, s] set 1 n so . (1.1) R2 ([−s, s], {θn }, N ) = # 1 ≤ j 6= k ≤ N : kθj − θk k ≤ N N For random numbers θj chosen uniformly and independently, R2 ([−s, s], {θn }, N ) → 2s with probability tending to 1 as N → ∞. Our main result is that this holds for the sequence of fractional parts {αnd mod 1} for almost every α: Denoting by R2 ([−s, s], α, N ) the pair correlation sum (1.1) for this sequence, we show Theorem 1. For d ≥ 2, there is a set P ⊂ R of full Lebesgue measure such that for any α ∈ P , and any s ≥ 0, R2 ([−s, s], α, N ) → 2s,
N → ∞.
Remark 1.1. The proof given below does not provide (and we do not know of) any specific α which is provably in P . Remark 1.2. Already with the pair correlation we see the necessity of a condition on the type of α. For if there are arbitrarily large integers p, q so that 1 p , |α − | ≤ q 10q d+1 then R2 ([−s, s], α, N ) 6→ 2s. Indeed if we choose N = q, then for m 6= n ≤ N ,
d d d
d
d
n α − md α = (n − m )p + t(n − m )
q 10q d+1
d with Hence either n α − md α ≤ 1/10q = 1/10N if q divides nd − md , or
d |t| ≤ 1.
n α − md α ≥ 9/10q = 9/10N otherwise. Thus there are no normalized differences
N nd α − md α in the interval (1/10, 9/10) for this sequence of N = q.
Pair Correlation of Fractional Parts
63
The proof of the theorem follows the steps in [8] (where a similar assertion is proven for the values of binary quadratic forms). We first establish that as a function of α ∈ [0, 1], R2 ([−s, s], α, N )P→ 2s in L2 (0, 1). This together with standard bounds on the Weyl sums S(n, N ) = x≤N e(nαxd ) allows us to pass to almost everywhere convergence. In Sect. 4 we briefly discuss higher correlations and show that they do not converge in L2 to the expected value. Thus our approach does not lend itself directly to establishing almost everywhere convergence of the higher correlations. 2. Bounding the Variance Let f ∈ Cc∞ (R) be a test function and set R2 (f, {θn }, N ) := where FN (y) =
X
1 N
FN (θj − θk ),
(2.1)
1≤j6=k≤N
X
f (N (y + m)).
(2.2)
m∈Z
The function FN (y) is periodic and has a Fourier expansion 1 X b n e(ny). f FN (y) = N N
(2.3)
n∈Z
Hence R2 (f, {θn }, N ) =
1 X b n f N2 N n∈Z
X
e(n(θj − θk )).
(2.4)
1≤j6=k≤N
In particular, if θn = αnd mod 1, then the pair correlation function is given by 1 X b n sof f (n, N ), f R2 (f, α, N ) = 2 N N
(2.5)
n∈Z
where sof f (n, N ) :=
X
e nα(xd − y d ) .
(2.6)
1≤x6=y≤N
As a function of α, R2 (f, α, N ) is periodic and from (2.5) its Fourier expansion is X R2 (f, α, N ) = bl (N )e(lα), (2.7) l∈Z
where for l 6= 0, bl (N ) =
1 X N2 n6=0
X 1≤x6=y≤N n(xd −y d )=l
The mean of R2 (f, α, N ) over α ∈ [0, 1] is
n . fb N
(2.8)
64
Z. Rudnick, P. Sarnak
1 hR2 i = b0 (N ) = 2 N
X
fb(0) =
1≤x6=y≤N
1 1− N
fb(0),
(2.9)
so that Z hR2 i =
∞
−∞
f (x)dx + O
1 N
,
(2.10)
which is the expected value for a random sequence. We next estimate the variance of R2 (f, α, N ) as a function of α: Proposition 2. As a function of α ∈ [0, 1],
R2 (f, α, N ) − fb(0) N −1/2+
(2.11)
2
for any > 0, the implied constants depending on and f . Proof. It is easy to see from (2.8) that since f ∈ Cc∞ (R), the Fourier coefficients bl (N ) are negligable for l ≥ N d+1+δ for any fixed δ > 0. Also from (2.8) we have for l 6= 0, τ (|l|)2 , N2
bl (N )
(2.12)
where τ (|l|) is the numbers of divisors of |l|. This is because the factors of l determine n, x, y. We will use the well-known estimate τ (m) m ,
for any > 0.
Thus by Parseval
2
R2 (f, α, N ) − fb(0) =
fb(0) N
2
!2 +
X N l6=0
N2
|bl (N )|
06=|l|≤N d+1+δ X
N N2 N N2
|bl (N )|2
l6=0
X
=
X
N |bl (N )| + smaller order term N2
|bl (N )|
l6=0
X
1≤x6=y≤N n∈Z
1 b n | N −1+ . |f N2 N
(2.13)
Pair Correlation of Fractional Parts
65
3. Almost-Everywhere Convergence 3.1. Overview of the argument for Theorem 1. In order to prove Theorem 1 from the decay of the variance of the pair correlation, we first show that for each f ∈ Cc∞ (R), there is a set of full measure P (f ) ⊂ R so that for all α ∈ P (f ), R2 (f, α, Nm ) → fb(0)
(3.1)
for a subsequence Nm which grows faster than m. Indeed, fix δ > 0, and let {Nm } be a sequence of integers with Nm ∼ m1+δ . Set XN (α) = R2 (f, α, N ) − fb(0). By Proposition 2, kXN k2 N −1+ for all > 0 and so 2
∞ Z X m=1
1
|XNm (α)2 |dα < ∞. 0
Therefore (since |XNm |2 ≥ 0) Z 0
and so
P m
1
X
|XNm (α)| dα = 2
m
XZ m
1
|XNm (α)|2 dα < ∞, 0
|XNm |2 ∈ L1 (0, 1). Thus the sum is finite almost everywhere: X
|XNm (α)|2 < ∞,
for almost all α.
m
Therefore, XNm (α) → 0 as m → ∞ for almost all α, that is we have (3.1) on a set P (f ) of α’s which we may assume consists only of Diophantine numbers. To go from almost everywhere convergence along a subsequence to almost everywhere convergence, we will show that as a function of N , R2 (f, N, α) does not oscillate much for Diophantine α. More precisely, there is some ν > 0 so that if Nm ≤ n < Nm+1 then for Diophantine α, there is c(f, α) > 0 so that −ν |Xn (α) − XNm (α)| c(f, α)Nm . δ , this estimate in turn will follow from: Because 0 ≤ n − Nm ≤ Nm+1 − Nm Nm
Proposition 3. Let 0 < δ < 1/2d−1 . Then for all f ∈ Cc (R) and all α of Diophantine type, there is some c(f, α) > 0 so that for all 0 ≤ k ≤ N δ , |XN +k (α) − XN (α)| ≤ c(f, α)N −ν , where ν < 1/2d−1 − δ.
66
Z. Rudnick, P. Sarnak
Since XNm (α) → 0 for all α ∈ P (f ), which by throwing out a measure-zero subset we assumed consisted only of Diophantine α’s, Proposition 3 implies Xn (α) → 0 for all α ∈ P (f ). We will prove this proposition after finishing the proof of Theorem 1. What remains to do is to find one subset P ⊂ R of full measure for which R2 (f, α, N ) → R∞ f (x)dx for all α ∈ P and all f which are characteristic function of intervals [−s, s] −∞ (or in Cc∞ (R)). To do this, pick a (countable) sequence of positive fi ∈ Cc∞ (R) so that for eachRf ≥ 0 as above, there are subsequences {fi± } ⊂ {fi } which satisfy fi− ≤ f ≤ fi+ ∞ and −∞ (fi+ − fi− )(x)dx → 0. Take P := ∩i P (fi ) which is still of full measure. For every α we have R2 (fi− , α, N ) ≤ R2 (f, α, N ) ≤ R2 (fi+ , α, N ), R∞ R∞ and in addition for α ∈ P , we have R2 (fi± , α, N ) → −∞ fi± . Since −∞ fi± → R∞ R∞ f , this shows that R2 (f, α, N ) → −∞ f for α ∈ P and gives Theorem 1. −∞ The proof of Proposition 3 will occupy the rest of this section. 3.2. Estimates for Weyl sums. We Pbegin with some consequences of Weyl’s estimates for the “Weyl sums” S(n, N ) = x≤N e(nαxd ) which we will need. Throughout the remainder of this section, we set D = 2d−1 . Lemma 4. For α Diophantine, and M ≥ 1, we have X |S(n, N )|D M 1+ N D−1+ 1≤n≤M
for all > 0 (D = 2d−1 ). Proof. This follows from proof of Weyl’s inequality (see [6], Lemma 3). We will outline the steps. By repeated squaring, one finds that |S(n, N )|
D
N
D−1
+N
D−d
N X y1 ,...,yd−1 =1
1 min N, kd!nαy1 . . . yd−1 k
,
where k·k denotes the distance to the nearest integer. Now sum over n ≤ M , collecting together terms with the product d!ny1 . . . yd−1 having a given value m. The number of such terms is at most the divisor function τ (m) m . Since the maximal value of m is d!M N d−1 , we find X X 1 . |S(n, N )|D M N D−1 + M N D−d+ min N, kmαk (3.2) 1≤n≤M m≤d!M N d−1 Proceeding as in [6], we replace α by a rational approximation a/q with |α−a/q| ≤ 1/q 2 , and divide the range of summation into consecutive blocks of length q. This will give X 1 M N d−1 + 1 · (N + q log q). min N, kmαk q d−1 m≤d!M N
Inserting into (3.2) we get
Pair Correlation of Fractional Parts
X
67
|S(n, N )|
D
MN
D−1
+M N
1≤n≤M
D−d+
M N d−1 + 1 · (N + q log q). q (3.3)
Now choose q ≤ M N d−1 with |α − a/q| ≤ 1/qM N d−1 (so certainly |α − a/q| ≤ 1/q 2 so (3.3) holds). Since α is Diophantine, |α − a/q| 1/q 2+ which gives q (M N d−1 )1− . Therefore M N d−1 + 1 · (N + q log q) (M N d−1 )1+ , q and consequently
X
|S(n, N )|D M 1+ N D−1+
1≤n≤M
as required.
As an immediate consequence of this lemma, we get on repeatedly using the CauchySchwarz inequality that Corollary 5. For α Diophantine, and M ≥ 1, X |S(n, N )|2 M 1+ N 2−2/D+
(3.4)
1≤n≤M
and
X
|S(n, N )| M 1+ N 1−1/D+ .
(3.5)
1≤n≤M
3.3. Proof of Proposition 3. We first show XN +k (α) − XN (α) =
1 N2
X 0<|n|≤M
n {sof f (n, N + k) − sof f (n, N )} fb N + O M 2+ N −2+δ−2/D .
(3.6)
We use the representation (2.5), XN (α) =
1 X b n sof f (n, N ). f N2 N n6=0
Since f ∈ Cc∞ (R), its Fourier transform fb is rapidly decreasing and so on using the trivial estimate |sof f (n, N )| ≤ N 2 we see that for any b > 0, M = N 1+b , n X 1 sof f (n, N ) + rapidly decaying term. XN (α) = 2 fb N N 06=|n|≤M
Next we use |sof f (n, N )| ≤ N + |S(n, N )|2 and Corollary 5 to deduce that X X |sof f (n, N + k)| ≤ M N + |S(n, N + k)|2 M 1+ N 2−2/D . (3.7) 06=|n|≤M 06=|n|≤M
68
Z. Rudnick, P. Sarnak
Next we claim that n X 1 1 sof f (n, N + k) = 2 fb 2 (N + k) N +k N 06=|n|≤M
n sof f (n, N + k) fb N
X 06=|n|≤M
+ O(M 2+ N −2+δ−2/D ). Indeed, write 1 1 = 2 +O 2 (N + k) N
k N3
1 + O(N −3+δ ) N2
=
and nk n n n = + O( 2 ) = +O N +k N N N
M N 2−δ
,
so that for |n| ≤ M , k < N δ , fb
n b n =f +O N +k N
M N 2−δ
.
Therefore n sof f (n, N + k) N +k 06=|n|≤M n X 1 sof f (n, N + k) fb − 2 N N 06=|n|≤M X 1 M 1 b n +O = f + O( ) sof f (n, N + k) N2 N 3−δ N N 2−δ 06=|n|≤M n X 1 sof f (n, N + k) fb − 2 N N 06=|n|≤M X 1 M + |sof f (n, N + k)| N 4−δ N 3−δ X
1 (N + k)2
fb
06=|n|≤M
M M 1+ N 2−2/D = M 2+ N −2+δ−2/D N 4−δ
by (3.7)
as required. This proves (3.6). Next we express the difference sof f (n, N + k) − sof f (n, N ) as X X e(−nαy d ) e nαxd sof f (n, N + k) − sof f (n, N ) = 2 Re N +1≤y≤N +k
+
X
1≤x≤N
e nα(xd − y d ) .
N +1≤x6=y≤N +k
We estimate the second term trivially by k 2 : |sof f (n, N + k) − sof f (n, N )| ≤ k|S(n, N + k)| + k 2 .
(3.8)
Pair Correlation of Fractional Parts
69
Then inserting this into (3.6) we get XN +k (α) − XN (α)
1 N2 k N2
X
k|S(n, N + k)| + k 2 + M 2+ N −2+δ−2/D
0<|n|≤M
X
|S(n, N )| +
0<|n|≤M
M k2 + M 2+ N −2+δ−2/D N2
k M k2 1+ 1−1/D M N + + M 2+ N −2+δ−2/D N2 N2
by (3.5)
M 1+ kN −1−1/D N b+δ−1/D+ . Since b > 0 can be made arbitrarily small, this proves our proposition.
4. Triple and Higher Correlations The higher correlations run into some basic difficulties. For example, consider the triple correlation for αn2 mod 1. For a test function f ∈ Cc∞ (R2 ), let X FN (y1 , y2 ) = f (N (y1 + m1 ), N (y2 + m2 )) . (4.1) (m1 ,m2 )∈Z2
This function is periodic and has a Fourier expansion 1 X b k e(k · y). f FN (y) = 2 N N 2
(4.2)
k∈Z
The triple correlation function of the sequence αn2 mod 1 and for the test function f is R3 (f, α, N ) = where the sum of R3 is
P0
1 N
0 X
FN α(x2 − y 2 ), α(y 2 − z 2 ) ,
(4.3)
1≤x,y,z≤N
is over all triples of distinct integers x, y, z. The Fourier expansion R3 (f, α, N ) =
X
cl (N )e(lα)
(4.4)
l
with 1 cl (N ) = 3 N
0 X 1,x,y,z≤N, k1 ,k2 k1 (x2 −y 2 )+k2 (y 2 −z 2 )=l
k 1 k1 b , . f N N
(4.5)
There is no doubt that the mean hR3 i = c0 (N ) → fb(0, 0), as N → ∞, the expected answer for random sequence, and that more generally cl (N ) → 0 if l 6= 0. That is to say that R3 (f, α, N ) → fb(0, 0) in the weak sense. This can probably be proven. However, a much greater difficulty appears and that is that if f (0) 6= 0 then
70
Z. Rudnick, P. Sarnak
2
R3 (f, N ) − fb(0) N.
(4.6)
2
This shows that the L2 approach to almost-everywhere convergence is problematic in this case. In fact, this feature of the L2 -norm being as large a manifestation of R3 being very large at rational α’s. For almost all α we still expect that R3 (f, α, N ) → fb(0, 0). To prove (4.6) note that as N → ∞, X l
1 cl (N ) = 3 N
0 X
X
1≤x,y,x≤N k1 ,k2
k 1 k1 b , ∼ N 2 f (0). f N N
Hence if f (0) 6= 0 then N | 2
X
cl (N )| ≤
l
X
!1/2 |cl (N )|
!1/2 1
lN 3
l
= N 3/2
X
2
X
!1/2 |cl (N )|2
,
l
which gives (4.6). References 1. Berry, M.V. and Tabor, M.: Level clustering in the regular spectrum Proc. R. Soc. London A356, 375–394 (1977) 2. Bleher, P.M. The energy level spacing for two harmonic oscillators with golden mean ratio of frequencies. J. Stat. Phys. 61, 869–876 (1990) 3. Bleher, P.M.: The energy level spacing for two harmonic oscillators with generic ratio of frequencies. J. Stat. Phys. 63, 261–283 (1991) 4. Pandey, A., Bohigas, O. and Giannoni, M.J.: Level repulsion in the spectrum of two-dimensional harmonic oscillators. J. Phys. A: Math. Gen. 22, 4083–4088 (1989) 5. Casati, G., Guarneri, I. and Izrailev, F.M.: Statistical properties of the quasi-energy spectrum of a simple integrable system. Phys. Lett. A 124, 263–266 (1987) 6. Davenport, H.: Analytic Methods for Diophantine Equations and Diophantine Inequalities. Ann Arbor, Michigan: Ann Arbor Publishers, 1962 7. Rudnick, Z., Sarnak, P. and Zaharescu, A.: In preparation 8. Sarnak, P.: Values at Integers of Binary Quadratic Forms. To appear in a volume in memory of C. Herz, edited by S. Drury 9. S´os, V.: On the distribution mod 1 of the sequence nα. Ann. Univ. Sci. Budapest. E¨otv¨os Sect. Math. 1, 127–134 (1958) 10. Swierczkowski, S.: On succesive settings of an arc on the circumference of a circle Fund. Math. 46, 187–189 (1958) ¨ 11. H. Weyl: Uber die Gleichverteilung von Zahlen mod. Eins. Math. Ann. 77, 313–352 (1916) Communicated by Ya. G. Sinai
Commun. Math. Phys. 194, 71 – 86 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Classification of Bicovariant Differential Calculi on Quantum Groups (a Representation-Theoretic Approach) Pierre Baumann, Fr´ed´eric Schmitt U.F.R. de Math´ematique, Universit´e Louis Pasteur, 7 rue Ren´e Descartes, F-67084 Strasbourg Cedex, France Received: 1 December 1996 / Accepted: 29 September 1997
Abstract: The restricted dual of a quantized enveloping algebra can be viewed as the algebra of functions on a quantum group. According to Woronowicz, there is a general notion of bicovariant differential calculus on such an algebra. We give a classification theorem of these calculi. The proof uses the notion (due to Reshetikhin and SemenovTian-Shansky) of a factorizable quasi-triangular Hopf algebra and relies on results of Joseph and Letzter. On the way, we also give a new formula for Rosso’s bilinear form.
Introduction Let G be a semi-simple connected simply-connected complex Lie group, g its Lie algebra, Uq g the quantized enveloping algebra of g. Uq g is a Hopf algebra. The associated quantum group is an object of non-commutative geometry. According to a point of view due to Woronowicz and developed by Faddeev, Reshetikhin and Takhtadzhyan [F–R–T], one may view the restricted (Hopf) dual (Uq g)∗ res as the algebra Aq G of functions on this quantum group. In this way, the Peter–Weyl theorem becomes a definition: the rational representations of the quantum group are the finite-dimensional representations of Uq g. In order to study the differential geometry of quantum groups, Woronowicz [Wo] defined the notion of bicovariant differential calculus. As in the classical case, one needs only to define the differential of functions at the unity point of the quantum group. If ε : Aq G → C(q) is the augmentation map, this amounts to take the residual class of functions belonging to ker ε modulo a right ideal R ⊆ ker ε. In the classical case, one takes R = (ker ε)2 . As for quantum groups, it is more important to preserve the group structure than the infinitesimal structure, and one is led to select ideals R as above by the requirement of a certain invariance condition. In this article, we solve the classification problem for these ideals R, and we give a picture of what they look like. We now compare our results with previous ones. Rosso [Ro3] showed how to use the quasi-triangular structure of Uq g in order to construct left covariant differential calculi
72
P. Baumann, F. Schmitt
on the quantum group. Modifying this construction, Jurˇco [Ju] used the R-matrix in the natural representation of Uq g (and in the dual of it) so as to construct bicovariant differential calculi: he obtained particular cases (when M is the natural g-module or its dual) of our Theorem 2. (In this spirit, see also [F–P].) As regards classification results, Schm¨udgen and Sch¨uler have classified the ideals R as above, but only when g is of classical type, and under restrictive assumptions on R. Most of the results in [S–S1, S–S2] are particular cases of our Theorem 1. For instance, the classification given in Theorem 2.1 of [S–S1] corresponds (in the wording of our theorem) to the ideals R constructed (up to a twisting character χ : 2X/2Q → C× , as explained in Sect. 3.3) from the natural Uq sln -module or its dual. Let us explain our proof and the contents of our article. Our proof relies on the quasi-triangular structure of Uq g. Since the formalism of R-matrices may be justified only for finite dimensional Hopf algebras, we will employ the dual notion of a co-quasitriangular (c.q.t.) Hopf algebra (see [L–T]): the algebra Aq G is c.q.t. We use then a bilinear form on Aq G, introduced by Reshetikhin and Semenov-Tian-Shansky. As Uq g is a factorizable quasi-triangular Hopf algebra (in the terminology of [R–S]), this pairing is non-degenerate and gives a linear injection Aq G ,→ Uq g ⊆ (Aq G)∗ . The image of R under this map is nearly the annihilator of a Uq g-module. It is then easier to discuss what R may be. The definitions and the proofs of these assertions are given in Sects. 1 and 2. In Sect. 3, we present a contruction of bicovariant differential calculi valid for any factorizable c.q.t. Hopf algebra. Finally we link, in the case of Aq G, these constructions with our classification result. Notations. • Let A be a k-algebra. If M is an A-module, its annihilator is denoted by annA M. If m ∈ M and m∗ ∈ M∗ (the k-dual of M), we denote by θM (m, m∗ ) the matrix coefficient (A → k, a 7→ hm∗ , a · mi). P • For a Hopf algebra H, we will use Sweedler’s notation for coproduct (1(a) = a(1) ⊗ a(2) ) and for coaction on comodules. The sum sign will generally be omitted. We will denote the augmentation and the antipode of H by ε and S respectively. • Let H be a Hopf algebra, and H∗ res be the restricted (Hopf) dual of H. A finitedimensional left H-module M (with a basis (mi ) and the dual basis (m∗i ) of M∗ ) can be H∗ res -comodule with structure map δR : (M → M ⊗ H∗ res , m 7→ P viewed as a right ∗ mi ⊗ θM (m, mi ) ). 1. Co-Quasi-Triangular Hopf Algebras
1.1. Some definitions. Let H be a Hopf algebra over a field k. A right crossed bimodule over H (in the sense of Yetter [Ye]) is a k-vector space M, which is also P a right H-module, : (M → M ⊗ H, m 7→ m(0) ⊗ m(1) )), both a right H-comodule (with structure map δR P m(0) · a(2) ⊗ S(a(1) )m(1) a(3) (for m ∈ M, structures being compatible: δR (m · a) = a ∈ H). When M and N are right crossed bimodules over H, M ⊗ N becomes a right crossed bimodule for the action (m ⊗ n) · a = m · a(1) ⊗ n · a(2) and the coaction δR (m ⊗ n) = (m(0) ⊗ n(0) ) ⊗ m(1) n(1) . There are two easy examples: we can endow H with the structures: a · b = ab and δR : (H → H ⊗ H, a 7→ a(2) ⊗ S(a(1) )a(3) ). Alternatively, we can put on H the structures a · b = S(b(1) )ab(2) (right adjoint action) and δR : (H → H ⊗ H, a 7→ a(1) ⊗ a(2) ).
Classification of Bicovariant Differential Calculi on Quantum Groups
73
When 0 is a bicovariant bimodule (see [Wo]), the space 0L of left coinvariants is a right crossed bimodule over H. Conversely, any right crossed bimodule over H is the space of left coinvariants of a bicovariant bimodule. Finally (H still being a Hopf algebra), we endow the tensor product coalgebra H∗ res ⊗ H with the product (f ⊗ a)(g ⊗ b) = hg(3) , a(3) ihg(1) , S(a(1) )i(g(2) f ⊗ a(2) b). We obtain a bialgebra, called Drinfel 0d’s double of H and denoted by D(H). (Here H∗ res is the standard dual of H, the coproduct is not brought into its opposite.) When M is a right crossed bimodule over H, it is a right D(H)-module for the actions: m · (f ⊗ 1) = hf, m(1) im(0) , m · (1 ⊗ a) = m · a. 1.2. Definition of a co-quasi-triangular Hopf algebra. We give the definition of c.q.t. Hopf algebras, by now usual (see [L–T] for historical notes): Definition 1. A co-quasi-triangular Hopf algebra is a pair (A, γ), where A is a Hopf algebra and γ : A → A∗ res is a coalgebra morphism and an algebra antimorphism such that we have the Yang–Baxter equation (or rather the Baxter commutation relations): a(1) b(1) hγa(2) , b(2) i = hγa(1) , b(1) ib(2) a(2) for all a, b ∈ A. That γ is a coalgebra morphism and an algebra antimorphism gives us that for all a, b ∈ A, hγa, bi = hγS(a), S(b)i. We call δ : A → A∗ the map such that hδa, bi = hγb, S(a)i, for all a, b ∈ A. Hence we have hγa, bi = hδb, S(a)i. We verify easily that δ takes its values in A∗ res and that (A, δ) is a c.q.t. Hopf algebra. If U is a Hopf algebra quasi-triangular for an R-matrix R12 , then U∗ res becomes a c.q.t. Hopf algebra for the map γ given by: for a, b ∈ U∗ res , hγ(a), bi = hb ⊗ a, R12 i, −1 and then hδ(a), bi = hb ⊗ a, R21 i. This follows from Drinfel 0d’s classical axioms. For instance, let H be a finite-dimensional Hopf algebra, and U = D(H): the dual vector space of the restricted dual of U. If (ei ) is a basis for space H ⊗ H∗ of U is the underlying P H, the canonical R-matrix is (e∗i ⊗ 1) ⊗ (1 ⊗ ei ) ∈ U ⊗ U. It corresponds to the maps γ : (H ⊗ H∗ → U, a ⊗ f 7→ ε(a)f ⊗ 1) and δ : (H ⊗ H∗ → U, b ⊗ g 7→ g(1)ε ⊗ S−1 (b)) (the antipode of a finite-dimensional Hopf algebra being invertible). The category of left modules over a quasi-triangular Hopf algebra is braided. The translation in the present formalism is given by the following proposition: Proposition 1. Let (A, γ) be a c.q.t. Hopf algebra. If M is a right A-comodule, it becomes a right crossed bimodule over A when endowed with the right module structure given by: for m ∈ M and a ∈ A, m · a = hγa, m(1) im(0) . This extra structure is compatible with tensor products of comodules and crossed bimodules. Proof. Let δR : (M → M ⊗ A, m 7→ m(0) ⊗ m(1) ) the structure map for M. Then we have: m(0) · a(2) ⊗ S(a(1) )m(1) a(3) = m(0) ⊗ hγa(2) , m(1) iS(a(1) )m(2) a(3) = m(0) ⊗ S(a(1) )a(2) m(1) hγa(3) , m(2) i = m(0) ⊗ m(1) hγa, m(2) i = δR (m · a). The compatibility with tensor products is a consequence of γ being a coalgebra homomorphism. We also note that the antipode of a c.q.t. Hopf algebra is always invertible, the square of its transpose being an inner automorphism of the algebra A∗ (see [Dr2]).
74
P. Baumann, F. Schmitt
Finally, when (A, γ) is a c.q.t. Hopf algebra, we have the maps γ and δ, and Radford [Ra] has shown that (im γ)(im δ) = (im δ)(im γ) is a sub-Hopf-algebra of A∗ res . This was shown in the early [R–S]: there is a Hopf algebra structure (with invertible antipode) on the tensor product coalgebra A ⊗ A such that the map (A ⊗ A → A∗ res , a ⊗ b 7→ γb · δa) is a coalgebra morphism and an algebra antimorphism. Example. In the F.R.T. construction [F–R–T], one considers matrices L+ and L− , whose elements lie in im γ and im δ respectively. Then Faddeev, Reshetikhin and Takhtadzhyan defined Uq g to be the algebra (im γ)(im δ). 1.3. The maps I and J. We fix in this subsection a c.q.t. Hopf algebra (A, γ) over the field k, and note δ the associated map. We define two maps I : (A → A∗ res , a 7→ γ(a(1) ) Sδ(a(2) )) and J : (A → A∗ res , a 7→ Sδ(a(1) ) γ(a(2) )). Equivalently, we may consider the pairing of two elements a, b ∈ A: hI(a), bi = hJ(b), ai. (When A is the dual of a quasi-triangular Hopf algebra, this pairing is ha ⊗ b, R21 R12 i.) We have I = S ◦ J ◦ S and J = S ◦ I ◦ S. We will now state an important property of the map I. A∗ res is a left A∗ res ⊗ A∗ res module for the law (x ⊗ y) · z = xz S(y). A is a right crossed bimodule over A for the structures: a·b = ab and δR : (A → A⊗A, a 7→ a(2) ⊗S(a(1) )a(3) ), so A is a right D(A)module. Let 5 : (D(A) ≡ A∗ res ⊗A → A∗ res ⊗A∗ res , x⊗b 7→ γ(b(1) )x(1) ⊗δ(b(2) )x(2) ). Proposition 2. In the set-up above, 5 is an algebra antimorphism. If ξ ∈ D(A) and a ∈ A, then I(a · ξ) = 5(ξ) · I(a). Proof. That 5 is an antimorphism is already in [R–S]. Then, as a consequence of the Yang–Baxter equation, we may write, for x ∈ A∗ res and a ∈ A, that Sγ(a(1) )hx, a(2) i = hx(2) , a(1) ix(1) Sγ(a(2) )S(x(3) ). Then we compute, for ξ = x ⊗ b ∈ D(A): I(a · ξ) = hx, S(a(1) )a(3) i I(a(2) b) = γ(b(1) ) hx, S(a(1) )a(4) i γ(a(2) ) Sδ(a(3) ) Sδ(b(2) ) = γ(b(1) ) hx(1) , S(a(1) )i SγS(a(2) ) hx(2) , a(4) i Sδ(a(3) ) Sδ(b(2) ) = γ(b(1) ) hx(2) , S(a(2) )ix(1) SγS(a(1) )S(x(3) )hx(5) , a(3) ix(4) Sδ(a(4) )S(x(6) )Sδ(b(2) ) = γ(b(1) ) hx(2) , S(a(2) )i x(1) SγS(a(1) ) hx(3) , a(3) i Sδ(a(4) ) S(x(4) ) Sδ(b(2) ) = γ(b(1) ) x(1) γ(a(1) ) Sδ(a(2) ) S(x(2) ) Sδ(b(2) ) = 5(ξ) · I(a). We single out the particular case b = 1: Proposition 3. We consider A and A∗ res as left A∗ res -modules for the adjoint action: if x, y ∈ A∗ res and a ∈ A, x · a = hx, S(a(1) )a(3) ia(2) and x · y = x(1) y S(x(2) ). Then I : A → A∗ res is a morphism of A∗ res -modules. Finally, we give the definition, originally due to Reshetikhin and Semenov-TianShansky [R–S]: Definition 2. One says that (A, γ) is factorizable if the pairing (A × A → k, (a, b) 7→ hI(a), bi) is non-degenerate. Thus (A, γ) is factorizable iff the maps I and J are injective. It is possible to show that (A, γ) is factorizable iff (A, δ) is so. 1.4. A related construction. First, let U be a Hopf algebra. It is a left U-module for the adjoint action: x · y = x(1) y S(x(2) ). We let F` (U) be the sum of all finite-dimensional
Classification of Bicovariant Differential Calculi on Quantum Groups
75
U-submodules of U. It is known [J–L1] that F` (U) is a subalgebra of U, a left coideal in U, and a U-submodule for the left adjoint action. The multiplication in U defines a morphism of left U-modules F` (U) ⊗ F` (U) → F` (U). We can then do the semi-direct product F` (U) ⊗ U: we obtain an algebra U ⊗ U denoting the ordinary tensor product algebra, there is an algebra morphism (F` (U) ⊗ U → U ⊗ U, x ⊗ y 7→ xy(1) ⊗ y(2) ). We can make the same constructions on the right: we obtain an algebra Fr (U). If the antipode of U is invertible, the algebra morphism (U ⊗ Fr (U) → U ⊗ U, x ⊗ y 7→ x(1) ⊗ x(2) y) has the same image as the previous one. Hence this image contains F` (U) ⊗ Fr (U) ⊆ U ⊗ U. We take now a c.q.t. Hopf algebra (A, γ), with δ, I and J as in the preceding subsection. Let U = (im γ)(im δ) be the minimal sub-Hopf-algebra of A∗ res in which γ and δ take their values. We consider on A and A∗ res the A∗ res -module structures of Proposition 3. By restriction, A and A∗ res are U-modules, and I : A → A∗ res is a morphism of Umodules. We can see that I takes its values in U, which is a U-submodule of A∗ res . Further, A is the sum of its finite-dimensional U-submodules, hence im I ⊆ F` (U). Proposition 4. Let (A, γ) be a c.q.t. factorizable Hopf algebra, and I be the associated map. Let U be the sub-Hopf-algebra (im γ)(im δ) ⊆ A∗ res . We suppose that im I = F` (U). Then I induces a bijection between: • the set of right ideals R of A, which are subcomodules for the right coaction δR : (A → A ⊗ A, a 7→ a(2) ⊗ S(a(1) )a(3) ). • the set of two-sided ideals I of F` (U), which are U-submodules for the adjoint action. This bijection preserves dimensions, codimensions, and the inclusion ordering in both sets. Proof. By assumption, I : A → F` (U) is a U-module isomorphism. We adopt the notations of Proposition 2. A is a D(A)-module, and U ⊗ A is (the underlying space of) a sub-Hopf-algebra of D(A), so we will view A as a right U ⊗ A-module: 1 ⊗ A acts on A by right multiplication, Uop ⊗ 1 acts on A by the left adjoint action. The injectivity of I implies that im J ⊆ U separates the points of A: hence the sub-U ⊗ A-modules of A are the right ideals which are subcomodules for the right coaction δR . On the other hand, we let E be the image of the morphism (F` (U) ⊗ U → U ⊗ U, x ⊗ y 7→ xy(1) ⊗ y(2) ). U is a U ⊗ U-module, so is an E-module, and F` (U) is a sub-E-module of U. E contains F` (U) ⊗ Fr (U), with S(Fr (U)) = F` (U). Therefore, the sub-E-modules of F` (U) are the two-sided ideals I which are U-submodules for the adjoint action. Now the proposition is a consequence of Proposition 2: writing 5 as the composition (F` (U)⊗A∗ res → A∗ res ⊗A∗ res , x⊗y 7→ xy(1) ⊗y(2) )◦(A∗ res ⊗A → F` (U)⊗A∗ res , x⊗ a 7→ I(a(1) ) ⊗ δ(a(2) )x), and using the assumption im I = F` (U), we can see that E is the image of U ⊗ A through 5. 2. The Case of the Quantum Coordinate Algebra
2.1. Notations. In this section, we study the preceding constructions in the case where A is the algebra Aq G of regular functions on a quantum group. Let g be a finite-dimensional semi-simple split Lie algebra, h a splitting Cartan subalgebra, {α1 , . . . , α` } ⊆ h∗ a basis for the root system, {α1∨ , . . . , α`∨ } ⊆ h the inverse roots, P ⊆ h∗ and Q ⊆ h∗ the weight and the root lattices. The choice of an invariant (under Weyl group action) scalar product (·|·) allows us to identify h and h∗ ,
76
P. Baumann, F. Schmitt
with αi = di αi∨ , di = (αi2|αi ) . We choose the normalization of (·|·) so that (λ | µ) ∈ Z whenever λ and µ belong to P. We denote by ρ half the sum of the positive roots, by P+ the set of dominant weights, and by w0 the longest element in the Weyl group. We now choose the following version of Uq g: this is a C(q)-algebra (q is generic) generated by Ei , Fi and Kλ (λ ∈ P). The relations are the usual ones among which: K −K−αi Kλ Ei = q (λ|αi ) Ei Kλ , Kλ Fi = q −(λ|αi ) Fi Kλ , Ei Fj − Fj Ei = δij qαdii −q−d . The i coproduct is given by: 1Kλ = Kλ ⊗Kλ , 1Ei = Ei ⊗1+Kαi ⊗Ei , 1Fi = 1⊗Fi +Fi ⊗ K−αi . We note S the antipode of Uq g. If one chooses a dominant weight λ and a character χ : P/2Q → C× , one knows how to construct a simple finite-dimensional Uq g-module, in which there is a highest weight vector mλ such that Kµ ·mλ = χ(µ mod 2Q)q (µ|λ) mλ . We note Lχ (λ) such a Uq g-module ; when χ is the trivial character, we simply write L(λ), and then Lχ (λ) = L(λ) ⊗ Lχ (0). The matrix coefficients of the representation L L(λ) span a linear subspace C(λ) of the restricted dual of Uq g, and we let Aq G = λ∈P+ C(λ). This is a Hopf subalgebra of (Uq g)∗ res . The elements of Aq G separate the points of Uq g [J–L1], so that there is an inclusion of Uq g into the dual of Aq G, actually into the restricted dual of Aq G. We note S the antipode of Aq G, which is the restriction to Aq G of the transpose of the antipode of Uq g. ThereP is an R-matrix for Uq g [Dr1, Ta, Ga]. We choose the R-matrix with the structure (diagonal part)(monomial in F ) ⊗ (monomial in E). If a and b belong to Aq G, the number hR12 , b ⊗ ai ∈ C(q) is well-defined (thanks to the weight graduation of Uq g and of any finite-dimensional Uq g-module), and we can define γ, δ : Aq G → (Aq G)∗ such that hR12 , b⊗ai = hγ(a), bi = hδ(b), S(a)i. (Aq G, γ) and (Aq G, δ) are c.q.t. Hopf algebras, im(γ) and im(δ) are the sub-Hopf-algebras U− U0 and U0 U+ of Uq g ⊆ (Aq G)∗ res respectively, and Uq g is the sub-Hopf-algebra (im γ)(im δ) = (im δ)(im γ) of (Aq G)∗ res . 2.2. Factorizability of Aq G. Let (Aq G, γ) be the c.q.t. algebra presented above, and δ be the associated map. For all the section, we endow Aq G and Uq g with the left adjoint action of Uq g, as in Sect. 1.4: in particular, the map I : Aq G → F` (Uq g) is a morphism of left Uq g-modules. Joseph and Letzter [J–L1, J–L2] have studied the structure of F` (Uq g), and we need the following results: • L If λ ∈ P+ , K−2λ generates a finite dimensional Uq g-submodule of Uq g, and F` (Uq g) = λ∈P+ (Uq g · K−2λ ). • Each block Uq g · K−2λ contains a unique one-dimensional Uq g-submodule; it defines a unique (up to scalars) element zλ of the center of Uq g. • F` (Uq g) ⊆ (Aq G)∗ separates the points of Aq G. The next assertion has been stated in [R–S]: Proposition 5. (Aq G, γ) is a factorizable c.q.t. Hopf algebra, and im I = F` (Uq g). Proof. Let λ ∈ P+ , L(λ) the standard Uq g-module, mλ a highest weight vector, mw0 λ a lowest weight vector, (mi ) a basis for L(λ) composed of weight vectors, (m∗i ) the dual basis. We have: • The matrix element θL(λ) (mw0 λ , m∗w0 λ ) is the linear form on Uq g given by (in the triangular decomposition U+ ⊗ U0 ⊗ U− of Uq g): EKµ F 7→ ε(E)q (w0 λ|µ) ε(F ). • On this element, γ takes the value Kw0 λ and δ the value K−w0 λ .
Classification of Bicovariant Differential Calculi on Quantum Groups
77
• The image by γ (respectively δ) of the matrix element θL(λ) (mi , m∗w0 λ ) (respectively θL(λ) (mw0 λ , m∗i )) is zero if i 6= w0 λ. So we have: I(θL(λ) (mw0 λ , m∗w0 λ )) = γ((θL(λ) (mw0 λ , m∗w0 λ ))(1) ) S(δ((θL(λ) (mw0 λ , m∗w0 λ ))(2) )) P = γ(θL(λ) (mi , m∗w0 λ )) S(δ(θL(λ) (mw0 λ , m∗i ))) = γ(θL(λ) (mw0 λ , m∗w0 λ )) S(δ(θL(λ) (mw0 λ , m∗w0 λ ))) = K2w0 λ .
Hence im I is a Uq g-submodule of F` (Uq g) which contains all the K2w0 λ (λ ∈ P+ ), so im I = F` (Uq g). We now want to show that J is injective. If b ∈ ker J, then for all a ∈ Aq G, hI(a), bi = hJ(b), ai = 0, so b is null when viewed as a linear form on im I = F` (Uq g). Then b = 0, because F` (Uq g) separates the points of Aq G. Finally, owing to the formula J = S ◦ I ◦ S and to the invertibility of S, I is also injective. This concludes the proof of the proposition. There is another way to present this result. Rosso [Ro1] introduced a bilinear non-degenerate ad-invariant form on Uq g, that Caldero [Ca] writes (Uq g × Uq g → C(q 1/2 ), (x, y) 7→ hζ(x), S−1 (y)i), where ζ : Uq g → (Uq g)∗ . Rosso’s non-degeneracy result is that ζ is injective; Caldero’s theorem states that ζ maps F` (Uq g) onto Aq G ⊆ (Uq g)∗ res . The triangular behaviour of Rosso’s form gives us that ζ(K2w0 λ ) = θL(λ) (mw0 λ , m∗w0 λ ). The ad-invariance of Rosso’s form can be translated for ζ: when we restrict ζ to F` (Uq g) and Aq G, ζ is a morphism of Uq g-modules for the adjoint structures. Now I ◦ ζ : F` (Uq g) → F` (Uq g) and ζ ◦ I : Aq G → Aq G are morphisms of Uq g-modules and fix the respective generators K2w0 λ and θL(λ) (mw0 λ , m∗w0 λ ) of these modules. (The fact that θL(λ) (mw0 λ , m∗w0 λ ) generates the Uq g-submodule C(λ) of Aq G is equivalent to the fact that m∗w0 λ ⊗ mw0 λ generates the Uq g-module L(λ)∗ ⊗ L(λ).) So we conclude that ζ and I are mutually inverse isomorphisms, and that I is a bijection between C(λ) and Uq g · K2w0 λ . The analysis also shows the amusing side-result: Proposition 6. If x ∈ F` (Uq g), y ∈ Uq g, then the Rosso form on (x, y) is given by hI−1 (x), S−1 (y)i, where I : (Aq G → F` (Uq g), a 7→ ha ⊗ idUq g , R21 R12 i) is related to the universal R-matrix and S is the antipode of Uq g. Remarks. 1. It is also possible to give an heuristic proof of this result, using the canonical R-matrix for Drinfel 0d’s double and using Rosso’s formula for his form [Ro2]. 2. In the preceding discussion, we were lying a bit. Caldero’s map ζ does not give exactly Rosso’s bilinear form, but our formula connecting I and Rosso’s form is correct as stated. In our notations, Caldero’s map ζ is the inverse of the map (Aq G → F` (Uq g), a 7→ δ(a(1) ) Sγ(a(2) )). Later, we will need to know the relations between the central elements zλ defined above. To this aim, we recall Drinfel 0d’s construction of the center of Uq g [Dr2]. Let λ ∈ P+ and t ∈ Aq G be the quantum trace in L(λ): for x ∈ Uq g, ht, xi = Tr L(λ) (K2ρ x). t is an invariant element for the adjoint action of Uq g in Aq G, so I(t) is central, and belongs to Uq g · K2w0 λ . We choose the normalization of z−w0 λ by letting z−w0 λ = I(t). We then have a Mackey-like theorem (which is implicit in [Dr2] and in the thesis of Caldero, Chap. II, 1.2): L Proposition 7. Let cνλµ be the fusion coefficients for g: L(λ) ⊗ L(µ) ' ν cνλµ L(ν). P ν Then zλ zµ = ν cλµ zν .
78
P. Baumann, F. Schmitt
Proof. Let µ ∈ P+ . We compute J(θL(µ) (mµ , m∗µ )) = K2µ (with the help of the formulas J = S ◦ I ◦ S and S(θL(µ) (mµ , m∗µ )) = θL(−w0 µ) (m−µ , m∗−µ )). Now let λ ∈ P+ and let t be the quantum trace in L(λ). Let 9 be the Harish-Chandra morphism from the center of Uq g to U0 [Ro1]. We want to compute 9(z−w0 λ ) on µ + ρ. (Evaluation on µ + ρ means the algebra homomorphism (U0 → C(q), Kλ 7→ q (λ|µ+ρ) ).) The result will be the image of z−w0 λ by the central character of L(µ). So it is hI(t), θL(µ) (mµ , m∗µ )i = hJθL(µ) (mµ , m∗µ ), ti = hK2µ , ti = Tr L(λ) (K2µ K2ρ ) = Tr L(λ) (K2(µ+ρ) ). Thus 9(z−w0 λ ) equals the sum of K2ν for ν in the set of weights (with multiplicities) of L(λ). We then use the fact that 9 is an injective algebra homomorphism. We denote by G the Grothendieck ring of the category of finite-dimensional Uq g-modules whose components are modules L(λ), without any twisting character χ : P/2Q → C× . Let Z(Uq g) be the center of Uq g, and Z[P] the group algebra of P (with the standard Z-basis denoted by (eν )ν∈P ). The map (G → Aq G, [M] 7→ Tr M (K2ρ )) is a ring homomorphism. If a, b ∈ Aq G are such that I(a) belongs to the center of Uq g, then I(ab) = I(a) I(b). As a consequence, the map G → Z(Uq g), [M] 7→ I(Tr M (K2ρ )) is a ring homomorphism. This shows again the statement in Proposition 7, and we can paraphrase the above proof by saying that the following diagram is commutative: G ch
? Z[P]
- Aq G
I
- Z(Uq g) 9
? - U0
Here ch : G → Z[P] is the ring homomorphism which maps a module to its formal character, and the bottom arrow is the map (Z[P] → U0 , eν 7→ K2ν ). 2.3. A technical result on the representation ring. We have just introduced a Grothendieck ring G: by the classical results of Lusztig and Rosso, G is naturally isomorphic to the representation ring of g. The elements [L(λ)] (λ ∈ P+ ) form a Z-basis of G and a Q-basis of G ⊗Z Q. Proposition 8. Let λ ∈ P+ . Then the ideal of R ⊗Z Q generated by the elements [L(λ + $)] ($ ∈ P+ ) is the whole algebra G ⊗Z Q. The proof of this proposition can be skipped without any drawback. Before we give it, we have to state an elementary lemma: ` Lemma. Let (µ(1) , . . . , µ(k) ) ∈ (C` )k be such that their image in C/Z are all P (m) different, and let (P (1) , . . . , P (k) ) ∈ (C[X1 , . . . , X` ])k . If (n1 , . . . , n` ) mP P (m) ` exp(2πi j nj µj ) = 0 holds for all (n1 , . . . , n` ) ∈ N , then the polynomials P (1) , . . . , P (k) are all equal to zero. For ` = 1, this lemma states linear independence of elementary solutions of a linear difference equation. The general proof is by induction on `. Proof of Proposition 8. In this proof, we are in a classical context and we do not identify h and h∗ . Let R ⊆ h∗ and R∨ ⊆ h be the direct and inverse root systems, (α 7→ α∨ ) the canonical bijection between R and R∨ , and Q(R∨ ) ⊆ h the root lattice. P = P(R) ⊆ h∗ is still the weight lattice; we denote by {α1∨ , . . . , α`∨ } the set of inverse simple roots,
Classification of Bicovariant Differential Calculi on Quantum Groups
79
and by {$1 , . . . , $` } the set of fundamental weights. R∨ and R define Q-structures on h and h∗ , and we can define hR and hC . The Weyl group W operates on h and h∗ , and the affine Weyl group Wa = W n Q(R∨ ) operates on h. Let Z[P] be the Z-algebra of the group P, Z[P]W be the set of elements which are invariant under Weyl group action, ∼ ch : (G →Z[P]W ) be the ring isomorphism “formal character”. Finally, we denote by ε(w) = ±1 the determinant of an element w of the Weyl group. For µ ∈ hC , let evµ : (Z[P] → C) be the ring morphism which sends a basic element eν (ν ∈ P) to exp(2πihµ, νi), where exp is the complex exponential. This extends to an algebra morphism evµ : (C[P] → C). If ν ∈ P+ , let fν be the map hC → C, µ 7→ evµ (ch L(ν)) . We first assert that given any (x1 , . . . , x` ) ∈ C` , there exists µ ∈ hC such that for all i ∈ {1, . . . , `}, f$i (µ) = xi . We view C[P] Pas the µi αi∨ coordinate ring of the affine variety (C× )` , and we view an element µ = 2πiµ1 2πiµ` × ` (µi ∈ C) as the point (e ,... ,e ) ∈ (C ) . By the Nullstellensatz, it is sufficient to prove that the elements (ch L($i ) − xi e0 ) (i = 1, . . . , `) generate a proper ideal in C[P]. This is already true in C[P]W by [Bo], Ch. VI, § 3, Th´eor`eme 1. The case of C[P] is given by a standard trick: let \ : (C[P] → C[P]W ) be the projection onto the trivial homogeneous component of W; \ is a morphism of C[P]W P in C[P] for the action 0 modules, and thus a relation Qi · (ch L($i ) − xi e ) = 1 in C[P] would give a relation P \ Qi · (ch L($i ) − xi e0 ) = 1 in C[P]W , which is impossible. We now want to prove a formula for the character fν (µ) = evµ (ch L(ν)). We first remark that fν is invariant under the action of the affine Weyl group Wa in hC . If the real part Re(µ) of µ lies in an open alcove of hR , our formula will just be Weyl’s character formula: P w∈W ε(w) exp(2πihwµ, ν + ρi) . fν (µ) = P w∈W ε(w) exp(2πihwµ, ρi) Writing the denominator as a product over the positive roots: Q exp(2πihµ, ρi) α∈R,α≥0 (1 − exp(−2πihµ, αi)), we can see that it is a non-zero complex number. In the general case, we let T = {α ∈ R | Re(hµ, αi) ∈ Z}: this is a closed symmetric subset of R ([Bo], Ch. VI, § 1, D´efinition 4), thus T is a root system in the vector space V1 ⊆ h∗R that it spans ([Bo], Ch. VI, § 1, Proposition 23). The stabilizer of µ in Wa is generated by the reflections across the affine hyperplanes in which Re(µ) lies ([Bo], Ch. V, § 3, Proposition 2), thus W1 := {w ∈ W | µ − wµ ∈ Q(R∨ )} is precisely the subgroup generated by reflections along α∨ (α ∈ T), and its restriction to V1 isPthe Weyl group of T. Let σ be half the sum of the inverse positive roots of T: σ = 21 α∈T,α≥0 α∨ . In restriction to V1 , σ is the sum of the fundamental weights of the root system T∨ of V1∗ . Let h be a small real parameter: Re(µ) + hσ then lies in an open alcove of hR and we can compute (with a little abuse): fν (µ) = lim fν (µ + hσ) h→0 P P = lim
h→0
w∈W/W1
P
w1 ∈W1
w∈W/W1
P
ε(ww1 ) exp(2πihwµ, ν + ρi) exp(2πihhw1 σ, w−1 (ν + ρ)i)
w1 ∈W1
ε(ww1 ) exp(2πihwµ, ρi) exp(2πihhw1 σ, w−1 ρi)
.
In the sums, we fix w ∈ W/W1 and compute the sums on w1 : in the numerator for instance, we have an alternating sum of exp(2πihhw1 σ, w−1 (ν + ρ)i), where w−1 (ν + ρ) ∈ P(R) has to be projected on V1 , as in [Bo], Ch. VI, § 1,P Proposition 28. The formula (valid in the group algebra of the weight lattice of T∨ ): w1 ∈W1 ε(w1 )ew1 σ = Q ∨ eσ α∈T,α≥0 (1 − e−α ) then gives:
80
P. Baumann, F. Schmitt
P fν (µ) =
w∈W/W1
P
ε(w) exp(2πihwµ, ν + ρi)
w∈W/W1
Q
ε(w) exp(2πihwµ, ρi)
α∈T,α≥0 hα
Q
∨
α∈T,α≥0 hα
, w−1 (ν + ρ)i
∨ , w −1 ρi
.
As ν + ρ and ρ are regular, neither of the products occurring here can be zero. (We will see soon that the denominator cannot be zero.) We now prove that the ideal of G ⊗Z C generated by the elements [L(λ+$)] ($ ∈ P+ ) is the whole algebra G ⊗Z C. We consider again [Bo], Ch. VI, § 3, Th´eor`eme 1: this time, the isomorphism ϕ : C[X1 , . . . , X` ] → C[P]W is given by ϕ(Xi ) = ch L($i ). Composing with the isomorphism ch : G → Z[P]W , we can see that G ⊗Z C is a polynomial algebra over C. We suppose by the way of contradiction that the elements [L(λ+$)] ($ ∈ P+ ) all belong to some maximal ideal of G ⊗Z C. Then, by the Nullstellensatz, there exists a point (x1 , . . . , x` ) ∈ C` such that for all $ ∈ P+ , ϕ−1 (ch L(λ + $))(x1 , . . . , x` ) = 0. We can find µ ∈ hC such that f$i (µ) = xi (i = 1, . . . , `): then fλ+$ (µ) = 0 for all $ ∈ P+ . We next use the formula: fλ+$ (µ) (denominator) X ε(w) exp(2πihwµ, λ + $ + ρi) = w∈W/W1
Y
hα∨ , w−1 (λ + $ + ρ)i,
α∈T,α≥0
P
any integers. The wµ (w ∈ W/W1 ) are all and write $ = ni $i , where (ni ) ∈ N` are Q distinct modulo Q(R∨ ), and the expressions α∈T,α≥0 hα∨ , w−1 (λ + $ + ρ)i are nonzero polynomials in (n1 , . . . , n` ) (they never vanish indeed). Then the above lemma states that the right-hand side cannot vanish for all (ni ) ∈ N` . This proves first that the denominator is not null, and second that fλ+P ni $i (µ) cannot vanish for all (ni ) ∈ N` . We have reached a contradiction. is then easy: we have shown that we can express in To go down to the case of G ⊗Z QP G ⊗Z C the unity as a finite sum 1 = xi [L(τi )][L(νi )], where τi ∈ P+ , νi ∈ λ + P+ and xi ∈ C. As the structure constants of G ⊗Z C are integer-valued, this system, viewed as linear equations in (xi ), has a solution in C, so has a solution in Q. 2.4. Classification of some ideals of F` (Uq g). In order to achieve our classification of ideals R ⊆ Aq G in the next section, we must study the ideals I ⊆ F` (Uq g) which are stable by the adjoint action of Uq g. The analysis requires the use of the subalgebra V of Uq g generated by F` (Uq g) and by the elements K2λ (λ ∈ P+ ). Joseph and Letzter [J–L1] have shown that V is the subalgebra generated by the elements Ei , Fi Kαi and K2λ (λ ∈ P). As it is such a “big” subalgebra of Uq g, its representation theory is similar to that of Uq g. We will describe it in the next subsection, but in the following proof, we need to know that the annihilator of a finite-dimensional V-module is homogeneous with respect to the Q-graduation of V. Proposition 9. The following two properties for a subspace I ⊆ F` (Uq g) are equivalent: (1) I is the annihilator in F` (Uq g) of a finite-dimensional V-module; (2) I is a finite-codimensional two-sided ideal of F` (Uq g) and a Uq g-submodule of F` (Uq g) for the left adjoint action. Proof. We first show that (1) ⇒ (2). If M is a finite-dimensional V-module, its annihilator in V is a finite-codimensional two-sided ideal of V, and is homogeneous w.r.t. the Q-graduation of V. It is then easy to see that annV M is a Uq g-submodule of V for
Classification of Bicovariant Differential Calculi on Quantum Groups
81
the left adjoint action. The annihilator I = (annV M) ∩ F` (Uq g) of M in F` (Uq g) thus satisfies the property (2). Conversely, let I ⊆ F` (Uq g), satisfying the property (2). We consider the left regular F` (Uq g)-module M = F` (Uq g)/I . I is its annihilator, so it is sufficient to show that M extends to a V-module. We thus want to show that the elements K−2λ ∈ F` (Uq g) (λ ∈ P+ ) map to invertible operators in End(M). 1) M is a finite-dimensional algebra, and is also a left Uq g-module (for the adjoint action). The multiplication in M defines a morphism of left Uq g-modules: M ⊗ M → M. Thus the Q-graduation of M (defined by the structure of Uq g-module) is an algebra grading. 2) We fix λ ∈ P+ . We can write M = M0 ⊕M∞ (as C(q)-vector space), where K−2λ acts nilpotently on M0 and inversibly on M∞ (Fitting’s decomposition). M0 and M∞ are stable by the commutant of K−2λ in End(M), so are right ideals of M. If x ∈ F` (Uq g) is homogeneous w.r.t. the Q-graduation of F` (Uq g), x commutes (up to a non-zero scalar) with K−2λ , so M0 and M∞ are stable by left multiplication by x. Thus M0 and M∞ are also left ideals of M. 3) We now show that M0 and M∞ are Uq g-submodules of M. (a) Let {e1 , . . . , ek } be the set of central idempotents in M. The elements Kµ (µ ∈ P) of Uq g act on M (by the adjoint action) as algebra automorphisms, so permute the elements of the set {e1 , . . . , ek }. Hence for each µ, there exists an integer n ≥ 1 such that Knµ fixes each ei . Since M is, as a Uq g-module, a direct sum of modules L(ν) (without any twisting character χ), and since q is generic, we conclude that e1 , . . . , ek are fixed by the adjoint action of the elements Kµ . (b) Let e be a central idempotent in M. e is of weight zero. We consider the qn P −di n(n−1)/2 ad Ei exponential expq (ad Ei ) = n≥0 q [n]i ! (i ∈ {1, . . . , `} fixed). Then exp (ad Ei ) is a well defined operator in M. The formula 1(Ein ) = Pn n q di (n−k)k n−k k Ei Kαi ⊗ Eik enables us to see that expq (ad Ei )(e) is k=0 k i q an idempotent which we write e + x. Then 2ex + x2 = x, x(1 − 2e) = x2 , x = x(1 − 2e)2 = x3 . The weights of the Q-homogeneous components of x belong to {nαi | n ≥ 1}; so the weights of the Q-homogeneous components of x3 belong to {nαi | n ≥ 3}, and the homogeneous component of x of weight αi is null. We obtain that (ad Ei )(e) = 0. Similarly, (ad Fi )(e) = 0 for all i ∈ {1, . . . , `}. (c) M0 and M∞ are ideals in M generated by central idempotents e0 and e∞ respectively. (a) and (b) show that e0 and e∞ define the trivial Uq g-module. Hence for x ∈ M0 and u ∈ Uq g, u · x = u · (xe0 ) = (u(1) · x)(u(2) · e0 ) = (u(1) · x)ε(u(2) )e0 = (u · x)e0 ∈ M0 . The same holds for M∞ . 4) We first consider the case g = sl2 . We choose naturally λ = $ the fundamental weight, and write M0 = L0 /I and M∞ = L∞ /I . The points 2) and 3) show that L0 and L∞ are two-sided ideals and left Uq g-submodules of F` (Uq g). By definition of the Fitting decomposition, there exists an integer n ≥ 0 such that K−2n$ ∈ L∞ . Hence for all integers m ≥ n, we have K−2m$ ∈ L∞ , and thus zm$ ∈ L∞ . Let n0 ≥ 0 be the smallest integer such that for all m ≥ n0 , zm$ ∈ L∞ . Proposition 7 and the Clebsch–Gordan theorem show that if n ≥ 1, z(n+1)$ + z(n−1)$ = z$ zn$ . Thus n0 has to be equal to zero. So 1 = z0 ∈ L∞ , M∞ = M, and K−2$ acts inversibly on M.
82
P. Baumann, F. Schmitt
5) The general case is solved in the same way. We consider the decomposition of the point 2) and write M0 = L0 /I and M∞ = L∞ /I . L0 and L∞ are two-sided ideals and left Uq g-submodules of F` (Uq g), and there exists an integer n ≥ 0 such that K−2nλ ∈ L∞ . If $ ∈ P+ , then K−2(nλ+$) ∈ L∞ , and thus znλ+$ ∈ L∞ . Let ϕ be the Q-algebra morphism G ⊗Z Q → Z(Uq g), [M] 7→ I(Tr M (K2ρ )) considered at the end of Sect. 2.2. Then ϕ−1 (L∞ ) is an ideal of G ⊗Z Q, which contains all the elements [L(−w0 nλ + $)] ($ ∈ P+ ). Thus ϕ−1 (L∞ ) = G ⊗Z Q by Proposition 8, and so 1 = ϕ([L(0)]) ∈ L∞ , M∞ = M, and K−2λ acts inversibly on M. Remark. This result is a particular case of Proposition 8.4.13 in [Jo]. Accordingly, its proof is shorter than the one of Joseph’s theorem, and does not require the knowledge of the inclusions between Verma modules, nor the use of Gel 0fand–Kirillov dimensions. 2.5. Classification of some right ideals of Aq G. The notations Aq G, Uq g, V have the ∼ same meaning as in Sects. 2.1 and 2.4. The map I : (Aq G→ F` (Uq g)) was introduced in Sect. 1.3. We now specify the structure of the finite-dimensional V-modules: they are completely reducible; each Uq g-module Lχ (λ) (with λ ∈ P+ , χ : P/2Q → C× ) is (by restriction) a simple V-module; the V-modules Lχ (λ) and Lϕ (µ) are isomorphic iff λ = µ and the characters χ, ϕ restrict to the same character 2P/2Q → C× . The simple finite-dimensional V-modules will be denoted by Lχ (λ) with λ ∈ P+ and χ : 2P/2Q → C× a character. We finally remark (see [J–L1]) that a simple finitedimensional V-module is still simple as a F` (Uq g)-module. Consequently, if (Mi ) is a finite family of non-isomorphic L finite-dimensional simple V-modules, the natural ring End Mi is surjective. homomorphism F` (Uq g) → Theorem 1. 1) Let R be a finite codimensional right ideal of Aq G, which is a subcomodule of Aq G w.r.t. the right coaction δR : (Aq G → Aq G ⊗ Aq G, a 7→ a(2) ⊗ S(a(1) )a(3) ). Then there exists a finite-dimensional V-module M such that R = I−1 (annF` (Uq g) M). 2) If M is a finite-dimensional V-module, then I−1 (annF` (Uq g) M) is a finite codimensional right ideal of Aq G, stable by the right coaction δR . 3) If M and N are finite dimensional V-modules, then I−1 (annF` (Uq g) M) I−1 (annF` (Uq g) N) iff M and N have the same irreducible components.
=
4) I−1 (annF` (Uq g) M) is included in the augmentation ideal of Aq G iff M contains the trivial V-module. Proof. 1) and 2) are consequences of Propositions 4 and 9. Let M and N be two finitedimensional V-modules having the same annihilator in F` (Uq g). Then annF` (Uq g) M = annF` (Uq g) (M ⊕ N). Let M1 , . . . , Mk (respectively M1 , . . . , Mn ) be the distinct irreducible components of M (respectively M ⊕ N). Then we have: L F` (Uq g)/annF` (Uq g) (M) ' ki=1 End Mi and:
F` (Uq g)/annF` (Uq g) (M ⊕ N) '
Ln i=1
End Mi ,
and so k = n: all the irreducible components of N appear in M. 3) follows. 4) can be proved in a similar way, using the fact that the augmentation ideal of Aq G is the inverse image by I of the annihilator of the trivial V-module.
Classification of Bicovariant Differential Calculi on Quantum Groups
83
3. Differential Calculi on Quantum Groups 3.1. Woronowicz’s definition. Let A be a Hopf algebra, 0 be a bicovariant bimodule and d : A → 0 be a linear map. We say that (0, d) is a bicovariant differential calculus on A if d is a derivation, a morphism of two-sided comodules and if the image of d generates the left A-module 0. The dimension of the space 0L of left coinvariants will be supposed to be finite. When (0, d) is a differential calculus over A, we note dL the map (A → 0L , a 7→ S(a(1) ) · d(a(2) )). The subspace R = ker dL ∩ ker ε is a finite-codimensional right ideal of A, and a subcomodule for the right coadjoint coaction δR : (a 7→ a(2) ⊗ S(a(1) )a(3) ). As shown by Woronowicz, the subspace R determines (up to isomorphism) the bicovariant differential calculus (0, d): we call it the ideal associated to (0, d). Geometrically, A must be viewed as the algebra of functions over a group G, 0 is the space of 1-forms on G, 0L is the space of left-G-invariant 1-forms on G, identified with the cotangent space at the unity point of G, and dL maps a function on G to its differential at the unity point. 3.2. A construction of bicovariant differential calculi. Let A be a c.q.t. Hopf algebra over the field k, and let γ, δ be the associated maps. a basis of M, We take a finite-dimensional right A-comodule M. We note (mi )P (m∗i ) the dual basis, and Rij the elements of A such that δR (mi ) = j mj ⊗ Rji . P Then 1Rji = k Rjk ⊗ Rki and ε(Rji ) = δji (Kronecker’s symbol). Also, M is a left A∗ -module, and the Rji (viewed as linear forms on A∗ ) are the matrix coefficients θM (mi , m∗j ) of this module. SinceP (A, γ) is c.q.t., M becomes a right crossed bimodule over A for the action mi · a = j hγ(a), Rji imj (Proposition 1). M∗ is a right comodule over A too, for the P coaction δR (m∗i ) = j m∗j ⊗ S(Rij ). Using the fact that (A, δ) is a c.q.t. Hopf algebra, ∗ of a right crossed bimodule over A for the action we may endow P M with the structure ∗ ∗ hδ(a), S(R )im . Then, by making the tensor product, we obtain that mi · a = ij j j End(M) ' M ⊗ M∗ is a right crossed bimodule. We denote by 0 the bicovariant bimodule associated to this right crossed bimodule End(M). As a vector space, 0 is just the tensor product A ⊗ M ⊗ M∗ . On the basic elements, the structure maps are: b · (a ⊗ mi ⊗ m∗j ) = ba ⊗ mi ⊗ m∗j , P (a ⊗ mi ⊗ m∗j ) · b = k,l ab(1) ⊗ hγ(b(2) ), Rki imk ⊗ hδ(b(3) ), S(Rj` )im∗` ,
δL (a ⊗ mi ⊗ m∗j ) = a(1) ⊗ a(2) ⊗ mi ⊗ m∗j , P δR (a ⊗ mi ⊗ m∗j ) = k,l a(1) ⊗ mk ⊗ m∗` ⊗ a(2) Rki S(Rj` ). P ∗ It follows that the canonical element X = i 1 ⊗ mi ⊗ mi of 0 is left and right coinvariant. The linear map d : (A → 0, a 7→ X · a − a · X) is then a derivation and a morphism of two-sided comodules. Theorem 2. 1) If (A, γ) is a factorizable c.q.t. Hopf algebra and if M is a simple finitedimensional non-trivial A-comodule, then the above construction gives a bicovariant differential calculus d : (A → 0 ≡ A ⊗ End(M)). 2) Its associated ideal is R = I−1 (annA∗ (k ⊕ M)), where k is the trivial A∗ -module.
84
P. Baumann, F. Schmitt
Proof. We first compute for a ∈ A: P d(a) = k,l a(1) hI(a(2) ), Rk` i ⊗ mk ⊗ m∗` − a(1) ha(2) , δk` i ⊗ mk ⊗ m∗` P = k,l a(1) hI(a(2) ), Rk` − δk` i ⊗ mk ⊗ m∗` , and so: dL (a) = =
P P
k,l
hI(a − ε(a)), Rk` imk ⊗ m∗`
k,l
hJ(Rk` − δk` ), aimk ⊗ m∗` .
The Rji are the matrix coefficients θM (mi , m∗j ) of the A∗ -module M, which is irreducible and non-trivial. Thus, by the Jacobson density theorem, the (dim M)2 + 1 elements {1, Rji } are linearly independent in A. The (dim M)2 linear forms {J(Rk` − δk` )} are then linearly independent in A∗ , and the formula for dL (a) shows that dL maps A onto 0L = End(M). 1) is proved. The same formula shows that R is the set of elements a in the augmentation ideal of A such that I(a) is orthogonal to all the matrix coefficients Rk` of the A∗ -module M. Thus R = ker ε ∩ I−1 (annA∗ M) = I−1 (annA∗ (k ⊕ M)). We have shown 2). If we consider now a finite family (Mi ) of non-trivial non-isomorphic finitedimensional simple right A-comodules, we can do the Ldirect sum of such constructions. If (A, γ) is factorizable, then the map d : (A → (A ⊗ L End Mi )) is a bicovariant Mi )). differential calculus. The associated ideal is I−1 (annA∗ (k ⊕ 3.3. The link with the classification theorem. We are now gathering the pieces of our patchwork. According to the statements in Sect. 3.1, Theorem 1 yields a complete classification of bicovariant differential calculi on Aq G. Morally, they are all given by the construction described in Sect. 3.2. Proposition 10. Let Uq g and Aq G be the objects defined in Sect. 2.1. If the root and the weight lattices for g are equal, all the bicovariant differential calculi on Aq G can be constructed by the method described in Sect. 3.2. Proof. The results in Sect. 2.5 tell us that an ideal R associated to a bicovariant differential calculus on Aq G is a subspace I−1 (annF` (Uq g) M), where M is a V-module containing the trivial V-module. Let M1 , . . . , Mn be the distinct non-trivial irreducible components of M. The assumption on g gives us that the Mi are modules L(λi ) (without any twisting character), and so can be considered as non-trivial non-isomorphic simple right Aq G-comodules. The construction of Sect. 3.2 for this family of comodules leads to a bicovariant differential calculus whose associated L ideal is the inverse image by I Mi . It is R, and the proposition is of the annihilator of the (Aq G)∗ -module C(q) ⊕ proved. In the remainder of this section, we will discuss what happens when the root and the weight lattices differ. Up to the end of this article, we consider this case. There exist non-trivial characters χ : 2P/2Q → C× , and for any weight λ, we can look at the ideal R = I−1 (annF` (Uq g) (C(q) ⊕ Lχ (λ))), and at the associated bicovariant differential calculus. It cannot be constructed by the method of Theorem 2, since Lχ (λ) is not a right Aq G-comodule. However, one may notice that the main trick in the construction −1 of Sect. 3.2 consisted in using two different R-matrices, namely R12 and R21 . R12 was used to endow the Aq G-comodule L(λ) with the structure of a right crossed bimodule
Classification of Bicovariant Differential Calculi on Quantum Groups
85
−1 over Aq G, and R21 turned the Aq G-comodule L(λ)∗ into a right crossed bimodule over Aq G. The tensor product of these right crossed bimodules then gave the bicovariant differential calculus associated to I−1 (annF` (Uq g) (C(q) ⊕ L(λ))). When one uses the small freedom allowed in the choice of the R-matrix of Uq g (see [Ga]), one can make similar constructions for the bicovariant differential calculi associated with some of the ideals I−1 (annF` (Uq g) (C(q) ⊕ Lχ (λ))). We will not write all the details, but point out that this is the way followed by Schm¨udgen and Sch¨uler for the construction described in [S–S1], Theorem 2.2. As an example, we now describe explicitly the bicovariant differential calculus as∧ sociated with the ideal I−1 (annF` (Uq g) (C(q) ⊕ Lχ (0))). Let P/Q be the group of characters ζ : P/Q → C× . If ζ is such a character, it extends to a one-dimensional rep¯ resentation ζ¯ of Aq G by letting ζ(θ (m, m∗ )) = ζ(λ mod Q)hm∗ , mi, and this gives ∧ L(λ) an inclusion of the group P/Q into the center of (Aq G)∗ res . Since (ζ¯ ⊗ id) ◦ δR : ¯ Aq G → C(q) ⊗ Aq G is given by (x 7→ ζ(x) ⊗ 1), we can see that the kernel of ζ¯ is a one-codimensional two-sided ideal of Aq G, stable by the right coaction δR . If ζ is not trivial, the ideal R = ker ε ∩ ker ζ¯ defines a bicovariant differential calculus on Aq G. Putting χ : 2P/2Q → C× , 2λ mod 2Q 7→ ζ(λ mod Q) , we can check that R = I−1 (annF` (Uq g) (C(q) ⊕ Lχ (0))). This construction gives all the one-dimensional differential calculi on Aq G (generalizing the result of [S–S1], Remark 4 after Theorem 2.2). Finally, let X be an intermediate lattice between P and Q. The matrix coefficients of the irreducible representations of Uq g whose highest weights belong to X span a subalgebra Aq GX ⊆ Aq G. These algebras Aq GX are factorizable c.q.t. Hopf algebras. For instance, Aq GQ is the algebra of functions on the quantum adjoint group, and Aq G ≡ Aq GP is the algebra of functions on the quantum simply-connected group. Our arguments in Sect. 2.5 show that the indecomposable bicovariant differential calculi on Aq GX are classified by ideals R = Aq GX ∩ I−1 (annF` (Uq g) (C(q) ⊕ Lχ (λ))), where χ : 2X/2Q → C× is a character (extended arbitrarily to a character of the group 2P/2Q). Thus the “twisted” bicovariant differential calculi are non-local, their appearance depending of the choice of X. The bicovariant differential calculi seem localized at the central elements of GX , that is to say, at the fixed points of GX under the adjoint action.
Acknowledgement. Les deux auteurs remercient le Minist`ere Franc¸ais de l’Enseignement Sup´erieur et de la Recherche pour son soutien financier (allocations de recherche). They also warmly and sincerely thank Professor M. Rosso for explaining the subject, giving bibliographic references, and supplying us with the leading idea.
Note added in proof. P. Polo kindly communicated us the following simple proof of Proposition 8. By the formal character isomorphism, G ' Z[P]. Let Z[P]W ⊆ Z[P] be the subring of W-invariant elements. Z[P] is a module of finite type over the noetherian ring Z[P]W , hence we can choose a finite generating set (eνi )1≤i≤n from the family (eν )ν∈P . Take a weight µ such that all µ +P νi are dominant. LetPλ =∈ P+ . Then there exists some ai ∈ Z[P]W such that e−λ−µ = i ai eνi , hence 1= i ai eλ+µ+νi . Multiplying this by eρ and making the alternating sum over the Weyl group, one abtains that: P ch L(0) = i ai ch L(λ + µ + νi ). This concludes the proof. Thanks are also due to A. Joseph for some useful comments about this work.
86
P. Baumann, F. Schmitt
References [Bo] [Ca]
Bourbaki, N.: Groupes et alg`ebres de Lie, Chapitres 4, 5 et 6. Paris: Masson, 1981 Caldero, P.: El´ements ad-finis de certains groupes quantiques. C. R. Acad. Sci. Paris 316, 327–329 (1993) [Dr1] Drinfel 0d, V.G.: Quantum groups. In: Proceedings of the International Congress of Mathematicians Berkeley 1986. Providence, RI: American Mathematical Society, 1987, pp. 798–820 [Dr2] Drinfel 0d, V.G.: On almost cocommutative Hopf algebras. Leningrad Math. J. 1, 321–342 (1990) [F–P] Faddeev, L.D., Pyatov, P.N.: The differential calculus on quantum linear groups. In: Dobrushin, R.L., Minlos, R.A., Shubin, M.A., Vershik, A. M. (eds.): Contemporary Mathematical Physics (Berezin memorial volume). Amer. Math. Soc. Transl. series 2, vol. 175. Providence, RI: American Mathematical Society, 1996, pp. 35–47 [F–R–T] Faddeev, L.D., Reshetikhin, N.Yu., Takhtadzhyan, L.A.: Quantization of Lie groups and Lie algebras. Leningrad Math. J. 1, 193–225 (1990) [Ga] Gaitsgory, D.: Existence and uniqueness of the R-matrix in quantum groups. J. Algebra 176, 653– 666 (1995) [Jo] Joseph, A.: Quantum groups and their primitive ideals. Ergebnisse der Mathematik und ihrer Grenzgebiete, 3. Folge, Band 29. Berlin–Heidelberg: Springer-Verlag, 1995 [J–L1] Joseph, A., Letzter, G.: Local finiteness of the adjoint action for quantized enveloping algebras. J. Algebra 153, 289–318 (1992) [J–L2] Joseph, A., Letzter, G.: Separation of variables for quantized enveloping algebras. Amer. J. Math 116, 127–177 (1994) [Ju] Jurˇco, B.: Differential calculus on quantized simple Lie groups. Lett. Math. Phys. 22, 177–186 (1991) [L–T] Larson, R.G., Towber, J.: Two dual classes of bialgebras related to the concepts of “quantum group” and “quantum Lie algebra”. Comm. Algebra 19, 3295–3345 (1991) [Ra] Radford, D.E.: Minimal quasi-triangular Hopf algebras. J. Algebra 157, 285–315 (1993) [R–S] Reshetikhin, N.Yu., Semenov-Tian-Shansky, M.A.: Quantum R-matrices and factorization problems. J. Geom. Phys. 5, 533–550 (1988) [Ro1] Rosso, M.: Analogues de la forme de Killing et du th´eor`eme d’Harish-Chandra pour les groupes quantiques. Ann. Sci. Ecole Norm. Sup. 23, 445–467 (1990) [Ro2] Rosso, M.: Certaines formes bilin´eaires sur les groupes quantiques et une conjecture de Schechtman et Varchenko. C. R. Acad. Sci. Paris 314, 5–8 (1992) [Ro3] Rosso, M.: Alg`ebres enveloppantes quantifi´ees, groupes quantiques compacts de matrices et calcul diff´erentiel non commutatif. Duke Math. J. 61, 11–40 (1990) [S–S1] Schm¨udgen, K., Sch¨uler, A.: Classification of bicovariant differential calculi on quantum groups of type A, B, C and D. Commun. Math. Phys. 167, 635–670 (1995) [S–S2] Schm¨udgen, K., Sch¨uler, A.: Classification of bicovariant differential calculi on quantum groups. Commun. Math. Phys. 170, 315–335 (1995) [Ta] Tanisaki, T.: Killing forms, Harish-Chandra isomorphisms, and universal R-matrices for quantum algebras. In: Infinite analysis, Part B, Proceedings Kyoto 1991. Singapore: World Scientific Publishing, 1992, pp. 941–961 [Wo] Woronowicz, S.L.: Differential calculus on compact matrix pseudogroups (quantum groups). Commun. Math. Phys. 122, 125–170 (1989) [Ye] Yetter, D.N.: Quantum groups and representations of monoidal categories. Math. Proc. Camb. Phil. Soc. 108, 261–290 (1990) Communicated by A. Connes
Commun. Math. Phys. 194, 87 – 108 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On Orientation and Dynamics in Operator Algebras. Part I Erik M. Alfsen1 , Frederic W. Shultz2 1 2
Math. Dept. University of Oslo, P.O. Box 1053 Blindern, N-0316 Oslo, Norway. E-mail:
[email protected] Math. Dept. Wellesley College, Wellesley MA-02181, USA. E-mail:
[email protected]
Received: 16 June 1997 / Accepted: 29 September 1997
Abstract: This paper characterizes the self-adjoint part of C*-algebras and von Neumann algebras among normed Jordan algebras. It also explains how the associative product is determined by a general notion of orientation which is related to dynamics and reflects the dual role of physical variables as observables and as generators of oneparameter groups of motions of the state space (Schr¨odinger picture). This concept of orientation bridges the approach in Connes’ characterization of the natural cone of a von Neumann algebra and our own characterization of the state space of a C*-algebra. 1. Introduction An old problem in operator algebras, motivated by physics, is to determine which (partially) ordered normed linear spaces can be the self-adjoint part of a C*-algebra or a von Neumann algebra. This is implicit in several papers of Segal [28, 29] and Kadison [20, 21, 24], and it was explicitly raised for von Neumann algebras by Sakai [27] and for C*-algebras by Sherman [30]. The self-adjoint elements of such algebras are used to represent bounded observables in algebraic models of quantum mechanics. However, the self-adjoint part A of a C*algebra is not closed under the given associative product, but only under the symmetrized product (“Jordan product”) a◦b=
1 1 (ab + ba) = (a + b)2 − a2 − b2 . 2 2
(1)
This product makes A into a (real) Jordan algebra, and it has been proposed to model quantum mechanics on Jordan algebras rather than associative algebras. This approach is corroborated by the fact that many physically relevant properties of observables are adequately described by Jordan constructs. Knowing an element of A, we can express not only the expectation value of the corresponding observable, but its entire probability
88
E. M. Alfsen, F. W. Shultz
law which is given by spectral functional calculus, and in turn by the squaring operation a 7→ a2 . The Jordan algebra approach to quantum mechanics was initiated by Jordan, von Neumann and Wigner in [17] where they introduced and studied finite dimensional “formally real” Jordan algebras. The restriction to finite dimensions was removed by von Neumann [18]. Jordan operator algebras (linear spaces of self-adjoint operators on a Hilbert space closed under the Jordan product) were first studied by Topping [35] and Størmer [33]. Størmer [32] gave a characterization of those Jordan algebras of self-adjoint operators that are actually self-adjoint parts of C*-algebras. The general definitions of JB-algebras and JBW-algebras (together with a Gelfand-Naimark type representation theorem) were given by Alfsen, Shultz and Størmer [4] and by Shultz [31] respectively. These algebras are defined axiomatically as (real) Jordan algebras which are also Banach spaces, subject to suitable conditions connecting Jordan product and norm. The self-adjoint part of a C*-algebra or a von Neumann algebra is a special case of a JB-algebra or a von Neumann algebra respectively. Not all JB-algebras or JBW-algebras arise in this fashion (cf. [13, Ch. 3–4]), but nevertheless they have enough structure to effectively model quantum mechanical observables. However, it is an important feature of quantum mechanics that the physical variables play a dual role, as observables and as generators of transformation groups. The observables are random variables with a specified probability law in each state of the quantum system, while the generators determine one-parameter groups of transformations of observables (Heisenberg picture) or states (Schr¨odinger picture). Both aspects can be adequately dealt with in the C*-algebra or von Neumann algebra formulation of quantum mechanics. An element a in the self-adjoint part A of such an algebra represents an observable whose probability law is determined by spectral theory as indicated above, while an element h in A determines the one-parameter group αt : b 7→ eiht be−iht (equivalently dαt (b)/dt |t=0 = i[h, b]), which represents the time evolution of the observable b. The spectral functional calculus is a Jordan construct, but the generation of one-parameter groups cannot be expressed in terms of the symmetrized product. Instead it is determined by the anti-symmetrized product in A, which we will write as follows: i i a ? b = [a, b] = (ab − ba). (2) 2 2 Thus the decomposition of the associative product into its Jordan part and its Lie part ab = a ◦ b − i(a ? b)
(3)
separates the two aspects of a physical variable. To solve the characterization problem, we must find appropriate conditions for an ordered normed linear space A, under which it is possible to define an associative product on A + iA making this space a C*-algebra or a von Neumann algebra. By the discussion above, this problem can be divided in two parts: first to construct the Jordan part of the associative product, then the Lie part when the Jordan part is known. By a theorem of Kadison [21] the ordering and the norm of a C*-algebra determine the Jordan part of the product. However, they do not determine the product itself, since the opposite algebra has the same ordering and norm but differs in the sign of the Lie part of the product. Thus, to go from the Jordan structure to the C*-structure, we must make a choice for the Lie part of the product.
On Orientation and Dynamics in Operator Algebras. I
89
It was Connes who first realized that a concept of orientation was relevant in this context [8]. In this paper he studies ordered linear spaces associated with sigma-finite von Neumann algebras. (These are the von Neumann algebras which have a faithful normal state (cf. e.g. [34, Prop. II.3.19]), or equivalently the ones which admit a faithful representation with a separating and cyclic vector ξ.) Connes concludes with a characterization of such spaces (Theorem 5.8). Here he shows that a “complex ordered linear space with order unit” (E, E + ) is isomorphic to the pair (M, M + ) for a sigma-finite von Neumann algebra M iff there exists a self-adjoint form s on E such that the completion of the cone E + with respect to s has the following properties: (i) it is “self-polar”, (ii) it is “facially homogeneous”, (iii) it is “orientable”. (All three properties are defined in Connes’ paper.) In the development leading up to this result, it is shown that the completion of M + with respect to a self-polar form s is independent of s (Theorem 2.1), and that it can be identified with the natural cone Pξ\ of Tomita-Takesaki theory, which can be abstractly characterized by the three properties (i), (ii), (iii) above (Theorem 5.2). In [5] Bellissard and Iochum showed that the properties (i) and (ii) above characterize the natural cone associated in an analogous fashion with a (sigma-finite) JBW-algebra. (See also [14].) Thus in the context of such cones, Connes’ notion of orientation is exactly what is needed to move from (sigma-finite) JBW-algebras to (sigma-finite) von Neumann algebras. It follows from results of Kadison [20] that the self-adjoint part of a C*-algebra is isometrically isomorphic, as an ordered normed linear space, to the space A(K) of all w*-continuous affine functions on the state space K. Similarly, the self-adjoint part of a von Neumann algebra is isometrically isomorphic to the space of all bounded affine functions on the normal state space. In view of this, characterizing the self-adjoint part of a C*-algebra (von Neumann algebra) is equivalent to characterizing the state space of a C*-algebra (the normal state space of a von Neumann algebra). This was accomplished for C*-algebras by Alfsen, Hanche-Olsen and Shultz in [3] and for von Neumann algebras by Iochum and Shultz in [15]. Here too the program proceeds by way of Jordan algebras (JB-algebras and JBW-algebras respectively), but it does not involve Tomita-Takesaki theory. The result in [3] is based on two earlier papers of Alfsen and Shultz [1] and [2]. In [1] they gave conditions on the facial structure of a compact convex set guaranteeing that A(K) admits a satisfactory spectral theory and functional calculus. This gives a candidate for a Jordan product in A(K) defined in terms of squares as in Eq. (1). Then in [2, Th. 7.2] they gave necessary and sufficient conditions that this product be bilinear, in which case it makes A(K) a JB-algebra. The key condition is the “Hilbert ball axiom” which says that each face of K which is generated by two extreme points, must be affinely isomorphic to the unit ball of a Hilbert space. (These Hilbert spaces can be of arbitrary finite or infinite dimension for a general JB-algebra. But for a C*-algebra they are of dimension 3 or 1, so for the characterization of C*-algebras the relevant condition is a “3-ball axiom” rather than a general “Hilbert ball axiom”.) As in Connes’ paper, the final step is to move from Jordan structure to associative structure (here from JB-algebras to C*-algebras). Again the key role is played by a concept of “orientability”. In [3] this concept is geometric; one requires that all the “facial 3-balls” (alluded to above) can be oriented in a continuous fashion with respect to the w*-topology. This can always be done locally, so that the requirement is that these local choices can be pieced together in a continuous way to give a global orientation. (See [3, §7] for the precise definition of orientation and [3, Th. 8.4] for the main result characterizing state spaces of C*-algebras.) Note also that if K is the state space of a
90
E. M. Alfsen, F. W. Shultz
C*-algebra, then there is a 1-1 correspondence between all “global orientations” on K and all those associative products on A + iA which organize this space to a C*-algebra inducing the same Jordan product on A as the original algebra [3, Cor.8.5]. In [15], Iochum and Shultz first characterize the normal state space of a JBWalgebra by conditions closely related to Connes’ facial homogeneity axiom. Then they characterize the self-adjoint part of a von Neumann algebra among all JBW-algebras. This is more difficult than the similar problem for C*-algebras in one respect, and easier in another. Since the (generally non-compact) normal state space of a von Neumann algebra may be devoid of extreme points, one must use a modified and more complicated version of the 3-ball axiom in this case. On the other hand, no orientability condition is needed in [15] to solve the characterization problem for the normal state space of a von Neumann algebra. (Nevertheless, here too one can define a notion of “global orientation” in 1-1 correspondence with associative products in the same way as for C*-algebras, but now orientability is automatic, as we will see in Part II.) The concept of orientation introduced by Connes in [8] and that introduced by ourselves in [3] are completely different in character and relate to different contexts, but the purpose is quite similar: to pass from Jordan structure to associative structure. More specifically, Connes’ “orientability axiom” provides a passage from JBW-algebras to von Neumann algebras, while our “orientability axiom” (together with the “3-ball axiom”) provides a passage from JB-algebras to C*-algebras. One of the main purposes of the current paper is to relate these two notions. In the present Part I we will give a solution of the characterization problem which relates to dynamics and applies equally well in the C* and the von Neumann case. In the forthcoming Part II we will develop a general theory of orientation for C*-algebras and von Neumann algebras, which is of geometric nature and bridges the two original approaches to orientation for such algebras. We will now give a brief survey of the content of the paper. We begin by a short summary of definitions and results from the theory of JBalgebras and JBW-algebras (with reference to [13] for proofs). All results are direct generalization of known results for C*-algebras and von Neumann algebras, so readers mainly interested in the theory of orientation for C*-algebras and von Neumann algebras can avoid the intricacies of Jordan algebras. Then we will transfer Connes’ notion of orientation from the natural cone of a JBWalgebra A to the algebra itself, and we will call the resulting notion a “Connes orientation on A”. Like the original concept, such an orientation is a complex structure on the Lie algebra of “order derivations” modulo its center (Definition 16). Our next step will be to introduce yet another notion which is closely related to that of a Connes orientation. This notion, which makes sense both in the JB and the JBW context (and in particular in the C* and the von Neumann context) will be called a “dynamical correspondence”. It is defined to be a (suitably axiomatized) correspondence which assigns a “skew order derivation” ψa to each element a of the given algebra A (Definition 17). The skew order derivations are generators of one-parameter groups of unital order automorphisms of A, and by duality also of one-parameter groups of motions of the state space K of A. Thus a dynamical correspondance gives the elements of A a double identity, which reflects the dual role of physical variables as observables and as generators of a one-parameter group of motions of the state space. Hence the name “dynamical correspondence”. (For related notions, see [6, 7, 9, 25].) To motivate the definition of a dynamical correspondence, we explain the geometrical meaning of this notion in the case of the 2 × 2 matrix algebra (which models a 2-level quantum system, cf. e.g. [26, Ch. 15]). Here the state space is a Euclidean 3-ball, and a
On Orientation and Dynamics in Operator Algebras. I
91
self-adjoint element of the algebra acts as an affine function on the ball. This function attains its maximum and its minimum at two antipodal points, and the corresponding one-parameter group consists of rotations about the diameter between these two points (in either one of the two possible directions depending on orientation). The geometric meaning of a dynamical correspondence in the general case will be explained in Part II. Note that a Connes orientation is defined in the context of JBW-algebras, while a dynamical correspondence is defined in the general context of unital JB-algebras (which include JBW-algebras as a special case). However, for JBW-algebras, it is shown that each Connes orientation determines a unique dynamical correspondence (Proposition 21), and conversely that each dynamical correspondence on a JBW-algebra arises in this way from a unique Connes orientation on the algebra (Corollary 24). Our main result is Theorem 23 by which a unital JB-algebra A is the self-adjoint part of a C*-algebra iff there exists a dynamical correspondence on A, in which case there is a natural 1-1 map from the dynamical correspondences on A to those C*-products on A + iA which induce the given Jordan structure on A. The same conclusions hold with “JBW” in place of “JB” and “von Neumann” in place of “C*”. In Part II we will concentrate on C*-algebras and von Neumann algebras, for which we will define our general notion of orientation. Like the orientation in [3], it is defined geometrically, but here without use of extreme points. Nevertheless, the definition is “local” in that one prescribes an orientation on “small subsystems” and then requires that the choice varies continuously. This local geometric notion of orientation provides a unified framework for studying the passage from Jordan structure to associative structure in C*-algebras and von Neumann algebras. In this framework we will describe the geometry of dynamical correspondences and complete the process of relating the various notions studied in Part I.
2. Order Derivations We begin by giving the definition of JB-algebras and JBW-algebras and some of their basic properties. (A comprehensive treatment of such algebras can be found in [13]). Definition 1. A JB-algebra is a real Jordan algebra A which is also a Banach space such that the Jordan product and the norm are connected by the following conditions for a, b in A: (i) ka ◦ bk = kak ◦ kbk, (ii) ka2 k = kak2 , (iii) ka2 k ≤ ka2 + b2 k. A JB-algebra with identity element 1 is said to be unital. In this paper, we will always assume our JB-algebras are unital. Definition 2. A JBW-algebra is a JB-algebra A which is the dual of a Banach space A∗ . The space A∗ is unique [13, Th.4.4.16], and it is called the predual of A. The self-adjoint part of a C*-algebra is a JB-algebra (for the product a ◦ b = 21 (ab + ba)), and the self-adjoint part of a von Neumann algebra is a JBW-algebra. In fact, many of the basic constructs for these associative algebras can be carried over to their Jordan counterparts. Note in this connection that since a JBW-algebra is a special case of a JB-algebra, all definitions given for JB-algebras also apply to JBW-algebras.
92
E. M. Alfsen, F. W. Shultz
A unital JB-algebra A is a complete order unit space with positive cone A+ = {a2 | a ∈ A} such that for a ∈ A, −1 ≤ a ≤ 1 ⇒ 0 ≤ a2 ≤ 1,
(4)
and this actually characterizes unital JB-algebras among all real Jordan algebras with identity [13, Prop. 3.1.6]. A linear functional ρ on a JB-algebra A is said to be positive, denoted ρ ≥ 0, if ρ(a) ≥ 0 for all a ∈ A+ . The set of all positive functionals in A∗ is a w*-closed (convex) cone, denoted by (A∗ )+ . If A is a unital JB-algebra, then the set of all ρ ∈ (A∗ )+ such that ρ(1) = 1 is a w*-compact convex set called the state space of A. Elements of the state space are called states, and the extreme points of the state space are called pure states. A JB-algebra is monotone complete if every upper bounded increasing net {aα } has a least upper bound a in A. A bounded linear functional ρ on A is called normal if ρ(aα ) → ρ(a) for each net {aα } as above. The positive normal linear functionals form a separating set for A if for every non-zero a ∈ A there exists a positive normal linear functional ρ such that ρ(a) 6= 0. (This fact together with monotone completeness characterize the JBW-algebras among all JB-algebras, and this characterization is taken as the definition in [13].) The predual A∗ of a JBW-algebra A can be identified with the subspace of A∗ which consists of all normal linear functionals [13, Th. 4.4.16]. We will use the term σ-weak topology to denote the topology on A determined by the duality with A∗ (the σ(A, A∗ )-topology in Bourbaki’s terminology). Thus, the σ-weakly continuous linear functionals are precisely the normal ones. Note that every JBW-algebra A is unital [13, Lem.4.1.7]. The convex set of normal states on A is called the normal state space of A. If A is a JB-algebra, then A∗∗ can be made into a JBW-algebra in such a way that the state space of A is identified with the normal state space of A∗∗ [13, §4.4]. Furthermore, multiplication is separately σ-weakly continuous on a JBW-algebra [13, Cor. 4.1.6], so we can often make use of the σ-weak density of A in A∗∗ . We will now introduce an order theoretic concept of derivation which plays an important role in Connes’ paper [8]. It can be defined in the general context of ordered linear spaces, but we will only give the definition for JB-algebras. But first some motivating remarks. Derivations occur in many different contexts. What is common for various derivations δ, is the fact that they are linear operators generating a one-parameter group of maps etδ which preserve the algebraic structure under study. In our present context, we are focusing on the order structure, ignoring the multiplicative aspect. Therefore the Leibniz rule is not relevant here. Definition 3. A bounded linear operator δ on a JB-algebra A is called an order derivation if etδ (A+ ) ⊂ A+ for all t ∈ R, or what is equivalent, if {etδ }t∈R is a one-parameter group of order automorphisms. Lemma 4. An order derivation δ on a JBW-algebra A is σ-weakly continuous. Proof. We will first show that the map φ = etδ is σ-weakly continuous for given t ∈ R, or what is the same, that ρ ◦ φ ∈ A∗ for every ρ ∈ A∗ . Since φ is an order automorphism and ρ is a normal linear functional, then ρ ◦ φ is also a normal linear functional. Thus ρ ◦ φ ∈ A∗ as desired. The order derivation δ is the norm limit of t−1 (etδ − 1) when t → 0, so we can find a sequence {ψn } of σ-weakly continuous linear maps such that kψn − δk → 0. Let
On Orientation and Dynamics in Operator Algebras. I
93
ρ ∈ A∗ . Then ρ ◦ ψn ∈ A for every n and kρ ◦ ψn − ρ ◦ δk → 0. Since A∗ is complete, ρ ◦ δ ∈ A∗ . Since ρ ∈ A was arbitrary, δ is σ-weakly continuous. We will give a necessary and sufficient condition that a linear operator be an order derivation. The idea behind this criterion is the following: If δ is an order derivation, then the orbit of etδ through a boundary point of the cone A+ cannot proceed to the exterior of A+ , nor to the interior, so the velocity vector must lie in the tangent space. Connes turned this heuristic argument into a precise criterion for self-polar cones [8, Lemma 5.3]. Later on Hanche-Olsen and Evans generalized it to arbitrary cones with the nearest point property, i.e. to cones for which the minimum distance from an arbitrary point to a point in the cone is effectively attained [12]. In this form it can be applied also in our context, as the positive cone of a unital JB-algebra, and in fact of every order unit space, has the nearest point property. This is an elementary result, which is certainly well known. But since we have not been able to find a good reference, we include the proof. Lemma 5. The positive cone A+ of an order unit space A has the nearest point property. Proof. Let a ∈ / A+ . Observe that there exists λ ∈ R+ such that a + λ1 ∈ A+ . In fact, we can take λ = kak, since a ≥ −kak1 (by the definition of order unit [13, 1.2.1]). Set λ0 = inf {λ ∈ R | a + λ1 ∈ A+ }.
(5)
Since the positive cone A is closed, there is an element b = a + λ0 1 ∈ A . We claim that b is a nearest point for a, i.e. that kc − ak ≥ kb − ak = λ0 for every c ∈ A+ . Let c ∈ A+ and set λ = kc − ak. Then c ≤ a + λ1 (again by the definition of the order unit). Since c ∈ A+ , also a + λ1 ∈ A+ . Hence λ ≥ λ0 as claimed. +
+
Lemma 6. A bounded linear operator δ on a unital JB-algebra is an order derivation iff the following implication holds for all a ∈ A+ and ρ ∈ (A∗ )+ ρ(a) = 0 ⇒ ρ(δa) = 0.
(6)
Proof. By Theorem 1 of [12] the quoted statement holds in the context of ordered Banach spaces with the nearest point property. By Lemma 5 it can be applied in our case. We will denote the Jordan multiplier determined by an element b of a JB-algebra A by δb . Thus for all a ∈ A (7) δb (a) = b ◦ a. Lemma 7. Let A be a unital JB-algebra. Then δb is an order derivation for every b ∈ A. Proof. Suppose ρ = 0 for a ∈ A+ and ρ ∈ (A∗ )+ . By the Cauchy-Schwartz inequality for JB-algebras [13, 3.6.2], kρ(δb a)k2 = kρ(b ◦ a)k2 ≤ ρ(b2 )ρ(a2 ).
(8)
Generally a ≤ kaka for every a ∈ A . In fact, the Jordan triple product (defined in [13, 2.3.2]) determines an order preserving map a 7→ {cac} for every c ∈ A [13, 3.3.6], 1 so if we evaluate a 2 by spectral theory [13, 3.2.4], we can write 2
+
1
1
1
1
a2 = {a 2 aa 2 } ≤ {a 2 (kak1)a 2 } = kaka. Now it follows from (8) that kρ(δb a)k2 ≤ ρ(b2 )kakρ(a) = 0. By Lemma 6, δb is an order derivation.
94
E. M. Alfsen, F. W. Shultz
Definition 8. An order derivation δ on a unital JB-algebra A is self-adjoint if δ = δa for some a ∈ A, and it is skew-adjoint (or just skew) if δ(1) = 0. Our next lemma shows, among other things, that the skew order derivations are the Jordan derivations, i.e. the bounded linear operators δ which satisfy the Leibniz rule δ(a ◦ b) = (δa) ◦ b + a ◦ (δb).
(9)
Lemma 9. Let A be a unital JB-algebra with state space K and let δ be an order derivation on A. For every t ∈ R, let αt = etδ and let αt∗ be the dual map defined on A∗ by (αt∗ ρ) = ρ(αt (a)) for ρ ∈ A∗ and a ∈ A. Now the following are equivalent: (i) (ii) (iii) (iv) (v)
δ is skew, αt (1) = 1 for all t, αt is a Jordan automorphism for all t, δ is a Jordan derivation, αt∗ (K) ⊂ K for all t.
Proof. (i) ⇒ (ii) Use the exponential series for etδ . (ii) ⇒ (iii) By a known theorem of Kadison, every unital order automorphism of a C*-algebra is a Jordan automorphism [21]. The same result is in fact valid for a JB-algebra. It can most easily be obtained from Theorem 12.13 of [1], by which the Jordan square a2 = a ◦ a of an element a of A (and hence every Jordan product a ◦ b) is determined by the ordering and the order unit (via the spectral integral R completely λ2 deλ of [1, Eq. (8.28)]). Thus, if αt is a unital order automorphism, then it is also a Jordan automorphism. (iii) ⇒ (iv) Since αt is a Jordan automorphism, then α(a ◦ b) = α(a) ◦ α(b) for a, b ∈ A. By the standard argument δ(a ◦ b) = lim t−1 (αt (a) ◦ αt (b) − a ◦ b) = (δa) ◦ b + a ◦ (δb). t→0
(iv) ⇒ (i) By Leibniz’ rule δ(1) = δ(1 ◦ 1) = 2δ(1). Hence δ(1) = 0. (ii) ⇔ (v) Trivial. For our next proof we shall need two elementary results valid for elements x, y, z in a unital Banach algebra. The first is the equation lim k(1 + n−1 x)n − ex k = 0,
n→∞
(10)
which follows from the continuity of the holomorphic functional calculus. The second is the inequality ky n − z n k ≤ n · Max{kyk, kzk}n−1 ky − zk,
(11)
which follows from the decomposition y n − z n = y n−1 (y − z) + y n−2 (y − z)z + · · · + (y − z)z n−1 . We will denote the set of order derivations of a JB-algebra A by D(A). Proposition 10. The set D(A) of order derivations of a unital JB-algebra A is a real linear space closed under Lie brackets [δ1 , δ2 ] = δ1 δ2 − δ2 δ1 .
On Orientation and Dynamics in Operator Algebras. I
95
Proof. The fact that D(A) is closed under linear operations, follows directly from Lemma 6. To show that D(A) is closed under Lie brackets, it suffices to show that [δ1 , δ2 ] is an order derivation for a given pair δ1 , δ2 ∈ D(A). By looking at the first few terms of the exponential series involved, we see that etδ1 etδ2 e−tδ1 e−tδ2 = 1 + t2 [δ1 , δ2 ] + t4 φt ,
(12)
where kφt k is bounded for t in a neighbourhood of 0. 1 Set tn = n− 2 and define αn = etn δ1 etn δ2 e−tn δ1 e−tn δ2 , βn = 1 + t2n [δ1 , δ2 ] and γn = φtn for n = 1, 2, · · · . Now αn − βn = n−2 γn
(13)
for n = 1, 2, · · ·, and {kγn k} is a bounded sequence. Clearly (αn )n is an order automorphism for every n. It follows from (10) that k(βn )n − exp[δ1 , δ2 ]k → 0 when n → ∞. Thus we only have to show that k(αn )n − (βn )n k → 0 when n → ∞. We will prove this by applying (11) with αn and βn in place of y and z, and we begin by showing that {kαn kn } and {kβn kn } are bounded sequences. By (12), kαn k ≤ 1 + n−1 k[δ1 , δ2 ]k + n−2 kγn k for n = 1, 2, · · ·. We will assume [δ1 , δ2 ] 6= 0 (otherwise there is nothing to prove). Let λ > 1 be arbitrary. Then for sufficiently large n kαn k ≤ 1 + λn−1 k[δ1 , δ2 ]k ≤ exp λn−1 kδ1 , δ2 k , which gives
kαn kn ≤ exp λk[δ1 , δ2 ]k .
Since λ > 1 was arbitrary,
lim kαn kn ≤ exp k[δ1 , δ2 k .
n→∞
For every n and then also
kβn k ≤ 1 + n−1 k[δ1 , δ2 ]k, kβn kn ≤ exp k[δ1 , δ2 k .
Let M > expk[δ1 , δ2 ]k ≥ 1. By the inequalities above, kαn kn ≤ M and kβn kn ≤ M , and then also kαn kn−1 ≤ M and kβn kn−1 ≤ M , for sufficiently large n. Now it follows from (11) and (13) that k(αn )n − (βn )n k ≤ nM kαn − βn k ≤ n−1 M kγn k for large n. Thus k(αn )n − (βn )n k → 0 when n → ∞, and we are done.
Lemma 11. Every order derivation δ on a unital JB-algebra A can be decomposed uniquely as the sum of a self-adjoint and a skew derivation, namely δ = δa + δ 0 , where a = δ(1).
96
E. M. Alfsen, F. W. Shultz
Proof. Set a = δ(1) and δ 0 = δ − δa . Then δ 0 (1) = a − a ◦ 1 = 0, so δ = δa + δ 0 is a decomposition of the desired type. If δ = δb + δ 00 is another such decomposition, then evaluation at 1 gives δ(1) = b ◦ 1 = b, so b = a. Therefore the decomposition is unique. To each δ ∈ D(A) we will associate the adjoint order derivation δ ∗ defined by δ = δa − δ 0 , where δ = δa + δ 0 as above. Thus δ ∈ D(A) is self-adjoint iff δ ∗ = δ. Two elements a, b of a JB-algebra A are said to operator commute if the operators δa , δb commute, i.e. if (a ◦ x) ◦ b = a ◦ (x ◦ b) for all x ∈ A. If A is the self-adjoint part of a C*-algebra, then a, b operator commute iff ab = ba [11, Lem. 5.1]. The set of all elements in a JB-algebra A which operator commute with every other element of A is called the center of A, and it will be denoted by Z(A). Note that Z(A) is an associative subalgebra of A. We will also denote by Z(D(A)) the center of the Lie algebra D(A), i.e. the set of all δ ∈ D(A) such that [δ, δ 0 ] = 0 for every other element δ 0 of D(A). ∗
Lemma 12. If δ is a skew order derivation on a unital JB-algebra A and z ∈ Z(A), then (i) etδ z = z for all t ∈ R, (ii) δz = 0. Proof. Recall that the bidual A∗∗ is a JBW-algebra. By σ-weak continuity of multiplication in each variable separately, the bidual map (etδ )∗∗ is also a Jordan automorphism. Furthermore, again by σ-weak continuity of multiplication, the center of A will be contained in the center of A∗∗ , so it suffices to prove the lemma for the special case where A is a JBW-algebra. Then it is enough to prove the lemma for the case where z is a central idempotent, i.e. z 2 = z. Since δ is skew, then etδ is a Jordan automorphism, so etδ z is also a central idempotent for t ∈ R. Thus etδ z − z is the difference of two central projections, so ketδ z − zk is either zero or one. Since ketδ z − zk is a continuous function of t which is zero when t = 0, it must be zero for all t. This proves (i). Now also δz = limt→0 t−1 (etδ z − z) = 0, which proves (ii). Lemma 13. If A is a unital JB-algebra, then Z(D(A)) = {δz | z ∈ Z(A)}.
(14)
Proof. Assume first that δ ∈ Z(D(A)). In particular [δ, δa ] = 0 for every a ∈ A. Let z = δ(1). Then for every a ∈ A, δ(a) = δδa (1) = δa δ(1) = δa (z) = a ◦ z = z ◦ a. Hence δ = δz . Also δz δa = δa δz , so z ∈ Z(A). Assume next that z ∈ Z(A). By the definition of Z(A), δz commutes with every self-adjoint order derivation δa . Therefore we only have to show that δz commutes with every skew derivation δ. But such a derivation is a Jordan derivation, so it follows from the Leibniz rule (9) and Lemma 12 (ii) that δδz (x) = δ(z ◦ x) = (δz) ◦ x + z ◦ (δx) = δz δ(x) for every x ∈ A. Thus δδz = δz δ as desired.
On Orientation and Dynamics in Operator Algebras. I
97
If A is the self-adjoint part of a C*-algebra A, then we will assign to each d ∈ A a linear operator δd on A defined by δd (x) =
1 (dx + xd∗ ) 2
(15)
for x ∈ A. Note that δd (x) = (δd (x))∗ . Lemma 14. If A is the self-adjoint part of a C*-algebra A and δ = δd for d ∈ A, then for x ∈ A and t ∈ R, ∗ (16) exp(2tδ)(x) = etd xetd . In particular if δ = δa for a ∈ A (self-adjoint case), then exp(2tδ)(x) = eta xeta ,
(17)
and if δ = δib for b ∈ A (skew case), then exp(2tδ)(x) = eitb xe−itb .
(18)
Proof. Consider the left and right multiplication operators Ld : x 7→ dx and Rd : x 7→ xd∗ defined on A for d ∈ A. Since Ld and Rd∗ commute, then for x ∈ A, ∗
exp(2tδd )(x) = exp(tLd + tRd∗ )(x) = exp(tLd )exp(tRd∗ )(x) = etd xetd . This proves (16), from which (17) and (18) both follow.
Proposition 15. If A is the self-adjoint part of a von Neumann algebra A, then the order derivations of A are the operators δd on A defined above, and an order derivation δ is self-adjoint (skew) iff it is of the form δd for d self-adjoint (skew). Proof. An order derivation is self-adjoint iff it is of the form δa for a ∈ A (by definition). Clearly also, an order derivation is skew if it is of the form δib (i.e. of the form δd with d skew). Conversely we will show that an arbitrary skew order derivation δ is of this form. For every t ∈ R the map etδ is a Jordan automorphism of A. We extend it by (complex) linearity to all of A. By a theorem of Kadison [22] there is a central projection c such that etδ acts as a *-isomorphism from cA into A and a *-anti-isomorphism from (1 − c)A into A. By Lemma 9, etδ fixes c. Hence etδ is a *-automorphism of cA and a *-anti-automorphism of (1 − c)A. Applying etδ twice, we observe that e2tδ acts as a *-automorphism also on (1 − c)A. Since t ∈ R was arbitrary, this means that etδ is in fact a *-automorphism of A for every t ∈ R. Thus by the Kadison-Sakai theorem the generator δ of the one-parameter group {etδ }t∈R is an inner derivation on A, i.e. δ(x) = 21 (hx − xh) for some h ∈ A and all x ∈ A [19, Ex.8.7.55]. Let h = a + ib, where a, b ∈ A. Then for each x ∈ A, δ(x) = −iδia (x) + δib (x). Since δ(x) = δ(x)∗ , then δia (x) = 0 and δ(x) = δib (x). Thus δ = δib . If d ∈ A, say d = a + ib, then δd is the sum of the order derivations δa and δib , so δd is also an order derivation. Conversely we will show that every order derivation is of this form. Assume that δ is an arbitrary order derivation on A and consider the decomposition δ = δa + δ 0 established in Lemma 11. By the argument above, δ 0 = δib for some b ∈ A. Thus for every x ∈ A,
98
E. M. Alfsen, F. W. Shultz
δ(x) =
i 1 1 (ax + xa) + (bx − xb) = (a + ib)x + x(a + ib)∗ = δa+ib (x), 2 2 2
so δ is of the desired form.
The results above can easily be dualized to the predual A∗ of the von Neumann algebra A. If δ is an order derivation of A and αt = etδ , then we consider the dual operator (αt )∗ on the self-adjoint part A∗ of A∗ for t ∈ R. Generally {(αt )∗ }t∈ R is a one-parameter group of order automorphisms of A∗ , and if δ is skew then each (αt )∗ leaves the normal state space invariant (Lemma 9 (v)), so {(αt )∗ }t∈ R is a one-parameter group of affine automorphisms of the normal state space. The orbits of {(αt )∗ }t∈R can be easily visualized in the case where A is the 2 × 2 matrix algebra. Here the (normal) state space is a Euclidean 3-ball, and the pure state space is the surface of the ball, i.e. a Euclidean 2-sphere. If a ∈ A is self-adjoint and has two distinct eigenvalues λ1 < λ2 corresponding to (unit) eigenvectors ξ1 , ξ2 , then the vector states ωξ1 , ωξ2 are antipodal points on the sphere (South Pole and North Pole on Fig.1). If δ = δia (the skew case), then (αt )∗ is a rotation of the ball by an angle t(λ1 − λ2 )/2 about the diameter [ωξ1 , ωξ2 ]. Thus the one-parameter group {(αt )∗ }t∈R represents a rotational motion with rotational velocity (λ1 − λ2 )/2 about this diameter, and the orbits on the sphere are the “parallel circles" (in planes orthogonal to [ωξ1 , ωξ2 ]). If δ = δa (the self-adjoint case), then the orbits will take us out of the state space. But this can be remedied by a normalization, i.e. by considering the parametric curves t 7→ k(αt )∗ σk−1 (αt )∗ σ instead of t 7→ (αt )∗ σ. These are the “longitudinal semi-circles" on the sphere (in planes through [ωξ1 , ωξ2 ]). The proof of these facts is elementary matrix calculation and will be omitted. ωξ2
δa
δia
ωξ1 Fig. 1.
In the example above we can easily see how the one-parameter group is determined by the geometry in the self-adjoint case, and we can also see what indeterminacy there is in the skew case. The self-adjoint element a ∈ A determines a real valued affine function aˆ : ω 7→ ω(a) on the state space. This function attains its minimum λ1 at ωξ1 and its maximum λ2 at ωξ2 . In the self-adjoint case the orbits are the longitudinal semi-circles traced out in the direction from ωξ1 to ωξ2 . In the skew case the orbits are the parallel circles, but they can be traced out in two possible directions “eastbound" and “westbound". The mere knowledge of the affine function aˆ does not tell us which
On Orientation and Dynamics in Operator Algebras. I
99
direction to choose. This would require a specific orientation of the ball (right-handed −→ or left-handed around ωξ1 ωξ2 ).
3. Connes Orientations and Dynamical Correspondences We will now transfer Connes’ concept of orientation [8, Def. 4.11] to the context of JBW-algebras. The idea is to axiomatize the map δd 7→ δid of D(A) into itself where A is the self-adjoint part of a von Neumann algebra A, or rather the corresponding map which is obtained when elements of D(A) are identified modulo Z(D(A)). ˜ To simplify the notation, we will write D(A) in place of D(A)/Z(D(A)). We will ˜ Note also denote the equivalence class of an element δ of D(A)) modulo Z(D(A)) by δ. ∗ ) is well defined on D(A)), g ˜ for if δ˜1 = δ˜2 then δ1 − δ2 = δz that the involution δ˜ ∗ = (δ ∗ ∗ g g for some z ∈ Z(A) (Lemma 13). Hence δ1∗ − δ2∗ = δz∗ = δz , so (δ 1 ) = (δ2 ). Definition 16. A Connes orientation on a JBW-algebra A is a complex structure on ˜ D(A), which is compatible with Lie brackets and involution, i.e. a linear operator I on ˜ D(A) which satisfies the requirements (i) I 2 = −1 (the identity map), (ii) [I δ˜1 , δ˜2 ] = [δ˜1 , I δ˜2 ] = I[δ˜1 , δ˜2 ], ˜ ∗. (iii) I(δ˜ ∗ ) = −(I δ) If A is the self-adjoint part of a von Neumann algebra A, then it is easily verified that I : δ˜d 7→ δ˜id (with d ∈ A) is a Connes orientation of A. We call it the Connes orientation induced on A from A. (Note that the map I0 : δd 7→ δid from D(A) into itself is not well-defined, since d is not determined by δd if d is not known to be self-adjoint.) An alternative approach is to take the basic properties of the map a 7→ δia , where a is a self-adjoint element of a von Neumann algebra A as axioms for a map ψ which assigns to each element a in a general unital JB-algebra A a skew order derivation ψa on A. Geometrically ψ assigns to each real valued affine function on the state space K of A a one-parameter group of affine automorphisms of K. Such a one-parameter-group describes a motion of states, and we will call ψ a “dynamical correspondence". The precise definition is the following: Definition 17. A dynamical correspondence on a unital JB-algebra A is a linear map ψ : a 7→ ψa from A into the set of skew order derivations on A which satisfies the requirements (i) [ψa , ψb ] = −[δa , δb ] for a, b ∈ A, (ii) ψa a = 0 for all a ∈ A. A dynamical correspondence on a JB-algebra A will be called complete if it maps A onto the set of all skew order derivations on A. It is easily verified that condition (ii) above is equivalent to the statement that exp(tψa ) fixes a for all a ∈ A and all t ∈ R. This property is easily visualized in the 2 × 2 matrix example, and it will play an important role in the geometrical investigations in Part II. We will also state the definition of a dynamical correspondence in another form. In this connection we shall need the following lemma, which will also be needed later.
100
E. M. Alfsen, F. W. Shultz
Lemma 18. Let A be a unital JB-algebra and let ψ be a map from A into the set of all skew order derivations on A. Then for all pairs a, b ∈ A, [ψa , δb ] = δψa b .
(19)
Proof. Since ψa is skew, it is a Jordan derivation. Hence for all c ∈ A, ψa (b ◦ c) = (ψa b) ◦ c + b ◦ (ψa c), which can be rewritten ψa δb c − δb ψa c = δψa b c. This gives (19).
Note that linearity of ψ is not listed among the requirements in the proposition below, since it follows from the other requirements. Proposition 19. Let A be a unital JB-algebra and let ψ : a 7→ ψa be a map from A into the set of skew order derivations of A. Then ψ is a dynamical correspondence iff the following requirements are satisfied for a, b ∈ A (i) [ψa , ψb ] = −[δa , δb ], (ii) [ψa , δb ] = [δa , ψb ]. Proof. Assume first that ψ is a dynamical correspondence. Condition (i) above is trivially satisfied as it is identical with condition (i) of Definition 17. By condition (ii) of Definition 17 and Lemma 18, then for all a ∈ A, [ψa , δa ] = δψa a = δ0 = 0. By linearity of ψ, then for all a, b ∈ A, 0 = [ψa+b , δa+b ] = [ψa , δb ] + [ψb , δa ], which gives [ψa , δb ] = −[ψb , δa ] = [δa , ψb ], and proves condition (ii) above. Assume next that ψ satisfies conditions (i) and (ii) above. Condition (i) of Definition 17 is trivially satisfied, and it follows from condition (ii) above and Lemma 18 that for all a ∈ A, δψa a = [ψa , δa ] = [δa , ψa ] = −[ψa , δa ]. Hence ψa a = δψa a 1 = 0, so condition (ii) of Definition 17 is also satisfied. It remains to show that ψ is linear. By Lemma 18 and condition (ii) above, then for all a, b, δψa b (1) = [ψa , δb ](1) = [δa , ψb ](1) = −[ψb , δa ](1) = −δψb a (1). Hence ψa b = −ψb a,
(20)
from which it follows that a 7→ ψa b is a linear map from A into D(A) for each fixed b ∈ A.
On Orientation and Dynamics in Operator Algebras. I
101
In the above proof we have actually also shown that if ψ is a dynamical correspondence on A, then Eq. (20) above holds for all pairs a, b ∈ A. We shall make more use of this equation later. We will now explain the relationship between Connes orientations and dynamical correspondences, and we begin with the following: Lemma 20. If ψ is a dynamical correspondence on a unital JB-algebra A, then the kernel of ψ : A → D(A) consists of all self-adjoint order derivations δz , where z ∈ Z(A). Proof. If a ∈ ker ψ, then it follows from Definition 17 (i) that [δa , δb ] = −[ψa , ψb ] = 0 for all b ∈ A. Hence a ∈ Z(A). Conversely, if z ∈ Z(A) then δz ∈ D(Z(A)) (Lemma 13), so it follows from Lemma 18 and Proposition 19 (ii) that for all b ∈ A, δψz b = [ψz , δb ] = [δz , ψb ] = 0. Thus ψz b = 0 for all b ∈ A, so z ∈ ker ψ.
Proposition 21. A Connes orientation I on a JBW-algebra A determines a complete dynamical correspondence ψ such that ψa ∈ I(δ˜a ) for all a ∈ A, and ψ is the only dynamical correspondence with this property. Proof. Let a ∈ A. We will first show that there exists a unique δ ∈ I(δ˜a ) such that δ is skew. Choose δ1 ∈ I(δ˜a ). By Definition 16 (iii), δ˜1∗ = I(δ˜a )∗ = −I(δ˜a∗ ) = −I(δ˜a ) = −δ˜1 . ˜ so δ ∗ + δ1 = δz for some z ∈ Z(A) (Lemma 13). Now define Hence δ˜1∗ + δ˜1 = 0, 1 1 1 δ = δ1 − 2 δz = 2 (δ1 − δ1∗ ). Thus δ is skew. Also 1 δ = δ1 − δz ∈ δ˜1 = I(δ˜a ). 2
(21)
If δ1 is an arbitrary skew order derivation such that δ 0 ∈ I(δ˜a ), then δ − δ 0 is both central and skew. By Lemma 13 δ − δ 0 = 0, so δ is the unique skew order derivation in I(δ˜a ). Denote the unique skew order derivation in I(δ˜a ) by ψa for each a ∈ A. Clearly ψ : a 7→ ψa is a linear map from A into D(A). Let a, b ∈ A. By Definition 16 (i),(ii), [I δ˜a , I δ˜b ] = I[δ˜a , I δ˜b ] = I 2 [δ˜a , δ˜b ] = −[δ˜a , δ˜b ]. Thus for some z ∈ Z(A) [ψa , ψb ] = −[δa , δb ] + δz . Since ψa and ψb are skew, then [ψa , ψb ](1) = 0. Also [δa , δb ](1) = a ◦ b − b ◦ a = 0. Hence z = δz (1) = 0. Thus [ψa , ψb ] = −[δa , δb ], so ψ satisfies condition (i) of Definition 17. To show that ψ also satisfies condition (ii) of Definition 17, we first observe that for all a ∈ A, then by Definition 16 (ii), [ψ˜ a , δ˜a ] = [I δ˜a , δ˜a ] = I[δ˜a , δ˜a ] = 0.
102
E. M. Alfsen, F. W. Shultz
By Lemma 13 there exists z ∈ Z(A) such that [ψa , δa ] = δz . Now δz is a commutator of two bounded linear operators on A and it commutes with each of them, so it follows from the Kleinecke-Shirokov Theorem [10, p.128] that δz is quasi-nilpotent, i.e. 1
lim kδz n k n = 0.
n→∞
But δz n (1) = z n for all n: Hence n
kδz 2 k2
−n
n
≥ kz 2 k2
−n
= kzk,
(22)
so z = 0. (The norm-closed subalgebra generated by a single element and 1 is isometrically isomorphic to C(X) for some compact Hausdorff space X [13, 3.2.4], so n n kz 2 k = kzk2 ). Thus [ψa , δa ] = 0. Now it follows from Lemma 18 that δψa a = 0, and hence also ψa a = 0, so we have verified condition (ii) of Definition 17. Thus ψ is a dynamical correspondence. To show that ψ is complete, we consider an arbitrary skew order derivation δ and we will show that δ = ψa for some a ∈ A. More specifically, we choose an arbitrary ˜ and we will show that δ1 is self-adjoint, i.e. of the form δ1 = δa for a ∈ A, δ1 ∈ I(−δ) and that ψa = δ. By Definition 16 (iii) and the fact that δ is skew, ˜ ∗ = I(δ˜ ∗ ) = I(−δ) ˜ = δ˜1 . δ˜1∗ = −(I δ) Thus δ1∗ −δ1 = δz for some z ∈ Z(A). Applying both sides to 1, we get (δ1∗ −δ1 )(1) = z. Since δ1∗ − δ1 is skew, then z = 0. Thus δ1∗ = δ1 , so δ1 is self-adjoint, i.e. δ1 = δa for some a ∈ A. ˜ ˜ so δ˜a = −I δ. ˜ Then by Definition 16 (i), I δ˜a = −I 2 δ˜ = δ. By definition δa ∈ I(−δ), ˜ ˜ Thus δ ∈ I δa , so δ is the unique skew order derivation in I δa ; in other words δ = ψa . With this we have shown that ψ is a complete dynamical correspondence. The uniqueness is clear, since ψa is the only skew order derivation in I δ˜a . The concept of a Connes orientation is defined for JBW-algebras, while the concept of a dynamical correspondence is defined for unital JB-algebras, so the two concepts cannot be equivalent. Note however, that it will follow from our main theorem that a dynamical correspondence on a JBW-algebra is necessarily complete and is derived from a Connes orientation as in Proposition 21 (Corollary 25). Thus the two concepts are in fact equivalent in the context of JBW-algebras.
4. The Main Theorem We are now ready to prove our main theorem which relates dynamical correspondences to associative products. Definition 22. Let A be a unital JB-algebra. A C*-product (W*-product) compatible with A is an associative product (x, y) 7→ xy on the complex linear space A + iA which induces the given Jordan product on A and organizes A + iA to a C*-algebra (von Neumann algebra) with the involution (a + ib)∗ = a − ib and the norm kxk = kx∗ xk1/2 .
On Orientation and Dynamics in Operator Algebras. I
103
Note that if a JB-algebra A is the self-adjoint part of a C*-algebra A, then we can transfer the product and the norm from A to A + iA by the representation x = a + ib (where x ∈ A and a, b ∈ A). This organizes A + iA to a C*-algebra with the properties in the definition above. Thus, a JB-algebra is the self-adjoint part of a C*-algebra iff there exists a C*-product compatible with A on A + iA, similarly in the JBW-context. Theorem 23. A unital JB-algebra is (Jordan isomorphic to) the self-adjoint part of a C*-algebra iff there exists a dynamical correspondence on A. In this case each dynamical correspondence ψ on A determines a unique C*-product compatible with A such that for a, b ∈ A, i (23) ψa b = (ab − ba), 2 and each C*-product compatible with A arises in this way from a unique dynamical correspondence ψ on A. The same conclusions hold with “JBW" in place of “JB" and “W*" or “von Neumann" in place of “C*". Proof. Assume first that A admits a dynamical correspondence ψ. By Eq. (20) of Proposition 19 we can define an anti-symmetric bilinear product (a, b) 7→ a?b on A by writing a ? b = ψa b.
(24)
Next define a bilinear map (a, b) → 7 ab from A × A into A + iA (considered as a real linear space) by writing ab = a ◦ b − i(a ? b). (25) This map can be uniquely extended to a bilinear product on A + iA (considered as a complex linear space). We will show that this product is associative. By linearity, it suffices to prove the associative law a(cb) = (ac)b
(26)
for a, b, c ∈ A. Writing out (26) by means of (25), we get a ◦ (c ◦ b) − i a ? (c ◦ b) − i(a ◦ (c ? b) − a ? (c ? b) = (a ◦ c) ◦ b − i (a ◦ c) ? b − i (a ? c) ◦ b − (a ? c) ? b. Separating real and imaginary terms (and using the anti-symmetry of the ?-product), we get two equations. The first one can be written as follows: a ? (b ? c) − b ? (a ? c) = −a ◦ (b ◦ c) + b ◦ (a ◦ c),
(27)
and the second one as follows: a ? (b ◦ c) − b ◦ (a ? c) = a ◦ (b ? c) − b ? (a ◦ c).
(28)
The left-hand side of (27) is nothing but [ψa , ψb ](c), and the right-hand side of (27) is nothing but −[δa , δb ](c). Similarly the left-hand side of (28) is [ψa , δb ](c), and the right-hand side of (28) is [δa , ψb ](c). Thus these two equations follow directly from the characterization of a dynamical correspondence in Proposition 19. We must also show that the bilinear product on A + iA is compatible with the natural involution (a + ib)∗ = a − ib on A + iA, i.e. that (xy)∗ = y ∗ x∗ for x, y ∈ A + iA. By
104
E. M. Alfsen, F. W. Shultz
linearity it suffices to show that (ab)∗ = ba for a, b ∈ A. But this follows directly from the antisymmetry of the ?-product, as ∗ (ab)∗ = a ◦ b − i(a ? b) = a ◦ b + i(a ? b) = b ◦ a − i(b ? a) = ba. We have now shown that A + iA is an associative *-algebra. By the definition of the involution, the self-adjoint part of A + iA is A. Thus by (25) and the anti-symmetry of the ?-product, 1 (ab + ba) = a ◦ b 2 for all pairs a, b ∈ A. Thus the associative product in A + iA induces the given Jordan product on A. We will now show that x∗ x ∈ A+ for every x ∈ A+iA. The closed Jordan subalgebra C(x∗ x) of A generated by the self-adjoint element x∗ x and 1 is associative, hence isometrically isomorphic to the real commutative Banach algebra C(X) for a compact Hausdorff space X [13, Th.3.2.2]. Thus x∗ x = a−b, where a, b are two positive elements of C(x∗ x) such that a ◦ b = 0. By the definition of the Jordan triple product [13, 2.3.2] and the associativity of C(x∗ x), {bab} = 2b ◦ (b ◦ a) − (b ◦ b) ◦ a = b ◦ (b ◦ a) = 0. But since the Jordan product in A is equal to that induced from the associative algebra A + iA, the same is true for the Jordan triple product. Therefore also bab = 0. Calculating in the associative *-algebra A + iA, we now find that (xb)∗ (xb) = bx∗ xb = b(a − b)b = −b3 .
(29)
Since 0 ≤ b ∈ C(x∗ x), then b3 ≥ 0 and b3 = 0 iff b = 0. Thus in order to prove b = 0, it suffices to show that (xb)∗ (xb) = 0. For brevity we set y = xb, and we will show that y ∗ y = 0. By (29) y ∗ y ∈ −A+ . Write y = c + id, where c, d ∈ A and calculate yy ∗ + y ∗ y = 2(c2 + d2 ), which gives
yy ∗ = 2(c2 + d2 ) − y ∗ y ∈ A+ .
Thus we have shown that y ∗ y ∈ −A+
and yy ∗ ∈ A+ .
(30)
Following [13, 3.2.9] we define the inverse of an element a ∈ A to be its inverse in C(a) (if it exists), and we denote it by a−1 . Note that this definition is equivalent to the usual definition of inverse in Jordan algebras; namely that a0 ∈ A is the inverse of a if a0 satisfies a ◦ a0 = 1 and a2 ◦ a0 = a [4, Prop.2.4]. Furthermore it is easily shown that a ∈ A is invertible with inverse a0 in the Jordan algebra A iff a ∈ A is invertible with inverse a0 in the Jordan algebra A + iA iff a is invertible with inverse a0 in the associative *-algebra A + iA. (For the last equivalence, see [16, p.51].) By definition, a real number λ is in the spectrum of an element a of the JB-algebra A, in symbols λ ∈ sp(a), if λ1 − a is non-invertible in A. Thus, by the above, λ ∈ sp(a) iff λ1 − a is non-invertible in the associative algebra A + iA.
On Orientation and Dynamics in Operator Algebras. I
105
Calculating in the associative algebra A + iA (in the same way as in the proof of [19, Prop.3.2.8]), we find that if λ1 − y ∗ y is invertible and λ 6= 0, then (λ1 − yy ∗ ) y(λ1 − y ∗ y)−1 y ∗ + 1 = λ1, so λ1 − yy ∗ is also invertible. With this we have shown that sp(y ∗ y)\{0} = sp(yy ∗ )\{0}. By (30) sp(y ∗ y) = sp(yy ∗ ) = {0}, and by the spectral theorem for JB-algebras [13, Th.3.2.4] y ∗ y = yy ∗ = 0. Hence b = 0, and then x∗ x ∈ A+ as claimed. Define now for x ∈ A + iA, kxk = kx∗ xk 2 . 1
(31)
Extend each state ρ on A to a complex linear functional on A + iA. Note that since the states separate the points of A, their extensions will separate the points of A + iA. Define for x ∈ A + iA, (32) (x | y)ρ = ρ(y ∗ x). Since x∗ x ∈ A+ for all x ∈ A + iA, this is a semi-definite inner product. Construct now the GNS-representations (πρ , Hρ ) in the usual way, and let (π, H) be the direct sum of all such representations. If x is a non-zero element of A + iA, then there exists a state ρ such that (x | 1)ρ = ρ(x) 6= 0. Thus by the Cauchy-Schwarz inequality, ρ(x∗ x) 6= 0, and then πρ (x) 6= 0. Thus π is a *-isomorphism of A + iA into B(H). In particular, π restricts to a Jordan isomorphism of A into the self-adjoint part of B(H), so it follows from [13, Prop.3.4.3] that kπ(a)k = kak for a ∈ A. By (31) and the corresponding equation for the norm of B(H), we now have for x ∈ A + iA kπ(x)k = kπ(x∗ x)k 2 = kx∗ xk = kxk. 1
(33)
Thus (30) defines a norm on A + iA which makes π : A + iA → B(H) an isometric *-isomorphism. Clearly this norm satisfies the C*-condition kxk2 = kx∗ xk, and we will now show that A + iA is complete for this norm. Pulling back the corresponding inequalities from B(H), we have for every x = a + ib ∈ A + iA, max(kak, kbk) ≤ kxk ≤ kak + kbk. Since A is complete in the order unit norm, the space A + iA must be complete in the norm (31), and then be a C*-algebra. It follows from (24), (25) and the anti-symmetry of the ?-product that 1 i (ab − ba) = (a ? b − b ? a) = ψa (b). 2 2 Thus we have constructed a C*-product compatible with A which satisfies the requirement (23). This product is unique since the compatibility with the Jordan algebra A determines the self-adjoint part a ◦ b in (25), and the requirement (23) determines the skew part a ? b in (25). Assume next that A + iA is equipped with a C*-product (x, y) 7→ xy compatible with the JB-algebra A. Now we define a map ψ from A into D(A) by Eq. (23), and we prove by straightforward calculation that the two requirements (i),(ii) of Definition 17 are satisfied. Thus ψ is a dynamical correspondence on A. Also we can recover the given
106
E. M. Alfsen, F. W. Shultz
associative product by substituting ψa b for a ? b in (25), so this product arises from ψ by the construction in the first part of the proof. It only remains to specialize to JBW-algebras. In this connection we must prove that if A is a JBW-algebra and (x, y) 7→ xy is a C*-product on A + iA which is compatible with A, then A + iA is a von Neumann algebra. The quickest proof of this fact is based on Kadison’s theorem that a monotone complete C*-algebra with a separating family of normal states is a von Neumann algebra [23] (or [19, Ex. 7.6.38]). In fact, the conditions on monotone completeness and separation by normal states are both imposed on the self-adjoint part of the C*-algebra, and since they are satisfied for the JBW-algebra A, there is nothing more to prove. Remark. The proof above shows that if A is a JB-algebra and A + iA is equipped with an associative product such that a + ib 7→ a − ib is an involution, then A + iA can be normed (in a necessarily unique way) to become a C*-algebra. Corollary 24. A JBW-algebra A is the self-adjoint part of a von Neumann algebra iff there exists a Connes orientation on A. In this case each Connes orientation I on A determines a unique W*-product compatible with A such that for d ∈ A + iA, I(δ˜d ) = δ˜id ,
(34)
and each W*-product compatible with A arises in this way from a unique Connes orientation on A. Proof. Assume first that A admits a Connes orientation I. By Proposition 21 there exists a dynamical correspondence ψ on A such that ψa ∈ I(δ˜a ) for a ∈ A. Construct the corresponding W*-product in A + iA as in Theorem 23. Let d ∈ A, say d = a + ib, where a, b ∈ A. Then to verify (34) we observe that for all c ∈ A, i δib c = (bc − cb) = ψb c, 2 so δib = ψb . Since ψ˜ b = I(δ˜b ), then I(δ˜ib ) = I(ψ˜ b ) = I 2 (δ˜b ) = −δ˜b , which gives
I(δ˜d ) = I(δ˜a ) + I(δ˜ib ) = ψ˜ a − δ˜b = δ˜ia − δ˜b = δ˜id ,
(35)
and establishes (34). Recall that (as shown in the proof of Proposition 21) there is a unique skew-adjoint order derivation in I(δ˜id ). Thus any two W∗ -products compatible with A and satisfying (34) induce the same Lie multiplication map δid for each d ∈ A, as well as the same Jordan multiplication (inherited from A), and thus coincide. Assume next that A + iA is equipped with a C*-product compatible with the JBWalgebra A. Thus A is the self-adjoint part of a von Neumann algebra A, and by Proposition 15 each order derivation on A is of the form δd for some d ∈ A. Now we can define a ˜ map I from D(A) into itself by (34), and we prove by straightforward calculation that the requirements (i),(ii),(iii) of Definition 16 are satisfied. Thus I is a Connes orientation on A. Clearly I is the unique Connes orientation on A for which Eq. (34) is satisfied. Corollary 25. A dynamical correspondence ψ on a JBW-algebra A is necessarily complete , and there is a unique Connes orientation I on A such that ψa ∈ I(δ˜a ) for all a ∈ A.
On Orientation and Dynamics in Operator Algebras. I
107
Proof. Let ψ be a dynamical correspondence on a JBW-algebra A. By Theorem 23 there is a unique W*-product in A such that (23) holds. In other words, A is the self-adjoint part of a von Neumann algebra A which is unique (up to a *-isomorphism) under the requirement ψa = δia for all a ∈ A. Now it follows from Corollary 24 that there is a Connes orientation I on A such that I(δ˜a ) = δ˜ia = ψ˜ a . Thus ψa ∈ I(δ˜a ) for all a ∈ A. By the same argument as the one leading up to (35), I(δ˜d ) = δ˜id for all d ∈ A. Since each order derivation on A is of the form δd for some d ∈ A (Proposition 15), this proves that I is uniquely determined by the requirement ψ˜ a = I(δ˜a ) = δ˜ia
for a ∈ A.
(36)
By Proposition 21 there is a unique dynamical correspondence on A for which (36) holds, and this correspondence is complete. Thus ψ is complete, and we are done. References 1. Alfsen, E.M. and Shultz, F.W.: Non-commutative spectral theory for affine function spaces on convex sets. Memoirs A.M.S. 172, 120 (1976) 2. Alfsen, E.M.and Shultz, F.W.: On state spaces of Jordan algebras. Acta Math. 140, 155–190 (1978) 3. Alfsen, E.M., Hanche-Olsen, H.,Shultz, F.W.: On state spaces of C∗ -algebras. Acta Math. 144, 267–305 (1980) 4. Alfsen, E.M., Shultz, F.W. and Størmer, E.: A Gelfand-Neumark theorem for Jordan algebras. Adv. Math. 28, 11–56 (1978) 5. Bellissard, J. and Iochum, B.: Homogeneous self-dual cones versus Jordan algebras, the theory revisited. Ann. Inst. Fourier (Grenoble) 28, 27–67 (1978) 6. Rajarama Bhat, B.V.: On a characterization of velocity maps in the space of observables. Pac. J. Math. 152, 1–14 (1992) 7. Bunce, L.J. and Maitland-Wright, J.D.: Velocity maps in von Neumann algebras. Pac. J. Math. 170, 421–427 (1995) 8. Connes, A.: Charact´erisation des espaces vectoriels ordonn´es sous-jacent aux alg´ebres de von Neumann. Ann. Inst. Fourier (Grenoble) 24, 121–155 (1974) 9. Grgin E. and Petersen, A.: Duality of observables and generators in classical and quantum mechanics. J. Math. Phys. 15, 764–769 (1974) 10. Halmos, P.R.: Hilbert Space Problem Book. Toronto–London–Melbourne: D.Van Nostrand, 1976 11. Hanche-Olsen, H.: On the structure and tensor products of JC-algebras. Can. J. Math. 35, 1059–1074 (1983) 12. Hanche-Olsen, H. and Evans, D.E.: The generators of positive semigroups. J. Func. Anal. 32, 207–212 (1979) 13. Hanche-Olsen, H. and Størmer, E.: Jordan Operator Algebras. Boston–London–Melbourne: Pitman, 1984 14. Iochum, B.: Cˆones autopolaires et alg`ebres de Jordan. Lecture Notes in Math. 1049, Berlin–Heidelberg– New York: Springer-Verlag, 1984 p. 249 15. Iochum, B. and Shultz, F.W.: Normal state spaces of Jordan and von Neumann algebras. J. Funct. Anal. 50, 317–328 (1983) 16. Jacobson, N.: Structure and representations of Jordan algebras. Am. Math. Soc. Colloq. Publ. 39, Providence R.I.. AMS, 1968 17. Jordan, P., von Neumann, J., and Wigner, E.: On an algebraic generalization of the quantum mechanical formalism. Ann. Math. 35, 29–64 (1934) 18. von Neumann, J.: On an algebraic generalization of the quantum mechanical formalism. Mat. Sbornik, 1, 415–482 (1936) 19. Kadison, R.V. and Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras, I-II. London–New York: Academic Press, 1986 20. Kadison, R.V.: A representation theory for commutative topological algebra, Memoirs A.M.S. 7, 36 (1951)
108
E. M. Alfsen, F. W. Shultz
21. Kadison, R.V.: A generalized Schwarz inequality and algebraic invariants for operators. Ann. of Math. 56, 494–503 (1952) 22. Kadison, R.V.: Isometries of operator algebras. Ann.Math. 54, 325–338 (1951) 23. Kadison, R.V.: Operator algebras with a faithful weakly-closed representation. Ann. Math. 64, 175–181 (1956) 24. Kadison, R.V.: Transformation of States in Operator Theory and Dynamics. Topology 3, Suppl. 2, 177–198 (1967) 25. Landsman, N.P.: Poisson Spaces with a transition probability. Rev. Math. Phys. 9, 29–57 (1997) 26. Mandel, L., Wolf, E.: Optical Coherence and Quantum Optics. Cambridge: Cambridge Univ. Press, 1995 27. Sakai, S.: The absolute value of W∗ -algebras of finite type. Tohoku Math. J. 8, 70–85 (1956) 28. Segal, I.E.: Irreducible representations of operator algebras. Bull. A.M.S. 53, 73–88 (1947) 29. Segal, I.E.: Postulates for general quantum mechanics. Ann. Math. 48, 930–948 (1947) 30. Sherman, S.: On Segal’s postulates for general quantum mechanics. Ann. Math. 64, 593–601 (1956) 31. Shultz, F.W.: On normed Jordan algebras which are Banach dual spaces J. Func. Anal. 31, 360–376 (1979) 32. Størmer, E.: On the Jordan structure of C∗ -algebras. Trans. Am. Math. Soc. 120, 438–447 (1965) 33. Størmer, E.: Jordan algebras of type I. Acta Math. 115, 165–184 (1966) 34. Takesaki, M.: Theory of Operator Algebras I. Berlin–Heidelberg–New York:: Springer-Verlag, 1979 35. Topping, D.: Jordan algebras of self-adjoint operators. Mem. Am. Math. Soc. 53, 48 (1965) Communicated by H. Araki
Commun. Math. Phys. 194, 109 – 134 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Quasi-Classical Asymptotics for the Pauli Operator Alexander V. Sobolev? Centre for Mathematical Analysis and its Applications, University of Sussex, Falmer, Brighton, BN1 9QH, UK. E-mail:
[email protected] Received: 23 July 1997 / Accepted: 30 September 1997
P Abstract: We study the behaviour of the sums Mγ = k |λk |γ , γ > 0 of the eigenvalues of the Pauli operator in L2 (Rd ), d = 2, 3, in a magnetic field µB(x) and electric field V (x) as the Planck constant ~ tends to zero and the magnetic field strength µ tends to infinity. We show that for µ~ ≤ const the sum Mγ obeys the natural Weyl type formula Z X σ 1−d |B| V−σ + 2 2kµ~|B| + V − dx, Mγ ∼ Cγ,d µ~ k>0
where σ = (d − 2)/2 + γ, with an explicit constant Cγ,d . If the field B has a constant direction, then this formula is uniform in µ ≥ 0. The method is based on Colin de Verdiere’s approach proposed in his work on “magnetic bottles” (Commun. Math Phys, 105, 327–335 (1986)).
1. Introduction The aim of the present paper is to study the negative discrete spectrum of the Pauli operator ( P = P0 + V, 2 Pd 2 (1.1) P0 = σ(−i~∇ − µa) = k=1 σk (−i~∂k − µak ) , acting in L2 () ⊕ L2 (), d = 2, 3, with the Dirichlet conditions on the boundary of the domain ⊂ Rd . Here a = (a1 , . . . , ad ) is a magnetic vector-potential, V is an electric (real-valued) potential decreasing at infinity, σ = (σ1 , σ2 , σ3 ) denotes the 3-tuple of 2×2 Pauli matrices, i.e. Hermitian matrices, such that ?
Author supported by EPSRC under grant B/94/AF/1793.
110
A. V. Sobolev
σk σl + σl σk = 2δlk I, and for any cyclic permutation (klm) of the numbers (123) one has σk σl = iσm . Up to a unitary equivalence all the realizations of these relations can be described by the standard Pauli matrices: 01 0 −i 1 0 , σ2 = , σ3 = . (1.2) σ1 = 10 i 0 0 −1 As one can see from (1.1), the operator P0 is non-negative. Moreover, it is known that under fairly broad conditions on the magnetic field B = curl a the point λ = 0 belongs to the spectrum of P0 , and in certain cases the spectrum coincides with the half-line [0, ∞) (see [3, 14] and references therein). The perturbed operator P = P0 + V may have some negative eigenvalues λk = λk (~, µB, V, ), k = 1, 2, . . . . We enumerate them counting multiplicity in the non-decreasing order. We shall study the asymptotic behaviour of the sums X |λk |γ , γ ≥ 0, (1.3) Mγ = Mγ (~, µB, V, ) = k
as ~ → 0, µ → ∞ and discuss the applicability of the natural quasi-classical asymptotic formula Mγ (~, µB, V, ) ∼ ~−d Bγ (µ~|B|, V, ), ~ → 0,
(1.4)
with the “magnetic” Weyl coefficient B(d) γ (b, V, )
Z =
βγ(d)
βγ(2)
b(x) V− (x)
d−2 2 +γ
+2
X
d−2 2 +γ
2kb(x) + V (x)
−
dx,
(1.5)
k>0 −1
= (2π)
,
βγ(3)
2 −1
= (2π )
Z
1
1
(1 − t) 2 tγ−1 dt.
γ
(1.6)
0
Notice that for b → 0 this coefficient tends to the standard Weyl coefficient Z d V− (x)γ+ 2 dx Cd,γ
(with an explicit constant Cd,γ ), so that as µ~ → 0 one recovers from (1.4) the usual Weyl asymptotics. On the contrary, if µ~ → ∞, then (1.4) reduces to Mγ (~, µB, V, ) ∼ µ~−d+1 Aγ (|B|, V, ), ~ → 0, with
Z Aγ (b, V, ) =
A(d) γ (b, V, )
=
βγ(d)
b(x)V− (x)
d−2 2 +γ
dx.
(1.7)
(1.8)
It was shown in [8, 9] that for homogeneous fields (i.e. for B = const) the asymptotics (1.4) for γ = 1 holds uniformly in the field strength µ. The first result of this kind for non-homogeneous fields was recently established in [6, 7] with the supplementary condition µ~3 → 0. In the present paper we prove that the formula (1.4) holds true if µ~ remains bounded from above (Theorem 2.1), whereas in the case of a constant direction field the asymptotics (1.4) is uniform in µ (Theorem 2.2). Notice that for d = 2 the field always has a constant direction. The strategy adopted in this paper involves the following two ingredients:
Quasi-Classical Asymptotics for the Pauli Operator
111
1. Quasi-classical asymptotics for the Pauli operator in a cube with the Dirichlet boundary conditions; 2. Appropriate a priori upper bound for the sum of the eigenvalues (usually referred to as the Lieb–Thirring inequality). First we decompose the domain into cubes. Then bit 1 yields the required asymptotics for the Pauli operator with the Dirichlet conditions on the sides of the cubes. The role of the Lieb–Thirring estimate is to ensure that the obtained asymptotic formula in the limit of small cubes turns into (1.4) for the initial problem. Our approach was inspired by Y. Colin de Verdiere’s work [2] on “magnetic bottles”, where the idea of calculating the spectral asymptotics of the magnetic Schr¨odinger operator by decomposing the configuration space into cubes with the Dirichlet conditions, was realized for the first time. Recall that the crucial point of this work was an explicit formula for the eigenvalues of the operator with a constant magnetic field on a torus, which subsequently led to appropriate spectral estimates on a cube. In fact, when implementing step 1 (see above) for µ~ ≤ const, we use these estimates directly, approximating B by a piecewise constant magnetic field (see Lemma 3.1, Proposition 3.2). In the case µ~ → ∞ (when B is assumed to have a constant direction) this approximation is too crude. Thus rather than apply the results of [2] directly, we use the strategy of this paper, modifying the “model” problem. Namely, we rely on the spectral properties of the Pauli operator on a torus with an arbitrary periodic (not necessarily constant) B having integer flux (see Sect. 4). The only piece of information we need here is that the point λ = 0 is an eigenvalue of multiplicity, which is equal to the flux of B (see [4] and Appendix A). The other eigenvalues are irrelevant, for they do not contribute into the asymptotics of Mγ as µ~ → ∞ (see Lemma 4.3). The crucial role in our approach is played by the Lieb–Thirring estimates for Mγ established in [11, 12]. In particular, those estimates are responsible for the region of values µ, ~, in which the asymptotics (1.4) is valid. Precisely, by [11, 12], Mγ ≤ C(~−3 + µ3/2 ~−3/2 ), Mγ ≤ C(~
−d
−d+1
+ µ~
),
for d = 3; for constant direction B’s, d = 2, 3
(1.9) (1.10)
(Here for the sake of discussion we ignore the dependence of these estimates on B and V and refer to (2.14), (2.15) for precise formulation). The requirement µ~ ≤ const in the case d = 3 (Theorem 2.1) is due to the fact that the estimate (1.9) shows the same behaviour in µ, ~ as the asymptotics (1.4), only for µ~ ≤ const. On the contrary, for constant direction fields (Theorem 2.2), the estimate (1.10) has the “correct” order for all µ and ~ (cf. (1.4), (1.7)). The discrepancy between (1.9) and (1.10) is connected with the fundamental property that the three-dimensional Pauli operator with a non-constant direction field may have eigenfunctions with eigenvalue zero (zero modes) (see [10]), which makes it impossible for (1.10) to hold in general. The authors of [6, 7] have developed a sophisticated method, which provides a better control over the zero modes and consequently, a Lieb–Thirring estimate having better orders of µ and ~ than (1.9), which allows for an ampler condition µ~3 → 0 of validity of (1.4). Without going into details we mention that the method of [6, 7] is concerned with the geometry of the field lines and requires more restrictive conditions on B and V than in our paper. In particular, in [6, 7] the magnetic field is supposed to be non-vanishing and bounded from above. Referring to the papers quoted above for further discussion of magnetic Lieb– Thirring inequalities and references, we mention however the most recent version of
112
A. V. Sobolev
the Lieb–Thirring inequality, obtained in [1]. It is similar to the estimate (2.14) below and reduces to (1.9) as far as the behaviour in ~ and µ is concerned. The paper is organized as follows. Section 2 contains the precise satements of main results. In Sect. 3 we prove the asymptotics (1.4) for µ~ ≤ const for the Pauli operator in a cube, assuming that V = const. Section 4 is concerned with the operator on a torus and provides necessary auxiliary estimates used in Sect. 5 for proving the asymptotics (1.4) in a cube for fields of constant direction. The proof of main results is completed in Sect. 6. Appendix A contains some known information regarding the problem on a torus (see [4]). Some technical material is collected in Appendix B. In conclusion we make some notational conventions. Sometimes we reflect the dependence of Mγ , Bγ and other quantities occurring in the paper, on the parameters of the problem (such as ~, a or V , etc). However, whenever it is convenient, we omit some or all of them from the notation. We may write, for example, Bγ (b) or Bγ (V ) instead of Bγ (b, V, ). The letters C, c with or without indices denote various positive constants, whose value is of no importance and may change from line to line. By int we denote the interior of a set ⊂ Rd . a ≈ b means that there exists a positive constant C such that C −1 ≤ a/b ≤ C. For any measurable real-valued function f we denote by f+ and f− its positive and negative parts respectively: f± = (|f | ± f )/2. This convention does not apply to operators (cf. (2.1) below). The points of Rd for both d = 2 and d = 3 as a rule are denoted by bold lower case letters: x, y, . . . . When we want to emphasize the difference between the two- and three- dimensional case we write x, y, . . . for two-dimensional variables. In particular this notation is used when representing x ∈ R3 as x = (x, x3 ), x ∈ R2 , x3 ∈ R. We shall work with two underlying Hilbert spaces: h = h() = L2 () and H = h⊕h. As a rule, elements of h and H are denoted by u, v and f, g respectively. It will be also convenient to introduce the notation d = d() = C0∞ (), D = d ⊕ d. Operators on H are usually denoted by blackboard bold letters, for instance P, H, etc. Furthermore, if W is an operator on h, then W denotes the operator W I, where I is the identity operator in C2 . This convention does not apply only to the letters N and R, C, which are reserved for the set of all positive integers and the fields of real and complex numbers respectively. For a self-adjoint operator T , R(z, T ) = (T − z)−1 denotes its resolvent. If T is semi-bounded, then T [·, ·] denotes the closed form associated with T , with the domain D[T ].
2. Conditions and Results
2.1. Basic definitions. First of all we give a precise definition of the operator (1.1) with the Dirichlet boundary conditions. We use the choice (1.2) of the Pauli matrices. Let ⊂ Rd be an open set (possibly unbounded). Let a = (a1 , . . . , ad ) ∈ L2loc () be a magnetic vector-potential with real-valued components. Define on d() the operators Q± = 51 ± i52 , 5k = −i~∂k − µak , k = 1, 2, 3.
(2.1)
Throughout the paper each statement containing the double subscript “±” must be understood separately for the sign “+” and “−”. The operators Q± , 5k are closable on
Quasi-Classical Asymptotics for the Pauli Operator
113
d(), since 5k are symmetric and Q± ⊂ Q∗∓ . We use the same letters Q+ , Q− , 5k for their closures. Consequently, the symmetric operator X 53 Q− σk 5k = (2.2) T= Q+ −53 k
acting in H() is also closed. We define the Pauli operator P0 as the operator associated with the closed quadratic form P0 [f, f ] = kTf k2 with the domain D[P0 ] = D(T). This implies that P0 = T∗ T. In the same way we define in h() the usual Schr¨odinger operator Ha with the magnetic field a: as an operator associated with the form Ha [u, v] = P 2 k h5k u, 5k vi. As a rule we assume that the vector-potential a ∈ Lloc () is such that the magnetic field defined in the distributional sense as B(x) = (B1 , B2 , B3 ) = curl a(x), Bl = ∂k aj − ∂j ak , (for any cyclic permutation l, k, j of the numbers 1, 2, 3), belongs to L∞ loc (). It is easy to check using (2.1) that A+ 0 0 B1 − iB2 − µ~ P0 = , B1 + iB2 0 0 A− ( Q∗± Q± , d = 2; (2.3) A± = 523 + Q∗± Q± , d = 3. Note the relation between A± and Ha : A± = Ha ∓ µ~B3
(2.4)
in the sense of sesqui-linear forms on d. The perturbed operator is denoted by P = P0 +V. The conditions under which this operator is well defined, will be specified later. As mentioned in the Introduction, in the case of a magnetic field having a constant direction we establish the asymptotics of Mγ in a larger region of values of the parameter µ than in the general case. We state the results for these two cases separately in Subsects. 2.2 and 2.3 below. The conditions on the potential V and the magnetic field B in both cases are chosen so as to ensure the validity of appropriate Lieb–Thirring estimates (see [11, 12]). 3 2.2. General fields, d = 3. Let `(x), b(x) be two positive functions such that b ∈ L∞ loc (R ). Define D(3) (x) = {y : |x − y| ≤ `(x)}. Assume that |`(x) − `(y)| ≤ %|x − y|, % < 1, (2.5) c ≤ b(y)b(x)−1 ≤ C, y ∈ D(3) (x), b(x)`(x)2 ≥ c.
Then we suppose that |B(x)| ≤ b(x), x ∈ . As shown in [12], if for some p > 3/2 the condition Z 3 |V (y)|p b(y) 2 + 1 dy < ∞ sup x∈
D (3) (x)∩
(2.6)
(2.7)
114
A. V. Sobolev
is fulfilled, then the operator P = P(V ) = P0 + V is well-defined as a form-sum on the domain D[P0 ]. Let the coefficient Bγ be defined in (1.5), (1.6). Theorem 2.1. Let the magnetic field B be a continuous function obeying the condition (2.6) for some function b(x) satisfying (2.5), and let (2.7) be fulfilled. Suppose that V− ∈ Lγ+3/2 (), V−γ b3/2 ∈ L1 () for some γ ≥ 1. Then the asymptotics d (2.8) lim ~ Mγ (~, µB, V, ) − Bγ (µ~|B|, V, ) = 0, ~ → 0, holds uniformly in µ~ ≤ C. 2.3. Constant direction fields. If the magnetic field has a constant direction, then under an appropriate choice of coordinates it can be written as B(x) = (0, 0, B(x)), where x = (x, x3 ), x = (x1 , x2 ). One can find a gauge such that a = (a1 , a2 , 0) with ak = ak (x), k = 1, 2, so that B = ∂1 a2 − ∂2 a1 (which is always the case for d = 2). In view of (2.3) in these coordinates the operator P0 is diagonal. The conditions on B will look the same for two- and three-dimensional situations. 2 (2) (x) = Namely, let `(x), b(x) be two positive functions such that b ∈ L∞ loc (R ). Define D {y : |x − y| ≤ `(x)}. Assume that |`(x) − `(y)| ≤ %|x − y|, % < 1, (2.9) c ≤ b(y)b(x)−1 ≤ C, y ∈ D(2) (x), b(x)`(x)2 ≥ c. Then we suppose that |B(x)| ≤ b(x), x ∈ . As shown in [11, 12], if for some p > d/2 the condition Z sup |V (y)|p b(y) + 1 dy < ∞, d = 2, x∈
or
(2.11)
D (2) (x)∩
Z sup x∈
(2.10)
D (2) (x)×[x3 ,x3 +1)∩
|V (y)|p b(y) + 1 dy < ∞, d = 3,
(2.12)
is fulfilled, then the operator P = P0 + V is well-defined as a form-sum on the domain D[P0 ]. Theorem 2.2. Let a constant direction magnetic field B be a continuous function obeying the condition (2.10) with a function b, satisfying (2.9), and let (2.11) (for d = 2) or (2.12) (for d = 3) be fulfilled. Suppose that γ+(d−2)/2
V− ∈ Lγ+d/2 (), V−
b ∈ L1 (),
for some γ ≥ 1 (for d = 2) or γ > 1/2 (for d = 3). Then the asymptotics −1 lim µ~1−d + ~−d |Mγ (~, µB, V, ) − ~−d Bγ (µ~|B|, V, )| = 0, ~ → 0, holds uniformly in µ ≥ 0.
(2.13)
Quasi-Classical Asymptotics for the Pauli Operator
115
2.4. Lieb–Thirring inequalities. The derivation of the asymptotics (2.8), (2.13) is based on the following estimates on Mγ established in [11, 12]: Proposition 2.3. 1. Let the magnetic field B obey the condition (2.6) and let (2.7) be fulfilled. Suppose that V− ∈ Lγ+3/2 (), V−γ b3/2 ∈ L1 () for some γ ≥ 1. Then Z 3 Mγ (~, µB, V, ) ≤ C~−3 V− (x)γ+ 2 dx Z 3 3 3 b(x) 2 V− (x)γ dx. (2.14) + C 0 µ 2 ~− 2
2. Let B have a constant direction, obey the condition (2.10) and let (2.11) (for d = 2) or (2.12) (for d = 3) be fulfilled. Suppose that γ+(d−2)/2
V− ∈ Lγ+d/2 (), V−
b ∈ L1 (),
for some γ ≥ 1 (for d = 2) or γ > 1/2(for d = 3). Then Z d −d Mγ (~, µB, V, ) ≤ C~ V− (x)γ+ 2 dx Z d−2 b(x)V− (x)γ+ 2 dx. + C 0 µ~−d+1
(2.15)
Initially, these estimates were proved for = Rd , d = 2, 3, which immediately implies (2.14), (2.15) for any open set in view of the variational principle by setting B = 0, V = 0 outside . Emphasize in this connection that the functions b, ` are defined in the entire space (and not only in !) to ensure the applicability of the results of [11, 12]. Note also that for B ∈ L∞ the estimates take a simpler form, which follows from (2.14), (2.15) by choosing b = ess − sup |B(x)|, `(x) = 1. x∈
Precisely, for d = 3 we have −3
Mγ (~, µB, V, ) ≤ C~
Z
3
V− (x)γ+ 2 dx + C 0 µ 2 ~− 2 b 2 3
3
3
Z
If the field has a constant direction, then Z d V− (x)γ+ 2 dx Mγ (~, µB, V, ) ≤ C~−d
+ C 0 µ~−d+1 b
V− (x)γ dx, γ ≥ 1.
Z
V− (x)γ+
d−2 2
dx.
(2.16)
(2.17)
The constants C, C 0 in (2.16) and (2.17) are the same as in (2.14) and (2.15) respectively and do not depend on . 2.5. Properties of Mγ and Bγ . Here we list some useful properties of the quantity (1.3). Below the parameters ~, µ and the field a are fixed and omitted from notation. Introduce
116
A. V. Sobolev
the discrete spectrum counting function: N (λ) = #{λk , λk < −λ}, λ ≥ 0. Note that M0 = N (0). The sum (1.3) can be represented as Z ∞ Z ∞ λγ dN (λ, V ) = γ λγ−1 N (λ, V )dλ. (2.18) Mγ (V ) = − 0
0
Furthermore, using the Weyl inequalities
N (λ, V1 + V2 ) ≤ N λ, (1 − ζ)−1 V1 + N λ, ζ −1 V2 , N (λ, V1 + V2 ) ≥ N λ, (1 + ζ)−1 V1 − N λ, −ζ −1 V2 ,
valid for any ζ ∈ (0, 1), one checks with the help of (2.18) that ( Mγ (V1 + V2 ) ≤ Mγ (1 − ζ)−1 V1 + Mγ ζ −1 V2 , Mγ (V1 + V2 ) ≥ Mγ (1 + ζ)−1 V1 − Mγ −ζ −1 V2 .
(2.19)
We conclude this section with some remarks on the coefficient Bγ . Notice, first of all that, similarly to (2.18), Z ∞ λγ−1 B0 (V + λ)dλ. (2.20) Bγ (V ) = γ 0
Now denote σ = γ + (d − 2)/2. The coefficient Bγ can be estimated with the help of the quantities R ( Md,γ = Md,γ (b, V, ) = b(x)|V (x)|σ dx, b(x) ≥ 0, (2.21) R Nd,γ = Nd,γ (V, ) = |V (x)|σ+1 dx, as follows: Bγ (b, V, ) ≤ βγ(d) Md,γ (b, V− , ) Z Z b(x) + 2βγ(d)
∞
σ
2tb(x) + V (x) 0
−
dtdx
≤ CMd,γ (b, V− , ) + C 0 Nd,γ (V− , ),
(2.22)
0
with some constants C, C depending on d, γ. Furthermore, Bγ is continuous in b and V for σ > 0. Precisely, for any functions V, V1 , V2 and non-negative functions b, b1 , b2 one has if σ ≤ 1: |Bγ (b, V1 ) − Bγ (b, V2 )| ≤ CMd,γ (b, V1 − V2 ) 1 1 σ +CNd,γ (V1 − V2 ) σ+1 Nd,γ (V1 ) σ+1 + Nd,γ (V2 ) σ+1 , and
(2.23)
|Bγ (b1 , V ) − Bγ (b2 , V )| ≤ C Md,γ (|b1 − b2 |, V ) 1 1 + Md,γ (|b1 − b2 |, V ) 2 Nd,γ (V ) 2 σ 1− σ2 + Md,γ (|b1 − b2 |, V ) 2 Nd,γ (V ) σ 1−σ + Md,γ (|b1 − b2 |, V ) Nd,γ (V ) . (2.24)
Quasi-Classical Asymptotics for the Pauli Operator
117
If σ > 1, then |Bγ (b, V1 ) − Bγ (b, V2 )| 1 σ−1 σ−1 ≤ C Md,γ (b, V1 − V2 ) σ Md,γ (b, V1 ) σ + Md,γ (b, V2 ) σ 1 σ σ + C 0 Nd,γ (V1 − V2 ) σ+1 Nd,γ (V1 ) σ+1 + Nd,γ (V2 ) σ+1 (2.25) and |Bγ (b1 , V ) − Bγ (b2 , V )| ≤ CMd,γ (|b1 − b2 |, V ) 1 1 + C 0 Md,γ (|b1 − b2 |, V ) 2 Nd,γ (V ) 2 . (2.26) The proof of the above estimates is given in Appendix B. 3. Asymptotics in a Cube. Moderate Fields Here we establish the asymptotics of Mγ for a cube = [0, R)d as ~ → 0, µ~ ≤ C for a constant potential V and arbitrary magnetic field B. The coefficient Bγ is defined in (1.5). Lemma 3.1. Let = [0, R)d be a cube such that V = const inside. Suppose that the magnetic field B is continuous in and µ~ ≤ C. Then the asymptotics (2.8) holds for all γ ≥ 1. If d = 3 and the magnetic field B has a constant direction, then (2.8) holds for all γ > 1/2. As was explained in the Introduction, the idea of the proof is borrowed from the work [2]. In particular, we use the following estimates: Proposition 3.2. Let = [0, r)d , r > 0, B be homogeneous, i.e. B = const, and V = const. Then there exists a constant C independent of B, V , such that for any δ ∈ (0, 1/2), γ ≥ 0, Mγ (~, µB, V, ) ≤ ~−d Bγ (µ~|B|, V, ); −d
Mγ (~, µB, V, ) ≥ ~
(3.1) 2 −2 −2
(1 − δ) Bγ (µ~|B|, V + C~ δ d
r
, ).
(3.2)
For γ = 0 (i.e. for the distribution function) this lemma immediately follows from the results of [2] with the help of the representation (2.3) and (2.4). This entails (3.1), (3.2) for all γ > 0 due to the relations (2.18), (2.20). Proof of Lemma 3.1. As in [2], we cut the cube = [0, R)d into a collection of smaller cubes and in each of them we approximate the magnetic field by a homogeneous one. This will enable us to use Proposition 3.2. First we implement this strategy in a cube 0 = [−r/2, r/2)d with the side r > 0 to be specified later on. Choose the gauge a = (a1 , a2 , 0) with Z Z x3 1 x2 a1 (x) = − B3 (x1 , t, 0)dt + B2 (x1 , x2 , t)dt, 2 0 0 1 a2 (x) = 2
Z
x1 0
Z
x3
B3 (t, x2 , 0)dt −
B1 (x1 , x2 , t)dt. 0
118
A. V. Sobolev
Using the fact that div B = 0, one checks easily that curl a = B. Now approximate a by the following linear vector-potential: ◦
◦
◦
a = ( a1 , a2 , 0), 1 1 ◦ a1 (x) = − B3 (0)x2 + B2 (0)x3 , a◦2 (x) = B3 (0)x1 − B1 (0)x3 . 2 2 Then, clearly, ◦ max0 |a(x) − a(x)| ≤ Crσr , σr = max0 |B(x) − B(y)|. x∈
x,y∈
(3.3)
Then direct calculation, using (3.3), shows that for any ∈ (0, 1), ◦ kT(µa, 0 )f k2 ≤ (1 + )kT(µ a, 0 )f k2 + C(1 + −1 )µ2 r2 σr2 kf k2 ,
(3.4)
◦
kT(µa, 0 )f k2 ≥ (1 − )kT(µ a, 0 )f k2 − C(−1 − 1)µ2 r2 σr2 kf k2 .
(3.5) p Pick r ≈ A ~(1 + µ)−1 with some number A > 0. Then (3.4), (3.5) provide the following estimates: Mγ (~, µB, V, 0 ) ≥ (1 + )γ Mγ (~, µB(0), V 0 , 0 ), V 0 = V + C(−1 + 1)A2 µ~σr2 (1 + )−1 ;
(3.6)
Mγ (~, µB, V, ) ≤ (1 − ) Mγ (~, µB(0), V , ), V 00 = V − C(−1 − 1)A2 µ~σr2 (1 − )−1 .
(3.7)
0
00
γ
0
Taking into account that µ~ ≤ C, we obtain from Proposition 3.2 that Mγ (~, µB, V, 0 ) ≥ (1 + )γ (1 − δ)d ~−d Bγ (µ~|B(0)|, V 0 + C(Aδ)−2 , ), (3.8) 0
γ −d
Mγ (~, µB, V, ) ≤ (1 − ) ~
00
Bγ (µ~|B(0)|, V , ),
(3.9)
for any δ ∈ (0, 1/2). Now we shall proceed to the proof of (2.8) for the cube = [0, R)d . To that end cut into a collection of equal cubes of size p r ≈ A ~(1 + µ)−1 , Rr−1 ∈ N. x(j) be the center of j . Denote these cubes by j , j = 1, . . . . LetP Lower bound. It is clear that P() ≤ j P(j ), which implies in view of (3.8) that Mγ () ≥
X
Mγ (j )
j
≥ (1 + )γ (1 − δ)d ~−d
X
Bγ (µ~|B(x(j) )|, V 0 + C(Aδ)−2 , j ),
j
where V 0 is defined as in (3.6) with σr =
max x,y∈:|x−y|≤r
|B(x) − B(y)|.
(3.10)
Quasi-Classical Asymptotics for the Pauli Operator
119
Since B is continuous, taking into account (2.23), (2.24), (2.25) and (2.26), we conclude that lim inf ~d Mγ () − (1 + )γ (1 − δ)d Bγ (µ~|B|, V (1 + )−1 + C(Aδ)−2 , ) ≥ 0. The parameters , δ, A are arbitrary and independent of each other, so that, referring again to (2.23), (2.25) we immediately obtain the lower bound lim inf ~d Mγ () − Bγ (µ~|B|, V, ) ≥ 0, ~ → 0, µ~ ≤ C, (3.11) for all γ > 0. Upper bound. To obtain an upper bound similar to (3.11), we introduce the following notation: S = \ ∪j int j , Sδ = {x ∈ Rd : dist{x, S} < δr} with a δ ∈ (0, 1/2). Proceeding a` la Colin de Verdiere [2], we construct a partition of unity ψk , k ≥ 0, such that ψ0 ∈ C0∞ (Sδ ), ψj ∈ C0∞ (j ), j ≥ 1 and X ψk2 (x) = 1, x ∈ , |∇ψk | ≤ C(δr)−1 . k
One can easily see that P0 (Sδ )[ψ0 f, ψ0 f ] +
X
P0 (j )[ψj f, ψj f ] ≤ P0 ()[f, f ] + C(Aδ)−2 kf k2 ,
j≥1
for any f ∈ D(), so that by virtue of the variational principle, X Mγ (V − C(Aδ)−2 , j ) + Mγ (V − C(Aδ)−2 , Sδ ). Mγ (V, ) ≤ j≥1
According to (3.9) X
Mγ (V, ) ≤ (1 − )γ ~−d
Bγ (µ~|B(x(j) )|, V 00 − C(Aδ)−2 , j )
j≥1
+ Mγ (V − C(Aδ)−2 , Sδ )
(3.12)
with the potential V 00 defined in (3.7), where σr is as in the proof of the lower bound. For µ~ ≤ C, according to (2.16), (2.15) one has ~d Mγ (V − C(Aδ)−2 , Sδ )
d ≤ C (|V 00 | + C(Aδ)−2 )γ + (|V 00 | + C(Aδ)−2 )γ+ 2 |Sδ |.
Since |Sδ | ≤ Cδ, it is clear that lim sup of the r.h.s. as A → ∞ and δ → 0, equals zero. Thus using, as in the proof of the lower bound, (2.23), (2.24), (2.25), (2.26) and taking into account arbitrarity of , A and δ, we conclude from (3.12) that lim sup ~d Mγ () − Bγ (µ~|B|, V, ) ≤ 0, ~ → 0, µ~ ≤ C. Together with (3.11) this bound leads to the required asymptotics (2.8).
120
A. V. Sobolev
4. Operator on a Torus In this section we study the spectrum of the Pauli operator on a torus. The magnetic field is supposed to have a constant direction. We consider two- and three-dimensional cases simultaneously. The parameters ~, µ are kept constant throughout the section. Let B = (0, 0, B), B ∈ L∞ (R2 ) be a periodic magnetic field with constant direction, i.e. B(x) = B(x) and for some positive numbers T1 , T2 one has: B(x1 , x2 ) = B(x + T1 , x2 ) = B(x1 , x2 + T2 ).
(4.1)
In the case d = 3 we fix also an arbitrary number T3 > 0 and denote by D(d) = ×dk=1 [0, Tk ) ⊂ Rd the fundamental domain of the lattice 0 with vertices at the points (T1 m1 , . . . , Td md ) : mj ∈ Z. We make a specific choice of the gauge for the vectorpotential corresponding to B. Namely, we assume that a = (−∂2 φ, ∂1 φ, 0),
(4.2)
where the function φ is a solution of the equation 1φ = B
(4.3)
chosen as follows. Split the magnetic field into two components: B = B0 + B1 , where B0 = const is a homogeneous field and B1 is a periodic field with the zero mean value. Then we choose φ in the form φ = φ0 + φ1 , where φ1 is a periodic solution (with periods T1 , T2 ) of (4.3) with B1 in the r.h.s. and φ0 is given by φ0 =
B0 (αx21 + βx22 ), α + β = 2, 4
with some real numbers α, β. This leads to the following gauge for the magnetic potential: a1 (x) = −∂2 φ1 − βB0 x2 /2, a2 (x) = ∂1 φ1 + αB0 x1 /2, which implies that ( a1 (x1 + T1 , x2 )
= a1 (x1 , x2 ), a1 (x1 , x2 + T2 ) = a1 (x1 , x2 ) − βB0 T2 /2,
a2 (x1 + T1 , x2 )
= a2 (x1 , x2 ) + αB0 T1 /2, a2 (x1 , x2 + T2 ) = a2 (x1 , x2 ).
(4.4)
(4.5)
With the help of these relations one easily sees that the operators Q± (see (2.1) for definition) commute with the so-called magnetic translations: ( (τ1 u)(x1 , x2 , x3 ) = u(x1 + T1 , x2 , x3 ) exp −iµαB0 T1 x2 /(2~) , (4.6) (τ2 u)(x1 , x2 , x3 ) = u(x1 , x2 + T2 , x3 ) exp iµβB0 T2 x1 /(2~) . If the flux of B across D(2) is integer,i.e. Z 1 8= B(x)dx, µ~−1 8 = N ∈ Z, 2π D(2)
(4.7)
then τ1 and τ2 mutually commute. In what follows we always assume that (4.7) is fulfilled. Consider the operators Q± , 5j on the functions u ∈ C ∞ (Rd ) obeying the (magnetic) periodic conditions:
Quasi-Classical Asymptotics for the Pauli Operator
τk u = u (k = 1, 2); u(x, x3 + T3 ) = u(x),
121
(4.8)
with the L2 (D(d) )-topology. In the case d = 2 the last condition in (4.8) is dropped. For Q± , 5j commute with τk , we can view these operators as being well-defined on the torus X (d) = Rd /0. More precisely, we regard them as operators acting on the sections of a complex line bundle over X (d) , the trivializations of the bundle being related by the rule (4.8). Hence we denote them as Q± (X (d) ), 5j (X (d) ). These operators are closable, since 5j are symmetric and Q± ⊂ Q∗∓ . Their closures are denoted by the same letters. Moreover, as the next lemma shows, Q+ and Q− are mutually adjoint: Lemma 4.1. Let Q± be as defined above with the magnetic potential (4.2). Then Q∗± = Q∓ . Proof. For simplicity we assume d = 2 and denote D = D(2) . It suffices to show that the operator 0 Q− Q= Q+ 0 is essentially self-adjoint on functions f ∈ C ∞ (R2 ) ⊕ C ∞ (R2 ) obeying the conditions (4.8). This amounts to proving that the equation (Q ± i)f (±) = 0 has a unique solution f (±) = 0 in the class f ∈ L2loc (R2 ) with the condition (4.8). It is 2 clear from the definition (4.2) of the magnetic potential that a ∈ L∞ loc (R ). Therefore 1 1 (R2 ) ⊕ Hloc (R2 ). Integrate by parts, it follows from the ellipticity of Q that f (±) ∈ Hloc using (4.8): kf (±) k2L2 (D) + kQf (±) k2L2 (D) = k(Q ± i)f (±) k2L2 (D) = 0. This entails f (±) = 0.
In addition to the operators A± (D(d) ) with the Dirichlet boundary conditions introduced in Sect. 2, we define by the formulae (2.3) with the conditions (4.8) the entries A± = A± (X (d) ) of the Pauli operator P0 (X (d) ) on the torus X (d) . The Schr¨odinger operator Ha (X (d) ) is naturally defined as the operator associated with the closure of the Pd form k=1 k5k uk2 , where u obeys (4.8). Using (4.5), one can check that similarly to (2.4) A± (X (d) ) = Ha (X (d) ) ∓ µ~B.
(4.9)
We will heavily use the following facts: Proposition 4.2. Let (4.7) be fulfilled. 1. If ±N > 0 (see (4.7)), then the point λ = 0 is an eigenvalue of the operator A± (X (d) ) of multiplicity ±N . 2. Let d = 2. Suppose that ±B(x) ≥ κ with some number κ > 0. Then the spectrum of A± = A± (X (2) ) has a gap between λ = 0 and the rest of the spectrum of width at least 2µ~κ.
122
A. V. Sobolev
Proof. The first statement of this proposition follows from the results of [4] (see Appendix A below). To prove statement 2 notice that by Lemma 4.1 one has A+ = Q∗+ Q+ , A− = Q+ Q∗+ , which guarantees that the non-zero spectra of A+ and A− coincide. On the other hand, by virtue of (4.9), under the condition ±B ≥ κ one has A∓ ≥ ±2µ~κ. Proposition 4.2 is crucial for proving Lemma 4.3. Let the condition (4.7) be fulfilled and |B| ≥ κ for some κ > 0. Suppose that V = const in D(d) and V− ≤ 2µ~κ. Then for any γ ≥ 0, Mγ (~, µB, V, X (2) ) = V−γ |N |, d = 2, 1 Mγ (~, µB, V, X (3) ) − 2π 2 βγ(3) T3 V−γ+ 2 |N | ≤ V−γ |N |, d = 3, π~
(4.10) (4.11)
with βγ(3) defined in (1.6). Proof. According to Proposition 4.2 and the equality (4.9), under the condition V− ≤ 2µ~κ the operator P0 (X (2) )+V has a single negative eigenvalue λ = −V− , its multiplicity being equal to |N |. This provides (4.10). If d = 3, then the negative eigenvalues of the operator P(X (3) ) have the form −V− + (2π~n/T3 )2 , n = 0, 1, . . . . Their multiplicity equals |N | for n = 0 and 2|N | for n > 0. Therefore |N |
−1
Mγ =
V−γ
+2
∞ X
(2π~n/T3 )2 − V−
γ −
.
n=1
The r.h.s. does not exceed V−γ + 2
Z
∞
(2π~t/T3 )2 − V−
0
γ −
dt,
which after a change of variables, yields the desired upper bound in (4.11). The corresponding lower bound can be obtained similarly. Let’s estimate the number of negative eigenvalues of the operator P(V, S) for an open subset S ⊂ X (d) . (Recall that the notation P(V, S) means the Pauli operator with the Dirichlet conditions on the boundary of the set S.) To this end fix a covering of X (d) by charts k , k = 1, 2 . . . , and a partition of unity ψk ∈ C0∞ (k ), associated with this covering, so that X ψk2 = 1. k
Then
X
P(k ∩ S)[ψk f, ψk f ] ≤ P(S)[f, f ] + C~2 kf k2 , C = max |∇ψk |2 . x,k
k
Therefore Mγ (V, S) ≤
X
Mγ (V − C~2 , k ∩ S).
k
Applying (2.15) to each summand in the r.h.s., we get
Quasi-Classical Asymptotics for the Pauli Operator
123
Mγ (V, S) ≤ C 0 µ~−d+1 Md,γ (b, V− + C~2 , S) + C 00 ~−d Nd,γ (V− + C~2 , S), (4.12) with Md,γ , Nd,γ defined in (2.21), where γ ≥ 1(d = 2), γ > 1/2(d = 3) and the constants C 0 , C 00 depend on γ and the choice of k . 5. Operator in a Cube. Strong Fields with a Constant Direction Lemma 4.3 shows that for |B| ≥ κ > 0 and sufficiently large µ~ only the lowest level λ = 0 of the unperturbed operator P0 (X (d) ) contributes to the negative spectrum of P(V, X (d) ). Our study of Mγ () with a cube ⊂ Rd for strong fields in this section will be based on this observation. To relate the quantity Mγ (X (d) ) to Mγ () we are going to prove the following lemma in the spirit of Colin de Verdiere [2]. Lemma 5.1. Let = (d) = [0, R)d , d = 2, 3 be a cube with the side R > 0 and let V = const in . Let B = (0, 0, B), B = B(x) with |B| ≥ κ > 0. Suppose that γ ≥ 1 if d = 2 and γ > 1/2 if d = 3. Then the asymptotics lim µ−1 ~d−1 Mγ (~, µB, V, ) = Aγ (|B|, V, )
(5.1)
holds as 2µ~κ ≥ V− and ~ → 0, with the coefficient Aγ defined in (1.8). Proof. Extend the field B(x) periodically to the entire plane. Let (2) j , j = 1, 2, . . . be (2) squares with disjoint interiors obtained by translating in the directions parallel to (2) ⊂ R2 (depending on µ, ~) the plane R2 such that R2 = ∪j (2) j . Choose a domain D in such a way that Z −1 B(x)dx = 2πN, |N | ∈ N. µ~ D (2) (2) Single out the maximal set of those (2) j ’s, that are contained in D , and denote M = (2) #{j : j ⊂ D(2) }. Taking a larger D(2) one can always satisfy the requirement Z 1 B(x)dx, (5.2) (1 − )|N | ≤ µ~−1 M |8| ≤ |N |, 8 = 2π (2)
for a fixed number ∈ (0, 1/2). Set (2) D(3) = D(2) × [0, R), (3) j = j × [0, R).
Denote by X (d) the torus obtained by identifying the opposite sides of the domain D(d) . Further we obtain upper and lower bounds for Mγ , which will guarantee the asymptotics (5.1). Upper bound. Clearly, M X j=1
so that
(d) P0 ((d) j ) ≥ P0 (X ),
124
A. V. Sobolev
M N (V + λ, (d) ) ≤ N (V + λ, X (d) ), M Mγ (V, (d) ) ≤ Mγ (V, X (d) ), ∀γ ≥ 0, where we have taken into account that all the operators P0 ((d) j ) have identical spectra. Hence for d = 3 the estimates (4.11) and (5.2) entail the bound Mγ (V, ) ≤ 2π 2 βγ(3) ≤ 2πβγ(3)
R γ+ 21 V |N |M −1 + V−γ |N |M −1 π~ −
µ|8| R γ+ 21 µ|8| V− + V γ. ~ ~(1 − ) ~(1 − ) −
Since > 0 is arbitrary, this implies that Z γ+ 1 V− 2 |B(x)|dx lim sup µ−1 ~2 Mγ (V, (3) ) ≤ βγ(3) ~→0
(3)
(3) = A(3) γ (|B|, V, ), V− ≤ 2µ~κ.
(5.3)
Similarly, with the help of (4.10), one proves the analogous bound for d = 2: (2) lim sup µ−1 ~Mγ (V, (2) ) ≤ A(2) γ (|B|, V, ), V− ≤ 2µ~κ. ~→0
(5.4)
Note that the estimates (5.3), (5.4) are valid for all γ ≥ 0. (d) Lower bound. Let int (d) j be the interior of j . Denote (d) (d) (d) S (d) = X (d) \ ∪M : dist(x, S (d) ) < δ} j=1 int j , Sδ = {x ∈ X (d) with some δ < R/2. Obviously, X (d) ⊂ Sδ(d) ∪M j=1 j . It follows from (5.2) that the flux of the magnetic field across S (2) is bounded by ~µ−1 |N |, which implies that
|S (2) | ≤
2π~|N | , κµ
and hence |Sδ(d) | ≤ Cµ−1 ~|N | + CδRd−1 M ≤ CM ( + δ)
(5.5)
with a constant C depending on κ, R, 8. Using a partition of unity subordinate to the covering of X (d) by Sδ(d) and (d) j , j = 1, 2, . . . , M , similar to the one used in the proof of the upper bound in Lemma 3.1, we arrive at the estimate Mγ (V + C~2 /δ 2 , X (d) ) ≤ M Mγ (V, (d) ) + Mγ (V, Sδ(d) ). Thus applying (4.11) to Mγ (X (3) ) and (4.12) to Mγ (Sδ(3) ), and subsequently using (5.2) and (5.5), we obtain for d = 3, Mγ (V, (3) ) ≥ 2π 2 βγ(3)
R γ+1/2 (V + C~2 /δ 2 )− |N |M −1 π~
− V−γ |N |M −1 − C 0 ~−3 (1 + µ~)|Sδ(3) |M −1 ≥ 2πβγ(3) µ~−2 R(V + C~2 /δ 2 )−
γ+1/2
|8|
− C 0 µ~−1 − C 0 ~−3 (1 + µ~)( + δ),
Quasi-Classical Asymptotics for the Pauli Operator
125
where the constant C 0 depends on V, 8. As µ~ ≥ c and ~ → 0, we obtain lim inf µ−1 ~2 Mγ () ≥ 2πβγ(3) Rπ −1 V−
γ+1/2
|8| − C( + δ)
(3) = A(3) γ (|B|, V, ) − C( + δ).
Since and δ are arbitrary, this inequality and (5.3) provide (5.1) for d = 3. Analogously, the bound (4.10) leads to a lower bound for Mγ in the case d = 2, which in combination with (5.4), yields (5.1). Combining this result with Lemma 3.1, we arrive at Corollary 5.2. Let = [0, R)d with a R > 0 and the fields B and V obey the conditions of Lemma 5.1. Suppose also that the field B is continuous. Then the asymptotics (2.13) holds uniformly in µ ≥ 0 for γ ≥ 1(d = 2) and γ > 1/2(d = 3). To prove it suffices to recall that µ~Aγ (|B|, V, ) = Bγ (µ~|B|, V, ) if µ~κ ≥ V− (see (1.5), (1.8)). To remove the condition |B| ≥ κ in the next section, we shall need the following bound for Mγ : Lemma 5.3. Let the cube and the fields B, V be as in Corollary 5.2 except for the condition |B| ≥ κ. Then lim sup µ~1−d + ~−d
−1
~→0
|Mγ − ~−d Bγ (µ~|B|, V, )| ≤ Cmax|B(x)|Md,γ (1, V− , ), x∈
(5.6)
uniformly in µ ≥ 0. Proof. Fix a number A > 0. According to Lemma 3.1, lim sup µ~1−d + ~−d
−1
~→0 µ~≤A
|Mγ − ~−d Bγ (µ~|B|, V, )| = 0.
(5.7)
On the other hand, due to (2.17), Mγ ≤ Cµ~−d+1 max|B(x)|Md,γ (1, V− , ) + C 0 ~−d Nd,γ (V− , ). x∈
In view of (2.22) the coefficient ~−d Bγ obeys the same bound. Hence sup µ~1−d + ~−d µ~≥A
−1
|Mγ − ~−d Bγ (µ~|B|, V, )| ≤ Cmax|B(x)|Md,γ (1, V− , ) + C 0 A−1 Nd,γ (V− , ). x∈
For A is arbitrary, together with (5.7) this provides (5.6).
126
A. V. Sobolev
6. Proof of the Main Results Proof of Theorem 2.2. As before, choose the coordinates in such a way that B = (0, 0, B). Let the quantities Md,γ , Nd,γ be defined as in (2.21). Let’s fix a number > 0. By the conditions of the theorem, there exists a finite collection of cubes j ⊂ , j = 1, 2, . . . with the sides parallel to coordinate axes, such that there is a piecewise constant function W such that W (x) = const, x ∈ j for each j, Md,γ (b, V − W, ) ≤ , Nd,γ (V − W, ) ≤ ,
(6.1)
and W = 0 on \ ∪j j . The existence of such cubes and a function W can be proved, for example, with the help of the Whitney decomposition (see [13]). Now fix another number η > 0 and partition each j into a finite family of cubes j,k , k = 1, . . . , such that |B(x) − B(y)| ≤ η, x, y ∈ j,k , ∀j, k.
(6.2)
Such a partition is possible due to the continuity of B. We shall say that the pair j, k belongs to the set A> = A> (η) if max |B(x)| ≥ 2η, x ∈ j,k . Otherwise we say (j, k) ∈ A< = A< (η). Note that because of (6.2) |B(x)| ≥ η, x ∈ j,k , (j, k) ∈ A> . As in Lemma 5.1, we establish lower and upper bounds on Mγ . Lower bound. Using (2.19) with V1 = W, V2 = V − W , we see that Mγ (V ) ≥ Mγ W (1 + ζ)−1 − Mγ ζ −1 (W − V ) , ∀ζ ∈ (0, 1).
(6.3)
According to (2.15) and (6.1), d−2 . Mγ ζ −1 (W − V ) ≤ Cζ −σ µ~1−d + C 0 ζ −σ−1 ~−d , σ = γ + 2
(6.4)
Estimate from below the first summand in the r.h.s. of (6.3). Clearly, X P(j,k ) ≥ P(), j,k
so that Mγ () ≥
X
Mγ (j,k ).
j,k
Applying Corollary 5.2 to the cubes from A> and Lemma 5.3 to the cubes from A< , we obtain that −1 lim inf µ~1−d + ~−d ~→0 × Mγ W (1 + ζ)−1 , − h−d Bγ µ~|B|, W (1 + ζ)−1 , X ≥ −Cη Md,γ 1, W− (1 + ζ)−1 , j,k ≥ −Cη. (j,k)∈A<
Notice that (2.23), (2.25) and (6.1) ensure that
Quasi-Classical Asymptotics for the Pauli Operator
|Bγ W (1 + ζ)
−1
127
( − Bγ (V )| ≤
σ
C(ζ σ + + σ+1 ), σ ≤ 1; 1 1 C(ζ + σ + σ+1 ), σ > 1.
Now due to the last estimates and (6.4), the lower bound (6.3) leads to the estimate −1 lim inf µ~1−d + ~−d Mγ (V, ) − ~−d Bγ µ~|B|, V, ~→0
≥ −Cη − C(ζ + σ + σ+1 )min{σ,1} − Cζ −σ−1 , 1
1
with a constant C independent of η, , ζ. Taking to zero first and then ζ and η, we arrive at −1 Mγ (V, ) − ~−d Bγ µ~|B|, V, ≥ 0. (6.5) lim inf µ~1−d + ~−d ~→0
Upper bound. Using again (2.19) with the same ζ as above, we conclude that Mγ (V ) ≤ Mγ (1 − ζ)−1 W + Mγ ζ −1 (V − W ) .
(6.6)
Further argument is the same as in the proof of the lower bound in Lemma 5.1. Let int j,k be the interior of j,k . Denote S = Rd \ ∪j,k int j,k and define Sδ = {x ∈ Rd : dist(x, S) < δ} with some δ > 0. Obviously, ⊂ Sδ ∪j,k j,k . It is clear from the construction of Sδ that |Sδ \ S| ≤ C δ,
(6.7)
with a constant C depending on , . Construct a partition of unity ψ0 , ψj,k ∈ C ∞ (Rd ), j, k ≥ 1, such that supp ψ0 ⊂ Sδ , supp ψj,k ⊂ j,k , j, k ≥ 1, X 2 ψ02 + ψj,k = 1, j,k
Y (x) = |∇ψ0 (x)| + |∇ψj,k (x)|2 ≤ Cδ −2 . 2
A simple calculation shows that X P0 (j,k )[ψj,k f, ψj,k f ] P0 (Sδ ∩ )[ψ0 f, ψ0 f ] + j,k
≤ P0 ()[f, f ] + C~2 Y[f, f ], for any function f ∈ D(). By virtue of the variational principle, X ˜ , j,k ) + Mγ (W ˜ , Sδ ∩ ), Mγ (W Mγ W (1 − ζ)−1 , ≤ j,k
˜ = W (1 − ζ)−1 − CY. W
(6.8)
˜ ∩ Sδ ⊂ Sδ \ S and W ˜ − ≤ W− (1 − ζ)−1 + C~2 δ −2 , which implies Notice that supp W in view of (2.15), (6.7) that −1 ˜ , Sδ ∩ ) = 0. Mγ (W (6.9) lim lim sup µ~1−d + ~−d δ→0
~→0
128
A. V. Sobolev
Now, to handle the sum in the r.h.s. of (6.8), as in the first part of the proof, we apply Corollary 5.2 to the cubes from A> and Lemma 5.3 to the cubes from A< , using subsequently (2.23), (2.25) and (6.1). Then, taking into account (6.9) and arbitrarity of δ, , ζ, η, we end up with an upper bound similar to (6.5). The upper and lower bounds immediately provide (2.13). Proof of Theorem 2.1 is similar, except that instead of the bound (2.15) one uses (2.14). We do not go into details.
7. Appendix A. Periodic Problem The spectrum of the Pauli operator in two dimensions with a periodic magnetic field was studied in [4]. In particular, it was shown there that the structure of the eigenspace corresponding to the ground state λ = 0 can be described by means of the Bloch functions, i.e. the functions obeying the quasi-periodic conditions τ1 u(x) = eip1 T1 /~ u(x), τ2 u(x) = eip2 T2 /~ u(x),
(7.1)
for some p = (p1 , p2 ) ∈ R2 (Here and below we use the notation of Sect. 4). The approach of [4] amounts to the direct construction of the Bloch functions with the help of the Weierstrass σ-functions and is connected with a specific choice of the function φ in the gauge (4.2) for the vector-potential a. In this Appendix we provide a simpler description of the Bloch functions, although in the final stage we also make use of the σ-function. The gauge for the vector-potential is chosen as in (4.2), (4.4). The condition (4.7) is supposed to be satisfied. Similarly to the operators A± (X (2) ), Q± (X (2) ) associated (p) (2) (2) with the periodic conditions (4.6), we define also the operators A(p) ± (X ), Q± (X ), 2 p ∈ R , corresponding to the conditions (7.1). Our objective is to prove Theorem 7.1. Let ±N > 0 in (4.7). Then for any p ∈ R2 the point λ = 0 is an (2) eigenvalue of the operator A(p) ± (X ) of multiplicity ±N . This theorem implies the first statement of Proposition 4.2. To be definite we shall assume that N > 0 and show that λ = 0 is an eigenvalue of (p) the operator A(p) + of multiplicity N . The equation A+ u = 0 is equivalent to Q(p) + u = 0.
(7.2)
u(x) = e−µφ/~ ψ(z), z = x1 + ix2 ,
(7.3)
We seek solution in the form
where the function φ is the solution of (4.3) introduced in the beginning of Sect. 4. Substituting (7.3) in (7.2), we see that (−i∂1 + ∂2 )ψ = 0, which means that ψ is an entire function. It follows from (7.1) that ψ obeys the conditions ψ(z + T1 ) = eω1 z+γ1 ψ(z), ψ(z + iT2 ) = eω2 z+γ2 ψ(z), with
(7.4)
Quasi-Classical Asymptotics for the Pauli Operator
129
ω1 = αµB0 T1 (2~)−1 = απN T2−1 , γ1 = απT1 N (2T2 )−1 + ip1 T1 ~−1 ; ω2 = −iβµB0 T2 (2~)−1 = −iβπN T1−1 , γ2 = βπT2 N (2T1 )−1 + ip2 T2 ~−1 .
(7.5)
We prove Theorem 7.1 in two steps. The first of them is Lemma 7.2. Let N > 0. Assume that for a p ∈ R2 there exists a non-zero solution u of (7.2), satisfying the conditions (7.1). Then 1. it has exactly N roots in the domain D = [0, T1 ) × [0, T2 ); 2. there are exactly N linearly independent solutions of such a form. Proof. Suppose without loss of generality that all the roots of u are strictly inside the domain D. The boundary conditions (7.4), (7.5) imply that the increment of the phase of ψ when moving anti-clockwise along the closed contour formed by the sides of the rectangle D, equals π(α + β)N = 2πN. By the principle of the argument, this means that the number of roots inside D equals N. To prove the second statement of the lemma notice that for any two solutions u1 and u2 of (7.2), the ratio ψ1 ψ2−1 of the corresponding functions ψ1 , ψ2 is a meromorphic periodic function (i.e. elliptic function) with the total number of poles being less than or equal to N with principal parts determined by the roots of ψ2 . There are exactly N linearly independent such elliptic functions (see [5]). It remains to prove the existence of a solution u of (7.2) of required form, or, which is the same, to prove the existence of an entire function, satisfying the conditions (7.4), (7.5). We shall construct such a function explicitly, using the Weierstrass function σ (see [5]). The function σ is entire and has the following translation properties: ( σ(z + T1 ) = −σ(z) exp{2η1 (z + T1 /2)}, (7.6) σ(z + iT2 ) =−σ(z) exp{2iη2 (z + iT2 /2)}. Here η1 , η2 are real numbers defined through the Weierstrass ζ−function ζ = σ 0 /σ as follows: η1 = ζ(T1 /2), η2 = −iζ(iT2 /2). Note the Legendre relation: η1 T2 − η2 T1 = π.
(7.7)
The required function ψ is provided by Lemma 7.3. The entire function N
ψ(z) = σ(z − ρ) with the parameters
2
eρ0 z−ρ1 z ,
(7.8)
130
A. V. Sobolev
2η1 T2 − απ 2η2 T1 + βπ N= N, 2T1 T2 2T1 T2 ip1 T1 η2 p2 T2 η1 ρ0 = −N (η1 − iη2 ) − + , π~ π~ i(p1 + ip2 )T1 T2 2ρ = −(T1 − iT2 ) − , π~N ρ1 =
(7.9) (7.10) (7.11)
obeys the conditions (7.4), (7.5). Before proving the lemma notice that the two expressions in (7.9) give the same value of ρ1 due to the Legendre relation (7.7) and the equality α + β = 2. Proof. We need to show that the choice (7.9), (7.11), (7.10) of the parameters ρ, ρ1 , ρ2 guarantees the validity of (7.4), (7.5). Using the translation properties (7.6), one sees that (7.8) satisfies the relations ψ(z + T1 ) = eω˜ 1 z+γ˜ 1 ψ(z), ψ(z + iT2 ) = eω˜ 2 z+γ˜ 2 ψ(z) with the complex numbers ω˜ 1 = 2(η1 N − ρ1 T1 ), γ˜ 1 = (iπ − 2η1 ρ + η1 T1 )N + ρ0 T1 − ρ1 T12 , ω˜ 2 = 2i(η2 N − ρ1 T2 ), γ˜ 2 = (iπ − 2iη2 ρ − η2 T2 )N + iρ0 T2 + ρ1 T22 . From the equality ω˜ k = ωk (see (7.5)), we obtain (7.9). Furthermore, substituting (7.9) in the definition of γ˜ 1 , γ˜ 2 , we get γ˜ 1 N −1 = iπ − 2η1 ρ + η1 T1 + ρ0 T1 N −1 − ρ1 T12 N −1 απT1 = iπ − 2η1 ρ + ρ0 T1 N −1 + , 2T2 and γ˜ 2 N −1 = iπ − 2iη2 ρ − η2 T2 + iρ0 T2 N −1 + ρ1 T22 N −1 βπT2 = iπ − 2iη2 ρ + iρ0 T2 N −1 + . 2T1 Since we require that γk = γ˜ k , this leads to the relations iπ − 2η1 ρ + ρ0 T1 N −1 = ip1 T1 N −1 ~−1 , iπ − 2iη2 ρ + iρ0 T2 N −1 = ip2 T2 N −1 ~−1 . Solving these equations and using again the Legendre relation, we arrive at (7.11), (7.10). Now the function (7.3) with ψ given by (7.8), satisfies (7.2). Now reference to Lemma 7.2 completes the proof of Theorem 7.1.
Quasi-Classical Asymptotics for the Pauli Operator
131
8. Appendix B Here we prove the bounds (2.23), (2.24), (2.25), (2.26) for the coefficient Bγ defined in (1.5). Let Md,γ , Nd,γ be defined as in (2.21) and σ = γ + (d − 2)/2. Assume without loss of generality that V1 , V2 and V are non-positive functions. Proof of (2.23), (2.25). Let first σ ≤ 1. Using the inequality |tσ1 −tσ2 | ≤ |t1 −t2 |σ , t1 , t2 ∈ R, for each term in (1.5), we obtain |Bγ (b, V1 )−Bγ (b, V2 )| ≤ CMd,γ (b, V1 − V2 ) Z X b|V1 − V2 |σ 1+ + C0
k≤|V1 |b−1
≤ CMd,γ (b, V1 − V2 ) + C 0
Z
X
1 dx
k≤|V2 |b−1
(|V1 | + |V2 |)|V1 − V2 |σ dx.
Now H¨older’s inequality leads to (2.23). For σ > 1 use the estimate |V1 |σ − |V2 |σ ≤ σ(|V1 |σ−1 + |V2 |σ−1 )|V1 − V2 |. Applying (2.22), one sees that Z |Bγ (b, V1 ) − Bγ (b, V2 )| ≤ C
b
σ−1 σ
1 |V1 |σ−1 + |V2 |σ−1 b σ |V1 − V2 | dx Z 0 |V1 |σ + |V2 |σ |V1 − V2 |dx. +C
Again H¨older’s inequality ensures (2.25).
Proof of (2.24), (2.26). We begin with the case σ ≤ 1. Let’s look first at the following auxiliary quantity: X σ 2kb2 + V − , 6(b) = 6(b, b), 6(b1 , b2 ) = 6σ (b1 , b2 ) = 2b1 k≥1
with some non-negative numbers b, b1 , b2 , and estimate 6(b1 ) − 6(b2 ). To be definite, we assume that b2 ≤ b1 . We shall consider separately two cases: 1. b2 ≤ C|b1 − b2 |1/2 |V |1/2 and 2. b2 ≥ c|b1 − b2 |1/2 |V |1/2 . Case 1. Notice that |6(b) − (σ + 1)−1 |V |σ+1 | ≤ Cbσ |V |.
(8.1)
Indeed, it is straightforward to check that Z k (2tb + V )σ− dt − (2kb + V )σ− 0≤ k−1
≤ 2(k − 1)b + V and
σ
− 2kb + V −
σ −
≤ 2 σ bσ ,
132
A. V. Sobolev
Z
∞
2b 0
(2tb + V )σ− dt =
|V |σ+1 . σ+1
Thus |6(b) − (σ + 1)−1 |V |σ+1 | ≤ 2σ+1 bσ+1
X
1,
1≤k≤|V |(2b)−1
which entails (8.1). Consequently |6(b1 ) − 6(b2 )| ≤ C(bσ1 + bσ2 )|V | σ
σ
≤ C|b1 − b2 |σ |V | + C|b1 − b2 | 2 |V |1+ 2 .
(8.2)
Case 2. Write 6(b1 ) − 6(b2 ) = 6(b1 , b1 ) − 6(b2 , b1 ) + 6(b2 , b1 ) − 6(b2 , b2 ) .
(8.3)
The first summand is bounded by |6(b1 , b1 ) − 6(b2 , b1 )| 1 1 |b1 − b2 | |b1 − b2 | 6(b1 ) ≤ |V |σ+1 ≤ C|b1 − b2 | 2 |V |σ+ 2 . ≤ b1 (σ + 1)b1
(8.4)
The second summand in (8.3) satisfies the bound X σ |6(b2 , b1 ) − 6(b2 , b2 )| ≤ 2b2 2k|b1 − b2 | , where k runs from 1 to |V |(2b2 )−1 . Thus |6(b2 , b1 ) − 6(b2 , b2 )| ≤ C
σ σ |b1 − b2 |σ |V |σ+1 ≤ C|b1 − b2 | 2 |V | 2 +1 . σ b2
(8.5)
It follows from (8.3), (8.4), (8.5) that 1
1
σ
σ
|6(b1 ) − 6(b2 )| ≤ C|b1 − b2 | 2 |V |σ+ 2 + C|b1 − b2 | 2 |V | 2 +1 . The last inequality and (8.2) provide the following bound which holds for all b1 , b2 : |6(b1 ) − 6(b2 )| σ
σ
1
1
≤ C|b1 − b2 | 2 |V |1+ 2 + C|b1 − b2 |σ |V | + C|b1 − b2 | 2 |V |σ+ 2 . To estimate the difference Bγ (b1 , V ) − Bγ (b2 , V ) write Z Z −1 σ βγ Bγ (b, V ) = b|V | dx + 2 6(b)dx,
(8.6)
(8.7)
b = b(x) being the function from Definition (1.5). The first term here yields the first term in the bound (2.24). The other three summands follow from (8.6) by making use of H¨older’s inequality:
Quasi-Classical Asymptotics for the Pauli Operator
Z Z
1 1 1 1 |b1 − b2 | 2 |V |σ+ 2 dx ≤ Md,γ (|b1 − b2 |, V ) 2 Nd,γ (V ) 2 ,
|b1 − b2 |σ |V |σ
Z
133
σ
|b1 − b2 | 2 |V |
σ2 2
2
+1−σ 2
σ 1−σ dx ≤ Md,γ (|b1 − b2 |, V ) Nd,γ (V ) ,
+ (2−σ)(1+σ) 2
σ 1− σ2 dx ≤ Md,γ (|b1 − b2 |, V ) 2 Nd,γ (V ) .
Let σ > 1. Then we use the following relation: Z ∞ λσ−2 61 (b, V + λ)dλ 6σ (b, V ) = Cσ 0
with an explicit constant Cσ . Plug in here (8.6) with σ = 1: Z 3 1 2 λσ−2 (V + λ)− dλ |6σ (b1 , V ) − 6σ (b2 , V )| ≤ Cσ |b1 − b2 | 2 Z + Cσ |b1 − b2 | λσ−2 (V + λ)− dλ 1
σ+ 21
≤ C|b1 − b2 | 2 V− Now (2.26) follows from (8.7).
+ C|b1 − b2 |V−σ .
Acknowledgement. First sections of the paper were written in the stimulating environment of the California Institute of Technology, in March 1996. I am grateful to B. Simon and the staff of Caltech Math Departement for their hospitality. I am grateful to R. Fenn for discussions about the Pauli matrices. Appendix A was written on the basis of discussions with M. Birman at the Schr¨odinger Institute in Vienna, in April 1996. I thank T. Hoffmann-Ostenhof for his kind invitation to the Institute.
References 1. Bugliaro, L., Fefferman, C., Fr¨olich, J., Graf, G.M., Stubbe, J.: A Lieb–Thirring bound for a magnetic Pauli operator. Commun. Math. Phys. 187, 567–582 (1997) 2. Colin de Verdiere, Y.: L’asymptotique de Weyl pour les bouteilles magnetiques. Commun. Math. Phys. 105, 327–335 (1986) 3. Cycon, H.L., Fr¨ose, R.G., Kirsch, W., Simon, B.: Schr¨odinger operators with application to quantum mechanics and global geometry. Texts and Monographs in Physics, Berlin: Springer, 1987 4. Dubrovin, B.A., Novikov, S.P.: Ground states in a periodic field. Magnetic Bloch functions and vector bundles. Soviet Math. Dokl. 22, 1, 240–244 (1980) 5. Erd´elyi, A., et al., Higher transcendental functions, vol. 2. New-York: McGraw-Hill Co, Inc, 1953 6. Erd}os, L., Solovej, J.P.: Semiclassical eigenvalue estimates for the Pauli operator with strong nonhomogeneous magnetic fields, I. Non-asymptotic Lieb–Thirring estimate. To appear in Duke Math. J. 7. Erd}os, L., Solovej, J.P.: Semiclassical eigenvalue estimates for the Pauli operator with strong nonhomogeneous magnetic fields, II.Leading order asymptotic estimates. Commun. Math. Phys. 188, 599– 656 (1997) 8. Lieb, E.H., Solovej, J.P., Yngvason, J.: Asymptotics of heavy atoms in high magnetic fields: II. Semiclassical regions. Commun. Math. Phys. 161, 77–124 (1994) 9. Lieb, E.H., Solovej, J.P., Yngvason, J.: Ground states of large quantum dots in magnetic fields. Phys. Rev. B 51, 10646–10665 (1995) 10. Loss, M., Yau, H.-T.: Stability of Coulomb systems with magnetic fields III. Zero energy bound states of the Pauli operator. Commun. Math. Phys. 104, 283–290 (1986) 11. Sobolev, A.V.: On the Lieb–Thirring estimates for the Pauli operator. Duke Math. J. 82, 607–635 (1996)
134
A. V. Sobolev
12. Sobolev, A.V.: Lieb–Thirring inequalities for the Pauli operator in three dimensions. In: Quasiclassical Methods, Eds: J. Rauch, B. Simon, IMA Volumes in Mathematics and its Applications, vol 95, pp. 155–188 13. Stein, E.M.: Singular integrals and differentiability properties of functions. Princeton: Princeton University, 1970 14. Thaller, B.: The Dirac equation. Texts and Monographs in Physics, Berlin: Springer, 1992 Communicated by B. Simon
Commun. Math. Phys. 194, 135 – 148 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Representations of Lie Groups and Geometric Quantizations Qiang Zhao? School of Mathematics, Peking University, Beijing, China, and Department of Mathematics, Northwest Normal University, Lanzhou, China Received: 2 January 1997 / Accepted: 1 October 1997
Abstract: In this paper we discuss the relation between representations of Lie groups and geometric quantizations. A series of representations of Lie groups are constructed by geometric quantization of coadjoint orbits. Particularly, all representations of compact Lie groups, holomorphic discrete series of representations and spherical representations of reductive Lie groups are constructed by geometric quantizations of elliptic and hyperbolic coadjoint orbits.
1. Introduction The problem of geometric quantization is, starting from the geometry of a symplectic manifold (M, ω), which gives the model of a classical mechanical system, to construct a Hilbert space H and a set of operators on it which give the quantum analogue of this system. Particularly if some Lie group G acts transitively on (M, ω) such that (M, ω) is a Hamilton G-space, then in the corresponding quantum systems irreducible representations of G must arise. Hamilton G-spaces were classified by Kostant [1]. Theorem 1.1 ([1], Theorem 5.4.1 ). Suppose that X is a Hamilton G-space. Then X is a covering of a coadjoint orbit Gf ⊂ g∗ . Therefore we may get irreducible representations of G by geometric quantization of coadjoint orbits or their coverings. Not every symplectic manifold can be quantized. For a symplectic manifold (M, ω), if h−1 ω ∈ H 2 (M, R) is integral, the Sourian-Kostant formula [1] gives prequantization of (M, ω), where h is Planck’s constant. When (M, ω) is a coadjoint orbit of a Lie group G, M = Gf ∼ = G/Gf for some f ∈ g∗ , here Gf = {g ∈ G | gf = f }. By Kostant [1], the coadjoint orbit M = Gf as a Hamilton G-space is prequantizable if and only if ?
Supported by Natural Science Foundation of China and the Post-Doctor’s Foundation of China.
136
Q. Zhao
there exists a character 3 of Gf such that d3 = 2πih−1 f . In this case, M is said to be integral. In general, the Sourian-Kostant prequantization is not a quantization. The problem is that the Hilbert space prequantization construct is too large to make the corresponding representation of the Poisson algebra irreducible. The idea is to find a polarization P of (M, ω) and replace the Hilbert space H of prequantization by the subspace HP of polarized elements in H ( see [2, 3]). Then the set of functions whose operator maps HP into itself forms a subalgebra CP (M ) of the Poisson algebra C ∞ (M ). They are the classical observables which can be quantizable. The difficulty of doing so is that the polarization P must be chosen properly. It is a technical and complicated task to seek a good and fitable polarization. Some symplectic manifolds admit no polarizations. It is also possible that HP = {0} for some polarization P . In the paper we succesfully construct Kahlerian polarizations and real polarizations for elliptic coadjoint orbits and hyperbolic coadjoint orbits, respectively. Using these polarizations, we quantize the two classes of coadjoint orbits. If a Lie group G is compact, every coadjoint orbit of G is elliptic. By geometric quantization of integral coadjoint orbits, all irreducible representations of G arise. In such a way, we get the Borel–Weil theorem in a direct and explicit way. If the Lie group G is SU (m, n), SO∗ (2n) or SP (n, R), which has holomorphic discrete series of representations, we can give these representations by geometric quantizations of integral elliptic coadjoint orbits. The spherical representations of reductive Lie groups, i.e., the principal series of representations induced by representations of minimal parabolic subgroups which are trivial on the Levi factors, are obtained by geometric quantizations of hyperbolic coadjoint orbits. 2. Geometric Quantization Suppose that (M, ω) is a symplectic manifold. We denote the set of all smooth real functions on M by C ∞ (M ) and the set of all smooth real vector fields on M by X (M ). For f ∈ C ∞ (M ), let Xf be the vector field determined by i(Xf )ω + df = 0, where by i(Xf )ω we denote the contraction of Xf with ω. Xf is called the Hamiltonian vector field generated by f . The Poisson bracket [f, g] ∈ C ∞ (M ) of f, g ∈ C ∞ (M ) is defined by [f, g] = Xf (g) = ω(Xf , Xg ). It is well-known that the Poisson bracket makes C ∞ (M ) into a Lie algebra–the Poisson algebra, and the map f → Xf is a Lie algebra homomorphism from C ∞ (M ) into V H (M ), the set of Hamiltonian vector fields. A subset F1 , F2 , . . . , Fm of C ∞ (M ) is called complete if the conditons [Fi , G] = 0 (i = 1, 2, . . . , m) for G ∈ C ∞ (M ) imply G = constant. Definition 2.1. A quantization of (M, ω) is a linear mapping F → Fˆ of the Poisson algebra C ∞ (M ) (or some subalgebra of it) into the set End(H) of operators on some Hilbert space H, having the properties: 1) 1ˆ = 1; ˆ ˆ ˆ ˆ 2) [F1 , F2 ]∧ = [Fˆ1 , Fˆ2 ]h = 2πi h (F1 F2 − F2 F1 ); ∧ ∗ ¯ ˆ 3) (F ) = (F ) ;
Representations of Lie Groups and Geometric Quantizations
137
4) for some complete set F1 , . . . , Fm of functions the operators Fˆ1 , . . . , Fˆm act irreducible on H; where F, F1 , . . . , Fm ∈ C ∞ (M ), and h is Planck’s constant. A linear mapping which possesses the first three properties is called a prequantization. A symplectic manifold (M, ω) is called quantizable whenever the form h−1 ω ∈ H (M, R) is integral. If (M, ω) is quantizable, following Sourian-Kostant, there is a complex line bundle L over M , a Hermitian structure (,) and a connection ∇ with the curvature h−1 ω on L which are compatible: 2
ξ(s1 , s2 ) = (∇ξ s1 , s2 ) + (s1 , ∇ξ s2 ),
ξ ∈ X (M ), s1 , s2 ∈ 0∞ (L).
Here 0∞ (L) is the set of all smooth sections of the line bundle L. The triple (L, ∇, (, )) is called the prequantum bundle. By H we denote the space of square-integrable sections with the inner product: Z ωn 0 (s, s0 ) , if dim M = 2n. < s, s >= n! M Theorem 2.1. The Sourian-Kostant formula h ∇Xf + f ∈ End(H), f −→ fˆ = 2πi
f ∈ C ∞ (M )
gives a prequantization of (M, ω). Proof. See [1].
Remark. In general, the Sourian-Kostant prequantization is not a quantization. In order to get quantizations, we should use polarizations. Let (M, ω) be a symplectic manifold, and T C (M ) the complexification of the tangent bundle over M . Definition 2.2. A subbundle P ⊂ T C (M ) is called a polarization if it fulfills the conditions: 1) The fibre Px is a Lagrangian subspace of TxC (M ) for each x ∈ M . 2) The distribution x → Px is integrable. It is clear that if the subbundle P is a polarization then the complex conjugate subbundle P¯ is also a polarizaton. If P = P¯ , the polarization P is said to be real. If Px ∩ P¯x = {0} for all x ∈ M , the polarization P is said to be Kahlerian. In this case, the form b(X, Y ) = iω(X, Y¯ ), X, Y ∈ Px is a nondegenerate Hermitian form on Px . Here Y¯ means the conjugation of Y ∈ gC with respect to the real form g of gC . Let P be a polarization of the symplectic manifold (M, ω) and (L, ∇, (, )) a prequantum bundle over M . We denote the space of all vector fields tangent to P on M by VP (M ). A smooth section s ∈ 0∞ (L) is said to be polarized if ∇X¯ s = 0 for every X ∈ VP (M ). Our idea is to quantize M by replacing the Hilbert space H of prequantization by the completion HP of the subspace 0P of square-integrable polarized sections of L. Then the operator fˆ with f ∈ C ∞ (M ) maps local polarized sections into polarized sections if and only if [X, Xf ] ∈ VP (M ) whenever X ∈ VP (M ). In this case, we called f quantizable. We denote the space of quantizable functions by CP (M ). It is a subalgebra of the Poisson algebra C ∞ (M ).
138
Q. Zhao
3. Geometric Quantization and Coadjoint Orbits Suppose that G is a simply-connected Lie group with the Lie algebra g. If a ∈ G and X ∈ g, we write aX for Ada(X). The coadjoint action of G on g∗ = HomR (g, R) is just the transpose of the adjoint action: (af )(X) = f (a−1 X),
a ∈ G, X ∈ g, f ∈ g∗ .
The differential of this action is (Xf )(Y ) = f ([Y, X]),
X, Y ∈ g, f ∈ g∗ .
For f ∈ g∗ , the coadjoint orbit O = Gf ∼ = G/Gf is a G-homogeneous space, where Gf = {a ∈ G|af = f } and its Lie algebra is gf = {X ∈ g|f ([X, Y ]) = 0, ∀Y ∈ g}. So one has a homomorphism from g into X (O): X → ξ X defined by (ξ X φ)(f 0 ) =
d |t=0 φ(exp(−tX)f 0 ), dt
f 0 ∈ O, φ ∈ C ∞ (O).
Now we attached a skew-symmetric bilinear form ωf 0 on Tf 0 O for any f 0 ∈ O defined by ωf 0 (ξ X , ξ Y ) = f 0 ([X, Y ]), X, Y ∈ g. It is easy to see that ωf 0 is well-defined and the family of ωf 0 defines a closed two-form ω, and therefore a G-invariant symplectic structure on O. For Y ∈ g, write φY ∈ C ∞ (O) for the function given by φY (f 0 ) = f 0 (Y ),
f 0 ∈ O.
Recall that a symplectic manifold (M, ω) is said to be a Hamiltonian G-space if the Lie group G acts transitively on M by symplectic automorphisms and there is a Lie homomorphism λ : g −→ C ∞ (M ) such that Xλ(Y ) is the vector field generated by the one-parameter subgroup exp(tY ) of automorphisms on M for every Y ∈ g. Theorem 3.1. ξ Y is the Hamiltonian vector field generated by φY , that is, i(ξ Y )ω + dφY = 0. Moreover, λ : g −→ C ∞ (O), Y → φY is a Lie algebra homomorphism and (O, ω, λ) is a Hamiltonian G-space. Write Cg (O) ⊂ C ∞ (O) for the image of λ which is a subalgebra of C ∞ (O). Proof. It is a direct verification.
Let G∗f be the character group of Gf and Ghf ⊂ G∗f be the set of all characters 3 : Gf → S 1 such that for any X ∈ gf one has 2πi d |t=0 3(exp(tX)) = f (X). dt h
Representations of Lie Groups and Geometric Quantizations
139
Definition 3.1. The coadjoint orbit O = Gf ⊂ g∗ is said to be h-integral if Ghf 6= ∅. In particular, a 1-integral coadjoint orbit is said to be integral. A main result of Kostant [1] is that the orbit (O, ω) is quantizable if and only if it is h-integral. Let O = Gf ⊂ g∗ be h-integral. There is a character 3 ∈ Ghf . We want to construct the prequantization of (O, ω) corresponding to 3. First, O ∼ = = G/Gf , and (G, O, , Gf ) with the natural projection pr : G → O ∼ G/Gf is a principal bundle. So we have the associated line bundle L = G ×Gf C corresponding to the character 3 : Gf → Aut(C) = C. Let [ L∗ = (Lf − {0}). f ∈O
The product group I = G × C∗ = G × (C − {0}) operates transitively on L∗ : (a, c)[g, z] = [ag, cz],
(a, c) ∈ I, [g, z] ∈ L∗ .
and the isotropy group of I at u = [e, 1] is Iu = {(a, 3(a−1 ))|a ∈ Gf }. The Lie algebra of I is i = g ⊕ C. Choose g ∈ i∗ as follows: g(X, z) = f (X) +
hz , 2πi
(X, z) ∈ i.
Obviously g|iu = 0. Therefore g ∈ (i/iu )∗ ∼ = Tu∗ L∗ . Since g ∈ Tu∗ L∗ is invariant under the action of Iu , there exists a unique 1-form α on L∗ which is I-invariant and satisfies αu = g. It is easy to see that α is a connection 1-form on L∗ which defines a connection ∇ on L. Second, since 3 : Gf → S 1 there exists a Hermitian structure (,) on L such that for g ∈ G and c ∈ C∗ , ([g, c], [g, c]) = |c|2 . Now G ⊂ I acts on L, moreover α and (,) are G-invariant. Let Z ωn < ∞, if dim O = 2n}. H = {s ∈ 0(L)| (s, s) n! O G has a unitary representation ρ on H: (ρ(g)s)(f 0 ) = gs(g −1 f 0 ), g ∈ G, f 0 ∈ O, s ∈ H. Theorem 3.2. The connection ∇ is determined by ∇ξX s = dρ(X)s −
2πi X φ s, h
X ∈ g, s ∈ 0∞ (L).
∇ is compatible with ( , ) and has the curvature h−1 ω. Therefore φX −→ is a prequantization of (O, ω).
h dρ(X), φX ∈ Cg (O) 2πi
140
Q. Zhao
Proof. Write X r for the right-invariant vector field on G satisfying Xer = X ∈ g, and [Xgr , zr ] ∈ T[g,r] L for the image of (Xgr , z(∂/∂z)r ) ∈ T( g, r)I under the natural map induced by I = G × C → L : (g, r) → [g, r], (g, r) ∈ I. Then ∗ r α[g,r] ([Xgr , zr ]) = (l(g −1 ,r −1 ) αu )([Xg , zr ])
= αu (([lg∗−1 Xgr , lr∗−1 zr ]) = αu (g −1 X, r−1 z) h z. = gf (X) + 2πir For any s ∈ 0∞ (L), we have Fs ∈ C ∞ (G) such that s(gf ) = [g, Fs (g)]. As a result, (s∗ ξ X )(gf ) = [−Xgr , (−X r Fs )(g)] and (dρ(X)s)(gf ) = [g, −(X r FS )(g)] = −(X r Fs )(g)Fs (g)−1 s(gf ). Therefore, (∇ξX s)(gf ) = 2πih−1 (s∗ α)gf (ξ X )s(gf ) = 2πih−1 αs(gf ) (s∗ ξ X )s(gf ) = 2πih−1 α[g,Fs (g)] ([−Xgr , (−X r Fs )(g)])s(gf ) h(−X r Fs )(g) )s(gf ) = 2πih−1 (−(gf )(X) + 2πiFs (g) = −2πih−1 φX (gf )s(gf ) + (dρ(X)s)(gf ). The rest of the proof is an easy verification.
C
Definition 3.2. Suppose that g is the complexification of g and p is a subalgebra of gC containing gf . p is called a polarization at f ∈ g∗ if a) f ([X, Y ]) = 0, ∀X, Y ∈ p and b) 2 dimC p = dim G + dim gf . Set ξ X+iY = ξ X +iξ Y for X, Y ∈ g. Then X → ξ X , X ∈ gC is a Lie homomorphism from gC into T C (O). Theorem 3.3. A polarization p at f determines a polarization P of the symplectic manifold (O = Gf, ω) which is G-invariant, that is, g∗ Pf = Pgf for any g ∈ G. And Cg (O) ⊂ CP (O), that is, φX , ∀X ∈ g is quantizable. Proof. Define Pf = span{ξfX |X ∈ p} and gX | X ∈ g} Pgf = g∗ Pf = span{ξgf
for any g ∈ G. Pgf is well-defined since Gf p ⊂ p. It is also an isotropic subspace of C O because Tgf ωgf (ξ gX , ξ gY ) = (gf )([gX, gY ]) = f ([X, Y ]) = 0 for any g ∈ G and X, Y ∈ p. By b) of Definition 3.2, Pgf is a Lagrangian subspace of C O. And the distribution gf → Pgf , gf ∈ O is integrable since p is a Lie subalgebra. Tgf Therefore P is a G-invariant polarization of O. That P is G-invariant implies [ξ X , η] ∈ VP (O), ∀η ∈ VP (O), i.e, Cg (O) ⊂ CP (O).
Representations of Lie Groups and Geometric Quantizations
141
4. Geometric Quantization of Elliptic and Hyperbolic Coadjoint Orbits Suppose G is a reductive Lie group. Then we can consider the Lie algebra of G as a subalgebra g of gl(n, R) for some n ∈ Z. Define the form (X, Y ) = trXY, X, Y ∈ g on g which is a G-invariant symmetric nondegenerate bilinear form. Using this form we can identify g with g∗ by an isomorphism: X → fX , X ∈ g, such that fX (Y ) = (X, Y ) = trXY, X, Y ∈ g. If O ⊂ g is an adjoint orbit, by fO ⊂ g∗ we denote the corresponding coadjoint orbit. Suppose that g=k⊕p is the Cartan decomposition of g and K is the maximal compact subgroup of G with the Lie algebra k. It is well-known that g = z(g) ⊕ [g, g], where z(g) is the center of g. Definition 4.1. X ∈ g ⊂ gl(n, C) is said to be semisimple (nilpotent, hyperbolic, or elliptic ) if the linear transformation of Cn defined by X is diagonalizable (respectively nilpotent, diagonalizable with real eigenvalues, or diagonalizable with purely imaginary eigenvalues). Theorem 4.1. Assume X = Xz + X 0 ∈ g, Xz ∈ z(g) and X 0 ∈ [g, g]. Then a) X is semisimple iff adX is semisimple; b) X is nilpotent iff Xz = 0 and adX is nilpotent; c) X is hyperbolic iff Xz ∈ p and adX is hyperbolic, or equivalently, the conjugacy class of X contains an element in p; d) X is elliptic iff Xz ∈ k and adX is elliptic, or equivalently, the conjugacy class of X contains an element in k. Proof. It is easy to prove by Humphreys [4] and definitions.
Definition 4.2. An element fX ∈ g∗ is said to be semisimple (or nilpotent, elliptic, or hyperbolic) if X ∈ g is. In order to quantize elliptic and hyperbolic coadjoint orbits, we consider some structure theory of the Lie algebra g. Choose an maximal abelian subalgebra t of k. The Weyl group W (g, t) = NK (t)/CK (t) acts on t, where NK (t) = {k ∈ K | Ad(k)t ⊂ t}, CK (t) = {k ∈ K | Ad(k)Y = Y, ∀Y ∈ t}. And we have the decomposition of gC : gC =
X
gα ,
α∈t∗
where gα = {X ∈ gC | [T, X] = α(T )X, ∀T ∈ t}. Write 1(g, t) for the root system {α ∈ t ∗ | gα 6= {0}} − {0}. It is easy to see that
142
Q. Zhao
θ(gα ) = gα , g¯ α = {Y¯ |Y ∈ gα } = g−α , where Y¯ means the conjugation of Y ∈ gC corresponding to the real form g, and θ is the Cartan involution of g : θ(X) = −t X, X ∈ g. We say that a root α is compact imaginary if θ|gα = id, and noncompact imaginary if θ|gα = −id. α is said to be complex if it is neither compact imaginary nor noncompact imaginary. We denote the sets of compact imaginary, noncompact imaginary and complex roots by 1k , 1n and 1c , respectively. Fix a positive root system 1+ + + + + + + + of 1(g, Pt). Corresponding, we have 1k , 1n and 1c , and 1 = 1k ∪ 1n ∪ 1c . Write 1 δ = 2 α∈1+ (dim α)α. Fix a closed Weyl chamber E = {X ∈ t | 2πiα(X) ≥ 0 for α ∈ 1+k ∪ 1+c , 2πiα(X) ≤ 0 for α ∈ 1+n }. It is easy to prove Theorem 4.2. There is a bijection between E and the set of elliptic coadjoint orbits in g∗ . Next, we choose a maximal abelian subspace a of p. The Weyl group W (g, a) = NK (a)/CK (a) acts on a. We have the decomposition of g: X gα , g= α∈a∗ where gα = {X ∈ g | [A, X] = α(A)X, ∀A ∈ t}. Write 1(g, a) for the root system {α ∈ a∗ | gα 6= {0}} − {0}. Fix a positive root system 1+ of 1(g, a) and a closed Weyl chamber H = {X ∈ a | 2πiα(X) ≥ 0 for α ∈ 1+ }. P Write p = 21 α∈1+ (dim α)α. Then the following theorem is obvious. Theorem 4.3. There is a bijection between H and the set of hyperbolic coadjoint orbits in g∗ . Now we can come to the geometric quantization of coadjoint orbits. Suppose fO ⊂ g∗ is an elliptic coadjoint orbit, where O ⊂ g is an elliptic adjoint orbit. Then there exists an unique element X ∈ E such that fO = GfX . So fO ∼ = G/GX , where GX = {g ∈ G | gX = X} with the Lie algebra gX = {Y ∈ g | [Y, X] = 0}. Define X gα , nX = α∈1+ ,α(X)6=0
pX = (gX )C ⊕ nX . C . pX induces a G-invariant Then pX is a parabolic subalgebra of gC and pX ∩ p¯ X = gX K¨ahlerian polarization P of fO . Assume that fO is h-integral, that is, there exists a character 3 ∈ GhfX . Then (fO , ω) is a quantizable symplectic manifold. Let L = G ×GX C be the associated line bundle corresponding to the character 3 : GX → Aut(C) = C. According to Sect. 3, we have a Hermitian structure (,) and a connection ∇ with the curvature h−1 ω on L which are compatible.
Representations of Lie Groups and Geometric Quantizations
143
The Kahlerian polarization P determines a complex structure on fO such that every ξ ∈ VP (fO ) is a holomorphic vector field. Now any two nonvanishing polarized sections s and s0 of L are related by s = φs0 with φ a holomorphic function on fO . If we use only polarized sections to define the local trivializations, the transition functions are holomorphic, and P make L a holomorphic line bundle. Then the space HP of squareintegrable polarized sections which are exactly holomorphic sections of L is closed in H (see Sect. 2), and therefore is a Hilbert space. Observe that the action of G on L is holomorphic because the polarization P is G-invariant. So ρ(g)HP ⊂ HP for every g ∈ G. Now it is obvious that Theorem 4.4. The map from Cg (fO ) to End(HP ): φX −→
h h dρ(X) = ∇ X + φX , 2πi 2πi ξ
X ∈ g,
is a geometric quantization of fO .// Next suppose that fO ⊂ g∗ is a hyperbolic coadjoint orbit. Then there exists an unique element X ∈ H ⊂ a such that O = GX. Theorem 4.5. The hyperbolic coadjoint orbit fO = GfX is h-integral. Moreover, GhfX is naturally in one-to-one correspondence with the irreducible representations of the group of connected components of GX . Proof. Suppose that
gX = kX ⊕ p0
is the Cartan decomposition of gX , and p1 is the orthogonal complement of aX = z(gX ) ∩ p0 in p0 . Then X ∈ aX . Let KX ⊂ GX be the maximal subgroup with Lie algebra kX . Define G1 = KX exp(p1 ), AX = exp(aX ) which is a vector group. It is easy to see that GX = G1 × AX . Since X ∈ aX , we have fX |kX +p1 = 0. Therefore we can define an irreducible representation 3 of GX which is trivial on the identity component of G1 and −1 3 exp(Z)) = e2πih (Z,X) , Z ∈ aX . So 3 ∈ GhfX . Evidently such representations correspond precisely to the representations of G1 /(G1 )0 ∼ = GX /(GX )0 . Define nX =
X
gα , pX = gX ⊕ nX , NX = exp(nX ).
α(X)>0
Theorem 4.6. The complexification of pX is a polarization at fX ∈ g∗ . It induces a G-invariant real polarization P of the orbit fO = GfX . Proof. It is obvious.
Write PX = GX NX . Then PX f 0 , f 0 ∈ fO are the leaves of P which are complete and simply-connected. And Q = fO /P ∼ = G/PX is an oriented Hausdorff manifold. Assume that dim f√O = 2n. Since Q is an oriented Hausdorff manifold, we can take the square root δP = KP of the canonical line bundle KP = π ∗ 3nC (Q) by taking the square roots of the transition functions, where π : fO → Q is the natural projection
144
Q. Zhao
and 3nC (Q) is the bundle of n-forms on Q. In fact, δP is the associated line bundle over fO ∼ = G/GX corresponding to the character 1 ⊗ ep of GX = G1 × AX : (1 ⊗ ep )(g1 , exp(Y )) = exp(p(Y )),
g 1 ∈ G 1 , y ∈ AX .
Define the covariant derivative 5X on KP for X ∈ VP (fO ) by 5X β = i(X)dβ, β ∈ 0∞ (KP ). The sections of KP which are covariantly constant along P are the pull-backs of nforms on Q. If Z is a vector field on fO whose flow preserves P , then the Lie derivative LZ maps √ sections of KP to sections of KP . The 5X and LZ can pass to the bundle δP = KP , where they are determined by 2(5X τ )τ = 5X τ 2 ,
2(LZ τ )τ = LZ τ 2 .
Here τ is a section of δP . Let 3 ∈ GhfX . Then the associated line bundle L = G ×GX C corresponding to 3 is a complex line bundle over M with a hermitian metric (,) and a compatible hermitian linear connection 5 with the curvature h−1 ω described as before. Set LP = L ⊗ δP . Define VP = {s˜ = sτ ∈ 0∞ (LP )| 5X s˜ = (5X s)τ + s 5X τ = 0}. ˜ s˜0 ) = (s, s0 )τ τ 0 ∈ 0∞ (KP ) and for X ∈ VP (M ), If s˜ = sτ and s˜0 = s0 τ 0 ∈ VP , then (s, ˜ s˜0 ) = (5X s, ˜ s˜0 ) + (s, ˜ 5X s˜0 ) = 0. 5X (s, Hence we can identify (s, ˜ s˜0 ) with an n-form on Q and define an inner product on VP by Z < s, ˜ s˜0 >= (s, ˜ s˜0 ). Q
The completion of {s˜ ∈ VP | < s, ˜ s˜ >< ∞} is our quantum space HP . Now G acts on the associated line bundles L and δP . The actions of G on L, δP and fO induce the actions ρ on 0∞ (L) and γ on 0∞ (δP ) of G: (ρ(g)s)(f 0 ) = gs(g −1 f 0 ), g ∈ G, f 0 ∈ fO , s ∈ 0∞ (L), (γ(g)τ )(f 0 ) = gτ (g −1 f 0 ), g ∈ G, f 0 ∈ fO , τ ∈ 0∞ (δP ). So G acts on 0∞ (LP ) by π = ρ ⊗ γ and π(G)VP ⊂ VP since P is a G-invariant polarization. Theorem 4.7. The map
π : G −→ Aut(VP )
is a unitary representation of G. Its differential is dπ(X)s˜ = (∇ξX s + 2πih−1 φX s)ν + sLξX ν, ∀X ∈ g, s˜ = sν ∈ VP . √ Proof. It is easy to prove dγ(X) = LξX by δP = KP and KP = π ∗ 3nC Q. According to Theorem 3.2, dρ(X) = ∇ξX + 2πih−1 φX . The conclusion is now obvious. Corollary 4.1. The map φX −→ of fO .
h 2πi dπ(X),
X ∈ g defines a geometric quantization
Representations of Lie Groups and Geometric Quantizations
145
5. Borel–Weil Theorem, Holomorphic Discrete Series and Spherical Representations In this section we want to determine which representations of Lie groups can be obtained by geometric quantization of elliptic and hyperbolic coadjoint orbits. We fix Planck’s constant h=1. For a compact Lie group G, every coadjoint orbit is elliptic. Suppose g is the Lie algebra of G. Choose a Cartan subalgebra t ⊂ g, then we get the root system 1(g, t). Fix a positive root system 1+ . Then the set of dominated weights is in one-to-one correspondence with the set of integral coadjoint orbits. Irreducible unitary representations of G are determined by their highest weights which are dominated weights. Every highest weight has the form of 2πifX |t with X ∈ E ⊂ t and the coadjoint orbit fO = GfX is integral and elliptic. Choose 3 ∈ G1fX . Then the associated line bundle L = G ×GX C corresponding to 3 is a complex line bundle over fO with a hermitian metric (,) and a compatible hermitian linear connection 5 with the curvature ω described as before. And the parabolic subalgebra pX before Theorem 4.4 determines complex structures on fO and L such that HP is exactly the set of holomorphic square-integral sections. HP is a Hilbert space with the inner product Z ωn < s, s0 >= (s, s0 ) , s, s0 ∈ HP (dim fO = 2n). n! fO ˜ P of functions F on Lemma 5.1. The Hilbert space HP is isomorphic to the space H G satisfying the following conditions: 1) F (ga) = 3(a−1 )F (g), g ∈ G, a ∈ GX ; 2) Y l F (g) = −2πifX (Y )F (g), g ∈ G, Y ∈ p¯ X . Here Y l F (g) =
d dt |t=0 F (g exp(tY
)) if Y ∈ g and
(Y1 + iY2 )l F (g) = Y1l F (g) + iY2l F (g), ∀Y1 , Y2 ∈ g. ˜ P is determined by the norm The inner product on H Z ||F ||2 = |F (g)|2 dg, F ∈ H˜ P . G
Proof. There is a function Fs : G → C for any s ∈ HP such that s(gfX ) = [g, Fs (g)],
g ∈ G.
Then Fs (ga) = 3(a−1 )Fs (g) for a ∈ GX and g ∈ G, moreover, Z ωn = ||Fs ||2 . (s, s) n! fO If s ∈ HP , then
(∇η s)(gfX ) = ∇ηgfX s = 0
for any g ∈ G and η ∈ VP (fO ). In other words, ∇ξgY s = 0 for any g ∈ G and Y ∈ p¯ . But ∇ξgY s = gfX
1 2πi dρ(gY
2πiφgY (gfX )s(gfX ) = 0, that is,
gfX
)s(gfX ) − (φgY s)(gfX ), so dρ(gY )s(gfX ) −
146
Q. Zhao
−Y l Fs (g) − 2πifX (Y )Fs (g) = 0. Therefore Y l Fs (g) = −2πifX (Y )Fs (g) for any g ∈ G and Y ∈ p¯ , i.e., Fs ∈ H˜ P . The inverse is similar. ˜ P by ρ : ρ(g)s(x) = gs(g −1 x), x ∈ fO , g ∈ G, s ∈ HP Now G acts on HP and H −1 and ρ˜ : ρ(g)F ˜ (a) = F (g a), a, g ∈ G, F ∈ H˜ P . The isomorphism in the Lemma 5.1 is G-equivariant, and the representations ρ and ρ˜ are equivalent ˜ ∗ = Hom(H˜ P , C) as Now define the map σ : G → H P σ(g)(F ) = F (g), F ∈ H˜ P , g ∈ G. Lemma 5.2. The image Im(σ) of σ is not contained in any proper subspace of H˜ P∗ . ˜ ∗. That is, span{σ(g)|g ∈ G} = H P Proof. If there exists a proper subspace V of H˜ P∗ such that Im(σ) ⊂ V , then the subspace ˜ P | ξ(F ) = 0, ∀ξ ∈ V } V ⊥ = {F ∈ H ˜ P . Choose a nonzero element F ∈ V ⊥ . Then σ(g)(F ) = is a proper subspace of H F (g) = 0 for any g ∈ G, that is , F = 0. This is a contradiction. Remark. The elements of Im(σ) are called coherent states[5]. Theorem 5.1 (Borel–Weil). Let X, HP and ρ as before. Then HP 6= {0} and ρ is the irreducible representation of G with the highest weight 2πifX |t. Moreover, φX →
1 dρ(X), X ∈ g 2πi
is a geometric quantization of the coadjoint orbit fO . Proof. First of all, we prove HP 6= {0}. By Peter–Weyl theory, there exists an irreducible representation ξ of G with the lowest weight −2πifX |t. Write the representation space of ξ as V and v0 ∈ V is a nonzero lowest weight vector belonging to the weight −2πifX |t. Suppose that (,) is the Hermitian inner product on V . For any v ∈ V , define the function v˜ on G as follows: v(g) ˜ = (ξ(g)v0 , v), g ∈ G. ˜ P , H˜ P 6= {0}, so is HP . ˜ P . Since v˜ 0 ∈ H It is easy to verify that v˜ ∈ H Consider the dual representation ρ˜∗ of ρ: ˜ ˜ P , ξ ∈ H˜ P∗ . (ρ˜∗ (g)ξ)(F ) = ξ(ρ(g ˜ −1 )F ), g ∈ G, F ∈ H ˜ ∗ . But By Lemma 5.2, span{σ(g)|g ∈ G} = H P ˜ −1 )F ) (ρ˜∗ (g)σ(e))F = σ(e)(ρ(g = (ρ(g ˜ −1 )F )(e) = F (ge) = F (g), ∀F ∈ H˜ P , g ∈ G. Hence ρ˜∗ (g)σ(e) = σ(g), g ∈ G and span{ρ˜∗ (g)σ(e) | g ∈ G} = H˜ P∗ . Therefore, ρ˜∗ is an irreducible representation of G. Besides, (ρ˜∗ (Y )σ(e))(F ) = −σ(e)(ρ(Y ˜ )F ) = −(ρ(Y ˜ )F )(e) = (Y l F )(e) = −2πifX (Y )σ(e)(F )
Representations of Lie Groups and Geometric Quantizations
147
˜ P . So for any Y ∈ p¯ X and F ∈ H ρ˜∗ (Y )σ(e) = −2πifX (Y )σ(e), Y ∈ p¯ X and ρ˜∗ is an irreducible representation of G with the lowest weight −2πifX . Therefore ρ˜ is an irreducible representation of G with the highest weight 2πifX , so is ρ. Remark. Borel–Weil theorem has not been published. Bott [6] and Kostant extended the theorem. But all their results were stated in the form of complex Lie groups. Suppose that G is a reductive Lie group and g = k ⊕ p is the Cartan decomposition of the Lie algebra g of G. Let c be the center of k. Assume that the centralizer of c in g is Zg (c) = k (this happens if G is compact or G = SU (n, m), SO ∗ (2n) and SP (n, C) ). Choose a maximal abelian subalgebra t ⊂ k, then t is also maximal abelian in g. Using the notation as Sect. 3, then 1c = P ∅ and 1+ = 1+k ∪ 1+n . Recall that P δ = 21 α∈1+ (dim α)α ∈ t∗ . Extend δ to g by δ| gα = 0. Then by the identification of g with g∗ using the form T r(XY ), X, Y ∈ g, there exists an element Y ∈ g such that δ = f2πiY . In fact Y ∈ t. Write D = {X ∈ t | 2πiα(X) ≥ 0 for α ∈ 1+k and 2πiα(X) < 0 for α ∈ 1+n }. Theorem 5.2. For any X ∈ D, the coadjoint orbit fO = GfX is elliptic. If fO is integral, the space HP before Theorem 4.4 is not trivial. The corresponding representation ρ of G is irreducible, and it is the holomorphic discrete series of representation of G such that the irreducible representation of K with the highest weight 2πifX occurs in ρ|K. Proof. Since (δ, α) = 1 for any simple root α, 2πiα(Y ) > 0. Therefore D ⊂ E. Now the theorem follows by Theorem 4.4 and Theorem 6.6 in [7] and proof similar to Lemma 5.1 Corollary 5.1. Every holomorphic discrete series of the representation of G can be obtained by geometric quantization of elliptic coadjoint orbits. For a general reductive Lie group G, a ⊂ p and H as in Sect. 3. If fO is a hyperbolic coadjoint orbit with O ⊂ g, then O∩H = {X} and fO ∼ = G/GX . Now PX = G1 AX NX is the Langlands decomposition of the parabolic subgroup PX , where G1 and AX as in the proof of Theorem 4.5. Lemma 5.3. The pre-Hilbert space VP in Theorem 4.7 is isomorphic to the space V˜P = {F ∈ C(G) | F (xman) = 3(a−1 )δ(a−1 )F (x), x ∈ G, m ∈ G1 , a ∈ AX , n ∈ NX } Z
with the norm ||F ||2 =
|F (k)|2 dk,
F ∈ V˜P .
K
Proof. Since LP = L ⊗ δP is the associated line bundle over G/GX with the character 3 ⊗ eδ of GX and ∇Y s˜ = 0 for Y ∈ VP (fO ), s˜ ∈ VP , by a proof similar to Lemma 5.1 there is a linear isomorphism φ : VP → V˜P satisfying s(gf ˜ X ) = [g, (φs)(g)], ˜ g ∈ G, s˜ ∈ VP . It is easy to see that
148
Q. Zhao
Z
Z
||s|| ˜ 2=
||s||2 τ 2 = Q
N¯ X
|φs| ˜ 2 dn, ¯
where s˜ = sτ ∈ VP and N¯ X = θ(NX ). Since N¯ X ⊂ G, every n¯ ∈ N¯ X has the decomposition n¯ = k(n)m( ¯ n)a( ¯ n)n( ¯ n) ¯ according to G = KG1 AX NX ,
Z
||s|| ˜ = 2
Z
N¯ X
Z
N¯ X
=
|φs( ˜ n)| ¯ 2 dn¯ |φs(k( ˜ n))| ¯ 2 e−2δ (a(n))d ¯ n¯
2 |φs(k)| ˜ dk
= K
by (5.25) of [7]. So φ is a isomorphism.
Define an action π˜ of G on V˜P by π(g)F ˜ (x) = F (g −1 x). Theorem 5.3. π˜ is a unitary representation of G which is equivalent to π, the representation obtained by the geometric quantization of the coadjoint orbit fO . π˜ is exactly the spherical representation of G. Proof. The conclusion is obvious by the construction of π. ˜
Acknowledgement. The author is deeply indebted to Professor Min Qian and Maozheng Guo who gave me guidance and encouragement to be engaged in the field of geometric quantization.
References 1. Kostant, B.: Quantization and Unitary representations. Lecture Notes in Mathematics, Vol. 170, Berlin– Heidelberg–New York: Springer Verlag, 1970 2. Kirillov, A.A.: Geometric Quantization, Dyn. Syst. 4, 137–172 (1992) 3. Woodhouse, N.: Geometric Quantization, Oxford: Clarendon Press, 1992 4. Humphreys, J.E.: Introduction to Lie Algebras and Representations Theory. Berlin–Heidelberg–New York: Springer Verlag, 1972 5. Rawnsley, J.: Coherent states and Kahler manifolds. Quat.J.Math., Oxford (2), 28, 43 (1977) 6. Bott, R: Homogeneous vector bundle. Ann. Math. (2) 66, 203–248 (1957) 7. Knapp, A.: Representation theory of semisimple groups. Princeton, NJ: Princeton University Press, 1986 Communicated by A. Connes
Commun. Math. Phys. 194, 149 – 175 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Special Quantum Field Theories in Eight and Other Dimensions Laurent Baulieu1 , Hiroaki Kanno2 , I. M. Singer3 1 LPTHE, Universit´ es Paris VI – Paris VII, URA 280 CNRS, 4 place Jussieu, F-75252 Paris Cedex 05, France. E-mail:
[email protected] 2 Department of Mathematics, Faculty of Science, Hiroshima University, Higashi-Hiroshima 739, Japan. E-mail:
[email protected] 3 Department of Mathematics, MIT, Cambridge, MA 02139, USA. E-mail:
[email protected]
Received: 23 April 1997 / Accepted: 1 October 1997
Abstract: We build nearly topological quantum field theories in various dimensions. We give special attention to the case of eight dimensions for which we first consider theories depending only on Yang–Mills fields. Two classes of gauge functions exist which correspond to the choices of two different holonomy groups in SO(8), namely SU (4) and Spin(7). The choice of SU (4) gives a quantum field theory for a Calabi– Yau fourfold. The expectation values for the observables are formally holomorphic Donaldson invariants. The choice of Spin(7) defines another eight dimensional theory for a Joyce manifold which could be of relevance in M - and F -theories. Relations to the eight dimensional supersymmetric Yang–Mills theory are presented. Then, by dimensional reduction, we obtain other theories, in particular a four dimensional one whose gauge conditions are identical to the non-abelian Seiberg–Witten equations. The latter are thus related to pure Yang–Mills self-duality equations in 8 dimensions as well as to the N=1, D=10 super Yang–Mills theory. We also exhibit a theory that couples 3form gauge fields to the second Chern class in eight dimensions, and interesting theories in other dimensions.
1. Introduction Topological quantum field theory (TQFT), or more specifically, cohomological quantum field theory has been extensively studied in two, three and four dimensions. (See e.g. [1, 2] and references therein.) In this article we show that theories which are almost topological also exist in dimensions higher than four. We call them BRSTQFT’s instead of TQFT’s. We give special attention to the case of Yang–Mills fields in eight dimensions. A BRSTQFT relies on a Lagrangian which contains as many bosons as fermions, interconnected by a BRST symmetry. The Lagrangian density is locally a sum of dclosed and BRST-exact terms. Starting from classical “topological” invariants, the most crucial point in the construction of the BRSTQFT is the determination of gauge fixing
150
L. Baulieu, H. Kanno, I. M. Singer
conditions, enforced in a BRST invariant way. In the weak coupling expansion, one interprets the theory as exploring through path integrations all quantum fluctuations around the solutions to the gauge conditions. This provides, eventually, an intuitive way to study the moduli problem associated with the choice of gauge fixing conditions, by computing Green functions defined from the BRST cohomology. Generally, one must distinguish between the ordinary gauge fixing conditions for the ordinary gauge degrees of freedom of forms and the gauge covariant ones which occur when one gaugefixes a “topological” invariant (i.e, constant on a Pontryagin sector of gauge fields). A BRSTQFT can often be untwisted into a Poincar´e supersymmetric theory; we give more examples in this paper. BRSTQFT’s are microscopic theories, in the sense that in principle they provide the fundamental fields to study (almost) topological properties. We ask: are their infrared limits describable by effective theories, following the ideas of Seiberg and Witten? In four dimensions Donaldson [3] used the moduli space of anti-self-dual fields to describe invariants of four manifolds. Witten [4] interpreted these invariants as observables in a topological quantum field theory, twisted N = 2 supersymmetric Yang–Mills. Baulieu and Singer [5] noted that this TQFT could be obtained from a topological action by the BRST formalism with covariant gauge functions which probe the moduli space of anti-self-dual fields. In this paper, we apply this formalism to higher dimensional cases of self duality; M-theory, F-theory, and low energy limits of string theory have increased the interest in QFTs in dimension greater than four. Over a decade ago, Corrigan et al [6] classified the cases in which the self-duality equation for Yang–Mills fields in four dimensions could be generalized to higher dimensions. See also Ward [7]. Solutions to these equations are higher dimensional instantons [8, 9]. The generalizations in eight dimensions depend on having the holonomy group reduced from SO(8) to Spin(7) or SU (4). See Salamon [10] for background on special holonomy groups. The third author (IMS) learned about self-duality in eight dimensions for Einstein manifolds and fields associated to the spin bundle from Eric Weinstein in 1990. Weinstein constructed special instantons, computed the dimensions of the corresponding moduli space, and noted the importance of Spin(7) and SU (4). For this, and more, see [11]. The geometry for manifolds with holonomy Spin(7) can be found in Joyce [12]. For holonomy SU (4), the holomorphic extension of Donaldson Theory is being developed by Donaldson, Joyce, Lewis, and Thomas at Oxford. Their program for extending results in two, three and four dimensions from the real to the complex case is sketched in Donaldson and Thomas [13]. In the first part of this paper we describe two eight dimensional Yang–Mills quantum field theories that reflect the eight dimensional self duality equations found in [6]; we use the geometry developed by the above-mentioned authors to construct the quantum field theory. These theories cannot be called topological for they depend on some geometrical structure of the manifold M8 . For want of a better term, we have called them BRST quantum field theories (BRSTQFT), because they are constructed by starting with a topological action and using the BRST formalism with covariant gauge functions that again probe the moduli space of these new anti-self-dual fields. When the holonomy group is Spin(7) ⊂ SO(8), we call (M8 , g) a Joyce manifold. Section 2.1.1 gives the geometry needed to construct the BRSTQFT of 2.1.2, which is in turn described geometrically in 2.1.3. Section 2.2 gives a parallel discussion of the holomorphic case, i.e., when the holonomy groups is SU (4). We compare the two cases in Sect. 2.3. We point out in Sect. 2.4 that the J-case is a twist of D = 10, N = 1 supersymmetric Yang–Mills theory (SSYM) dimensionally reduced to D = 8. Since
Special Quantum Field Theories in Eight and Other Dimensions
151
supersymmetries for a curved manifold require covariant constant spinors, there is one remaining supersymmetry; we explain its relation to the topological BRST symmetry. Having defined pure Yang–Mills BRSTQFT in eight dimensions, we introduce a different theory in Sect. 3 which couples an uncharged 3-form gauge field B3 to the Yang–Mills field A. We propose as covariant gauge conditions of the coupled systems, the pair of equations FA = ∗ ∧ FA , Tr (FA ∧ FA )+ + dB3 + ∗dB3 = 0,
(1.1)
where is a background closed 4-form. One must be careful here; B3 is not an ordinary 3-form and dB3 is not its differential. Rather, B3 is locally defined, up to an exact 3-form so that dB3 stands for a closed 4-form. (See the discussion in Sect. 3). Section 4 discusses other dimensions. When M12 is a Calabi–Yau 6-fold, one can define BRSTQFT’s and we do so. We reduce our 8D theories to 6D and 4D in sections 4.2 and 4.3, respectively. The H case reduction can be obtained directly on a Calabi–Yau 3-fold by a modification of the methods in Sect. 2.2. The reduction to 4D is particularly interesting. On the one hand we get a twisted N = 4 SSYM of Vafa and Witten [16]. In fact, the H, J cases and the case of M7 , holonomy G2 theory, reduced to 4D, give the three twists of N = 4 SSYM. On the other hand we also get the nonabelian Seiberg–Witten theory. Thus there is a relationship between N = 4 SSYM and nonabelian SW theories. The latter theory is obtained from the eight dimensional J theory, with its octonionic structure; the former is obtained from the N = 1, D = 10 SSYM theory, by ordinary dimensional reduction. The direct link between the D = 10 SSYM theory and the J theory is that the N = 1, D = 10 SSYM theory gives by dimensional reduction the N = 1, D = 8 SSYM which can be identified with the J theory by a simplest twist, specific to eight dimensions, which interchanges vectors and spinors (Sect. 2.4). 2. Pure Yang–Mills 8 Dimensional Case The four dimensional Yang–Mills TQFT can be obtained by the BRST formalism. Starting with p1 = 8π1 2 Tr F ∧ F , one gauge-fixes its invariances with three covariant gauge conditions and one Feynman-Landau gauge condition that probe the moduli space of self-dual curvature fields [5]. These gauge conditions are enforced in a BRST invariant way, by using the 4 gauge freedom of local general infinitesimal variations of the connecd d tion Aµ . Put mathematically, we get an elliptic complex 0 → 30 −→ 31 −→ 32+ → 0, tensored with a Lie algebra G. In this section we extend this scheme to 8 dimensions when the holonomy group in SO(8) is either SU (4) (the case of a Calabi–Yau 4-fold) or Spin(7) (the case of a Joyce manifold). The 4-D self duality equations must be generalized to λF µν =
1 µνρσ T Fρσ , 2
(2.1)
where λ is a constant (an eigenvalue) and T µνρσ is a totally antisymmetric tensor which is generally not invariant under general SO(D) transformations. Rather it is invariant under a subgroup of SO(D). Corrigan et al [6] classified the possible choices of T µνρσ up to eight dimensions, where two solutions T are singled out. Indeed, for these cases, the space of 2-forms 32 decomposes into a direct sum and one can thus replace the
152
L. Baulieu, H. Kanno, I. M. Singer
self-duality condition in four dimensions by the condition that the curvature fields lie in an appropriate summand. The elliptic complex above has an 8-D counterpart: 0 → d d 30 −→ 31 −→ P+ (32 ) → 0. Moreover, in each case, there is a closed 4-form and one can replace p1 by 1 ∧ Tr (F ∧ F ). (2.2) 8π 2 R Since X ∧ Tr (F ∧ F ) is independent of the gauge field A and since the new elliptic complex implies that the number of gauge covariant gauge functions plus FeynmanLandau type gauge condition is eight, one can use the BRST formalism to introduce new (ghosts and ghosts of ghosts) fields and an invariant action. The theory is not topological, because it depends on the reduction of the holonomy group. In the case of the SU (4) reduction, one predicts that the expectation value of the observables depends on the holomorphic structure of X, but not on the choice of the Calabi–Yau metrics. We call these theories BRSTQFT’s. We will say the BRSTQFT is of type J for Spin(7) and of type H for SU (4). We will analyze each case. They differ in a subtle way from the point of view of BRST quantization. In the type H case one has 6 independent real covariant gauge conditions which can be seen as three complex 4-D self-duality conditions. We can complete them by a complex supress gauge condition which counts for the two missing gauge conditions allowed by the eight freedom in deforming the Yang–Mills field. In the type J case one has seven independent real equations which we can complete by the usual (real) Landau gauge condition. In the former case one has thus a complexification of all ingredients of the 4-D case. In the latter case all fields are real, and the situation is quite like the 4-D case, with the change of the quaternionic structure of the self duality equations in four dimensions into an octonionic one in eight dimensions. The action we consider will be the BRST invariant gauge fixing of the topological invariant Z 1 ∧ Tr (F ∧ F ), (2.3) S0 = 2 M8 where is a fixed closed four form adapted to each case. Depending on the case, we will have six or seven covariant gauge fixing conditions of the type of Eq. (2.1), that we will denote as 8i = 0, 1 ≤ i ≤ 6 or 7. That we get an action containing a Yang–Mills part relies on the identity X Tr (8i 8i ) · (vol) = −S0 + Tr (F ∧ ∗F ), (2.4) a i
where a is a positive real number (one has different decompositions in the J and H cases). (vol) stands for the volume form. The last term is the action density for the Yang–Mills theory. Hence a solution to 8i = 0 gives a stationary point of the eight dimensional Yang–Mills theory. For this reason, the equations 21 T µνρσ Fρσ = λF µν , deserve to be called the instanton equation. Notice that one has the correspondence µνρσ = µνρσαβγδ T αβγδ . By adding to S0 a BRST exact term which generates among P other terms i (8i 8i ), we will thus replace the “topological” invariant ∧ Tr (F ∧ F ) by the standard Yang–Mills Lagrangian Tr (F ∧ ∗F ) plus ghost terms, which constitute the action of the BRSTQFT theory. As explained earlier, the term BRSTQFT seems to us more appropriate than the term TQFT for the resulting theory. Obviously, the remaining gauge invariances must be gauge fixed, which will be done in the same spirit, as in [5].
Special Quantum Field Theories in Eight and Other Dimensions
153
2.1. Type J case: Joyce Manifold. 2.1.1. Geometrical setup. Recently it has been proposed that the 7 dimensional and the 8 dimensional Joyce manifolds provide a compactification to four dimensions of M -theory and F -theory, respectively [17, 18, 19]. We consider here the 8 dimensional case and call a Joyce manifold an eight dimensional manifold with Spin(7) holonomy [12] 1 . Then Spin(7) acting on 34 (M8 ), the space of 4-forms, leaves invariant a selfdual 4-form 6= 0. Further, is covariantly constant and hence closed. The space of 2-forms 32 (M8 ) splits into 3221 ⊕ 32+ with dim R 32+ = 7. One can see this by noting that 32 ' so(8) and that 3221 ' Lie algebra of Spin(7) ⊂ so(8). The splitting can also be obtained as follows: let T be the operator on 32 given by τ → ∗( ∧ τ ). Then T is self adjoint with eigenvalues +1 and −3, when is scaled. Its eigenspaces are 3221 and 32+ , respectively. The ordinary anti-self-dual Yang–Mills fields in four dimensions are now to be replaced by (P+ FA ) = 0, where P+ is the projection of 32 onto 32+ . We next discuss the linearization of this equation. − + and SM (that is, 8s and 8c in another notation) denote the chiral and antichiral Let SM real (Majorana) spinors for M8 (M8 is simply connected and has a unique spin structure). + is the direct sum R ⊕ V (that is, 8s = 1 ⊕ 7). Then the representation of Spin(7) on SM + . The Let ζ be a covariantly constant spinor field of norm 1 giving the splitting of SM − representation of Spin(7) on SM is irreducible. Since SM ⊗ SM is isomorphic to forms, + tensoring by ζ identifies spinors with forms. For example, 32 (SM ) ' 32 (M8 ); so 2 2 + ) = 3 (R ⊕ V) = V ∧ V + ζ ⊗ V gives the splitting into 3221 ⊕ 32+ . Further 3 (SM − can be identified with 31 (M8 ), that is, 8v . We conclude that the sequence ζ ⊗ SM d
P d
+ 2 1 2 0 ∗ 0 → 30 −→ 31 −→3 + → 0 is an elliptic sequence and (P+ d + d ) : 3 → 3+ ⊕ 3 − + is the Dirac operator 6 ∂ : SM → SM , after the identification of spinors with forms due to ζ. If P is a principal bundle over M8 with a compact gauge group G, we can couple DA forms to its Lie algebra G by a vector potential A. We have the sequence 0 → 30 ⊗G −→ P D + A 2 = 0, i.e. when P+ FA = 0. (Here 31 ⊗ G −→ 32+ ⊗ G → 0 which is elliptic when P+ DA we have identified the Lie algebra G with the adjoint Lie algebra bundle over M8 .) In ∗ =6DA : 31 ⊗ G → 32+ ⊗ G + 30 ⊗ G is elliptic. The index of the general, P+ DA + DA operator is the virtual dimension of the moduli space MJ of solutions to the nonlinear equation P+ FA = 0, modulo gauge transformations. To make contact with the next section, let us remark that P+ FA = 0 determines, in the case of a pure Yang–Mills BRSTQFT, the relevant gauge covariant gauge conditions ∗ is the operator related to the Landau-Feynman gauge shown in Eq. (2.1), while DA condition of ordinary gauge degrees of freedom. More precisely, the BRSTQFT that will be determined shortly is the gauge fixing by R BRST techniques of S0 [A] = M8 ∧ Tr (F ∧ F ). The latter is independent of A, since it is 8π 2 ∪ p1 (P ) which only depends on the topological charge of A. The way one gets the Yang–Mills action from the gauge fixing of an invariant is the consequence of the following. If ω is an element of 32 , let ω− and ω+ be its components on 3221 and 32+ . Then k ω k2 =k ω+ k2 + k ω− k2 , hω+ , ω− i = 0, while
1 There is another class of Joyce manifolds in seven dimensions [20]. Its holonomy is the exceptional group G2 . Both classes of Joyce manifolds have been studied in superconformal field theory [21, 22].
154
L. Baulieu, H. Kanno, I. M. Singer
∧ F ∧ F = ∧ (F+ + F− ) ∧ (F+ + F− ) = ∧ F+ ∧ F+ + ∧ F− ∧ F− + ∧ F− ∧ F+ + ∧ F+ ∧ F− = −3 ∗ F+ ∧ F+ + ∗F− ∧ F− + ∗F− ∧ F+ − 3 ∗ F+ ∧ F− . (2.5) Thus
Z Tr ( ∧ F ∧ F ) =k F− k2 −3 k F+ k2 ,
(2.6)
M8
and
Z Tr ( ∧ FA ∧ FA ) + 4 k F+ k2 .
k F A k2 =
(2.7)
M8
∧ orients M8 and is the volume element. Given the topologial sector, we choose R so that M8 Tr ( ∧ FA ∧ FA ) ≥ 0. Then F+ = 0 minimizes the action k FA k2 . To write the BRSTQFT action in physicist’s notation, we have to be more explicit. In terms of an orthonormal basis, the self-dual four form is = e 1 ∧ e2 ∧ e5 ∧ e6 + e 1 ∧ e2 ∧ e7 ∧ e8 + e 3 ∧ e4 ∧ e5 ∧ e6 +e3 ∧ e4 ∧ e7 ∧ e8 + e1 ∧ e3 ∧ e5 ∧ e7 − e1 ∧ e3 ∧ e6 ∧ e8 −e2 ∧ e4 ∧ e5 ∧ e7 + e2 ∧ e4 ∧ e6 ∧ e8 − e1 ∧ e4 ∧ e5 ∧ e8 −e1 ∧ e4 ∧ e6 ∧ e7 − e2 ∧ e3 ∧ e5 ∧ e8 − e2 ∧ e3 ∧ e6 ∧ e7 +e1 ∧ e2 ∧ e3 ∧ e4 + e5 ∧ e6 ∧ e7 ∧ e8 ,
(2.8)
where ei (i = 1, . . . , 8) are vielbein fields. The operator T defined above can be written as the following Spin(7) invariant fourth rank antisymmetric tensor T µνρσ = ζ T γ µνρσ ζ,
(2.9)
where γ µνρσ is the totally antisymmetric product of γ matrices for the SO(8) spinor representation; 1 [µ ν ρ σ] γ γ γ γ , (2.10) γ µνρσ = 4! and ζ is the covariantly constant spinor introduced above to identify spinors with forms. This gives another component representation of the four form . To repeat the first paragraph of this section in terms of the fourth rank tensor T µνρσ , we define an analogue of the instanton equation on the Joyce manifold [6]; F µν =
1 µνρσ T Fρσ , 2
i.e.
F ∈ 32+ .
(2.11)
The curvature 2-form Fµν in 8 dimensions has 28 components, whose Spin(7) decomposition is 28 = 7 ⊕ 21. (This is made explicit by the eigenspace decomposition of the action of 21 T µνρσ in Eq. (2.1) with the eigenvalues λ = −3 and λ = 1.) Equation (2.11) can be written as seven independent equations, showing that the curvature has no components in the former subspace which is 7-dimensional F8i = cijk Fjk ,
1 ≤ i, j, k ≤ 7,
(2.12)
Special Quantum Field Theories in Eight and Other Dimensions
155
Equation (2.12) makes the octonionic structure explicit. Indeed, the cijk are the structure constants for octonions2 and the eight dimensional tensors Tµνρσ can be written as 3 T8ijk = cijk , 1 ≤ i, j, k ≤ 7 1 lijkabc cabc , 1 ≤ i, j, k, l ≤ 7. Tlijk = 24
(2.13)
Notice that by construction, the Tµνρσ are self-dual objects in 8 dimensions. Computed explicitly, Eq. (2.11) is 81 82 83 84 85 86 87
≡ ≡ ≡ ≡ ≡ ≡ ≡
F12 + F34 + F56 + F78 F13 + F42 + F57 + F86 F14 + F23 + F76 + F85 F15 + F62 + F73 + F48 F16 + F25 + F38 + F47 F17 + F82 + F35 + F64 F18 + F27 + F63 + F54
= 0, = 0, = 0, = 0, = 0, = 0, = 0.
(2.14)
In this form, the gauge functions are ready to be used to define the BRSTQFT action. It is known (see [8, 9, 14] ) that at least one instanton solution exists for the 8 dimensional equation F µν = 21 T µνρσ Fρσ .4 ,5 Finally, Eqs. (2.5)-(2.7) imply 4
7 X
Tr (8i 8i ) · (vol) = − ∧ Tr (F ∧ F ) + Tr (F ∧ ∗F ).
(2.15)
i=1
2.1.2. Action and observables. In the following all the fields are Lie algebra valued and we will suppress the Lie algebra indices. We use the standard notation (ψµ , φ) for topological ghost. We also introduce the Faddeev–Popov ghost c to define a completely nilpotent BRST transformation. The topological BRST transformation for the gauge field and the ghost fields is sAµ = ψµ + Dµ c, sψµ = −Dµ φ − [c, ψµ ], 1 sc = φ − [c, c], sφ = −[c, φ]. 2
(2.16)
We need as many pairs of the anti-ghost and the auxiliary fields (χi , Hi ) as topological gauge functions, with the following BRST transformation law; sχi = Hi − [c, χi ],
sHi = [φ, χi ] − [c, Hi ].
(2.17)
One has 1 ≤ i ≤ 7. The gauge fixed action at the first stage is 2 If we decompose the octonions into its one dimensional real part and 7 dimensional imaginary part, R7 , then ∗7 (|R7 ) is a 3-form α which determines Cayley multiplication on R7 by α(z, y, z) =< x, y, z > . 3 In the four dimensional case one has similar equations, with the indices i, j, k running from 1 to 3. Then the coefficients cijk are the structure constants for quaternions. The holomorphic H case that we will shortly analyze is thus a theory with a complexified quaternionic structure. 4 It is also known that a solution exists in seven dimensions if one replaces Spin(7) by G (see [15]). 2 5 An interesting problem is to find conditions on a curved compact Joyce manifold M so that such 8 instantons exist.
156
L. Baulieu, H. Kanno, I. M. Singer
Z Z 1 1 1 8 √ S1 = ∧ Tr (F ∧ F ) + s d x g Tr (χi 8i + χi Hi ) 2 M8 2 M8 2 Z 1 ∧ Tr (F ∧ F ) = 2 M8 Z 1 1 1 8 √ d x g Tr Hi 8i + Hi Hi − χi (Dψ)i + φ[χi , χi ] , (2.18) + 2 M8 2 2 where (Dψ)i is the FP ghost independent part of s8i . Eliminating the auxiliary fields Hi by Eq. (2.15), one recovers the standard Yang–Mills kinetic term Z S1 =
1 µν 1 √ d x g Tr − F Fµν − χi (Dψ)i + φ[χi , χi ] . 4 2 M8 8
(2.19)
Notice that the fermion terms break the SO(8) global invariance down to G2 , for which the octonion structure coefficient in Eq. (2.12) is an invariant tensor. The gauge fixing and Faddeev–Popov ghost dependence have not been considered yet: the first stage action has still a gauge symmetry in the ordinary sense. To fix it completely we take two more conditions D · ψ = 0,
∂ · A = 0.
(2.20)
(The meaning of the scalar product is the usual one, e.g. D · 9 = Dµ 9µ .) Introducing ¯ η) and (¯c, B) with the BRST transformation law, additional fields (φ, ¯ sφ¯ = η − [c, φ], s¯c = B − [c, c¯],
¯ − [c, η], sη = [φ, φ] sB = [φ, c¯] − [c, B],
(2.21)
we write the complete action as 1 √ ¯ d x g Tr (φD · ψ + c¯∂ · A + c¯B) S2 = S1 + s 2 M8 Z 1 1 √ d8 x g Tr − F µν Fµν − χi (Dψ)i + φ[χi , χi ] = 4 2 M8 ¯ · Dφ − ψ · [φ, ¯ ψ] + B∂ · A + 1 B 2 + c¯∂ · Dc +ηD · ψ + φD 2 1 −¯c∂ · ψ + ∂ · A[c, c¯] − φ[¯c, c¯] . 2 Z
8
(2.22)
A natural set of topological observables is derived from the topological invariants 1 2
Z
Z ∧ Tr (F ∧ F ), M8
Tr (F ∧ F ∧ F ∧ F ).
(2.23)
M8
The method of the descent equation implies a ladder of topological invariants and, for example, gives the following descendants:
Special Quantum Field Theories in Eight and Other Dimensions
O(0) =
1 2 Z
Z ∧ Tr (F ∧ F ), M8
∧ Tr (ψ ∧ F ),
O(1) = Z
γ7
1 ∧ Tr ( ψ ∧ ψ − φ ∧ F ), 2 γ6 Z =− ∧ Tr (ψ ∧ φ), γ5 Z 1 = ∧ Tr (φ ∧ φ). 2 γ4
O(2) = O(3) O(4)
157
(2.24)
The descendant O(k) with ghost number k is an integral over an (8 − k) cycle γ(8−k) . 2.1.3. Geometric interpretation. The virtual dimension of the moduli space MJ of solutions to P+ FA = 0 is −index 6 ∂ ⊗ IG , i.e., the index of 6 ∂ ⊗ IG : S − ⊗ G → S + ⊗ G. Its value is Z ˆ 8 ) ch(G), A(M (2.25) − M8
computable in terms of the relevant characteristic classes. We will discuss the vanishing theorem needed to make the virtual dimension equal to the actual dimension elsewhere. We can interpret Sect. 2.1.1 geometrically analogous to Sect. 5 in [5]. The BRST equations in this section are the analogues of (7) in [5], and are the structure equations for the universal connection on A/G ×M8 with structure group G. The curvature 2-form i is an i-form in the A/G F for this universal connection equals F20 +F11 +F02 , where F2−i direction (ghost number) and a (2 − i)-form in the M8 direction. Note that F20 at (A, x) is FA (x) and F11 assigns to τ ∈ T (A/G, A) and v ∈ T (M8 , v) the value τ (v) ∈ G, since τ is ∗ a 1-form on M8 . Further, F02 on τ1 , τ2 ∈ T (MJ , A) is G(b∗τ1 (τ2 )) where G = (DA DA )−1 0 0 ∗ on 3 ⊗ G and bτ1 (f ) = [τ1 , f ] for f ∈ 3 ⊗ G; bτ1 is the adjoint of bτ1 . We restrict F to MJ ×M8 and consider c2 = 8π1 2 Tr (F ∧F) a 4-form on MJ ×M8 . Its expansion contains 1 1 1 8π 2 Tr (F1 ∧ F1 ), which has ghost number 2. This 4-form assigns to τ1 , τ2 ∈ T (MJ , A) ˜ 2 and v1 , v2 ∈ T (M8 , x) the value 8π1 2 (Tr (τ1 (v1 )τ2 (v2 )) − Tr (τ1 (v2 )τ2 (v1 )). Let τ1 ∧τ denote this 2-form on M8 . Let ck4−k be the component of c2 which is of degree k in the MJ direction and of R degree 4 − k in the M8 direction. Then γk ∧ ck4−k gives a k-form on MJ , when γk is a (8 − k)-cycle on M8 , k = 0, 1, 2, 3 or 4. These are the observables O(k) in Eq. (2.24). Taking products of the forms O and integrating them over MJ gives the expectation values of the products of observables. We are not addressing the central problem of integrating a form over the non compact space MJ . We can specialize to 6-cycles, or equivalently to 2-forms to get a closer analogy to Donaldson invariants: if σ ∈ H 2 (M8 ), R ˜ 2 ∧ σ. We get an rlet 6σ be the 2-form on MJ given by 6σ (τ1 , τ2 ) = M8 ∧ τ1 ∧τ R 2 symmetric multi-linear function on H (M8 ) given by (σ1 , . . . , σr ) → MJ 6σ1 ∧ . . . ∧ 6σr , if dim MJ = 2r. Of course the issue here is to make these invariants well-defined and to see how they depend on the space of Joyce manifolds modulo diffeomorphisms for a fixed M8 .
158
L. Baulieu, H. Kanno, I. M. Singer
2.2. Type H: Calabi–Yau Complex 4-manifold. 2.2.1. Geometrical setup. Suppose now that the holonomy group for (M8 , g) with metric g is SU (4). So M8 is a complex manifold and we can assume that g is a Calabi–Yau metric with a K¨ahler 2-form ω. We choose a holomorphic covariantly constant (4,0)form which trivializes the canonical bundle K. We √ normalize so that ∧ is the volume element of M8 . We also choose the trivial K for the spin structure on M8 . even ± We know that complex spinors can be identified with forms: SM ⊗ C ' 30, odd and the Dirac operator with ∂¯ + ∂¯ ∗ . Real Majorana spinors SM ⊂ SM ⊗ C are the fixed points of a conjugation b on SM ⊗ C. We can identify b with a conjugate linear ∗ operator as follows. For any Calabi–Yau M2n , define ∗ : 30,p → 30,n−p by hα, βi = R ∧ α ∧ ∗β, where now ∈ 3n,0 . (If one denotes by ∗1 the usual map on M2n complex manifolds: 3p,q → 3n−q,n−p , then ∗1 − = ∧ ∗ on 30,q .) When n = 4, one can show that conjugation b equals (−1)q ∗ on 30,q . Consequently, the operator 0,2 − + ∂¯ ∗ + P+ ∂¯ : 30,1 → 30,0 + 30,2 + is the Dirac operator from SM → SM . Here 3± is the 0,2 0,2 ± eigenspace of ∗, P± is the projection of 3 on 3± ; we have identified 30,1 with ∂¯
P ∂¯
+ 0,0 0,2 + 30,3 ) and 30,0 with 1+∗ + 30,4 ). The sequence 30 −→30,1 −→3 + is 2 (3 elliptic and is the linearization of the equation P+ FA = 0, modulo gauge transformations. Suppose now (E, ρ) is a complex Hermitian vector bundle over M8 with metric ρ of dimC = N . If A is a connection for E, we have its covariant differential DA : C ∞ (E) → C ∞ (E ⊗31 ) so that DA = ∂A + ∂¯A with ∂¯A : C ∞ (E) → C ∞ (E ⊗30,1 ). By introducing local complex coordinates z µ , ∂¯A (ßI ) = (∂µ¯ + (AIJ )µ¯ ßJ )dz¯ µ , I, J = 1, . . . , N . So (AIJ )µ¯ dz¯ µ is a (0,1)-form on M8 with N × N matrix coefficients. The 1-form connection A with values in GL(N, C) does not split naturally into 30,1 + 31,0 unless E is holomorphic. A splitting can be obtained by a choice of almost complex structure on the principal bundle. See Bartolomeis and Tian [24]. In any case, 2 . the curvature FA can be decomposed as FA = FA2,0 + FA1,1 + FA0,2 with FA0,2 = ∂¯A 0,1 ∞ ∞ For each ∂¯ operator: C (E) → C (E ⊗ 3 ), there exists a unique connection A ¯ Hence, the such that (i) A preserves the hermitian metric ρ of E and (ii) (DA )0,1 = ∂. ¯ space AP of ∂ operators can be identified with the connections of the principal bundle P associated with E, which preserve the Hermitian metric. The group of complex gauge ¯ is also a ∂¯ transformations H acts on the space AP , because if h ∈ H, then h−1 ∂h operator.
0,1 1−∗ 2 (3
∂¯
P ∂¯
+ A A 0,1 ⊗ G −→ 30,2 Let G be gl(N, C). Then the sequence 30 ⊗ G −→3 + ⊗ G is still elliptic on the symbol level. We say ∂¯A is holomorphic anti-self-dual if P+ FA0,2 = 0, in − + ⊗G → SM ⊗G. which case the sequence is elliptic. Its index is the index of 6 ∂ ⊗IG : SM R 0,2 Again, the BRSTQFT will be obtained by gauge fixing S0 = M8 ∧Tr (FA ∧FA0,2 ). S0 is independent of A, because S0 = 8π 2 ∪ p1 (E), since ∈ 34,0 . When S0 6= 0, we can normalize further by eiθ , so that S0 is real and positive. To verify Eq. (2.4) in the H case, we reduce G to u(N ), using the metric ρ. If ω ∈ 30,2 2 2 2 has components ω± in 30,2 ± , then k ω k =k ω+ k + k ω− k . And Z 0,2 0,2 0,2 0,2 0 ≤ S0 = Tr ∧ (FA+ + FA− ) ∧ (FA+ + FA− )
=− =−
M8 0,2 2 k k FA+ 0,2 2 k FA+ k
0,2 2 0,2 0,2 + k FA− k +iIm hFA+ , FA− i 0,2 2 + k FA− k .
(2.26)
Special Quantum Field Theories in Eight and Other Dimensions
159
Hence 0,2 2 k FA0,2 k2 = 2 k FA+ k +S0 .
(2.27)
So the holomorphic anti-self-dual gauge condition minimizes the action k FA0,2 k2 in the topological sector with S0 fixed. The (4, 0) form can be simply expressed in local coordinates as = dz 1 ∧ dz 2 ∧ dz 3 ∧ dz 4 .
(2.28)
F (0,2) = dz¯ µ¯ dz¯ ν¯ Fµ¯ ν¯ ,
(2.29)
Fµ¯ ν¯ = ∂µ¯ Aν¯ − ∂ν¯ Aµ¯ + [Aµ¯ , Aν¯ ].
(2.30)
Dµ¯ = ∂µ¯ + [Aµ¯ , ].
(2.31)
and where also One has the part of the Bianchi identity D[µ¯ Fν¯ ρ]¯ = 0.
(2.32)
The 3 complex gauge covariant gauge conditions, which count for 6 real conditions on the 8 independent real components contained in Aµ¯ are c
Fµ¯ 1 µ¯ 2 + µ¯ 1 µ¯ 2 µ¯ 3 µ¯ 4 Fµ¯ 3 µ¯ 4 = 0.
(2.33)
The two other gauge conditions are given by the following complex equation 1 ∂µ¯ c Aµ¯ + [Aµ¯ , c Aν¯ ] = 0. 2
(2.34)
If we compute the real and imaginary parts of this condition, they give respectively the Landau gauge condition adn the first of the seven conditions in (2.11). A similar decomposition of (2.33) gives the six other equations in (2.11). We have now the topological ghost 9µ¯ with 4 independent complex components, and we have the ghost gauge condition Dµc¯ 9µ¯ = 0.
(2.35)
(Here and below, we use the left upper symbol c for complex conjugation.) A consequence of the use of complex gauge transformations is that a complex Faddeev–Popov ghost c must be introduced, with complex ghost of ghost φ. Up to the complexification of all fields, we have thus exactly the same field content as the original 4 dimensional Yang–Mills TQFT. This leads us to the BRST algebra that we will shortly display. 2.2.2. Action and observables. From the previous arguments, we must write the BRST algebra in a notation where all fields are complex fields and replace the formula of the J case by sAµ¯ = ψµ¯ + Dµ¯ c, sψµ¯ = −Dµ¯ φ − [c, ψµ¯ ], 1 sc = φ − [c, c], sφ = −[c, φ]. 2
(2.36)
160
L. Baulieu, H. Kanno, I. M. Singer
In the antighost sector, we have a complex self dual two-form with 3 independant complex components χµ¯ ν¯ and sχµ¯ ν¯ = Hµ¯ ν¯ − [c, χµ¯ ν¯ ],
sHµ¯ ν¯ = [φ, χµ¯ ν¯ ] − [c, Hµ¯ ν¯ ].
(2.37)
We have also the complexified analogues of the antighosts of the four dimensional Yang–Mills TQFT, with the same transformation laws as in (2.21). Because of complexification, there are in the H case more ghosts than as in the J case. Thus, part of the gauge fixing will consists in setting equal to zero the imaginary parts of the scalar ghosts ¯ η. c, φ, φ, To impose these conditions, and the 3+1 complex gauge conditions Eqs. (2.33) and (2.34), we define Z Z= [DAµ¯ ][Dc Aµ¯ ][D9µ¯ ][Dc 9µ¯ ][Dκµ¯ ν¯ ][Dc κµ¯ ν¯ ][DHµ¯ ν¯ ][Dc Hµ¯ ν¯ ] c ¯ ¯ c φ][Dc][D c][Dc¯][Dc c¯][DB][Dc B] [Dη][Dc η][Dφ][Dc φ][Dφ][D
exp SH , (2.38)
Z SH = [ ∧ Tr F (0,2) ∧ F (0,2) ] Z 1c 1 4 4 d zd z¯ s Tr κµ¯ ν¯ (c Fµ¯ ν¯ + µ¯ ν¯ ρ¯ σ¯ Fρ¯ σ¯ + Hµ¯ ν¯ ) + c κµ¯ ν¯ (Fµ¯ ν¯ + µ¯ ν¯ ρ¯ σ¯ c Fρ¯ σ¯ + Hµ¯ ν¯ ) 2 2 ¯ µc¯ 9µ¯ +c φ¯ c Dµ¯ 9µ¯ + Im φ¯ Im c + φD 1 1c 1 c 1 c c c c + c¯(∂µ¯ Aµ¯ + [Aµ¯ , Aµ¯ ] + B) + c¯( ∂ µ¯ Aµ¯ + [ Aµ¯ , Aµ¯ ] + B) .(2.39) 2 2 2 2 If we develop the s-exact terms and eliminate the auxiliary fields H and B we get a supersymmetric action starting with Tr (F ∧ ∗F ), because 41 k FA k2 = k FA0,2 k2 + 1 2 2 4 k hF, ωi k + topological terms, and a Feynman–Landau gauge fixing term |∂ ·A| . The action of the H case is similar to that of the J case after eliminationof the imaginary parts ¯ η by mean of the equations of motion coming from s(Im φ¯ Im c). Moreover, of c, φ, φ, if one separate fields in their real and imaginary parts, one finds a mapping between the ghosts of the H and J case (for instance the six antighosts contained in the complex self dual two-form κµ¯ ν¯ and the imaginary part of the antighosts c¯ of the H cases can be identified as the seven ghosts κi of the J case). Actually, up to this mapping, the actions of the H and J cases are almost identical. The definition of observables follows from the cocycles obtained by the descent equations, as sketched in the previous section. Their meaning is now discussed. f denote [A ∈ AP ] with F+0,2 = 0. It is invariant 2.2.3. Geometric interpretation. Let M 2 f The 3 under H (which acts on G in 3+ ⊗ G, but not on 32+ .) Let MH = M/H. complex covariant gauge conditions, 30,2 = 0, probe the moduli space M H . We + ∂¯
P ∂¯
+ A A 0,1 remarked earlier that 0 → 30 ⊗ G −→3 ⊗ G −→ 30,2 + ⊗ G is an elliptic complex with 0,1 0 0,2 ∗ ¯ ¯ ∂A +P+ ∂A : 3 ⊗G → 3 ⊗G +3+ ⊗G; the elliptic operator 6 ∂A : S − ⊗G → S + ⊗G. The complex gauge condition is ∂¯ ∗ τ = 0 for τ ∈ 30,1 ⊗ G. e over MH × M8 with connection. As before, we get a hermitian vector bundle E e One can compute c2 of E in terms of its curvature F H . One has the map T of H 0,∗ (M8 ) T R into forms on MH by µ−→ M8 ∧ Tr (F H ∧ F H ) ∧ µ.
Special Quantum Field Theories in Eight and Other Dimensions
161
Formally this gives a multilinear map of H 0,∗ (M8 ) → C by µ1 , . . . , µr → T (µ1 ) ∧ . . . ∧ T (µr ). These would be the expectation values of the observables of MH the BRSTQFT. H 1 As in Sect. 2.1.3. part of c2 is 8π1 2 Tr ((F H )11 ∧ (F H )11 ) with R (F )1 (τ, v) = τ (v) ∈ 0,2 u(N ). If σ ∈ H (M8 ) let Tσ be the 2-form on MH given by M8 ∧ Tr (τ1 ∧ τ2 ) ∧ σ, where τi , i = 1, 2, are (0,1) forms on M8 with values in u(N ). The formal holomorphic Donaldson polynomial is the symmetric r-multilinear function on H 0,2 (M8 ) given by R σ1 , . . . , σr → MH Tσ1 ∧ . . . ∧ Tσr , when dim MH = 2r. (Note that if H 0,2 (M8 ) 6= 0, then M8 is hyperK¨ahler because elements of H 0,∗ are covariantly constant.) It will be very interesting to see when formal integration over MH is justified, and when these invariants depend only on the complex structure of M8 , not on the Calabi– Yau metric g, nor the hermitian metric ρ. C. Lewis [12] is investigating the conditions under which MH is the set of stable holomorphic vector bundles. Since the elliptic operator here is 6 ∂ again, the virtual dimension of MH is Z ˆ 8 ) ch (G). A(M (2.40) − R
M8
2.3. Comparison of H and J cases. Under suitable conditions ((E, ρ) a stable6 vector bundle, for example), one expects that the orbit space of AP under the group of complex gauge transformations, will be the same as the sympletic quotient, AP k GU , where GU are the gauge transformations on P reduced to the compact group U (N ). Since [A; hFA1,1 , ωim = 0, m ∈ M8 ] is the zeros of the moment map, AP /GU is the orbit space of this set under GU . We replace the condition P+ (FA0,2 ) = 0 with FA0,2 ∈ 30,2 + ⊗gl(N, C) by the conditions 0,2 1,1 0,2 0,2 P+ (FA ) = 0 and hFA , ωi = 0, where now FA ∈ 3 ⊗ u(N ) and hFA1,1 , ωi ∈ u(N ). One should get the same moduli space of solutions. ∂¯
P ∂¯
+ A A 0,1 ⊗gl(N, C) −→ 30,2 In the linearization, the sequence gl(N, C)−→3 + ⊗gl(N, C) →
P ∂¯ ⊕i ∂
ω A 0 is replaced by u(N ) → 30,1 ⊗ u(N ) + −→ 30,2 + ⊗ u(N ) ⊕ u(N ) → 0, where 0,1 iω ∂ : τ ∈ 3 ⊗ u(N ) → h∂τ, ωim ∈ u(N ). The operator iω ∂ is the linearization of the 0-momentum condition hFA1,1 , ωim = 0; it is also the imaginary part of ∗ ∂¯A : 30,1 ⊗ gl(N, C) → gl(N, C). Thus, with the reduction of Spin(7) holonomy to Spin(6) = SU (4) holonomy, the 7 dimensional 32+ in the J-case decomposes into the 6 dimensional 32+ of the H-case plus R.
2.4. Link to twisted supersymmetry. We note that the field content of our Yang–Mills BRSTQFT action in 8 dimensions is similar to that of four dimensional topological Yang–Mills theory. Since four dimensional topological Yang–Mills theory is a twisted version of D = 4, N = 2 super Yang–Mills theory and is related by dimensional reduction to the minimal six dimensional supersymmetric Yang–Mills theory, it is natural to expect a similar connection in eight dimensions. This is indeed so; we explain the type J case, although the fields (c, c¯, B) which were employed to impose the Lorentz condition ∂ µ Aµ = 0, are neglected. The gauge supermultiplet in eight dimensions consists of one gauge field in 8v (the vector representation), one chiral spinor in 8s , one anti-chiral spinor in 8c and two scalars [25]. The reduction of the holonomy group to Spin(7) 6 For physicists, one might define (E, ρ) to be stable if it is holomorphic, Einstein-Hermitian, i.e., F · ω ρ is a constant multiple of the identity, where Fρ is the curvature of (E, ρ) relative to its unique ρ-connection.
162
L. Baulieu, H. Kanno, I. M. Singer
defines decomposition of the chiral spinor; 8s = 1 ⊕ 7. Now it is natural to identify Aµ and ψµ in our topological theory as 8v and 8c , respectively. Furthermore χi and η just correspond to the chiral spinor 8s according to the above decomposition. Finally φ and φ¯ give the remaining two scalars. This exhausts all the dynamical fields in our action of eight dimensional topological Yang–Mills theory. Though we do not work out the transformation law explicitly, we believe this is a sufficiently convincing argument for the fact that the J case is the D = 8 SSYM dimensionally reduced from D = 10, N = 1 SSYM. The connection between a general supersymmetry transformation and topological BRST transformations is the following: when M8 is flat, the reduction from D = 10 is N = 2 real supersymmetry or N = 1 complex supersymmetry. For curved manifolds, the only surviving supersymmetries are those depending on covariant constant spinors. In the J case the nilpotent topological BRST symmetry generator is a combination of the real and imaginary parts of the one surviving complex generator of supersymmetry. As said just above, this supersymmetric Yang–Mills theory in eight dimensions is obtained by dimensional reduction from the D = 10, N = 1 super Yang–Mills theory. This suggests a relationship with superstring theory. It has been argued that the effective world volume theory of the D-brane is the dimensional reduction of the ten dimensional super Yang–Mills theory [26]. Thus the BRSTQFT constructed in this section may arise as an effective action of 7-brane theory. In fact Joyce manifolds are discussed in connection with supersymmetric cycles in [27, 28]. Recently in [29], a six dimensional topological field theory of ADHM sigma model is obtained as a world volume theory of D-5 branes. The world volume theory of D-branes could provide a variety of higher dimensional BRSTQFT’s.
3. Coupling of the 8D Theory to a 3-Form For the pure Yang–Mills theory, we have seen that the construction of a BRSTQFT implies a consistent breaking of the SO(D) invariance. This turns out to be quite natural, when closed but not exact forms exist, like the K¨ahler 2-form on K¨ahler manifolds or the holomorphic (n, 0)-form on Calabi–Yau manifolds. This idea extends to consider BRSTQFT’s involving sets of possibly interacting p-form gauge fields with (p + 1)-form curvatures Gp+1 = dBp + ..., satisfying relevant Bianchi identities. Our point of view is that one must define a system of equations, eventually interpreted in BRSTQFT as gauge conditions, which does not overconstrain the fields. If tensors T µ1 ,...,µ2p+2 of rank 2p+2; (2p+2 ≤ D) exist which are invariant under maximal subgroups of SO(D), we can consider BRSTQFT based on gauge functions of the following type, where λ is a parameter: T µ1 ,...,µ2p+2 Gµp+2 ,...,µ2p+2 = λGµ1 ,...,µp+1 .
(3.1)
Such equations must be understood in a matricial form, since they generally involve several forms Bp , with different values of p. To ensure that the problem is well defined, a first requirement is that Eq. (3.1) has solutions in Gp+1 for λ different from zero. This algebraic question is in principle straightforward to solve by group theory arguments, although we expect that geometrical arguments should also justify them. Moreover, we must also consider that Gp+1 is the curvature of a p-form gauge field Bp . Thus, other gauge functions must be introduced, to gauge fix the ordinary gauge freedom of Bp which leave invariant its curvature Gp+1 . This gives a second requirement, since from the point of view of the quantization, the total number of gauge conditions, the topological ones
Special Quantum Field Theories in Eight and Other Dimensions
163
and the ordinary ones, must be exactly equal to the number of independent components in the gauge field Bp . To be more precise, the number of ordinary gauge freedom of a p-form gauge field in p−1 : (this amounts to the fact that Bp is truly defined up to a (p − 1)D dimensions is CD−1 form, which is itself defined up to a (p − 2)-form, and so on.) We should therefore only retain invariant tensors T such that the number of components of Bp equates the rank of the system of linear equations in G presented in Eq. (3.1) plus the number of ordinary gauge freedom in Bp . Obviously, when there are several fields in Eq. (3.1), the counting of independent conditions can become quite subtle, since one must generally combine several equations like Eq. (3.1). For instance, we will display in the next section BRSTQFT theories in dimensions D < 8. Their derivation will appear as rather simple, because they all descend by dimensional reduction from the pure Yang–Mills BRSTQFT based on the set of 6 or 7 independent self-duality gauge covariant equations in eight dimensions found in Sect. 2. Without this insight, their derivation would be less obvious. We now turn to the introduction of a 3-form gauge field in 8 dimensions. In even D = 2k dimensions, Eq. (3.1) has a generic solution for an uncharged (k−1)-form gauge field Bk−1 : assuming the existence of a curvature Gk for Bk−1 , we can consider the obvious generalization of self-duality equations, Gk = ∗Gk . The number of these conditions k . On the other hand, the number of ordinary gauge freedom of a (k − 1)is CD−1 k−2 k−2 k−3 k−4 0 = CD − CD + CD − . . . ± CD . Thus imposing the form gauge field is CD−1 ordinary gauge fixing conditions for the (k−1)-form gauge field plus the gauge covariant k−1 k−2 k ones, Gk = ∗Gk , gives a number of CD = CD−1 + CD−1 equations, which is equal k−1 to the number of arbitrary local deformations of the CD independent components of the (k − 1)-form gauge field. We will see that it is possible to generalize the self duality equation satisfied by a (k − 1)-form gauge field. Moreover, the counting remains correct in the case it has a charge. As an example, in the 8-dimensional theory, a 3-form gauge field has 56 components, with 21 ordinary gauge freedom, while the number of self dual equations involving the 4-form curvature of the 3-form is 35, and one has 56=21+35. We thus propose as topological gauge conditions for the coupled system made of the Yang–Mills field A and the 3-form gauge field B3 the following coupled equations: λFµν = Tµνρσ Fρσ , dB3 + ∗(dB3 ) + αTr (F ∧ F )+ = 0.
(3.2)
α is a real number, possibly quantized, and Tr (F ∧ F )+ denotes the self dual part of Tr (F ∧ F ) 7 . Although B3 is real valued, it interacts with the Yang–Mills connection A, when α 6= 0. An octonionic instanton solves the first equation, as shown in [14] and by Eqs. (25), (30), (31) of [15] in the case of M8 = S 7 × R. For this solution, the 4-form Tr (F ∧ F ) is not self dual. Given these facts, we are led to define a BRSTQFT in 8 dimensions based on the gauge conditions (3.2), in which a 3-form gauge field is coupled to a Yang–Mills field. The ghost spectrum for the ordinary gauge invariance of the field B3 generalizes that of the Yang–Mills field, with the following unification between the ghost B21 and the ghosts of ghosts B12 and B03 : b3 = B3 + B 1 + B 2 + B 3 . B 2 1 0
(3.3)
(From now on upper indices mean ghost number and lower indices ordinary form degree.) 7 Equation (3.2) suggests that the 3-form could be involved in an anomaly compensating mechanism. See sec. 3.1 where we show that Eqs. (3.1 ) implies dB3 = 0 if M is compact.
164
L. Baulieu, H. Kanno, I. M. Singer
The BRST symmetry of the topological Yang–Mills symmetry considered in the previous section satisfies b = A + c, A (3.4) b A] b = F 0 + 91 + φ 2 , b + 1 [A, Fb = (s + d)A 2 1 0 2
(3.5)
with the notation 911 = 9µ dxµ and φ20 = φ. The gauge symmetry of the 3-form B3 involves a 2-form infinitesimal parameter associated to B21 . We can distinguish however different topological sectors for B3 , which cannot be connected only by these infinitesimal gauge transformations. As an example, B3 and B30 can belong to such different sectors, if 2 B30 = B3 + Tr (A ∧ dA + A ∧ A ∧ A). 3
(3.6)
We thus define the curvature of B3 as (A) ∧ F (A) ), G(A) 4 = dB3 + Tr (F
(3.7)
where the index (A) means the dependence upon the Yang– Mills field A. Notice that it is not globally possible to eliminate the A dependence of G(A) 4 by a field redefinition of B3 involving the Chern–Simons 3-form. The topological BRST symmetry of the 3-form gauge field system is defined from b3 + Tr (Fb (A) ∧ Fb (A) ) = G4 + G 1 + G 2 + G 3 + G 4 , b 4 = (s + d)B G 3 2 1 0
(3.8)
that is (s + d) (B3 + B21 + B12 + B03 ) + Tr (F20 + 911 + φ20 ) ∧ (F20 + 911 + φ20 ) = dB3 + Tr (F ∧ F ) + G31 + G22 + G13 + G04 .
(3.9)
g , g = 1, 2, 3, 4 are the topological ghosts of B3 . By expansion in ghost The fields G4−g number, Eqs. (3.5) and (3.9) define a BRST operation s which, eventually, determines the equivariant cohomology of arbitrary deformations of the Yang–Mills field modulo ordinary gauge transformations and of the 3-form gauge field, modulo the infinitesimal gauge transformations, δB3 = d2 , 2 ∼ 2 + d1 , 1 ∼ 1 + d. There is a natural topological invariant candidate for the classical part of a BRSTQFT action, Z (A) (A) G(A) ∧ F (A) ) . (3.10) Itop = 4 ∧ G4 + ∧ Tr (F
Its gauge fixing is a generalization of what we do in the pure Yang–Mills case. The main point is to find the gauge function in the topological sector. The existence of the octonionic instanton, together with an associated moduli space (yet to be explored), indicates that Eq. (3.2) is a good choice 8 . To enforce the gauge function Eq. (3.2), one must introduce a self-dual 4-form antighost κµνρσ , and consider the following BRST exact action: 8 Notice that one could also consider a 7-dimensional theory, which is formally related to the BRSTQFT in 8 dimensions as the 3-dimensional Chern–Simons theory is related to the 4-dimensional Yang–Mills TQFT action.
Special Quantum Field Theories in Eight and Other Dimensions
Z S3 =
165
d8 x s κµνρσ (∂[µ Bνρσ] + µνρσαβγδ ∂[α Bβγδ] + Tr F[µν Fρσ] ) .
(3.11)
The remaining conditions are for the usual gauge invariances of forms, whether they are classical or ghost fields. One can choose the following gauge fixing conditions for g , the longitudinal parts of all ghosts and ghosts of ghosts G4−g 1 = a11νρ , ∂ µ G3µνρ 2 ∂ µ G2µν = b12ν , 3 ∂ µ G1µ = c13 .
(3.12)
One must also conventionally gauge fix the longitudinal components of B3µνρ , of the 1 2 and B1µ , and of the antighosts. The presence in the r.h.s. of Eq. (3.12) of ghosts B2µν g the cocycles 13−g stemming from the ghost decomposition of Tr Fb ∧ Fb = Tr (F + 9 + φ) ∧ (F + 9 + φ) is an interesting possibility. It can lead to mass effects in TQFT, when the ghost of ghost φ takes a given mean value, depending on the choice of the vacuum in the moduli space, which can be adjusted by suitable choices of the parameters a, b, c. All these gauge conditions can be enforced in a BRST invariant way, as explained e.g. in [30]. The final result is an action of the following type (including the pure Yang–Mills part discussed in the previous sections) Z S = (∂µ Bνρσ ∂ µ B νρσ + Tr F µν Fµν + ∂µ Bνρσ Tr F µν F ρσ +supersymmetric terms).
(3.13)
b g occurring in the ghost expansion The observables are defined from all forms O 8−g of the 8-form b4 ∧ G b4 . b8 = G (3.14) O Whether these supersymmetric terms, made of ghost interactions, are linked to Poincar´e supersymmetry is an interesting question. 3.1. Mathematical Interpretation. Fix an element of H 4 (M8 , Z) and let h4 denote its harmonic representative. Let ß denote the affine space of all closed 4-forms which represent this cohomology class. Then ß = h4 + d33 ; strictly speaking ß = h4 + d(33 /closed 3-forms) = h4 + dδ34 . In any case a tangent vector to ß can be represented as dB3 with B3 a 3-form. There are other ways of describing ß. An element of ß can be represented as a collection of 3-forms {Bu }, for a collection of coordinate neighborhoods U covering M8 , satisfying Bu − Bv = dwu,v on u ∩ v. Thus {dBu } gives a well-defined closed form on M8 ; to be an element of ß, this 4-form must be cohomologous to h4 . In the earlier part of this section, dB3 means this element of ß when B3 is defined locally as B3,u ; or if B3 is an ordinary three form, dB3 is really h4 + dB3 .9 Next consider the elliptic complex 0 → 30 → 31 → 32 → · · · → 34+ → 0, where 34± are the ±1 eigenspaces of the ordinary ∗ operator on M8 . Remember that in the J-case we also had 0 → 30 → 31 → 32+ → 0 with 32 = 32− ⊕ 32+ of dimensions d
d
21 and 7, respectively. Consider then 0 → 32− −→ 33 −→ 34+ → 0. We leave the 9 The theory of gerbes [31] gives a sheaf theoretic description for exhibiting integral cohomology classes, extending the notion of curvature field as an integral 2-cocycle.
166
L. Baulieu, H. Kanno, I. M. Singer
reader to check that it is elliptic. (It does not suffice that the dimensions are 21, 56 and DA 35, respectively.) The linearization of the problem below involves 0 → 30 ⊗ G −→ D d d A 32+ ⊗ G → 0 for connections, and 0 → 32− −→ 33 −→ 34+ → 0 for 31 ⊗ G −→ 3-forms. An analogue of the anti-self-dual equations for the pair (A, G) with a connection A and G ∈ ß is (a) (b)
(i.e. P+ FA = 0) FA = ∗ ∧ FA , (1 + ∗)G = −αTr (FA ∧ FA )+ .
(3.15)
This equation is a mathematical interpretation of (3.2). Note that if a solution A, G = h4 + dB3 exists for (3.15), then Tr (FA ∧ FA ) is self-dual and hence harmonic. Hence (1 + ∗)(h4 + dB3 ) is harmonic. Since (1 + ∗)h4 is harmonic, so is (1 + ∗)dB3 . Hence dB3 = 0, and G = h4 . Note also that the sector ß, i.e. the element chosen in H 4 (M8 , R) must have its self-dual part, a multiple of the self-dual element p1 (P ). If we linearize (3.15), we get for τ ∈ T (G) and B3 ∈ T (ß), the equations P+(2) (DA τ ) = 0 and P+(4) dB3 = 0, where P+j is the projection of 3j → 3j+ (j = 2, 4). ∗ We then have a pair of elliptic systems above, with gauge fixing functions DA τ = 0 and d∗ B3 = 0, respectively. The covariant gauge functions areR given by (3.15). The candidate for the topological action S0 (A, G) is M8 G ∧ G + ∧ Tr (F ∧ F ). Since we now have the covariant gauge functions to probe the moduli space of solutions to (3.15) and we have the gauge fixing functions, we can apply the BRST formalism. We first express S0 in terms of the norms. From (2.7), Z ∧ Tr (FA ∧ FA ) + 4 k (FA )+ k2 . (3.16) k FA k2 = M8
R Also with G = G+ + G− , G± ∈ 34± , we have M8 G ∧ G =k G+ k2 − k G− k2 . Thus one obtains k FA k2 + k G k2 = S0 + 4 k (FA )+ k2 +2 k G+ k2 . We know that k FA k2 is minimized when F+ = 0, and that k G k2 is minimized when G = h. So we get a R minimum when (3.15) is satisfied and it equals S0 + 16π 4 α2 M8 p21 = S01 . In the pure YM case, the natural space was AP /G × M8 or its subspace MJ × M8 . b3 = Rather than 3-forms on M8 , we need 3-forms on MJ × M8 which we write as B 0 1 2 3 B3 + B2 + B1 + B0 (Eq.(3.3), above) with the upper index as the degree in the MJ direction (ghost number) and the lower index in the M8 direction. As before s denotes b b b b dMJ so that (s + d)B3 = (dMJ + dM8 )B3 = dMJ (B3 ) + dM8 (B3 ) is a 4-form with terms in the ab directions. 4. BRSTQFT’s for Other Dimensions Than 8 From many points of view the case D = 8 is exceptional. It is of interest, however, to also build BRSTQFT’s in other dimensions, by using the BRST quantization of dclosed Lagrangians with gauge functions as in Eq. (3.1). In this section, we first focus on theories with D < 8, that we directly obtain by various dimensional reductions in flat space of the J and H theories; we then comment on the cases D = 12 and D = 10 . We will not address the question of observables; their determination is clear from the descent equations which can be derived in all possible cases from the knowledge of the BRST symmetry.
Special Quantum Field Theories in Eight and Other Dimensions
167
4.1. Dimensional reduction of the Yang–Mills 8D BRSTQFT. In D = 8, for the J-case, we have seen that there exists a set of seven self-duality equations, on which we have based our BRSTQFT. These equations were complemented with a Landau gauge condition to get a system of 8 independent equations for the 8 components of Aµ . These seven equations can be written as 8i (Fµν (xµ )) = 0,
1 ≤ i ≤ 7,
1 ≤ µ, ν ≤ 8.
[ (2.14)]
(4.1)
Just as one obtains a BRSTQFT action based on Bogomolny equations in 3 dimensions [32], we can define a BRSTQFT in seven dimensions, by standard dimensional reduction on the eighth coordinate; that is, by putting in the above seven equations x8 = 0, ∂8 = 0 and replacing A8 by a scalar field ϕ(xj ) and Fi8 by Di ϕ(xj ). We can then gauge fix the longitudinal part of Ai , with an equation of the following type: ∂i Ai = [v, ϕ],
(4.2)
which allows for the case of a massive gauge field A. (Here and in what follows, the constant v defines a direction in the Lie algebra for the Yang–Mills symmetry.) The gauge fixed action will be Z d7 x |Fij |2 + |Di ϕ|2 + |∂i Ai − [v, ϕ] |2 + supersymmetric terms . (4.3) M7
This process can be iterated. We can go down from dimension 8 to 8 − n, by suppressing the dependence on n of the coordinates xµ . In D < 8 dimensions we will have a gauge field with D = 8−n components and a set of n scalar fields ϕp , p = 1, . . . , n which should be considered as Higgs fields. Obviously, the dimensional reduction applies as well to the various ghosts, and the fields ϕp fall into topological BRST multiplets, which, depending on the case, can possibly be interpreted as twisted Poincar´e supermultiplets. Moreover, as we will see when D = n = 4, there is an interesting option to assign the fields ϕp as elements of other representations, e.g. spinorial ones, of SO(D). One can also consider the dimensional reduction in the H-case. One can break the symmetry between the coordinates y, z, t, w and their complex conjugates by replacing some of the fields, e.g. Im Aw¯ , by scalar fields. In all cases, the final theories rely on 8 independent gauge conditions for all fields: 7 for the topological gauge ones plus 1 for the ordinary gauge condition, if one starts from the J case; or 6 for the 3 complex topological gauge conditions plus 2 for the ordinary complex gauge condition, if one starts from the H case. 4.1.1. The case D=6. Since the case D = 6 is of great interest in superstring theory, let us display what we get, starting from the H case, i¯ j¯ k¯ Fj¯ k¯ = Di¯ ϕ,
(4.4)
∂i¯ Ai¯ = [v, ϕ].
(4.5)
This set of gauge functions represents 4 complex equations, for eight degrees of freedom represented by the complex fields Ai¯ and ϕ. If we start from the J case, we have 8i (Fµν (xµ ), Dµ ϕa (xµ )) = 0, possibly complemented by
1 ≤ a ≤ 2,
1 ≤ µ, ν ≤ 6,
(4.6)
168
L. Baulieu, H. Kanno, I. M. Singer
∂µ Aµ = Ma,b ϕa ϕb + Na,b v a ϕb .
(4.7)
Notice that a 2-form gauge fields, subjected to the topological invariance sB2 = 92 + . . . can be introduced, still in 6 dimensions, with the topological self-dual gauge condition 2 (4.8) dB2 + ∗(dB2 ) + αTr (AdA + AAA)+ = 0. 3 This possibility is similar to the introduction of a 3-form in D = 8. We can directly build a BRSTQFT in 6 dimensions. First we consider a pure Yang– Mills case, taking the topological gauge fixing condition of the type 1 Tµνρσ F ρσ . (4.9) 2 The fourth rank tensor Tµνρσ is assumed to be invariant under some maximal subgroup of SO(6). According to Corrigan et al [6], only SO(4) × SO(2) and U (3) allow such an invariant tensor. The first choice corresponds to the case where the 6D manifold is a direct product of a 4D manifold and 2D Riemann surface; M6 = M4 × 62 . The second subgroup is the holonomy group of 6 dimensional K¨ahler manifolds. In this case we can write down the invariant tensor as the Hodge dual of a K¨ahler form ω, λFµν =
Tµνρσ = (∗ ω)µνρσ .
(4.10)
The possible eigenvalues λ of (4.9) with the tensor (4.10) are 1, −1 and −2. The eigenspaces of these eigenvalues give the decomposition of the 15 dimensional rep¯ ⊕ 1.10 Taking resentation of SO(6) under its subgroup SU (3) × U (1); 15 = 8 ⊕ (3 ⊕ 3) λ = 1 defines the 8 dimensional subspace given by the following seven linear conditions on Fµν , where we use complex indices a, b = 1, 2, 3: Fab = Fa¯ b¯ = 0,
(4.11)
ab¯
ω Fab¯ = 0.
(4.12)
(The last Eq. (4.12) is, e.g., F11¯ + F22¯ + F33¯ = 0.) The first condition (4.11) means that the connection is holomorphic. These equations are known as the Donaldson-UhlenbeckYau (DUY) equation for the moduli space of stable holomorphic vector bundles on a K¨ahler manifold. It also appears in the Calabi–Yau compactification of the heterotic strings. The DUY equation implies the standard second order equation of motion for the Yang–Mills field11 . In fact, this follows from the following identity in the action density level; 1 − Tr F ∧ ∗F + ω ∧ Tr (F ∧ F ) 4 3 ¯ ¯ = Tr − g aa¯ g bb Fab Fa¯ b¯ + (g ab Fab¯ )2 , 2
(4.13)
where we have introduced the metric gab¯ for the K¨ahler form ω. This identity [24] is crucial in constructing a BRST Yang– Mills theory whose classical action is the topological density ω ∧ Tr (F ∧ F ). From the BRST point of view, one must introduce scalar fields to get a correct balance between the gauge fixing conditions and the field degrees of freedom and to 10 The usual splitting of 3 ⊗ C into 3 ⊕3 ω is the K¨ahler form. 11 This is a general property of the system (4.9). 2
1,1
2,0
⊕3
0,2
with 3
1,1
decomposed into λω ⊕ ω ⊥ , where
Special Quantum Field Theories in Eight and Other Dimensions
169
recover Eq. (4.4). Given a hermitian connection A for the hermitian vector bundle (E, ρ), Eq. (4.4) says FA0,2 = (∂¯A )∗ ϕ e = ∗−1 e i.e., ∗1 FA0,2 = ∂A (∗1 , ϕ) e ∈ 31,3 ⊗ G. (See 1 ∂A ∗1 ϕ, Sect. 2.2.1 for the definition of the operation ∗1 .) When M is a Calabi–Yau 3 fold, let ϕ = ∗ϕ e ∈ 30 ⊗ G and we get ∗FA0,2 = ∂¯A ϕ. Linearization gives the usual elliptic curl grad operator, the holomorphic of : div 0 0,2 0,1 ∗ 3 ⊗G 3 ⊗G ∂¯A ∂¯A −→ . (4.14) : ∗ ∂¯A 0 30,3 ⊗ G 30,0 ⊗ G Of course what one wants is not (4.4) but FA0,2 = 0, Eq. (4.11), the condition that makes E a holomorphic bundle. However, as a consequence of the Bianchi identity, ∗ ∗ ϕ e = 0, which also implies ∂¯A ϕ e = 0, when ∂¯A FA0,2 = 0 and hence (4.4) implies ∂¯A ∂¯A M is compact without boundary. Thus (4.4) implies (4.11); moreover, when M is a ∗ ϕ e = 0, (equivalent, ∂¯A ϕ = 0) only happens when Calabi–Yau 3-fold and E is stable, ∂¯A ϕ is a constant multiple of I in u(N ). In that sense, the right-hand side of Eq. (4.5) is 0, giving the gauge fixing condition ∂¯ ∗ τ = 0, τ ∈ 30,1 ⊗ G. Equation (4.12) is the equation hF, ωim = 0 (see Sect. 2.3). As stated there, the orbit space under complex gauge transformations should be the same as the symplectic quotient, the orbit space under unitary gauge transformations of the 0-momentum set, i.e., the condition hF, ωim = 0. Equation (4.13) is a special case of Proposition 3.1 in [24], which we have used previously in Sect. 2.2.2. The DUY equation can also be obtained from the 6 dimensional supersymmetric Yang–Mills theory on a Calabi–Yau manifold. The supersymmetry transformation laws of the (N = 1) vector multiplet (AM , 9) in 6 dimensions are δAM = iΞ0M 9 − i90M Ξ, i δ9 = − 6M N ΞF M N , 2
(4.15)
where 0M are the gamma matrices and 6M N = 41 [0M , 0N ] is the spin representation. On the Calabi–Yau manifold the holonomy group is further reduced to SU (3), which gives a covariantly constant (complex) spinor ζ. In fact this is the very reason why the Calabi–Yau manifold is favorable in the compactification of superstrings to 4 dimensions. We will identify the supersymmetry transformation with Ξ = ζ as a topological BRST transformation. With this choice of parameter, SUSY transformations are decomposed according to the representations of SU (3). The decomposition of SO(6) vector is 6 = 3 ⊕ 3¯ and the chiral spinor decomposes as 4 = 3 ⊕ 1. Thus we obtain the following topological BRST transformation law: sAµ = ψµ , sAµ¯ = 0, sχ = g µµ¯ Fµµ¯ , sψµ = 0, sρ = 0. sψ[µ¯ ν] ¯ = Fµ¯ ν¯ ,
(4.16)
We should explain how we have “twisted”spinors into ghosts and anti-ghosts. In terms ¯ µ = 0, we can make the twist as of the covariantly constant spinor ζ which satisfies ζ0 follows; ¯ µ¯ 9, ¯ χ = ζ9, ψ¯ µ¯ = ζ0 ¯ µ¯ 0ν¯ 0σ¯ 9, ¯ ρ¯ = µ¯ ν¯ σ¯ ζ0 ψ[µ¯ ν] ¯ = ζ0µ¯ 0ν¯ 9,
(4.17)
170
L. Baulieu, H. Kanno, I. M. Singer
where (ψ¯ µ¯ , ρ) ¯ are complex conjugates of (ψµ , ρ). This is an example of the identification of spinors with forms, explained in Sect. 2.1.1. Looking at the BRST transformations of the anti-ghosts, we recover the DUY equations (4.11, 4.12). 4.1.2. Reduction to a 4-D BRSTQFT; Seiberg–Witten equations. We now turn to the reduction to D = 4, which is of special interest, particularly the theory obtained by dimensional reduction of the J theory from D = 8 to D = 4. We will get a BRSTQFT with gauge conditions identical to the non-Abelian Seiberg–Witten equations, which in turn is also related to the N = 4, D = 4 supersymmetric theory. The main observation is that, in the J case the set of seven equations (2.14) can be separated into 3 plus 4 equations. If we group A5 , A6 , A7 , A8 into the 4 component field ϕα , α = 1, 2, 3, 4, the latter can be interpreted in 4 dimensions as a commuting complex Weyl spinor and Aµ = A1 , A2 , A3 , A4 as a 4 dimensional vector. The set of the first 3 equations in Eq. (2.14) can now be interpreted as the condition that the self-dual part in 4-D of the curvature of Aµ is equal to a bilinear in ϕα ; then, the remaining four equations can be written as Dirac type equations. To be more precise, with the relevant definition of the 4 × 4 matrices 0µ and 6µν , the dimensional reduction down to D = 4 of Eq. (2.14) gives Fµν + µνρσ F ρσ +t ϕ6µν ϕ = 0, Dµ(A) 0µ ϕ = 0.
(4.18)
The consistency of the dimensional reduction from Eq. (2.14) to Eq. (4.18), and the correctness of the SO(4) tensorial properties of all fields, are ensured by the existence of relevant elliptic operators in 8 and 4 dimensions. The remarkable feature is that the above equations are the non-abelian version of Seiberg–Witten equations. In other words, we have observed that the spinors and vectors of the non-abelian S–W theory get unified in the Yang–Mills field of the J theory. The generation of a Higgs potential, to break down the symmetry, with a remaining U (1) is in principle possible, by the relevant modifications in the gauge functions, which provide a Higgs potential, function of ϕ. This is however a subtle issue that we will address elsewhere. The form of the action after dimensional reduction is just the sum of the bosonic part of the Seiberg–Witten action, plus ghost terms. Its derivation is standard from the knowledge of the gauge function, as a BRST exact term, which enforces the gauge functions. The link to supersymmetry in 4 dimensions is as follows. The BRSTQFT based on Spin(7) is a twisted version of the D = 8, N = 1 theory where the spinor is a complex field counting for 16 = 8 + 8 independent real components, and one has a complex scalar field in the supersymmetry multiplet. This theory is itself obtained as the dimensional reduction of the D = 10, N = 1 super Yang–Mills theory, where the spinor has 16 independent real components. Thus we predict that the theory we get by dimensional reduction to 4 dimensions of BRSTQFT in 8 dimensions is related to twisted versions of the D = 4, N = 4 super Yang–Mills theory. For instance, there are 6 scalar fields in the bosonic sector of the theory as presented in the work of Vafa and Witten [16], (see their Eq.(2.1)). In our derivation, these 6 scalar fields are combinations of 4 of the components of the 8-D Yang–Mills field and of the commuting ghost and antighost φ and φ¯ of the J theory.
Special Quantum Field Theories in Eight and Other Dimensions
171
There are actually three ways of twisting the N = 4 SSYM in four dimensions, defined by how SO(4) ' SU (2) × SU (2) is embedded in the R symmetry group12 SU (4) [16]. They are (i) (2, 1) ⊕ (1, 2) , (ii) (1, 2) ⊕ (1, 2) and (iii) (1, 2) ⊕ (1, 1) ⊕ (1, 1), where we have indicated how the defining representation of SU (4) decomposes under SU (2) × SU (2). Taking into account the argument in Sect. 6 of [27], we can see that the cases (i) and (iii) arise from the reduction of type H and J cases, respectively. The remaining case (ii), which is the twist employed by Vafa-Witten [16], is obtained from the 7 dimensional Joyce manifold with G2 holonomy. On the other hand, we get the non-abelian Seiberg–Witten theory with an adjoint hypermultiplet in the case (iii), which gives the relationship between N = 4 SSYM and non-abelian Seiberg–Witten equation. We thus conclude that very interesting twists connect the fields of the pure Yang– Mills 8-D BRSTQFT, (obtained by gauge fixing the invariant ∧ Tr (F ∧ F )), the fields which are involved in the four dimensional Seiberg–Witten equations, and the fields of the D = 4, N = 4 super-Yang–Mills theory. We note that if one starts from the H case gauge functions, the result of compactifying down to 4 dimensions is just a complexified version of a two dimensional Yang–Mills TQFT, coupled to two scalar fields; it could also be deduced from the dimensional reduction of the 3-dimensional BRSTQFT based on the Bogomolny equations. 4.2. Dimensions larger than 8. 4.2.1. Discussion of the case D=12. A BRSTQFT in 12 dimensions might be a candidate for F -theory. 11-dimensional supergravity, defined on the boundary of a 12 dimensional manifold, emphasizes the relevance of a 3-form gauge field CR3 , possibly coupled to a non abelian connection one form A. The most important term M11 C3 ∧ dC3 ∧ dC3 of the 11-dimensional supergravity suggests that one should build a TQFT based on the gauge-fixing of the following invariant 13 : Z dC3 ∧dC3 ∧dC3 +dC3 ∧dC3 ∧Pinv 4 (F )+dC3 ∧Pinv 8 (F )+Pinv 12 (F ) , (4.19) M12
where Pinv n (F ) are invariant polynomials of degree n/2 of the curvature of A, i.e, characteristic classes. Special geometries like hyper or quaternionic K¨ahler manifolds give natural four-forms. They, their duals (which are 8-forms, and are therefore good candidates to define gauge functions for the curvature of a 3-form in 12 dimensions), and their powers might be used as well here. It is natural to try and gauge fix these topological actions to get a BRSTQFT. However, we did not find gauge fixing functions for a single uncharged 3-form gauge field in 12 dimensions. Rather, we did find one for a single charged 3-form, and another one for a theory with two uncharged 3-forms. (See below.) We could introduce a 5-form gauge field, (not relevant for pure 11-dimensional supergravity), and similar to the 8-dimensional case, consider self-duality conditions for the 6-form curvature of C5 , with a gauge condition of the type dC5 + ∗dC5 + Tr (F ∧ F ∧ F )+ = 0.
(4.20)
In the present understanding of superstrings, 5-forms are not so natural; so we will not elaborate further on this case. 12 13
The R symmetry is the automorphism of the extended supersymmetry algebra. Here again dC 3 means h + dC 3 , where h is the harmonic representative of an element in H 4 (M12 ).
172
L. Baulieu, H. Kanno, I. M. Singer
When M12 is a Calabi–Yau 6-fold, we can do some things in two different theories. In the first theory, we couple a charged 3-form B to the Yang–Mills field. (B is valued in the same Lie Algebra as A.) We again use ∗ : 30,q → 30,6−q , so that 30,3 ⊗ G = 0,3 ¯ 0,2 − ∂¯A ∂¯A ∗ B = 0 implies for compact manifolds that 30,3 + ⊗ G + 3− ⊗ G. Again ∂A F ∂¯A ∗ B = 0. The covariant gauge condition is ∗F 0,2 = ∂¯A B, B ∈ 30,3 + ⊗ G; equivalently, ∗ B. So the covariant gauge conditions become the pair F 0,2 = 0 and ∂¯A B = 0, F 0,2 = ∂¯A ∗ similar to the Calabi–Yau 3-fold case in Sect. 4.1.1. There, F 0,2 = 0 and ∂¯A ϕ e = 0, with 0,3 0,3 ϕ e ∈ 3 ⊗ G. In the present case, B ∈ 3+ ⊗ G. The moduli space is a vector bundle over the set of holomorphic bundles for a fixed C ∞ (E, ρ). Each such holomorphic structure gives a unique A with FA0,2 = 0. The fiber ¯ over A consists of [B ∈ 30,3 + ⊗ G ; ∂A B = 0]. ∂¯
∂¯
∂¯
A A A The sequence 0 → 30,0 ⊗ G −→ 30,1 ⊗ G −→ 30,2 ⊗ G −→ 30,3 + ⊗ G is elliptic at the symbol level; linearization of the covariant gauge condition together with the usual gauge fixing is given by the elliptic operator: 0,2 0,1 ∗ ∂¯A ∂¯A 3 ⊗G 3 ⊗G −→ . (4.21) : ∗ ∂¯A 0 30,3 30,0 ⊗ G + ⊗G R We take as classical “topological” action S0 [A, B] = M12 6 ∧Tr (∂¯A B ∧FA ) where 6 is the (6, 0) covariant constant formRof M12 . Since the covariant gauge function is ∗ ∗ ∗ B and since hF 0,2 , ∂¯A Bi = M12 6 ∧ Tr (F 0,2 ∧ ∂¯A B), we have k F 0,2 − F 0,2 − ∂¯A ∗ 2 0,2 2 ∗ 2 0,2 ¯ ∗ ∗ 0,2 ∗ ¯ ¯ ¯ B k2 =k ∂A B k =k F k + k ∂A B k −hF , ∂A Bi − h∂A B, F i, that is, k F 0,2 − ∂¯A 0,2 2 ∗ 2 c ∗ ¯ ¯ ¯ F k + k ∂A B k −S0 [A, B] − S0 [A, B] . (Remember that ∂A = ∗∂A ∗.) We thus ∗ B k2 . obtain a BRSTQFT whose gauge fixed action will include the term k F 0,2 k2 + k ∂¯A 0,3 Moreover, the condition that B ∈ 3+ ⊗ G can be imposed in a BRST invariant by using the ordinary gauge freedom of B 14 . In the second theory, we introduce two uncharged 2-form gauge fields B2a and two (non abelian) Yang–Mills fields Aa , with a = 1 and 2. We consider the following topological classical action Z ab 6 ∧ dB2a ∧ dB2b . (4.22)
M12
We define the following “holomorphic” gauge conditions, where the complex indices run from 1 to 6, c
2 a a a a a ∂[µ¯ Bνa¯ ρ]¯ + ab µ¯ ν¯ ρ¯ α¯ β¯ γ¯ ∂[α¯ Bβb¯ γ] ¯ + A[µ¯ Aν¯ Aρ] ¯ ). ¯ = Tr (A[µ¯ ∂ν¯ Aρ] 3
(4.23)
The right-hand side of this equation is the Chern–Simons form of rank 3. The similarity to 8 dimensions is striking, up to the replacement of the even Chern class by the odd Chern–Simons class. Equation (4.23) implies ∂ ρ¯ ∂[µ¯ Bνa¯ ρ]¯ = ab µ¯ ν¯ ρ¯ α¯ β¯ γ¯ Tr Fρb¯ α¯ Fβb¯ γ¯ .
(4.24)
Its solution is the stationary point of the following action: 14 The (0,3)-form B is valued in the same Lie algebra as the Yang–Mills field. It is thus non abelian and its quantization involves the field anti-field formalism of Batalin and Vilkoviski. We intend to perform elsewhere this rather technical task, which generalizes that sketched at the end of Sect. 3.0.
Special Quantum Field Theories in Eight and Other Dimensions
173
Z d12 x ab (∂[µ¯ Bνa¯ ρ]¯ c ∂[µ¯ Bνb¯ ρ]¯ + µ¯ ν¯ ρ¯ α¯ β¯ γ¯ c Bνa¯ ρ¯ Tr Fρb¯ α¯ Fβbγ¯ + complex conjugate). M12
(4.25) Gauge fixing the Lagrangian Eq. (4.22) by the gauge condition Eq. (4.24) provides a BRST invariant action. Its ghost independent and gauge independent part is identical to the action Eq. (4.25). 4.2.2. Other possibilities. In 10 dimensions one could build a BRSTQFT based on a four-form gauge field B4 and a pair of two gauge field B2a , a = 1, 2, which naturally fit into the type IIB superstring. All these forms are uncharged, but they can develop non trivial interactions [30]. The curvatures are
with Bianchi identities, dG5 = fields one closed 11-form and two 8-forms
G5 = dB4 + ab B2a Gb3 ,
(4.26)
Ga3 = dB2a ,
(4.27)
ab Ga3 Gb3
and
dGa3
= 0. One can construct from these
111 = ab Ga3 Gb3 G5 ,
(4.28)
1a8 = G5 Ga3 .
(4.29)
The role of the invariant forms is obscure, but their existence could signal generalizations of the Green–Schwarz type anomaly cancellation mechanism. The covariant gauge function is (4.30) dB4 + ∗dB4 + ab B2a dB2b = 0. The mixing of forms of various degrees by the gauge functions generalizes that of the 3-form with the Yang–Mills field in the eight dimensional theory of Sect. 3. 5. Conclusion We have described some new Yang–Mills quantum field theories in dimensions greater than four, using self duality. In eight dimensions we found two BRSTQFT’s depending on holonomy Spin(7) (the J-case) or holonomy SU (4) (the H-case). In the J-case, BRST symmetry is what is left of supersymmetry. The increase in dimension allows us to couple ordinary gauge fields to forms of higher degree. We have given several examples. Dimensional reduction generates new theories. One of them is a BRSTQFT whose gauge conditions are the non-abelian Seiberg–Witten equations. In four dimensions, given the self duality condition, there are other ways of deriving the Lagrangian of Witten’s topological Yang–Mills theory besides Witten’s twist of N = 2 SSYM and besides BRST [1, 2, 33]. These methods should work equally well in deriving our BRSTQFT Lagrangians for the pure Yang–Mills case. Finally, as we have indicated earlier, the geometries of the moduli spaces we have probed have not been worked out. Much remains to be done [13]. However, from the lessons learned in four dimensions, it is tempting to hurdle these obstacles and proceed to the corresponding Seiberg–Witten abelian theory. Preliminary investigations indicate that one can compute the Seiberg–Witten invariants, when M8 is hyperK¨ahler, i.e., when the holonomy group is Sp(2). This case is very similar to the Seiberg–Witten invariants for M4 when it is K¨ahler [34].
174
L. Baulieu, H. Kanno, I. M. Singer
Acknowledgement. We thank M. Duff for pointing out to us that the octonionic solution in [14] and [15] does not give a self dual T r(F ∧ F ). H.K. would like to thank T. Eguchi and T. Inami for helpful communications. The work of H.K. is supported in part by the Grant-in-Aid for Scientific Research from the Ministry of Education, Science and Culture, Japan. L. B. would like to thank the Yukawa Institute where part of this work has been done, and E. Corrigan and H. Nicolai for discussions. IMS would like to thank G. Tian for bringing him up to date on complex geometry. He also benefited from discussions with S. Axelrod, S. Donaldson, R. Thomas, and E. Weinstein. His work is supported in part by a DOE Grant No. DE-FG02-88ER25066.
Note added on July 17, 1997. T.A. Ivanova has called our attention to [14], where instanton solutions are found. B.S. Acharya and M. Loughlin have called our attention to their paper [35] where they discuss self duality for Euclidean gravity when d ≤ 8. B.S. Acharya, M. Loughlin and B. Spence also discuss self duality in [36]. In their paper, a note added says that their proof of BRST invariance would “seem to conflict” with our theory not being topological. Indeed the theory is not topological. They made the corrections in a revised version. We expand on our assertion. Assume M is a compact oriented simply connected manifold with Aˆ = 1 and assume M admits a Joyce metric, i.e, a metric with Spin(7) holonomy. The space of Joyce metrics modulo diffeomorphisms isotopic to the identity is of dimension 1 + b4− (M ) (see Theorem D in [20]). It is conceivable that this manifold of Joyce metrics is not connected so that one cannot find a path from one Joyce metric to another with each point of the path a Joyce metric. The BRST argument for invariance requires a path of Joyce metrics, hence shows formally that the correlation functions are constant on components of the space of Joyce metrics. But the argument does not imply constancy of the correlation functions on all Joyce metrics. This is one reason we chose not to label our J-case QFT a topological quantum field theory. On the mathematical side the argument analogous to BRST invariance also works formally because the correlation functions come from the second Chern class (see 2.1.3). As we indicated there, to define the analogue of Donaldson invariants (the correlation function precisely), one needs to integrate over the moduli space MJ of self dual connections. To do so, a compactification of MJ is important (work in progress by D. Joyce and C. Lewis). The H-case (Sect. 2.2.3 in particular) is more complicated. Physicists allow a degeneration of the complex structure to connect one moduli space with another. We do not know how the “holomorphic Donaldson invariants” behave under this degeneration.
References 1. Birmingham, D., Blau, M., Rakowski, M., G. Thompson: Physics Reports 209, 129 (1991) 2. Cordes, S., Moore, G., Ramgoolam, S.: Lectures on 2D Yang–Mills Theory, Equivariant Cohomology and Topological Field Theories. In: Les Houches Session LXII, hep-th/9411210 3. Donaldson, S.K.: Topology. 29, 257 (1990) 4. Witten, E.: Commun. Math. Phys. 117, 353 (1988) 5. Baulieu, L., Singer, I.: Nucl. Phys. B (Proc. Supple.) 5B, 12 (1988) 6. E. Corrigan, Devchand, C., Fairlie, D.B., Nuyts, J.: Nucl. Phys. B214, 452 (1983) 7. Ward, R.S.: Nucl. Phys. B236, 381 (1984) 8. Fairlie, D.B., Nuys, J.: J. Phys.A17, 2867 (1984) 9. Fubini, S., Nicolai, H.: Phys. Lett. 155B, 369 (1985) 10. Salamon, S.: Riemannian Geometry, Holonomy Groups. Pitman Research Notes in Mathematics Series, 1989
Special Quantum Field Theories in Eight and Other Dimensions
175
11. Weinstein, E.: Extension of self-dual Yang–Mills equations across the 8th dimension. PHD dissertation, Harvard University Math. Dept, 1992 12. Joyce, D.D.: Invent. Math. 123, 507 (1996) 13. Donaldson, S.K., Thomas, R.P.: Gauge Theory in Higher Dimensions. Oxford preprint (1996) 14. Ivanova, T.A.: Phys. Lett. B315, 277 (1993); Ivanova, T.A., Popov, A.D.: Lett. Math. Phys. 24, 85 (1992); Theor. Math. Phys. 94, 225 (1993) 15. G¨unayden, M., Nicolai, H.: Phys. Lett. B351, 169 (1995) 16. Vafa, C., Witten, E.: Nucl. Phys. B431, 3 (1994) 17. Vafa, C.: Nucl. Phys. B469, 403 (1996), hep-th/9602022 18. Papadopoulos, G., Townsend, P.K.: Phys. Lett. B357, 300 (1995), hep-th/9506150 19. Acharya, B.S.: N=1 M-Theory-Heterotic Duality in Three-Dimensions and Joyce Manifolds. hepth/9604133; Dirichlet Joyce Manifolds, Discrete Torsion and Duality. hep-th/9611036 20. Joyce, D.D.: J. Diff. Geom. 43, 291, 329 (1996) 21. Shatashvili, S., Vafa, C.: Superstrings, Manifolds of Exceptional Holonomy. hep-th/9407025 22. Figueroa-O’Farrill, J.M.: A Note on the Extended Superconformal Algebras Associated with Manifolds of Exceptional Holonomy. hep-th/9609113 23. G¨unayden, M., G¨ursey, F.: J. Math. Phys. 14, 1651 (1973) 24. DeBartolomeis, P., Tian, G.: J. Diff. Geom. 43, 231 (1973) 25. Salam, A., Sezgin, E. (Eds), Supergravities in Diverse Dimensions. Amsterdam–Singapore: NorthHolland/World Scientific, 1989 26. Witten, E.: Nucl. Phys. B460, 335 (1996), hep-th/9510135 27. Bershadsky, M., Sadov, V., Vafa, C.: Nucl. Phys. B463, 420 (1996), hep-th/9511222 28. Becker, K., Becker, M., Morrison, D.R., Ooguri, H., Oz, Y., Yin, Z.: Supersymmetric Cycles in Exceptional Holonomy Manifolds and Calabi–Yau 4-folds. hep-th/9608116 29. Furuuchi, K., Kunitomo, H., Nakatsu, T.: Topological Field Theory and Second-Quantized Five-Branes. hep-th/9610016 30. Baulieu, L.: Algebraic quantization of gauge theories. In: Perspectives in fields and particles, eds. Basdevant-Levy, Cargese Lectures 1983 London: Plenum Press, 1985; Baulieu, L.: Nucl. Phys. B478, 431 (1996) 31. Brylinski, J.-L.: Loop Spaces, Characteristic Classes, Geometric Quantization. Berlin: Birkh¨auser, 1992 32. Baulieu, L., Grossman, B.: Phys. Lett. 214B, 223 (1988) 33. Atiyah, M., Jeffrey, L.: J. Geom. Phys. 7, 120 (1990) 34. Witten, E.: Math. Res. Letters 1, 764 (1994) 35. Acharya, B.S., O’Loughlin, M.: Phys. Rev. D55, R4521, (1997), hep-th/9612182 36. Acharya, B.S., O’Loughlin, M., Spence, B.: hep-th/9705138 Communicated by S.-T. Yau
Commun. Math. Phys. 194, 177 – 190 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Quenched Sub-Exponential Tail Estimates for One-Dimensional Random Walk in Random Environment Nina Gantert? , Ofer Zeitouni ?? Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel Received: 13 December 1996 / Accepted: 3 October 1997
Abstract: Suppose that the integers are assigned i.i.d. random variables {ωx } (taking values in the unit interval), which serve as an environment. This environment defines a random walk {Xn } (called a RWRE) which, when at x, moves one step to the right with probability ωx , and one step to the left with probability 1 − ωx . Solomon (1975) determined the almost-sure asymptotic speed vα (=rate of escape) of a RWRE. Greven and den Hollander (1994) have proved a large deviation principle for Xn /n, conditional upon the environment, with deterministic rate function. For certain environment distributions where the drifts 2ωx − 1 can take both positive and negative values, their rate function vanishes on an interval (0, vα ). We find the rate of decay on this interval and prove it is a stretched exponential of appropriate exponent, that is the absolute value of the log of the probability that the empirical mean Xn /n is smaller than v, v ∈ (0, vα ), behaves roughly like a fractional power of n. The annealed estimates of Dembo, Peres and Zeitouni (1996) play a crucial role in the proof. We also deal with the case of positive and zero drifts, and prove there a quenched decay of the form exp(−cn/(log n)2 ).
1. Introduction In this paper, we continue the study, initiated in [4] and [2], of tail estimates for a nearest-neighbor random walk on Z with site-dependent transition probabilities. Let ω = (ωx )x∈Z be an i.i.d. collection of (0, 1)-valued random variables, with marginal distribution α For every fixed ω, let X = (Xn )n≥0 be the Markov chain on Z starting at X0 = 0 (unless explicitly stated otherwise), and with transition probabilities ? On leave from the Department of Mathematics, TU Berlin. Research supported by the Swiss National Science foundation under grant 8220–046518. ?? Partially supported by a US-Israel BSF grant.
178
N. Gantert, O. Zeitouni
Pω (Xn+1 = y | Xn = x) =
ωx if y = x + 1 1 − ωx if y = x − 1 . 0 otherwise
(1)
The symbol Pω denotes the measure on path space given the environment ω, and is referred to as the “quenched" setting. The process (X, ω) is an Rexample of a random walk in random environment (RWRE), and X has the law P = αZ (dω)Pω , referred to as the “annealed" law. When no confusion arises, we use P also to denote the law of (X, ω). We use in various places, when confusion does not occur, P to denote the probability of events constructed from random variables unrelated to the RWRE. For a discussion of the different regimes that the RWRE Xn exhibits, we refer to the introduction in [2]. R Abbreviate ρ = ρ(x, ω) = (1 − ωx )/ωx and hf i = f (ω)αZ (dω) for any function f of the environment. Let ρmax denote the maximum of ρ over the closed support of α, and let ρmin denote the corresponding minimum. We will be interested here in the case hρi < 1 and ρmax ≥ 1, in which case (cf. [7]) the RWRE is transient and, P-a.s., lim n−1 Xn = vα :=
n→∞
1 − hρi . 1 + hρi
(2)
Tail estimates for Xn /n have been derived for the quenched setting in [4]. In particular, it was shown there that, P-a.s, the random variables Xn /n satisfy with respect to Pω a large deviation principle of speed n and explicit, deterministic, rate function I(v), defined as follows (see [4, Theorem 2 and Corollary 1]). Let f (r, ω), r ≥ 0 denote the continued fraction function f (r, ω) =
ρ(0, ω)| ρ(1, ω)| 1| − − , er (1 + ρ(0, ω)) er (1 + ρ(1, ω)) ···
and let λ(r) = exphlog f (r, ω)i . Let r(v) = 0 for v ≤ vα , and for v ∈ (vα , 1], let r(v) be the unique solution of the equation v −1 = −λ0 (r)/λ(r). Then, −r(v) − v log λ(r(v)) , v ∈ [0, 1] I(−v) + vhlog ρi , v ∈ [−1, 0] I(v) = ∞, v 6∈ [−1, 1] . Furthermore, I(v) = 0 for v ∈ [0, vα ] and I is strictly positive elsewhere. Our goal in this paper is to study in greater detail the regime v ∈ (0, vα ) under Pω . In the annealed setting, i.e., when one is interested in P(Xn ≤ nv), v ∈ (0, vα ), sub–exponential rates of decay were derived in [2]. We summarize now the main results of [2] relevant to us. Recall (cf. [2]) that when hρi < 1, there exists a unique s > 1 satisfying hρs i = 1. Theorem 1 (see [2]). Let v ∈ (0, vα ). (a) Positive and negative drifts. Suppose that hρi < 1 and ρmax > 1. Then, lim log P(Xn ≤ nv)/ log n = 1 − s .
n→∞
(b) Positive and zero drifts. Suppose that hρi < 1 but ρmax = 1 and α(1/2) > 0. Then, |2/3 and C2 = | π(loghρi) |2/3 , with C1 = 23 | π log α(1/2) 2 8
Random Walk in Random Environment
− C1 (1 −
179
v 1/3 1 ) ≤ lim inf 1/3 log P(Xn ≤ nv) n→∞ n vα 1 v 1/3 ) . ≤ lim sup 1/3 log P(Xn ≤ nv) ≤ −C2 (1 − vα n→∞ n
(3)
Maybe surprisingly, it turns out that the annealed estimates are key to understanding the quenched asymptotics. The next theorems are our main results. They quantify the fact that the annealed probabilities of large deviations are of bigger order than their quenched counterparts, due to the possibility of rare fluctuations in the environment which may slow down the RWRE. Theorem 2 (Positive and negative drifts). Suppose that hρi < 1, ρmax > 1, and let v ∈ (0, vα ). Then, for P-a.a. ω, the following statements hold: 1. For any δ > 0, lim sup n→∞
1 n1−1/s−δ
log Pω (Xn < nv) = −∞.
(4)
log Pω (Xn < nv) = 0.
(5)
2. For any δ > 0, lim inf n→∞
1 n1−1/s+δ
Furthermore, lim sup n→∞
1 log Pω (Xn < nv) = 0. n1−1/s
(6)
One should compare the rate of decay obtained in Theorem 2 with the annealed polynomial rate of decay (see Theorem 1) P(Xn < nv) ' n1−s . As in [2], tail estimates are different when the drift cannot be negative: Theorem 3 (Positive and zero drifts). Suppose that hρi < 1, ρmax = 1, and α({1/2}) > 0. Then, for P-a.a. ω, and for v ∈ (0, vα ), − c1 (1 −
v (log n)2 log Pω (Xn < nv) ) ≤ lim inf n→∞ vα n v 2 (log n)2 ≤ lim sup log Pω (Xn < nv) ≤ −c2 (1 − ) . n v n→∞ α
(7)
Here, c1 = |π log α({1/2})|2 /8 and c2 = |π loghρi|2 /243 . Again, the rate in Theorem 3 should be compared with the annealed rate (cf. Theorem 1) P(Xn < nv) ' exp(−Ci n1/3 ). Remarks. 1. As in [2], we have not covered the case of hρi < 1, ρmax = 1, while α({1/2}) = 0. The tail estimates in the annealed case were conjectured in [2, p. 681] to be of the form exp(−Di nβ ), i = 1, 2, for some β ∈ (1/3, 1) determined by the tails of α(·) near 1/2. The same proof as in Theorem 3 then shows that the upper quenched estimates in Theorem 3 become exp(−dn/(log n)γ ), with γ = 1/β − 1.
180
N. Gantert, O. Zeitouni
2. In the setting of Theorem 2, we conjecture that actually lim inf n→∞
1 n1−1/s
log Pω (Xn < nv) = −∞.
In fact, the derivation of the lower bound in (6) hints at such a limit. In the setting of Theorem 3, we conjecture, as in [2], that the lower bound is sharp, that is v (log n)2 log Pω (Xn < nv) = −c1 (1 − ). n→∞ n vα lim
In fact, it was shown recently (see [6]) that the lower bound is sharp in the annealed setting, that is one may replace C2 in the right hand side of (3) by C1 . This however does not suffice for closing the gap in our Theorem 3, see the comment following the proof of the theorem. 3. In the setting of Theorem 2, it is natural to attempt to improve on (4), (5) by allowing for δn →n→∞ 0. Such improvement is possible if in Theorem 1.1 of [2], one refines the convergence, that is one proves bounds of the form lim sup gn ns−1 P(Xn < nv) < ∞ n→∞
for appropriate gn →n→∞ 0 sub–polynomially , which is possible albeit tedious. It seems however impossible by this way to completely close the gap between the upper and lower bounds exhibited in (4) and (5). We conclude this introduction with two technical lemmas, borrowed from [2], whose proof follows readily from the explicit computations for inhomogeneous random walk of [1, pp. 66–71]. Let Xn denote a RWRE and let X¯ n denote a RWRE with ω0 = 1. Let Pk τ¯k = min{n : X¯ n = k}, let Rk = k −1 i=1 log ρ(i), and let L0 = maxn≥0 {−Xn }. Lemma 1 ([2], Lemma 2.1). For all n, k, Pω (τ¯k ≥ n) ≥ (1 − e−(k−1)Rk−1 )n . Lemma 2 ([2], Lemma 2.2). For any k ≥ 1, P(L0 ≥ k) ≤
hρik . 1 − hρi
2. Proofs Proof of Theorem 2. Since the lower bound of Theorem 2 is relatively simple, and the key ideas are already explained in [2], we postpone the discussion of it and begin by providing a sketch of the proof of the upper bound leading to (4), that is, with τn = inf {t : Xt = n} , we will explain why lim
n→∞
1 n1−1/s−δ
log Pω τn > n/v = −∞.
(8)
Random Walk in Random Environment
181
The required upper bound follows readily. We will omit subsequences, etc. in this sketch, and thus the reader interested in a complete proof should take the next few paragraphs with somewhat of a grain of salt. The precise statement of the required estimate is contained in the statement of Proposition 1. Divide the interval [0, nv] into blocks of size roughly k = kn := n1/s+δ . Let Xnx denote the RWRE started at x, and define Tk(i) = inf{t > 0 : Xtik = (i + 1)k} ,
i = 0, ±1 , . . . .
(9)
By slight abuse of notation, we continue to use Pω for the quenched law of the {Xnx }. By using the annealed bounds of [2], see Theorem 1, one knows that P(τk > k/v) ∼ k 1−s . Hence, taking appropriate subsequences, one applies a Borel–Cantelli argument to control the probability, conditioned on the environment, of the time spent in each such block being large, i.e., one exhibits a uniform estimate on Pω (Tk(i) > k/v), cf. Lemma 5. The next step involves a decoupling argument. Let (i)
T k = inf {t > 0 : Xtik = (i + 1)k or Xtik = (i − 1)k}.
(10)
Then, using Lemma 2, and the Borel–Cantelli lemma, one shows that for all relevant (i) blocks, that is i = ±1, ±2, . . . , ±n/k, Pω (T k 6= Tk(i) ) is small enough. Therefore, we (i)
can consider the random variables T k instead of Tk(i) , which have the advantage that their dependence on the environment is well localized. This allows us to obtain (cf. Lemma 7) (i) a uniform bound on the tails of T k , for all relevant i. The final step involves estimating how many of the k-blocks will be traversed from right to left before the RWRE hits the point nv. This is done by constructing a simple (i) random walk (SRW) St whose probability of jump to the left dominates Pω (Tk(i) 6= T k ) for all relevant i. The analysis of this SRW will allow us to claim (cf. Lemma 9) that the number of visits to a k-block after entering its right neighbor is negligible. Thus, the original question on the tail of τn is replaced by a question on the sum of (dominated by (i) i.i.d.) random variables T k , which is resolved by means of the tail estimates obtained in the second step. A slight complication is presented by the need to work with subsequences in order to apply the Borel–Cantelli lemma at various places. Going from subsequences to the original n sequence is achieved by means of monotonicity arguments. Turning now to the complete proof, we first note that it is actually enough to prove a weaker statement. For δ ∈ (0, 1 − 1/s), let Cn = nδ and let nj = [j 2/δ ]. Recall that τn = inf {t : Xt = n} , and let µ := v −1 > vα−1 . The key to the upper bound is the following proposition, whose proof is postponed. Proposition 1. lim
j→∞
C nj 1−1/s
nj
log Pω τnj > nj µ = −∞.
(11)
Assuming the proposition holds true, let us show how to complete the proof of the upper (j + 1)2/δ + 1 −→ 1. Let jn be such that bound (4). Note that, for j large, nj+1 /nj ≤ j 2/δ − 1 j→∞ njn ≤ n < njn+1 . Then, for any n,
182
N. Gantert, O. Zeitouni
Pω τn > nµ ≤ Pω τnjn+1 > njn µ = Pω τnjn+1 > njn+1 µ(n) , µnjn . njn+1 µnjn Let N be large such that inf n≥N > µα , and consider only n > N . One njn+1 concludes from Proposition 1 that for all δ > 0, P a.s.,
where µ(n) =
lim sup n→∞
1 1 n1− s +δ
log Pω (τn > nµ) = −∞ .
(12) 0
0
To prove (4), let v < v 0 < vα and define L[nv ] = max{[nv 0 ] − Xk[nv ] ; k ≥ 0}. Then, 0
Pω (Xn < nv) ≤ Pω (τ[nv0 ] > n) + Pω (L[nv ] ≥ [nv 0 ] − nv) .
(13)
By Lemma 2, P(L
[nv 0 ]
0
≥ [nv ] − nv) = E(Pω (L
[nv 0 ]
0
hρi[nv ]−[nv]−1 . ≥ [nv ] − nv)) ≤ 1 − hρi 0
Hence, one may find some ε > 0, θ > 0 such that 0
P(Pω (L[nv ] ≥ [nv 0 ] − nv) ≥ e−εn ) ≤ e−θn . Applying now the Borel–Cantelli lemma, one concludes that P-a.s., lim sup n→∞
0 1 log Pω (L[nv ] ≥ [nv 0 ] − nv) < −ε < 0 . n
(14)
(4) follows from (13), (14) and (12). As mentioned before, the proof of the lower bounds (5) and (6) follows the ideas of [2] (see in particular Remark 4, p. 682). Indeed, it is already explained there why, for any δ > 0, X 1 n < v = 0. lim inf 1−1/s+δ log Pω n→∞ n n In order to see the refined estimate in (6) , we recall the following notations from [2]. Let m+k X x Rk (m) = k1 log ρ(i). Define τkx = inf {t : Xtx = k + x} and τ xk = inf {t : X t = i=m+1 x
k + x}, where X t is the RWRE with ω(x) = 1, initiated at x. It follows from Lemma 1 that n x ≥ n ≥ Pω τ xk+1 ≥ n ≥ 1 − e−kRk (x) . (15) Pω τk+1 For n = 1, 2 , . . . , define Mn (x) =
max
x≤m≤x+n k≤x+n−m
kRk (m).
In particular, it follows from (15) that for any c > 0 and l = [n/c], n x x Pω (τl+1 ≥ n) ≥ Pω (τ¯l+1 ≥ n) ≥ 1 − e−Ml (x) .
(16)
We recall the following exceedence bounds, due to Iglehart. For this version, see [5], Theorem A.
Random Walk in Random Environment
183
Lemma 3. There exist constants K1 , K2 , such that for any z ∈ R, log l exp −K1 exp(−sz) ≤ lim inf P Ml (x) − ≤z l→∞ s log l ≤z ≤ lim sup P Ml (x) − s l→∞ ≤ exp −K2 exp (−sz) . A corollary of Lemma 3 and (16) (taking y = ez ) is the following: Lemma 4. For any y > 0 there exists a cy > 0 such that, for any v 0 < vα , 1−1/s − n 0 1/s x y(v ) ≥ cy lim inf P Pω (τ[nv 0 ] ≥ n) ≥ e
n→∞
and the convergence is uniform in x. Equipped with Lemma 4, we have completed all the preliminaries required for provk ing (6). Indeed, fix y > 0, and let nk = 22 . Note that lim sup n→∞
log Pω (Xn ≤ nv) log Pω (Xnk ≤ nk v) ≥ lim sup 1−1/s k→∞ n1−1/s n ≥ lim sup k→∞
≥ lim sup k→∞
≥ lim sup k→∞
k 0 log Pω (τ[n ≥ nk ) k v] 1−1/s nk n log Pω (τ[nk−1 ≥ nk ) k v]−nk−1 1−1/s nk nk−1 log Pω (τ[nk v0 ] ≥ nk ) n, 1−1/s nk
where v 0 = v − ε for arbitrary ε. By Lemma 4, and the Borel–Cantelli lemma, for any z > 0, 1−1/s n n − kz ≥ n ≥ e Pω τ[nk−1 0 k kv ] infinitely often. The conclusion follows by taking z → ∞. This completes the proof of Theorem 2, except that we still have to show Proposition 1. 1/s
C nj n j for some 1 > ε > 0. For Xnx the RWRE Proof of Proposition 1. Let k = kj = 1−ε started at x, recall that Tk(i) = inf{t > 0 : Xtik = (i + 1)k} ,
i = 0, ±1 , . . . .
law of the {Xnx }. By slight abuse of notation, wencontinue h i to use Pωhforithe quenched o n n Finally, let bn = Cn−δ and Ij = − kjj − 1 , · · · , kjj + 1 . Fix µ0 > µ.
184
N. Gantert, O. Zeitouni
Lemma 5. For P – a.e. ω, there exists a J0 (ω) such that for all j > J0 (ω), and all i ∈ Ij , ! Tk(i)j Pω > µ0 ≤ b n j . kj Proof of Lemma 5. By Chebycheff’s bound, T (i) kj
P Pω
kj
>µ
0
!
> bnj
1 Tk j ≤ P > µ0 b nj kj (i)
≤
1 1−s+o(1) k , b nj j
where the last inequality follows from Theorem 1(a), and o(1) j→∞ −→ 0. Hence, T (i) P Pω
kj
kj
>µ
0
! > bnj for some i ∈ Ij
≤3 ≤
hn i j
kj
·
3 nδ(s−o(1)−δ) j
and the conclusion follows from the Borel–Cantelli lemma. Let 0 < θ <
hρi − log 1−ε
,
dθn
=e
−θn1/s Cn
1 · k 1−s+o(1) b nj j ≤
4 j 2(s−o(1)−δ)
,
, and recall that
(i)
T k = inf {t > 0 : Xtik = (i + 1)k or Xtik = (i − 1)k}. Lemma 6. For P – a.e. ω, there is a J1 (ω) s.t. for all j ≥ J1 (ω), (i) Pω T kj 6= Tk(i)j , some i ∈ Ij ≤ dθnj . Proof of Lemma 6. Again, we use the Chebycheff bound: (i) P Pω T kj 6= Tk(i)j , some i ∈ Ij > dθnj 1 3nj (0) P T kj 6= Tk(0) ≤ θ · j d nj k j ≤ ≤
1 3nj hρikj · · dθnj kj 1 − hρi
1 3 log hρi 1− 1 −δ +δ nj s +θ , exp njs (1 − hρi) (1 − ε)
where the second inequality follows from Lemma 2. The conclusion follows from the Borel–Cantelli lemma. We actually need to iterate the estimates of Lemma 5.
Random Walk in Random Environment
185
Lemma 7. For P – a.e. ω, for all j > J0 (ω) , and each i ∈ Ij , and for x ≥ 1, Pω
(i)
T kj kj
> µ0 x ≤ (2bnj )[x/2]∨1 .
Proof of Lemma 7. For 1 ≤ x < 4, the claim follows from Lemma 5. Assume thus that x ≥ 4. Then, Pω
(i)
T kj kj
> µ x ≤ Pω 0
T (i)
> µ0 (x − 2) ,
kj
kj
ik
(i − 1)kj < X[µ0jkj (x−2)]+1 < (i + 1)kj , min{t : t ≥ [µ0 kj (x − 2)] + 2, Xt
ikj
= (i + 1)kj } ≥ xµ0 kj .
Hence, by the Markov property,
Pω
T (i) kj
kj
> µ0 x ≤ Pω
(i)
T kj
> µ0 (x − 2)
kj
×
sup (i−1)kj
≤ Pω
T (i) kj
Pω inf {t : Xty = (i + 1)kj } ≥ 2µ0 kj
> µ0 (x − 2) · Pω Tk(i)j + Tk(i−1) > 2µ0 kj j
kj (i) T kj > µ0 (x − 2) ≤ Pω kj h i 0 Pω Tk(i)j > µ0 kj + Pω Tk(i−1) > µ k j j (i) T kj ≤ 2bnj Pω > µ0 (x − 2) , kj
where the last inequality is a consequence of Lemma 5. The lemma follows by induction. We need one more preliminary computation related to the bounds in Lemma 7. Let {Zk(i)j }, i = 1, 2, . . . denote a sequence of i.i.d. positive random variables, with P
Zk(i)j kj
! <µ
0
= 0,
P
Zk(i)j kj
! 0
>µx
Lemma 8. For any λ > 0, and any ε > 0,
[x/2]∨1 = 2bnj ,
x ≥ 1.
186
N. Gantert, O. Zeitouni
E exp λ
! Zk(i)j kj
0
≤ eλµ (1+ε) + gj ,
−→ 0. where gj j→∞ Proof of Lemma 8. ! Zk(i)j Z ∞ Zk(i)j log u du = E exp λ P > kj kj λ 0 log u Z ∞ ∨1 0 0 ≤ eλµ (1+ε) + (2bnj ) 2λµ (1 + ε) du = e where gj j→∞ −→ 0.
λµ0 (1+ε)
eλµ0 (1+ε)
+ gj
In order to control the number of repetitions of visits to kj –blocks, we introduce an auxiliary random walk. Let St , t = 0, 1 , . . . , denote a simple random walk with S0 = 0 and P St+1 = St + 1 St = 1 − P St+1 = St − 1 St = 1 − dθn . Set Mnj =
1 1− s1 n . C nj j
Lemma 9. For θ as in Lemma 6, and n large enough, hn i θε j } > Mnj ≤ exp − nj . P inf {t : St = kj 2 Proof of Lemma 9. n h n io S[Mnj ] nj j > Mnj ≤ P < P inf t : St = kj Mn j k j Mn j S[Mnj ] < 1 − ε ≤ 2 e−Mnj hnj (1−ε) , =P Mn j where the last inequality is a consequence of Cram`er’s theorem (cf. [3]), and the fact that dθn < ε. Here, 1−x x hn (1 − x) = (1 − x) log + x log θ . θ 1 − dn dn Using hn (1 − x) ≥ − e2 − x log dθn , we get S[Mnj ] ε +εMnj log dθn j ≤ e− 2 θ n j . < 1 − ε ≤ 2 e2Mnj /e e P Mn j
We are now ready to prove (11). Note that, for all j > J0 (ω), and all i ∈ Ij , we may, (i) due to Lemma 7, construct {Zk(i)j } and {T kj } on the same probability space such that
Random Walk in Random Environment
(i) Pω Zk(i)j ≥ T kj
187
∀ i ∈ Ij = 1. Fix µα < µ0 < µ and ε > 0 small enough. Recalling (i)
that, under Pω , the T kj are independent, we obtain, with {St } defined before Lemma 9, and j large enough,
n
Pω (τnj > nj µ) ≤ P inf t : St =
h n io j
kj
> Mnj + P
nj M X
Zk(i)j > nj µ
i=1
1 X Z kj > µ(1 − ε) M nj kj i=1 ! h Zk(i)j −λµ(1−ε) iMnj −θεnj /2 ≤e ·e + E exp λ (i) kj Mn j 0 ≤ e−θεnj /2 + eλ(µ +2εµ−µ) + gj e−λµ(1−ε) M n j ≤ e−θεnj /2 + e−λεµ , Mn j
≤ e−θεnj /2 + P
where Lemma 9 was used in the second inequality and Lemma 8 in the fourth. Since λ > 0 is arbitrary, (11) follows. Proof of Theorem 3. We begin by giving a quick sketch of the lower bound in (7), based on [2]. By the Erd¨os-Renyi strong law for the longest run of heads, (or the asymptotics for long rare segments in random walks, see e.g., [3, p. 69]), there is a segment I = (imin , imax ), with imin ≥ n(v−ε), imax < nv and imax −imin = log n/(− log α({1/2}))(1+ o(1)), such that ωi = 1/2 for i ∈ I. Let X˜ n denote the RWRE started at (imin + imax )/2. Let τ = min{t : X˜ t = imin or X˜ t = imax }. Then, τ possesses the same law as the exit time, denoted τ¯ , of the simple symmetric random walk from the interval [−(imax − imin )/2, (imax − imin )/2]. As before, we let τk = min{t : Xt = k}. We have, v − 2ε v 2ε )Pω (τ > n(1 − + )) vα vα vα v − 2ε v 2ε ≥n )P (τ¯ > n(1 − + )) . vα vα vα
Pω (Xn < nv) ≥ Pω (τn(v−ε) ≥ n = Pω (τn(v−ε)
(17)
By Solomon’s law of large numbers, cf. (2), lim Pω (τn(v−ε) ≥ n
n→∞
v − 2ε ) = 1. vα
(18)
By standard eigenvalue estimates for the simple random walk (cf. [8, p. 243]), lim
n→∞
n(1 −
v vα
(log n)2 log P (τ¯ > n) = −π 2 /8 . − v2εα )(log α(1/2))2
(19)
Combining (19), (17), and (18), the lower bound in (7) follows. The proof of the upper bound in (7) follows the proof of part 1 of Theorem 2, except that there is no need for subsequences here. With µ = v −1 > vα−1 = µα and t ∈ (0, 1) , define µ¯ = tµα + (1 − t)µ. Fix 1/2 > ε > 0, δ > 2, bn = n−(δ/2) and
188
N. Gantert, O. Zeitouni
(log n)3 (1 + δ)3 , C23 (µ¯ − µα )(1 − ε)3 h i o n h i where C2 was defined in Theorem 1. We define In = − nk − 1 , · · · , nk + 1 , and k = k(n) :=
use Tk(i) as in (9). Then, following the outline of the proof of Lemma 5, exp(−C2 (µ¯ − µα )1/3 k 1/3 (1 − ε)) , bn
¯ > bn ) ≤ P(Pω (Tk(i) > µk)
(20)
where we have used the bound ¯ ≤ exp(−k 1/3 C2 (µ¯ − µα )1/3 ) , P(Tk(i) > µk) which follows from Theorem 1 using the inequalities ¯ ≤ P(X[µk] < k) ≤ P(X[µk] < ([µk] ¯ + 1)/µ) ¯ . P(Tk(i) > µk) ¯ ¯ Thus, by the Borel–Cantelli lemma, for P-a.e. ω, there exists an N0 (ω) such that for all n > N0 (ω), ¯ , some i ∈ In ) ≤ bn . (21) Pω (Tk(i) > µk (i)
Define T k as in (10). Set 0 < γ < (1 + δ)3 | loghρi|/C23 (µ¯ − µα ). With dn = exp(−γ(log n)3 ), the Borel–Cantelli lemma yields, as in the proof of Lemma 6, that for P-a.e. ω, there exists an N1 (ω) such that for n ≥ N1 (ω), (i)
Pω (Tk(i) 6= T k , some i ∈ In ) < dn .
(22)
Using (21), one concludes as in Lemma 7 that for P-a.e. ω, for n > N0 (ω), and each i ∈ In , (i) ¯ ≤ (2bn )[x/2]∨1 . (23) Pω (T k > k µx) Let Zk(i) , i = 1, 2 , . . . denote a sequence of positive, i.i.d random variables with P
Z (i) k
k
< µ¯ = 0 ,
P
Z (i) k
k
> µx ¯ = (2bn )[x/2]∨1 ,
x ≥ 1.
The following lemma takes the place of Lemma 8 in the proof of Theorem 2: ¯ + ε0 ), Lemma 10. For each ε0 > 0, we have, for λn = − log(2bn )/2µ(1 E exp λn Zk(i) /k ≤ eλn µ¯ + gn , where gn n→∞ −→ 0 . Proof of Lemma 10. Exactly as in the course of the proof of Lemma 8, for n large enough, Z ∞ Z (i) log u (i) k > E exp λn Zk /k = du P λn k 0 Z ∞ log u (2bn ) 2λn µ¯ du = eλn µ¯ + gn , ≤ eλn µ¯ + eλn µ¯
where
Random Walk in Random Environment
Z gn =
∞
u
189
¯ (log 2bn )/(2λn µ)
Z
∞
du =
eλn µ¯
eλn µ¯
0
u−(1+ε ) du n→∞ −→ 0.
Let St , t = 0, 1, . . . , denote the simple random walk with S0 = 0 and P (St+1 = St + 1|St ) = 1 − P (St+1 = St − 1|St ) = 1 − dn , and let Mn =
nC23 (µ¯ − µα )(1 − ε)2 . (log n)3 (1 + δ)3
Mimicking the proof in Lemma 9, we obtain that P (inf{t : St = [n/k]} > Mn ) ≤ exp(−nθε) ,
(24)
where θ = γC23 (µ¯ − µα )(1 − ε)2 /(3(1 + δ)3 ). Following the proof of Theorem 2, we have Mn n h n io X > Mn + P Pω [τn > nµ] ≤ P inf t : St = Zk(i) > nµ k i=1
Mn 1 X Zk(i) −nθε ≤e > µ(1 − ε) +P Mn k i=1 Mn ≤ e−nθε + E exp λn Zk(i) /k e−λn µ(1−ε) ¯ ≤ e−nθε + e−λn Mn (µ(1−ε)−µ−ε) ,
where the second inequality is due to (24) and the last due to Lemma 10. Plug in the definition of Mn and λn to get 3 2δ ( µ ¯ − µ )(1 − ε) C µ(1 − ε) − µ ¯ − ε 2 α 2 2 (log n) lim sup log Pω (τn > nµ) ≤ − . 3 µ(1 n 2(1 + δ) ¯ + ε0 ) n→∞ Letting ε and ε0 → 0 and δ → 2, one gets lim sup n→∞
1 µ − µ¯ (log n)2 log Pω (τn > nµ) ≤ −C23 (µ¯ − µα ) n 2 · 33 µ¯ 1 t(1 − t) (µ − µα )2 , (25) = −C23 2 · 33 (1 − t)µ + tµα
where we used the definition of µ¯ in the last equality. Optimizing over t ∈ (0, 1) yields lim sup n→∞
(log n)2 1 1 log Pω (τn > nµ) ≤ −C23 (µ − µα )2 √ . √ n 2 · 33 ( µ + µ α )2
To prove the upper bound in (7), observe that for v < v 0 < vα , by the same argument as in (14),
190
lim sup n→∞
N. Gantert, O. Zeitouni
X (log n)2 (log n)2 1 n log Pω < v ≤ lim sup log Pω τ[nv0 ] > [nv 0 ] 0 n n n v n→∞ 0 2 (log[nv ]) 0 1 0 ] > [nv ] = lim sup v 0 log P τ ω [nv [nv 0 ] v0 n→∞ 2 1 1 0 1 1 v − ≤ −C23 2 2 · 33 v0 vα √1 + √1 0 vα v 0 2 v 1 v α ≤ −C23 . 1− √ √ 2 2 · 33 vα 0 v + vα
√ √ Letting v 0 → v, and using vα /( v + vα )2 ≥ 1/4, we get lim sup n→∞
X v 2 (log n)2 1 n log Pω < v ≤ −C23 1 − , n n 8 · 33 vα
completing the proof of the upper bound in (7).
(26)
Remark. Even when one uses the results of [6] and replaces C2 by C1 in the right hand side of (26), the behaviour of the exponent in the upper bound is quadratic in (vα − v), which is far from the linear behaviour exhibited by the exponent of the corresponding lower bound. While the constant in the upper bound can be slightly further improved (e.g., by using subsequences in the proof), it seems that a new approach is needed to completely close the gap. Added in proof A. Pisztora and T. Povel have recently succeeded in closing the gap mentioned above, and established that the lower bound in (7) captures the right asymptotic behaviour. References 1. Chung, K.L.: Markov chains with stationary transition probabilities. Berlin: Springer, 1960 2. Dembo, A., Peres, Y., Zeitouni, O.: Tail estimates for one–dimensional random walk in random environment. Commun. Math. Phys. 181, 667–683 (1996) 3. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Boston: Jones and Bartlett, 1993 4. Greven, A., den Hollander, F.: Large deviations for a random walk in random environment. Ann. Probab. 22, 1381–1428 (1994) 5. Karlin, S., Dembo, A.: Limit distributions of maximal segmental score among Markov dependent partial sums. Adv. in Appl. Prob. 24, 113–140 (1992) 6. Pisztora, A., Povel, T., Zeitouni, O.: Precise large deviations estimates for one-dimensional random walk in random environment. Submitted 7. Solomon, F.: Random walks in random environment. Ann. Probab. 3, 1–31 (1975) 8. Spitzer, F.: Principles of random walk. Berlin: Springer, 1976 Communicated by Ya. G. Sinai
Commun. Math. Phys. 194, 191 – 205 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Riemann Problem for Pressureless Fluid Dynamics with Distribution Solutions in Colombeau’s Sense Jiaxin Hu The Young Scientist Laboratory of Mathematical Physics, Wuhan Institute of Physics and Mathematics, Academia Sinica, Wuhan 430071, P.O.Box 71010, P.R. China. E-mail:
[email protected] Received: 2 May 1997 / Accepted: 6 October 1997
Abstract: The Riemann problem for the equations of pressureless fluid dynamics was considered. Solutions of this problem were constructed by employing the viscosity vanishing approach. For some initial data, solutions showed high singularity around the shock waves. A new mathematical theory of generalized functions initiated by J.F. Colombeau was applied to dealing with the multiplication of singular distributions. As a byproduct, the entropy condition was obtained for singular distribution solutions.
1. Introduction We are concerned with the one-dimensional equations of pressureless fluid dynamics of the form ρt + (ρu)x = 0 (1.1) (ρu)t + (ρu2 )x = 0, (x, t) ∈ R × R+ with initial data (ρ, u)|t=0 = (ρ0 (x), u0 (x)) =
(ρ2 , u2 ), x > 0, . (ρ1 , u1 ), otherwise, ρ1 ≥ 0, ρ2 ≥ 0,
(1.2)
We call (1.1),(1.2) a Riemann problem. System (1.1) is the special form of one-dimensional fluid dynamics equations. We recall that the equations of fluid dynamics in Eulerian coordinates read ( ρt + (ρu)x = 0 (conservation of mass), (1.3) 2 (1.4) (ρu)t + (ρu + p)x = 0 (conservation of momentum), where ρ and u stand for the density and the velocity of fluid, respectively, while p denotes the pressure [1].The density ρ is nonnegative; the regions in the physical space where
192
J. Hu
ρ = 0 are identified with vacuum regions of the flow. As we know, a flow is formed by two kinds of effects; the effect of inertia and the effect of pressure difference. If we neglect the effect of pressure difference in (1.4)(that is, the pressure p is constant), (1.3) and (1.4) are reduced to (1.1). The system (1.1) also describes other important physical phenomena, see [10, 26] and references therein. System (1.1) has duplicate eigenvalues λ = u with corresponding right eigenvectors r = (a, 0)T , a being an arbitrary real number, and so 5λ.r = 0. Thus (1.1) is nonstrictly hyperbolic and linearly degenerate. We recall that in general, classical Riemann solutions lie in L∞ loc (R × R+ ), the space of locally bounded functions (cf. [18]). In the present situation, however, one finds that no classical weak solutions exist for some initial data, and the introduction of linear functionals on C0∞ (R × R+ ) into Riemann solutions is found necessary. In other words, Riemann solutions can be viewed as Schwartz generalized functions. At this time, there arises a concurrent problem of how to define the product of two Schwartz generalized functions. According to L. Schwartz’s theory, it is impossible to define the multiplication of arbitrary Schwartz generalized functions since the space of all Schwartz generalized functions is not an algebraic one [16]. To overcome this difficulty, many people have done interesting works [12, 22, 23]. In particular, Colombeau initiated a new algebraic space G(Rn ) of generalized functions, which is the extention of Schwartz generalized function space and allows us to define the multiplication of arbitrary distributions [2, 3]. This new idea was independently introduced by E.E. Rosinger [15], and developed by Oberguggenberger [13,14], Colombeau, A.Y.Loux [4, 5, 6], Todorov [20] and Egorov [9] et al. In the present paper, we borrow this new theory to cope with the product of distributions appearing in the non-classical Riemann solutions to (1.1), (1.2). The program of this paper is as follows: in Sect. 2, for reader’s convenience, we shall give a glimpse of Colombeau’s theory of generalized functions and then interpret in what sense (1.1),(1.2) hold in the framework of Colombeau’s theory (cf.(2.3),(2.4)). In Sect. 3 we shall use the vanishing viscosity approach, first introduced by Dafermos [7] and Tupciev [21], to show that the viscosity regularized problem ρt + (ρu)x = µtρxx (1.5) (ρu)t + (ρu2 )x = µt(ρu)xx , (x, t) ∈ R × R+ with initial data (1.2) has a smooth similarity solution for ρ1 ρ2 > 0. Equivalently, we consider the boundary value problem ( 00 µρ = −ξρ0 + (ρu)0 , (1.6a) µ(ρu)00 = −ξ(ρu)0 + (ρu2 )0
(1.6b)
with boundary conditions (ρ(−∞), u(−∞)) = (ρ1 , u1 ),
(ρ(∞), u(∞)) = (ρ2 , u2 ),
(1.7)
It is shown that (1.6),(1.7) has a smooth solution(ρµ (ξ), uµ (ξ)) on(−∞, ∞) for every µ > 0. In Sect. 4 we shall pay attention to the non-classical Riemann solutions and prove the solutions (ρµ (ξ), uµ (ξ)) to (1.6),(1.7) obtained in Sect. 3 generate a distribution solution (P, U )to (1.1),(1.2). The weak limit of the sequence {ρµ (ξ) : 0 < µ < 1} is just the macroscopic aspect of P in G(R), and the microscopic profiles of the nonclassical shock waves are analysed. Finally, in the last section we consider the case whenρ1 = 0, ρ2 > 0 or ρ1 > 0, ρ2 = 0. We mention in passing here that E, Rykov and Sinai once considered (1.1) with more
Riemann Problem for Pressureless Fluid Dynamics
193
general initial data (except vacuum data) and obtained the global existence of weak solutions by using generalized variational principles [26]. Also Z. Wang, F. Huang and X. Ding investigated this problem by introducing potential functions and LebesgueStieltjies integrals [24, 25]. All of them avoided the difficulty of the multiplication of distributions. 2. A Glimpse of Colombeau’s Algebra G(Rn ) of Generalized Functions In this section we briefly describe the definition of the new generalized functions introduced by Colombeau ([2–4]). Let be an open set in Rn and we denote by D() the set of all C ∞ functions on n with R q ia nonnegative integer, we set Aq = {ϕ ∈ iD(R i1) such R compact support. For that Rn ϕ(λ)dλ = 1 and Rn λ ϕ(λ)dλ = 0 if 1 ≤ |λ| ≤ q}. As usual λ = λ1 ...λinn and|i| = i1 + ... + in . If 0 < < 1 we set 1 λ . ϕ (λ) = n ϕ We denote by EM [] the set of all mappings R(ϕ, x) : A0 × → R such that (i) for any ϕ the mapRϕ : x → R(ϕ, x) is a C ∞ function of the variable x ∈ . |k| (ii) If D = k1∂ kn is any partial derivation operator and if K is any compact subset ∂x1 ...∂xn
of , then there exists an integer N such that if ϕ ∈ AN there are constants C > 0 and η ∈ (0, 1] such that C sup |DRϕ (x)| ≤ N x∈K if 0 < < η We set N [] to be the set of all mappings R ∈ EM [] such that for all D and K as above there exists an integer N such that if ϕ ∈ Aq , q ≥ N , ∃C > 0 and η ∈ (0, 1] such that sup |DRϕ (x)| ≤ Cq−N x∈K
if 0 < < η. Colombeau defined a generalized function G on to be the equivalence class modulo N [] of a representative(ϕ, x) → R(ϕ, x) of G, i.e., the space G() of generalized functions on is the quotient set EM []/N []. The operations in G() such as differentiation, addition and multiplication are those naturally defined on representatives. D0 () is naturally imbedded as a vector subspace of G(): any distribution T on is considered as the class of the mapping R(ϕ, x) = hT (λ), ϕ(λ − x)i if the function λ → ϕ(λ − x) has its support in . Finally, following Colombeau’s notation, two elements G1 , G2 ∈ G() are said to be associated (notation G1 ≈ G2 ) if there exist some representatives R1 , R2 of G1 , G2 satisfying ∀ψ ∈ D(Rn )∃N such that ∀ϕ ∈ AN (Rn ) Z lim (R1 (ϕ , x) − R2 (ϕ , x))ψ(x)dx = 0. →0 Rn
194
J. Hu
An element G is said to have a distribution T ∈ D0 () as macroscopic aspect, iff G ≈ T , i.e., Z ∀ψ ∈ D(),
R(ϕ , x)ψ(x)dx → hT, ψi
as
→ 0,
where R(ϕ, x) is some representative of G. Now we are in a position to give the definition of distribution solutions to (1.1), (1.2). We first note that (1.1),(1.2) are invariant under the transformation x → αx0 , t → αt0 (α > 0). We should seek self-similar solutions of the form (ρ(x, t), u(x, t)) = (ρ(ξ), u(ξ)), ξ = xt . Thus (1.1) changes into
−ξρ0 (ξ) + (ρ(ξ)u(ξ))0 = 0 −ξ(ρ(ξ)u(ξ))0 + (ρ(ξ)u2 (ξ))0 = 0,
and (1.2) into
(ρ(ξ), u(ξ)) →
(ρ1 , u1 ), (ρ2 , u2 ),
0=
d dξ , ξ
∈ R,
ξ → −∞, ξ → +∞,
(2.1)
(2.2)
Definition 2.1. The generalized functions P, U ∈ G(R) are said to be a distribution solution to (1.1),(1.2) if they satisfy −ξP 0 + PU 0 ≈ 0 (2.3) −ξ(PU )0 + (PU 2 )0 ≈ 0 and the initial condition (ρ(ξ), u(ξ)) →
(ρ1 , u1 ), (ρ2 , u2 ),
ξ → −∞, ξ → +∞,
(2.4)
for some representatives (ρ(ξ), u(ξ)) of P, U . 3. Existence of Smooth Solutions to (1.6),(1.7) In this section we shall show the existence of a smooth solution (ρµ , uµ ) to (1.6),(1.7) on (−∞, ∞) for every fixed µ > 0 with ρµ (ξ) > 0. To do this, we consider the two parameter boundary-value problem 00 0 0 µρ = −ξρ + νm (3.1a) 2 0 m 00 0 (3.1b) , −L < ξ < L µm = −ξm + ν ρ with
ρ(−L) = ρ∗ + ν(ρ1 − ρ∗ ), ρ(L) = ρ∗ + ν(ρ2 − ρ∗ ), m(−L) = νρ1 u1 = νm1 , m(L) = νρ2 u2 = νm2 ,
(3.2a) (3.2b)
where parameters ν ∈ [0, 1], L ≥ 1 and ρ∗ = min(ρ1 , ρ2 ), ρ1 > 0, ρ2 > 0, m(ξ) = ρ(ξ)u(ξ). The following theorem is a special case of Theorem 2.1 in [17] with p(ρ) = 0 (see pp. 1050–1052).
Riemann Problem for Pressureless Fluid Dynamics
195
Theorem 3.1. Assume that there are positive constants M and δ depending only on u1 , u2 , ρ1 , ρ2 and µ, but independent of ν and L, such that every solution (ρ(ξ), m(ξ)) of (3.1), (3.2) with ρ(ξ) > 0, corresponding to any 0 ≤ ν ≤ 1, L ≥ 1, satisfies |m(ξ)| + ρ(ξ) ≤ M, sup −L≤ξ≤L (3.3) inf ρ(ξ) ≥ δ. −L≤ξ≤L
Then there exists a solution of (1.6),(1.7), denoted again by (ρ(ξ), m(ξ)), such that ρ(ξ) > 0 for −∞ < ξ < ∞. Our next goal is to derive the apriori estimates (3.3) required to apply Theorem 3.1. We note that if (ρ(ξ), m(ξ)) is a solution of (3.1), (3.2) with ρ > 0, then ρ(ξ) and u(ξ) = m(ξ) ρ(ξ) satisfy the following: µρ00 = −ξρ0 + ν(uρ0 + ρu0 ), ρ0 µu00 = (νu − ξ − 2µ )u0 , −L < ξ < L ρ
(3.4a) (3.4b)
with boundary conditions ρ(−L) = ρ∗ + ν(ρ1 − ρ∗ ), ρ(L) = ρ∗ + ν(ρ2 − ρ∗ ), m(−L) ρ1 u1 m(L) ρ2 u2 u(−L) = =ν ∗ , u(L) = =ν ∗ . ρ(−L) ρ + ν(ρ1 − ρ∗ ) ρ(L) ρ + ν(ρ2 − ρ∗ )
(3.5a) (3.5b)
The following lemma is crucial in establishing (3.3). Lemma 3.2. Let (ρ(ξ), u(ξ)) with ρ(ξ) > 0 be a nonconstant solution of (3.4),(3.5) in (-L, L) for some 0 < ν ≤ 1 and L ≥ 1. Then u(ξ) is always a monotone function in (-L, L) while ρ(ξ) satisfies one of the following: (i) ρ(ξ) is monotone in (-L, L), (ii) ρ(ξ) has only one maximum point in (-L, L) if u(ξ) is strictly decreasing in (-L, L), (iii) ρ(ξ) has only one minimum point in (-L, L) if u(ξ) is strictly increasing in (-L, L). Proof. Let (ρ(ξ), u(ξ)) be a nonconstant solution of (3.4),(3.5) with ρ(ξ) > 0. By (3.4b),(3.5b), we know that ! Z ξ 1 0 A(s)ds , −L < ξ < L, u (ξ) = λ exp −L µ where A(ξ) = νu(ξ) − ξ − 2µρ0 (ξ)/ρ(ξ) and λ is given by Z u(L) − u(−L) = λ
Z
L
τ
exp −L
−L
1 A(s)ds dτ. µ
Therefore, u(ξ) is always monotone in (-L, L). Next we suppose u(ξ) is strictly decreasing in (-L, L) and ρ(ξ) has a critical point σ in (-L,L). Then ρ0 (σ) = 0. By (3.4a), we have µρ00 (σ) = νρ(σ)u0 (σ) < 0 since ρ(σ) > 0 and u0 (σ) < 0. Thus σ is the maximum point of ρ(ξ). Case (ii) is proven. Case (iii) can be treated similarly.
196
J. Hu
From Lemma (3.2), u(ξ) is uniformly bounded in (-L,L) with respect to ν ∈ (0, 1], L ≥ 1 and µ > 0. It remains to estimate ρ(ξ) for Cases (ii), (iii) in Lemma 3.2. Lemma 3.3. For Cases (ii), (iii) in Lemma 3.2, there exists positive constants M and δ, independent of ν ∈ (0, 1] and L ≥ 1, such that δ ≤ ρ(ξ) ≤ M. Proof. Motivated by [8], we first prove Z β ρ(ξ)dξ ≤ (β − α)ρ + N,
(3.6)
α
for every interval (α, β) ⊂ (−L, L), where ρ = max {ρ(−L), ρ(L)} = max{ρ1 , ρ2 } 0≤ν≤1
and N = ρ max |u(−L) − u(L)|. 0≤ν≤1
In fact, we set θ1 = inf{ξ ∈ (α, β) : ρ(ξ) ≥ ρ} if ρ(α) < ρ (if this set is empty, (3.6) is automatically satisfied); on the other hand, we set θ1 = sup{ξ ∈ (−L, β) : ρ(ξ) ≤ ρ} if ρ(α) ≥ ρ. Similarly, we set θ2 = inf{ξ ∈ (β, L] : ρ(ξ) ≤ ρ} if ρ(β) ≥ ρ, while we set θ2 = sup{ξ ∈ (α, β) : ρ(ξ) ≥ ρ} if ρ(β) < ρ. Since ρ(θ1 ) = ρ(θ2 ) = ρ, we have Z β Z θ2 Z θ2 (ρ(ξ) − ρ)dξ ≤ (ρ(ξ) − ρ)dξ = − ξρ0 (ξ)dξ. (3.7) θ1
α
θ1
Noting that ρ0 (θ1 ) ≥ 0 and ρ0 (θ2 ) ≤ 0, integrating (3.4a) over (θ1 , θ2 ), we have Z θ2 0 ≥ µρ0 (θ2 ) − µρ0 (θ1 ) = − ξρ0 (ξ)dξ + νρ(u(θ2 ) − u(θ1 )).
(3.8)
θ1
Therefore, (3.7), (3.8) give Z β (ρ(ξ) − ρ)dξ ≤ νρ(u(θ1 ) − u(θ2 )) ≤ ρ|u(L) − u(−L)| ≤ N α
since u(ξ) is monotone in (-L, L). Thus (3.6) is obtained. Now we apply (3.6) to estimating ρ(ξ) from above for Case (ii) in Lemma 3.2. By (3.6), it follows that ρ∗ ≤ ρ(ξ) ≤ ρ +
N , ξ ∈ [−L, L]/σ. |σ − ξ|
(3.9)
Without loss of generality we assume that ρ(σ) > ρ. We fix ξ0 ∈ [−L, σ) such that ρ(ξ0 ) = ρ. For any ξ ∈ [ξ0 , σ) we let ξ 0 be a point in (σ, L] with the property ρ(ξ 0 ) = ρ(ξ) (such a point exists since ρ(L) ≤ ρ). Integrating (3.4a) over (ξ, ξ 0 ) we obtain 0
0
Z
0
µ(ρ (ξ ) − ρ (ξ)) = −
ξ0
sρ0 (s)ds + νρ(ξ)(u(ξ 0 ) − u(ξ)).
(3.10)
ξ
We note that ρ0 (ξ 0 ) ≤ 0 and − gives
R ξ0 ξ
sρ0 (s)ds =
R ξ0 ξ
(ρ(s) − ρ(ξ))ds ≥ 0. Therefore, (3.10)
Riemann Problem for Pressureless Fluid Dynamics
197
µρ0 (ξ) ≤ νρ(ξ)(u(ξ) − u(ξ 0 )) ≤ ρ(ξ).
N ρ
which yields ρ(σ) ≤ ρ(α) exp(
N (σ − α)), µρ
ξ0 ≤ α ≤ σ.
(3.11)
N ). On the other hand, If σ − ξ0 ≤ 1, we choose α = ξ0 in (3.11) to get ρ(σ) ≤ ρ exp( µρ if σ − ξ0 > 1 we choose α = σ − 1. From (3.11),(3.9), it follows that N N ≤ (ρ + N ) exp . ρ(σ) ≤ ρ(α − 1) exp µρ µρ
Next we estimate ρ(ξ) from below for Case (iii) in Lemma 3.2. We set w(ξ) = −ρ(ξ). Then (w(ξ), u(ξ)) is the solution of (3.4) with the initial data (3.5b) and w(−L) = −ρ∗ − ν(ρ1 − ρ∗ ),
w(L) = −ρ∗ − ν(ρ2 − ρ∗ ).
(3.5a)0
Similar to (3.9), (3.11), we obtain N , ξ ∈ [−L, L]/τ, |τ − ξ| N w(τ ) ≤ w(α) exp − ∗ (τ − α) , ξ0 ≤ α ≤ τ, µρ w(ξ) ≤ w +
where τ is the minimum point of ρ(ξ) in (-L,L) and ξ0 is a point in (-L,L) with w(ξ0 ) = w, w = max {w(−L), w(L)} = −ρ∗ and N = ρ∗ max |u(−L) − u(L)|. Therefore, 0≤ν≤1
0≤ν≤1
≥ ρ∗ − |τN −ξ| , ξ ∈ [−L, L]/τ, N ρ(τ ) ≥ ρ(α) exp − µρ , ξ0 ≤ α ≤ τ. ∗ (τ − α) ρ(ξ)
(3.12) (3.13)
2 N If τ − ξ0 ≤ 2 ρN∗ , (3.13) gives that ρ(τ ) ≥ ρ∗ exp −2 µρ for α = ξ0 . If τ − ξ0 > 2 ρN∗ , ∗2 2 2N . The proof is let α = τ − 2 ρN∗ and use (3.12),(3.13) to obtain ρ(τ ) ≥ 21 ρ∗ exp − µρ ∗2 complete. Now we have obtained the existence of a solution (ρµ (ξ), uµ (ξ)) of (1.6),(1.7) on (−∞, ∞) for every fixed µ > 0. Moreover, we have Lemma 3.4. For every fixed µ > 0, the same results in Lemma 3.2 are valid for the solution (ρµ (ξ), uµ (ξ)) of (1.6),(1.7). Furthermore, Z
β
ρµ (ξ)dξ ≤ (β − α)ρ + N1 , δ ∗ , ρ ≤ ρµ (ξ) ≤ M + 1, 0 < min 2
(3.14)
α
(3.15)
for every interval (α, β) ⊂ (−∞, ∞), where ρ = max{ρ1 , ρ2 } and N1 = ρ|u1 − u2 |.
198
J. Hu
Proof. The proof is similar to that of Lemma 3.2 and (3.6), and (3.15) can be easily obtained from the proof of Theorem 3.1 in [17] (pp. 1051–1052). We omit the detail. Similar to (3.9), we also have ρ∗ ≤ ρµ (ξ) ≤ ρ +
N , ξ ∈ (−∞, ∞)/σµ , |σµ − ξ|
(3.16)
when ρµ (ξ) has a maximum point σµ on (−∞, ∞) ( by Lemma (3.4) this happens if and only if u1 > u2 ), where N = ρ(u1 − u2 ). 4. Existence of Distribution Solutions to (1.1), (1.2) In this section we shall prove the existence of distribution solutions to (1.1), (1.2) for ρ1 ρ2 > 0. We distinguish two cases (1) u1 ≤ u2 , (2) u1 > u2 . Case (2) is much more complicated and a matter of real interest in this paper. We first consider Case (1). From Lemma 3.4, it is easily seen that {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} is uniformly bounded in µ and of uniformly bounded variation. By Helly’s theorem {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} possesses a subsequence which converges a.e. on (−∞, ∞) to some functions ρ(ξ), u(ξ) of bounded variation, and (ρ(ξ), u(ξ)) provides a classical weak solution to (1.1,(1.2) (see Theorem 3.2 in [7] or Theorem 4.1 in [17]). It is a routine matter to verify that the generalized functions P, U in G(R) with ρ(ξ), u(ξ) as macroscopic aspects, respectively, satisfy (2.3), (2.4). Thus, (1.1) with (1.2) admits a distribution solution in Colombeau’s sense for u1 ≤ u2 and ρ1 ρ2 > 0. Now we turn to Case (2). At this time, ρµ (ξ) has a maximum point σµ on (−∞, ∞), and ρµ (ξ) may tend to infinity as µ → 0. We call the condition that u1 > u2 the entropy condition for the singular distribution solution of (1.1), (1.2). Let σµ → σ, |σ| ≤ ∞, as µ → 0 (pass to a further subsequence if necessary). Lemma 4.1. If |σ| = ∞, then {(ρµ (ξ), uµ (ξ)) : 0 < µ < 1} is uniformly bounded in µ. Proof. We first assume that σ = ∞. By (3.16) we get ρ∗ ≤ ρµ (ξ) ≤ ρ + N,
(4.1)
for ξ ∈ (∞, a] and µ small, where a is any fixed real number. We take a = 2 and ξ0 ∈ [1, 2] such that 0 ≤ ρ0µ (ξ0 ) = ρµ (2) − ρµ (1) ≤ ρ + N − ρ∗ by (4.1). Noting u2 ≤ uµ (ξ) ≤ u1 on (−∞, ∞), we take µ to be so small that 1 (ξ0 − uµ (ξ0 ))ρµ (ξ0 ) + µρ0µ (ξ0 ) ≤ 1, σµ − uµ (σµ ) 1 (σµ − ξ0 )ρ + N1 ≤ 2ρ + 1, σµ − uµ (σµ )
(4.2) (4.3)
(this can be done since σµ → ∞ as µ → 0). Integrating (1.6a) over (ξ0 , σµ ), one finds that
Riemann Problem for Pressureless Fluid Dynamics
199
ρµ (σµ ) =
1 0 ξ0 − uµ (ξ0 ) ρµ (ξ0 ) + µρµ (ξ0 ) + σµ − uµ (σµ )
Zσµ
ρµ (ξ)dξ
ξ0
1 (ξ0 − uµ (ξ0 ))ρµ (ξ0 ) + µρ0µ (ξ0 ) + (σµ − ξ0 )ρ + N1 ≤ σµ − uµ (σµ ) ≤ 2 + 2ρ by virtue of (3.14),(4.2) and (4.3). The case σ = −∞ can be treated similarly.
Now we turn to the case |σ| < ∞. Let ξαµ be the singularity point of (1.6), that is, ξαµ = uµ (ξαµ ) and set ξα = lim uµ (ξαµ ). It is easily seen that u2 ≤ ξα ≤ u1 since µ→0
u2 ≤ uα (ξ) ≤ u1 on (−∞, ∞).
Lemma 4.2. ξα is defined as above. Then {ρµ (ξ) : 0 < µ < 1} is uniformly bounded in µ if σ 6= ξα . Proof. We suppose σ > ξα . Integrating (1.6a) over (σµ , σ + 1), we have Z σ+1 0 ξρ0µ (ξ)dξ + ρµ (σ + 1)uµ (σ + 1) − ρµ (σµ )uµ (σµ ) µρµ (σ + 1) = − σµ
= σµ − uµ (σµ ) ρµ σµ + uµ (σ + 1) − σ − 1 ρµ (σ + 1) Z σ+1 ρµ (ξ)dξ. +
(4.4)
σµ
Notice that ρ0µ (σ + 1) ≤ 0 and σµ − uµ (σµ ) = σµ − ξαµ + (uµ (ξαµ ) − uµ (σµ )) = (σµ − ξαµ )(1 − u0µ (θµ )) ≥ 21 (σ − ξα ) > 0 for µ small since u0µ (ξ) ≤ 0 on (−∞, ∞) and σµ − ξαµ → σ − ξα as µ → 0. Here θµ is between ξαµ and σµ . Therefore, (4.4) gives "
1 ρµ (σµ ) ≤ σµ − uµ (σµ ) ≤
Z
σ + 1 − uµ (σ + 1) ρµ (σ + 1) −
2 (σ + 1 − u2 )(ρ + 2N ), σ − ξα
#
σ+1
ρµ (ξ)dξ σµ
for µ small
by (3.16) and ρµ (ξ) > 0. Thus {ρµ (ξ) : 0 < µ < 1} is uniformly bounded in µ. The case σ < ξα can be treated similarly. Given η > 0, from (3.16) we know ρµ (ξ) ≤ ρ + 2N η , ξ ∈ (−∞, ξα − η) ∪ (ξα + η, ∞) for µ small if σµ → σ = ξα as µ → 0. Combining this and Lemma 4.1, 4.2 we easily get Theorem 4.3. If u1 > u2 , then {ρµ (ξ) : 0 < µ < 1} is uniformly bounded and of uniformly bounded variation over the interval (−∞, ξα − η) ∪ (ξα + η, ∞) for any given η > 0. Applying Helly’s theorem and the diagonal principle, we deduce from Theorem 4.3 that there exist a subsequence of ρµ (ξ)(still denoted by the original one) and some function ρ(ξ) such that ρµ (ξ) → ρ(ξ), ξ ∈ (−∞, ξα ) ∪ (ξα , ∞), µ → 0.
(4.5)
200
J. Hu
We set lim uµ (ξ) = u(ξ), ξ ∈ (−∞, ∞).
(4.6)
µ→0
Theorem 4.4. Suppose u1 > u2 and let (ρ(ξ), u(ξ)) be given by (4.5), (4.6).Then for any η > 0 (ρ1 , u1 ), ξ ≤ ξα − η . (4.7) (ρ(ξ), u(ξ)) = ρ2 , u2 ), ξ ≥ ξα + η Proof. By Theorem 4.3, it is easily seen that there exists a positive M1 (independent of µ) such that η η ∪ ξα + , ∞ , (4.8) 0 < ρ∗ ≤ ρµ (ξ) ≤ M1 , ξ ∈ −∞, ξα − 2 2 i h such where ρ = min(ρ1 , ρ2 ) and µ is small. We set ξ0 to be a point in ξα + η2 , ξα + 3η 4 that 4 3η η uµ (ξα + ) − uµ ξα + u0µ (ξ0 ) = η 4 2 which says that 4 − (u1 − u2 ) ≤ u0µ (ξ0 ) ≤ 0. η
(4.9)
By (1.6a), (1.6b), u0µ (ξ) Observe that
u0µ (ξ0 )ρ2µ (ξ0 ) exp = ρ2µ (ξ)
Z
ξ
ξ0
! uµ (s) − s ds , µ
uµ (s) − s = uµ (s) − uµ (ξαµ ) + (ξαµ − s) 1 = (s − ξαµ )(u0µ (θµ ) − 1) ≤ − η, 4
ξ ≥ ξ0 .
(4.10)
(4.11)
since s − ξαµ → s − ξα ≥ ξ0 − ξα ≥ 21 η as µ → 0, and u0µ (θµ ) ≤ 0. Here θµ is between s and ξαµ . Combining (4.10) with (4.8),(4.9),(4.11), one easily gets that 2 4 1 M1 0 exp − η(ξ − ξ0 ) , ξ ≥ ξ0 , (4.12) |uµ (ξ)| ≤ (u1 − u2 ) η ρ∗ 4µ R∞ Therefore, by (4.12), for any ξo ≥ ξα +η, we deduce from u2 −uµ (ξ) = ξ u0µ (s)ds that lim uµ (ξ) = u2 uniformly. A similar argument leads to lim uµ (ξ) = u1 for ξo ≤ ξα − η. µ→0
µ→0
Now we turn to discuss the case for ρ(ξ). Let ξ1 be a point in [ξα + 21 η, ξα + 43 η] such that 4 3 1 ρµ (ξα + η) − ρµ ξα + η ρ0µ (ξ1 ) = η 4 2 which combines with (4.8) to yield
Riemann Problem for Pressureless Fluid Dynamics
|ρ0µ (ξ1 )| ≤
201
4 (M1 − ρ∗ ). η
By (1.6a), (4.10), ρ0µ (ξ)
=
ρ0µ (ξ1 ) exp
Z
ξ ξ1
uµ (s) − s ds µ
(4.13)
!
! uµ (s) − s ds ρµ (τ )u0µ (τ )dτ exp µ ξ1 τ ! Z ξ uµ (s) − s 0 = ρµ (ξ1 ) exp ds µ ξ1 !Z Z ξ ξ 1 0 u (s) − s 1 µ + uµ (ξ0 )ρ2µ (ξ0 ) exp ds dτ. µ µ ρ µ (τ ) ξ0 ξ1 1 + µ
Z
ξ
Z
ξ
From (4.13), (4.11), (4.9) and (4.8), it follows from (4.14) that 1 0 |ρµ (ξ)| ≤ const exp − η(ξ − ξ1 ) 4µ 1 1 + (ξ − ξ1 ) exp − η(ξ − ξ0 ) , ξ ≥ ξα + η. µ 4µ
(4.14)
(4.15)
Here the constant is independent of µ. Accordingly, we have from (4.15) that lim ρµ (ξ) = µ→0
ρ2 for ξ ≥ ξα + η. In a similar way, lim ρµ (ξ) = ρ1 for ξ ≤ ξα − η. The proof is complete.
µ→0
Theorem 4.4 means that ρ(ξ), u(ξ) share the same discontinuity point ξ = ξα on (−∞, ∞). However, the Rankine–Hugoniot condition no longer holds for ξ = ξα since ρ1 ρ2 (u1 − u2 ) 6= 0, and thus (ρ(ξ), u(ξ)) is not a classical weak solution to (1.1), (1.2). But we can get the existence of a distribution solution of (1.1), (1.2) in Colombeau’s sense. The following lemma is vital to our analysis. Lemma 4.5. Let u1 > u2 and (ρµ (ξ), uµ (ξ)) be the solution of (1.6), (1.7). Then ρµ (ξ) converges weakly star to the function ρ(ξ) given by (4.7) plus a weighted Dirac function concentrated on ξ = ξα , that is Z ∞ Z ∞ ρµ (ξ)ϕ(ξ)dξ = ρ(ξ)ϕ(ξ)dξ + λhδ(ξ − ξα ), ϕ(ξ)i, (4.16) lim µ→0
for each ϕ(ξ) ∈ function.
−∞
−∞
C0∞ (R),
where λ = ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα and δ(ξ) is the Dirac
Proof. Inspired by [19] (also see [11]), we let ξ1 < ξα < ξ2 and ϕ(ξ) ∈ C0∞ (ξ1 , ξ2 ) with the property that ϕ(ξ) ≡ ϕ(ξα ) in a small neighbourhood of ξα . By (1.6a), Z ξ2 ρµ (ξ)ϕ00 (ξ)dξ µ ξ1 ξ2
Z =
ξ1
ρµ (ξ)(ξϕ0 (ξ) + ϕ(ξ)) − ρµ (ξ)uµ (ξ)ϕ0 (ξ) dξ,
202
J. Hu
which combines with (3.14) to yield that Z
ξ2
ρµ (ξ)ϕ(ξ)dξ = lim
lim
µ→0
Z
ξ2 µ→0
ξ1
ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ.
(4.17)
ξ1
For α1 , α2 near ξα , α1 < ξα < α2 , (4.7) gives Z
ξ2
lim
µ→0
ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ
ξ1
Z
α1
= lim
µ→0
Z
α1
=
ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ + lim
ξ1 0
ρ1 (u1 − ξ)ϕ (ξ)dξ +
ξ1
Z
Z
µ→0
ξ2
ξ2
ρµ (ξ)(uµ (ξ) − ξ)ϕ0 (ξ)dξ
α2
ρ2 (u2 − ξ)ϕ0 (ξ)dξ
α2
Z
Z
α1
= (ρ1 u1 − ρ2 u2 + ρ2 α2 − ρ1 α1 )ϕ(ξα ) + ρ1
ξ2
ϕ(ξ)dξ + ρ2
→ (ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα )ϕ(ξα ) + ρ1
ϕ(ξ)dξ
ξ1 Z ξα
α2 Z ξ2
ϕ(ξ)dξ + ρ2 ξ1
ϕ(ξ)dξ ξα
as α1 → ξα −, α2 → ξα +. Therefore , it follows from (4.17) that Z
ξ2
ρµ (ξ)ϕ(ξ)dξ
lim
µ→0
ξ1
Z
= (ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα )ϕ(ξα ) + ρ1
Z
ξα
ϕ(ξ)dξ + ρ2 ξ1
(4.18)
ξ2
ϕ(ξ)dξ. ξα
By the approximation process, (4.18) holds for every ϕ(ξ) ∈ C0∞ (ξ1 , ξ2 ) . This completes the proof. Now we define ρ(ξ) on (−∞, ∞) as follows: ρ(ξ) = ρ(ξ) + λδ(ξ − ξα ) = ρ1 + (ρ2 − ρ1 )H(ξ − ξα ) + λδ(ξ − ξα ),
(4.19)
where λ = ρ1 u1 − ρ2 u2 + (ρ2 − ρ1 )ξα and H(ξ) is the Heaviside function given by H(ξ) = 1 for ξ > 0 and H(ξ) = 0 for ξ < 0. By (4.7), we write u(ξ) = u1 + (u2 − u1 )H(ξ − ξα ).
(4.20)
Now we have Theorem 4.6. Suppose u1 > u2 and ρ1 ρ2 > 0. Let P and U belong to G(R) with ρ(ξ) and u(ξ) given by (4.19), (4.20) as their macroscopic aspects respectively. Then (P, U√ ) satisfies (2.3), (2.4), i.e., (P, U ) is a distribution solution to (1.1), (1.2). Moreover, √ √ ρ u + ρ u ξα = √1 ρ1 +√ρ2 2 and λ = ρ1 ρ2 (u1 − u2 ). 1
2
Riemann Problem for Pressureless Fluid Dynamics
203
Proof. Let θi (ξ) ∈ A0 (R), i.e., θi (ξ) ∈ D(R) with
R
θ (ξ)dξ R i
= 1, i = 1, 2. We set
R1 (θ1 , ξ) = (ρ ∗ θ1 )(ξ) = (ρ ∗ θ1 )(ξ) + λhδ(y − ξα ), θ1 (y − ξ)i Z ∞ θ1 (s)ds + λθ1 (ξα − ξ) = ρ1 + (ρ2 − ρ1 ) ξα −ξ
Z
R2 (θ2 , ξ) = (u ∗ θ2 )(ξ) = u1 + (u2 − u1 )
(4.21)
∞ ξα −ξ
θ2 (s)ds.
We want to show that for any ψ(ξ) ∈ C0∞ (R), Z ∞ (4.22) R1 (θ1 , ξ)(ξψ(ξ))0 − R1 (θ1 , ξ)R2 (θ2 , ξ)ψ 0 (ξ) dξ = 0, lim →0 −∞ Z ∞ lim R1 (θ1 , ξ)R2 (θ2 , ξ)(ξψ(ξ))0 − R1 (θ1 , ξ)R22 (θ2 , ξ)ψ 0 (ξ) dξ = 0, (4.23) →0
−∞
where θi (ξ) = 1 θi
ξ
, i = 1.2. Observing that
ρ ∗ θ1 (ξ) → ρ(ξ), R2 (θ2 , ξ) → u(ξ), R22 (θ2 , ξ) → u2 (ξ) we have
Z lim →0 Z =
∞
−∞ ∞ −∞
a.e.on R,
R1 (θ1 , ξ)(ξψ(ξ))0 dξ
ρ(ξ) (ξψ(ξ))0 dξ + λ lim
Z
→0
∞ −∞
1 θ1
ξα − ξ
(ξψ(ξ))0 dξ
(4.24)
= ((ρ1 − ρ2 )ξα + λ) ψ(ξα ) + λξα ψ 0 (ξα ) and
Z lim →0 Z =
∞
−∞ ∞
R1 (θ1 , ξ)R2 (θ2 , ξ)ψ 0 (ξ)dξ
u(ξ)ρ(ξ)ψ 0 (ξ)dξ (4.25) Z ∞ Z ∞ s 1 1 ξα − ξ θ1 ψ 0 (ξ)(u1 + (u2 − u1 ) θ2 ds)dξ + λ lim →0 −∞ ξα −ξ −∞
= (ρ1 u1 − ρ2 u2 )ψ(ξα ) + λ(u1 + (u2 − u1 )A)ψ 0 (ξα ), R R∞ ∞ where A = −∞ θ1 (y) y θ2 (s)ds dy. Therefore, (4.22), (4.24–25) imply that
(λ + (ρ1 − ρ2 )ξα − ρ1 u1 + ρ2 u2 ) ψ(ξα ) + λ (ξα − u1 − (u2 − u1 )A) ψ 0 (ξα ) = 0, which means λ (ξα − u1 − (u2 − u1 )A) = 0, since ψ(ξ) is arbitrary. Similarly, (4.23),(4.21) are reduced to
λ = (ρ2 − ρ1 )ξα + ρ1 u1 − ρ2 u2 ,
(4.26)
204
J. Hu
ξα (ρ1 u1 − ρ2 u2 ) + λ(u1 + (u2 − u1 )A + ρ2 u22 − ρ1 u21 ) = 0, (4.27) λ ξα (u1 + (u2 − u1 )A) − u21 + 2u1 (u2 − u1 )A + (u2 − u1 )2 B = 0, R 2 R∞ ∞ where B = −∞ θ1 (y) y θ2 (s)ds dy. Remembering that u2 ≤ ξα ≤ u1 , we solve (4.26),(4.27) to obtain √
√ ρ1 u1 + ρ2 u2 ξα = √ , √ ρ1 + ρ2 √ ρ2 A= √ √ , ρ1 + ρ2
λ=
√ ρ1 ρ2 (u1 − u2 ),
B = A2 .
(4.28) (4.29)
We remark here that (4.29) always holds for any ρ1 ρ2 > 0 by appropriately choosing θ1 and θ2 . Thus (2.3) holds for ξα , λ, A chosen above. And (2.4) is easily seen. This completes the proof. 5. Existence of Distribution Solutions to (1.1), (1.2) for ρ1 ρ2 = 0 In this section we only consider the case ρ1 = 0, ρ2 > 0 . The case for ρ2 = 0, ρ1 > 0 can be treated similarly. Let (ρη (x, t), uη (x, t)) = (ρη (ξ), uη (ξ)) be the macroscopic aspect of the distrbution solution (P,U) to (1.1), (1.2) with initial data (η, u1 ), x ≤ 0 . (5.1) (ρη (x, 0), uη (x, 0)) = (ρ2 , u2 ), x > 0 We set (ρ(ξ), u(ξ)) = lim (ρη (ξ), uη (ξ)) in D0 (R). Then (ρ(ξ), u(ξ)) is a classical weak η→0
solution to (1.1), (1.2)(of course (P, U ) is also a distribution solution to (1.1), (1.2) in Colombeau’s sense). As a matter of fact, when u1 ≤ u2 , the smooth solution (ρµη (ξ), uµη (ξ)) to (1.6), (5.1) is uniformly bounded with respect to both µ and η and of uniformly bounded variation, and (ρ(ξ), u(ξ)) is a classical weak solution in L∞ (R × R+ ). On the other hand, when u1 > u2 , the singularity part λη δ(ξ − ξαη ) of ρη (ξ) vanishes as η → 0 since its strength √ √ λη = ρ1 ρ2 (u1 − u2 ) = ηρ2 (u1 − u2 ) → 0 as η → 0 (cf. (4.19), (4.28)), and the weak limit (ρ(ξ), u(ξ)) of (ρη (ξ), uη (ξ)) as η → 0 lies in L∞ (R × R+ ). In a word, (1.1), (1.2) has always a classical weak solution when one initial datum is a vacuum state. Acknowledgement. I am deeply grateful to Prof. Xiaqi Ding for his persistent encouragement. I also would like to express my thanks to Prof. M. Oberguggenberger for bringing his works to my attention.
References 1. Courant, R. and Friedrichs, K.O.: Supersonic flow and shock waves. Berlin–Heidelberg–New York: Springer-verlag, 1976 2. Colombeau, J.F.: New generalized functions and multiplication of distributions. Amsterdam: North Holland, 1984 3. Colombeau, J.F.: Multiplication of distributions. Springer Lecture Notes in Mathematics, Vol. 1532, Heidelberg: Springer-Verlag, 1992
Riemann Problem for Pressureless Fluid Dynamics
205
4. Colombeau, J.F. and Oberguggenberger, M.: On a hyperbolic system with a compatible quadratic term: Generalized solutions, delta waves, and multiplication of distributions. Comm. Part. Diff. Eqs. 14, 905– 938 (1990) 5. Colombeau, J.F. and LeRoux, A.Y.: Multiplication of distributions in elasticity and hydrodynamics. J. Math. Phys. 29, 315–319 (1988) 6. Colombeau, J.F., LeRoux, A.Y., Noussair, A. and Perrot, B.: Microscopic profiles of shock waves and ambiguities in multiplications of distributions. SIAM J. Numer. Anal. 26, 871–883 (1989) 7. Dafermos, C.M.: Solution of the Riemann problem for a class of hyperbolic conservation laws by the viscosity method. Arch. Rational Mech. Anal. 52, 1–9 (1973) 8. Dafermos, C.M. and DiPerna, R.J.: The Riemann problem for certain classes of hyperbolic systems of conservation laws. J. Diff. Eqs. 20, 90–114 (1976) 9. Egorov, Yu.V.: On the theory of generalized functions. Russ. Math. Surveys 45, 3–40 (1990) 10. Greenberg, J.M. and LeRoux, A.Y.: A well-balanced scheme for the numerical processing of source terms in hyperbolic equations. SIAM J. Num. Anal. 33, 1–16 (1996) 11. Hu, J.X.: A limiting viscosity approach to Riemann solutions containing delta-shock waves for nonstrictly hyperbolic conservation laws. Quarterly Appl. Math. 2, 361–373 (1997) 12. Maso, G.D., LeFloch, P. and Murat, F.: Definition and weak stability of nonconservative products. J. Math. Pure Appliquees 6, 483–548 (1995) 13. Oberguggenberger, M.: Multiplications of distributions and applicatons to partial differential equations. Pitman Research Notes Math. Vol. 259, Harlow: Longman, 1992 14. Oberguggenberger, M.: Products of distributions: Nonstandard methods. Zeit. Anal. Anw. 7, 347–365 (1988); Corrections to this article: Zeit. Anal. Anw. 10, 263–264 (1991) 15. Rosinger, E.E.: Nonlinear Partial Differential Equations. An Algebraic view of generalized solutions. Amsterdam: North Holland, 1990 16. Schwartz, L.: Sur l’impossibilite de la multiplication des distributions. C.R. Acad. Sci. Paris 239, 847– 848 (1954) 17. Slemrod, M. and Tzavaras, A.E.: A limiting viscosity approach for the Riemann problem in isentropic gas dynamics. Indiana Univ. Math. J. 38, 1047–1074 (1989) 18. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. Berlin–Heidelberg–New York: SpringerVerlag, 1983 19. Tan, D.C., Zhang, T. and Zheng, Y.X.: Delta-Shock waves as limits of vanishing viscosity for hyperbolic systems of conservation laws. J. Diff. Eqs. 112, 1–32 (1994) 20. Todorov, T.: Colombeau’s generalized functions and nonstandard analysis. In: Generalized Functions, Convergence Structures and Their Applications (B.Stankovic, E. Pap, S. Pilipovic, V.S. Vladimirov, ed), New York: Plenum Press, 1988, pp. 327–339 21. Tupciev, V.A.: On the method of introducing viscosity in the study of problems involving decay of a discontinuity. Dokl. Akad. Nauk. SSR 211, 55–58 (1973) 22. Volpert, A.I.: The space BV and quasilinear equations. Math. USSR Sbornik 73(115), 225–267 (1967) 23. Volpert, A.I. and Hudjaev, S.I.: Analysis in classes of discontinuious functions and equations of mathematical physics. Nijhoff, 1985 24. Wang, Z., Huang, F.M. and Ding, X.Q.: On the Cauchy problem of transportation equations. Acta Math. Appl. Sinica, 2, (1997) 25. Wang, Z. and Ding, X.Q.: Uniqueness of generalized solutions for the Cauchy problem of transportation equations. To appear 26. E, Weinan, Rykov, Yu.G. and Sinai, Ya.G.: Generalized variational principles, global weak solutions and behavior with random initial data for systems of conservation laws arising in adhesion particle dynamics. Commun. Math. Phys. 177, 349–380 (1996) Communicated by Ya. G. Sinai
Commun. Math. Phys. 194, 207 – 230 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds Hideki Omori1 , Yoshiaki Maeda2 , Naoya Miyazaki1 , Akira Yoshioka3 1 Department of Mathematics, Faculty of Science and Technology, Science University of Tokyo, Noda, Chiba, 278, Japan 2 Department of Mathematics, Faculty of Science and Technology, Keio University, Hiyoshi, Yokohama, 223, Japan 3 Department of Mathematics, Faculty of Engineering, Science University of Tokyo, Kagurazaka, Tokyo, 162, Japan
Received: 4 April 1997 / Accepted: 12 October 1997
Abstract: We introduce a complete invariant for Weyl manifolds, called a Poincar´e– Cartan class. Applying the constructions of the Weyl manifold to complex manifolds via the Poincar´e–Cartan class, we propose the notion of a noncommutative K¨ahler manifold. For a given K¨ahler manifold, the necessary and sufficient condition for a Weyl manifold to be a noncommutative K¨ahler manifold is given. In particular, there exists a noncommutative K¨ahler manifold for any K¨ahler manifold. We also construct the noncommutative version of the S 1 -principal bundle over a quantizable Weyl manifold.
Introduction The construction of a deformation quantization of symplectic manifolds has been extensively studied in recent works. The purpose of this paper is to present a cohomological invariant of Weyl manifolds which appeared in the construction of the star products on a symplectic manifold. As introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer in [BFL], a deformation quantization, or more precisely a star-product on a symplectic manifold M is an associative product ∗ on C ∞ (M )[[ν]], the space of formal power series in ν with coefficients in C ∞ (M ), such that: (D1) f ∗ g = f g + ν2 {f, g} + · · · , for f, g ∈ C ∞ (M ), where { , } stands for the Poisson bracket on M . (D2) 1 ∗ f = f = f ∗ 1, ν ∈ center. (D3) Complex conjugation f → f¯ is an anti-automorphism of (C ∞ (M )[[ν]], ∗), where ν = −ν. By the localization theorem (cf. [OMY2, O, p.312]), we may always assume that the star-product ∗ has the locality; i.e. supp f ∗ g ⊂ supp f ∩ supp g as C[[ν]]-valued functions.
208
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
One construction of star-products for symplectic manifolds was first shown by Vey [V] and Lichnerowicz [L] via a torsion free flat connection. Using different approaches, De Wilde-Lecomte [DL], Fedosov [F] and Omori-Maeda-Yoshioka [OMY1] have proved the existence of a star-product for an arbitrary symplectic manifold. De Wilde-Lecomte worked algebraically via careful cohomological arguments, while Fedosov and Omori-Maeda-Yoshioka used a geometric method on the Weyl bundle (cf. [W]). Fedosov’s crucial idea is to construct a flat connection on the sections of the Weyl bundle. [OMY1] built a noncommutative version of manifolds, called Weyl manifolds from a given symplectic manifold. Thus, it is natural to ask how the constructions by [DL, F and OMY1] relate to each other. Deligne [D] studied the relationship between the construction of the star-product by De Wilde-Lecomte and by Fedosov and showed that these constructions are equivalent to each other. Recently, there has been interesting work on the equivalence of star products by Xu [X] and Bertelson-Cahen-Gutt [BCG]. In this paper, we first remark the equivalence of the star-product constructed in [OMY1]; The Weyl manifold, by definition (cf.Sect. 3.1), is constructed by patching “noncommutative coordinates”, and constructions of the star product are built on that of Weyl manifolds. The quantum version of Darboux’s theorem (cf. [O], p.317) combined with the inverse Moyal product formula (1.2) easily gives that all star-products are obtained as the algebra of Weyl functions on a Weyl manifold (cf. remark after Theorem 3.2). Fedosov’s flat connection is the connection on a Weyl algebra bundle for which all Weyl functions are characterized as parallel sections. We show in this paper that there is a bijective correspondence between the equivalence class of Weyl manifolds and the second cohomology group H 2 (M, ν 2 C[[ν 2 ]]) (Theˇ orem 3.5). The correspondence is indeed given by a characteristic Cech 2-cohomology class (cf. Definition 3.4) called the Poincar´e–Cartan class which comes from a patching of “quantized Darboux coordinates” to make a noncommutative manifold. The Poincar´e– Cartan class has been proposed previously by Karasev and Maslov in [KM] to be an invariant for their asymptotic quantization theory. It is remarked that its integration on a circle coincides with the original Poincar´e–Cartan invariant (cf. [O]). On the other hand, a characteristic class was defined by Nest-Tsygan [NT] in terms of the curvature of the connection for the Weyl bundle, which distinguishes Fedosov star-products up to equivalence (cf. [F]). We conjecture that their characteristic class might coincide with the Poincar´e–Cartan class for the Weyl manifold. The main purpose of this paper is to apply the Poincar´e–Cartan class to complex manifolds and to propose the notion of a noncommutative K¨ahler manifold. A K¨ahler manifold is a special type of symplectic manifolds with the option that their coordinate transformations are not only symplectic but also holomorphic. For a given K¨ahler manifold M , we give a necessary and sufficient condition for a Weyl manifold over M to be a noncommutative K¨ahler manifold in terms of its Poincar´e–Cartan class (Theorem 4.6.). We also show that there exists a noncommutative K¨ahler manifold for every K¨ahler manifold (Theorem 5.2.). The second subject of this paper discussed in Sect. 6 is, as an application of the construction of star products via Weyl manifold, a construction of a quantum S 1 -bundle over a symplectic manifold with the quantization condition. In patching up the (noncommutative) local coordinates to obtain the Weyl manifold, we use a derivation which generates a noncommutative version of the circle action on the S 1 -bundle of a symplectic manifold satisfying the quantization condition. Furthermore, if the base manifold M has a K¨ahler structure, then the noncommutative version of the associated line bundle has the structure which one may call “holomorphic
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
209
line bundle”. This structure naturally gives the notion of holomorphic sections, and the space of all holomorphic sections is a maximal commutative subalgebra. It should be remarked that the construction of star-products in [OMY1] has an advantage of yielding naturally such constructions. 1. Weyl Functions We first review briefly Weyl functions and Weyl diffeomorphisms on Weyl algebras. Here we start with a Weyl algebra over R. A Weyl algebra W is the algebra generated formally by ν, X1 , · · · , Xn , Y1 , · · · , Yn over R with the fundamental relations [ν, Xi ] = [ν, Yi ] = 0, [Xi , Xj ]= [Yi , Yj ] = 0, [Xi , Yj ] = −νδij . The multiplication of the algebra is denoted by ∗. Then, the Weyl algebra W can be identified with the algebra R[[X, Y, ν]] of formal power series with the following product, called the Moyal product : → ν ←− − (1.1) a ∗ b = a exp{− ∂X ∧˙ ∂Y }b, 2 Pn ←− − → where a∂X ∧˙ ∂Y b = j=1 {∂Xj a·∂Yj b−∂Yj a·∂Xj b}, We put the usual adic-topology on W . The formula (1.1) can be inverted to recapture the commutative product as follows: → ν ←− ∗ − a · b = a exp{ ∂X ∧ ∂Y }b, 2
(1.2)
Pn → ←− ∗ − where a∂X ∧ ∂Y b = j=1 {∂Xj a∗∂Yj b − ∂Yj a∗∂Xj b}, This can be viewed as a method of construction of a commutative product from the ∗-product. This idea appears in Sect. 3 to make a model space of a Weyl manifold, and in Sect. 6 to solve an equation given by using ∗-product. We define an involutive anti-automorphism a → a¯ by setting X i = Xi , Y j = Yj , ν¯ = −ν. Note that there are other systems of elements (X10 , · · · , Xn0 , Y10 , · · · , Yn0 ) of W with the same fundamental relations which topologically generate the same W . We call such X10 , · · · , Xn0 , Y10 , · · · , Yn0 quantum canonical generators (QC-generators). 1.1. Weyl function. Let U be an open set of R2n with linear coordinates (x1 , · · · , xn , y1 , · · · , yn ), and W U the trivial algebra bundle U × W . Let Γ (W U ) be the space of all continuous sections of W U with respect to the compact open topology. Γ (W U ) is an associative algebra over R under the pointwise ∗-product. Define the sections ξi , ηi of W U by ηi (p) = yi (p) + Yi , i = 1, · · · , n. (1.3) ξi (p) = xi (p) + Xi , Then, we have [ξi , ηj ] = −νδij , [ξi , ξj ] = [ηi , ηj ] = 0. For any R[[ν]]-valued C ∞ function f , we define a section f ] (ξ, η), called a Weyl function, by the formula P 1 λ µ λ µ (1.4) f ] (ξ, η)(p) = λ!µ! ∂x ∂y f (p)X · Y . λµ
For f ∈ C ∞ (U )[[ν]] we call f ] the Weyl continuation of f . Obviously ξi = x]i , and ηi = yi] . We define F (W U ) to be the set of all Weyl functions. F(W U ) is a closed subalgebra of Γ (W U ) (cf.[OMY1]).
210
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
It is easily seen that the ∗-product f ] ∗ g ] is given by the same formula (1.1), i.e. − − → ν ← f ] ∗ g ] (p) = (f exp − { ∂x ∧˙ ∂y }g)] , 2
(cf. [OMY1]).
(1.5)
Moreover, the involutive anti-automorphism a 7→ a¯ extends naturally on Γ (W U ) and F (W U ) = F(W U ), We have also f ] = (f¯)] . 1.2. Integration on W U . For a Weyl function f ] ∈ F(W U ) with f integrable on U , we define the integral of f ] by R ] R f = f dV ∈ R[[ν]], U
U
where dV = dx1 · · · dxn dy1 · · · dyn is the usual volume element on U . Integration by R ← − − → parts shows that if one of f, g has a compact support, then U f { ∂y ∧˙ ∂x }k gdV = 0. Hence, we have R R ] f ∗ g ] = f · gdV. (1.6) U
U
In particular, we have R
f ] ∗ g] =
U
R
g] ∗ f ] ,
U
R U
f] =
R
f ].
(1.7)
U
1.3. The contact Weyl Lie algebra. We define a derivation L0 as follows: L0 ν = 2ν 2 ,
L0 Xi = νXi ,
L0 Yi = νYi .
Together with a formal symbol τ , we define a Lie algebra, called a contact Weyl Lie algebra , g = Rτ ⊕ W with the bracket: [aτ + f, bτ + g] = aL0 g − bL0 f + [f, g].
(1.8)
We easily see that [g, g] ⊂ ν ∗ W . We set also τ¯ = τ to define an involutive antiautomorphism. Definition 1.1. A linear mapping A : g → g is called a ν-isomorphism, if A is a Lie algebra isomorphism satisfying (i) A(ν) = ν, (ii) AW = W and (iii) the restriction A|W is an algebra isomorphism. D : g → g is called a ν-derivation if D is a Lie algebra derivation satisfying (i) D(ν) = 0, (ii) DW ⊂ W , and (iii) the restriction D|W is an algebra derivation. (Cf.[OMY1] Definition 4.2. ) Although L0 and hence τ depends on the choice of QC-generators, it is easy to see that the ν-isomorphism class of g is determined only by W . Lemma 1.2. For every ν-derivation D : g → g, there are f ∈ W , c ∈ R such that D is written in the form D = ad(ν −1 ∗f ) + c ad(log ν). If D(τ ) ∈ ν∗W , then f = ν ∗ g, g ∈ W in the above expression and c is determined uniquely by D. g is determined only up to constant. If D(τ ) ∈ ν 2 ∗W , then D = ad(ν ∗ g), where g is determined uniquely by D.
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
211
Here, we first remark that ad(ν −1∗f ) and ad(log ν) are defined by only symbolic use of ν −1 ∗f and log ν. Note that the above lemma is proved in [OMY1, Proposition 4.3] in the case of complex coefficients, but the proof works also for the real coefficients. Though the second statement was not given there, it can be seen easily by the proof. Let U be an open subset of R2n with coordinates x1 , · · · , xn , y1 ,· · · , yn , and Γ (gU ) the space of all continuous sections of the trivial bundle gU = U × g over U . We define a section by Pn (1.9) τ˜ (p) = τ − i=1 (yi (p)Xi − xi (p)Yi ). The sections ξi , ηi given by (1.3) are contained in Γ (gU ), and we have [τ˜ , ξi ] = ν ∗ ξi , [τ˜ , ηi ] = ν ∗ ηi , [ξi , ηj ] = −νδij , [τ˜ , ν] = 2ν 2 .
(1.10)
We give several remarks for the complexification. The notion of Weyl algebras and Weyl functions can be easily complexified by considering the tensor product with C. We denote these by W C and F(W U )C . W , F(W U ) are real subalgebras of W C , F(W U )C . The involutive anti-automorphism extends naturally by setting ¯i = −i to these complexified algebras. Here, one should take care that for instance W is not the subspace {a ∈ W C ; a¯ = a}. To avoid the confusion that might occur, we define as follows: A linear mapping 8 : F (W U )C → F(W U )C over R is said to have the hermitian property if 8(f ) = 8(f¯) holds for every f , and 8 is said to have the real-to-real property if 8(F(W U )) ⊂ F (W U ). Notions of ν-isomorphisms and ν-derivations of g extend for the complexification gC = g ⊗ C. Lemma 1.2 and the following remark hold for the complexified case.
2. Patching Diffeomorphisms 2.1. Weyl diffeomorphisms and contact Weyl diffeomorphisms. Let U and V be open subsets of R2n with coordinates x1 , · · · , xn , y1 , · · · , yn . Consider the trivial algebra bundles W U = U × W , and W V = V × W over U and V respectively. For a bundle isomorphism 8 : W U → W V inducing a diffeomorphism ϕ : U → V on base spaces, we define the pullback 8∗ : Γ (W V ) → Γ (W U ) by (8∗ S)(p) = 8−1 S(ϕ(p)). A continuous algebra isomorphism 9 : F (W V ) → F(W U ) such that 9(ν) = ν will be called a pre-Weyl diffeomorphism. The following lemma is shown in [OMY1, Lemma 3.2] : Lemma 2.1. For any pre-Weyl diffeomorphism 9 : F (W V ) → F(W U ), there exists a unique bundle isomorphism 8 such that 9 = 8∗ . In particular, the induced diffeomorphism ϕ : U P → V is a symplectic diffeomorphism with respect to the natural symplectic 2-form = dxi ∧ dyi . A pre-Weyl diffeomorphism 9 : F(W V ) → F (W U ) is called a Weyl diffeomorphism, if 9 has the hermitian property 9(f¯) = 9(f ). By Lemma 2.1 and the same proof of [OMY3] Proposition 2, we see easily that any pre-Weyl diffeomorphism 9 has the volume preserving property: R R 9(f ) = f. (2.1) U
V
212
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
Remark that the definition of the Weyl diffeomorphism is slightly stronger than that defined in [OMY1, Definition 3.4]. Though (2.1) is requested in the definition of the Weyl diffeomorphism in [OMY3], this holds automatically by the above observation. Note that the notion of ν-derivations in Definition 1.1 extends naturally to Γ (gU ). Remember that a ν-derivation induces, by definition, an algebra derivation on Γ (W U ). Definition 2.2. A ν-derivation 4 : Γ (gU ) → Γ (gU ) is called a contact Weyl vector field if 4(ν) = 0, 4F (W U ) ⊂ F (W U ) and 4(τ˜ ) ∈ F(W U ). 2.2. Contact Weyl diffeomorphisms. We call an isomorphism 8c∗ : Γ (gV ) → Γ (gU ) a pointless contact diffeomorphism if 8c∗ is a Lie algebra isomorphism such that 8c∗ (ν) = ν, 8c∗ (τ˜ ) ∈ τ˜ + F (W U ), 8c∗ F(W V ) = F(W U ), and the restriction 8c∗ |F(W V ) is an algebra isomorphism. 8c∗ is called a contact Weyl diffeomorphism, if the restriction to F(W V ) gives a Weyl diffeomorphism. Proposition 2.3. Suppose U, V are diffeomorphic to the open unit disk D2n of R2n . For every symplectic diffeomorphism ϕ : U → V , there is a Weyl diffeomorphism 8 : W U → W V inducing ϕ between base spaces. Moreover, 8∗ extends to a contact Weyl diffeomorphism 8c∗ : Γ (gV ) → Γ (gU ) such that 8c∗ (f ) = 8c∗ (f¯). Proposition 2.3 is given in [OMY1, Theorems 3.7 and 4.7] in the case of complex coefficients, but this holds also for the real case by the same proof. In the proof of [OMY1, Theorem 3.7], ϕ is requested to be a symplectic diffeomorphism of U onto V . However this condition is easily removed by considering an exhausting family of closed subsets of U and V . The 8 given by Proposition 2.3 is called a lift of ϕ. Note that the lift 8 of ϕ is not unique in general. C Let W C U and F(W U ) be the complexification of W U and F (W U ) respectively. Notions of pre-Weyl diffeomorphisms and Weyl diffeomorphisms extends naturally on these complexified algebras. Let Γ (gU )C be the complexification of Γ (gU ). As in Lemma 1.2, the notion of contact Weyl vector fields and pointless contact diffeomorphisms, etc. extends naturally to Γ (gU )C . By Lemma 1.2 and the remark mentioned in the last paragraph of Sect. 1, we have: Lemma 2.4. For a contact Weyl vector field 4 : Γ (gU )C → Γ (gU )C there exist f ∈ F (W U )C and c0 ∈ C such that 4 = ad(ν −1 ∗f ) + c0 ad(log ν): (1) If 4(τ˜ ) ∈ ν∗F(W U )C , then f ∈ ν∗F(W U )C , c0 ∈ C, and c0 is uniquely determined. ν −1 ∗f is determined only up to constant. (2) If 4(τ˜ ) ∈ ν 2 ∗Γ (gU )C , then c0 = 0, and f can be taken in ν 2 ∗F (W U )C , and such f is unique. (3) If 4 has the real-to-real property; 4Γ (gU ) ⊂ Γ (gU ), then c0 ∈ R and f can be taken in F (W U ). ¯ and 4(τ˜ ) ∈ (4) If 4 has the real-to-real property, the hermitian property; 4(h) = 4(h) 0 2 ν ∗F(W U ), then c = 0 and f can be taken in ν ∗F(W U ), and hence such f is unique. Proof. (1) and (2) are easy to see by Lemma 1.2, and (3) is given by the similar proof. For (4), we see by (1)-(3) that there are g ∈ F (W U ) and c0 ∈ R such that 4 = ad(g) + c0 ad(log ν). By the hermitian P property, we have 4(τ˜ ) = 4(τ˜ ). It follows that [τ˜ , g + g] ¯ = 4c0 ν. Since g + g¯ ∈ k≥0 ν 2k C ∞ (U )] , we have c0 = 0 and g is written
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
213
P 2k+1 in the form g = ν g2k+1 . It follows f = ν ∗ g ∈ ν 2 ∗ F (W U ). This yields 2 4(τ˜ ) ⊂ ν ∗ F(W U ), and hence f is determined uniquely by 4. Considering the formal expansion in ν k , we see that if a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C induces the identity on the base space U , then 8c∗ is written in the form 0 Y k ] −1 0 ead(ν hk ) ec ad(ν ) ec ad(log ν) , (2.2) 8c∗ = ∞
Q0 using hk ∈ C ∞ (U )C for every integer k ≥ 0 and c, c0 ∈ C, where the notation ∞ Ik means · · · Ik · · · I2 I1 I0 . c0 is determined uniquely by 8c∗ by virtue of Lemma 2.4, (1). By Lemma 2.4 we easily obtain the following: Corollary 2.5. If a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C induces the identity on the base space U , then hk , c, c0 in (2.2) satisfies the following: (1) If 8c∗ (τ˜ ) ∈ τ˜ + 2c + ν 2 ∗ F(W U )C , then c0 = 0, h0 = 0, and c, hk (k ≥ 1) are unique. (2) If 8c∗ has the real-to-real property; 8c∗ Γ (gU ) = Γ (gU ), then c, c0 ∈ R and hk ∈ F (W U ). (3) If 8c∗ has the real-to-real property and the hermitian property; 8c∗ (f ) = 8c∗ (f¯), then c0 = 0, h2k = 0 for k ≥ 0, and c and h2k+1 (k ≥ 0) are unique. (4) If 8c∗ induces the identity on F (W U )C , then there are c˜ ∈ C[[ν]], c0 ∈ C such −1 0 that 8c∗ = ec˜ ad(ν )+c ad(log ν) . Furthermore, if 8c∗ has the real-to-real property and the hermitian property, then c0 = 0 and c˜ ∈ R[[ν 2 ]]. Proof. Here we give the proof of (4). Set 8c∗ (τ˜ ) = τ˜ + g, g ∈ F (W U )C . As [τ˜ , ξi ] = νξi , [τ˜ , ηj ] = νηj , and 8c∗ is an isomorphism, we have [g, ξi ] = [g, ηj ] = 0, hence g ∈ C[[ν]]. The second statement of (4) follows easily. The next lemma is given in [OMY1]: Lemma 2.6. For every pre-Weyl diffeomorphism 8∗ : F (W U )C → F (W U )C there is a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C which extends 8∗ . Proof. By Lemma 2.1, 8∗ induces a symplectic diffeomorphism ϕ on U . By Proposition 2.3, there is a Weyl diffeomorphism 9∗ which is a lift of ϕ. Let 9c∗ be a contact Weyl diffeomorphism which extends 9∗ . Hence, 8∗ 9∗−1 induces the identity on the base space. It follows by Corollary 2.5, (3) that 8∗ = 9∗ ead(h) , h ∈ F (W U )C . Note that the ad(ν −1 ) component is not used, since these act trivially on F(W U )C . Hence we define a pointless contact diffeomorphism 8c∗ by 9c∗ ead(h) . A contact Weyl diffeomorphism 8c∗ has by definition the real-to-real property and it may be assumed by Proposition 2.3 that 8c∗ has the hermitian property. We now remark the following: Lemma 2.7. If a pointless contact diffeomorphism 8c∗ : Γ (gU )C → Γ (gU )C has the real-to-real property and the hermitian property, then 8c∗ (τ˜ ) is written in the form ] + ··· , τ˜ + g0] + ν 2 ∗g2] + · · · + ν 2k ∗g2k
g2k ∈ C ∞ (U ).
214
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
3. Poincar´e–Cartan Classes 3.1. Weyl manifold. Let W M be a locally trivial algebra bundle with the fiber isomorphic to W. Then for an open covering {Vα } of M , there are local trivializations 8α : W Vα → W Uα associated to Vα , where W Vα is the restriction of W M and W Uα is the trivial algebra bundle over Uα (⊂ R2n ). Denote by ϕα : Vα → Uα the induced homeomorphism. Definition 3.1. W M is called a (real) Weyl manifold, if for each Vα , Vβ such that Vα ∩ Vβ 6= ∅, the patching transformation 8αβ = 8β 8−1 α : Uαβ × W → Uβα × W ,
(3.1)
where Uαβ = ϕα (Vα ∩ Vβ ), induces a Weyl diffeomorphism 8∗αβ . Each 8α : W Vα → W Uα is called a local Weyl chart on W M , and W Uα is called the model algebra over Vα . If 8∗αβ are merely pre-Weyl diffeomorphisms, then W M is called a pre-Weyl manifold. By Lemma 2.1, the base manifold M of a pre-Weyl manifold W M has a C ∞ symplectic structure. The following was the main theorem of [OMY1]: Theorem 3.2. On every C ∞ symplectic manifold M , there exists a Weyl manifold W M . In particular, the system of trivial Weyl algebra bundles {W Uα } can be patched together via Weyl diffeomorphisms. The notions of Weyl functions, the involutive anti-automorphism f → f¯, and integration are naturally defined on a Weyl manifold W M . Denote by F(W M ) the algebra of all Weyl functions on W M . Two Weyl manifolds W M , W 0M are said to be isomorphic, if there is an algebra isomorphism 9 : F (W M ) → F (W 0M ) inducing the identity on the base manifold M . Using the fact that F(W M ) is linearly isomorphic to C ∞ (M )[[ν]], we translate the algebra structure of F(W M ) over to C ∞ (M )[[ν]]. In particular, C ∞ (M )[[ν]] is a noncommutative associative algebra which can be viewed as a deformation quantization of (C ∞ (M ), ·). Through this observation, we see also that complex conjugation f → f¯ is an involutive anti-automorphism of (C ∞ (M )[[ν]], ∗). Suppose conversely that we have a deformation quantization (C ∞ (M )[[ν]], ∗) with an involutive anti-automorphism f → f¯ such that f¯ = f for any f ∈ C ∞ (M ) and ν¯ = −ν. Let {Vα } be a locally finite simple open covering of M . Note that by the localization theorem [OMY2], the above ∗-product can be localized on C ∞ (Vα )[[ν]]. Here we need a definition; Definition 3.3. For f ∈ C ∞ (U )[[ν]], the body part b(f ) of f is an R-valued C ∞ function on U such that f − b(f ) ∈ νC ∞ (U )[[ν]]. A system of elements ξ1 , · · · , ξ2n ∈ C ∞ (U )[[ν]] are called topological generators (T-generators), if the body parts b(ξ1 ), · · · , b(ξ2n ) are local coordinates on U . By the same idea of Weyl continuation, every f ∈ C ∞ (U )[[ν]] can be viewed as a “function” of ξ1 , · · · , ξ2n , whenever ξ1 , · · · , ξ2n are T-generators. By the quantum version of Darboux’s theorem [O], there are elements ξ1 , · · · , ξn , η1 , · · · , ηn of C ∞ (U )[[ν]] such that ξ¯i = ξi ,
η¯i = ηi ,
[ξi , ξj ] = [ηi , ηj ] = 0,
[ξi , ηj ] = −νδij ,
(3.2)
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
215
which are T-generators. We call ξ1 , · · · , ξn , η1 , · · · , ηn quantum canonical generators (QC-generators). As in (1.2), we use the inverse Moyal product formula to make a commutative product ◦. We identify Vα with Uα ⊂ R2n . It is not hard to see that the mapping f → f¯ remains as an involutive automorphism of (C ∞ (Vα )[[ν]], ◦), and f ◦ g is decomposed for some k ≥ 1 into X ν 2l $2l (f, g), $2l (f, g) = $2l (g, f ) ∈ C ∞ (Vα ). f ◦g =f ·g+ l≥k
Since the first component $2k is a Hochschild 2-cocycle, and hence a Hochschild 2coboundary by [OMY2, Theorem 2.2], it is easy to see that (C ∞ (Vα )[[ν]], ◦) is isomorphic to (C ∞ (Vα )[[ν]], ·) with the usual commutative product ·. Hence, there is an open subset Uα of R2n and (C ∞ (Vα )[[ν]], ∗) is isomorphic to F (W Uα ) through an isomorphism 9∗α with the hermitian property. On each Vα ∩ Vβ , the identity mapping of (C ∞ (Vα ∩ Vβ )[[ν]], ∗) onto itself regarded as (C ∞ (Vβ ∩ Vα )[[ν]], ∗), induces a Weyl diffeomorphism 8∗αβ : F (W Uβα ) → F (W Uαβ ). Hence, any deformation quantization (C ∞ (M )[[ν]], ∗) with an involutive anti-automorphism is obtained as an algebra of Weyl functions on a Weyl manifold. 3.2. Poincar´e–Cartan classes. For a symplectic manifold M , there are Weyl manifolds over M which are not isomorphic. We give the complete invariant for the isomorphism class of a Weyl manifold as an element of H 2 (M )[[ν 2 ]]. Let {Vα } be a covering of M . For each α let ϕα : Vα → Uα ⊂ R2n be a symplectomorphic coordinate map. Consider the trivial Lie algebra bundle gUα on Uα . Recall that Theorem 3.2 was proved in [OMY1] by constructing a contact Weyl diffeomorphism 8c∗ αβ : Γ (gUβα ) → Γ (gUαβ ) for Vα ∩ Vβ 6= ∅, patching gUα and gUβ together. Let 8∗αβ be the restriction 8c∗ αβ |F(W Uβα ). It is clear that {8∗αβ } gives a pre-Weyl manifold if and only if 8c∗ αβ satisfy 8c∗ αα = 1
c∗ c∗ cαβγ ad(ν and 8c∗ αβ 8βγ 8γα = e
−1
)+c0αβγ ad(log ν)
on every Vα ∩ Vβ ∩ Vγ 6= ∅, where cαβγ ∈ R[[ν]] and c0αβγ ∈ R. The necessity is given −1
0
by Corollary 2.5, (4), and the sufficiency is given since ecαβγ ad(ν )+cαβγ ad(log ν) is the identity on each subalgebra F (W Uαβγ ), where Uαβγ = ϕα (Vα ∩ Vβ ∩ Vγ ). {8∗αβ } gives a Weyl manifold if and only if 8c∗ αβ has the hermitian property furthermore. If this is the case, we see that c0αβγ = 0 and cαβγ ∈ R[[ν 2 ]]. Under these situations the family {F (W Uα )} of algebras is patched together to give an algebra sheaf on M . ˇ 2-cocycles on M (cf.[O, p. 353, It is easily seen that {cαβγ } and {c0αβγ } are Cech OMY1, Lemma 5.6]). In what follows Weyl manifolds are our main concern, but pre-Weyl manifolds are occasionally used for a supplementary role. Definition 3.4. For a family {Γ (gUα )} constructed on a Weyl manifold W M , {cαβγ } is called the Poincar´e–Cartan 2-cocycle of {gUα }. If we set 2 (2) cαβγ = c(0) αβγ + ν cαβγ + · · · ,
(3.3)
216
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
ˇ then {c(0) αβγ } is cohomologous to a Cech cocycle given by the symplectic 2-form on M (cf. [O, p. 357, KM]). We call the cohomology P class of {cαβγ } the Poincar´e–Cartan class of {W Uα } and denote it by c(W M ) = k≥0 ν 2k c(2k) (W M ). In [OMY1], we constructed a Weyl manifold on M such that cαβγ = c(0) αβγ ∈ R. The following is a characterization for the equivalence of Weyl manifolds via the Poincar´e–Cartan class. This corresponds to the work by [D, NT and BCG] in which they characterized the equivalence of Fedosov star products: Theorem 3.5. The equivalence of Weyl manifolds W M up Pto isomorphism is determined by the Poincar´e–Cartan class. Moreover, for every c = k≥0 ν 2k c(2k) ∈ H 2 (M )[[ν 2 ]] such that c(0) is the class of symplectic 2-form, there exists a Weyl manifold W M whose Poincar´e–Cartan class c(W M ) is c. Proof. Let {cαβγ }, {c0αβγ } be Poincar´e–Cartan cocycles of {gUα } and {g0Uα } respectively. Suppose the Poincar´e–Cartan classes coincide. Then, there exists bαβ ∈ R[[ν 2 ]] on every Vα ∩ Vβ 6= ∅ such that bαβ = −bβα and c0αβγ − cαβγ = bαβ + bβγ + bγα . −1
Note that bαβ can be replaced by bαβ +cαβ such that cαβ +cβγ +cγα =0. Since ebαβ ad(ν ) c∗ bαβ ad(ν −1 ) ´ c∗ is an automorphism, we can replace 8c∗ . αβ by 8αβ = 8αβ e −1 bαβ ad(ν ) Since e is the identity on F(W Uαβ ), this replacement does not change the isomorphism class of F (W M ), but it changes the Poincar´e–Cartan cocycle from ´ c∗ {cαβγ } to {c0αβγ }. Hence we can assume that we have two families {8c∗ αβ } and {8αβ } of patching transformations such that c∗
c∗
c∗
c∗ c∗ cαβγ ad(ν ´ ´ ´ 8c∗ αβ 8βγ 8γα = 8αβ 8βγ 8γα = e
−1
)
.
´ c∗ −1 induces the identity on the base space, we see by Corollary 2.5, Since 8c∗ αβ (8αβ ) −1 c∗ ad(νhαβ ) ´ c∗ (3) that there is a unique hαβ such that 8 . ec ad(ν ) -terms can be αβ = 8αβ e removed by using the ambiguity of bαβ mentioned above. By a standard argument of ˇ Cech cohomology, we see that ad(νhα ) c∗ −ad(νhβ ) ´ c∗ 8 8αβ e . αβ = e
(See also Lemmas 5.4 through 5.6.) This implies that two families are isomorphic. Conversely suppose there is a Weyl diffeomorphism 9 : W 0M → W M which induces the identity on the base manifold. That is, 9∗ defines an algebra isomorphism of F(W M ) onto F(W 0M ) with the hermitian property such that 9∗ (ν) = ν. The isomorphism 9∗ is equivalently given by a family {9∗α } of isomorphisms: 9∗α : F(W U α ) → F (W 0U α ),
(3.4)
each of which induces the identity map on the base space Uα such that ´ ∗αβ . 9∗α 8∗αβ (9∗β )−1 = 8
(3.5)
c∗ If we extend 9∗α to a contact Weyl diffeomorphism 9c∗ α , then the replacement of 8αβ c∗ c∗ c∗ −1 by 9α 8αβ (9β ) makes no change of Poincar´e–Cartan cocycle.
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
217 −1
c∗
c∗ c∗ −1 bαβ ad(ν ) ´ αβ = 9c∗ By (3.5) and Corollary 2.5,(4), we have 8 . Howα 8αβ (9β ) e ever this type of replacement changes the Poincar´e–Cartan cocycle within the same cohomology class.P Suppose c = k≥0 ν 2k c(2k) ∈ H 2 (M )[[ν 2 ]] is given. Then, we start with a Weyl e–Cartan cocycle {c(0) manifold W (0) M with a Poincar´ αβγ }, and changing patching Weyl diffeomorphisms we construct a Weyl manifold with a Poincar´e–Cartan class c. Let 8∗αβ : F(W Uβα ) → F (W Uαβ ) be the patching Weyl diffeomorphism of W (0) M (2k) ˇ and let 8c∗ be its extension as a contact Weyl diffeomorphism. Let {c } be a Cech αβ αβγ
cocycle involved in c(2k) . Since the sheaf cohomology of C ∞ functions H 2 (M, E) = {0}, ∞ there is h(2) αβ ∈ C (Uαβ ) on each Uαβ such that (2) (2) ∗ ∗ (2) −c(2) αβγ = hαβ + ϕαβ hβγ + ϕαγ hγα .
(3.6)
] 2 ] ` c∗ For a function rαβ ∈ C ∞ (Uαβ )[[ν 2 ]], we set h˜ αβ = (h(2) αβ ) + ν rαβ . If we use 8αβ = ˜
ad(ν hβα ) 8c∗ as patching diffeomorphisms for every Vα ∩ Vβ 6= ∅, then we see αβ e ∗
∗
∗
˜ βα ) ad(ν8 h ˜ ˜ c∗ c∗ c∗ ad(ν8αβ h αγ γβ ) ead(ν8αα hαγ ) . ` c∗ ` c∗ ` c∗ e 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα e
Here we used the general formula ∗
ad(h) 8c∗ = ead(8αβ h) 8c∗ αβ e αβ
(3.7)
for h ∈ F (W Uαβ ), proved by the uniqueness of solution of ordinary differential equations. By (3.6), we have ∗
∗
˜
∗
˜
˜
ead(ν8αβ hβα ) ead(ν8αγ hγβ ) ead(ν8αα hαγ ) = eν
2 (2) cαβγ adν −1
mod ν 4 .
(3.8)
By working on the term ν , ν , · · · , we can tune up rαβ by the same manner as in [OMY1] so that 4
∗
˜
6
∗
˜
∗
˜
ead(ν8αβ hβα ) ead(ν8αγ hγβ ) ead(ν8αα hαγ ) = eν
2 (2) cαβγ adν −1
.
` c∗ {8 αβ }
` M with the Poincar´e–Cartan class It follows that defines a Weyl manifold W c∗ ` c(0) + ν 2 c(2) . Replacing 8c∗ αβ by 8αβ and repeating a similar argument as above, we can 4 replace the condition mod ν in (3.8) by mod ν 6 . Repeating this procedure, we have a Weyl manifold W M such that c(W M ) = c ∈ H 2 (M )[[ν 2 ]]. 4. Noncommutative K¨ahler manifolds In this section, we introduce a restricted notion of deformation quantization for K¨ahler manifolds, which we call a noncommutative K¨ahler manifold. 4.1. Paracoordinates. Let us first review the calculus of complex variables, which differs crucially from the real case. Let U be an open subset of Rm with coordinate functions x1 , · · · , xm , and C ∞ (U )C the space of all C-valued C ∞ functions on U . Consider a set {z1 , · · · , zm } in C ∞ (U )C . Set U˜ = {ψz (p); p ∈ U },
ψz (p) = (z1 (p), · · · , zm (p)).
z1 , · · · , zm are called paracoordinates of U , if the following conditions are satisfied:
218
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
m (1) U˜ is a real m dimensional C ∞ submanifold of √C such that the complex span of the tangent space Tp U˜ equals Cm , i.e. Tp U˜ + −1Tp U˜ = Cm . (2) ψz : U → U˜ is a diffeomorphism.
[] ψz is called the coordinate map of paracoordinates. ∂x ∂zi The inverse matrix of ( ∂x ) is occasionally denoted by ( ∂zij ), though we do not j ∂x define the derivative ∂zij . Moreover, a C ∞ mapping f from U˜ into C is written in the form f (z1 , · · · , zm ), even though z1 , · · · , zm are not necessarily independent complex variables on U˜ . Since zi are C ∞ functions of x1 , · · · , xm , f (ψz (x)) is a C ∞ function ∂f as follows: of x1 , · · · , xm . We define ∂z i ∂f X ∂xk ∂f = . ∂zi ∂zi ∂xk m
(4.1)
k=1
Note that the right-hand side of (4.1) is computed as elements of C ∞ (U )C or of C ∞ (U˜ )C . Higher order derivatives are defined similarly. At each point p˜ ∈ U˜ , {dz1 , · · · , dzm } forms a real basis in real coefficients of the cotangent space Tp∗˜ U˜ . Let u1 , · · · , um ∈ C ∞ (U )C be other paracoordinates with U˜ 0 = {(u1 (p), · · · , um (p)); p ∈ U }. We set: ∂zi X ∂zi ∂xk = ∂uj ∂xk ∂uj m
(4.2)
k=1
although zi is not a genuine function of u1 , · · · , um . Note that ψu ψz−1 is a C ∞ diffeomorphism of U˜ onto U˜ 0 . The chain rule also holds for these. 4.2. K¨ahler manifolds. Let M be a smooth symplectic manifold with a symplectic form . For a function f ∈ C ∞ (M )C , we denote by f¯ the complex conjugate of f . The ¯ for any f, g ∈ C ∞ (M )C . Poisson bracket { , } defined by satisfies {f, g} = {f¯, g} Note that a K¨ahler manifold M is characterized as a real symplectic manifold covered by open subsets {Vα } such that for each Vα there is a homeomorphism ϕα : Vα → Uα ⊂ Cn with the following properties: (1) The coordinate functions z1α , · · · znα on Cn satisfy {ziα , zjα } = {z¯iα , z¯jα } = 0. (2) The matrix ({ziα , z¯jα }) is nondegenerate. (3) On each intersection Vα ∩ Vβ , setting Uαβ = ϕα (Vα ∩ Vβ ), the coordinate transformation ϕαβ = ϕβ ϕ−1 α : Uαβ → Uβα is holomorphic. z1α , · · · , znα are called K¨ahler coordinates (K-coordinates) on Vα . We can assume that {Vα } is a locally finite, simple open Stein covering: i.e. a locally finite open covering such that Vα1 ∩ · · · ∩ Vαk is a contractible Stein manifold. Let V be one of Vα . As is known in [KN], there exist K-coordinates z1 , · · · , zn on V . We can assume that there is a K¨ahler potential F on V , i.e. a real valued C ∞ function F (z, z) ¯ such that the symplectic form equals √ X −1 ∂ 2 F k,l = . (4.3) = k,l dzk ∧ dz¯l , 2 ∂zk ∂ z¯l √ P ∂F , we have = dzi ∧ dzi∗ , {zi , zj∗ } = δij and {zi∗ , zj∗ } = 0. Setting zi∗ = 2−1 ∂z i ∗ ∗ z1 , · · · , zn , z1 , · · · , zn are called complex canonical coordinates (CC-coordinates).
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
219
2
Since ∂z∂k ∂Fz¯l is nondegenerate, the CC-coordinates are paracoordinates of V . Note that the canonical conjugate variables z1∗ , · · · , zn∗ are not uniquely defined, as zi∗ can ∂h for any holomorphic function h. In using CC-coordinates, be replaced by z˜i∗ = zi∗ + ∂z i the Poisson bracket becomes X ∂f ∂g ∂g ∂f − ). (4.4) {f, g} = ( ∂zi ∂zi∗ ∂zi ∂zi∗ We consider the relationship between z1 , · · · , zn , z¯1 , · · · , z¯n
and z1 , · · · , zn , z1∗ , · · · , zn∗
(4.5)
on V . Let ϕz , ϕ0z be the coordinate maps of (z1 (p), · · · , zn (p), z¯1 (p), · · · , z¯n (p)),
(z1 (p), · · · , zn (p), z1∗ (p), · · · , zn∗ (p))
respectively. We set V˜ = {ϕz (p) ∈ C2n ; p ∈ V }, V˜ 0 = {ϕ0z (p) ∈ C2n ; p ∈ V }. ∞ difLet {t1 , · · · t2n } be a real coordinate system of V . Note that ϕ0z ϕ−1 z is a C 0 −1 feomorphism of V˜ onto V˜ 0 . Then, ϕ0z ϕ−1 can be written as ϕ ϕ (z, z) ¯ = (z, z ∗ ). If z z z we consider the inverse mapping of ϕ0z ϕ−1 ¯ can be viewed as a “function” of z, z ∗ , z , z which can be understood as a sort of implicit function theorem. Note that {dzi , dz¯i }, {dzi , dzi∗ } are real bases of the cotangent spaces TV∗˜ and TV∗˜ 0 respectively, and we have that there are relations: √ 2 ∂2F −1 X ∂ F dz¯k + dzk , 1 ≤ i ≤ n. (4.6) dzi∗ = 2 ∂zi ∂ z¯k ∂zi ∂zk 2
Since ∂z∂i ∂Fz¯k is non-singular, the above equality can be inverted to solve dz¯k . P p,q We consider the exterior algebra Λ∗ (V ) = Λ (V ) consisting of elements of the form: X ω= ωi1 ···ip ,j1 ···jq (z, z ∗ )dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · dzjq . Define the partial exterior derivatives ∂ω, ∂ ∗ ω by: X ∂ω = −{zj∗0 , ωi1 ···ip ,j1 ···jq }dzj0 ∧ dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · ∧ dzjq , (4.7) X ∂∗ω = {zi0 , ωi1 ···ip ,j1 ···jq }dzi∗0 ∧ dzi∗1 ∧ · · · ∧ dzi∗p ∧ dzj1 ∧ · · · ∧ dzjq . P Thus, ∂ ∗ zi∗ = (ad(zk )zi∗ )dzk∗ = dzi∗ , Hence we may set {zi , } =
∂zi =
∂ , ∂zi∗
P
−(ad(zk∗ )zi )dzk∗ = dzi .
−{zi∗ , } =
∂ . ∂zi
Using the Jacobi identity for the Poisson bracket, we have (∂ ∗ )2 = ∂ 2 = 0,
∂∂ ∗ + ∂ ∗ ∂ = 0.
We set d = ∂ + ∂ ∗ . The following is a slight modification of the Poincar´e lemma: Lemma 4.1. Let V be an open contractible subset of M with CC-coordinates (z1 , · · · , zn , z1∗ , · · · , zn∗ ). If dω = 0 for ω ∈ Λp,q (V ), then there exist θ1 ∈ Λp−1,q (V ) and θ2 ∈ Λp,q−1 (V ) such that ω = ∂ ∗ θ1 + ∂θ2 .
220
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
By Lemma 4.1, we have Lemma 4.2. On a K¨ahler manifold M , every holomorphic coordinate transformation ϕαβ : Uαβ → Uβα induces a Poisson algebra isomorphism of the form: ϕ∗αβ (ziβ ) = ϕiαβ (z α ),
ϕ∗αβ (zi∗β ) =
X ∂gαβ ((dϕαβ )−1 )ki · (zk∗α + ), ∂zkα
(4.8)
k
where gαβ is a holomorphic function. Proof. Let z1 , · · · , zn and w1 , · · · , wn be K-coordinates on Uαβ and Uβα respectively. Let z1 , · · · , zn , z1∗ , · · · , zn∗ and w1 , · · · , wn , w1∗ , · · · , wn∗ be associated CC-coordinates respectively. Since wi = wi (z); holomorphic function of z1 , · · · , zn , we have also w¯ i = ¯ and z¯i = z¯i (z, z ∗ ), we see that wi = wi (z), wi∗ = wi∗ (z, z ∗ ). w¯ i (z). Since wi∗ = wi∗ (z, z), The Poisson isomorphism which is induced by the coordinate transformation is written as ϕ∗αβ (wi ) = wi (z), ϕ∗αβ (wi∗ ) = wi∗ (z, z ∗ ). We define another Poisson isomorphism ϕ˜ ∗αβ by setting ϕ˜ ∗αβ (wi ) = wi (z),
ϕ˜ ∗αβ (w˜ i∗ ) =
X ∂zk z∗ ∂wi k
using the correspondence similar to transition functions of the cotangent bundle. Since both (w1 , · · · , wn , w1∗ , · · · , wn∗ ) and (w1 , · · · , wn , w˜ 1∗ , · · · , w˜ n∗ ) are CC-coordinates, we have {wi , wj∗ − w˜ j∗ } = 0. It follows that gj = wj∗ − w˜ j∗ is holomorphic. By {wi∗ , wj∗ } = {w˜ i∗ , w˜ j∗ } = 0, we have {w˜ i∗ , gj } − {w˜ j∗ , gi } = 0, which implies P ∂g . d( gi (w)dwi ) = 0. By Lemma 4.1, we have gi = ∂w i Put g = gαβ (z). Since ϕαβ is a holomorphic diffeomorphism, we have wi∗ = w˜ i∗ +
X ∂zk ∂g ∂gαβ = (zk∗α + ). ∂wi ∂wi ∂zkα
k
It is obvious that ϕ∗αα = 1, gαα =const, and on every Vα ∩ Vβ ∩ Vγ 6= ∅, we see that ϕ∗αβ ϕ∗βγ ϕ∗γα = 1,
ϕ∗αβ gβα + ϕ∗αγ gγβ + gαγ = const.
(4.9)
4.3. Noncommutative K¨ahler manifold. Let M be a K¨ahler n-manifold. In the following we give a noncommutative version of K-coordinates. Viewing M as a real symplectic manifold, we construct a real Weyl manifold W M as a locally trivial Weyl algebra bundle over M and a noncommutative algebra F (W M ) of the Weyl functions of W M . C We now consider the complexifications W C M and F (W M ) . The complexification C F(W M ) is viewed as a subalgebra of the sections of the complex Weyl algebra bundle WC M. Let U be a contractible open subset of R2n . Definition 4.3. (cf. Definition 3.3) For f ∈ F (W U )C , the body part b(f ) of f is a C ∞ -function on U such that f − b(f )] ∈ νΓ (W U )C . A system of elements ξ1 , · · · , ξ2n ∈ F(W U )C are called topological complex generators (TC-generators), if the body parts b(ξ1 ), · · · , b(ξ2n ) are paracoordinates on U .
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
221
If ξ1 , · · · , ξ2n are TC-generators, then these elements together with ν generate a dense subalgebra of F(W U )C . On a local coordinate neighborhood U , TC-generators ζ1 , · · · , ζn , ζ¯1 , · · · , ζ¯n ∈ F (W U )C are called quantum K¨ahler coordinates (QK-coordinates), if [ζi , ζj ] = [ζ¯i , ζ¯j ] = 0, and the body part of the matrix (− ν1 [ζi , ζ¯j ]) is non-degenerate. The following is easy to see: Proposition 4.4. Let U ⊂ Cn be a domain which is a Stein manifold. Suppose ζ1 , · · · , ζn ∈ F(W U )C satisfy [ζi , ζj ] = 0. Then, for any holomorphic function f (t1 , · · · , tn ) on a domain U , f (ζ1 , · · · , ζn ) can be defined by using a polynomial approximation, to be an element of F(W U )C . Definition 4.5. A complexified pre-Weyl manifold W C M is called a noncommutative K¨ahler manifold, if there is an open covering {Vα } with QK-coordinates z1α , · · · , znα , z¯1α , · · · , z¯nα of each F (W Uα )C , the model algebra over Vα , satisfying the following: On every Uαβ = ϕα (Vα ∩ Vβ ), two systems of the generators are related through a pre-Weyl diffeomorphism 8∗αβ : F(W Uβα )C → F(W Uαβ )C such that there is a holomorphic mapping ϕαβ = (ϕ1αβ , · · · , ϕnαβ ) of Uαβ onto Uβα with 8∗αβ (ziβ ) = ϕiαβ (z α ).
(4.10)
ϕαβ is called a holomorphic coordinate change. By the above definition, it is easily seen that the base manifold M of a noncommutative ahler manifold (cf.Sect. 4.2). K¨ahler manifold W C M is a K¨ A function of QK-coordinates z α remains a function of QK-coordinates z β after any patching transformation 8∗βα . Hence on the noncommutative K¨ahler manifold W C M , the notion of quantum holomorphic function is well-defined as a function of z α on each WC Uα . We now consider a Weyl manifold W M over M and its complexification W C M. Let {cαβγ } be the Poincar´e–Cartan cocycle of {g(Uα )}. Since constant functions can be viewed as holomorphic functions, there is a natural homomorphism π of H 2 (M ) into H 2 (M, O), the sheaf cohomology group of holomorphic functions. The following is the main theorem of this paper: Theorem 4.6. A Weyl manifold W M constructed on a K¨ahler manifold M is a noncommutative K¨ahler manifold, if and only if π(c(2k) (W M )) = 0 for every k ≥ 1. In particular, if H 2 (M, O) = {0}, then any Weyl manifold constructed on M is a noncommutative K¨ahler manifold.
5. Proof of Theorem 4.6 5.1. Quantum complex coordinates. Let M be a K¨ahler manifold. According to [OMY1], there exists a Weyl manifold W C M. Theorem 5.1. There is an open covering {Vα } of M such that on every Vα there are QK-coordinates ζ1α , · · · , ζnα , ζ¯1α , · · · , ζ¯nα and the quantum canonical conjugate ζ1∗α , · · · , ζn∗α with [ζiα , ζj∗α ] = −νδij , [ζi∗α , ζj∗α ] = 0.
222
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
Proof. We first take K-coordinates z1α , · · · , znα and we make CC-coordinates on Uα ; z1α , · · · , znα , z1∗α , · · · , zn∗α . Set ζiα = (ziα )] , ζi∗α = (zi∗α )] . Then, we get [ζiα , ζjα ] = [ζi∗α , ζj∗α ] = 0
(mod ν 2 ),
[ζiα , ζj∗α ] = −νδij
(mod ν 2 ).
(5.1)
Set as follows: [ζiα , ζjα ] = ν 2 aij
(mod ν 3 ),
[ζiα , ζj∗α ] = νδij + ν 2 bij
[ζi∗α , ζj∗α ] = ν 2 cij
(mod ν 3 ),
(mod ν 3 ).
Define a 2-form ω as follows: X ω= (aij dzi∗α ∧ dzj∗α − bij dzi∗α ∧ dzjα + cij dziα ∧ dzjα ).
(5.2)
(5.3)
By the Jacobi identity, we have dω = 0. Thus, by Lemma 4.1, there exists a 1-form X X κj dzjα (5.4) θ= λi dzi∗α + such that dθ = ω. Replacing ζiα , ζi∗α by ζ˜i = ζiα − νλi , ζ˜i∗ = ζi∗α + νκi , we obtain [ζ˜i , ζ˜j ] = [ζ˜i∗ , ζ˜j∗ ] = 0
[ζ˜i , ζ˜j∗ ] = −νδij
( mod ν 3 ),
Repeating this procedure yields Theorem 5.1. ζ1α , · · ·
, ζnα , ζ1∗α , · · ·
, ζn∗α
( mod ν 3 ).
will be called quantum complex canonical generators (QCC-
generators). 5.2. Standard noncommutative K¨ahler manifold. On a K¨ahler manifold M , we take a coordinate covering {Vα } and K-coordinates z1α , · · · , znα on Uα , where Uα = ϕα (Vα ). By using the argument in Sect. 4.2 on each Uα , there are CC-coordinates z1α , · · · , znα , z1∗α , · · · , zn∗α . Identifying Vα with Uα , we use the above CC-coordinates on Vα . Let ψzα : Vα → C2n be the coordinate map of these paracoordinates and let V˜α = α {ψz (p); p ∈ Vα }. We define a star-product ∗α on C ∞ (V˜α )C [[ν]] by ν ←− −−→ f ∗α g = f exp{− ∂zα ∧˙ ∂z∗α }g. 2
(5.5)
ˆ C (C ∞ (V˜α )C [[ν]], ∗α ) can be viewed as the algebra of Weyl functions F(W V˜ α ) of the C ˆ ˜ ˜ trivial complex Weyl algebra bundle W V˜ α over Vα . Since Vα is diffeomorphic to Vα ˆ C ˆ C through the coordinate map ψzα , F(W V˜ ) may be written as F (W V ). We now identify α
α
ˆ C C (V˜α )C [[ν]] with F(W Vα ). Let ϕαβ = ϕβ ϕ−1 α be classical holomorphic coordinate transformations. By (4.8), ˆ ∗αβ (ν) = (4.9), we see the following: Under the same notations as in Lemma 4.2, we set 8 ν and X ∂gαβ ˆ ∗αβ (zi∗β ) = ˆ ∗αβ (ziβ ) = ϕiαβ (z α ), 8 ((dϕαβ )−1 )ki · (zk∗α + ). (5.6) 8 ∂zkα ∞
k
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
223
ˆ ∗αβ extends to a pre-Weyl diffeomorphism of Theorem 5.2. (i) The mapping 8 ˆ C ˆ C (F(W V ), ∗β ) onto (F(W V ), ∗α ) such that βα
αβ
ˆ ∗αα = 1, 8
ˆ ∗αβ 8 ˆ ∗βγ 8 ˆ ∗γα = 1. 8
(5.7)
α α ∗α ˆ C Thus, we obtain a noncommutative K¨ahler manifold W M in which z1 , · · · , zn , z1 , · · · , ∗α zn are local QCC-generators. ˆ ∗αβ extends to a pointless contact diffeomorphism 8 ˆ c∗ (ii) Moreover, 8 αβ such that (0)
cαβγ ad(ν ˆ c∗ ˆ c∗ ˆ c∗ 8 αβ 8βγ 8γα = e
−1
)
,
c(0) αβγ ∈ C,
(5.8)
and c(0) αβγ defines a cohomology class in the coefficients C of the symplectic 2-form on M. Proof. Omitting subscripts α, β, we denote by zi0 = ϕi (z). By (5.5), we have " # X ∂zk ∂zk ∂zj0 ∂g ∗ 0 = ν · (z + ), z = νδij , k j ∂zi0 ∂zk ∂zi0 ∂zk k " # X ∂zk ∂g X ∂zm ∂g 0 0 ∗ ∗ [zi , zj ] = 0, · (zk + ), · (zm + ) = 0. ∂zi0 ∂zk ∂zj0 ∂zm m
(5.9)
k
Thus, setting zi0∗ =
P ∂zk ∂zi0
· (zk∗ +
∂g ∂zk ),
we see z10 , · · · , zn0 , z10∗ , · · · , zn0∗ are QCC-
generators of C ∞ (Vαβ )C [[ν]]. ˆ ∗αβ extends to a pre-Weyl diffeomorphism. Since ϕαβ is a symplectic We show that 8 diffeomorphism, Proposition 2.3 gives a lift 9∗αβ of ϕαβ . By Theorem 3.2, we may assume that 9∗αβ are patching Weyl diffeomorphisms of a Weyl manifold W M . ˆ ∗αβ on the above QCC-generators. Set We consider (9∗αβ )−1 8 ∗
ˆ αβ (zi0 ) = zi0 + hi , (9∗αβ )−1 8
∗
ˆ αβ (zi0∗ ) = zi0∗ + h∗i . (9∗αβ )−1 8
By (5.9) together with Lemma 4.1, we easily see that there are elements hαβ ∈ ˆ ∗αβ = 9∗αβ ead(hαβ ) . Since 9∗αβ and ead(hαβ ) are pre-Weyl C ∞ (Vαβ )C [[ν]] such that 8 ˆ ∗αβ extends to a pre-Weyl diffeomorphism. Equation (5.7) follows diffeomorphisms, 8 ˆ M. directly from (4.9). Thus, we get a noncommutative K¨ahler manifold W ˆ c∗ Though we can make, by Lemma 2.6, a pointless contact diffeomorphism 8 αβ which ˆ c∗ ˆ ∗αβ , we construct 8 directly in two ways to obtain (5.8). extends 8 αβ β We define a contact Weyl Lie algebra Γ (gC Uβ ) by joining τ with the relations [τ β , ν] = 2ν 2 , c∗
[τ β , ziβ ] = νziβ ,
[τ β , zi∗β ] = νzi∗β .
(5.10)
∗
ˆ αβ we have only to know the function fαβ given ˆ αβ of 8 To obtain the extension 8 c∗ β α ˆ αβ (τ ) = τ + fαβ . By (5.6), we set zi0 = ϕ∗αβ (z βi ), zi0∗ = ϕ∗αβ (z ∗β by 8 i ) on Uαβ . Then, by (5.9) and (5.10), fαβ must satisfy [τ α + fαβ , zi0 ] = νzi0 ,
[τ α + fαβ , zi0∗ ] = νzi0∗ .
(5.11)
224
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
Pn Note that [τ α , h] = νE α h, where E α is the Euler operator given by E α = i=1 (ziα · ∗α ∂ziα + zi · ∂zi∗α ). Equation (5.11) can be rewritten by using the usual commutative product, as ∂ ∂ fαβ = −(I − E α )zi0 , fαβ = (I − E α )zi0∗ . (5.12) ∂zi0∗ ∂zi0 Thus, by solving (5.12) via Lemma 4.1, we found fαβ . Since the right-hand side of (5.12) does not involve ν, fαβ does not involve ν. Put ∗ ∗ 1 c(0) αβγ = 2 (fαβ + ϕαβ fβγ + ϕαγ fγα ). c∗
c∗
c∗
(0)
−1
cαβγ ad(ν ) ˆ ˆ ˆ . Then, we have c(0) αβγ ∈ C and 8αβ 8βγ 8γα = e To find the de Rham cohomology class corresponding to c(0) αβγ through the isoˇ morphism between Cech cohomology and de Rham cohomology, we recall another recipe of constructions of fαβ . That is, we find a 1-form θ˜α on every Vα such that ϕ∗αβ θ˜β − θ˜α = 21 dfαβ on Vαβ = Vα ∩ Vβ , because {dθ˜α } defines a global closed 2-form. To find θ˜α , we remark that there is a one parameter family ψt of symplectic diffeomorphisms ψt : Vαβ,0 → Vαβ,t such that Vαβ,0 = Vαβ , ψ0 = 1 and Vαβ,1 = Vαβ , ψ1 = (ϕ1αβ , · · · , ϕ2n αβ ) (cf. [OMY3, Lemma A] and [KNz]). Define an infinitesimal symd ψt )ψt−1 . Since Ht is a Hamiltonian vector field, plectic transformation Ht by Ht = ( dt ∞ there is a C function ht such that yHt = −dht . c∗ d c∗ 1 Recall that a lift 9c∗ t is given by solving the equation dt 9t = 9t ad( ν ht ). If we c∗ α α set 9t (τ ) = τ + ft , then ft must satisfy the differential equation
1 1 d ft = [ht , ft ] + 2ht − [τ α , ht ]. dt ν ν
(5.13)
By the first construction of fαβ , we may set f1 = fαβ mod ν. P Note that setting θ˜α = 21 (ziα dzi∗α − zi∗α dziα ), we have ν1 [τ α , ht ] = E α ht = 2θ˜α yHt . Solving the equation of the ν 0 -component of (5.13), we have Z 1 ∗ fαβ = ψ1 ψt∗−1 (2ht − 2θ˜α yHt )dt. (5.14) 0
Note that = dθ˜α on Vα (cf. 4.2). By Cartan’s formula of Lie derivatives, we see that Z 1 Z 1 dfαβ = 2ψ1∗ ψt∗−1 (dht − d(θ˜α yHt ))dt = −2ψ1∗ ψt∗−1 LHt θ˜α dt. 0
0
Hence we have dfαβ = 2(ϕ∗αβ θ˜β − θ˜α ), by remarking ψ1∗−1 θ˜α = P that dθ˜α = 2 dziα ∧ dzi∗α . The last assertion is proved.
ϕ∗αβ θ˜β . Thus, we see
Probably, the noncommutative K¨ahler structure obtained by Theorem 5.2 is isomorphic to that given by Karabegov [Ka]. 5.3. Proof of Theorem 4.6. Suppose we have a Weyl manifold W M with the Poincar´e– Cartan class {cαβγ }. Let 8∗αβ be Weyl diffeomorphisms giving patching transforma∗ tions, and let 8c∗ αβ be the lifts of 8αβ given in Sect. 3.2. Let ϕαβ : Uαβ → Uβα be the coordinate transformation induced by 8∗αβ . ϕαβ is a symplectomorphism and a holomorphic diffeomorphism at the same time.
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
225
By the assumption of Theorem 4.6, we have that for every k ≥ 1, {c(2k) αβγ } can be (2k) (2k) (2k) (2k) ∗ ∗ (2k) is a written in the form cαβγ = gαβ + ϕαβ gβγ + ϕαγ gγα , (k ≥ 1), where gαβ holomorphic function on Uαβ = ϕα (Vα ∩ Vβ ). Beside 8c∗ αβ , we define another family of pre-Weyl diffeomorphisms ˘ c∗ ˆ c∗ 8 αβ = 8αβ exp
X k≥1
1 (2k) ad(ν 2k−1 gαβ ) 2k − 1
(5.15)
c∗
(2k) ˘ αβ } satisfies by using gαβ given above. By (3.7), we see that {8
ˆ c∗ ˆ c∗ ˆ c∗ ˘ c∗ ˘ c∗ ˘ c∗ 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα exp
P
2k−1 (2k) 1 (gαβ k 2k−1 ad(ν
∗
∗
(2k) ˆ αβ g (2k) + 8 ˆ αγ gγα +8 )). βγ
Hence by (5.6), (5.8), we have c∗ c∗ c∗ cαβγ ad(ν −1 ) ˘ c∗ ˘ c∗ ˘ c∗ . Lemma 5.3. 8 αβ 8βγ 8γα = 8αβ 8βγ 8γα = e
˘ ˘ c∗ The above lemma also shows that the system {8 αβ } defines a pre-Weyl manifold W M . However, we see also the following: 6 ∅, there exist a unique hαβ ∈ Lemma 5.4. For each α, β such that Vα ∩ Vβ = ad(νhαβ ) cαβ ad(ν −1 ) ˘ c∗ = 8 e . F (W Uαβ )C and a unique cαβ ∈ C such that 8c∗ αβ αβ e Proof. We already know that fαβ does not involve ν. Hence by (5.15) we see that P ] β α 2k ∞ ˘ c∗ 8 αβ (τ ) is written in the form τ + k≥0 ν ∗h2k , h2k ∈ C (Uαβ ). Apply now Lemma ∗ ˆ αβ induce the same ϕαβ on the base spaces, we see 2.7 to 8∗αβ . Since both 8∗αβ and 8 ∗ ˆ αβ )−1 (τ α ) is written uniquely in the form τ α + 2cαβ + by Corollary 2.5, (1) that 8∗αβ (8 2 0 0 ˘ c∗ ad(hαβ ) ecαβ ad(ν −1 ) . ν ∗ hαβ , where hαβ ∈ F (W Uβα )C . Hence, we have 8c∗ αβ = 8αβ e We remark also that cαβ + cβγ + cγα = 0. c∗ ˘ c∗ ˘ c∗ The identities 8c∗ αβ 8βα = 1, 8αβ 8βα = 1 together with (3.7) yield hβα = ∗ ˘ αβ hαβ and cαβ = −cβα . −8 In what follows we use −1 ˘ ∗αβ = 8 ˘ ∗αβ ecαβ ad(ν ) 9 (5.16) ∗
˘ αβ , since this replacement (5.16) does not change the Poincar´e–Cartan instead of 8 cocycle by the above remark. The identity in Lemma 5.3 gives the following cocycle property for {hαβ }: ˘∗
˘∗
Lemma 5.5. On Γ (gUαβγ ), we have ead(νhαβ ) ead(ν 9αβ hβγ ) ead(ν 9αγ hγα ) = 1. ˘ M. The next lemma shows that this cocycle is a coboundary, and hence W M ∼ =W Lemma 5.6. For each α, there exists hα ∈ F (W Uα )C such that c∗
ad(νhα ) ˘ 9αβ e−ad(νhβ ) . 8c∗ αβ = e
(5.17)
226
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
Proof. Let ϕαβ be the induced symplectic diffeomorphism by 8∗αβ . Using Lemma 5.5, we have by identifying hαβ with an ordinary function that hαβ + ϕ∗αβ hβγ + ϕ∗αγ hγα = 0
mod ν.
(5.18)
Taking a partition of unity {φα } subordinate to the covering {Vα }, we set X hα = ϕ∗αγ φγ hαγ ∈ F(W Uα )C .
(5.19)
γ
Using Lemma 5.5 again, we get hαβ = hα − ϕ∗αβ hβ , Setting ad(hα ) ˘ c∗ −ad(hβ ) ´ c∗ 9αβ e , 9 αβ = e 3 ˜ ´ c∗ we see that 8c∗ αβ = 9αβ mod ν . By Corollary 2.5, (1), there exists a unique hαβ such that 2 ´ c∗ ad(ν h˜ αβ ) 8c∗ (5.20) αβ = 9αβ e
without the ec ad(ν
−1
)
-term. Repeating this procedure yields Lemma 5.6.
We now show Theorem 4.6. We first show the necessity; π(c(2k) (W M )) = 0 implies M is a noncommutative K¨ahler manifold. We may assume by Lemma 5.6 that W M is a ˘ ∗αβ . Since [ν −1 , ziβ ] = 0 pre-Weyl manifold with a system of patching diffeomorphisms 9 (2k) β (z ), ziβ ] = 0, we have by (5.15), (5.16) and Theorem 5.2 that and [gαβ ˆ ∗αβ ziβ = ϕiαβ (z α ). ˘ ∗αβ ziβ = 8 9
(5.21)
This means the patching transformations are holomorphic. Note that [ziα , z¯jα ] = −ν{b(ziα ), b(z¯jα )} mod ν. Hence we see the body part of the matrix (− ν1 [ziα , z¯jα ]) is nondegenerate. Thus, W M is a noncommutative K¨ahler manifold. To prove the sufficiencyP in Theorem 4.6, let W M be a Weyl manifold with the (2k) . By definition of Weyl manifold W M for a Poincar´e–Cartan class c = k≥0 c K¨ahler manifold M , there are a simple open Stein covering {Vα }, a system of trivial Lie algebra bundle {gUα } and a system of patching transformations {8c∗ αβ }. Suppose that W M is a noncommutative K¨ahler manifold over M . Then, we may assume that on each F(W Uα )C there are QK-coordinates z1α , · · · , znα , z¯1α , · · · , z¯nα with the pre-Weyl diffeomorphisms 9∗αβ : F(W Uβα )C → F(W Uαβ )C satisfying the property (4.10). Since {8∗αβ } and {9∗αβ } are patching diffeomorphisms of the same Weyl manifold W M , there is a pre-Weyl diffeomorphism 9∗α for each α such that 8∗αβ 9∗β = 9∗α 9∗αβ . Hence W M can be viewed as a pre-Weyl manifold with patching diffeomorphisms 9∗αβ . ∗ Let 9c∗ αβ be a pointless contact diffeomorphism which extends 9αβ . On the other hand, remark that the holomorphic coordinate change ϕαβ in (4.10) can be viewed as the usual holomorphic coordinate transformations on the base manifold M . By Theorem 5.1, there are QCC-generators z1α , · · · , znα , z1∗α , · · · , zn∗α on each ˆ ∗αβ and pointless F (W Uα )C . By Theorem 5.2, we have pre-Weyl diffeomorphisms 8 ˆ∗ ˆ c∗ contact diffeomorphisms 8 αβ which extend 8αβ . c∗ −1 ˆ induces the identity on the base space Uαβ , there is by (2.2) Since 9c∗ αβ (8αβ ) C ˆ ∗αβ ead(hαβ ) . The terms ec ad(ν −1 ) , ec0 ad(log ν) need hαβ ∈ F (W Uαβ ) such that 9∗αβ = 8 not be used because these are identities on F(W Uαβ )C .
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
227
∗
ˆ αβ (zi ) for every zi . We see that [zi , hαβ ] = 0. It follows that Note that 9∗αβ (zi ) = 8 hαβ does not involve zi∗ variables, that is “holomorphic”. ∗ ˆ c∗ ad(hαβ ) . Then, 9c∗ e– We define 9c∗ αβ by 8αβ e αβ is an extension of 9αβ . The Poincar´ Cartan cocycle of W M is given as ecαβγ ad(ν
−1
)
c∗
c∗
c∗
ˆ∗
ˆ∗
c∗ c∗ ad(8αβ hαβ ) ad(8αγ hβγ ) ad(hγα ) ˆ ˆ ˆ = 9c∗ e e . αβ 9βγ 9γα = 8αβ 8βγ 8γα e ˆ∗
−1
(0)
−1
ˆ∗
By (5.8), we have ecαβγ ad(ν ) = ecαβγ ad(ν ) ead(8αβ hαβ ) ead(8αγ hβγ ) ead(hγα ) . Let hαβ = P P k (k) 2k (2k) k≥0 ν hαβ . Since cαβγ = k≥0 ν cαβγ , we have ˆ ∗αγ hβγ + hγα = 0 ˆ ∗αβ hαβ + 8 8
mod ν.
By identifying h(0) αβ with an ordinary function, we get (0) ∗ (0) 0 = ϕ∗αβ h(0) αβ + ϕαγ hβγ + hγα . (0)
(0) ˆ c∗ −ad(hαβ ) instead of 8 ˆ c∗ ˇ c∗ Consider 8 αβ = 8αβ e αβ in the above arguments. Since hαβ are c∗ ˇ αβ can be used as patching diffeomorphisms to define a noncommutative holomorphic, 8 K¨ahler manifold. ˇ ∗αβ ead(νhαβ ) and Now by the same reason as above, there are hαβ such that 9∗αβ = 8 hαβ are holomorphic. Hence, we have −1
(0)
−1
ˇ∗
ˇ∗
ecαβγ ad(ν ) = ecαβγ ad(ν ) ead(ν 8αβ hαβ ) ead(ν 8αγ hβγ ) ead(νhγα ) . P (2) (0) (0) ∗ ∗ (0) Setting hαβ = k≥0 ν k h(k) αβ , we have cαβγ = ϕαβ hαβ + ϕαγ hβγ + hγα . This implies (2) that cαβγ is a coboundary in the cochain complex with coefficients O. Repeating this 2 procedure, we see {c(2k) αβγ } = 0 in H (M, O) for k ≥ 1. Thus, we obtain Theorem 4.6. 6. Construction of Noncommutative Contact Algebras In this section we construct a certain algebra, called a noncommutative contact algebra over a quantizable symplectic manifold. We use notations stated in Sect. 3. Let M be a symplectic manifold with the symplectic form . We assume that M is quantizable, i.e., π1 ∈ H 2 (M ; Z). We consider a Weyl manifold W M . On each coordinate Uα , we use τ given by (1.9) and we denote it by τ˜α . In this section we assume that the Poincar´e–Cartan class c(W M ) is c(0) (W M ). Since M is quantizable, we can assume that c(0) αβγ is taken as πnαβγ , where nαβγ ∈ Z. α −1 Since [τ˜ , ν ] = −2, we see that c∗ c∗ α α 8c∗ αβ 8βγ 8γα τ˜ = τ˜ + 2πnαβγ . α
α
(6.1)
c∗ c∗ iτ˜ = eiτ˜ , and hence the associative algebras A(Uα ) genThis implies 8c∗ αβ 8βγ 8γα e iτ˜ α C erated by e and F(W Uα ) can be patched together to form an algebra sheaf. We denote this patched algebra by A(M ). Every element of A(Uα ) is written in the form P α fm ∗ eimτ˜ , fm ∈ F (W Uα )C and 8c∗ αβ are the patching transformations.
228
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka α
Let Am (M ) be the subspace consisting of elements written in the form fα ∗eimτ˜ on each Uα , where fα ∈ F (W Uα ). Am (M ) is characterized as the eigenspace of 1i ad(ν −1 ) with the eigenvalue 2m. Clearly, we have M Am (M ), A(M ) = m∈Z
and A0 (M ) = F(W M ) is a subalgebra of A(M ). Since 8c∗ αβ is a contact Weyl diffeomorphism, we see that there exists fαβ ∈ F (W Uαβ )C on every Vα ∩ Vβ 6= ∅ such that C
β
c∗
iτ˜ ) = ei8αβ (τ˜ 8c∗ αβ (e
β
)
= ei(τ˜
α
+fαβ )
Lemma 6.1. There is Fαβ ∈ F (W Uαβ )C such that ei(τ˜ ular, we have eis(τ˜
α
+tν)
α
t
α
.
(6.2)
+fαβ )
t
α
= Fαβ ∗ eiτ˜ , In particα
= eisτ˜ ∗ (1 + 2isν) 2 = (1 − 2isν)− 2 ∗ eisτ˜ . is(τ˜ α +fαβ )
(6.3)
−isτ˜ α
Proof. Consider ψ(s) = e ∗e . Since [ν, ψ(s)] = 0, ψ(s) does not involve τ˜ α . We have α α α α d ψ(s) = eis(τ˜ +fαβ ) ∗ (ifαβ ) ∗ e−isτ˜ = iψ(s) ∗ eisτ˜ ∗ fαβ ∗ e−isτ˜ . (6.4) ds α
α
α
Put g(s) = eisτ˜ ∗ fαβ ∗ e−isτ˜ . Since g(s) = eisad(τ˜ ) fαβ , we see g(s) ∈ F(W Uα )C . d ψ(s) = ψ(s) ∗ ig(s), where g(s) ∈ F (W Uα )C Thus, we have a differential equation ds is viewed as a known function. Note that F (W Uα )C ∼ = C ∞ (Uα )C [[ν]]. By the Moyal product formula, the above differential equation can be rewritten as a system of differential equations on Uα . It is easy to see that (6.4) has a unique solution in C ∞ (Uα )C [[ν]]. Note that α α tν . eisτ˜ ∗ tν ∗ e−isτ˜ = 1 − 2isν Equation (6.3) is obtained by solving (6.4) inserted in the above quantity. β
α
iτ˜ ) = 8∗αβ (g) ∗ Fαβ ∗ eiτ˜ for any g ∈ F(W Uβ )C . Lemma 6.1 shows that 8c∗ αβ (g ∗ e −1 α α −1 t Since etad(ν ) eτ˜ = eτ˜ +2t , we see that e 2 ad(ν ) gives an S 1 = {eit } action on A(M ). Hence the relation (6.2) together with Lemma 6.1 can be viewed as a transition rule (coordinate transformation) of a “quantum” S 1 -principal bundle and the associated line bundle. Remark that the principal S 1 -bundle PM constructed on M via the quantization condition is a contact manifold with a contact form as a connection form whose curvature form is the symplectic form. We denote by LM the line bundle associated to PM . A(M ) can be viewed as a noncommutative contact algebra (C ∞ (PM )C [[ν]], ∗) defined on PM . We denote by P˜M , L˜ M the quantum principal bundle and its associated line bundle respectively given by the patchwork mentioned above. We suppose that W M is a noncommutative K¨ahler manifold over a K¨ahler manifold M . By Theorem 5.1, there exist QCC-generators z1α , · · · , znα , z1∗α , · · · , zn∗α . As we did in (5.10), joining a new element τ˜ α such that
[τ˜ α , ziα ] = νziα ,
[τ˜ α , zi∗α ] = νzi∗α ,
[τ˜ α , ν] = 2ν 2 ,
(6.5)
C we construct a family of contact Lie algebras {gU } . Since the ν-isomorphism class of α α C C gUα depends only on W Uα , we see that the Poincar´e–Cartan cocycle of {gC Uα }α gives the class c(0) (W M ) in the coefficient C, which is assumed to be integral.
Poincar´e–Cartan Class and Deformation Quantization of K¨ahler Manifolds
229
α
Proposition 6.2. For every Vα there is an element Hα ∗eiτ˜ ∈ A(Vα ) such that [zi , Hα ∗ α eiτ˜ ] = 0. Moreover, if f ∈ A(Vα ) satisfies [ziα , f ] = 0 for every i, 1 ≤ i ≤ n, then there α is a holomorphic function h(z) such that f can be written in the form h(z) ∗ Hα ∗ eiτ˜ . Proof. We need only to show the first assertion. The inverse Moyal product formula (1.2) for QCC generators z1α , · · · , znα , z1∗α , · · · , zn∗α gives a commutative product ◦. Using the product ◦, the ∗-product is given by the Moyal P product formula (1.1). Take a function H(t) and put Hα = H( ziα ◦zi∗α ). We consider the system of equaα tions [ziα , H ∗ eiτ˜ ] = 0. By the Moyal product formula for the above QCC generators, this equals α α (6.6) (H 0 ◦ ziα ) ∗ eiτ˜ + H ∗ [ziα , eiτ˜ ] = 0. α
α
Since eiτ˜ ∗ ziα = ziα ∗ ei(τ˜ +ν) , (6.6) is reduced by using Lemma 6.1 to a differential equation ν 1 1 d (1 + √ ) H(t) + (1 − √ )H(t) = 0. (6.7) 2 1 − 2iν dt 1 − 2iν Equation (6.7) can be solved in C ∞ (R)[[ν]] and we have H ∈ C ∞ (Vα )[[ν]].
By Proposition 6.4, we see that for every Vα ∩Vβ 6= ∅ there are holomorphic functions h such that α iτ˜ β ) = hαβ ∗ Hα ∗ eiτ˜ . (6.8) 8c∗ αβ (Hβ ∗ e αβ
line bundle over M and Thus, the quantum line bundle L˜ M of LM is a holomorphic L . A(M ) can be viewed as the algebra of sections of m∈Z L˜ m M Let H(M )[[ν]] be the commutative algebra consisting of all holomorphic sections L of m≥0 L˜ m M . If H(M )[[ν]] 6= {0}, H(M )[[ν]] can be viewed as a representation space of Weyl functions on M . As pointed out in [CGR], the multiplication operator combined with the projection to the space of all holomorphic sections is the essence of the Berezin representation [Be], which coincides with the representation produced by geometric quantization with respect to the K¨ahler polarization mentioned above. References [BCG] [Be] [BFL] [CGR] [D] [DL] [F] [Ka] [KM] [KN]
Bertelson, M., Cahen, M. and GuttS, S.: Equivalence of star-products. To appear in Commun. Math. Phys. Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 8, 153–174 (1975) Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A. and Sternheimer. D.: Deformation theory and quantization I. Ann. of Physics 111, 61–110 (1978) Cahen, M., Gutt S. and Rawnsley, J.: Quantization of K¨ahler manifolds, II. Trans. Amer. Math. Soc. 337, 73–98 (1993) Deligne, D.: D´eformation de l’Alg`ebre des Fonctions d’une Vari´et´e Symplectique: Comparison entre Fedosov et De Wilde. Lecomte, Selecta Math. New Series 1, 667–697 (1995) De Wilde, M. and Lecomte, P.B.: Existence of star-products and of formal deformations of the Poisson Lie algebra of arbitrary symplectic manifolds. Lett. Math. Phys. 7, 487–496 (1983) Fedosov, B.: Deformation quantization and index theory, Mathematical topics. 9, Basel–Boston: Birkh¨auser, 1996 Karabegov, A.V.: Deformation quantization with separation of variables on a K¨ahler manifold. Commun. Math. Phys. 180, 745–755 (1996) Karasev M. and Maslov, V.: Asymptotic and geometric quantization. Russian Math. Surveys 39, no.6, 133–206 (1984) Kobayashi, S. and Nomizu, K.: Foundations of differential geometry II. New York: Wiley, 1969
230
[KNz]
H. Omori, Y. Maeda, N. Miyazaki, A. Yoshioka
Karasev, M. and Nazaikinskii, V.: On the quantization of rapidly oscillating symbols. Math. USSR Izv. 34, 737–764 (1978) [NT] Nest, R. and Tsygan, B.: Algebraic index theorem for families. Adv. Math. 113, 151–205 (1995) [L] Lichnerowicz, A.: D´eformations d’algebr`es a` une vari´et´e symplectique (les ∗ν -produits). Ann. Inst. Fourier, Grenoble 32, 157–209 (1982) [O] Omori, H.: Infinite dimensional Lie groups. AMS. Translation Monograph 158, Providence, RI: Am. Math. Soc., 1997 [OMY1] Omori, H., Maeda, Y. and Yoshioka, A.: Weyl manifolds and deformation quantization. Adv. Math. 85, 224–255 (1991) [OMY2] Omori, H., Maeda, Y. and Yoshioka, A.: Deformation quantization of Poisson algebras. Contemp. Math. 179, 213–240 (1994) [OMY3] Omori, H., Maeda, Y. and Yoshioka, A.: Existence of a closed star product. Lett. Math. Phys. 26, 285–294 (1993) [OMY4] Omori, H., Maeda, Y. and Yoshioka, A.: A Poincar´e–Birkhoff–Witt theorem for infinite dimensional Lie algebras. J. Math. Soc. Japan 46, 25–50 (1994) [OMMY] Omori, H., Maeda, Y., Miyazaki, N. and Yoshioka, A.: Noncommutative 3-sphere: A model of noncommutative contact algebras, To appear in J. Math. Soc. Japan [V] Vey, J.: D´eformations du crochet de Poisson d’un vari´et´e symplectique. Comment. Math. Helv. 50, 421–454 (1975) [W] Weinstein, A.: Deformation quantization, S´eminaire Bourbaki, 46´eme annee. Asterisque 227, 789, 389–409 (1995) [X] Xu, P.: Fedosov ∗-products and quantum moment maps. To appear Communicated by H. Araki
Commun. Math. Phys. 194, 231 – 248 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Stochastic Burgers’ Equations and Their Semi-Classical Expansions A. Truman1 , H. Z. Zhao1,2 1 Department of Mathematics, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, UK. E-mail:
[email protected] 2 Department of Mathematics, University of California, Irvine, CA 92697, USA. E-mail:
[email protected]
Received: 12 May 1997 / Accepted: 23 February 1998
Abstract: In this paper we use the Hopf-Cole logarithmic transformation and the stochastic Hamilton Jacobi theory to study stochastic heat equations and Burgers’ equations. Before the caustics, the stochastic inviscid Burgers’ equation gives the first term of our semi-classical expansion, i.e. the inviscid limit of the viscous stochastic Burgers’ equation. In order to push our results beyond the inviscid limit, we construct solutions for iterated (stochastic) Hamilton Jacobi continuity equations. Then we give the semiclassical asymptotic expansions for stochastic heat equations and Burgers’ equations by using Nelson’s stochastic mechanical processes with drifts given by the solution of the iterated (stochastic) Hamilton Jacobi continuity equations. The explicit formula for the remainder term is given by a path integral.
1. Introduction and Results It is well known that the Hopf-Cole logarithmic transformation gives a powerful technique in the study of Burgers’ equations (Hopf (1950)). Using this transformation, the semi-classical asymptotic analysis of Elworthy and Truman for the heat equation has led us to discover new inviscid (semi-classical) asymptotic expansions for the stochastic Burgers’ equations in Truman and Zhao (1996b). On the other hand, the Hopf-Cole logarithmic transformation has proved to be a powerful technique in the study of heat equations and reaction diffusion equations (Li and Zhao (1996), Zhao (1997)). The idea of the stochastic Hamilton Jacobi theory goes back to the well-known WKB method for the asymptotics of the eigenfunctions of the Schr¨odinger operator. In the 1960’s, Maslov applied the semi-classical Hamilton Jacobi theory and his canonical operator method to obtain the asymptotic expansions of the solutions to the Schr¨odinger equation and heat equations (Maslov and Fedoryuk (1981)). Truman (1977) gave a simple proof that quantum mechanics tends to classical mechanics as ~ → 0 and used the Girsanov-Cameron-Martin theorem to obtain analogous results for the diffusion
232
A. Truman, H. Z. Zhao
equation. A more detailed connection between classical mechanics and the diffusion equation was obtained by Elworthy and Truman (1982) using the Brownian bridge process. Recent developments include the applications to nonlinear problems such as approximate travelling wave solutions of the generalized KPP equations in Zhao and Elworthy (1992) and Elworthy, Truman, Zhao and Gaines (1994) and a new discovery of how to formulate and solve a stochastic Hamilton Jacobi equation and continuity equation with random vector and scalar potentials by Truman and Zhao. We have now generalized Hamilton Jacobi theory to include the stochastic heat equations, Burgers’ equations and stochastic Schr¨odinger equations in Truman and Zhao (1996a-c). These equations arise in e.g. nonlinear filtering problems such as the Zakai equation and the dynamics of an interface subject to a random external force such as the KPZ model and a quantum mechanical particle in a random electromagnetic field and quantum filtering problems. In recent years, stochastic Burgers’ equations have attracted the attention of mathematicians, e.g. Sinai (1991), Bertini, Cancrini and Jona-Lasinio (1994), Bertini and Cancrini (1995), Albeverio, Molchanov and Surgailis (1994), Holden, Lindstrom, Oksendal, Uboe and Zhang (1994,1995), Oksendal (1994), Truman and Zhao (1996b), to name but a few. Using a sample path of a stochastic mechanics, Truman and Zhao have constructed the solution of the stochastic Hamilton Jacobi equation following a classical sample path. For a continuous distribution of trajectories with associated density in configuration space, the leading term is given by the continuity equation. In this paper we generalize the results in Truman and Zhao (1996b) giving exact semiclassical expansions up to arbitrarily high order in powers of the viscosity µ. For this, iterated continuity equations and iterated Hamilton-Jacobi-continuity equations are introduced and their solutions are constructed. It is intrinsic to study the iterated Hamilton-Jacobi-continuity equations which are derived formally in the following way: Consider a stochastic viscous Burgers’ equation dv µ + (∇v µ )v µ dt + (∇c)dt + (∇k)dw(t) =
1 2 µ µ 1v dt. 2
Here d is the differential with respect to time t, ∇ and 1 are the gradients and the Laplacian with respect to space variables x ∈ Rn , w(t) is a one dimensional Brownian motion on a probability space (, F , P). Conditions on c and k and initial conditions are discussed later. Let v µ = ∇S µ with appropriate initial condition for S µ . Then S µ satisfies the following stochastic Hamilton-Jacobi-Bellman equation: 1 1 dS µ + |∇S µ |2 dt + cdt + kdw(t) = µ2 1S µ dt. 2 2 But formally if S µ ∼ ∞ X j=0
∞ P j=0
µ2j Sj , then it turns out that
∞
∞
j=0
j=0
1 X 2j 1 X 2j µ2j dSj + | µ ∇Sj |2 dt + cdt + kdw(t) ∼ µ2 µ 1Sj dt. 2 2
Comparing coefficients of µ2j for j = 0, 1, 2, · · · , it is not difficult to find 1 ∂ Sj + ∂t 2
X i1 ,i2 ≥0,i1 +i2 =j
∇Si1 ∇Si2 =
1 1Sj−1 , 2
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
233
with the convention 21 1S−1 (x, t) = −c(x, t) − k(x, t)w˙ t . We call these equations (stochastic) Hamilton Jacobi continuity equations. These equations are apparently related to the semiclassical limits of the quantum mechanics (Simon (1979) and Truman (1977)), to the Madelung fluid (see e.g. Guerra (1981)), and to the quantum tunneling problem (Jona-Lasinio, Martinelli and Scoppola (1981)). It seems that similar equations are also related to the quantum field theory (Glimm and Jaffe (1981)). More details will be discussed in our later publications. The main result of this paper is to construct the solutions of the iterated HamiltonJacobi-continuity equations above. The solutions are given by the solutions of the iterated continuity equations. These first m solutions give the first m terms in the asymptotics of the solution for the stochastic heat equations and stochastic Burgers’ equations. Furthermore the remainder term is given by the expectation of an exponential involving a Nelson’s stochastic mechanical process (Nelson (1985)). Our results and methods for the remainder term although related to Watling (1992) are different in this important respect as well as being stochastic in that they are applicable to systems with noise. We expect this new approach can be used in the study of the small time asymptotics of a heat kernel on a Riemannian manifold which has been studied by Elworthy (1989). Readers who are interested in deterministic equations can assume k ≡ 0 while reading this paper. Assume that c ∈ C 2 (Rn × R1 ), k ∈ C 2 (Rn × R1 ) and S(−, 0) ∈ C 2 (Rn ). Consider a stochastic classical mechanics ( ˙ s = −∇c(8s , s) ds − ∇k(8s , s) dws d8 (1.1) ˙ 0 (x) = ∇S(x, 0) . 80 (x) = x , 8 Here ws is an one dimensional Brownian motion on probability space (, F , P). For each x, we have a solution 8s (x). Therefore we have a random map 8s : × Rn → Rn for each s. We assume a no caustic condition: there exists a T (ω) > 0 a.s. such that for 0 ≤ t ≤ T (ω), 8t (ω) : Rn → Rn is a diffeomorphism for a.e. ω ∈ . Such T (ω) > 0 exists if we have some control on c, k and S0 . We have proved in Truman and Zhao (1996b) that if ∇c, ∇2 c, ∇k, ∇2 k, ∇S(−, 0) and ∇2 S(−, 0) are all bounded, then there exists a T (ω) > 0 a.s. such that if 0 ≤ s ≤ T , 8s (ω) : Rn → Rn is a diffeomorphism for a.e. ω ∈ . Define S˜0 : [0, +∞) × Rn → R by the following non-anticipating Itˆo’s stochastic integral Z Z t Z t 2 1 t ˙ c(8s y, s) ds − k(8s y, s) dws . 8s (y) ds + S(y, 0) − S˜ 0 (y, t) = 2 0 0 0 For 0 ≤ t ≤ T (ω), define S0 (x, t) for a.e. ω ∈ by S0 (x, t) = S˜ 0 (8−1 t x, t) .
(1.2)
Recall the following theorem in Truman and Zhao (1996a-b): Theorem 1.1 (Stochastic Hamilton Jacobi equation and continuity equation). Assume that c ∈ C 2 (Rn × R1 ), k ∈ C 2 (Rn × R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω) and S0 is defined by (1.2). (i) For a.e. ω ∈ and 0 ≤ t ≤ T (ω), ˙ t = ∇S0 (8t , t), 8
(1.3)
234
A. Truman, H. Z. Zhao
and S0 (x, t) satisfies the following stochastic Hamilton Jacobi equation 1 2 (1.4) dS0 (x, t) + |∇S0 (x, t)| dt + c(x, t) dt + k(x, t) dwt = 0 . 2 ∂ −1 (ii) Define φ(x, t) = det 8t x . Then for a.e. ω ∈ , any x ∈ Rn and 0 ≤ t ≤ ∂x T (ω), φ(x, t) satisfies the following continuity equation ∂ φ(x, t) + div {φ(x, t)∇S0 (x, t)} = 0 . (1.5) ∂t Suppose T0 : Rn → R is positive and C ∞ . Function T0 (x) is associated with the initial condition of the stochastic diffusion Eq. (1.13). Define for the random map 8s (ω) : Rn → Rn , T0 (y, t) = T0 (y) and for j = 1, 2, . . .
Z
Tj (y, t) =
t
φ 0
− 21
(·, s)1 φ
1 2
(·, s)Tj−1 (8−1 s ·, s)
8s (y)
ds
and for j = 0, 1, 2, . . .
p ψj (x, t) = Tj (8−1 t x, t) φt (x) . Then we have the following iterated continuity equations :
(1.6)
Lemma 1.2 (Iterated continuity equations). For a.e. ω ∈ , ψj (x, t) defined by (1.6) satisfy the following iterated continuity equations for 0 ≤ t ≤ T (ω) : 1 ∂ ψj + ∇ψj · ∇S0 = − ψj 1S0 + 1(ψj−1 ) , ∂t 2 with the convention ψ−1 ≡ 0.
j = 0, 1, 2, . . .
(1.7)
Therefore, if we consider the following linear combination of ψj , we have m m X X 1 2j 1 2j ∂ µ ψj (x, t) + ∇ µ ψj (x, t) · ∇S0 (x, t) ∂t 2j 2j j=0 j=0 m m−1 X X 1 1 1 2j 1 2j =− µ ψj (x, t) 1S0 (x, t) + µ2 1 µ ψj (x, t) , 2 2j 2 2j j=0
j=0
which immediately implies m m X X 1 2j 1 2j ∂ log µ ψj (x, t) + ∇ log µ ψj (x, t) · ∇S0 (x, t) ∂t 2j 2j j=0 j=0 P m−1 1 2j 1 µ ψ (x, t) j j j=0 2 1 1 = − 1S0 (x, t) + µ2 . Pm 1 2j 2 2 j=0 2j µ ψj (x, t)
(1.8)
In the next theorem we give the solutions of the iterated Hamilton-Jacobi-continuity equations. The classical Hamilton Jacobi equation is the first one among these equations.
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
235
Theorem 1.3. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω). Then for a.e. ω ∈ , the solutions of the following Hamilton Jacobi continuity equations: 1 ∂ Sj + ∂t 2
X
∇Si1 ∇Si2 =
i1 ,i2 ≥0,i1 +i2 =j
1 1Sj−1 , 2
j≥0
(1.9)
with the convention 21 1S−1 (x, t) = −c(x, t) − k(x, t)w˙ t , for 0 ≤ t ≤ T (ω) are given by S0 defined by (1.2), S1 (x, t) = − log ψ0 (x, t) and for j ≥ 2, P P ψ i1 ψ i2 ψ i1 ψ i 2 ψ i 3 1 ψj−1 i1 ,i2 ≥1,i1 +i2 =j−1 i1 ,i2 ,i3 ≥1,i1 +i2 +i3 =j−1 + − Sj (x, t) = j−1 − 2 ψ0 2ψ02 3ψ03 j−1
+ · · · + (−1)
ψ1j−1
(j − 1)ψ0j−1
! (x, t). (1.10)
It follows from (1.9) that m
m
X
j=0
j=0
i1 ,i2 ≥0,i1 +i2 =j
1 X 2j ∂ X 2j ( µ Sj ) + µ ( ∂t 2
∇Si1 ∇Si2 ) + c + k w(t) ˙ m−1
X 1 µ2j Sj ). = µ2 1( 2
(1.11)
j=0
ˆ defined ˆ Fˆ , P) First we consider a stochastic process xµs on a probability space (, by the following stochastic differential equation: m dxµ = µ dBs − ∇S0 (xµ , t − s) ds + µ2 ∇ log P 1j µ2j ψj (xµ , t − s) ds , s s s 2 (1.12) j=0 µ x0 = x . ˆ Fˆ , Pˆ ). Here Bs is a standard Brownian motion on Rn on the probability space (, Note that the process xs is also on the probability space (, F , P ). Therefore, as in Rt Truman and Zhao (1996a-b), in order to make the stochastic integral 0 k(t−s, xs )dwt−s well defined in the Itˆo sense, we denote ws∗ = wt−s and Fs∗ is the enlargement of the filtration {Fs∗ }, where Fs0 = σ(wr∗ : r ≤ s). Then wt−s = ws∗ is Fs∗ measurable and Fs∗1 ⊂ Fs∗2 if s1 ≤ s2 . We are going to use the process xs if it is Fs∗ measurable and non-explosive to construct the solution to the following stochastic diffusion equation of the Stratonovich type: h i ( µ duµt (x) = 21 µ2 1uµt (x) + µ12 c(x, t)uµt (x) dt + k(x,t) µ2 ut (x) ◦ dwt , (1.13) 2 uµ0 (x) = T0 (x)e−S(x,0)/µ .
236
A. Truman, H. Z. Zhao
Theorem 1.4. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω), ψj (x, t) are defined by (1.6) and the stochastic process defined by (1.12) is Fs∗ measurable and non-explosive. Then for a.e. ω ∈ , 0 ≤ t ≤ T (ω), the solution of the heat Eq. (1.13) is given by m S0 (x, t) X 1 2j µ µ ψj (x, t) ut (x) = exp − µ2 2j j=0 (1.14) ( ) Z 1(ψm (xµs , t − s)) 1 2(m+1) t ˆ µ × E exp ds . Pm 1 2j µ 2m+1 0 j=0 2j µ ψj (xs , t − s) Formula (1.14) looks neat for the stochastic heat Eq. (1.13). But for the stochastic Burgers’ equations the formula we can obtain by taking a logarithmic transformation to (1.14) is then complicated and the remainder term is not explicit (given by a series of which the convergence certainly needs a serious discussion). To avoid this we can use an alternative stochastic process which is slightly different from (1.12): namely the following Nelson’s stochastic process defined by m dy µ = µ dBs − P µ2j ∇Sj (y µ , t − s) ds , s s (1.15) j=0 µ y0 = x . ˆ Then ˆ Fˆ , P). Here Bs is a standard Brownian motion on Rn on the probability space (, we can prove the following theorem: Theorem 1.5. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω), Sj (x, t) are defined by (1.10). Suppose that the stochastic process ys defined by (1.15) is Fs∗ measurable and non-explosive. Then the solution of the heat Eq. (1.13) is given by Z t m 1 X 2j ˆ exp { − 1 µ2m µ Sj (x, t)}E 1Sm (ysµ , t − s) ds uµt (x) = exp { − 2 µ 2 0 j=0
1 + 2
2m X
µ
X
2(j−1)
j=m+1
0≤i1 ,i2 ≤m,i1 +i2 =j
Z
t 0
(∇Si1 ∇Si2 )(ysµ , t − s)ds}. (1.16)
In Eq. (1.13), consider a special case when T0 ≡ 1. The Hopf–Cole logarithmic transformation v µ (x, t) = −µ2 ∇ log uµt (x) then gives the solution of the following stochastic Burgers’ equation: ( dvtµ (x) + ∇vtµ (x) · vtµ (x) + ∇c(x, t) dt + ∇k(x, t) dwt = 21 µ2 1vtµ (x) , (1.17) v0µ (x) = ∇S0 (x) = v0 (x) . Denote vj (x, t) = ∇Sj (x, t). It is easy to see that vj (x, t) satisfy the following iterated Burgers’ equation for j = 0, 1, 2, · · · : ∂ vj + ∂t
X i1 ,i2 ≥0,i1 +i2 =j
(∇vi1 )vi2 =
1 1vj−1 2
(1.18)
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
237
with the convention 21 1v−1 = −∇c − ∇k w(t) ˙ and initial condition v0 (x, 0) = ∇S0 (x), vj (x, 0) = 0 for j = 1, 2, · · · . Then the following theorem is a corollary of Theorem 1.5 and the Hopf–Cole transformation. Theorem 1.6. Assume that c ∈ C 2 (Rn ×R1 ), k ∈ C 2 (Rn ×R1 ) and S(−, 0) ∈ C 2 (Rn ), 8s defined by (1.1) satisfies a no-caustic condition for 0 ≤ t ≤ T (ω) and Sj (x, t) are defined by (1.10). Suppose that the stochastic process ys defined by (1.15) is Fs∗ measurable and non-explosive. The solution of the stochastic Burgers’ Eq. (1.17) is given by µ
v (x, t) =
m X
µ2j vj (x, t)
j=0
ˆ exp { − 1 µ2m − µ2 ∇ log E 2 2m 1 X 2(j−1) µ + 2 j=m+1
Z
t 0
div vm (ysµ , t − s) ds
X 0≤i1 ,i2 ≤m,i1 +i2 =j
Z
t 0
(1.19)
(vi1 vi2 )(ysµ , t − s)ds}.
Remark. We have solved the viscous stochastic Burgers’ equation up to arbitrarily high order in the viscosity µ2 . The first term in the asymptotic expansion being the solution to the inviscid Burgers’ equations, is a well-known result of Hopf (1950). It is clear from the above formula that iterated Burgers’ equations give the higher order terms of the asymptotics which arise beyond the inviscid limit. The explicit formula for the remainder term is given by the logarithmic derivative of a path integral.
2. Proof of Iterated Continuity Equations and Hamilton Jacobi Continuity Equations In this section we give the proofs of Lemma 1.2 and Theorem 1.3. Proof of Lemma 1.2. Differentiating the identity 8t (8−1 t x) = x with respect to x and t implies (denote y = 8−1 x) t −1 ∇y 8t (8−1 t x) ∇8t x = I
(2.1)
−1 ˙ t (8−1 ˙ −1 8 t x) + ∇y 8t (8t x)(8t x) = 0 .
(2.2)
and
Multiplying both sides of Eq. (2.2) by ∇8−1 t x and using identity (2.1) and (1.2) we obtain −1 ˙ −1 (2.3) 8 t x = − ∇8t x ∇S0 (x, t) . It turns out from the definitions of ψj and Tj that
238
A. Truman, H. Z. Zhao
∂ ψj (x, t) + ∇ψj (x, t) · ∇S0 (x, t) ∂t 1 1 ∂ 2 2 ˙ −1 = ∇y Tj (8−1 Tj (8−1 t x, t) · (8t x)φt (x) + t x, t)φt (x) ∂t 1 1 ∗ ∂ 2 + Tj (8−1 φ 2 (x) + ∇8−1 ∇y Tj (8−1 t x, t) t x t x, t)φt (x) · ∇S0 (x, t) ∂t t 1
2 + Tj (8−1 t x, t)∇φt (x) · ∇S0 (x, t) h i 1 −1 −1 ˙ = φ 2 (x, t)∇y Tj (8−1 x, t) · 8 x + (∇8 x)∇S (x, t) 0 t t t 1 1 ∂ 2 2 + Tj (8−1 φ x, t) (x) + ∇φ (x) · ∇S (x, t) 0 t t ∂t t 1 + 1 φ 2 (x, t)Tj−1 (8−1 t x, t) .
It follows from (2.3) and the continuity Eq. (1.4) that 1 ∂ ψj (x, t) + ∇ψj (x, t) · ∇S0 (x, t) = − ψj (x, t)1S0 (x, t) + 1(ψj−1 (x, t)) . ∂t 2
Proof of Theorem 1.3. For the case j = 0, (1.9) is the stochastic Hamilton Jacobi Eq. (1.4). For j = 1, dividing both sides of the continuity equation for ψ0 by −ψ0 we have 1 ∂ (− log ψ0 ) + ∇(− log ψ0 )∇S0 = 1S0 . ∂t 2
(2.4)
That is (1.9) for j = 1. For j = 2, multiplying the iterated continuity equation for ψ1 by ψ0 and the continuity equation for ψ0 by ψ1 we have ψ0
1 ∂ ψ1 + ψ0 ∇ψ1 ∇S0 = − ψ0 ψ1 1S0 + ψ0 1(ψ0 ), ∂t 2
and ψ1 It turns out that
1 ∂ ψ0 + ψ1 ∇ψ0 ∇S0 = − ψ1 ψ0 1S0 . ∂t 2
1 ψ1 1 ψ1 1 1(ψ0 ) ∂ (− ) + ∇(− )∇S0 = − . ∂t 2 ψ0 2 ψ0 2 ψ0
But −
(2.5)
1 1ψ0 1 1 = 1(S1 ) − (∇S1 )2 . 2 ψ0 2 2
It turns out from (2.5) that 1 1 ∂ (S2 ) + ∇(S2 )∇S0 = 1(S1 ) − (∇S1 )2 . ∂t 2 2
(2.6)
For an integer j ≥ 3, first, multiplying the continuity equation for ψ0 by ψj−1 and the iterated continuity equation for ψj−1 by ψ0 we find ψj−1
1 ∂ ψ0 + ψj−1 ∇ψ0 ∇S0 = − ψj−1 ψ0 1S0 , ∂t 2
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
and ψ0
239
1 ∂ ψj−1 + ψ0 ∇ψj−1 ∇S0 = − ψ0 ψj−1 1S0 + ψ0 1(ψj−2 ). ∂t 2
It turns out that 1 ψj−1 1 ψj−1 1 1(ψj−2 ) ∂ (− ) + ∇(− j−1 )∇S0 = − j−1 . ∂t 2j−1 ψ0 2 ψ0 2 ψ0
(2.7)
Second, for any i1 , i2 ≥ 1 with i1 + i2 = j − 1, multiplying the iterated continuity equation for ψi1 by ψ02 ψi2 and the iterated continuity equation for ψi2 by ψ02 ψi1 we find ψ02 ψi2
1 ∂ ψi1 + ψ02 ψi2 ∇ψi1 ∇S0 = − ψ02 ψi2 ψi1 1S0 + ψ02 ψi2 1(ψi1 −1 ), ∂t 2
and
1 ∂ ψi + ψ02 ψi1 ∇ψi2 ∇S0 = − ψ02 ψi1 ψi2 1S0 + ψ02 ψi1 1(ψi2 −1 ). ∂t 2 2 It turns out that ψ02 ψi1
∂ (ψi ψi ) + ψ02 ∇(ψi1 ψi2 )∇S0 ∂t 1 2 1 = − ψ02 × 2(ψi1 ψi2 )1S0 + ψ02 (ψi1 1(ψi2 −1 ) + 1(ψi1 −1 )ψi2 ). 2 ψ02
Then the symmetry of indexes i1 and i2 implies that ψ02
∂ ( ∂t
X i1 +i2 =j−1,i1 ,i2 ≥1
ψi1 ψi2 )∇S0
i1 +i2 =j−1,i1 ,i2 ≥1
X
= − ψ02 × (
X
ψi1 ψi2 ) + ψ02 ∇(
X
ψi1 ψi2 )1S0 + 2ψ02 (
i1 +i2 =j−1,i1 ,i2 ≥1
ψi1 1(ψi2 −1 )).
(2.8)
i1 +i2 =j−1,i1 ,i2 ≥1
On the other hand, multiplying the continuity equation for ψ0 by X ψ i1 ψ i 2 2ψ0 i1 +i2 =j−1,i1 ,i2 ≥1
we have X
(
ψ i1 ψ i2 )
i1 +i2 =j−1,i1 ,i2 ≥1
=−
ψ02 (
X
∂ 2 ψ +( ∂t 0
X
ψi1 ψi2 )∇ψ02 ∇S0
i1 +i2 =j−1,i1 ,i2 ≥1
(2.9)
ψi1 ψi2 )1S0 .
i1 +i2 =j−1,i1 ,i2 ≥1
It follows from (2.8) and (2.9) that P P ψ i1 ψ i2 ψ i1 ψ i 2 ∂ i1 +i2 =j−1,i1 ,i2 ≥1 i1 +i2 =j−1,i1 ,i2 ≥1 ( )+∇( )∇S0 ∂t 2j−1 × 2ψ02 2j−1 × 2ψ02 P ψi1 1(ψi2 −1 ) =
i1 +i2 =j−1,i1 ,i2 ≥1 2j−1 ψ02
.
(2.10)
240
A. Truman, H. Z. Zhao
By the same method we can prove that P
P
ψ i1 ψ i2 ψ i3
∂ i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1 (− ∂t 2j−1 × 3ψ03
) + ∇(− =−
i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1
ψ i1 ψ i2 ψ i 3
)∇S0 2j−1 × 3ψ03 P ψi1 ψi2 1(ψi3 −1 )
i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1
,
2j−1 ψ03
··· ψ1j−1
ψ1j−1 ∂ j−1 ((−1)j−1 ) + ∇((−1) )∇S0 ∂t 2j−1 × (j − 1)ψ0j−1 2j−1 × (j − 1)ψ0j−1 =(−1)j−1
ψ1j−2 1ψ0
2j−1 ψ0j−1
. (2.11)
It turns out from (2.7), (2.10) and (2.11) that ∂ Sj + ∇Sj ∇S0 ∂t 1 1(ψj−2 ) = j−1 − + 2 ψ0 P −
P i1 +i2 =j−1,i1 ,i2 ≥1 ψ02
i1 +i2 +i3 =j−1,i1 ,i2 ,i3 ≥1 ψ03
ψi1 1(ψi2 −1 )
ψi1 ψi2 1(ψi3 −1 ) + ··· +
j−2 1ψ0 j−1 ψ1 (−1) . ψ0j−1
This leads to ∂ Sj + ∇Sj ∇S0 ∂t 1 1(ψj−2 ) ψj−2 1ψ0 = j−1 − + + 2 ψ0 ψ02 P ( ψi1 ψi2 )1ψ0 i1 +i2 =j−2,i1 ,i2 ≥1 ψ03 ψ j−3 1ψ1 + · · · + (−1)j−2 1 j−2 ψ0
−
P i1 +i2 =j−2,i1 ,i2 ≥1 ψ02
ψi1 1(ψi2 )
P
i1 +i2 +i3 =j−2,i1 ,i2 ,i3 ≥1 ψ3 ) 0 ψ j−2 1ψ0 (−1)j−1 1 j−1 . ψ0
ψi1 ψi2 1(ψi3 )
−
+
Differentiating Si1 for i1 ≥ 2 with respect to space variables x we have
(2.12)
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
241
∇Si1 (x, t) =
1 2i1 −1
P
∇ψi1 −1 ψi1 −1 ∇ψ0 + + − ψ0 ψ02 P
−
j1 ,j2 ≥1,j1 +j2 =i1 −1 ψ03
P
j1 ,j2 ≥1,j1 +j2 =i1 −1 ψ02
∇(ψj1 )ψj2
P
ψj1 ψj2 ∇ψ0 −
j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i1 −1 ψ03
∇(ψj1 )ψj2 ψj3
ψj1 ψj2 ψj3 ∇ψ0
j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i1 −1 + ψ04 ψ i1 −2 ∇ψ1 +(−1)i1 −1 1 i1 −1 − ψ0
+ ··· ψ i1 −1 ∇ψ0 (−1)i1 −1 1 i1 ψ0
) .
Similarly for i2 ≥ 2, ∇Si2 (x, t) =
1 2i2 −1
P
∇ψi2 −1 ψi2 −1 ∇ψ0 + + − ψ0 ψ02 P
−
j1 ,j2 ≥1,j1 +j2 =i2 −1 ψ03
P
−
X
1 2j−2
i1 +i2 =j,i1 ,i2 ≥2
+ ∇ψi1 −1 − + ··· +
j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i2 −1 ψ03
∇(ψj1 )ψj2 ψj3
ψj1 ψj2 ψj3 ∇ψ0 + ··· ψ i2 −1 ∇ψ0 (−1)i2 −1 1 i2 ψ0
! .
∇Si1 ∇Si2
i1 +i2 =j,i1 ,i2 ≥2
=
∇(ψj1 )ψj2
P
ψj1 ψj2 ∇ψ0
j1 ,j2 ,j3 ≥1,j1 +j2 +j3 =i2 −1 + ψ04 ψ i2 −2 ∇ψ1 + (−1)i2 −1 1 i2 −1 − ψ0
It follows that X
j1 ,j2 ≥1,j1 +j2 =i2 −1 ψ02
(∇ψi1 −1 ψi2 −1 + ψi1 −1 ∇ψi2 −1 )∇ψ0 ∇ψi1 −1 ∇ψi2 −1 − ψ02 ψ03
ψi1 −1 ψi2 −1 (∇ψ0 )2 ψ04 P ∇(ψj1 )ψj2 + ∇ψi2 −1
j1 ,j2 ≥1,j1 +j2 =i2 −1
P j1 ,j2 ≥1,j1 +j2 =i1 −1
∇(ψj1 )ψj2
ψ03 ψ i1 +i2 −4 (∇ψ1 )2 (−1)i1 +i2 −2 1 i1 +i2 −2 ψ0
− (−1)i1 +i2 −2
ψ i1 +i2 −2 (∇ψ0 )2 + (−1)i1 +i2 −2 1 ψ0i1 +i2
!
2ψ1i1 +i2 −3 ∇ψ1 ∇ψ0 ψ0i1 +i2 −1
. (2.13)
242
A. Truman, H. Z. Zhao 0 Note that ∇S1 = − ∇ψ ψ0 . Therefore we also have
∇S1 ∇Sj−1 = −
1
−
2j−2
P +
∇ψj−2 ∇ψ0 ψj−2 (∇ψ0 )2 + ψ02 ψ03
j1 ,j2 ≥1,j1 +j2 =j−2
∇(ψj1 )ψj2 ∇ψ0 −
ψ03
+··· +
ψ j−3 ∇ψ1 ∇ψ0 (−1)j−2 1 ψ0j−1
−
P
ψj1 ψj2 (∇ψ0 )2
j1 ,j2 ≥1,j1 +j2 =j−2 ψ04
ψ j−2 (∇ψ0 )2 (−1)j−2 1 ψ0j
! . (2.14)
Finally from (2.12) and (2.13) and (2.14) we obtain that 1 ∂ Sj + ∂t 2 =
1 2j−1
X
ψj−2 (∇ψ0 )2 1(ψj−2 ) ψj−2 1ψ0 ∇ψj−2 ∇ψ0 + + 2 − ψ0 ψ02 ψ02 ψ03 P P ψi1 1(ψi2 ) ψi1 ψi2 1ψ0
−
i1 +i2 =j−2,i1 ,i2 ≥1 ψ02
+
P
i1 +i2 =j−2,i1 ,i2 ≥1 ψ02
+
4× −
−
i1 +i2 =j−2,i1 ,i2 ≥1 ψ03
∇ψi1 ∇ψi2 (2.15)
P
3×
(∇ψi1 )ψi2 ∇ψ0
i1 +i2 =j−2,i1 ,i2 ≥1 ψ03
+ · · · + (−1)j−2
+
∇Si1 ∇Si2
i1 +i2 =j,i1 ,i2 ≥0
+
ψ1j−3 1ψ1 ψ0j−2
(j − 1)ψ1j−2 (∇ψ0 )2 ψ0j
−
−
ψ1j−2 1ψ0 ψ0j−1
+
P i1 +i2 =j−2,i1 ,i2 ≥1 ψ04
ψi1 ψi2 (∇ψ0 )2
(j − 3)ψ1j−4 (∇ψ1 )2 ψ0j−2 )
2(j − 2)ψ1j−3 ∇ψ1 ∇ψ0 ψ0j−1
.
Then simple computation implies that the right hand side of the above equals 21 1Sj−1 . That is to say 1 ∂ Sj + ∂t 2
X i1 +i2 =j,i1 ,i2 ≥0
∇Si1 ∇Si2 =
1 1Sj−1 . 2
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
243
3. Proof of Theorem 1.4 Proof. Let xµs be defined by (1.12). For each ω ∈ , define a new probability measure Pˆ 1 by dPˆ 1 = Mµt ˆ dP Z t m X 1 1 2j −∇S0 (xµs , t − s) + µ2 ∇ log = exp − µ ψj (xµs , t − s) dBs j µ 2 (3.1) 0 j=0 2 Z t m X 1 1 µ 2 2j µ − 2 (x , t − s) + µ ∇ log µ ψ (x , t − s) ds . −∇S 0 j s s 2µ 0 2j j=0 ˆ Fˆ , Pˆ1 ) with variance µ2 . Therefore Then for each ω ∈ , xµs is a Brownian motion on (, µ µ ˆ Let E ˆ ˆ be the ˆ Fˆ , P). xs is isometric to Bs = x + µBs on the probability space (, P ˆ We write as E ˆ for simplicity. Denote by E ˆˆ expectation with respect to the measure P. P1 ˆ the expectation with respect to the measure P1 . By the Feynman—Kac formula, the solution of the heat Eq. (1.13) is given by Z t S0 (Btµ ) 1 + c(Bsµ , t − s) ds uµt (x) = EPˆ T0 (Btµ ) exp − µ2 µ2 0 Z t 1 µ k(Bs , t − s) dwt−s + 2 µ 0 Z t S0 (xµt ) 1 µ + 2 c(xµs , t − s) ds = EPˆ 1 T0 (xt ) exp − µ2 µ 0 Z t 1 k(xµs , t − s) dwt−s , + 2 µ 0 Rt provided xµs is Fs∗ measurable and non-explosive so that 0 k(xµs , t − s) dwt−s is well defined in Itˆo’s sense. By the Maruyama–Girsanov–Cameron–Martin formula we have
uµt (x)
=
EPˆ T0 (xµt ) exp
Z t S0 (xµt ) 1 − + 2 c(xµs , t − s) ds µ2 µ 0 Z t 1 µ k(xs , t − s) dwt−s · Mµt . + 2 µ 0
Applying Itˆo’s formula to S0 (xµs , t − s) − µ2 log
Pm
1 2j µ j=0 2j µ ψj (xs , t
(3.2)
− s) we have
244
A. Truman, H. Z. Zhao
S0 (xt , 0) − µ2 log T0 (xt ) m X 1 2j µ ψj (x, t) =S0 (x, t) − µ2 log 2j j=0 Z t m X ∂ 1 2j + log ds S0 (xµs , t − s) − µ2 µ ψj (xµs , t − s) ds ∂s 2j 0 j=0 m X 1 µ2j ψj (xµs , t − s) + ∇S0 (xµs , t − s) − µ2 ∇ log 2j j=0 m X 1 2j × µ dBs − ∇S0 (xµs , t − s) ds + µ2 ∇ log µ ψj (xµs , t − s) ds 2j j=0 m 2 Z t X µ 1 2j 1S0 (xµs , t − s) − µ2 1 log + µ ψj (xµs , t − s) ds . 2 0 2j j=0
It turns out that Z m X 1 1 t µ2j ψj (xµs , t − s) dBs ∇S0 (xµs , t − s) − µ2 ∇ log µ 0 2j j=0 m X 1 1 = 2 S0 (xt , 0) − µ2 log T0 (xt ) − S0 (x, t) + µ2 log µ2j ψj (x, t) µ 2j j=0 Z t 1 {−ds S0 (xµs , t − s)+ + 2 µ 0 2 m X 1 2j µ 2 µ + ∇S0 (xs , t − s) − µ ∇ log ds µ ψ(x , t − s) s 2j j=0 Z t m X 1 ∂ 1 2j + log µ ψj (xµs , t − s) − 1S0 (xµs , t − s) j ∂s 2 2 0 j=0 m X 1 1 2j µ ds . + µ2 1 log µ ψ (x , t − s) j s 2 2j j=0
Note that
P m 1 2j µ m 1 µ ψ (x , t − s) X j j s j=0 2 1 2j µ ψj (xµs , t − s) = Pm 1 2j 1 log µ 2j µ ψ (x j s , t − s) j=0 2j j=0 2 m X 1 2j µ µ ψj (xs , t − s) . − ∇ log j 2 j=0
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
245
It follows that m 1 X 1 S0 (xt , 0) − µ2 log T0 (xt ) − S0 (x, t) + µ2 log Mµt = exp µ2j ψj (x, t) 2 2j µ j=0 Z t 1 1 2 −ds S0 (xµs , t − s) + |∇S0 (xµs , t − s)| ds + 2 µ 0 2 2 Z t m X 1 2 1 2j µ ds µ ∇ log µ ψ (x , t − s) + j s j 2 0 2 j=0 Z t m X 1 1 2j ∂ + µ ψj (xµs , t − s) − 1S0 (xµs , t − s) log j ∂s 2 2 0 j=0
m X 1 2j µ ψj (xµs , t − s) · ∇S0 (xµs , t − s) 2j j=0 P m 1 2j µ 1 µ ψ (x , t − s) j j s j=0 2 1 + µ2 Pm 1 2j µ 2 j=0 2j µ ψj (xs , t − s) 2 m X 1 2j 1 2 µ µ ψj (xs , t − s) ds . − µ ∇ log 2 2j j=0
− ∇ log
(3.3) By using the stochastic Hamilton Jacobi Eq. (1.4) and the continuity Eq. (1.8) we have m S0 (x, t) X 1 2j µ µ ψj (x, t) ut (x) = exp − µ2 2j j=0 ( ) Z 1(ψm (xµs , t − s)) 1 2(m+1) t ˆ µ × E exp ds . Pm 1 2j µ 2m+1 0 j=0 2j µ ψj (xs , t − s) 4. Proof of Theorem 1.5 and Theorem 1.6 Proof of Theorem 1.5. Let ysµ be defined by (1.15). For each ω ∈ , define a new probability measure Pˆ 1 by dPˆ 1 = Mµt dPˆ m 1Z t X − = exp − µ2j ∇Sj (ysµ , t − s) dBs (4.1) µ 0 j=0 2 Z t X m 1 2j µ − − 2 µ ∇S (y , t − s) ds . j s 2µ 0 j=0
246
A. Truman, H. Z. Zhao
ˆ Pˆ1 ) with variance µ2 . Therefore ˆ F, Then for each ω ∈ , ysµ is a Brownian motion on (, µ µ ˆ By the Maruyama– ˆ P). ˆ F, ys is isometric to Bs = x + µBs on the probability space (, Girsanov–Cameron–Martin formula and Feynman–Kac formula, similar to (3.2), we have Z t S0 (ytµ ) 1 + c(ysµ , t − s) ds uµt (x) = EPˆ T0 (ytµ ) exp − µ2 µ2 0 Z t 1 µ k(ys , t − s) dwt−s · Mµt . + 2 µ 0 Applying Itˆo’s formula to
m P j=0
µ2j Sj (ysµ , t − s) we have
S0 (yt , 0) − µ2 log T0 (yt ) Z tX m m X µ2j Sj (x, t) + µ2j ds Sj (ysµ , t − s)ds = 0
j=0
j=0
Z t X m m X µ2j ∇Sj (ysµ , t − s) × µ dBs − µ2j ∇Sj (ysµ , t − s) ds + 0
j=0
Z m µ2 t X 2j + µ 1Sj (ysµ , t − s) ds . 2 0
j=0
j=0
It turns out that
1 µ
Z
t
0
m X
µ2j ∇Sj (ysµ , t − s) dBs
j=0
m X 1 = 2 S0 (yt , 0) − µ2 log T0 (yt ) − µ2j Sj (x, t) µ j=0 2 X Z t m m X 1 + 2 µ2j ds Sj (ysµ , t − s) + µ2j ∇Sj (ysµ , t − s) ds − µ 0 j=0 j=0
−
1 2
Z tX m 0
j=0
This is followed by
µ2j 1Sj (ysµ , t − s) ds .
(4.2)
Stochastic Burgers’ Equations and Their Semi-Classical Expansions
247
m 1 X S0 (yt , 0) − µ2 log T0 (yt ) − Mµt = exp µ2j Sj (x, t) µ2 j=0 2 m Z t m X X 1 1 2j µ 2j µ + 2 µ ds Sj (ys , t − s) + µ ∇Sj (ys , t − s) ds − µ 0 2 j=0 j=0 Z m 1 t X 2j − µ 1Sj (ysµ , t − s) ds . 2 0 j=0
But 2 m X 2j µ ∇S j j=0 =
m X j=0
µ
2j
X
∇Si1 ∇Si2 +
2m X j=m+1
i1 ,i2 ≥0,i1 +i2 =j
X
µ2j
∇Si1 ∇Si2 .
0≤i1 ,i2 ≤m,i1 +i2 =j
It turns out by using (1.11) that Z m 1 X 1 2m t µ 2j ˆ ut (x) = exp − 2 µ Sj (x, t) × E exp − µ 1Sm (ysµ , t − s) ds µ 2 0 j=0
1 + 2
Z
t
2m X
0 j=m+1
µ2(j−1)
X 0≤i1 ,i2 ≤m,i1 +i2 =j
(∇Si1 ∇Si2 )(ysµ , t − s)ds
.
Proof of Theorem 1.6. The result for the Burgers’ equation follows by applying the logarithmic transformation to (1.16). Acknowledgement. One of us, HZZ would like to thank Professor D. Williams FRS for his invitation to visit the University of Bath and Professosr B. Oksendal and T. Lindstrom for inviting him to visit the University of Oslo. It is our great pleasure to thank Professor K.D. Elworthy, Professor W.A. Zheng and Dr. Z. Brzezniak for helpful conversations. We would like to acknowledge the support of EPSRC Grants GR/L37823 and GR/K70397.
References 1. Albeverio, S., Molchanov, S. and Surgailis, D.: Stratified structure of the universe and Burgers equation – A probability approach. Probab. Theory Relat. Fields 100, 457–484 (1994) 2. Bertini, L., Cancrin, N. and Jona-Lasinio, G.: The stochastic Burgers equation. Commun. Math. Phys. 165, 211–232 (1994) 3. Bertini, L. and Cancrini, N.: The stochastic heat equation: Feynman–Kac formula and intermittence. J. Stat. Phys. 78, 1377–1401 (1995)
248
A. Truman, H. Z. Zhao
4. Da Prato, G., Debusche, A. and Temam, R.: Stochastic Burgers equation. Preprint di mathematica n.27 ’Scuola Normale Superiore’ Pisa (1993) 5. Elworthy, K. D.: Geometric Aspects of Diffusions on Manifolds. In: Hennequin, P. L. (ed.) Ecole d’Et´e Probabilit´e de Saint–Flour XV–XVII 1985, 1987, Lecture Notes in Mathematics 1362, Berlin– Heidelberg–New York: Springer-Verlag, 1989, pp. 276–425 6. Elworthy, K. D.: Stochastic differential equations on manifolds. London Mathematical Society Lecture Notes 70, Cambridge: Cambridge University Press, 1989 7. Elworthy, K. D. and Truman, A.: The diffusion equation and classical mechanics: An elementary formula. In: Albeverio, S. et al (eds.) Stochastic Processes in Quantum physics, Lecture Notes in Physics 173, Berlin: Springer, 1982, pp. 136–146 8. Elworthy, K. D., Truman, A., Zhao, H. Z. and Gaines, J.: Approximate travelling waves for the generalized KPP equations and classical mechanics. Proc. R. Soc. Lond., Series A 446, 529–554 (1994) 9. Glimm, J. and Jaffe, A.: Quantum Physics. New York: Springer-Verlag, 1981 10. Guerra, F.: Structural aspects of stochastic mechanics and stochastic field theory. Phys. Rep. 77, 263–312 (1981) 11. Hopf, E.: The partial differential equation ut + uux = µuxx . Comm. Pure and Appl. Math. 3, 201–230 (1950) 12. Holden, H., Lindstrom, T., Oksendal, B., Uboe, J. and Zhang, T. S.: The Burgers’ equation with a noise force and the stochastic heat equations. Comm. PDE. 19, 119–141 (1994) 13. Holden, H., Lindstrom, T., Oksendal, B., Uboe, J. and Zhang’ T. S.: The stochastic Wick-type Burgers equation. In: Ethridge. A.M. (ed.), Stochastic Partial Differential Equations, (London Mathematical Society Lecture Note Series 216), Cambridge: Cambridge University Press, 1995, pp.141–161 14. Jona-Lasinio, G., Martinelli, F. and Scoppola, E.: The semiclassical limit of quantum mechanics. Phys. Rep. 77, 313–327 (1981) 15. Li, X.M. and Zhao, H.Z.: Gradient estimates and the smooth convergence of approximate travelling waves for reaction diffusion equations. Nonlinearity 9, 459–477 (1996) 16. Maslov, V.P. and Fedoryuk, M.V.: The quasiclassical approximation for equations of quantum mechanics. Dordecht: Reidel Publishing Comp., 1981 17. Nelson, E.: Quantum Fluctuations. Princeton, NJ: Princeton University Press 1985 18. Oksendal, B.: Stochastic partial differential equations and applications to hydrodynamics. In: Cardaso, A.I., de Faria, M., Potthoff, J. and Streit, L. (eds.) Stochastic Analysis and Applications in Physics, (Nato ASI Series, 449, Dordrecht: Kluwer 1994, pp. 283-305 19. Simon, B.: Functional Integration and Quantum Physics. New York: Academic Press, 1979 20. Sinai, Ya. G.: Two results concerning asymptotic behaviour of solutions of the Burgers equation with force. J. Stat. Phys. 64, 1–12 (1991) 21. Truman, A.: Classical mechanics, the diffusion (heat) equation and the Schr¨odinger equation. J. Math. Phys. 18, 2308–2315 (1977) 22. Truman, A. and Zhao, H. Z.: The stochastic Hamilton Jacobi equation, stochastic heat equation and Schr¨odinger equation. In: Davies, I. M., Truman, A. and Elworthy, K. D. (eds), Stochastic Analysis and Applications, Singapore: World Scientific, 1996a, pp. 441–464 23. Truman, A. and Zhao, H. Z.: On stochastic diffusion equations and stochastic Burgers’ equations. J. Math. Phys. 37, 283–307 (1996b) 24. Truman, A. and Zhao, H. Z.: Quantum mechanics of charged particles in random electromagnetic fields. J. Math. Phys. 37, 3180–3197 (1996c) 25. Watling, K. D.: Formulae for solutions to (possibly degenerate) diffusion equations exhibiting semiclassical asymptotics. In: Truman, A. and Davies, I. M. (eds.), Stochastics and Quantum Mechanics Singapore: World Scientific, 1992, pp. 248–271 26. Zhao, H.Z.: On the gradients of the travelling waves for the generalized KPP equations. Proc. R. Soc. Edinb. 127, 423–439 (1997) 27. Zhao, H.Z. and Elworthy, K.D.: The travelling wave solutions of scalar generalized KPP equations via classical mechanics and stochastics approaches. In: Truman, A. and Davies, I. M. (eds.), Stochastics and Quantum Mechanics, Singapore: World Scientific, 1992, pp. 298–316 Communicated by D. Brydges
Commun. Math. Phys. 194, 249 – 295 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Continuous Renormalization for Fermions and Fermi Liquid Theory Manfred Salmhofer Mathematik, ETH-Zentrum, 8092 Z¨urich, Switzerland. E-mail:
[email protected] Received: 14 May 1997 / Accepted: 6 October 1997
Abstract: I derive a Wick ordered continuous renormalization group equation for fermion systems and show that a determinant bound applies directly to this equation. This removes factorials in the recursive equation for the Green functions, and thus improves the combinatorial behaviour. The form of the equation is also ideal for the investigation of many-fermion systems, where the propagator is singular on a surface. For these systems, I define a criterion for Fermi liquid behaviour which applies at positive temperatures. As a first step towards establishing such behaviour in d ≥ 2, I prove basic regularity properties of the interacting Fermi surface to all orders in a skeleton expansion. The proof is a considerable simplification of previous ones.
1. Introduction In this paper, I begin a study of fermionic quantum field theory by a continuous Wick ordered renormalization group equation (RGE). As an example, I take the standard many-fermion system of solid state quantum field theory, but the method applies to general fermionic models with short-range interactions. I show that a determinant of propagators appears in the RGE and I use a determinant bound to prove that a factorial which would appear in bosonic theories is removed from the recursion for the fermionic Green functions. This may lead to convergence of perturbation theory in the absence of relevant couplings, but I do not address the convergence problem, which is related to the solution of a particular combinatorial recursion, in this paper. A short account of this work has appeared in [13]. Continuous RGEs were invented by Wegner [1] and Wilson [2]. Polchinski [3] found a beautiful way to use them for a proof of perturbative renormalizability of φ4 theory. His method was simplified in [5], and extended to composite operator renormalization and to gauge theories by Keller and Kopper [6]. Keller [7] also proved local Borel summability. While equivalent to the Gallavotti-Nicol`o [8–10] method, the continuous
250
M. Salmhofer
RGE is much simpler technically. An application of continuous RG methods to nonperturbative bosonic problems [11, 12] requires many new ideas and a combination with cluster expansion techniques, to control the combinatorics. It is one of the points of this paper that for fermions, the straightforward adaption of the method yields a determinant bound which improves the combinatorics of fermionic theories as compared to bosonic ones, and may lead to nonperturbative bounds. The determinant structure is not visible in the form of the flow equation used in [3, 5, 6], because the flow equation in that form is a one-loop equation which has too little structure. A key ingredient for the present analysis is Wick ordering, which was first used in the context of continuous RGEs for scalar field theories by Wieczerkowski [4]. I show that for fermions, the Wick ordered RGE contains a determinant of propagators to which a Gram inequality applies directly. A closer look at the way the Feynman graph expansion is generated by the RGE shows that the sign cancellations bring the combinatorial factors for fermions nearer to that of a planar field theory. This reduction does, however, not lead to a planar field theory in the strict sense because of certain binomial factors in the recursion. The direct application of the determinant bound shown here requires the interaction between the fermions to be short-range. This prevents a straightforward application to systems with abelian gauge fields by simply integrating over the gauge fields. One model to which the method applies directly is the Gross-Neveu model (which has been constructed rigorously [14, 15]). I show here only the most basic power counting bounds by leaving out all relevant and marginal couplings, but it is possible to take them into account by renormalization. A class of physically realistic models with a short-range interaction is that of nonrelativistic many-fermion models. In these models, there is a significant complication of the analysis because the singularity of the fermion propagator in momentum space is not at a point, but instead on the Fermi surface, which is a (d − 1)-dimensional subset of momentum space. Only in one dimension, the singularity is pointlike – the “surface” becomes a point. The interest in these models has resurged recently because of the discovery of hightemperature superconductivity. Before that it was taken for granted that Fermi liquid (FL) behaviour holds in all dimensions d ≥ 2, and Luttinger liquid behaviour in one dimension (the latter has been proven [16–18]). At certain doping values, however, strong deviations from FL behaviour are seen in the high-Tc materials. The discussion following these discoveries revealed that the former arguments for FL behaviour contained logical gaps. Like [19–25], the present work is not aimed at an understanding of these deviations, but at the more modest goal of determining first under which conditions FL behaviour occurs. Before doing so, it is necessary to give a definition of what would constitute FL behaviour. At zero temperature, the noninteracting Fermi gas has a discontinuity in the occupation number density; this is also a property one would require of a zero temperature FL. This discontinuity is absent for Luttinger liquids. It is, however, not sufficient for FL behaviour because there is a one-dimensional model which has both such a discontinuity and some Luttinger liquid features in the spectral density [27]. Moreover, in the standard models of many-fermion systems such a step never occurs because superconductivity sets in below a critical temperature, and it smooths out the step in the zero-temperature Fermi distribution. It is thus desirable to give a definition of FL behaviour at temperatures above the critical temperature for superconductivity. This is not at all straightforward because there is no clean characterization of FL behaviour at a fixed temperature. I propose to look at a whole range of temperatures and values of the coupling constant to bring out the characteristic features of a FL.
Continuous Renormalization for Fermions and Fermi Liquid Theory
251
I define an equilibrium Fermi liquid as a system in which the perturbation expansion in the coupling constant λ converges for the skeleton Green functions in the region |λ| log β small enough (here β is the inverse temperature) and where the selfenergy fulfills certain regularity conditions. The skeleton Green functions are defined in detail in Sect. 6; in them, selfenergy insertions are left out, so that the Fermi surface stays fixed. I discuss in Sect. 7 how they are related to the exact Green functions (a suitable choice of the Wick ordering covariance can be used to take the selfenergy insertions into account). The logarithmic dependence of the radius of convergence on β comes from the Cooper instability; the difference between Fermi liquids and Luttinger liquids is in the regularity properties of the selfenergy. This is discussed in detail in Sect. 2.6. The goal is to show that the standard many-fermion systems are Fermi liquids in that sense. A proof of this requires a combination of the regularity techniques of [21–23] for renormalization with the sector technique of [20] in the determinant bound (see Sect. 2.6 for further discussion). I do not give a complete proof in this paper but only part of it, by showing a determinant bound and some of the required regularity properties of the selfenergy in perturbation theory. The hope is that the determinant bound will lead to convergence, so that the method developed here, which is somewhat simpler than e.g. the one in [20], will work nonperturbatively. A different representation for fermionic Green functions that provides a simplification and leads to nonperturbative bounds is given in [28]. In Sect. 2, I review the Grassmann integral for many-fermion systems briefly, to give a self-contained motivation for the study of such systems, and to fix notation. The Fermi liquid criterion is formulated in Sect. 2.6. Sect. 3 contains the general renormalization group equation and the determinant formula Eq. (53). Sect. 4 contains the determinant bound and an application to systems where the propagator has point singularities. In Sect. 5, I show the existence of the thermodynamic limit for the many-fermion system in perturbation theory. In Sect. 6, I prove bounds on the skeleton selfenergy that are needed to renormalize the full theory in perturbation theory. Again, the RGE in the form of [3, 5, 6] would not be very convenient for this because it is a one-loop equation, whereas the crucial effects for regularity all start at two loops. They can be seen in a simple way in the Wick ordered RGE. Details about Wick ordering and the derivation of the determinant formula Eq. (53) are deferred to the Appendix.
2. Many-Fermion Systems The model is defined on a spatial lattice with spacing ε. Continuum models are obtained in the limit ε → 0; lattice models, such as the Hubbard model, are obtained by fixing ε. L ∈ N, and let G Let d ≥ 2 be the spatial dimension, ε > 0, L ∈ R be such that 2ε d d 3 be the torus be any lattice of maximal rank in R , e.g., G = Z . Let R P3 = εG/LG. The number of points of this lattice is |3| = ( Lε )d . Let 3 dx F (x) = εd x∈3 F (x) and δ3 (x, x0 ) = ε−d δx,x0 . Let F3 be the Fock space generated by the spin one half fermion operators satisfying the canonical anticommutation relations [29], i.e. for all x, x0 ∈ 3, cα (x)c+α0 (x0 ) + c+α0 (x0 )cα (x) = δαα0 δ3 (x, x0 ).
(1)
Here α ∈ {−1, 1} is the spin of the fermion in units of ~2 . The free part of the Hamiltonian H3 (c, c+ ) = H0 + λV is
252
M. Salmhofer
H0 = −
X
Z
Z dy T (x, y) c+α (x)cα (y).
dx
α∈{−1,1} 3
(2)
3
For a one-band model on a lattice with fixed spacing ε, T (x, y) = tx−y = ty−x describes hopping from a site y to another site x with an amplitude tx−y = ty−x . The interaction is multiplied by a small coupling constant λ; I assume it to be a normal ordered density-density interaction Z Z X dy v(x − y)c+α (x)c+σ (y)cα (x)cσ (y). (3) V (c, c+ ) = − dx 3
3
α,σ∈{−1,1}
In other words, it is a special type of a four-fermion interaction. For instance, the simplest Hubbard model is given by λ = U2 , where U is the usual Hubbard-U , and by v(x − y) = δ3 (x, y), and the hopping term is tx−y = t if |x − y| = 1 and zero otherwise, where t is the hopping parameter. At temperature T and chemical potential µ, the R grand canonical partition function is given by Z3 = trF3 e−β(H3 −µN3 ) with N3 = 3 dx n(x), and β = kB1 T . Observables are given by expectation values of functions, mainly polynomials, of the c and c+ , hOi3 =
1 tr e−β(H3 −µN3 ) O(c, c+ ) . Z3
(4)
A basic question is whether the expected values of observables have a finite thermodynamic limit and whether an expansion in λ can be used to get their behaviour at small or zero temperature T . For instance, one would like to expand the two-point function P∞ hc+ (x)c(y)i3 = r=0 λr GL,ε 2,r (x, y). It is by now well-known that the result of a naive −λV expansion e in powers of λ is that at T = 0, limL→∞ GL,ε 2,r = ∞ for all r ≥ 3 (see, e.g., [19, 21]). At positive temperature T , this unrenormalized expansion converges for |λ| ≤ const T d ; see Sect. 4. To get a better T -dependence of the radius of convergence, one has to renormalize. Because of the BCS instability, the best one can hope for in general is a bound |λ| log T1 < const for the region of convergence. This is part of the Fermi liquid criterion formulated below. 2.1. Grassmann integral representation. The standard Grassmann integral representation is obtained by applying the Lie product formula nτ (5) e−β(H3 −µN3 ) = lim e−ετ (H0 −µN3 ) e−ετ λV nτ →∞
to the trace for Z3 and hOi. The spacing in the imaginary-time direction is ετ = nβτ . The limit exists in operator norm because on the finite lattice 3, all operators are just finite-dimensional matrices. Inserting the orthonormal basis of F3 between the factors in Eq. (5) and rearranging, I get Z3 = limnτ →∞ Z3,nτ , where Z3,nτ is given by a finite-dimensional Grassmann integral, as follows. Let nτ be even and T = {τ = nετ : n ∈ Z, − n2τ ≤ n < n2τ }, let 3 = T × 3, and A be the Grassmann algebra generated by ψσ (x), ψ¯ σ (x), with σ ∈ {1, −1} and x = (τ, x) ∈ 3. Fix some Q ordering on 3 and denote the usual Grassmann measure [30, 31] by D3 ψD3 ψ¯ = x,σ dψσ (x) dψ¯ σ (x). Then Eq. (5) implies R ¯ Z3,nτ = N3 D3 ψD3 ψ¯ e−S3 (ψ,ψ) , where N3 is a normalization factor that depends on ε, L, and nτ , and where
Continuous Renormalization for Fermions and Fermi Liquid Theory
XZ
Z ¯ = S3 (ψ, ψ)
253
dτ T
σ
¯ )) . dx ψ¯ σ (τ, x)∂τ ψσ (τ, x) − H3 (ψ(τ ), ψ(τ
(6)
3
R P Here I have used the notations ψ(τ )(x) = ψ(τ, x), T dτ F (τ ) = ετ τ ∈T F (τ ), and ∂τ ψ(τ ) = ε−1 τ (ψσ (τ + ετ ) − ψσ (τ )), and the sum over τ runs over T, with antiperiodic boundary conditions [32]. For nτ < ∞ and L < ∞, this is a finite-dimensional Grassmann integral. The limit nτ → ∞, and afterwards L → ∞, will be taken only for the effective action. No infinite-dimensional Grassmann integration will be required. To do the Fourier transformation, it will be convenient to deal with periodic functions defined on an interval of double length in τ , and to impose the antiperiodicity as an antisymmetry condition: let T2 = ετ Z/2βZ, in other words, T2 = {τ ∈ ετ Z : −β ≤ τ < β} with periodic boundary conditions. Thus the fields ψ and ψ¯ are periodic with respect to translations of τ by 2β, and antiperiodicity with respect to translations by β is imposed by setting (−)
(−)
ψ (τ + β, x) = − ψ (τ, x) (7) R R R for all x ∈ 3. With the further notation 3 dx F (x) = T dτ 3 dx F (τ, x), and ¯ = S2 (ψ, ψ) ¯ + λS4 (ψ, ψ), ¯ where δ3 (x, x0 ) = ετ −1 ε−d δτ,τ 0 δx,x0 , the action is S(ψ, ψ) Z XZ ¯ = dx dx0 ψ¯ σ (x) a(x, σ, x0 , σ 0 )ψσ0 (x0 ) (8) S2 (ψ, ψ) σ,σ 0 3
with
3
a(x, σ, x0 , σ 0 ) = δσσ0 (∂τ + µ)δ3 (x, x0 ) − T (x, x0 )δT2 (τ, τ 0 )
and ¯ = S4 (ψ, ψ)
XZ σ,σ 0
0
0
Z dx
3
dx0 ψ¯ σ (x)ψσ (x)v(τ, x, τ 0 , x0 )ψ¯ σ0 (x0 )ψσ0 (x0 )
(9) (10)
3 0
with v(τ, x, τ , x ) = δT2 (τ, τ )v(x − x0 ). For the present work, the interaction does not have to be instantaneous. Retardation effects, like from phonons, are allowed. That is, v(τ, x, τ 0 , x0 ) may have a dependence on τ and need not be local in τ . The operator a appearing in S2 is invertible because the antiperiodicity condition removes the zero modes of the discretized time derivative. In other words, the Matsubara frequencies for fermions are nonzero at positive temperature (this will become explicit in the next section). 2.2. The propagator in Fourier space. Fourier transformation with the antiperiodicity conditions Eq. (7) is described in Appendix A. The Fourier transforms of ψ and ψ¯ are Z (−) (−) ˆ ψσ (p) = dx e−ipx ψσ (x), (11) 3
where, for p = (ω, p) and x = (τ, x), px = ωτ + px. If 3∗ is the dual lattice to 3, the momentum p is in 3∗ = Mnτ × 3∗ , where Mnτ = {ωn =
nτ nτ π (2n + 1) : n ∈ Z, − ≤n< } β 2 2
(12)
254
M. Salmhofer
R P is the set of Matsubara frequencies ωn . With the notation 3∗ dp F (p) = β1 ω∈Mnτ R R P −d ∗ dp F (ω, p), where 3∗ dp = L p∈3∗ , the inverse Fourier transform is ψσ (x) = R3 ipx ˆ ψσ (p). The Fourier transform of the hopping term is Tˆ (p, q) = δ3∗ (p + dx e 3∗ R ˜ q, 0)T (q), with T˜ (q) = dz eiqz tz . Denoting 3
E(p) = T˜ (p) − µ,
(13)
where µ is the chemical potential, ω b=
1 eiετ ω − 1 , iετ
(14)
and δ3∗ (p + p0 , 0) = δ3 (p + p0 , 0) ετ −1 δ−ω,ω0 , the Fourier transform of the operator a in the quadratic part of the action is aˆ (p, σ, p0 , σ 0 ) = δ3 (p + p0 , 0)δσσ0 iωb0 − E(p0 ) . (15) In other words, the matrix with entries aˆ (p, σ, −p0 , σ 0 ) is diagonal, and for temperature T = β1 > 0, all diagonal entries are nonzero because sin(ετ ω) ≥ 1 . (16) | Re ω b | = ω ετ ω 2β Thus the inverse of a, the propagator c = a−1 , exists; it has the Fourier transform cˆ(p, σ, p0 , σ 0 ) = δ3∗ (p + p0 , 0)δσσ0
1 . 0 b iω − E(p0 )
(17)
b → ω, so one gets the usual formula (iω − In the formal continuum limit ετ → 0, ω E(p))−1 . The partition function of the system of independent fermions (λ = 0) is Z Y ¯ d − E(p) , iω(p) (18) D3 ψ D3 ψ¯ e(ψ,Aψ) = det A = p
which is nonzero by Eq. (16). 2.3. The class of models. Denote the dual to G by B, the first Brillouin zone of the d infinite lattice. For instance, for G = εZd , B = Rd / 2π ε Z . The assumptions for the class of models are: there is k0 ≥ 2 such that the dispersion relation E ∈ C k0 (B, R), and for all p ∈ B, E(−p) = E(p) holds. The interaction vˆ is a C k0 function from R × B to R, ˆ ˆ 0 , p), all its derivatives up to order k0 are bounded functions on B × R, v(−p 0 , p) = v(p and the limit p0 → ∞ of vˆ exists and is C k0 in p. There is g0 > 0 such that for all p on the Fermi surface S = {p : E(p) = 0}, |∇E(p)| ≥ g0 holds. The Fermi surface is a subset of an ε–independent bounded region of momentum space (hence compact), it is strictly convex and has positive curvature everywhere. In particular, there is V1 > 0 such that for all L and ε , Z dk 1l |E(k)| ≤ 2 ≤ V1 . (19) 3∗
Continuous Renormalization for Fermions and Fermi Liquid Theory
255
The constant Emax = sup |E(p)|
(20)
p∈B
is independent of ε. Under these hypotheses, there is 0 > 0 and a C 2 -diffeomorphism π from (−20 , 20 ) × S d−1 to an open neighbourhood of the Fermi surface S in B, (ρ, θ) 7→ π (ρ, θ), such that 2 (21) E(π π (ρ, θ)) = ρ and |∂ρ π (ρ, θ)| ≤ g0 (see [21], Lemma 2.1, and [22], Sect. 2.2; 0 was called r0 there). Let J(ρ, θ) = det π 0 (ρ, θ) and denote Z dθ |J(ρ, θ)|. (22) J0 = sup |J(ρ, θ)| and J1 = sup |ρ|≤0 S d−1
|ρ|≤0 θ∈S d−1
I assume that 0 ≤ 1 and (for convenience in stating some bounds) that β0 = kB0T ≥ 6. With the units chosen in a natural way, i.e., with typical bandwidths of electron volts, this corresponds to temperatures T up to 1000 Kelvin if 0 is of order one, which seems a sufficient temperature range to study conduction in crystals. Note, however, that 0 depends on the Fermi surface and thus on the filling factor. A typical example is the discretized Laplacian E(k) = B
d 1 X (1 − cos(εkν )) − µ. ε2
(23)
ν=1
1 , E(k) → k2 /2m − µ, the Jellium dispersion relation, which For ε → 0 and B = m satisfies the above hypotheses if large |k| are cut off. For ε = 1 and B = t, one gets the tight-binding dispersion relation with hopping parameter t/2, which satisfies the above hypotheses if µ 6= td (half-filling). In the limit µ → td, 0 → 0 in the Hubbard model. This implies that to have bounds uniform in the filling, one has to stay away from halffilling. The energy 0 sets the scale where the low-energy behaviour sets in. The effective four-point interaction at that scale can differ substantially from the original interaction. For a discussion, see Sect. 7 and [24].
2.4. Nambu formalism. It will be useful for deriving the component form of the RGE to rename the Grassmann variables such that the distinction between ψ and ψ¯ is in another index. This is a variant of the usual “Nambu formalism”; see, e.g., [33]. Let 0 = 3 × {−1, 1} × {1, 2},
(24)
and denote X = (x, σ, i) ∈ 0. For x ∈ 3 and σ ∈ {−1, 1}, the fields are defined as ψ(x, σ, 1) = ψ¯ σ (x) and ψ(x, σ, 2) = ψσ (x). The antiperiodicity condition reads ψ(x + βeτ , σ, i) = −ψ(x, σ, i) with eτ the unit vector in τ -direction. The Grassmann algebra generated by the (ψ(X))X∈0 is denoted by A0 [ψ]. Given another set of Grassmann algebra generated by the ψ Rand η is denoted by variables (η(X))X∈0 , the Grassmann R P2 P A0 [ψ, η]. Furthermore, denote 0 dX F (X) = i=1 σ∈{−1,1} 3 dx F (x, σ, i) and δ0 ((x, σ, i), (x0 , σ 0 , i0 )) = δii0 δσσ0 δ3 (x, x0 ), and define a bilinear form on A0 [ψ, η] by
256
M. Salmhofer
Z dX ψ(X) η(X) = − (η, ψ)0 .
(ψ, η)0 =
(25)
0
Then S2 =
1 2
(ψ, A ψ)0 , where, for X = (x, σ, i) and X 0 = (x0 , σ 0 , i0 ), ( 0 if i = i0 0 A(X, X ) = a(x, σ, x0 , σ 0 ) if i = 1 and i0 = 2 −a(x0 , σ 0 , x, σ) if i = 2 and i0 = 1
(26)
with a given by Eq. (9). In other words, when written as a matrix in the index i, A takes the form 0 a (27) T 0 −a with (aT )(x, σ, x0 , σ 0 ) = a(x0 , σ 0 , x, σ) denoting the transpose of a. Since a is invertible, A is invertible as well. R With this, Z3,nτ = N3 det a Z˜ 3 , where Z˜ 3 = dµC (ψ)e−λS4 (ψ) , where C = A−1 , and dµC is the linear functional (“Grassmann Gaussian measure”) defined by dµC (ψ) = 1 (det a)−1 D0 ψ e 2 (ψ,Aψ)0 . The constant N3 det a drops out of allRcorrelation functions and can therefore be omitted. The “measure” dµC is normalized, dµC (ψ) = 1, and its characteristic function is Z 1 (28) dµC (ψ) e(η,ψ)0 = e 2 (η, C η)0 . All moments of dµC can be obtained by differentiating Eq. (28) with respect to η and setting η = 0; see also the next subsection. 2.5. The connected Green functions. In the correspondence between the system, as defined by the Hamiltonian H3 and F3 , to the Grassmann integral, I have so far only discussed the partition function itself. In the path integral representation of Eq. (4), ¯ in the with a polynomial observable O(c, c+ ), one simply gets a factor O(ψ(0), ψ(0)) Grassmann integral. The m-point Green functions of the system determined by C and V are + *m Z m Y Y 1 dµC (ψ) e−λV (ψ) ψ(Xk ) = ψ(Xk ). (29) Z3 k=1
k=1
They determine the expected values of all polynomials by linearity. On a finite lattice, the limit nτ → ∞ of Z˜ 3 exists by the Lie product formula Eq. (5), and for λ small enough (depending on L, ε, β, and µ), it is nonzero since the trace of the matrix e−β(H3 −µN3 ) over the finite-dimensional space F3 is a continuous function of λ, which is nonzero at λ = 0 by Eq. (16) and Eq. (18). A similar argument applies to the numerator of Eq. (4). Thus the limits of numerator and denominator in Eq. (4) as nτ → ∞ exist separately. Therefore one can take this limit in numerator and denominator through the same sequence, i.e., take nτ to be the same in numerator and ¯ denominator. It follows that with the special choice O(ψ(0), ψ(0)), all expectation values in the Hamiltonian picture can be expressed as the limit nτ → ∞ of Eq. (29), with a special choice of the polynomial in the fields. Thus the correlation functions given by Eq. (29) include as a special case the expectation values of polynomials in the creation and annihilation operators. Let (η(X))X∈0 be a Rfamily of Grassmann generators. The partition function with δ ∂ = ετ −1 ε−d ∂η(X) , then source terms is Z0 (η) = dµC (ψ) e−λV (ψ) + (η,ψ)0 . Let δη(X)
Continuous Renormalization for Fermions and Fermi Liquid Theory
*
n Y
+ ψ(Xk )
k=1
257
"n # Y 1 δ Z0 (η) = Z0 (0) δη(Xk ) k=1
. η=0
Thus, if one knows Z0 (η) one can derive all correlation functions. It is convenient to study the connected correlation functions, defined as + "n # *n Y Y δ log Z0 (η) ψ(Xk ) = δη(Xk ) k=1
c
k=1
(30)
(31)
η=0
instead. Since Z0 (η) is the exponential of log Z0 (η), one can reconstruct all correlation functions from the connected ones. It is even more convenient to transform the sources η, to get the amputated connected Green functions. They are generated by Z (32) Geff (χ) = log dµC (ψ) e−λV (ψ+χ) . A shift in the measure shows that Geff (χ) = 21 χ, C −1 χ 0 + log Z0 C −1 χ , so that the study of Geff is equivalent to that of log Z0 . The selfenergy 6(p) is defined as the one-particle irreducible part of the two-point function. In terms of the connected amputated two-point Green function G2 , which is the coefficient of the quadratic part (in ψ) of the effective action Geff (ψ), it is 6(p) = G2 (p)(1 − CG2 (p))−1 . 2.6. Criteria for Fermi liquid behaviour . In the following I give a definition of Fermi liquid behaviour which is linked to the question of convergence of the expansion in the coupling constant λ, and I discuss in some detail the physical motivation for this definition, the results that have been proven in this direction, and its relation to other notions of FL behaviour. In most many-fermion models, one cannot expect the expansion in λ to converge uniformly in the temperature, not even after renormalization. In particular, the Cooper instability produces a superconducting ground state, and thus a nonanalyticity in λ, if the temperature is low enough. This happens even if the initial interaction is repulsive [34, 35]. Nesting instabilities can produce other types of symmetry breaking, such as antiferromagnetic ordering, which may compete or coexist with superconductivity, but the conditions I posed, in particular the curvature of the Fermi surface, remove these instabilities at low temperatures (which temperatures are “low” depends on the scale 0 ). Let the skeleton Green functions be defined as the connected amputated m-point correlation functions where selfenergy insertions are left out. These functions are the solution of a natural truncation of the renormalization group equation; they are defined precisely in Sect. 6. Definition 1. The d-dimensional many-fermion system with dispersion relation E and interaction V shows (equilibrium) Fermi liquid behaviour if the thermodynamic limit of the Green functions exists for |λ| < λ0 (β), and if there are constants M0 , M1 , M2 > 0 (independent of β and λ), such that the following holds. The perturbation expansion for the skeleton Green functions converges for all (λ, β) with |λ| log β < M0 , and for all (λ, β) with |λ| log β ≤ M2 0 , the skeleton selfenergy 6sk : R × B → C satisfies the regularity conditions:
258
M. Salmhofer
1. 6sk is twice differentiable in p and max k∂ α 6sk k∞ ≤ M1 ,
|α|=2
(33)
2. the restriction to the Fermi surface 6sk |{0}×S ∈ C k0 (S, R), and max k∂ α 6sk k∞ ≤ M2 .
|α|=k0
(34)
Here k0 > d is the degree of differentiability of the dispersion relation E (given in Sect. 2.3). Nothing is special about the factor 21 in the condition |λ| log β ≤ M2 0 . One could instead also have taken any fixed compact subset of {z : |z| < M0 }. The derivatives mean, β when taken in p0 , a difference 2π (6sk (p0 + 2π β , p) − 6sk (p0 , p)). The maximum runs d+1 over all multiindices α ∈ N0 . This definition only concerns equilibrium properties of Fermi liquid behaviour; it does not touch phenomena like zero sound, which require an analysis of the response to perturbations that depend on real time. It is natural in that it defines a Fermi liquid above the critical temperature for superconductance: at a given λ, the value of T for which the convergence breaks down is Tc ∝ e−M0 /|λ| , which is the usual BCS formula. Convergence of perturbation theory above Tc implies that the usual Fermi liquid formulas are valid there. Convergence is stated only for skeleton quantities because that is all one can show. This convergence and the regularity properties of the selfenergy imply that the exact Green functions (no restriction to skeletons) are continuous in λ, that the exact selfenergy 6 is C 2 in λ and p, and that 6 obeys a bound similar to Eq. (33). The Green functions are not analytic in λ because otherwise already the unrenormalized expansion, which diverges termwise, would converge. The regularity properties (1) and (2) ensure that the exact Green functions can be reconstructed from the skeleton Green functions by renormalization. The usual skeleton expansion argument [36], where finiteness only of the skeleton selfenergy, but not of its derivatives, is shown, is insufficient to do that; one has to prove regularity properties (1) and (2). This was discussed in detail in [22], see also Sect. 7. The condition k0 > d is necessary to make the regularized propagator summable in position space. It is required in the proof of Lemma 5 (and in the proofs in [20], only that there the dispersion relation was taken C ∞ ). In the absence of level crossing, the free dispersion relation E(k) is usually even real analytic in k. However, when reconstructing the exact Green functions from the skeleton Green functions, one needs regularity of the dispersion relation of the interacting system, and thus regularity property (2), which is rather hard to verify even in perturbation theory. Thus it is desirable to take the smallest possible k0 > d. Because 6 obeys a bound similar to Eq. (33), one can do the usual first-order Taylor expansion in the momenta to get ˜ 6(p) = p0 (∂0 6)(0, P(p)) + (p − P(p)) · ∇6(0, P(p)) + 6(p),
(35)
from which one obtains a finite wave function renormalization Z(p) = 1 + i(∂0 6)(0, P(p))
(36)
˜ and a finite correction to the Fermi velocity, and the Taylor remainder 6(p) vanishes quadratically in the distance of the momentum (p0 , p) to its projection (0, P(p)) to the Fermi surface.
Continuous Renormalization for Fermions and Fermi Liquid Theory
259
This property distinguishes Fermi liquids from other possible states of the manyfermion system, such as Luttinger liquids: In one dimension (where “Luttinger liquid behaviour” has been proven [16–18]), the second derivative of even the second order skeleton selfenergy grows like β for large β and thus violates the condition that the second derivative should be bounded independently of β for |λ| log β ≤ M2 0 . Note that this distinction can only be made if β is allowed to vary; at fixed β, the requirement that something is bounded independently of β is trivial. This is the reason why a whole range of values of β and λ is included in Definition 1. A full proof that the models obeying the hypotheses stated in Sect. 2.3 are Fermi liquids in the sense of Definition 1 is not within the scope of this paper, but several ingredients for such a proof are already in place. I now discuss what is known and then briefly state the main results of the present paper. The analyticity of the skeleton Green functions in λ follows for d = 2 spatial dimensions from a modification of the method in [20]. The required modification, namely to put in the four-point functions, is not difficult because at positive temperature, these functions have no singularities (in momentum space), but are bounded by a constant times a power of log β. Regularity property (1) was proven for all d ≥ 2 in perturbation theory. The “overlapping loop” method of [21] developed for these proofs applies nonperturbatively as well, so that (1) holds. For E(k) = k2 /2m − µ, rotational invariance implies that 6sk |{0}×S is independent of p, so that (2) is trivially fulfilled. Thus the model with this dispersion relation and a rotation-invariant short-range interaction is the simplest example of a Fermi liquid in d = 2. In the case without rotational symmetry (e.g. the Hubbard model), (2) was proven in perturbation theory for d = 2 in [21–23], with k0 = 2+h, h < 21 , by use of a classification of graphs without double overlaps [23] and a detailed analysis of their contributions [22]. A nonperturbative implementation of the double overlap technique of [23] has not been given yet, but it should be possible. For d = 3, analyticity has not yet been proven (for a partial result, see [26]). Regularity property (1) was proven in perturbation theory in [21–23]. Property (2) has not even been shown in perturbation theory up to now. In this paper, I use the continuous RGE to give a largely simplified proof of the existence of the thermodynamic limit (Lemma 3) and of regularity property (1) in perturbation theory, for all d ≥ 2 (Theorem 6). I show in perturbation theory that only the ladder four-point function (see Definition 3) can produce the logarithmic growth in β that leads to the Cooper instability (and hence to the restriction |λ| log β < M0 ), and that all other contributions to the four-point function are bounded uniformly in β (Theorem 4). In Sect. 4, I prove a basic power counting theorem (Theorem 1) that takes into account the effect of fermionic sign cancellations by a determinant bound. The determinant appearing in the RGE may lead to analyticity of the skeleton Green functions, but a proof of analyticity is not given here. A simplified proof of (2) can be given for d = 2 in perturbation theory by an extension of the methods developed here. For d = 3, regularity property (2) requires more than three derivatives of 6sk |{0}×S to exist. A proof that this is the case looks rather difficult, but the simple structure of the continuous RGE makes it seem within reach of that method. A natural question is if there are criteria independent of temperature that can be applied also in the zero temperature limit for “Fermi liquid behaviour”. For the class of models specified in Sect. 2.3, the simplest criterion is that the non-ladder skeleton Green functions are analytic in the coupling constant λ, that the non-ladder skeleton selfenergy
260
M. Salmhofer
6(N ) is C 1 , and that 6(N ) |S is C k0 , with bounds uniform in β. The non-ladder skeleton Green functions are obtained by removing all ladder contributions (defined in Sect 6) to the Green functions. The regularity implies that the Fermi velocity and the wave function renormalization are finite uniformly in β, and even at zero temperature, for d ≥ 2 (which is the usual criterion for Fermi liquids), whereas they still diverge in one dimension as β → ∞ 16–18. The Fermi liquid criterion given in Definition 1 is more natural than the one using the non-ladder Green functions: if the regularity conditions (1) and (2) of Definition 1 hold, the transition from the exact Green functions to the skeleton Green functions is a matter of convenience, but the replacement of the skeleton Green functions by the non-ladder skeleton Green functions changes the model drastically because it removes superconductivity. Evidently, a definition referring to a modified model is not as natural. Moreover, as mentioned above, Fermi liquid behaviour is observed only above the critical temperature for superconductance anyway. In Sect. 6, I define the non-ladder skeleton functions precisely and prove the above statements about 6(N ) uniformly in the temperature in perturbation theory. In [25], an asymmetric model, in which the symmetry E(k) = E(−k) does not hold, was introduced, and a proof was outlined that such models are Fermi liquids down to zero temperature. The asymmetry of the Fermi surface removes the Cooper instability at zero relative momentum q of the Cooper pair, i.e., it implies that the four-point function has no singularity at relative momentum q = 0 (which is where the usual Cooper pairing comes from). The regularity properties of the selfenergy, which are crucial for Fermi liquid behaviour, were not proven in [25]. Doing this [22, 23] is quite a bit harder than in the (k → −k)-symmetric case. At zero temperature, regularity property (1) is replaced by (10 ) : 6 ∈ C 2− because the selfenergy is not C 2 at zero temperature (the second derivative grows as a power of log β; for a detailed discussion of these problems see [22]). This modified regularity property (10 ) and (2) were proven in perturbation theory for a general class of two-dimensional models with a strictly convex Fermi surface, which includes the (k → −k)-nonsymmetric Fermi surfaces, in [21–23]. More precise conditions on the dispersion relation that imply absence of the Cooper instability also at nonzero relative momentum q were also formulated in [22]. It is not sufficient just to have a k → −k nonsymmetric surface to achieve that; one also needs that the curvature at a point on the Fermi surface and at its antipode differ except at finitely many points (for details, see Hypothesis (H40 ) of [22] and the geometrical discussion in Appendix C of [22]).
3. The Renormalization Group Equation In this section I derive the continuous RGE for fermionic models. I first derive it for the generating function, and then turn to the component form which is obtained by expanding the effective action in Wick ordered monomials of theR fields. Let 0 be aP finite set, for a function X 7→ F (X) from 0 to any linear space let 0 dX F (X) = ε0 X∈0 F (X), where ε0R > 0 is a constant, let δ(X, X 0 ) = ε−1 0 δXX 0 , and define the bilinear form (f, g) = 0 dX f (X)g(X). Let A be the finite-dimensional Grassmann algebra generated δ by the generators (ψ(X), χ(X), η(X))X∈0 and let δψ(X) be the fermionic derivative δ 0 0 normalized such that δψ(X) ψ(X ) = δ(X, X ); recall that the fermionic derivatives anticommute.
Continuous Renormalization for Fermions and Fermi Liquid Theory
261
3.1. The RGE for the generating function. For t ≥ 0 let Ct be an invertible, antisymmetric R linear operator acting on functions defined on 0, i.e. (Ct f )(X) = 0 dX 0 Ct (X, X 0 )f (X 0 ) with (37) Ct (X 0 , X) = −Ct (X, X 0 ). ∂Ct ˙ Let Ct be continuously differentiable in t; denote ∂t = Ct . Let dµCt be the linear functional (Grassmann Gaussian measure) with characteristic function Z 1 (38) dµCt (ψ) e(η,ψ)0 = e 2 (η, Ct η)0 . The integrals of arbitrary monomials are obtained R from this formula by taking derivatives with respect to η. The measure is normalized: dµCt (ψ) = 1. Let V (ψ) ∈ A have no constant part, λ ∈ C, and G(0, ψ) = λV (ψ). The effective action at t > 0 is Z (39) G(t, ψ) = log dµCt (χ) eG(0,χ+ψ) . Because the measure is normalized, G(t, ψ) is a well-defined formal power series in λ. By the nilpotency of the Grassmann variables, eG(t,ψ) is a polynomial in λ (the degree of which grows with |0|). Thus G(t, ψ) is analytic in λ for |λ| < λ0 (0). Proposition 1. Let Z Z 1 δ 1 δ δ δ , Ct Ct (X, X 0 ) . = dX dX 0 1 Ct = 2 δψ δψ 0 2 δψ(X) δψ(X 0 ) 0
Then
and
(40)
0
∂ G(t,ψ) e = 1C˙ t eG(t,ψ) ∂t
(41)
eG(t,ψ) = e1Ct eG(0,ψ) .
(42)
If G(0, ψ) is an element of the even subalgebra, then for all t > 0, G(t, ψ) is an element of the even subalgebra, and it satisfies the renormalization group equation ∂ ˙ Ct G(t, ψ) + 1 δG(t, ψ) , C˙ t δG(t, ψ) . G(t, ψ) = 1 (43) ∂t 2 δψ δψ δ δ ) by replacing every factor ψ(X) by δη(X) in the Proof. For any F (ψ) ∈ A, define F ( δη δ δ (η,ψ)0 ]η=0 (the derivatives δη also polynomial expression for F . Then F (ψ) = [F ( δη ) e generate a finite-dimensional Grassmann algebra, so the expansion for F terminates at some power). Since Grassmann integration is a continuous operation, and by Eq. (38), Z δ G(0, δη ) G(t,ψ) (η,χ+ψ)0 dµCt (χ) e = e e
i h δ 1 = eG(0, δη ) e 2 (η,Ct η)0 e(η,ψ)0 For any formal power series f (z) =
P
f (1Ct ) e
(η,ψ)0
=f
fk x k , 1 (η, Ct η)0 2
η=0
.
(44)
e(η,ψ)0 ,
(45)
η=0
262
M. Salmhofer
so e 2 (η, Ct η)0 +(η,ψ)0 = e1Ct e(η,ψ)0 . Since 1Ct is bilinear in the derivatives, it commutes with all factors that depend only on η and can be taken out in front in Eq. (44). This ˙ Ct , Eq. (41) follows. implies Eq. (42). Since 1Ct also commutes with 1 If G(0, ψ) is an element of the even subalgebra, the same holds for G(t, ψ) by Eq. (42), since every application of 1Ct removes two fields. Thus performing the derivatives with respect to ψ gives Eq. (43). 1
3.2. The component RGE in position space. Let V (ψ) be an element of the even subalgebra. The effective action has the expansion G(t, ψ) =
∞ X
λr Gr (t, ψ)
(46)
r=1
with Gr (t, ψ) polynomials in the Grassmann algebra. As explained in Sect. 3, this expansion converges for 0 finite, but the radius of convergence λ0 depends on 0 and Ct . For the models discussed in Sect. 2, this means that λ0 goes to zero in the limit L → ∞ and β → ∞. ˙ t . The Assume that C = limt→∞ Ct exists and let Dt = C − Ct so that C˙ t = −D application will be that C is the covariance of the model and Ct is part of it, so that in G(t, ψ), part of the fields have been integrated over. Dt is then the covariance of the unintegrated fields. I expand the polynomial Gr (t, ψ) ∈ A in the basis for the Grassmann algebra given by the Wick ordered monomials Dt (ψ(X1 ) . . . ψ(Xp )), Gr (t, ψ) =
m(r) ¯ X
Z
m=0 0m
dX Gmr (t | X) Dt
m Y
! ψ(Xk ) ,
(47)
k=1
where Gmr (t | X1 , . . . , Xm ) is the connected, amputated m–point Green function and X = (X1 , . . . , Xm ). Details about Wick ordering are provided in Appendix B. A short formula is (48) Dt (ψ(X1 ) . . . ψ(Xp )) = e−1Dt ψ(X1 ) . . . ψ(Xp ). I use the symbol Dt (ψ(X1 ) . . . ψ(Xp )) rather than : ψ(X1 ) . . . ψ(Xp ) : to indicate clearly with respect to which covariance Wick ordering is done, because this will be important. The Gmr (t | X1 , . . . , Xm ) are assumed to be totally antisymmetric, that is, for all π ∈ Sm , Gmr (t | Xπ(1) , . . . , Xπ(m) ) = ε(π)Gmr (t | X1 , . . . , Xm ), because any part of G that is not antisymmetric would cancel in Eq. (47). ∂ Application of ∂t to Eq. (47) gives a sum of two terms since two factors depend on t. By Eq. (48), ∂ ˙ Dt Dt (ψ(X1 ) . . . ψ(Xp )) Dt (ψ(X1 ) . . . ψ(Xp )) = −1 ∂t ˙ Ct Dt (ψ(X1 ) . . . ψ(Xp )). =1
(49)
When multiplied by Gmr (t | X1 , . . . , Xm ) and integrated over X1 , . . . , Xm , this gives ˙ Ct G(t, ψ). Thus the term linear in G drops out of Eq. (43) by Wick ordering with 1 respect to Dt , and Eq. (43) now reads
Continuous Renormalization for Fermions and Fermi Liquid Theory m(r) ¯ X
Z dX Dt
m=0 0m
m Y
! ψ(Xk )
k=1
263
1 ∂ Gmr (t | X) = Qr (t, ψ), ∂t 2
where Qr (t, ψ) is defined by X ∞ δG(t, ψ) δG(t, ψ) , Ct = λr Qr (t, ψ). δψ δψ
(50)
(51)
r=1
Being an element of the Grassmann algebra, Qr (t, ψ) has the representation ! Z m(r) ¯ m X Y Qr (t, ψ) = dX Qmr (t | X) Dt ψ(Xk ) . m=0 0m
(52)
k=1
To obtain the Qmr (t | X), one has to rewrite the product of the two Wick monomials in Eq. (51). This is done in Appendix C. The result is Proposition 2. Qm1 (t | X) = 0, and for r ≥ 2, Z Z Z ∂ dV dW − det Dt(i) (V , W ) Qmr (t | X) = dκmr ∂t 0i
0i
˜ , X2 ), Gm1 r1 (t | X1 , V ) Gm2 r2 (t | W where
R
dκmr stands for the sum Z dκmr (r1 , m1 , r2 , m2 , i) F (r1 , m1 , r2 , m2 , i) X X κm1 m2 i F (r1 , m1 , r2 , m2 , i) = r1 ,r2 ≥1 r1 +r2 =r
(53)
(54)
(m1 ,m2 ,i)∈Mr1 r2 m
with positive weights κm1 m2 i = mi 1 mi 2 , so that dκmr is a positive measure. Mr1 r2 m is the set of (m1 , m2 , i) such that i ≥ 1, 1 ≤ m1 ≤ m(r ¯ 1 ), 1 ≤ m2 ≤ m(r ¯ 2 ), m1 + m2 = m + 2i, and m1 and m2 are even. X1 = (X1 , . . . , Xm1 −i ), X2 = (Xm1 −i+1 , . . . , Xm ), ˜ = (Wi , . . . , W1 ), and Dt(i) (V , W ) is the V = (V1 , . . . , Vi ), W = (W1 , . . . , Wi ), and W (i) i × i matrix (Dt (V , W ))kl = Dt (Vk , Wl ). Comparison of the coefficients gives the component form of the RGE, 1 ∂ Gmr (t | X1 , . . . , Xm ) = Am Qmr (t | X1 , . . . , Xm ), ∂t 2
(55)
where Am is the antisymmetrization operator (Am f )(X1 , . . . , Xm ) =
1 X ε(π)f (Xπ(1) , . . . , Xπ(m) ). m!
(56)
π∈Sm
The important feature of Eq. (53) is that the determinant of the propagators appears in this equation. The Gram bound for this determinant improves the combinatorics by a
264
M. Salmhofer
factorial. I now discuss the graphical interpretation of the equation and the determinant, to motivate why this improvement can be regarded as a “planarization” of the graphs. 3.3. The graphical interpretation. The component form of the RGE has a straightforward graphical interpretation. If one associates the vertex drawn in Fig. 1 to Gmr (t | X1 , . . . , Xm ), adopts the convention that the variables occurring on the internal lines of a graph are integrated, and writes out the determinant as det Dt(i) (V , W ) =
X π∈Si
ε(π)
i Y
Dt (Vk , Wπ(k) ),
(57)
k=1
the right-hand side of Eq. (53) appears as the signed sum over graphs with two vertices Gm1 r1 (t) and Gm2 r2 (t), obtained by joining leg number m1 − i + k of vertex 1 with leg number i − π(k) + 1 of vertex 2, for all k ∈ {1, . . . , i}. The graph for π = id is drawn in Fig. 2. The expansion in terms of Feynman graphs is generated by iteration of the equivalent integral equation 1 Gmr (t | X) = Gmr (0 | X) + Am 2
Zt dt Qmr (t | X)
(58)
0
with the initial condition Gmr (0 | X) = −δr1 Vm (X), where Vm (X) is the coefficient of C (ψ(X1 ) . . . ψ(Xm )) in the original interaction. It is evident from Fig. 2 that only connected graphs contribute to this sum.
3
m-2
Gmr(t)
2 1
m-1 m
Fig. 1. The vertex corresponding to Gmr (t)
Note that the only planar graphs appearing in this sum are from π(j) = j + k mod i for k ∈ {0, . . . , i−1}, and that for k 6= 0, these permutations produce planar graphs only if i = m1 or i = m2 . For all other permutations, the graphs arising are nonplanar. For bosons, the determinant is replaced by a permanent, and one can permute the integration variables so that the derivative of the permanent (and hence the sum over permutations) gets replaced by i Y ∂ Dt (Vk , Wk ) Dt (V1 , W1 ), (59) i i! ∂t k=2
so that the planar graph drawn in Fig. 2 is the only one contributing to the right-hand side. Thus this factor i! distinguishes between the combinatorics of the exact theory and a “planarized” theory, in which i! is replaced by i2 (the second factor i comes
Continuous Renormalization for Fermions and Fermi Liquid Theory
m1-i
265
i 2
Gm ,r (t)
2
m1
1 1
1
Gm ,r (t) 2 2
1
m2 Fig. 2. A graph contributing to the right hand side of the RGE
from doing the derivative in the determinant, see Sect. 4). The “planarized” theory does contain more than the sum over all planar graphs because of the binomial factors (the antisymmetrization operation Am does not change the combinatorics because it contains an explicit factor 1/m!). In the next section, I bound the determinant by const i and thereby reduce the combinatorics of the fermionic theory to that of the planarized theory. 4. Fermionic Sign Cancellations 4.1. The determinant bound. Equation (53) already suggests that a determinant bound similar to the one used in Lemma 1 of [20] can be applied to the RGE. Before applying this bound, the derivative with respect to t has to be performed, and some factors need to be arranged to avoid the factor |3| that appeared in [20]. The reason it does not appear here is that only connected graphs contribute to the effective action (whereas the partition function itself was bounded in Lemma 1 of [20]). In the RGE, this is very easily seen without a reference to graphs. Since the determinant is multilinear in the columns of the matrix, the derivative with respect to t produces a sum of terms where every column gets differentiated. Expanding along each differentiated column gives XX 0 0 ∂ ˙ t (Vl , Wl0 ) det Dt(i−1) (V (l) , W (l ) ) det Dt(i) (V , W ) = (−1)l+l D ∂t 0 i
i
(60)
l=1 l =1
0
with V (l) = (V1 , . . . , Vl−1 , Vl+1 , . . . , Vi ) and a similar expression for W (l ) . The sign is cancelled by rearranging Gm1 r1 (t | X 1 , V ) = (−1)i−l Gm1 r1 (t | X 1 , V (l) , Vl ), 0
˜ , X 2 ) = (−1)i−l0 Gm2 r2 (t | Wl0 , W ˜ (l ) , X 2 ). Gm2 r2 (t | W
(61)
Upon renaming of the integration variables, the summand becomes independent of l and l0 , so the sum gives a factor i2 . Thus Z Z Z Z ˙ t (V, W ) dY dZ (62) Qmr (t | X) = dκmr i2 dV dW D ˜ X 2 ). det Dt(i−1) (Y , Z)Gm1 r1 (t | X 1 , Y , V )Gm2 r2 (t | W, Z,
266
M. Salmhofer
Let ||| · ||| be the norm [14] |||Fm ||| =
max
sup
Z Y m
p∈{1,...,m} Xp
Lemma 1. Assume that
dXq |Fm (X1 , . . . , Xm )|.
(63)
q=1 q6=p
sup det Dt(i−1) (Y , Z) ≤ Ai−1 (t).
(64)
Y ,Z
Z
Then |||Qmr (t)||| ≤
˙ t ||| |||Gm1 r1 (t)||| |||Gm2 r2 (t)|||. dκmr i2 Ai−1 (t) |||D
(65)
Proof. Let p ∈ {1, . . . , m}. Without loss of generality, let p ≤ m1 − i, so that Xp is a component of X 1 (the other case is similar by the symmetry of the sum for Qmr (t) in m1 and m2 ). By Eq. (64), Z |||Qmr (t)||| ≤
dκmr i2 Ai−1 (t) sup
Z mY 1 −i
Xp
Z dXq
dV φ(V, X 1 ) χ(V )
(66)
q=1 q6=p
Z
with φ(V, X 1 ) = Z
and χ(V ) = The bound
Z
χ(V ) ≤ sup
˜ X 2 ) . ˙ t (V, W ) Gm2 r2 (t | W, Z, dW dZ dX 2 D ˙ t (V, W ) sup dW D
V
gives the result.
dY |Gm1 r1 (t | X 1 , Y , V )|
W
Z
˜ X 2 ) dZ dX 2 Gm2 r2 (t | W, Z,
(67)
(68)
(69)
From now on I assume that 3 is a discrete torus and that 3∗ is its dual. Let N be a set with |N | = n elements, let 0 = 3 × N × {1, 2} and 0∗ = 3∗ × N × {1, 2}. N can be thought of as the index set containing spin and colour indices. The last factor {1, 2} ¯ as discussed in Sect. 2. distinguishes between the usual ψ and ψ, Lemma 2. Let X = (x, σ, j) ∈ 0, and for every k ∈ 3∗ , let M (k) be a symmetric matrix in M(n, R) with eigenvalues mρ (k) satisfying |mρ (k)| ≤ 1. Let ft : 3∗ → C, and let Z 0 0 0 Dt (X, X ) = δj ,3−j dk eik(x−x ) (−1)j ft ((−1)j k)Mσ,σ0 (−1)j k . (70) 3∗
Then Dt (X 0 , X) = −Dt (X, X 0 ), and i−1 Z (i−1) (Y , Z) ≤ dk |ft (k)| . det Dt 3∗
(71)
Continuous Renormalization for Fermions and Fermi Liquid Theory
267
0
Proof. Since 3∗ = −3∗ and (−1)j = −(−1)j , a change of variables k → (−1)j k implies Z j j0 0 dk ft (k) Mσ,σ0 (k) eik((−1) x+(−1) x ) . (72) Dt (X, X 0 ) = δj+j 0 ,3 (−1)j 3∗
The antisymmetry of Dt now follows from the symmetry Pof M . Let 5ρ (k) be the spectral projection to the eigenspace of mρ (k), so that M (k) = ρ mρ (k)5ρ (k), and denote the scalar product on the spin space by [·, ·] so that M (k)σσ0 = [eσ , M (k) eσ0 ], where the eσ are orthonormal. Then Z2π
Z
0
Dt (X, X ) =
dk 3∗ 1
0
dϕ X at (X)(k, ϕ, ρ), 5ρ (k) bt (X 0 )(k, ϕ, ρ) 2π ρ
= hat (X) , bt (X 0 )i
(73)
with 3∗ 1 = {k ∈ 3∗ : ft (k) 6= 0}, and at (X)(k, ϕ, ρ) = (−1)j e−ik(−1)
j
j0
bt (X 0 )(k, ϕ, ρ) = eik(−1)
x0
x
e−iϕ(j−3) |ft (k)| 2 eσ , 1
0
eiϕj ft (k)mρ (k) |ft (k)|− 2 eσ0 . 1
(74)
Gram’s bound [37], i−1 Y 21 (i−1) (Y , Z) ≤ hat (Yk ), at (Yk )i hbt (Zk ), bt (Zk )i , det Dt
(75)
k=1
X
and
|mρ (k)|2 heσ , 5ρ (k)eσ i ≤
X
ρ
imply Eq. (71).
heσ , 5ρ (k)eσ i ≤ 1
(76)
ρ
˜ t (k) and Mσσ0 (k) = δσσ0 (as in the above many-fermion Corollary 1. If ft (k) = D systems) then Z i−1 (i−1) ˜ | det Dt (Y , Z)| ≤ dk Dt (k) . (77) 3∗
4.2. Power counting for point singularities. In this section, I show some basic power counting bounds for the Green functions obtained from the truncation that all marginal or relevant couplings are left out. Theorem 1. Assume that Dt is of the form Eq. (70), that Z ˜ t (k) ≤ 11 e−t , dk D
(78)
3∗
and that R
˙ t ||| ≤ 12 et . |||D
R
(79)
Let dκ˜ mr denote the measure obtained by restricting the sum in dκmr to m1 ≥ 4 and m2 ≥ 4 and replacing G4,rk by vδrk ,1 whenever it appears in the sum on the
268
M. Salmhofer
right-hand side of Eq. (53). Then the solution G˜ mr (t) of Eq. (55) with initial condition Gmr (0) = G(0) mr satisfies ( m γmr et( 2 −2) if m ≥ 6 |||G˜ mr (t)||| ≤ γ4r (1 + t) (80) if m = 4 if m = 2 γ2r with γmr defined recursively as γmr =
|||G(0) mr |||
1 + 312 m
Z dκ˜ mr i2 1i−1 1 γ m 1 r1 γ m 2 r2 .
(81)
Proof. Induction in r, with Eq. (80) and Eq. (81) as the inductive hypothesis. The case r = 1 is trivial. Let r ≥ 2, and the statement hold for all r0 < r. By Lemma 2, Eq. (64) −t(i−1) . Thus holds with Ai (t) = 1i−1 1 e Z −t(i−2) ˜ |||Gm1 r1 (t)||| |||G˜ m2 r2 (t)|||. (82) |||Qmr (t)||| ≤ 12 dκ˜ mr i2 1i−1 1 e By the inductive hypothesis and m1 + m2 − 2i = m, ||| · ||| of the right-hand side of Eq. (58) is bounded by Z |||Gmr (0)||| +
dκ˜ mr i
2
1i−1 1 12 γm1 r1 γm2 r2
1 2
Zt ds es(
m 2 −2)
.
(83)
0
To complete the induction step, this has to be bounded by the right hand side of Eq. (80). m If m ≥ 6, m 2 − 2 ≥ 6 ≥ 1, so 1 2 Rt
Zt ds es( 0
m 2 −2)
≤
m 1 3 m et( 2 −2) ≤ et( 2 −2) . m−4 m
2 If m = 4, ds ≤ m (1 + t). If m = 2, m 2 − 2 = −1, so 0 This implies Eq. (80), with γmr given by Eq. (81). 1 2
1 2
Rt 0
ds es(
(84) m 2 −2)
≤
1 2
=
1 m.
Remark 1. In graphical language, the truncation removes all two-legged insertions that require renormalization and all nontrivial four-legged insertions (‘nontrivial’ means that the four-legged vertices are still there). Remark 2. The solution to Eq. (81) is bounded by the solution to the untruncated recursion γm1 = vδm4 , and for r ≥ 2, Z 1 (0) dκmr i2 B i−1 γm1 r1 γm2 r2 +A (85) γmr = γmr m with A = 312 and B = 11 . The constants A and B can be scaled out of the mrecursion. If the initial interaction is a four-fermion interaction, gm,r = mγmr A−1 B 1− 2 satisfies the recursion gm,1 = wδm4 , with w = 4AB, and for r ≥ 2, r−1 X 2(s+1) X X µ − 1m + 2k + 1 − µ gµ,s gm+2k+2−µ,r−s . (86) gm,r = k k µ=2 s=1 k≥0
µ even
I do not provide bounds for the solution gm,r here; if gm,r ≤ const r , then the above P bounds imply that r λr G˜ mr (t) is analytic in λ. This behaviour is suggested by the absence of the factor i! that would appear for bosons in the recursion; see [13].
Continuous Renormalization for Fermions and Fermi Liquid Theory
269
The propagators Ct and Dt for the renormalization group equation are defined using a partition of unity χ1 + χ2 = 1, χi ∈ C ∞ (R+0 , [0, 1]) with 1 χ1 (x) = 1 if x ≤ 4 , (87) 0 if x ≥ 1 χ01 (x) < 0 for all x ∈ ( 41 , 1), and kχ01 k∞ ≤ 2. Proposition 3. Let 3 be d-dimensional, and B be the infinite-volume limit of 3∗ . For L d ∈ 2N, B = Rd / 2π instance, for 3 = εZd /LZd , with 2ε ε Z . Let Dt be given by Eq. (70) with kM (k)k ≤ 1, ft (0) = 0, and for k 6= 0 ft (k) = f˜(k)χ1 (e2t f˜(k)2 ),
(88)
with f˜ ∈ C d+1 (B \ {0}, C), satisfying for all |α| ≤ d + 1 and all |k| ≤ 1, |Dα f˜(k)| ≤ Fd |k|1−d−|α| . Then Eq. (78) and Eq. (79) hold. R Proof. Equation (78) holds because 3∗ \{0} |k|1−d χ1 (k 2 e2t )dk is a Riemann sum apR dd k proximation to the convergent integral B |k|1−d χ1 (k 2 e2t ) (2π) d . The infinite-volume analogue of Eq. (79) is usually proven by integration by parts, using repeatedly Z dd k i(x−x0 )k ∂ ˆ˙ ˙ t (x − x0 ) = − Dt (k), e (89) (xν − x0ν )D (2π)d ∂kν B
which implies that |D˙ t (X, X 0 )| falls off at least as (1+e−t |x−x0 |)−d−1 for large |x−x0 |. On the torus at finite L, one iterates instead the summation by parts formula Z ˙ˆ ˙ˆ − a) , ˙ − x0 )(1 − eia(x−x0 ) ) = dk eik(x−x0 ) D(k) D(x − D(k (90) 3∗
which holds for all a ∈ 3∗ , decomposes 3 into 2d parts and chooses a appropriately to get an analogue of Eq. (89). This works uniformly in L because M (0) = 0. A similar argument is given in more detail in the proof of Lemma 5. Remark 3. Let d ≥ 2, 3 = εZd /LZd and d 2 X (1 − cos(εkν )). ε2
Lε (k) =
(91)
ν=1
1−d The choices f˜(k) = Lε (k) 2 , M = 1 (the toy model of [20]), and
f˜(k) =
d X 1 ν=1
and M (k) = f˜(k)
ε
!−1/2
2 sin(kν ε)
d X ν=1
+ (εLε (k))
2
(92) !
1 iγν sin(kν ε) + εLε (k) ε
(93)
270
M. Salmhofer
(Wilson fermions) satisfy the hypotheses of Proposition 3. For all t > 0, the infinitevolume limit L → ∞ and the continuum limit ε → 0 of the Green functions G˜ mr (t) exist and satisfy Eq. (80). In the first case, f˜(k) → |k|1−d , in the second case, M (k)f˜(k) → pp/2 as ε → 0. The Euclidean Dirac matrices satisfy the Clifford algebra γµ γν + γν γµ = δµν , and can be chosen hermitian, γµ∗ = γµ . It is possible to adapt the matrix structure of Lemma 2 to satisfy the antisymmetry condition on Dt also for this case. Remark 4. In ultraviolet renormalizable theories, the signs in the exponents of Eq. (78) and Eq. (79) are reversed. A power counting theorem similar to the infrared power counting Theorem 1 can be proven provided that in the initial interaction, at most quartic polynomials appear. The statement is then that for m ≥ 6, the Green function is bounded m by const e−t( 2 −2) .
5. The Thermodynamic Limit of the Many-Fermion System In this section, I apply the RGE to the many-fermion systems defined in Sect. 2. For d = 1, the singularity is pointlike, so the power counting bound Theorem 1 applies. For d ≥ 2, the analogue of Theorem 1 gives only weaker bounds, which, e.g., in d = 2 would mean that the four-point function is still relevant and the six-point function is marginal. This is not the actual behaviour; showing better bounds in d ≥ 2 requires a refinement using the sector technique of [20] and is deferred to another paper. I also show a simple bound for the full Green functions that takes into account the sign cancellations. If the coefficients γmr given by Eq. (85) are exponentially bounded, this bound implies that the unrenormalized expansion converges in a region |λ|β d < const . I also give a simple proof that the thermodynamic limit exists in perturbation theory. One can also show bounds on the expansion coefficients for finite nτ and L. Since this is mainly a tedious repetition of the infinite-volume proofs with integrals replaced by Riemann sums (it also requires that µ is chosen such that the Fermi surface contains no points of the finite-L momentum space lattice), I will content myself with indicating where this is necessary. For convenience, I call the limit nτ → ∞ and L → ∞ the thermodynamic limit, although the first limit would more aptly be called the time-continuum limit. The limit nτ → ∞ has to be taken first, because I want to apply Eq. (5) for operators on a finite-dimensional space only. However, it will turn out that for T > 0, the order of the two limits does not matter. Since I want to take the limit nτ → ∞, I can assume that nτ ≥ 2β(0 + Emax ). Thus Lemma 9 applies, with E0 = t , for all t ≥ 0. 5.1. The component RGE in Fourier space. Let 3 = T×3 and 3∗ = Mnτ ×3∗ , as given in Sects. 2.1 and 2.2. Let 0 = 3 × {−1, 1} × {1, 2} and 0∗ = 3∗ × {−1, 1} × {1, 2}. The Fourier transforms of the Gmr (t) are, with (P = P1 , . . . , Pm ), Gˆ mr (t | P ) =
Z Y m 3m
dxk e−i(p1 x1 +...+pm xm ) Gmr (t | X).
(94)
k=1
For K = (k, σ, j) ∈ 0∗ let ∼ K = (−k, σ, 3−j), and let δ0∗ (K, K 0 ) = δ3∗ (k, k 0 )δσσ0 δjj 0 . Assume that the Fourier transform of the propagator Dt is of the form ¯ t (K 0 ), ˆ t (K, K 0 ) = δ0∗ (K, ∼ K 0 ) D D
(95)
Continuous Renormalization for Fermions and Fermi Liquid Theory
271
where ˜ t ((−1)j k). ¯ t (K) = (−1)j D D
(96)
˜ t for This combination of signs implies that Dt (X, Y ) = −Dt (Y, X). The propagator D the many-fermion system is given in Eq. (103). The Fourier transform of Qmr (t) is Z Qˆ mr (t | P ) =
Z dκmr i!i
¯˙ t (K1 )) dK1 . . . dKi (−D
i Y
¯ t (Kj ) D
j=2
Gˆ m1 r1 (t | P (1) , K) Gˆ m2 r2 (t |∼ K, P (2) )
(97)
with P (1) = (P1 , . . . , Pm1 −i ), P (2) = (Pm1 −i+1 , . . . , Pm1 +m2 ), K = (K1 , . . . , Ki ), and ∼ K = (∼ Ki , . . . , ∼ K1 ). Here the arguments of Gm2 r2 (t) have been permuted and relabelled such that the determinant is transformed into a sum with the same sign for all terms (see Appendix C); this gives the extra factor i!. By translation invariance in space x and time τ , Gˆ mr (t | P1 , . . . , Pm ) = δ3∗ (p1 + . . . + pm , 0)Imr (t | P1 , . . . , Pm )
(98)
with a totally antisymmetric function Imr (t | P1 , . . . , Pm ) of (P1 , . . . , Pm ) ∈ 0∗ m that satisfies m X ∇pµ Imr (t | P1 , . . . , Pm ) = 0. (99) µ=1
A priori, Eq. (98) implies only the existence of a function Iˆmr (t | P1 , . . . , Pm ), defined only for those P1 , . . . , Pm for which p1 +. . .+pm = 0. However, since H = {p1 , . . . , pm : p1 + . . . + pm = 0} is a linear subspace of (3∗ )m , one can simply extend I˜ to a function on all space by defining I = Iˆ ◦ 5H with 5H the projection to the subspace H. Since 5H is symmetric in all its arguments, I is totally antisymmetric, and Eq. (99) holds because (1, . . . , 1) ⊥ H (in Eq. (99), ∇pµ is a difference operator which becomes the gradient in the limit where the momenta become continuous). The product of the two δ3∗ in Eq. (97) can be combined to cancel the δ3∗ in the relation between Gˆ and I, and to remove the integration over k1 . Thus the RGE in Fourier ∂ Imr (t | P ) = 21 Am Qmr (t | P ), with space is ∂t Z Z X ¯˙ t (K1 )) Qmr (t | P ) = dκmr i!i dK2 . . . dKi (−D σ1 ,j1 i Y
¯ t (Ks ) Im1 r1 (t | P (1) , K) Im2 r2 (t |∼ K, P (2) ), D
(100)
s=2
where K1 = (k1 , σ1 , j1 ) and k1 is fixed as k1 = −(k2 + . . . + ki + p1 + . . . + pm1 −i ). I use the same symbol for the Q in position and in momentum space since it will be always clear from the context which one is meant. Remark 5. The equivalent integral equation is 1 Imr (t | P ) = Imr (0 | P ) + Am 2
Zt ds Qmr (s | P ). 0
(101)
272
M. Salmhofer
R The sum dκmr contains the sum over r1 ≥ 1 and r2 ≥ 1, with the restriction r1 +r2 = r. Thus only r1 < r and r2 < r occur in this sum. Therefore Eq. (101), together with an initial condition Imr (0 | P ), uniquely determines the family of functions Imr (t | P ). Iteration of Eq. (101) generates the usual perturbation expansion. 5.2. Bounds on the finite-volume propagator. The propagator cˆ for the many-fermion system is given in Eq. (17). If k is such that E(k) = 0, then cˆ becomes of order β for small |ω|. In the temperature zero limit, β → ∞, this becomes a singularity. This is the reason why renormalization is necessary. The renormalization group flow is parametrized by t ≥ 0, where (102) t = 0 e−t is a decreasing energy scale. The fixed energy scale 0 was specified in Sect. 2.3. The limit of interest is t → ∞. The uncutoff propagator for the many-fermion system is given in Eq. (17). Thus, for k = (ω, k) ∈ 3∗ let 1 2 ˜ t (k) = χ1 t −2 |ib (103) ω − E(k)| , D ib ω − E(k) where ω b is defined in Eq. (14), χ1 is given in Eq. (87), and define C˜ t (k) similarly, with ˜ t (k) = (ib ω − E(k))−1 is independent of t. χ1 replaced by χ2 = 1 − χ1 . Then C˜ t (k) + D ˜ t define operators D ˆ t (K, K 0 ) and Cˆ t (K, K 0 ) on the functions The functions C˜ t and D ˙ ∗ ∂ ˜ ˜ on 0 by Eq. (96) and Eq. (95). Denote Dt = ∂t Dt , and let 1l (A) = 1 if the event A is true and 0 otherwise. ˜ t is a C ∞ function of t that vanishes identically if t > log β0 . If Proposition 4. D 2 β0 t ≤ log 2 , then ˜ t ⊂ {k ∈ 3∗ : |ib supp D ω − E(k)| ≤ t }, 1 ˜˙ t ⊂ {k ∈ 3∗ : t ≤ |ib ω − E(k)| ≤ t }. supp D 2
(104)
Moreover ˜˙ ω − E(k)| ≤ t ≤ 2β 1l |ib ω − E(k)| ≤ t , Dt (k) ≤ 4−1 t 1l |ib D ˜ t (k) ≤
β 2
1l |ib ω − E(k)| ≤ t , Z ˜˙ dk D and t (k) ≤ 4V1 3∗
Z
(105)
˜ t (k) ≤ V1 log β0 , dk D 2
(106)
3∗
where V1 is the constant in Eq. (19). ˜ t (k) 6= 0 implies |ib Proof. χ1 (x) = 0 if x ≥ 1, so D ω − E(k)| ≤ t ≤ 1. By Lemma 9, ˜ t 6= 0 only for t ≤ log β0 . Since |ib this implies |ω| ≤ π2 t . Since |ω| ≥ πβ , D ω − E(k)| ≥ 2 2 ∂ ˜ ˜ Dt (k)| = |Re ω b | ≥ β , the stated properties of Dt follow. The t-derivative gives | ∂t −2 ˙ 1 0 −2 2 0 ˜ 2 |ib ω − E(k)| |χ ( |ib ω − E(k)| )|. Since χ (x) = 0 for x 6∈ ( , 1), D (k) 6= 0 t
1
t
1
4
t
implies 21 t ≤ |ib ω − E(k)| ≤ t , which implies Eq. (105) and Eq. (104). Equation (106) R ˜ t = log(β0 /2) ds D ˜˙ s . follows from these inequalities by Lemma 9, by Eq. (19), and by D t
Continuous Renormalization for Fermions and Fermi Liquid Theory
273
Remark 6. The bounds Eq. (106) are crude because the restriction |E(k)| ≤ t was replaced by |E(k)| ≤ 2 when Eq. (19) was applied. To get a better bound, one has to require that no point of the finite–volume lattice in momentum space is on the Fermi R surface S. Because only then 3 dk 1l |E(k)| ≤ t ≤ const t holds uniformly in L. 5.3. Existence of the thermodynamic limit in perturbation theory. The proof of existence of the thermodynamic limit will proceed inductively in r, because of the recursive structure of the RGE mentioned in Remark 5. It will be an application of the dominated convergence theorem to Eq. (101). To this end, it is necessary to make the integration region independent of nτ and L. Although Imr (t), given by Eq. (101), appears evaluated at P = (P1 , . . . , Pm ) ∈ 0m on the RHS of Eq. (101) only, the integral defines the tderivative of a function defined on 0∗∞ m , where 0∗∞ = M(β) × B × {1, −1} × {1, 2} with B = G∗ the first Brillouin zone of the infinite lattice, and π M(β) = {ωn = (2n + 1) : n ∈ Z} (107) β the set of Matsubara frequencies in the limit nτ → ∞. For a bounded function Fm : 0∗∞ m → C, let (108) |Fm |0 = sup |Fm (P )|. m P ∈0∗ ∞
Up to now, the dependence of Imr (t) on (nτ , L) was not denoted explicitly. I now put it (nτ ,L) . Let lim = limL→∞ limnτ →∞ . in a superscript and write Imr (nτ ,L)→∞
(nτ ,L) (nτ ,L) (0))m,r,nτ ,L be a family of bounded functions such that Imr (0) Lemma 3. Let (Imr (nτ ,L) = 0 if m > 2r + 2, lim(n τ ,L)→∞ Imr (0) = Imr (0) exists and is a bounded function (nτ ,L) (0) (nτ ,L) on 0∗∞ m , and Imr (0) ≤ Kmr . Let (Imr (t))m,r,nτ ,L be the solution to Eq. (101). (nτ ,L) Then, for all m and r, Imr (t) = 0 if m > 2r + 2, and
∂ (nτ ,L) I (t) = 0 ∂t mr
for all t > log
β , 2
(109)
there are bounded functions Imr (t) : 0∗∞ m → C such that lim
(nτ ,L)→∞
(nτ ,L) Imr (t) = Imr (t).
Let Pmr be the polynomials defined recursively as Z −1 1−r dκmr i i! xi−1 Pm1 r1 (x)Pm2 r2 (x) Pmr (x) = 6 |Imr (0)| + 0
(110)
(111)
(in particular, the coefficients of Pmr are independent of β), then for all nτ , L, β, t, (n ,L) Imrτ (t) ≤ (0 β)r−1 Pmr 4V1 log β0 . (112) 2 0 Proof. Induction in r, with the statement of the lemma as the inductive hypothesis. Let r = 1. Since r1 ≥ 1 and r2 ≥ 1, the right-hand side of the equation is zero, so (nτ ,L) (nτ ,L) (t, P ) = Im1 (0, P ) for all t. Thus the statement follows from the hypotheses Im1 (nτ ,L) on Imr (0). (nτ ,L) (0) = 0. Let r ≥ 2, and the statement hold for all r0 < r. Let m > 2r +2. Then Imr The five-tuple (m1 , r1 , m2 , r2 , i) contributes to the right-hand side only if r1 + r2 = r,
274
M. Salmhofer
m1 + m2 = m + 2i, i ≥ 1, and by the inductive hypothesis, only if m1 ≤ 2r1 + 2 and (nτ ,L) (t) can be nonzero only if m = m1 +m2 −2i ≤ m1 +m2 −2 ≤ m2 ≤ 2r2 +2. Thus Imr 2r1 + 2 + 2r2 + 2 − 2 = 2r + 2. The integral appearing on the right-hand side of Eq. (101) is a Riemann sum approximation to an integral over the (t, nτ , L)-independent region (nτ ,L) have a s ∈ [0, ∞), kj ∈ M(β) × B. By the inductive hypothesis, the factors Im k rk 2 ¯ limit Eq. (110) satisfying Eq. (112). Since for all t, Dt is a bounded C function, the same holds for the propagators (boundedness holds because β < ∞). Thus the integrand converges pointwise, and it suffices to show that it is bounded by an integrable function to get Eq. (110) and Eq. (112) by an application of the dominated convergence theorem. Let α = 4V1 log β2 0 , and let g be the function on [0, ∞) × (M(β) × B)i−1 given by Z 2 β0 g(s, k) = dκmr i i!Pm1 r1 (α)β r1 −1 Pm2 r2 (α)β r2 −1 1l s ≤ log 2 0 e−s i Zs Y dsj 1l |ωj | ≤ π2 0 e−sj 1l |E(kj | ≤ 20 e−sj . (113) j=2 0
The integrand on the right-hand side of Eq. (101) is bounded by g by Proposition 4, Lemma 9, and the inductive hypothesis Eq. (112). Because g vanishes identically for s > log β2 0 , it is integrable. By Eq. (105) and Eq. (106), Zt
Z
Z ds
g(k)dk2 . . . dki ≤ β r−1
dκmr i i! αi−1 Pm1 r1 (α)Pm2 r2 (α).
(114)
0
Thus Eq. (110) holds, and Eq. (112) holds with Pmr given by Eq. (111) (the factor 61−r comes from the assumption that β0 ≥ 6). Since the right-hand side of Eq. (101) vanishes for t > log β2 0 , Eq. (109) holds. (nτ ,L) (0) is not the initial interaction because at t = 0, the propagator C0 6= 0. Remark 7. Imr (nτ ,L) The Imr (0) are obtained from the original interaction by Wick ordering with respect to C and integrating over all fields with covariance C0 . The existence of the limit (nτ , L) → (nτ ,L) (0) is not obvious because in the limit nτ → ∞, the absolute value of ∞ of the Imr the propagator is not summable. In perturbation theory, this is no serious problem, and (nτ ,L) (0) are controlled by a similar inductive proof as the above. An alternative the Imr proof is in Appendix D of [22].
5.4. The many-fermion system in the thermodynamic limit. In this section, let the manyfermion system satisfy the hypotheses stated in Sect. 2.3 with k0 > d. In the thermodynamic limit, the spatial part p of momentum becomes continuous, p ∈ B, and the set of Matsubara frequencies becomes M(β), given in Eq. (107). It is convenient to take (p0 , p) ∈ R × B and put all the β-dependence into the integrand. To this end, I define the step function ωβ : R → M(β) by ωβ (p0 ) =
π (2n + 1) β
if p0 ∈ (
2π 2π n, (n + 1)]. β β
For any continuous and integrable function f , Z 1 X dp0 f (ωβ (p0 )) = f (ω). 2π β R
ω∈M(β)
(115)
(116)
Continuous Renormalization for Fermions and Fermi Liquid Theory
Moreover, sup |ωβ (p0 ) − p0 | =
p0 ∈R
π β
and
275
inf |ωβ (p0 )| =
p0 ∈R
π , β
(117)
|ωβ (p0 )| ≥ p20 , and ωβ (−p0 ) = −ωβ (p0 ) holds Lebesgue-almost everywhere, so that in integrals like Eq. (116), ωβ can be treated as an antisymmetric function. With this, the ˜ t (p) = Ct (ωβ (p0 ), E(p)) with propagator now reads D Ct (x, y) =
1 2 2 χ1 (−2 t (x + y )) ix − y
(118)
(and t = 0 e−t ). In infinite volume, the restriction on the spatial part p of momentum improves the bounds on the propagator over the finite-volume ones given above. ˜ t (p) = 0 for all ˜ t is bounded, C ∞ in t and C k0 in p. If t > log β0 , then D Lemma 4. D π d p ∈ R × B. For all multiindices α ∈ N0 with |α| ≤ k0 , there is a constant Bα > 0 such that α ˜˙ −1−|α| 1l |iωβ (p0 ) − E(p)| ≤ t D Dt (p) ≤ Bα t −1−|α| 1l |ωβ (p0 )| ≤ t 1l |E(p)| ≤ t , (119) ≤ Bα t 0 B0 = 4. For t ≤ log β π , Z −|α| dp0 α ˜˙ D (p) 1l |E(p)| ≤ t , ≤ Bα t D t 2π
R
Z
dd+1 p α ˜˙ Dt (p) (2π)d+1 D
1−|α|
≤ 2J1 Bα t
,
(120)
(121)
R×B
Z
and
dd+1 p ˜ Dt (p) (2π)d+1
≤ 8J1 t .
(122)
R×B
˜ t = 0 for ˜ t (p) to be nonzero, |ωβ (p0 )| ≤ t must hold. By Eq. (117), D Proof. For D β0 −2 ˙ 0 −2 2 ˜ t > log π . Since Dt (p) = −2t (iωβ (p0 ) + E(p)) χ1 (t (ωβ (p0 ) + E(p)2 )) and ˜˙ (p)| ≤ 4 1l |iω (p ) − E(p)| ≤ . Thus B = 4. Derivatives with kχ0 k ≤ 2, |D 1 ∞
t
β
t
0
t
0
respect to p can act on E(p) or on χ01 , in which case they produce factors bounded by −1 t |∇E(p)|, so there is Bα such that Eq. (119) holds. Inserting Eq. (119) into the integral in Eq. (120) gives Z dp0 α ˜˙ −1−|α| (123) 1l |E(p)| ≤ t Mβ D Dt (p) ≤ Bα t 2π R
with Mβ =
1 β {n
∈ Z : |2n + 1| ≤
2 βt } ≤ t , π π
which proves Eq. (120). This gives for the integral in Eq. (121),
(124)
276
M. Salmhofer
Z
Z dd+1 p α ˜˙ dd p −|α| D (p) ≤ B 1l |E(p)| ≤ t . D t α t (2π)d+1 (2π)d
(125)
B
R×B
For t ≥ 0, t ≤ 0 , so by a change of coordinates in the integral, Z
dd p 1l |E(p)| ≤ t ≤ d (2π)
B
Zt
Z dρ
dθ|J(ρ, θ)| = 2J1 t ,
(126)
−t
which implies Eq. (121). Equation (122) follows by integration over t.
5.5. A bound in position space. In this section, I prove a bound for the unrenormalized Green functions that motivates the statement that the unrenormalized expansion converges for |λ|β d < const (if the solution to Eq. (86) is exponentially bounded, it implies this convergence). Lemma 5. For many-fermion systems with E ∈ C k0 , where k0 > d, there is 12 > 0 such that for all t ≥ 0 and all β, ˙ t ||| ≤ 12 etd . |||D ˙ t ||| ≤ 4 Proof. By definition of ||| · |||, |||D Z
Z ˙ t (x0 , x) = D
dk0 2π R
R β/2 −β/2
dx0
(127) R 3
˙ t (x0 , x)| with dx |D
dd k ix0 ωβ (k0 )+ik·x ˙ Ct (ωβ (k0 ), E(k)). e (2π)d
(128)
B ω (k )2 +E(k)2
By Eq. (118), C˙t (ωβ (k0 ), E(k)) = − 22 (iωβ (k0 ) + E(k))χ01 ( β 0 2 ). t t Claim. There is a constant N , depending on k0 and E, such that for all β, ˙ t (x0 , x)| ≤ N t |D
1 . (1 + t |x|)k0 (1 + t min{|x0 |, β2 − |x0 |})2
The lemma follows from Eq. (129) by Z Z 1 −d d d x dd ξ(1 + |ξ|)−k0 = t (1 + t |x|)k0
(129)
(130)
and β
Z2 −β 2
1 dx0 ≤ 4−1 t (1 + t min{|x0 |, β2 − |x0 |})2
Z∞
du . (1 + u)2
(131)
0
Equation (129) is proven by the standard integration by parts method. Since the k0 dependence is via the step function ωβ , one has to use summation by parts in the form Z Z 2πix0 dk0 q dd k ix0 ωβ (k0 )+ik·x ˙ β ) = e (1q2π C˙t )(ωβ (k0 ), E(k)), (132) Dt (x0 , x) (1 − e 2π (2π)d β
R
B
Continuous Renormalization for Fermions and Fermi Liquid Theory
277
where (1a f )(k0 ) = f (k0 ) − f (k0 − a). Taking q ≤ 2 and using a Taylor expansion of C˙t (x, y) of order q in x, one sees that 1q2π C˙t is still C k0 in k and that for any multiindex α with |α| ≤ k0 , ∂ α q ( ) 1 2π C˙t (ωβ (k0 ), E(k)) ≤ N1 ∂k β
β
1 1+|α|+q
β 2 t
1l |iωβ (k0 ) − E(k)| ≤ t .
By Eq. (124) and Eq. (126), this implies 2 α 1 −1−|α|−q x 1 − e 2π ˙ β ix0 . Dt (x) ≤ 4J1 N1 2 t β For |ξ| ≤ π, | sin ξ| ≥
min{|ξ|, π − |ξ|}, so 2π 2π 4 ix0 β x0 ≥ min{|x0 |, β2 − |x0 |}, ≥ sin 1 − e β β
and thus
(133)
(134)
2 π
|α|+q
t
This implies Eq. (129).
˙ t (x) ≤ N2 t . |xα |(min{|x0 |, β2 − |x0 |})q D
(135)
(136)
Remark 8. This proof changes in finite volume: one needs the requirement that no point on the Fermi surface S is a point of the momentum space lattice determined by the sidelength L to get a bound that is uniform in L. This can be achieved by choosing the chemical potential µ appropriately. Theorem 2. For the many-fermion model in d ≥ 1, and with the initial condition (0) , |||Gmr (0)||| = γmr |||Gmr (t)||| ≤ γmr (0 β)(r−1)d (137) with the γmr given by Eq. (85), where A =
8 0 d 12
and B = 8J1 0 .
Proof. By Eq. (122), Corollary 1, and Lemma 2, | det Dt(i−1) | ≤ (8J1 t )i−1 , so 11 ≤ ˙ t ||| ≤ 12 etd . The proof is by induction in r, with the inductive 8J1 0 . By Lemma 5, |||D hypothesis (138) |||Gmr (t)||| ≤ γmr etd(r−1) . The statement is trivial for r = 1. Let r ≥ 2, and Eq. (138) hold for all r0 < r. By Lemma 1 and Eq. (106), and since r1 + r2 = r, Z td(r−1) |||Qmr (t)||| ≤ 12 e dκmr i2 1i−1 γ m 1 r1 γ m 2 r2 . (139) 1 Rt 2 td(r−1) e , and Eq. (138) follows by Since r ≥ 2, r − 1 ≥ r2 , so 0 dt esd(r−1) ≤ dr m integration and by r ≥ 4 . Because Dt = 0 if t > log(β0 ), Eq. (137) holds. Remark 9. The bounds here are rather crude because t ≤ 0 was used. However, these are bounds for the full Green functions, not for truncations. Improving them requires renormalization and, for d ≥ 2, the sector technique of [20]. Proposition 5. The hypotheses of Theorem 1 are satisfied for the many-fermion model in d = 1 in the thermodynamic limit.
278
M. Salmhofer
Proof. Equation (78) follows from Eq. (121). Equation (79) follows from Lemma 5. R Proposition 6. Let d ≥ 2. If the sum dκmr is truncated to m1 > 2(d + 1), and m2 > 2(d + 1), then the Gmr defined by this truncation satisfy for all m > 2d + 2, |||Gmr (t)||| ≤ γmr et(
m 2 −(d+1))
.
(140)
If these bounds were sharp, all m-point functions up to m = 2d would be relevant in the RG sense, and the (2d + 2)-point function would be marginal. This is not really the case; see [20]. It is, however, remarkable that such a simple power counting ansatz goes through at all in d ≥ 2. 6. Regularity of the Selfenergy In this section, I study the regularity question for the skeleton selfenergy, which is very nontrivial even in perturbation theory. I show that all notions introduced in [21–23] arise in a natural way in the RGE and prove that the skeleton selfenergy is twice differentiable. This verifies the regularity criterion (1) for Fermi liquids in perturbation theory. The proof given here is a considerable simplification of the ones in [21, 22]. The main technique in doing the regularity proofs, and in showing that the generalized ladder graphs give the only contribution to the four-point function that is not bounded uniformly in β, is the volume improvement technique invented in [21]. Here, I show that the overlapping loops, and all related concepts developed in [21–23], appear naturally in the Wick-ordered continuous RGE. In particular, overlapping loops always appear in the skeleton two-point function, and the only nonoverlapping part of the skeleton four-point function is m1 = m2 = 4, which is precisely the ladder part of the four-point function. Moreover, the double overlaps of [23] arise in a natural way when the integral equation Eq. (101) is iterated. The main reason why all these effects are seen so easily is Wick ordering. The RGE without Wick ordering has too little structure to make these effects explicit in a convenient way. For instance, the volume improvement effect from overlapping loops is a two-loop effect, whereas the non-Wick-ordered RGE is a one-loop equation. 6.1. Volume-improved bounds. The indicator functions restrict the spatial support of the propagator to regions (141) R(t ) = {p ∈ B : |E(p)| ≤ t }. The volume of the intersection of R and its translates occurs in the bounds for the integrals in the RGE. Good bounds for these volumes are the key to the analysis of these systems. Lemma 6. Let k0 ≥ 2, π (ρ, θ) be as defined in Sect. 2, and Z Z dθ1 dθ2 1l |E(v1 π (0, θ1 ) + v1 π (0, θ2 ) + q)| ≤ ε . W(ε) = sup max q∈B vi ∈{±1} S d−1
S d−1
There is a constant QV ≥ 1 such that for all 0 < ε ≤ 0 , 1 + | log ε| if d = 2 W(ε) ≤ QV ε 1 if d ≥ 3.
(142)
(143)
Continuous Renormalization for Fermions and Fermi Liquid Theory
279
Proof. This is Theorem 1.2 of [22]. The constant QV depends on the curvature of the Fermi surface, hence on 0 . I now prepare for the regularity proofs by using the volume improvement of Lemma 6 to bound the right-hand side of the RGE. This is possible because the graph drawn in Fig. 2 is overlapping according to the graph classification of [21], if i ≥ 3 Z α (144) D Qmr (t | P ) = dκmr (m1 , r1 , m2 , r2 , i)i i!Xα,i (t | P ), where, after a change of variables kj → (−1)lj kj , X X
X
Xα,i (t | P ) =
σ∈{−1,1}i l∈{1,2}i α i Y
Z
α! α0 !α1 !α2 !
Z
˜˙ t (k1 ) dki Dα0 D
dk2 . . . R×B
R×B
˜ t (kj ) Dα1 Im1 r1 (t | P (1) , K) Dα2 Im2 r2 (t |∼ K, P (2) ) D
(145)
j=2
with Kj = (kj , σj , lj ), the sum over α running over all triples (α0 , α1 , α2 ) with α0 + Pi α1 + α2 = α, and k1 = − j=2 (−1)lj kj + p1 + . . . + pm1 −i+1 . Xα,i depends also on (m1 , r1 , m2 , r2 , m, r). Application of | · |0 gives X α1 α2 α! (146) |Xα,i (t)|0 ≤ 4i α0 !α1 !α2 ! |D Im1 r1 (t)|0 |D Im2 r2 (t)|0 Yα0 ,i (t) α
with Yα,i (t) =
sup
max
Q∈M(β)×B vi ∈{−1,1}
Z Y i
˜˙ t ( ˜ t (kj )|dkj |Dα D |D
j=2
i X
vj kj + Q)|.
(147)
j=2
0 Lemma 7. Yα,i (t) = 0 for t > log β π , and
i−2−|α|
Yα,i (t) ≤ (8J1 )i−1 Bα t
.
(148)
If i ≥ 3,
1 i−1−|α| Yα,i (t) ≤ (8J1 )i−1 Bα K (1) (1 + t) t 2 with K (1) given in Eq. (158). ˜˙ t (p)| ≤ Bα −1−|α| , so Proof. By Eq. (119), |Dα D t i−1 Z −1−|α| D ˜ t (k) dk , Yα,i (t) ≤ Bα t
(149)
(150)
R×B
thus Eq. (148) holds by Eq. (121). For i ≥ 3, i−3 Z ˜ t (k)|dk Yα,i (t) ≤ |D sup R×B
max
Q∈M(β)×B v1 ,v2 ∈{−1,1}
Yα (t, Q, v1 , v2 )
(151)
280
M. Salmhofer
with
Z
Z
Yα (t, Q, v1 , v2 ) =
dk1
˜˙ t (v1 k1 + v2 k2 + Q)|. ˜ t (k1 )| |D ˜ t (k2 )| |Dα D dk2 |D
(152)
R ˜ tj (kj ). By Eq. ˜˙ t = 0 for t > log β0 , to write D ˜ t (kj ) = − log(β0 ) dtj D In Yα (t), I use D π t −1−|α| ˙ ˜ t (p)| ≤ Bα t 1l |E(p)| ≤ t . Inserting this and doing the integrals (119), |Dα D over (k1 )0 and (k2 )0 , I get by Eq. (120), Yα (t, Q, v1 , v2 ) ≤ 8
2
−1−|α| Bα t
log(β Z 0)
log(β Z 0)
dt2 , V(t, t1 , t2 )
dt1 t
with
Z
Z
V(t, t1 , t2 ) =
dk2 1l |E(v1 k1 + v2 k2 + Q)| ≤ 0 e−t . (154)
dk1 R(0 e−t1 )
(153)
t
R(0 e−t2 )
With the coordinates (ρ, θ) defined in Eq. (21), 0Ze−t1
V(t, t1 , t2 ) =
0Ze−t2
dρ1 −0 e−t1
Z dρ2
Z dθ1
dθ2 J(ρ1 , θ1 )J(ρ2 , θ2 )
−0 e−t2
1l |E(v1 π (ρ1 , θ1 ) + v2 π (ρ2 , θ2 ) + Q)| ≤ 0 e−t .
(155)
Since |ρj | ≤ 0 e−tj ≤ 0 e−t , Eq. (21) implies |E(v1 π (ρ1 , θ1 ) + v2 π (ρ2 , θ2 ) + Q) −E(v1 π (0, θ1 ) + v2 π (0, θ2 ) + Q)| ≤ g40 |E|1 0 e−t , with |E|1 = |∇E|0 . Using J(ρj , θj ) ≤ J0 and doing the ρ-integrals, I get −t 1 V(t, t1 , t2 ) ≤ (20 J0 )2 e−t1 −t2 W (1 + 4|E| (156) g0 )0 e with W given by Eq. (142). The integrals over t1 and t2 give 1−|α| −t 1 Yα (t, Q, v1 , v2 ) ≤ 44 Bα J02 t W (1 + 4|E| ) e . 0 g0
(157)
This implies Eq. (149) with K (1) = 2(
J0 2 4|E|1 ) QV (1 + ) 1 + log(1 + J1 g0
4|E|1 g0 )
+ |log 0 | .
(158)
This gives the following Lemma 8. For all i, |Xα,i (t)|0 ≤ 4i (8J1 )i−1
X
α! α0 !α1 !α2 !
|Dα1 Im1 r1 (t)|0 |Dα2 Im2 r2 (t)|0
α i−2−|α0 | Bα0 t .
(159)
Continuous Renormalization for Fermions and Fermi Liquid Theory
For i ≥ 3, |Xα,i (t)|0 ≤ 4i (8J1 )i−1
X
α! α0 !α1 !α2 !
α i−1−|α0 | Bα0 K (1) t
1 2 (1
1
281
|Dα1 Im1 r1 (t)|0 |Dα2 Im2 r2 (t)|0 + t) if d = 2 if d ≥ 3.
(160)
Equation (160) is the implementation of the volume improvement bound from overlapping loops in the continuous RGE setting. In the next section I use it to prove regularity properties of the selfenergy. 6.2. The ladder four-point function and the skeleton selfenergy. ˜ R for the skeleton functions Imr (t)is given by Eq. (100), but with RDefinition 2. The RGE dκmr replaced by dκ˜ mr , where in the latter the sums over m1 and m2 start at four instead of two. The function I˜2r (t) is the skeleton selfenergy of the model. This truncation of the sum prevents any two-legged insertions that require renormalization from occurring in the graphical expansion. To give a precise meaning to the statement that the ladder resummation takes into account the most singular contributions to the four-point function, I split the skeleton four-point function into two pieces, as follows. In Q4r , m1 + m2 = 4 + 2i, so i ≥ 1 2 (m1 + m2 − 4) ≥ 2 since the skeleton condition has removed m1 = 2 and m2 = 2. Let Q4,r,2 be the i = 2 term in this sum; it corresponds to the “bubble” graph drawn in Fig. 3.
P1
K
-K
P2
P3
P4 Fig. 3. The graph corresponding to Q4,r,2
More explicitly, Q4,r,2 is given by X Z X 1 h ¯ t (K) D ¯˙ t (K 0 ) dk D Q4,r,2 (t | P1 , . . . , P4 ) = A4 144 2 r1 +r2 =r i1 ,σ1 ,i2 ,σ2 i 0 I˜4,r1 (t | P1 , P2 , K , K) I˜4,r2 (t |∼ K, ∼ K 0 , P3 , P4 ) (161) with K = (k, i1 , σ1 ) and K 0 = (−p1 − p2 − k, i2 , σ2 ). Let Q4,r,≥3 (t | P ) = Q4,r (t | P ) − Q4,r,2 (t | P )
(162)
be the contribution from all terms where at least i ≥ 3 lines connect the two vertices in Q4,r . Correspondingly, let I˜4,r (t) = Br (t) + Ur (t), (163) where B4,r (t) and U4,r (t) are defined by
282
M. Salmhofer
1 ∂ Br (t) = A4 Q4,r,2 (t), ∂t 2
∂ 1 Ur (t) = A4 Q4,r,≥3 (t) ∂t 2
(164)
with the initial condition Br (0) + Ur (0) = I˜4,r (0) (where it is understood that one of the two summands on the left-hand side is set to zero, see below). Note that because I˜4,r occurs on the right-hand side of both equations in Eq. (164), Eq. (164) is a coupled system of differential equations. Definition 3. The function Br(L) obtained from the skeleton RGE by the truncation ∂ Ur (t) = 0, ∂t
Ur (0) = 0
(165)
is the ladder skeleton four-point function. The function Ur(N ) obtained by the truncation ∂ Br (t) = 0, ∂t
Br (0) = 0
(166)
(N ) , obtained with this is the non-ladder skeleton four-point function. The functions I˜m,r R truncation, (i.e., where m1 = 4 and m2 = 4 are left out in the sum dκ˜ 4,r ) are the non-ladder skeleton Green functions.
The motivation for the split is that in Q4,r,≥3 , the number of internal lines of the graph in Figure 2 is i ≥ 3, so the volume improvement of Lemmas 7 and 8 improves the power counting. The constants in the following theorems are independent of β. Theorem 3. There is a constant L2 such that (L) Br (t) ≤ L2 r ( 1 (1 + t))r−1 ≤ L2 r |log(β0 )|r−1 . 0 2 The series
P r
(167)
Br(L) (t) converges uniformly in t and P if |λ log β0 | < L2 −1 .
Proof. This follows immediately by induction from Eq. (161) by use of Eq. (159) with i = 2 and α = 0. Using one-loop volume bounds one can show that both the particle-particle and the particle-hole ladder are bounded uniformly in t and β for Q 6= 0, so that the only singularity in the four-point function can arise at zero momentum (for a discussion of this, see, e.g., [24]). In the particle-particle ladder, this singularity is really there, and it implies that Fermi liquid behaviour occurs only above a critical temperature: the log β in Definition 1 is the logarithm occurring in Theorem 3. The next theorem states that the non-ladder skeleton four-point function is bounded and that consequently, the non-ladder skeleton selfenergy is C 1 uniformly in β. This shows that indeed, only the ladder four-point function produces a nonuniformity in β (this motivates the alternative criterion for Fermi liquid behaviour in Sect. 2.6). The second derivative, however, is only bounded by a power of log β in d = 2; this motivates why at zero temperature, the selfenergy is only required to be C 2−δ for some δ > 0.
Continuous Renormalization for Fermions and Fermi Liquid Theory
283
(N ) Theorem 4. For all r ≥ 1, the non-ladder skeleton functions I˜2,r (t) converge for 2 t → ∞ to a C function. There are constants L3,r and L4,r , independent of β, such that
α ˜(N ) D I2,r (t) ≤ L3,r 0
and
(
1 (log β0 )2 log β0
α ˜(N ) D I4,r (t) ≤ L4,r 0
(
if |α| ≤ 1 if |α| = 2 and d = 2 if |α| = 2 and d ≥ 3,
1 log β0 β0
if |α| = 0 if |α| = 1 if |α| = 2.
(168)
(169)
If the ladder four-point function is left in the RGE, its logarithmic growth (Theorem 3) shows up in all other Green functions, and in the selfenergy: Theorem 5. The skeleton functions I˜2,r and I˜4,r converge for t → ∞ and satisfy α D I˜m,r (t) ≤ L5,m,r (log β0 )r (170) 0 for m = 2, |α| ≤ 2, and m = 4, |α| = 0. Theorems 4 and 5 are proven in the next section. Remark 10. To prove bounds with a good β behaviour for m ≥ 6 requires the use of different norms, where part of the momenta are integrated [19]. It is easy to see that for m ≥ 6, the connected m-point functions have singularities and thus are not uniformly bounded functions of momentum; for instance the second order six-point function is given by C(P ), which is O(β) if p is on the Fermi surface and if ω = π/β. Remark 11. A bound const r for the constants in Theorem 5 would suffice to show the first requirement of the Fermi liquid behaviour defined in Definition 1, namely the convergence of the perturbation expansion for the skeleton functions for |λ log β| < const . The proofs given in this section do not imply this bound because in the momentum space equation, the factorial remains. It may, however, be possible to give such bounds in d = 2 by combining the sector technique of [20] with the determinant bound. 6.3. The regularity proofs. At positive temperature, the frequencies are still discrete, whereas the spatial part of momentum is a continuous variable. Whenever derivatives with respect to p0 are written below, they are understood as a difference operation β 2π 2π (f (p0 + β ) − f (p0 )). The RGE actually defines the Green functions for (almost all) real values of p0 , not just the discrete set of Matsubara frequencies, so the effect of such a difference can be bounded by Taylor expansion. This changes at most constants, so I shall not write this out explicitly in the proofs. The following theorem implies Theorem 4 about the non-ladder skeleton selfenergy and four-point function. Note that all bounds in this theorem are independent of β. Theorem 6. Let α be a multiindex with |α| ≤ 2. For all m, r, let Imr (0) be C 2 in (0) (0) ≥ 0 such that |Dα Imr (0)| ≤ Kmr , with (p1 , . . . , pm ), and assume that there are Kmr (0) (0) (N ) ˜ Kmr = 0 for m > 2r + 2, and Km1 = δm4 v, where v > 0. Let Imr (t) be the Green functions generated by the non-ladder skeleton RGE, as given in Definition 3, with initial (0) and for r ≥ 2, let values Imr (0). Let Km1 = Km1
284
M. Salmhofer
1 M1 m
(0) Kmr = Kmr +
Z dκ˜ mr i i!(32J1 )i−1 Km1 r1 Km2 r2 ,
(171)
where M1 = 240BK (1) , with B = maxα Bα . Then Kmr = 0 if m > 2r + 2, and for all t ≥ 0 and all m, r, 2− m −|α| 2 if m ≥ 6 t α (N ) D Q˜ mr (t) ≤ 2Kmr 1−|α| 1+t (172) if m = 4 t 2 0 2−|α| 1+t δd,2 t if m = 2. 2 Moreover, for m ≥ 6, for m = 4,
m α (N ) D I˜mr (t) ≤ Kmr t2− 2 −|α| , 0
1 2 α (N ) 1+t D I˜mr (t) ≤ K4r 0 −12 1+t t 2
(173)
if α = 0 if |α| = 1 if |α| = 2,
(174)
if |α| ≤ 1 if |α| = 2 and d = 2 if |α| = 2 and d = 3.
(175)
and for m = 2, 1 α (N ) 1+t 2 D I˜mr (t) ≤ K2r 0 1+t2 2
Proof. Induction in r, with the statement of the theorem as the inductive hypothesis. ) r = 1 is trivial because Q˜ (N m1 = 0 and because the statement holds for Im1 (0). Let r ≥ 2 and the statement hold for all r0 < r. The inductive hypothesis applies to both factors (N ) ) I˜m in Q˜ (N mr . For mk = 4, it implies that k rk (N ) mk −|α| I˜m r (t) ≤ Km r −|α| = Kmk rk 0 et(|α|+ 2 −2) k k t k k 0
(176)
for all t ≥ 0. Recall Eq. (159) and Eq. (160), and α α (N ) ) α ˜ (N ) D Am Q˜ (N ˜ mr (t) 0 = Am D Qmr (t) 0 ≤ D Qmr (t) 0 Z ) ≤ dκ˜ (N mr i i!|Xα,i (t)|0 .
(177)
Let m ≥ 6. By Eq. (159) and m1 + m2 = m + 2i, the t-dependent factors in Xα,i (t) are et( Since
m 2
m1 2
− 2 + |α| ≥
−2+|α1 |)
m 2
et(
m2 2
−2+|α2 |) t(2−i+|α0 |)
e
= et(
m 2 −2+|α|)
.
(178)
− 2 ≥ 1, integrating the RGE gives
α (N ) 1 (N ) D I˜mr (t) ≤ Dα I˜mr (0) 0 + 0 2
Zt
) ds Dα Q˜ (N mr (s) 0
0
1 2− m −|α| (0) νmr t 2 ≤ Kmr + , m−4 where
(179)
Continuous Renormalization for Fermions and Fermi Liquid Theory
Z νmr = 4B
dκi ˜ i!(32J1 )i−1
X
285
α! α0 !α1 !α2 ! Km1 r1 Km2 r2 .
(180)
α
P α! |α| For m ≥ 6, m − 4 ≥ m 3 . Moreover, α0 !α1 !α2 ! ≤ 3 , so Eq. (172) and Eq. (173) follow. Let m = 4. One of m1 and m2 must be at least six because the ladder part is left (N ) . Thus i = 21 (m1 + m2 − 4) ≥ 3. By Eq. (160), there is an extra small factor out in I˜mr −t t = 0 e , so 1 + t 1−|α| (1) α ˜ (N ) K νmr , (181) D Q4,r (t) ≤ 2 t 0 which proves Eq. (172). Equation (174) follows by integration. Let m = 2. The case m1 = m2 = 2 is excluded since this is the skeleton RG. Thus i = 21 (m1 + m2 − 2) ≥ 3, and Eq. (160) implies δ 1 + t d,2 1−|α| (1) α ˜ (N ) t K ν2r . (182) D Q2,r (t) ≤ 2 0 Thus Eq. (172) holds, and Eq. (175) follows by integration.
(N ) Proof of Theorem 4. Let |α| ≤ 1. By Eq. (175), |Dα I˜2r (t)| ≤ K2,r . By Eq. (172), (N ) α ∂ ˜(N ) −t 1+t → 0 as t → ∞. Thus the limit t → ∞ of I˜2,r (t) exists |D ∂t I2r (t)| ≤ K2,r 0 e 2 1 and is a C function of (p1 , p2 ). All constants are uniform in β. The second derivative ) ˜ (N ) of Q˜ (N 2,r is O(t) in d = 2 and O(1) in d = 3. Since Q2,r = 0 for t > log β0 , the integral (N ) for I˜2,r over t runs only up to log β0 , which gives the stated dependence on log β0 .
Remark 12. Note that the bounds for m = 2 and |α| = 2 in Theorem 6 do not imply convergence. This is the source of the logarithmic behaviour discussed in Sect. 2 and in [22]. The following theorem implies Theorem 5. Here the bounds have an explicit βdependence. One could avoid this β-dependence by including polynomials in t, but this would also increase the combinatorial coefficients by factorials. Theorem 7. Let α be a multiindex with |α| ≤ 2. For all m, r, let Imr (0) be C 2 in (0) (0) ≥ 0 such that |Dα Imr (0)| ≤ Kmr , with (p1 , . . . , pm ), and assume that there are Kmr (0) (0) ˜ Kmr = 0 for m > 2r + 2, and Km1 = δm4 v, where v > 0. Let Imr (t) be the Green functions generated by the skeleton RGE, as given in Definition 2, with initial values (0) Imr (0). Let Km1 = Km1 and for r ≥ 2, let Kmr be given by Eq. (171). Then for m ≥ 4, m α D Q˜ mr (t) ≤ 2Kmr (log β0 )r−2 t2− 2 −|α| (183) 0 and For m = 2, and
m α D I˜mr (t) ≤ Kmr (log β0 )r−1 t2− 2 −|α| . 0
α D Q˜ 2r (t) ≤ 2K2r (log β0 )r−2+δd,2 2−|α| t 0 α D I˜2r (t) ≤ K2r (log β0 )r−2 0
1 (log β0 )1+δd,2
if |α| ≤ 1 if |α| = 2.
(184) (185) (186)
286
M. Salmhofer
Proof. The proof is by induction in r, with the statement of the theorem as the inductive hypothesis. It is similar to the proof of Theorem 6, with only a few changes. Note that m1 = 2 and m2 = 2 never appear on the right-hand side of the RGE because of the skeleton truncation in Definition 2. For m ≥ 4, use Eq. (159); this gives m α D Q˜ mr (t) ≤ 9νmr (log β0 )r−2 t2− 2 −|α| . (187) 0 For m ≥ 6, the scale integral is as in the proof of Theorem 6. For m = 4, the scale integral Rt is now 0 ds = t ≤ log β0 . This produces the powers of log β0 upon iteration. For 2−|α|
2− m −|α|
( 21 (1 + t))δd,2 instead of t 2 in Eq. m ≥ 2, use Eq. (160); this gives K (1) t (187). The theorem now follows by integration over t, recalling that the upper integration limit is at most log β0 . Proof of Theorem 5. Convergence of the selfenergy follows for |α| ≤ 1 as in the proof of Theorem 4. For m = 4, the function is bounded uniformly in t. For |α| = 2, convergence ∂ ˜ I4r (t) = 0 for t > log β0 . at β > 0 holds because ∂t
7. Conclusion The determinant bound for the continuous Wick-ordered RGE removes a factorial in the recursion for the Green functions. If the model has a propagator with a pointlike singularity, power counting bounds that include this combinatorial improvement can be proven rather easily. The improvement may lead to convergence, but I have not proven this here. If it does, then Theorem 1 implies analyticity of the Green functions where all relevant and marginal couplings are left out in a region independent of the energy scale, and Theorem 2 implies analyticity of the full Green functions for |λ|β d < const in many-fermion systems for all d ≥ 1. Natural models to which Theorem 1 applies are the Gross-Neveu model in two dimensions and the many-fermion system in one dimension. In both cases, I have only given bounds where the marginal and relevant terms were left out, to give a simple application of the determinant bound derived in Sect. 4. In both cases, analyses of the full models, including the coupling flows, have been done previously ([14, 15] and [16–18]). Many-fermion models in d ≥ 2 are the most realistic physical systems where the interaction is regular enough for the analysis done here to apply directly (i.e., without the introduction of boson fields). I have defined a criterion for Fermi liquid behaviour for these models. A proof that such behaviour occurs requires more detailed bounds and a combination of the method with the sector method of [20]. This may be feasible by an extension of the analysis done here. The Jellium dispersion relation E(k) = k2 /2 − µ is the case where the proofs are easiest because 6|S is constant. The proof that such a system is a Fermi liquid is possible by a combination of the techniques of [20] and [21]; it may also be within the reach of the continuous RGE method developed here. Perturbative bounds that implement the overlapping loop method of [21] in a simple way were given in Sect. 6. The verification of regularity property (2) for nonspherical Fermi surfaces is not as simple, but the perturbative analysis done here can be extended rather easily to include the double overlaps used in [23], because the graph classification of [23] arises in a natural way when the integral equation for the effective action is iterated. Multiple overlaps can also be exploited in the RGE; this may be necessary for the many-fermion systems in d ≥ 3. The split of the four-point function into the ladder and non-ladder part
Continuous Renormalization for Fermions and Fermi Liquid Theory
287
done in Sect. 6.2 singles out the only singular contributions to the four-point function and the least regular contributions to the selfenergy in a simple way. The treatment of these ladder contributions was done in [22], where bounds uniform in the temperature were shown for the second derivative in perturbation theory. The results in Sect. 6.2 provide another proof that only the ladder flow corresponding to Fig. 3 leads to singularities and hence instabilities. It should be noted that this statement depends on the assumptions stated in Sect. 2.3, in particular on the choice of 0 , in the following specific way. The curvature of the Fermi surface sets a natural scale which appears via the constants in the volume bounds. Above this scale, the geometry of the Fermi surface provides no justification of restricting to the ladder flow. This is of some relevance in the Hubbard model, where two scale regimes arise in a natural way: if ˜ where µ˜ = µ − td, t the hopping parameter, is defined such that µ˜ = 0 is t > µ, half-filling, then the curvature of the Fermi surface is effectively so small that one can replace the Fermi surface by a square. More technically speaking, the constant QV , which depends on the curvature of the Fermi surface, diverges for µ˜ → 0. Thus, for small µ, ˜ t has to be very small for QV t < 1 to hold. If QV t > 1, the improved volume estimate does not lead to a gain over ordinary power counting. Only below this energy scale, the curvature effects of the volume improvement bound, Lemma 6, imply that the ladder part of the four-point function dominates. To get to scale µ, ˜ one has to calculate the effective action. Needless to say, the effective four-point interaction at scale µ˜ may look very different from the original interaction, and RPA calculations suggest that the antiferromagnetic correlations produced by the almost square Fermi surface lead to an attractive nearest-neighbour-interaction [38]. However, one should keep in mind that for the reasons just mentioned, it is an ad hoc approximation to keep only the RPA part of the four-point function above scale µ, ˜ and that a correct treatment must either give a different justification of the ladder approximation, or replace it by a better controlled approximation. Above, I have not discussed how one gets from the skeleton Green functions to the full Green functions. The key to this is to take a Wick ordering covariance which already contains part of the selfenergy. That is, the Gaussian measure changes in a nontrivial way with t. This can be done such that all two-legged insertions appear with the proper renormalization subtractions, and it gives a simple procedure for a rigorous skeleton expansion. Moreover, it makes clear why it is so important to establish regularity of the selfenergy: once the selfenergy appears in the propagator, its regularity is needed to show Lemma 4 (or its sector analogue), on which in turn, all other bounds depend. The condition k0 > d also enters in this lemma and seems indispensable from the point of view of the method developed here. Details about this modified Wick ordering technique will appear later. There is a basic duality in the technique applied to these fermionic models. The determinant bound has to be done in position space, but the regularity bounds use geometric details that are most easily seen in momentum space. The continuous RGE shows very nicely that those terms that require the very detailed regularity analysis are very simple from the combinatorial point of view and vice versa.
A. Fourier Transformation Recall that ψ is defined on the doubled time direction T2 and obeys the antiperiodicity with respect to translations by β, Eq. (7), and that nτ was chosen to be even. T2 is the set T2 = ετ Z/2βZ = T ∪ (T + β), where
288
M. Salmhofer
T = {τ ∈ T2 : τ = ετ k, k ∈ {− The dual to T2 is T∗2 = transform on T2 is
π β Z/2nτ Z
f˜(ω) = ετ
X
= {ω =
π βk
e−iωτ f (τ ),
nτ nτ ,..., − 1}}. 2 2
(188)
: k ∈ {−nτ , . . . , nτ − 1}}. The Fourier
f (τ ) =
τ ∈T2
1 X iωτ ˜ e f (ω). 2β
(189)
ω∈T2
If f (τ − β) = −f (τ ), then f˜(ω) = 0 if f (τ ) =
ωβ π
is even. In that case, with fˆ(ω) = 21 f˜(ω),
1 X iωτ ˆ e f (ω) β
(190)
ω∈Mnτ
with Mnτ given by Eq. (12). The orthogonality relations are Z
1 X iωτ 1 e = (δτ 0 − δτ β ). β ετ
dτ ei(ωn ±ωm )τ = βδmn ,
(191)
ω∈Mnτ
T
Lemma 9. Let β > 0, E0 ≥ 0, and nτ ≥ 2β(E0 + Emax ), where Emax is defined in Eq. (20), and let ω b be defined as in Eq. (14). Then for all k ∈ B and all ω ∈ Mnτ , |ib ω − E(k)| ≤ E0 implies |ω| ≤ π2 E0 and |E(k)| ≤ 2E0 , and X
1 β
1l |ib ω − E(k)| ≤ E0
≤ E0 1l |E(k)| ≤ 2E0 .
(192)
ω∈Mnτ
b = ε1τ sin(ωετ ). The condition Proof. By Eq. (14), Im ω b = ε1τ (1 − cos(ωετ )) and Re ω b | ≤ E0 and |Im ω b + E(k)| ≤ E0 . Thus |Im ω b| ≤ |ib ω − E(k)| ≤ E0 implies |Re ω E0 + Emax . Since 1 − cos x ≥ π22 x2 , (E0 + Emax )ετ ≥ ετ |Im ω b | = 1 − cos(ωετ ) ≥ 2 2 −1 1/2 ≤ π2 . Since sinxx is decreasing on π 2 (ωετ ) , so |ωετ | ≤ π(β(E0 + Emax )(2nτ ) ) 2 π 1 2 τ) [0, π2 ], E0 ≥ |Re ω b | ≥ |ω|| sin(ωε ωετ | ≥ π |ω|. So |ω| ≤ 2 E0 . Since 1 − cos x ≤ 2 x , |Im ω b| ≤
1 2ετ
(ωετ )2 ≤
π2 1 βE02 ≤ E0 . 8 nτ
(193)
≤ 2E0 . In terms Thus |Im ω b + E(k)| ≤ E0 implies |E(k)| of indicator functions, this means that 1l |ib ω − E(k)| ≤ E0 ≤ 1l |ω| ≤ π2 E0 1l |E(k)| ≤ 2E0 . The summation over Mnτ gives 1 β
X ω∈Mnτ
1l |ω| ≤
π 2 E0
nτ
=
1 β
2 X
1l |2n + 1| ≤
βE0 2
≤ E0 .
(194)
n=− n2τ
0 Note that the last inequality holds in particular if βE < 1, because then the sum is 2 empty, hence the result is zero, because 2n + 1 is always odd.
Continuous Renormalization for Fermions and Fermi Liquid Theory
289
B. Wick Ordering Recall the conventions fixed at the beginning of Sect. 3. Definition 4. Let W0 (η, ψ) = e(η,ψ)0 − 2 (η, C η)0 , and let A00 be the Grassmann algebra generated by (ψ(x))x∈0 . Wick ordering is the C–linear map C : A00 → A00 that takes the following values on the monomials: C (1) = 1, and for n ≥ 1 and X1 , . . . , Xn ∈ 0, "n # Y δ C ψ(X1 ) . . . ψ(Xn ) = W0 (η, ψ) . (195) δη(Xk ) 1
k=1
η=0
Theorem 8. W0 (η, ψ) = C e . Let α1 , . . . , αn be Grassmann variables, and Pn let α(X) = k=1 αk δ0 (X, Xk ). Then " n ! # Y ∂ (α,ψ)0 C ψ(X1 ) . . . ψ(Xn ) = . (196) C e ∂αk (η,ψ)0
k=1
α=0
Proof. By Taylor expansion in η, and by definition of C , # "n X Y X 1 ∂ W0 (η, ψ) = W0 (α, ψ) η(Xk ) n! ∂α(Xk ) X1 ,...,Xn ∈0 k=1 n≥0 α=0 # "n Y X 1 Z δ dX1 . . . dXn W0 (α, ψ) = η(Xk ) n! δα(Xk ) n≥0 k=1 α=0 ! n X 1 Z Y dX1 . . . dXn C = η(Xk )ψ(Xk ) n! n≥0
k=1
X 1 = C (η, ψ)0 n = C e(η,ψ)0 . n!
(197)
n≥0
Equation (196) follows directly from the definition of C .
The next theorem contains the alternative formula Eq. (48) for the Wick ordered monomials used in Sect. 2. Theorem 9. Let 1Ct be defined by Eq. (40). Then Ct ψ(X1 ) . . . ψ(Xn ) = e−1Ct ψ(X1 ) . . . ψ(Xn ). In particular, if Ct depends differentiably on t, then ∂1Ct ∂ Ct ψ(X1 ) . . . ψ(Xn ) = − C ψ(X1 ) . . . ψ(Xn ) . ∂t ∂t P Proof. For any formal power series f (z) = fk z k , 1 (η,ψ)0 f (1Ct ) e (η , Ct η)0 e(η,ψ)0 , =f 2
(198)
(199)
(200)
so e−1Ct e(η,ψ)0 = e− 2 (η,Ct η)0 +(η,ψ)0 , from which Eq. (198) follows by definition of Ct , because derivatives with respect to η commute with 1Ct . If Ct depends differentiably on t, Eq. (199) follows by taking a derivative of Eq. (198) because 1Ct commutes ∂1 with ∂tCt . 1
290
M. Salmhofer
C. Wick Reordering By Eq. (47) and Eq. (51), Qr (t, ψ) =
1 2
Z
P
m(r ¯P1 ) m(r ¯P2 )
r1 ≥1,r2 ≥1 r1 +r2 =r
m1 =1 m2 =1
Yr1 ,m1 ,r2 ,m2 with
dX Gm1 r1 (t | X (1) ) Gm2 r2 (t | X (2) ) P (X, ψ),
Yr1 ,m1 ,r2 ,m2 =
(201)
where X (1) R= (X1 , . . . , Xm1 ), X (2) = (Xm1 +1 , . . . , Xm1 +m2 ), X = (X1 , . . . , Xm1 +m2 ), P (X, ψ) = dXdY C˙ t (X, Y ) ωX,Y (ψ), and !! ! m1 m2 Y Y δ δ Dt Dt ωX,Y (ψ) = ψ(Xk1 ) ψ(Xm1 +k2 ) . (202) δψ(X) δψ(Y ) k1 =1
k2 =1
Only even m1 and m2 contribute to the sum for Qr (t, ψ) because Gr (t, ψ) is an element of the even subalgebra for all t. Let η (1) =
m1 X
ηk δ0 (X, Xk ),
k=1
η (2) =
m2 X
ηm1 +k δ0 (X, Xm1 +k ),
(203)
k=1
and η = η (1) + η (2) . With this,
# ! " m1 m (1) Y δ Y1 ∂ δ (η ,ψ)0 Dt ψ(Xk1 ) = Dt e δψ(X) δψ(X) ∂ηk k1 =1 k=1 η=0 "m # Y1 ∂ (1) (1) (1) 1 δ e(η ,ψ)0 − 2 (η ,Dt η )0 = (−1)m1 ∂ηk δψ(X) k=1 η=0 "m # Y1 ∂ (1) (1) (1) 1 = (−1)η (1) (X)e(η ,ψ)0 − 2 (η ,Dt η )0 , ∂ηk k=1
since (−1)m1 = 1. Similarly, ! "m m2 Y Y2 ψ(Xk ) = Dt k=1
k=1
(204)
η=0
∂ ∂ηm1 +k
# (2)
(−1)η (Y )e
(η (2) ,ψ)0 − 21 (η (2) ,Dt η (2) )0
.
(205)
η=0
Since η (1) is independent of ηm1 +1 , . . . , ηm1 +m2 , the derivatives with respect to η (2) can be commuted through so that ∂ m1 +m2 Z(η, ψ) (206) P (X, ψ) = ∂η1 . . . ∂ηm1 +m2 η=0 with
(1) (1) (2) (2) 1 1 Z(η, ψ) = (η (1) , C˙ t η (2) ) e(η,ψ)0 − 2 (η , Dt η )0 − 2 (η , Dt η )0 . ˙ t , give Wick ordering of e(η,ψ)0 , the antisymmetry of Dt , and C˙ t = −D ∂ (η (1) , Dt η (2) )0 Dt e(η,ψ)0 . e Z(η, ψ) = − ∂t
(207)
(208)
Continuous Renormalization for Fermions and Fermi Liquid Theory
291
Lemma 10. Let A and P B be elements of the P Grassmann algebra generated by (η(X))X∈0 , that is, A = I⊂0 aI η I andPB = I⊂0 bI η I , where aI , bI ∈ C, and let F (x) be the formal power series F (x) = r≥0 fr xr . Then ∂L ∂L ∂ = A(η) B( ) F ( , ψ) , (209) A( ) B(η) F (η, ψ) ∂η ∂η ∂η η=0 η=0 where
∂L ∂η
is the derivative with respect to η, acting to the left.
Proof. The proof is an easy exercise in Grassmann algebra and is left to the reader. Let m1 + m2 = µ. By Lemma 10 ∂L ∂µ Z(η, ψ) = η1 . . . ηµ Z( , ψ) . ∂η1 . . . ∂ηµ ∂η η=0 η=0 By definition of η (1) and η (2) , X µ m1 X ∂L ∂L , D = t ∂η (1) ∂η (2)
k1 =1 k2 =m1 +1
∂L ∂L Dt (Xk1 , Xk2 ) . ∂ηk1 ∂ηk2
(210)
(211)
Every derivative can act only once on η1 . . . ηµ without giving zero because the η’s are all different. So ∂L ∂L , D η1 . . . ηµ exp t ∂η (1) ∂η (2) X Y ∂L ∂L = η1 . . . η µ Dt (Xk1 , Xk2 ) , (212) ∂ηk1 ∂ηk2 L⊂M1 ×M2
(k1 ,k2 )∈L
where M1 = {1, . . . , m1 } and M2 = {m1 + 1, . . . , µ}. L = ∅ contributes to the sum, but gives the t-independent result η1 . . . ηµ , so the t-derivative removes this term in Eq. (208). Let π1 (k1 , k2 ) = k1 and π2 (k1 , k2 ) = k2 . For a term given by L 6= ∅ to be nonzero, π1 |L and π2 |L must be injective, because otherwise a derivative would act twice. Thus the sum can be restricted to the set L = {L ⊂ M1 × M2 : L 6= ∅, and πk |L injective for k = 1, 2}.
(213)
If L ∈ L, π1 (L) = π2 (L). Thus L=
with
min{m1 ,m2 }
[
[
i=1
Bi ⊂Mi |B1 |=|B2 |=i
L(B1 , B2 )
(214)
L(B1 , B2 ) = {L ∈ L : π1 (L) = B1 and π2 (L) = B2 }.
(215)
B1 = {b1 , . . . , bi }, 1 ≤ b1 < . . . < bi ≤ m1 , B2 = {bi+1 , . . . , b2i }, m1 + 1 ≤ bi+1 < . . . < b2i ≤ m1 + m2 ,
(216)
Let
292
M. Salmhofer
then for any L ∈ L(B1 , B2 ), there is a unique permutation π ∈ Si such that L = {(bk , bi+π(k) ) : k ∈ {1, . . . , i}}.
(217)
Thus the sum over L splits into a sum over i ≥ 1, a sum over sequences b = (b1 , . . . , b2i ) with (218) 1 ≤ b1 < b2 < . . . < bi ≤ m1 < bi+1 < . . . < b2i ≤ m1 + m2 , and a sum over permutations π ∈ Si . Therefore ∂L ∂L η1 . . . ηµ exp , Dt ∂η (1) ∂η (2) min{m1 ,m2 }
= with
X
i XX Y
i=1
π∈Si
b
D(Xbk , Xbi+π(k) ) H(i, b, π)
(219)
k=1
i Y ∂L ∂L . H(i, b, π) = η1 . . . ηµ ∂ηbk ∂ηbi+π(k)
(220)
k=1
The derivatives give (since m1 and m2 are even and hence (−1)mk = 1) i(i+1) 2 −
H(i, b, π) = (−1)
2i P
bk
k=1
ε(π)
µ Y
ηk .
(221)
k=1 k6∈{b1 ,...,b2i }
The remaining derivatives acting in Eq. (210) come from the Wick ordered exponential in Eq. (208). They now act on H. By Lemma 10, they give µ µ Y Y ∂L ∂ ηk Dt e( ∂η ,ψ) = Dt e(η,ψ) ∂ηk k=1 k=1 k6∈{b1 ,...,b2i }
k6∈{b1 ,...,b2i }
η=0
= Dt
mY 1 +m2
ψ(Xk ) .
η=0
(222)
k=1 k6∈{b1 ,...,b2i }
The final step is to rename the integration variables to rewrite the Wick ordered product in the form in which it appears in Eq. (52). Before this is done, it is necessary to permute the arguments of the Gmr (t | X (k) ) such that X1 , . . . , Xbi appear as the first i entries. Since b1 < . . . < bi , one can first permute (X1 , . . . , Xb1 ) → (Xb1 , X1 , . . . , Xb1 −1 ). This takes b1 − 1 transpositions and hence gives a factor (−1)b1 −1 . The next permutation (Xb1 , X1 , . . . , Xb1 −1 , Xb1 +1 , . . . Xb2 ) → (Xb1 , Xb2 , X1 , . . . , Xb2 −1 )
(223)
gives a factor (−1)b2 −2 because Xb1 has already been moved, etc. Thus i P
Gm1 r1 (t | X1 , . . . , Xm1 ) = (−1) k=1
(bk −k)
˜ Gm1 r1 (t | Xb1 , . . . , Xbi , X)
(224)
Continuous Renormalization for Fermions and Fermi Liquid Theory
293
with X˜ = (Xρ1 , . . . , Xρm1 −i ), where {1, . . . , m1 } \ {b1 , . . . , bi } = {ρ1 , . . . , ρm1 −i } and ρ1 < . . . < ρm1 −i . Similarly, i P
Gm2 r2 (t | X2 ) = (−1) k=1
(bi+k −k−m1 )
Gm2 r2 (t | Xbi+1 , . . . , Xb2i , X).
(225)
Thus the b–dependent sign factor cancels, and upon renaming of the integration variables, Vk = Xbk , Wk = Xbi+k , etc., and with m = m1 + m2 − 2i, Y becomes ! min{m1 ,m2 } Z m X Y Yr1 ,m1 ,r2 ,m2 = dX Dt ψ(Xk ) Y (X) (226) i=1
with Y (X) = (−1)
i(i+1) 2
Z dV dW
k=1
X
Gm1 r1 (t | V , X1 ) Gm2 r2 (t | W , X2 )
b
X π∈Si
ε(π)
∂ − ∂t
i Y
!
Dt (Vk , Wπ(k) )
(227)
k=1
and X, X1 , X2 , V , and W given in Proposition 2. The summand does not depend on b any more, so the sum over b, with the constraint Eq. (218), gives X m2 m1 = κm1 m2 i . = (228) i i B ⊂M ,B ⊂M 1 1 2 2 |B1 |=|B2 |=i
The sum over permutations π gives the determinant of Dt(i) . Finally, Gm1 r1 (t | V1 , . . . , Vi , X1 ) = (−1)(m1 −i)i Gm1 r1 (t | X1 , V1 , . . . , Vi )
(229)
and Gm2 r2 (t | W1 , . . . , Wi , X2 ) = (−1)
i(i−1) 2
Gm2 r2 (t | Wi , . . . , W1 , X2 ),
(230)
which cancels the sign factor, and thus proves Eq. (53). The graphical interpretation of the above derivation is that Gm1 r1 (t | X1 ) and Gm2 r2 (t | X2 ) are vertices of a graph, with the set of legs of vertex 1 given by η1 . . . ηm1 ∂L ∂L and the set of legs of vertex 2 given by ηm1 +1 . . . ηm1 +m2 . The operator ( ∂η (1) , Dt ∂η (2) ) generates lines between these two vertices by removing factors of η in the monomial η1 . . . ηm1 +m2 . The number of these internal lines is i, and the remaining m = m1 +m2 −2i factors ηk correspond to external legs of the graph. i ≥ 1 must hold because a derivative with respect to t is taken, so the graph is connected. The rearrangement using permuta- m1 tions is simply the counting of all those graphs that have the same value. There are i m2 ways of picking i legs from the m1 legs of vertex number 1 and i ways of picking i legs from the m2 legs of vertex number 2, which gives κm1 m2 i . And there are i! ways of pairing these legs to form internal lines. One could also have used the antisymmetry Qi of Gm2 r2 (t) to get this factor explicitly, i.e. to permute to get i! k=1 Dt (Vk , Wk ) instead of the determinant. This is the result one would have got for bosons since there are no sign factors in that case. The order of the Wk in Gm2 r2 (t | Wi , . . . , W1 , X2 ) is
294
M. Salmhofer
chosen reversed to cancel the i-dependent sign factor for π = id. The graph for π = id is the planar graph drawn in Fig. 2. The sum over permutations π 6= id corresponds to the sum over all graphs with the fixed two vertices and i internal lines. In particular, it contains the nonplanar graphs. If one restricts the sum to planar graphs, only the shifts πk (j) = j + k mod i, k ∈ {0, . . . , i − 1}, remain, and this is only the case if m1 = i of m2 = i. The factor i! makes the combinatorial difference between the exact theory and the ‘planarized’ theory. Acknowledgement. I thank Volker Bach, Walter Metzner, Erhard Seiler, and Christian Wieczerkowski for discussions. I also thank Christian Lang for his hospitality at a very pleasant visit to the University of Graz, where this work was started.
References 1. Wegner, F.: In: Phase Transitions and Critical Phenomena vol. 6, edited by C. Domb and M. Green, London–NewYirk, Academic Press 2. Wilson, K., Kogut, J.: Phys. Reports 12, 75 (1974) 3. Polchinski, J.: Nucl. Phys. B 231, 269 (1984) 4. Wieczerkowski, C.: Commun. Math. Phys. 120, 149 (1988) 5. Keller, G., Kopper, C., Salmhofer, M.: Helv. Phys. Acta 65, 32 (1992) 6. Keller, G., Kopper, C.: Commun. Math. Phys. 148, 445 (1992); Phys. Lett. B273, 323 (1991) 7. Keller, G.: Commun. Math. Phys. 161, 311 (1994) 8. Gallavotti, G.: Rev. Mod. Phys. 57, 471 (1985) 9. Gallavotti, G. and Nicol`o, F.: Commun. Math. Phys. 100, 545 (1985) and 101, 247 (1985) 10. Feldman, J., Hurd, T., Rosen, L., Wright, J.: QED: A Proof of Renormalizability. Springer Lecture Notes in Physics 312, Berlin–Heidelberg–New York: Springer-Verlag, 1988 11. Brydges, D.C., Yau, H.-T.: Commun. Math. Phys. 129, 351 (1990) 12. Brydges, D.C., Dimock, J., Hurd, T.: Commun. Math. Phys. 172, 143 (1995); mp-arc 96-538, mp-arc 96-681, and references therein 13. Salmhofer, M.: Fermionic sign cancellations in the continuous renormalization group equation. Phys. Lett. 408 B, 245 (1997) 14. Gawedzki, K. and Kupiainen, A.: Commun. Math. Phys. 102, 1 (1985) 15. Feldman, J., Magnen, J., Rivasseau, V. and S´en´eor, R.: Commun. Math. Phys. 103, 67 (1986) 16. Benfatto, G., Gallavotti, G.: J. Stat. Phys. 59, 541 (1990) 17. Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Commun. Math. Phys. 160, 93 (1994) 18. Bonetto, F., Mastropietro, V.: Commun. Math. Phys. 172, 57 (1995) 19. Feldman, J. and Trubowitz, E.: Helv. Phys. Acta 63, 157 (1990), ibid. 64, 213 (1991) 20. Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: Helv. Phys. Acta 65, 679 (1992) 21. Feldman, J., Salmhofer, M. and Trubowitz, E.: J. Stat. Phys. 84, 1209 (1996) 22. Feldman, J., Salmhofer, M. and Trubowitz, E.: Regularity of the Moving Fermi Surface: RPA Contributions. To appear in Commun. Pure Appl. Math. 23. Feldman, J., Salmhofer, M. and Trubowitz, E.: Regularity of the Moving Fermi Surface: The Full Selfenergy. To appear in Commun. Pure Appl. Math. 24. Salmhofer, M.: Improved Power Counting and Fermi Surface Renormalization. Rev. Math. Phys. 10 (1998) 25. Feldman, J., Kn¨orrer, H., Lehmann, D., Trubowitz, E.: In: Constructive Physics, V. Rivasseau (ed.), Springer Lecture Notes in Physics, 1995 26. Magnen, J., Rivasseau, V.: Mathematical Physics Electronic Journal 1, No. 3 (1995) 27. I thank Walter Metzner for pointing this out to me 28. Feldman, J., Kn¨orrer, H. and Trubowitz, E.: To appear in Commun. Math. Phys. 29. Bratteli, O., Robinson, D.: Operator Algebras and Quantum Statistical Mechanics. Berlin–Heidelberg– New York: Springer, 1979 30. Berezin, F.A.: The Method of Second Quantization. New York: Academic Press, 1966 31. Brydges, D.C. and Munoz Maya, I.: Journal of Theoretical Probability 4, 371 (1991) 32. L¨uscher, M.: Commun. Math. Phys. 54, 283 (1976)
Continuous Renormalization for Fermions and Fermi Liquid Theory
33. 34. 35. 36.
295
Brydges, D.C. and Wright, J.: J. Stat. Phys. 51, 435 (1988) Kohn, W. and Luttinger, J.M.: Phys. Rev. Lett. 15, 524 (1965) Feldman, J., Kn¨orrer, H., Sinclair, R. and Trubowitz, E.: Helv. Phys. Acta 70, 154 (1997) Fetter, A.L. and Walecka, J.D.: Quantum Theory of Many-Particle Systems. New York: McGraw-Hill, 1971 37. Beckenbach, E.F., Bellman, R.: Inequalities. Berlin–Heidelberg–New York: Springer, 1961 38. Vignale, G. et al.: Phys. Rev. B 39, 2956 (1989); Langmann, E., Salmhofer, M., Wallin, M.: Unpublished Communicated by D. C. Brydges
Commun. Math. Phys. 194, 297 – 321 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Monopoles and the Gibbons–Manton Metric Roger Bielawski Max-Planck-Institut f¨ur Mathematik, Gottfried-Claren-Strasse 26, 53225 Bonn, Germany. E-mail:
[email protected] Received: 6 June 1997 / Accepted: 7 October 1997
Abstract: We show that, in the region where monopoles are well separated, the L2 metric on the moduli space of n-monopoles is exponentially close to the T n -invariant hyperk¨ahler metric proposed by Gibbons and Manton. The proof is based on a description of the Gibbons–Manton metric as a metric on a certain moduli space of solutions to Nahm’s equations, and on twistor methods. In particular, we show how the twistor description of monopole metrics determines the asymptotic metric. The construction of the Gibbons–Manton metric in terms of Nahm’s equations yields a class of interesting (pseudo)-hyperk¨ahler metrics. For example we show, for each semisimple Lie group G and a maximal torus T ≤ G, the existence of a G × T invariant (pseudo)-hyperk¨ahler manifold whose hyperk¨ahler quotients by T are precisely Kronheimer’s hyperk¨ahler metrics on GC /T C . A similar result holds for Kronheimer’s ALE-spaces. The moduli space Mn of (framed) static SU (2)-monopoles of charge n, i.e. solutions to Bogomolny equations dA 8 = ∗F , carries a natural hyperk¨ahler metric [1]. The geodesic motion in this metric is a good approximation to the dynamics of low energy monopoles [26, 33]. For the charge n = 2 the metric has been determined explicitly by Atiyah and Hitchin [1], and it follows from their explicit formula that when the two monopoles are well separated, the metric becomes (exponentially fast) the Euclidean Taub-NUT metric with a negative mass parameter. It was also shown by N. Manton [27] that this asymptotic metric can be determined by treating well-separated monopoles as dyons. The equations of motion for a pair of dyons in R3 are found to be equivalent to the equations for geodesic motion on Taub-NUT space. For an arbitrary charge n, it was shown in [3] that, when the individual monopoles are well-separated, the L2 -metric is close (as the inverse of the separation distance) to the flat Euclidean metric. Gibbons and Manton [14] have then calculated the Lagrangian for the motion of n dyons in R3 and shown that it is equivalent to the Lagrangian for geodesic motion in a hyperk¨ahler metric on a torus bundle over the configuration space C˜ n (R3 ).
298
R. Bielawski
This metric is T n -invariant and has a simple algebraic form. Gibbons and Manton have conjectured, by analogy with the n = 2 case, that the exact n-monopole metric differs from their metric by an exponentially small amount as the separation gets large. We shall prove this conjecture here. ˜ n of solutions Our strategy is as follows. We construct a certain moduli space M to Nahm’s equations which carries a T n -invariant hyperk¨ahler metric. Using twistor methods we identify this metric as the Gibbons–Manton metric. Finally, we show that ˜ n and Mn are exponentially close. This proof adapts equally well to the the metrics on M asymptotic behaviour of SU (N )-monopole metrics with maximal symmetry breaking, as will be shown elsewhere. The asymptotic picture can be explained in the twistor setting. We recall that a monopole is determined (up to framing) by a curve S – the spectral curve – in T CP 1 , which satisfies certain conditions [16]. One of these is triviality of the line bundle L−2 over S, and a nonzero section of this bundle is the other ingredient needed to determine the metric [19, 1]. Asymptotically we have now the following situation. When the individual monopoles become well separated the spectral curve of the n-monopole degenerates (exponentially fast) into the union of spectral curves Si of individual monopoles, while the section of L−2 becomes (also exponentially fast) n meromorphic sections of L−2 over the individual Si . The zeros and poles of these sections occur only at the intersection points of the curves Si . This information (and the topology of the asymptotic region of Mn ) is, as we show in the last section, sufficient to conclude that the asymptotic metric is the Gibbons–Manton metric. The construction of the moduli space of solutions to Nahm’s equations which gives the Gibbons–Manton metric admits various generalizations. Some of them are described in Sect. 4. Let us recall that Kronheimer [23] has shown existence of hyperk¨ahler structures M (τ1 , τ2 , τ3 ) on GC /T C , where G is a compact semisimple Lie group and T ≤ G is a maximal torus. These structures are parameterized by the cohomology classes τ1 , τ2 , τ3 ∈ Lie(T ) of the three K¨ahler forms. We show (in Sect. 4) that there is a (pseudo)-hyperk¨ahler manifold MG with a tri-Hamiltonian action of T such that, if µ : MG → Lie(T ) ⊗ R3 is the hyperk¨ahler moment map, then the hyperk¨ahler quotient µ−1 (τ1 , τ2 , τ3 )/T of MG by T is precisely Kronheimer’s M (τ1 , τ2 , τ3 ). A similar construction can be done for Kronheimer’s ALE-spaces. The article is organized as follows. In Sects. 1 and 2 we recall the definitions of the ˜ n of Gibbons–Manton and monopole metrics. In Sect. 3 we introduce the moduli space M ˜ solutions to Nahm’s equations and give heuristic arguments why the metric on Mn should be exponentially close to the monopole metric. In Sect. 4, as a preliminary step to study ˜ n we introduce yet another moduli space of solutions to Nahm’s equations, somewhat M ˜ n . In that section we also discuss the relation with Kronheimer’s metrics simpler than M ˜ n as a differential, complex, and finally mentioned above. In Sect. 5 we identify M ˜ n and complex-symplectic manifold. In Sect. 6 we calculate the twistor space of M identify its hyperk¨ahler metric as the Gibbons–Manton metric. In Sect. 7 we finally ˜ n are exponentially close. The short show that the monopole metric and the metric on M Sect. 8 shows how one can read off the Gibbons–Manton metric, as the asymptotic form of the monopole metric, from the twistor description of the latter.
1. The Gibbons–Manton metric The Gibbons–Manton metric [14] is an example of 4n-dimensional (pseudo)-hyperk¨ahler metric admitting a tri-Hamiltonian (hence isometric) action of the n-dimensional torus
Monopoles and the Gibbons–Manton Metric
299
T n . Such metrics have particularly nice properties and were studied by several authors [25, 18, 32]. The Gibbons–Manton metric was described as a hyperk¨ahler quotient of a flat quaternionic vector space by Gibbons and Rychenkova in [15]. We recall here this description, which we slightly modify to better n suit our purposes. We start with flat hyperk¨ahler metrics g1 and g2 on M1 = S 1 × R3 and M2 = Hn(n−1)/2 . We consider a pseudo-hyperk¨ahler metric on the product manifold M = M1 ×M2 given by g = g1 −g2 . The complex structures on H are given by the right multiplication by quaternions i, j, k. The metric g1 is invariant under the obvious action (by translations) of T n = (S 1 )n and the metric g2 is invariant under the left diagonal action of T n(n−1)/2 . We consider a homomorphism φ : T n(n−1)/2 → T n given by i−1 n Y Y tij t−1 . (tij )i<j 7→ ji j=i+1
j=1
i=1,... ,n
on M = M1 ×M2 by t·(m1 , m2 ) = (φ(t)·m1 , t·m2 ). This defines an action of T Gibbons and Rychenkova have shown that the hyperk¨ahler quotient of (M, g) by this action of T n(n−1)/2 is the Gibbons–Manton metric. We remark that, if we choose coordinates (ti , xi ) on M1 , ti ∈ S 1 and xi ∈ R3 , and quaternionic coordinates qij , i < j, on Hn(n−1)/2 , then the moment map equations are: n(n−1)/2
1 qij iq¯ij = xi − xj . 2
(1.1)
As long as xi 6= xj for i 6= j, the torus T n(n−1)/2 acts freely on the zero-set of the moment map. The quotient of this set by T n(n−1)/2 is a smooth hyperk¨ahler manifold which we denote by MGM . The action of T n on M1 induces a free tri-Hamiltonian action on MGM for which the moment map is just (x1 , . . . , xn ). This makes MGM into a T n -bundle over 3 the configuration space C˜ n (R3 ) of n distinct points in R . We shall now determine this 3 ˜ bundle. We recall that a basis of H2 Cn (R ), Z is given by the k(k − 1)/2 2-spheres, 2 Sij = {xk ∈ R3 ; |xi − xj | = const, xk = const if k 6= i, j},
(1.2)
where i < j. We have Proposition 1.1. The hyperk¨ahler moment map for the action of T n makes MGM into a T n -bundle over C˜ n (R3 ) determined by the element (s1 , . . . , sn ) of H 2 C˜ n (R3 ), Zn given by −1 if k = i 2 sk (Sij ) = 1 if k = j 0 otherwise. 2 Proof. From the formula (1.1) it follows that restricting the bundle to a fixed Sij is 2 equivalent to considering the case n = 2. In other words sk (Sij ) = 0 if k 6= i, j and we have to consider only one quaternionic coordinate qij . The zero-set of the moment map is 21 qij iq¯ij = xi − xj and the circle S 1 by which we quotient acts by t · qij , (ti , xi ), (tj , xj ) = tqij , (tti , xi ), (t−1 tj , xj ) . The quotient can be obtained by setting ti = 1 and the induced action of the ith generator si of T n is then given by left 1 1 multiplication by s−1 i on qij . Since the map qij → 2 qij iq¯ij with the left action of S on 2 {qij ∈ H; |qij | = 1} is the Hopf bundle, it follows that si (Sij ) = −1. A similar argument 2 shows that sj (Sij ) = 1.
300
R. Bielawski
In particular, (t, x) = (ti , xi ) form local coordinates on MGM . The metric tensor can be then written in the form [32]: g = 8dx · dx + 8−1 (dt + A)2 , where the matrix 8 and the 1-form A depend only on the xi and satisfy certain linear PDE’s. In particular, 8 determines the metric. For the Gibbons–Manton metric ( P 1 1 − k6=i kxi −x if i = j kk 48ij = 1 if i 6= j. kxi −xj k 2. Nahm’s Equations and Monopole Metrics We shall recall in this section the description of the L2 -metric on the moduli space of charge n SU (2)-monopoles in terms of Nahm’s equations. A proof that the Nahm transform [30, 16] between the two moduli spaces is an isometry was given by Nakajima in [31]. One starts with the space A of quadruples (T0 , T1 , T2 , T3 ) of smooth u(n)-valued functions on (−1, 1) such that T1 , T2 , T3 have simple poles at ±1 with residues 21 ρ(σi ), i = 1, 2, 3, where ρ : su(2) → u(n) is the standard irreducible n-dimensional representation of su(2) and σi are the Pauli matrices. Equipped with the L2 -norm (given by a biinvariant inner product on u(n)), A becomes a flat quaternionic affine space. There is an isometric and triholomorphic action of the gauge group G of U (n)-valued functions g : [−1, 1] → U (n) which are 1 at ±1: 7 Ad(g)T0 − gg ˙ −1 , T0 → 7 Ad(g)Ti , i = 1, 2, 3. Ti →
(2.1)
The zero-set of the hyperk¨ahler moment map for this action is then described by Nahm’s equations [30]: 1 T˙i + [T0 , Ti ] + 2
X
ijk [Tj , Tk ] = 0 ,
i = 1, 2, 3.
(2.2)
j,k=1,2,3
The quotient of the space of solutions by G is a smooth hyperk¨ahler manifold Mn of dimension 4n. By the above mentioned result of Nakajima, Mn is the moduli space of (framed) charge n SU (2)-monopoles. With respect to any complex structure Mn is biholomorphic to the space of based rational maps of degree n on CP 1 [13]. If we replace U (n) by SU (n) (resp. by P SU (n)) in the above description, we obtain the moduli space of strongly centered (resp. centered) SU (2)-monopoles of charge n. Remark 2.1. A similar construction can be done for any compact Lie group G. We require ρ : su(2) → g to be a Lie algebra homomorphism whose image lies in the regular part of g. We obtain a smooth hyperk¨ahler manifold of dimension 4 rank G which can be identified with a totally geodesic submanifold of a certain moduli space of SU (N )-monopoles (with a minimal symmetry breaking). Alternatively, as a complex manifold, it is a desingularization of hC × T C /W , where T C is a maximal torus in GC , hC its Lie algebra, and W the corresponding Weyl group [6].
Monopoles and the Gibbons–Manton Metric
301
The tangent space to Mn can be described as the space of solutions to the linearized Nahm’s equations and satisfying the condition of being orthogonal (in the L2 -metric) to vectors arising from infinitesimal gauge transformations. In other words the tangent space to Mn at a solution (T0 , T1 , T2 , T3 ) can be identified with the set of solutions (t0 , t1 , t2 , t3 ) to the following system of linear equations: t˙0 + [T0 , t0 ] + [T1 , t1 ] + [T2 , t2 ] + [T3 , t3 ] = 0, t˙1 + [T0 , t1 ] − [T1 , t0 ] + [T2 , t3 ] − [T3 , t2 ] = 0, t˙2 + [T0 , t2 ] − [T1 , t3 ] − [T2 , t0 ] + [T3 , t1 ] = 0, t˙3 + [T0 , t3 ] + [T1 , t2 ] − [T2 , t1 ] − [T3 , t0 ] = 0.
(2.3)
The metric is defined by k(t0 , t1 , t2 , t3 )k2 =
1 2
Z
1 −1
3 X
kti k2 .
(2.4)
0
The three anti-commuting complex structures can be seen by writing a tangent vector as t0 + it1 + jt2 + kt3 .
3. The Asymptotic Moduli Space ˜ n (c), c ∈ R, of We shall now construct a one-parameter family of moduli spaces M solutions to Nahm’s equations carrying (pseudo-)hyperk¨ahler metrics. We shall see later on that these metrics are the Gibbons–Manton metric with different mass parameters. We consider the subspace 1 of exponentially fast decaying functions in C 1 [0, ∞], i.e.: ηt ηt 1 = f : [0, ∞] → u(n); ∃η>0 sup e kf (t)k + e kdf /dtk < +∞ . (3.1) t
As in the previous section, ρ : su(2) → u(n) is the standard irreducible n-dimensional representation of su(2) (in particular, ρ(σ1 ) is a diagonal matrix). We denote by h the (Cartan) subalgebra of u(n) consisting of diagonal matrices. Let A˜ n be the space of C 1 -functions (T0 , T1 , T2 , T3 ) defined on (0, +∞] and satisfying (cf. [23]): (i) (ii) (iii) (iv)
T1 , T2 , T3 have simple poles at 0 with res Ti = 21 ρ(σi ); Ti (+∞) ∈ h for i = 0, . . . , 3; (T1 (+∞), T2 (+∞), T3 (+∞)) is a regular triple, i.e. its centralizer is h; (Ti (t) − Ti (+∞)) ∈ 1 for i = 0, 1, 2, 3.
Next we shall define the relevant gauge group. The Lie algebra of our gauge group G(c) is the space of C 2 -paths ρ : [0, +∞) → u(n) such that (i) ρ(0) = 0 and ρ˙ has a limit in h at +∞; (ii) (ρ˙ − ρ(+∞)) ˙ ∈ 1 , and [τ, ρ] ∈ 1 for any regular element τ ∈ h; ˙ = 0. (iii) cρ(+∞) ˙ + limt→+∞ (ρ(t) − tρ(+∞))
302
R. Bielawski
It is the Lie algebra of the Lie group G(c) = {g : [0, +∞) → U (n); g(0) = 1, s(g) := lim gg ˙ −1 ∈ h, (τ − Ad(g)τ ) ∈ 1 , (gg ˙ −1 − s(g)) ∈ 1 , exp(cs(g)) lim (g(t) exp(−ts(g))) = 1} . Remark. The last condition in the definition of G(c) means that g(t) is asymptotic to exp(ht − ch) for some diagonal h. We introduce a family of metrics on A˜ n . Let (t0 , t1 , t2 , t3 ) be a tangent vector to the space A˜ n at a point (T0 , T1 , T2 , T3 ). The functions ti are now regular at 0, i = 0, . . . , 3. We put Z +∞ X 3 3 X k(t0 , t1 , t2 , t3 )k2c = c kti (+∞)k2 + kti (s)k2 − kti (+∞)k2 ds. (3.2) 0
0
0
We observe that the group G(c) acting by (2.1) preserves the metric k·kc and the three ˜ n (c) as the (formal) complex structure of the flat hyperk¨ahler manifold A˜ n . We define M hyperk¨ahler quotient of A˜ n by G(c) (with respect to the metric k · kc ). The zero set of the moment map is given by (2.2) (here condition (iii) in the definition of Lie(G(c)) is ˜ n (c) is defined as the moduli space of solutions to Nahm’s equations: essential) and so M ˜ n (c) = solutions to (2.2) in A˜ n /G(c). M ˜ n (c) will be seen to be positive definite if Remark. If c > 0, then the metric (3.2) on M (T1 (+∞), T2 (+∞), T3 (+∞)) is sufficiently far from the walls of Weyl chambers. On the other hand, if c < 0, then the metric will be shown to be everywhere negative definite. Therefore, for c < 0 we should really replace k · kc with its negative; it is, however more convenient to consider the metrics k · kc . We observe that sending a solution Ti to the solution rTi (rt) for any r > 0 induces ˜ n (rc). ˜ n (c) and M a homothety of factor r between M ˜ n (c), let us explain why we expect this Before we begin the detailed study of M metric to be exponentially close to the monopole metric. It is known [4] that the solutions to Nahm’s equations on (0, 2) corresponding to a well-separated monopole are exponentially close to being constant away from the boundary points (i.e. on any [, 2 − ]). The same is true for solutions on the half line (0, +∞): as long as the triple (T1 (+∞), T2 (+∞), T3 (+∞)) is regular, the solutions are exponentially close to being constant away from 0 [23] (it is helpful to notice that the space of regular triples is the same as the space C˜ n R3 of distinct points in R3 ). Our strategy is to take two solutions, on half-lines (0, ∞) and (−∞, 2) with the same values at ±∞, cut them off at t = 1 and use this non-smooth solution on (0, 2) (with correct boundary behaviour) to obtain an exact solution to the monopole Nahm data. The exact solution will differ from the approximate one by an exponentially small amount. Furthermore the part of the half-line solutions which we have cut off is exponentially close to being constant and, for c = 1, contributes an exponentially small amount to the metric k · kc (all estimates are uniform and can be differentiated). This can be seen from the fact that we can rewrite (3.2) as Z cX Z +∞ X 3 3 kti (s)k2 + kti (s)k2 − kti (+∞)k2 ds. (3.3) k(t0 , t1 , t2 , t3 )k2c = 0
0
c
0
The first term, together with the corresponding term for the solution on (−∞, 2), is exponentially close to the monopole metric (for c = 1).
Monopoles and the Gibbons–Manton Metric
303
4. Moduli Space of Regular Semisimple Adjoint Orbits ˜ n (c) we need to consider first another moduli In order to obtain information about M space of solutions to Nahm’s equations, defined analogously, except that we require the solutions to be smooth at t = 0. This space, which can be defined for an arbitrary compact Lie group G, is of some interest as all hyperk¨ahler structures on GC /T C (here T C is a maximal torus) due to Kronheimer [23] can be obtained from it as hyperk¨ahler quotients (see Theorem 4.3 below). A reader who is primarily interested in monopoles should think of G as U (n). Let us first recall how Kronheimer constructs hyperk¨ahler metrics on GC /T C . Let h be the Lie algebra of T C and let (τ1 , τ2 , τ3 ) ∈ h3 be a regular triple, i.e. one whose centralizer is h. For a fixed η > 0, consider the Banach space η1 = f : [0, ∞] → g; sup eηt kf (t)k + eηt kdf /dtk < +∞ t
with the norm kf k = supt eηt kf (t)k + eηt kdf /dtk . Define Aη (τ1 , τ2 , τ3 ) as the space of C 1 -functions (T0 , T1 , T2 , T3 ) : (0, +∞] → g which satisfy: {T0 (t), (Ti (t) − τi ) ; i = 1, 2, 3} ⊂ η1 . Define also G η by replacing 1 with η1 in the definition of G given in the previous section. Kronheimer shows then that for small enough η, M (τ1 , τ2 , τ3 ) = {solutions to (2.2) in Aη (τ1 , τ2 , τ3 )} /G η , equipped with the L2 metric is a smooth hyperk¨ahler manifold, diffeomorphic to GC /T C . Futhermore, if (τ2 , τ3 ) is regular, then M (τ1 , τ2 , τ3 ) is biholomorphic, with respect to the complex structure I, to the complex adjoint orbit of τ2 + iτ3 . We observe that the union of all M (τ1 , τ2 , τ3 ) has a natural topology and it is, in fact, a smooth manifold. We shall show now that there is a T -bundle over this union which carries a (pseudo)-hyperk¨ahler metric. We define the space AG by omitting the condition (i) in the definition of A˜ n in the previous section. Instead we require that the Ti are smooth at t = 0 for i = 0, 1, 2, 3. We define MG (c), c ∈ R, as the (formal) hyperk¨ahler quotient of AG by G(c) with respect to the metric (3.2). We have: Proposition 4.1. MG (c) equipped with the metric (3.2) is a smooth hyperk¨ahler manifold. The tangent space at a solution (T0 , T1 , T2 , T3 ) is described by Eqs. (2.3). We remark that the metric 3.2 may be degenerate at some points. However the hypercomplex structure is defined everywhere. η (c) by replacing with η in the definition of MG (c). By the expoProof. Define MG nential decay property of solutions to Nahm’s equations ([23], Lemma 3.4), a neighbourhood of a particular element in MG (c) is canonically identified with its neighbourhood η (c) for small enough η. Therefore we can use the transversality arguments of in MG [23], Lemma 3.8 and Proposition 3.9 (with a slight modification due to condition (iii) in the definition of Lie(G(c))) to deduce the smoothness. The fact that the metric is hyperk¨ahler is, formally, the consequence of the fact that MG (c) is a hyperk¨ahler quotient. One can, in fact, check directly that the three K¨ahler forms are closed. We shall also, later on, identify the complex structures and the complex symplectic forms proving their closedness.
304
R. Bielawski
We observe now that the action on AG of gauge transformations which are asymptotic to exp(−th + λh), h ∈ h, λ ∈ R, induce a free isometric action of T = exp(h) on MG (c). In fact this action is tri-Hamiltonian and a simple calculation shows Proposition 4.2. The hyperk¨ahler moment map µ = (µ1 , µ2 , µ3 ) for the action of T on MG (c) is given by µi (T0 , T1 , T2 , T3 ) = Ti (+∞) for i = 1, 2, 3. As an immediate corollary we have: Theorem 4.3. Let (τ1 , τ2 , τ3 ) be a regular triple in h3 . The hyperk¨ahler quotient µ−1 (τ1 , τ2 , τ3 )/T of MG (c) by the torus T is isometric to Kronheimer’s M (τ1 , τ2 , τ3 ). We have also a tri-Hamiltonian action of G on MG (c) given by the gauge transformations with arbitrary values at t = 0. The hyperk¨ahler moment map for this action is (T1 (0), T2 (0), T3 (0)). We have two other group actions on MG (c). There is a free isometric and triholomorphic action of the Weyl group W = N (T )/T given by the gauge transformations which become constant (and in W ) exponentially fast. Finally there is a free isometric SU (2)-action which rotates the complex structures. As a consequence it has a globally defined K¨ahler potential for each K¨ahler form (cf. [18]). The potential for ω2 (or ω3 ) is given by the moment map for the action of a circle in SU (2) which preserves I. This is easily seen to be KJ = c
3 X
Z
+∞
kTi (+∞)k2 +
i=2
0
3 X
kTi (s)k2 − kTi (+∞)k2 ds.
i=2
Remark 4.4. There is a similar (pseudo)-hyperk¨ahler manifold with a torus action such that the hyperk¨ahler quotients by this torus are isometric to Kronheimer’s ALE-metrics on the minimal resolution of a given Kleinian singularity C2 /0 [24]. This manifold is defined as MG except that the Ti have poles at t = 0 with the residues defined by a subregular homomorphism su(2) → g (cf. [6, 5]). Remark 4.5. One can observe that MG (0) is a cone metric (with the R>0 -action given by Ti (t) 7→ rTi (rt)) and in fact, it is an H∗ -bundle over a pseudo-quaternion-K¨ahler manifold (cf. [34]). Remark 4.6. It is instructive to consider the K¨ahler analogue of MG (c). The K¨ahler metrics on G/T (cf. [2]) can be described (cf. [7]) as the natural L2 -metrics on the moduli space of solutions to T˙1 = [T1 , T0 ] with T0 (t), (T1 (t) − τ ) ∈ 1 for a fixed regular element τ of h (this gives the K¨ahler form whose cohomology class is τ ). We ˆ G (c) with can now do a construction similar to that of MG (c) to obtain moduli spaces M a (G × T )-invariant pseudo-K¨ahler metric whose K¨ahler quotients by T are precisely the K¨ahler metrics on G/T . In this case it is easy to compute both the topology and the ˆ G (c) is diffeomorphic to G × {regular elements of h}, ˆ G (c): M complex structure of M and the complex structure at (1, h) is given by I(v + w, p) = (I0 v − p, w), where v ⊥ h, w, p ∈ h and I0 is the complex structure on G/T . ˜ n (c) as a Manifold 5. M ˜ n (c) defined in Sect. 3. Our first task is to show that this We now return to the space M ˜ n (c) is a smooth hyperk¨ahler quotient of the space is smooth. We shall show that M
Monopoles and the Gibbons–Manton Metric
305
product of the space MU (n) (c − 1) considered in the previous section and of another moduli space of solutions to Nahm’s equations. This latter space, denoted by Nn , is given by u(n)-valued solutions to Nahm’s equations defined on (0, 1] smooth at t = ˜ n (c) at t = 0. The gauge group consists of gauge 1 and with the same poles as M transformations which are identity at t = 0, 1. Equipped with the metric (2.4) this is a smooth hyperk¨ahler manifold [6, 11]. It admits a tri-Hamiltonian action of U (n) given by gauge transformations with arbitrary values at t = 1. In addition, we consider the space MU (n) (c−1) defined in the previous section. We identify it this time with the space of solutions on [1, +∞] via the map Ti (t) 7→ Ti (t + 1) (so that the gauge transformations behave now, near +∞, as elements of G(c)). ˜ n (c) is the hyperk¨ahler quotient of Nn × It is easy to observe that the space M MU (n) (c − 1) by the diagonal action of U (n) (cf. [6]; the moment map equations simply match the functions T1 , T2 , T3 at t = 1; after that, quotienting by G means that the ˜ n (c) remaining gauge transformations are smooth at t = 1). Using this description of M we can finally show ˜ n (c) equipped with the metric (3.2) is a smooth hyperk¨ahler manProposition 5.1. M ifold. The tangent space at a solution (T0 , T1 , T2 , T3 ) is described by the Eqs. (2.3). Proof. Since the metric (3.2) may be degenerate, we still have to show that the moment map equations on Nn × MU (n) (c − 1) are everywhere transversal. Consider a particular point in MU (n) (c − 1) which we represent by a solution m = (T0 , T1 , T2 , T3 ) with T0 (+∞) = 0 and Ti (+∞) = τi , i = 1, 2, 3. Let µ be the hyperk¨ahler moment map for the action of G on Nn × MU (n) . We observe that the image of dµ|m contains the image of dµ0|m , µ0 being the hyperk¨ahler moment map for the action of G on Nn × M (τ1 , τ2 , τ3 ) (Kronheimer’s definition of M (τ1 , τ2 , τ3 ) was recalled in the previous section). The metric on Nn × M (τ1 , τ2 , τ3 ) is non-degenerate and, as G acts freely, dµ0|m is surjective. ˜ n (c) is smooth. Thus dµ is surjective at each point in Nn × MU (n) (c − 1) and M ˜ n (c) has isometric actions of the We observe that, as in the case of MU (n) (c), M n torus T (defined as the diagonal subgroup of U (n)), of the symmetric group Sn , and of SU (2). In particular, the hyperk¨ahler moment map for the action of T n is still given by the values of T1 , T2 , T3 at infinity (cf. Proposition 4.2). ˜ n (c): We can describe the topology of M ˜ n (c) is a principal T n -bundle over the configuration space C˜ n (R3 ) Proposition 5.2. M of n distinct points in R3 . We postpone identifying this bundle until the next section (Proposition 6.3). Proof. The space C˜ n (R3 ) is the space of regular triples in the subalgebra of diagonal ˜ n (c) → matrices and the moment map µ for the action of T n gives us a projection M 3 ˜ ˜ Cn (R ). Let us consider a fixed regular triple (τ1 , τ2 , τ3 ) and all elements of Mn (c) with Ti (+∞) = τi , i = 1, 2, 3, i.e. µ−1 (τ1 , τ2 , τ3 ). For each such solution we can make T0 identically 0 by some gauge transformation g with g(0) = 1. This is defined uniquely up to the action of G × T n and so the set of T n -orbits projecting via µ to (τ1 , τ2 , τ3 ) can be identified with the set of solutions to Nahm’s equations with T0 ≡ 0, T1 , T2 , T3 having the appropriate residues at t = 0 and being conjugate to τ1 , τ2 , τ3 at infinity. By the considerations at the beginning of this section this space is the hyperk¨ahler quotient of Nn ×M (τ1 , τ2 , τ3 ) by U (n). The arguments of [6] show that the corresponding complexsymplectic quotient can be identified with the intersection of a regular semisimple adjoint
306
R. Bielawski
orbit of GL(n, C) with the slice to the regular nilpotent orbit. This intersection is a single point. Finally, in order to identify in this case the hyperk¨ahler quotient with the complexsymplectic one we can adapt the argument in the proof of Proposition 2.20 in [20]. ˜ n (c) (because of the action of Our next task is to describe the complex structure of M SU (2) all complex structures are equivalent). As usual (cf. [13]), if we choose a complex structure, say I, we can introduce complex coordinates on the moduli space of solutions to Nahm’s equations by writing α = T0 + iT1 and β = T2 + iT3 . The Nahm equations can be then written as one complex and one real equation: dβ = [β, α], dt d (α + α∗ ) = [α∗ , α] + [β ∗ , β]. dt
(5.1) (5.2)
˜ n (c) is the hyperk¨ahler quotient By the remark made at the beginning of this section, M of the product manifold Nn × MU (n) (c − 1). We shall show that as a complex symplectic ˜ n (c) is the complex-symplectic quotient of Nn × MU (n) (c − 1). Let us recall manifold M the complex structure of Nn [13, 19, 6, 12]. Let e1 , . . . , en denote the standard basis of Cn . There is a unique solution w1 of the equation dw = −αw dt with
lim t−(n−1)/2 w1 (t) − e1 = 0.
t→0
(5.3)
(5.4)
Setting wi (t) = β i−1 (t)w1 (t), we obtain a solution to (5.3) with lim ti−(n+1)/2 wi (t) − ei = 0. t→0
The complex gauge transformation g(t) with g −1 = (w1 , . . . , wn ) makes α identically zero and sends β(t) to the constant matrix 0 . . . 0 (−1)n+1 Sn .. 1 . (−1)n Sn−1 . (5.5) B(β1 , . . . , βn ) = . . .. .. .. . 0 ... 1 S1 Here βi denote the (constant) eigenvalues of β(t) and Si is the ith elementary symmetric polynomial in {β1 , . . . , βn }. The mapping (α, β) → (g(1), B) gives a biholomorphism between (Nn , I) and Gl(n, C) × Cn [6]. ˜ n (c) as follows: We describe the complex structure of M ˜ n (c) and Proposition 5.3. There exists a T n -equivariant biholomorphism between M an open subset of ! a −1 {[g, b] ∈ Gl(n, C) ×N (d + n); gbg is of the form (5.5)} ∼, n
Monopoles and the Gibbons–Manton Metric
307
where d denotes diagonal matrices, the union is over unipotent algebras n (with respect to d) and N = exp n. Furthermore, the relation ∼ is given as follows: [g, d+n] ∼ [g 0 , d0 +n0 ] if and only if n ∈ n, n0 ∈ n0 , and either n0 ⊂ n and there exists an m ∈ N such that gm−1 = g 0 , Ad(m)(d + n) = d0 + n0 or vice versa (i.e. n ⊂ n0 etc.). Remark. It will follow from the description of the twistor space that this biholomorphism is actually onto. Proving this right now would require showing that the T n -action on ∗ n ˜ Mn (c) extends to the global action of C . This, in turn, requires showing existence of solutions to a mixed Dirichlet-Robin problem on the half-line - something that seems quite tricky. Proof. Fix a unipotent algebra n and consider the set of all solutions (α, β) = (T0 + iT1 , T2 + iT3 ) on [1, +∞) such that the intersection of the sum of positive eigenvalues of ad(iT1 (+∞)) with C(β(+∞)) is contained in n. Let M (n; c − 1) be the corresponding subset of MU (n) (c). We observe that, since (T1 (+∞), T2 (+∞), T3 (+∞)) is a regular triple, the projection of T1 (+∞) onto dC ∩ C(β(+∞)) is a regular element, and so n contains the unipotent radical of a Borel subalgebra of C(β(+∞)) for any element of M (n; c − 1). Using gauge freedom, we always make T0 (+∞) = 0 and, by Proposition 4.1 of Biquard [8], such a representative is of the form g α(+∞), β(+∞) + Ad(exp{−α(+∞)t})n , where n ∈ n and g is a bounded Gl(n, C)-valued gauge transformation. The transformation g is defined modulo exp{−α(+∞)t}g0 exp{α(+∞)t} with g0 ∈ P = exp(d + n). Since T0 (+∞) = 0 and T0 is decaying exponentially fast, g has a limit (in T C ) at +∞. If we replace g(t) by g 0 (t) = g(t)g(+∞)−1 exp{−α(+∞)t + cα(+∞)}, then (α, β) = g 0 (0, β(+∞) + n0 ) for an n0 ∈ n. The transformation g 0 , which satisfies (at infinity) the boundary condition of an element of G(c − 1)C , is now defined modulo constant gauge transformations in N . Moreover g 0 (1) is independent of G(c − 1) and we obtain a map φ : M (n) → Gl(n, C) ×N (d + n) by sending (α, β) to (g 0 (1), β(+∞) + n0 ). Considering the infinitesimal version of this construction shows that φ is holomorphic. Since φ is U (n)-equivariant, it is (locally) Gl(n, C)-equivariant. We can adapt the ˜ n (c) is the complex-symplectic argument of Proposition 2.20 in [19] to show that M quotient of Nn × MU (n) (c − 1) by (local action of) Gl(n, C). Let us restrict attention to Nn × M (n). The complex symplectic moment map at the point (g, B) of Nn is −g −1 Bg (here g ∈ Gl(n, C) and B is of the form (5.5)) and the complex symplectic moment map at the point corresponding to [g 0 , βd + n] is g 0 (βd + n)g 0−1 (here βd is diagonal and n ∈ n). The moment map equation for the diagonal action of Gl(n, C) is g −1 Bg = g 0 (βd + n)g 0−1 . If we now quotient by Gl(n, C), i.e. send g to identity, we shall end up with the set of [g 0 , b] ∈ Gl(n, C) ×N (d + n) such that g 0 bg 0−1 = B (B is determined by the diagonal part of b). This identifies the charts described in this proposition. By going through the procedure we can conclude that the charts for different n are matched as claimed. ˜ n (c) to the manifold So far we have shown that there is a holomorphic map φ from M M described in the statement. We still have to show that φ is 1-1. By construction our n map is T n -equivariant, and so C∗ -equivariant (where the action is defined). Since n ˜ n (c). Furthermore the C∗ n -action on the C∗ -action on M is free, it is free on M M leaves invariant sets of the form M ∩ Gl(n, C) ×N (d + n) , d ∈ d. Each such set n is a single orbit of C∗ and so φ is 1-1. ˜ n (c) is rather complicated. We remark that the open The above description of M dense subset where β(+∞) is regular corresponds to n = 0, i.e. to {(βd , g); βd = diag(β1 , . . . , βn ), βi 6= βj if i 6= j, gβd g −1 = B(β1 , . . . , βn )}.
308
R. Bielawski
˜ n (c) by M ˜ nreg (c). We observe that an We shall denote the corresponding subset of M element g of Gl(n, C) which sends diag(β1 , . . . , βn ) to B(β1 , . . . , βn ) is of the form g = V (β1 , . . . , βn )−1 diag(u1 , . . . , un ),
(5.6)
where ui 6= 0 and V (β1 , . . . , βn ) is the Vandermonde matrix, i.e. Vij = (βi )j−1 . We can ˜ nreg (c): calculate the complex symplectic form ω = ω2 + iω3 on M ˜ nreg (c) is given, in coordinates Proposition 5.4. The complex symplectic form ω on M βi , ui , i = 1, . . . , n, by n X dui i=1
ui
∧ dβi −
X dβi ∧ dβj i<j
βi − βj
.
(5.7)
Proof. First, we calculate ω on the subset of MU (n) (c − 1), where β(+∞) is regular. This subset is biholomorphic to Gl(n, C) × {regular elements of hC } and according to the proof of Proposition 5.3, an element (α, β) of this set corresponding to (g, βd ) ∈ −1 , g(t)βd g(t)−1 ), where g(t) is a complex Gl(n, C)×h can be written as (α, β) = (−g(t)g ˙ gauge transformation with g(0) = g. Therefore a tangent vector (a(t), b(t)) at (α, β) can be written as (a, b) = −g ρg ˙ −1 , g bd + [ρ, βd ] g −1 , (5.8) where ρ is dual to g −1 dg and bd is dual to dβd . The complex symplectic form on MU (n) (c − 1) is given by Z +∞ tr dα ∧ dβ − dα(+∞) ∧ dβ(+∞) . ω = (c − 1) tr dα(+∞) ∧ dβ(+∞) + 0
ˆ corresponding, via (5.8), to (ρ, bd ) and (ρ, For two tangent vectors (a, b) and (ˆa, b), ˆ bˆ d ) we obtain ˆ d , ω = − tr bd ρˆ − ρbˆ d − [ρ, ρ]β ˜ nreg (c) it remains to where ρ = ρ(0), ρˆ = ρ(0). ˆ To calculate the symplectic form on M substitute (5.6) for g. Let us write u for diag(u1 , . . . , un ). Then ρ becomes dual to u−1 du − u−1 dV V −1 u. Let us write ν for the tangent vector dual to u−1 du and ϒ for the tangent vector dual to dV V −1 . Since ν is diagonal and the ith row of ϒ is of the form bi s (here we write bd = diag(b1 , . . . , bn )), for a covector s, we can write ω as ˆ d . ω = − tr bd νˆ − ν bˆ d − [ϒ, ϒ]β ˆ d . Let us write Wij for the (i, j)th entry of V −1 , i.e. It remains to calculate tr[ϒ, ϒ]β .Y (βj − βk ), Wij = (−1)n−i Sn−i (β1 , . . . , βˆj , . . . , βn ) (5.9) k6=j
Sk being the k th elementary symmetric polynomial (S0 = 1). We calculate the (i, i)th ˆ as entry of [ϒ, ϒ] ! ! X X X k−2 k−2 (bi bˆ j − bˆ i bj ) (k − 1)βi Wkj (k − 1)βj Wki . j
k
k
Monopoles and the Gibbons–Manton Metric
309
This means that ˆ d= tr[ϒ, ϒ]β
X
(bi bˆ j −bˆ i bj )(βi −βj )
X
i<j
! (k −
1)βik−2 Wkj
X
k
! (k −
1)βjk−2 Wki
Formula (5.7) will be proven if we can show (for i 6= j) the following identity: ! ! X X −1 (k − 1)βik−2 Wkj (k − 1)βjk−2 Wki = . (βi − βj )2 k
.
k
(5.10)
k
According to (5.9) we have X
P (k − 1)βik−2 Wkj =
k (k
k
− 1)βik−2 (−1)n−k Sn−k (β1 , . . . , βˆj , . . . , βn ) Q . s6=j (βj − βs ) (5.11)
We compute the numerator of this expression. We set p = n − 1 and (a1 , . . . , ap ) = (β1 , . . . , βˆj , . . . , βn ). Then the numerator can be written as ! p p X d X s p−1−s s p−s (p − s)(−1) ai Ss (a1 , . . . , ap ) = (−1) Ss t . dt s=0
Since
P
s=0
s
Ss t =
Q
t=ai
(1 + as t), we can rewrite the expression under the derivative as p X
(−1)s Ss tp−s =
s=0
p Y
(t − as ).
s=0
Taking the derivative and substituting ai for t, finally gives p X
(p − s)(−1)s ap−1−s = i
s=0
Y
(ai − as ).
s6=i
Going back to (5.11), we have X
Q
(k −
1)βik−2 Wkj
k
from which (5.10) follows.
s6=i,j (βi − βs ) , = Q s6=j (βj − βs )
Remark 5.5. Setting p i = ui
.Y
(βi − βj ),
j>i
the formula (5.7) can be rewritten as ω=
n X dpi i=1
pi
∧ dβi .
310
R. Bielawski
˜ n (c) 6. The Twistor Space and the Metric on M ˜ n (c). As a first step, we observe, We shall now identify the twistor space Z(c) of M after Hitchin et al. [18], that the hyperk¨ahler moment map µ for the T n -action defines a moment map, also denoted by µ, for the complex-symplectic form along the fibers Z(c) → CP 1 . This µ is a map from Z(c) to O(2) ⊗ Cn . We shall first identify the open subset Z reg (c) of Z(c) defined as the set Z reg (c) = µ−1 (O(2) ⊗ Cn − O(2) ⊗ 1) ,
(6.1)
where 1 is the generalized diagonal in Cn . In terms of the coordinates (β1 , . . . , βn ) and (u1 , . . . , un ) given by (5.6), Z reg (c) has the following description: Proposition 6.1. Z reg (c) is obtained by taking two copies of C × (Cn − 1) × (C∗ )n ˜ β˜i , u˜ i ), i = 1, . . . , n, and identifying over ζ 6= 0 by with coordinates (ζ, βi , ui ) and (ζ, ζ˜ = ζ −1 , β˜i = ζ −2 βi , u˜ i = ζ −(n−1) exp{−cβi /ζ}ui . The real structure is given by ¯ ζ 7→ −1/ζ, ¯ βi 7→ −βi /ζ¯2 , n−1 Q ¯ ¯ cβ¯ i /ζ¯ . ui 7→ u¯ −1 1/ζ¯ i j6=i (βi − βj )e Finally, the complex symplectic form along the fibers is given by (5.7). Proof. For any hyperk¨ahler moduli space of solutions to Nahm’s equations one can trivialize the twistor space by choosing an affine coordinate ζ on CP 1 and then putting η = β + (α + α∗ )ζ − β ∗ ζ 2 , u = α − β ∗ ζ for ζ 6= ∞, and η˜ = β/ζ 2 + (α + α∗ )/ζ − β ∗ , u˜ = −α∗ − β/ζ for ζ 6= 0. Then, over ζ 6= 0, ∞, we have η˜ = η/ζ 2 , u˜ = u − η/ζ. ¯ η 7→ −η ∗ /ζ¯2 , u 7→ −u∗ + η ∗ /ζ¯ (cf. [12, 9]). Moreover, the real structure is ζ 7→ −1/ζ, We now have to go through the procedure in the proof of Proposition 5.3 to describe ˜ β˜i , u˜ i ). First we describe the twistor space of Nn Z reg in coordinates (ζ, βi , ui ) and (ζ, ˜ in coordinates (g, B) and (g, ˜ B) defined right after (5.5) (cf. [12]). Going through the procedure assigning (g, B) to (α, β), we see that B˜ = B β1 /ζ −2 , . . . , βn /ζ −2 . On the other hand g is given by g = g(1), where g(t) is a complex gauge transformation such d −1 g = −ug −1 . This means that g(t) makes u identically zero. We observe that that dt exp{−Bt/ζ}g(t) makes u˜ identically zero and η˜ into B/ζ 2 . The initial value for the solution g −1 depends on ζ and so we can write g(t) ˜ = U exp{−Bt/ζ}g(t) for some constant matrix U . If we are to get the form (5.5), we must have U = U 0 d(ζ), where (6.2) d(ζ) = diag ζ −(n−1) , ζ −(n−3) , . . . , ζ n−1 . In addition U 0 commutes with B β1 /ζ −2 , . . . , βn /ζ −2 . Moreover, the initial value for d −1 g = −αg −1 depends only on the residues of u, η, u, ˜ η˜ and therefore the equation dt 0 U does not depend on B. Since the initial values belong to SU (n), we also have U 0 ∈ SU (n). It follows that U 0 belongs to the center of SU (n). This is only an ambiguity in the choice of trivialization and it does not affect the twistor space. Similar considerations show that the real structure sends B(β1 , . . . , βn ) to B −β¯1 /ζ¯−2 , . . . , −β¯n /ζ¯−2 and g ¯ (g ∗ )−1 , where to r(ζ) exp{B ∗ /ζ}
Monopoles and the Gibbons–Manton Metric
rij (ζ) =
311
0 if i + j = 6 n . (−1)j−1 ζ¯n+1−2j if i + j = n
This time the remaining ambiguity is given by a real element in the center of SU (n), i.e. −1 if n is even. We now go through a similar procedure for the subset of MU (n) (c − 1), where β(+∞) is regular. We have assigned in the proof of Proposition 5.3 to each element of this set a pair (g, β(+∞). We already know how β(+∞) changes (as it is given by the complex moment map for a torus action). The proof of Proposition 5.3 shows that the other coordinates, g on {ζ 6= ∞} and g˜ on {ζ 6= 0}, are related by g˜ = g exp{−(c − ¯ 1)β(+∞)/ζ}. The real structure sends g to (g ∗ )−1 exp{(c − 1)β(+∞)∗ /ζ}. Finally we have to go to the complex-symplectic quotient as in the proof of Propo˜ β˜d ), where βd = diag(β1 , . . . , βn ) and sition 5.3. We end up with (g, βd ) and (g, gβd g −1 = B(β1 , . . . , βn ) (and similarily for (g, ˜ β˜d )). We see that βi and β˜i are related as stated and g˜ = d(ζ) exp{−B/ζ}g exp{−(c − 1)βd /ζ}. Since exp{−B/ζ}g = g exp{−βd /ζ}, g˜ = d(ζ)g exp{−cβd /ζ}. If we now go to the coordinates ui , u˜ i defined by (5.6), we see that they change as required, since the (i, j)th entry of V −1 is given by (5.9) and the βi change as prescribed (i.e. as sections of O(2)). A similar argument shows that the real structure is, up to a sign, the one described in the statement ∗−1 ¯ ¯ diag{ecβi /ζ } and in (it is enough to compare the last row in r(ζ) V −1 diag{ui } −1 −2 −2 0 ¯ ¯ ¯ ¯ V −β1 /ζ , . . . , −βn /ζ diag{ui }). We shall see shortly (Proposition 6.2) that the negative of the real structure described in the statement does not admit any sections (a section would be equivalent to a complex number with imaginary modulus). The formula for the complex symplectic structure is a direct consequence of Proposition 5.4. ˜ n (c) and this means We now wish to find the full twistor space and the metric on M finding a family of real sections. We know their projections to O(2) ⊗ Cn : they are given 3 by (β + (α + α∗ )ζ − β ∗ ) (+∞) (cf. [18]) and are parameterized √ by n distinct points in R with coordinates (xi , Re zi , Im zi ), i = 1, . . . , n, where xi = −1T1 (+∞), zi = β(+∞). In other words we have n curves Si = {(ζ, η); η = zi + 2xi ζ − z¯i ζ 2 } in T CP 1 (here η is the fiber coordinate). According to Proposition 6.1 the ui coordinate of a real section of Z(c) changes as a non-zero section of the bundle Lc (k − 1) (with the transition function ζ k−1 ecη/ζ from ∞ to 0) over Si . This is true only away from the intersection points of the curves Si and we have to understand what happens to the section at these points. Two curves Si = {(ζ, η); η = zi + 2xi ζ − z¯i ζ 2 } and Sj = {(ζ, η); η = zj + 2xj ζ − z¯j ζ 2 } intersect in a pair of distinct points aij and aji , where aij =
(xi − xj ) + rij , z¯i − z¯j
rij =
q (xi − xj )2 + |zi − zj |2 .
(6.3)
We have: ˜ Proposition 6.2. The real sections of the twistor space Z(c) of Mn (c) are given, over ζ 6= ∞, by β1 (ζ), . . . , βn (ζ), u1 (ζ), . . . , un (ζ) , where βi (ζ) = zi + 2xi ζ − z¯i ζ 2 , Y ui (ζ) = Ai (ζ − aji )ec(xi −z¯i ζ) , j6=i
312
R. Bielawski
where (xi , zi ), i = 1, . . . , n, are distinct points in R × C and Ai are complex numbers satisfying Y xi − xj + rij . Ai A¯ i = j6=i
Remark. Given Proposition 5.2, this finally shows that the biholomorphism of Proposition 5.3 is onto. Proof. Consider a real section s of Z(c) (corresponding to a solution (T0 , T1 , T2 , T3 )) which projects to a given real section (β1 (ζ), . . . , βn (ζ)) of O(2) ⊗ Cn . For a generic section the intersection points of the βs are all distinct. We consider the point aji at which √ βi intersects βj and let us assume that no√other βs intersect there. We recall that −1T1 (ζ) = 21 (α + α∗ ) − β ∗ ζ and, hence, −1T1 (ζ)(+∞)ss = xs − z¯s ζ. This √ √ means that −1T1 (aji )(+∞)jj < −1T1 (aji )(+∞)ii , and so, with respect to the complex structure corresponding to aji ∈ CP 1 , the solution (T0 , T1 , T2 , T3 ) belongs to the chart described in Proposition 5.3 with n generated by the matrix with the only nonzero entry having coordinates (i, j). Let us write s as (βi (ζ), ui (ζ)), i = 1, . . . , n, in a neighbourhood of aji , ζ 6= aji (notice that the procedure of Proposition 6.1 does assign well-defined complex numbers u1 (ζ), . . . , un (ζ) to each ζ 6= aji ). According to the proof of Proposition 5.3 there is an element m(ζ) ∈ N = exp n such that the following expression −1 diag u1 (ζ), . . . , un (ζ) m(ζ) V β1 (ζ), . . . , βn (ζ) has an invertible limit at ζ = aji . −1 and let p(ζ) denote Let Wkl (ζ) denote the (k, l)th entry of V β1 (ζ), . . . , βn (ζ) the only non-zero non-diagonal entry of diag u1 (ζ), . . . , un (ζ) m(ζ) (p(ζ) is the (i, j)th entry). We then have that Wkj uj + Wki p and Wki ui have a finite limit at ζ = aji , for all k = 1, . . . , n. From the formula (5.9) a finite limit for Wni ui implies that ui (aji ) = 0, while the nonvanishing of the last row of V −1 diag(us )m means that aji is a single zero of ui . If more than two sections βs (ζ) meet at aji the considerations are similar but involve larger n. We can conclude the aji contribute precisely n−1 zeros of ui (counting multiplicities) and, given Proposition 6.1, this proves the formula for ui (ζ) as soon as we show that ui has no other zeros, or, equivalently, no poles. To prove this latter statement it is enough to show that uj does not have a pole at aji . We go back to the situation when n is one-dimensional, and where we concluded that Wkj uj + Wki p has a finite limit at ζ = aji for all k = 1, . . . , n. We can write Wnj uj + Wni p as (f uj + gp)/(βi − βj ) where f and g have finite limits at ζ = aji . We then have X X 1 βs + gp βs , Wn−1,j uj + Wn−1,i p = − f uj βi − βj s6=j
s6=i
which can be rewritten as −f uj −
X s6=i
βs
f uj + gp . βi − βj
Since the second term has a finite limit, so does f uj and hence uj . Again, if more than two sections βs (ζ) meet at aji the considerations are similar but involve larger n. Thus
Monopoles and the Gibbons–Manton Metric
313
we have shown the second formula of the statement. The last formula follows from the reality condition and the fact that aji = −1/¯aij (this calculation also eliminates the ±1 ambiguity in the choice of the real structure in the proof of 6.1). ˜ n (c) as a T n -bundle over the configuration space C˜ n (R3 ) We can finally identify M 3 of n distinct points xi in R . ˜ n (c) is equivalent to the T n -bundle described in Proposition 1.1. Proposition 6.3. M Proof. From the last formula in Proposition 6.2 it follows that Ai 6= 0 if, for all j 6= i, zi 6= zj or xi > xj . On the other hand, if we put Y AI = Ai aji , j∈I
for any subset I of {j; j 6= i}, then we have Y Y AI A¯ I = xi − xj + rij xj − xi + rij . j6=i j6∈I
j∈I
Let us choose sets I1 , . . . , In such that Ii ⊂ {j; j 6= i} and j ∈ Ii ⇔ i 6∈ Ij . Define U (I1 , . . . , In ) as the complement of the subset {(xi , zi )i=1,...,n ; Iic = {j; zi = zj and xi < xj }} (Iic denotes the complement of Ii in {j; j 6= i}). The sets ˜ n (c) is trivialized U (I1 , . . . , In ) cover C˜ n (R3 ) and over each of them the bundle M by coordinates xi , zi , AIi /|AIi | . To determine the bundle, choose i < j. The bundle 2 restricted to Sij is given by the transition function from U (I1 , . . . , In ), where j 6∈ I(i) 0 to U (I1 , . . . , In0 ), where Ii0 = Ii ∪ {j}, Ij0 = Ij − {i}, Ik0 = Ik for k 6= i, j. Let φk be the transition function for the k th generator of T n , i.e. the transition function from AIk /|AIk | to AIk0 /|AIk0 |. We see that φk = 1 if k 6= i, j, and φi = aji /|aji |, φj = |aji |/aji . Therefore φi = (zj − zi )/|zj − zi | and φj = φ−1 i . It remains to identify the circle bundle over the sphere x2 + |z|2 = const given by the transition function z/|z| from the region U0 = {z 6= 0 or x > 0} to the reon U1 = {z 6= 0 or x < 0}. Let us write the unit 3-sphere as {(u, v) ∈ C2 ; |u|2 + |v|2 = 1}. The Hopf bundle is given the S 1 action t · (u, v) = (tu, t−1 v) and the projection S 3 → S 2 by the map x = |u|2 − |v|2 , z = 2uv. Over U0 this bundle is trivialized by (x, z, u/|u|) and over U1 by (x, z, |v|/v). 2 , S 1 ). The transition function is |z|/z. Thus [φi ] = −1 ∈ H 1 (Sij ˜ n (c). By the remark at the end of Sect. 3, it We can now calculate the metric on M is enough to know the metric for c = −1, 0, 1, as the others are obtained by homothety. We shall calculate the metric for c = 1. The metric for c = −1 is the everywhere negative definite version of the Gibbons–Manton metric (this can be seen from the c = 1 calculation) and the one for c = 0 the and negative-definite cone over a 3-Sasakian manifold. ˜ n (1) is isomorphic, as a hyperk¨ahler manifold, to the Gibbons–Manton Theorem 6.4. M manifold MGM defined in Sect. 1. Proof. We know from the previous proposition that the two spaces are diffeomorphic. ˜ n (1) and of the Gibbons–Manton metric We shall show that the twistor description of M coincide. We recall from Sect. 1 that thelatter is a hyperk¨ahler quotient of M = M1 × n M2 by a torus, where M1 = S 1 × R3 and M2 = Hn(n−1)/2 . With respect to any
314
R. Bielawski
n complex structure M1 = C∗ × Cn and M2 = Cn(n−1)/2 × Cn(n−1)/2 . Let us write the corresponding complex coordinates as (pi , βi ), i = 1, . . . , n, on M1 and as (vij , wij ), i < j, on M2 . The complex-symplectic forms corresponding to metrics g1 and g2 are given by n X dpi
pi
i=1
X
∧ dβi ,
(6.4)
dvij ∧ dwij .
(6.5)
i<j
The real sections of the twistor space Z1 of M1 are written, over ζ 6= ∞, as pi (ζ) = Bi exi −z¯i ζ ,
βi (ζ) = zi + 2xi ζ − z¯i ζ 2 ,
(6.6)
where Bi B¯ i = 1. The real sections of the twistor space Z2 of M2 are (cf. [2], chapter 13.F): vij (ζ) = Cij (ζ − aij ),
wij (ζ) = Dij (ζ − aji ),
(6.7)
where aij , aji are roots of vij wij = zij + 2xij ζ − z¯ij ζ 2 for some (xij , zij ) ∈ R × C, i.e. q q xij + x2ij + |zij |2 xij − x2ij + |zij |2 , aji = aij = z¯ij z¯ij and Cij C¯ ij = −xij +
q
x2ij + |zij |2 ,
¯ ij = xij + Dij D
q
x2ij + |zij |2 .
Here the particular choice of sections is forced either by the fact the metric is positive definite or by requiring that the S 1 -action t · (vij , wij ) = (tvij , t−1 wij ) determines the Hopf bundle over the 2-sphere x2ij + |zij |2 = 1 (this calculation was done in the proof of Proposition 6.3). To obtain the twistor description of the Gibbons–Manton metric we have to perform the complex-symplectic quotient construction along the fibers of Z1 ⊕ Z2 with respect to the difference of the forms (6.4) and (6.5). As in Sect. 1, the moment map equations are vij wij = βi − βj and so the aij , aji are given by (6.3). Since we already know that the manifolds are diffeomorphic, it is sufficient to determine the metric on an open dense subset, e.g. on the set where all vij are non-zero. Quotienting n(n−1)/2 is equivalent to sending all vij to 1. This is achieved by acting this set by C∗ n(n−1)/2 −1 . By the description of the torus action given in by the element (vij ) of C∗ Sect. 1, this sends pi (ζ) to Q Q j
i Cij (ζ − aij ) j>i (ζ − aij ) where
Q j
(xi − xj + rij )
j>i (xj
− xi + rij )
.
(6.9)
These and the βi give the real sections for the Gibbons–Manton metric and the symplectic form is (6.4). We now compare this with the description of Z(1) given in Proposition
Monopoles and the Gibbons–Manton Metric
315
6.2. According to Remark 5.5 we should set pi = ui the same symplectic form. We obtain pi (ζ) = Q
Q j6=i Q
(xi − xj + rij )
j>i
j>i (βi
Q
|zj − zi |2
=
Y
j>i (z¯j
− z¯i ) with the norm of Ei .
(xi − xj + rij )
j
Q j
Y (xi − xj + rij ) |zj − zi |2 j>i
(xi − xj + rij )
j>i (xj
which proves the theorem.
− βj ) in order to have
Q (ζ − aji ) x −z¯ ζ Ai Qji (z¯j − z¯i ) j>i (ζ − aij )
All we have to do is to compare the norm of Ai We have, from Proposition 6.2 and Eq. (6.9), Ai A¯ i Q = 2 j>i |zj − zi |
Q
− xi + rij )
= Ei E¯ i ,
We shall finish the section with a remark that Propositions 6.2 and 6.3 can be generalized to define hyperk¨ahler metrics on a class of T n -bundles over C˜ n (R3 ). We have: Theorem 6.5. Let P be a T n -bundle over C˜ n R3 determined by an element (s1 . . . . , 2 2 2 sn ) of H 2 C˜ n (R3 ), Zn satisfying sk (Sij ) = 0 if k 6= i, j and si (Sij ) = −sj (Sij ). Then P carries a family of (pseudo)-hyperk¨ahler metrics such that the real sections of the twistor space are given, over ζ 6= ∞, by (β1 (ζ), . . . , βn (ζ), u1 (ζ), . . . , un (ζ)), where βi (ζ) = zi + 2xi ζ − z¯i ζ 2 , Y ui (ζ) = Ai (ζ − aji )sij ec(xi −z¯i ζ) , j6=i
where c is a real constant, (xi , zi ), i = 1, . . . , n, are distinct points in R × C, sij = 2 )|, and Ai are complex numbers satisfying |si (Sij Ai A¯ i =
Y
xi − xj + rij
sij
.
j6=i
This description determines a hypercomplex structure on P . A (pseudo)-hyperk¨ahler metric can be then calculated using any complex-symplectic form along the fibers, given as a section of 32 TF∗ ⊗ O(2), e.g. the form (5.7). These metrics will correspond to the motion of n dyons in R3 interacting in different ways (cf. [14]). Remark. The calculation of the metric given above shows that the Taub-NUT metric (cf. [2]) has two very different descriptions in terms of Nahm’s equations: 1) it is the ˜ 2 (−1) defined by considering ˜ 0 (−1) of M metric on the totally geodesic submanifold M 2 su(2)-valued solutions to Nahm’s equations and SU (2)-valued gauge transformations; 2) it is the metric on the moduli space of SU (3)-monopoles of charge (1, 1) [10, 29].
316
R. Bielawski
7. Asymptotic Comparison of the Metrics We shall now show that the Gibbons–Manton metric and the monopole metric are asymptotically exponentially close. The asymptotic region, where the individual monopoles are separated, of the monopole space Mn is diffeomorphic to P/Sn , where P is a t orus bundle over the configuration space C˜ n (R3 ) and Sn the symmetric group. The bundle P is not, however, the bundle of Proposition 6.3. Rather, as we shall see shortly, it is n the quotient of that bundle by a Z2 -subgroup of T n . In other words it is the bundle determined by an s ∈ H 2 C˜ n (R3 ), Zn with all sk being twice those in Proposition 6.3. We shall compare the metric on Mn with the metric on the hyperk¨ahler quotient of ˜ n (1) × M ˜ n (1) by the diagonal T n -action. We do this in order to have solutions to M Nahm’s equations with poles at both ends of the interval [−1, 1]. For any c, c0 , let us ˜ n (c) × M ˜ n (c0 ) by the diagonal action ˜ n (c, c0 ) for the hyperk¨ahler quotient of M write M n n 0 0 of T . The action of T given by t·(m, m ) = (tm, m ) induces a tri-Hamiltonian action ˜ n (c, c0 ) which makes M ˜ n (c, c0 ) into a T n -bundle over C˜ n (R3 ). We have of T n on M ˜ n (c + c0 ) Z2 n , ˜ (c, c0 ) is isomorphic, as a hyperk¨ahler manifold, to M Lemma 7.1. M n n where Z2 = {t ∈ T n ; t2 = 1}. ˜ n (c), M ˜ n (c0 ) respectively. Proof. Let µ, µ0 be the moment maps for the action of T n on M n 0 The moment map for the diagonal T -action on the product is µ+µ . If we go back to the proof of Proposition 6.3 and use the same notation, we can see that the zero-set of this 2 is given by moment map is a (T n ×T n )-bundle over C˜ n (R3 ) which restricted to each Sij −1 −1 0 transition functions (φ1 , . . . , φn , φ1 , . . . , φn ) (the point being that U (I1 , . . . , In0 ) = −U (I1 , . . . , In )). Hence, if we quotient by T n , by sending the second T n to 1 over each U (I1 , . . . , In ), we end up with a T n -bundle for which the transition functions are φ2k , k = 1, . . . , n. This proves the differential-geometric part of the statement. To obtain the ˜ n (c0 ), performing ˜ n (c) × M isometry we repeat this argument for the twistor space of M the complex-symplectic quotient along the fibers as in the proof of Theorem 6.4. ˜ n (1, 1) with half (compare formula (2.4)) of From now on, we shall consider M the metric given by the above lemma. In other words, locally the metric is still the ˜ Gibbons–Manton metric. We can identify Mn (1, 1) with the moduli space of pairs (T0 , T1 , T2 , T3 ), (T00 , T10 , T20 , T30 ) of solutions to Nahm’s equations, defined respectively on [−1, ∞] and on [−∞, 1], such that Ti (+∞) = Ti0 (−∞) for i = 0, 1, 2, 3, and the residues of Ti at −1 and of Ti0 at +1, i = 1, 2, 3, define the standard ndimensional irreducible representation of su(2). The group of gauge transformations G(1, 1) is now defined as pairs (g, g 0 ) such that g(t + 1), g 0 (−t + 1) ∈ G(c) for some ˙ −1 = limt→−∞ g˙ 0 g 0−1 . The tangent space consists of pairs c and s = limt→+∞ gg 0 0 0 0 (t0 , t1 , t2 , t3 ), (t0 , t1 , t2 , t3 ) defined on [−1, ∞] and on [−∞, 1], respectively, with ˜ n (1, 1) can be written as ti (+∞) = t0i (−∞) and satisfying Eqs. (2.3). The metric on M 1 1X kti (+∞)k2 + 2 2 3
Z
−1
0
1 2 We can rewrite this as
+∞
3 X 0
3 X
kti (s)k2 − kti (+∞)k2 ds +
0
kt0i (−∞)k2 +
1 2
Z
1
3 X
−∞
0
kt0i (s)k2 − kt0i (−∞)k2 ds.
Monopoles and the Gibbons–Manton Metric
1 2
Z
+∞ 0
3 X
317
kti (s)k − kti (+∞)k 2
2
0
+
1 2
Z
1 ds + 2
3 0 X −1
Z
0
3 X
−∞
0
kti (s)k2 ds +
0
kt0i (s)k2 − kti (+∞)k2 ds 1 2
Z
1 0
3 X
kt0i (s)k2 ds.
(7.1)
0
Let us fix a complex structure, say I and write as in Sect. 5, α forT0 + iT1 , β for T2 + iT3 . ˜ n (1, 1) as a pair (α− , β− ), (α+ , β+ ) . We shall write βi for We write an element of M th ˜ nreg (1, 1) the subset of M ˜ n (1, 1) the (i, i) entry of β− (+∞) = β+ (−∞) and denote by M reg where all βi are distinct. Similarly, we write Mn for the subset of (α, β) in (Mn , I) where the eigenvalues of β are distinct. We shall prove: ˜ nreg (1, 1)/Sn to Mnreg such that Theorem 7.2. There exists a biholomorphism φ from M |φ∗ g − g 0 | = O(e−cR ),
(7.2)
where g, g 0 denote the monopole and Gibbons–Manton metric respectively, c = c(n) is a constant, and R is the separation distance of particles in Cn (R3 ), i.e. R = min{|xi − xj |; i 6= j}.
(7.3)
The same estimate holds for the Riemannian curvature tensor. Since such a biholomorphism will be defined for any complex structure and the union ˜ n (1, 1), we conclude that the ˜ nreg (1, 1) for different complex structures is all of M of M monopole and the Gibbons–Manton metrics are exponentially close in the asymptotic region of the monopole moduli space. The remainder of the section is devoted to proving this theorem. We need the following lemma: ˜ nreg (1) is biholomorphic to the quotient of the Lemma 7.3. Let C > 0. The space M space of solutions (α, β) to Eq. (5.1) which have the correct boundary behaviour at t = 0 and are constant (hence diagonal) for t ≥ C by the group of complex gauge transformations g : [0, +∞) → Gl(n, C) with g(0) = 1 and g(t) = exp(ht − h) for some diagonal h for t ≥ C. ˜ nreg (1) and let αd = α(+∞), βd = β(+∞). According Proof. Let (α, β) be an element of M to the proof of Proposition 5.3, there is a unique complex gauge transformation g defined on [C/2, +∞) with g(+∞) = 1 such that (α, β) = g(αd , βd ). Let gˆ : [C/2, ∞) → Gl(n, C) be a smooth path with the values and the first derivatives of gˆ and g coinciding ˆ to the complex at t = C/2 and with g(t) ˆ = 1 and for t ≥ C. We obtain a solution (α, ˆ β) Nahm equation (5.1) by setting ( (α, β)(t) if t < C ˆ (7.4) (α, ˆ β)(t) = g(t)(α ˆ d , βd ) if t ≥ C. This is a solution of the type described in the statement of this lemma. The proof of 5.3 shows further that it is only g(C/2) exp{(1 − C/2)αd } (and a solution to (5.1) on ˜ nreg (1). Therefore we obtain a well defined [0, C/2]) that determines the element of M reg ˜ holomorphic map from Mn (1) to the moduli space described in the statement. Let us ˆ be an element of the moduli space described in the define the inverse map. Let (α, ˆ β)
318
R. Bielawski
statement. As in [23] we can find a bounded complex gauge transformation g0 such that ˆ is an element of M ˜ nreg (1). g0 (α, ˆ β) We can assume that g0 has a limit h at +∞ (this follows from the convexity property of g0 [13], since we can assume that g0 (t) is hermitian for all t). According to Proposition ˜ n (1) extends to a global action C∗ n with respect to 5.3 or 6.2 the action of T n on M ˜ reg the complex structure I (or any other). Let n (α, β) be the element of Mn (1)Cobtained −1 ∗ ˆ by the action of h ∈ C . Then (α, β) = g(α, ˆ and g ∈ G (1). This from g0 (α, ˆ β) ˆ β) gives the inverse mapping. ˜ nreg (1, 1)/Sn and Mnreg . From We can now construct a biholomorphism between M reg ˜ the above lemma, M n (1, 1) is biholomorphic to the quotient of the space of pairs (α− , β− ), (α+ , β+ ) defined on [−1, +∞) and on (−∞, 1] respectively such that (α− , β− )(t + 1) and (α+ , β+ )(1 − t) are as in the above lemma, (α− , β− )(+∞) = (α+ , β+ )(−∞) by the group of pairs (g− , g+ ) with g− (−1) = g+ (1) = 1 and such that there are diagonal h, p with g− (t) = exp(th − p) for t > −r, g+ (t) = exp(th − p) for ˆ to the complex Nahm t < r (r ∈ (0, 1) is fixed but arbitrary). We define a solution (α, ˆ β) equation (5.1) on (−1, 1) by ( (α− , β− )(t) if t < 0 ˆ (7.5) (α, ˆ β)(t) = (α+ , β+ )(t) if t ≥ 0. The G C -orbit of this solution (see Sect. 2 for the definition of G) contains a unique element of Mn [13, 20]. Furthermore, the action of a (g− , g+ ) translates into the action of g ∈ G C , where g(t) = g− (t) for t < 0 and g(t) = g+ (t) for t ≥ 0. Therefore we have ˜ nreg (1, 1) to Mn . If we now have an element a well defined holomorphic map φr from M reg (α, β) of Mn , we can diagonalize β on [−r, r] and make α diagonal and constant on ˜ be the resulting solution to the complex Nahm equation. We obtain [−r, r]. Let (α, ˜ β) ˜ an element of Mnreg (1, 1) by setting ( ˜ (α, ˜ β)(t) for t < 0 (α− , β− )(t) = ˜ (α, ˜ β)(0) for t ≥ 0 and similarly for (α+ , β+ ). This defines the inverse to φr up to the ordering of eigenvalues ˜ nreg (1, 1)/Sn and Mnreg . of β. In other words φr induces a biholomorphism between M reg ˜ Furthermore, for a fixed element (α− , β− ), (α+ , β+ ) of Mn (1, 1) and two parameters ˆ of (7.5) are G C -equivalent and therefore φr , φr0 induce the same r, r0 , the resulting (α, ˆ β) biholomorphism φ. Let us now prove the estimate (7.3). Fortunately, much of the analysis has been already done in [3]. First of all, we recall ([23], Lemma 3.4) that solutions to Nahm’s equations which have a regular triple as a limit at infinity, approach this limit exponentially fast, of order O e−cR (that is T1 , T2 , T3 do and we can always make T0 to have such decay by using the gauge freedom). The proofs of Propositions 3.11–3.14 in [3] show that the same holds for tangent vectors (t0 , t1 , t2 , t3 ). Let us now see what happens to a tangent vector v under the The gauge transformations (g, g 0 ) which make the map φ. ˜ nreg (1, 1) constant and equal to the common value at element (α− , β− ), (α+ , β+ ) of M infinity on [−1 + C/2, +∞) and (−∞, 1 − C/2] are exponentially close to the identity. In the next stage of the construction of φ – formula (7.4) – we have smoothed out the solutions which can be again done by gauge transformations exponentially close to 1.
Monopoles and the Gibbons–Manton Metric
319
Therefore the resulting tangent vector vˆ is exponentially close to the original one in the metric (7.1). We have then restricted the solutions (formula (7.5)) to obtain a solution ˆ to the complex Nahm equation on [−1, 1]. Let p denote this operation of restric(α, ˆ β) tion. The first line of the formula (7.1) is exponentially small and therefore the norm of ˆ will vˆ in (7.1) and the norm of dp(v) ˆ in (2.4) are exponentially close. The solution (α, ˆ β) not satisfy the real Nahm equation, however, we will have ˆ βˆ ∗ ] = O(e−cR ). ˆ := d αˆ + αˆ ∗ + [α, ˆ αˆ ∗ ] + [β, F (α, ˆ β) dt Lemma 2.10 in [13] implies now that we can solve the real equation by a complex gauge transformation bounded as O(e−cR ). We can now show that the vector dφ(v) tangent ˆ is exponentially close to dp(v) ˆ by following the to Mn (which is obtained from dp(v)) analysis of Sect. 3 in [3] step by step, replacing the O(1/R) estimates by O(e−cR ). This proves the estimate (7.3). For the curvature estimates we do the same using the analysis of Sect. 4 in [3]. This proves Theorem 7.2. 8. Twistor Description of Monopoles and the Gibbons–Manton Metric We shall show in this section how the twistor description of monopole metrics determines the asymptotic metric. We recall [13] that the moduli space of n-monopoles is biholomorphic to the space of based rational maps p(z)/q(z) on CP 1 o f degree n (based means that deg p < deg q). On the set, where the roots β1 , . . . , βn of q(z) are distinct, these roots and the values pi = p(βi ) of p form local coordinates and the complex-symplectic form can be written as [1]: n X i=1
dβi ∧
dpi . pi
(8.1)
The metric is determined by the real sections p(z, ζ)/q(z, ζ). Their description is provided in [19]. The denominator q(z, ζ) is given by a curve S – the spectral curve of the monopole – in T CP 1 [16]. This curve satisfies several conditions, one of which is the triviality of the line bundle L−2 restricted to S, and Hurtubise [19] shows that the numerator p(z, ζ) is given by a nonzero section of this bundle. (The values pi (ζ) are given by the values of this section at the intersection points βi (ζ) of S with Tζ CP 1 .) What happens when the individual monopoles separate? First of all, the spectral curve approaches the union of spectral curves of individual monopoles exponentially fast [4]. These curves Si are of the form ηi = zi + 2xi ζ − z¯i ζ 2 , i = 1, . . . , n, where (xi , Re zi , Im zi ) are locations of 1-monopoles (particles). What happens to the section of L−2 ? We make a heuristic assumption (which we know to be true from Sect. 6) that the section acquires zeros and poles at the intersection points of the Si (more precisely the only singularities of pi (ζ) occur at the intersection points of Si with other Sj ). As we shall see this is sufficient to determine the asymptotic metric. ¯ ζ¯ First of all the real structure on the bundle L−2 is u 7→ u¯ −1 e−2η/ and therefore if pi has a zero at one of the points of Si ∩ Sj , then it has a pole of the same order at the other, and vice versa. Furthermore, since the metric and hence the real sections are invariant under the action of the symmetric group, we must have Y ζ − aij k e−2(xi −z¯i ζ) , i = 1, . . . , n, pi (ζ) = Ai ζ − aji j6=i
320
R. Bielawski
where aij , aji are the two points in Si ∩ Sj given by (6.3) and k is an integer. The reality condition implies that Y akji a¯ kji . Ai A¯ i = j6=i
One can now calculate the asymptotic metric, using (8.1). The sign of k will determine the signature, while |k| is simply a constant m ultiple. The actual value of k is determined by the topology of the asymptotic region of Mn , and comparing with Proposition 6.3 and the remarks at the beginning of Sect. 7 we conclude that k = 1 (in the coordinates Q of Proposition 6.2, pi = j;j6=i (βi − βj )/u2i ). We remark that the above analysis can be easily done for other compact Lie groups G. The twistor description of metrics on moduli spaces of G-monopoles with maximal symmetry breaking is known from the work of Murray [28] and Hurtubise and Murray [21, 22] and from this the asymptotic metric can be calculated. We shall do the exact analysis in the case of G = SU (N ) in a subsequent paper. Acknowledgement. This work was carried out at the Max-Planck-Institut f¨ur Mathematik in Bonn. I wish to thank the Institute’s directors and staff for their hospitality and for creating a stimulating research atmosphere. I also thank Michael Murray for comments.
Note added in proof The metrics of section 4 (or, at least, their positive-definite counterparts) can be also obtained as finite-dimensional hyperk¨ahler quotients. This follows from the construction of Kobak and Swann (Internat. J. Math. 7, 193–210 (1996)) of a nilpotent GC -orbit as a hyperk¨ahler quotient of a vector space V by a product U of unitary groups. The hyperk¨ahler quotient P of V by the semisimple part of U is a positive-definite analogue of the spaces MG (c) (i.e. Theorem 4.3 holds for P ). Presumably the spaces MG (c) can also be obtained this way by changing the signature of the metric on V .
References 1. Atiyah, M.F. and Hitchin, N.J.: The geometry and dynamics of magnetic monopoles. Princeton: Princeton University Press, 1988 2. Besse, A.: Einstein manifolds.Berlin–Heidelberg–New York: Springer, 1987 3. Bielawski, R.: Asymptotic behaviour of SU (2) monopole metrics. J. reine angew. Math. 468, 139–165 (1995) 4. Bielawski, R.: Monopoles, particles and rational functions. Ann. Glob. Anal. Geom. 14, 123–145 (1996) 5. Bielawski, R.: On the hyperk¨ahler metrics associated to singularities of nilpotent varieties. Ann. Glob. Anal. Geom. 14, 177–191 (1996) 6. Bielawski, R.: Hyperk¨ahler structures and group actions. J. London Math. Soc. 55, 400–414 (1997) 7. Bielawski, R.: Invariant hyperk¨ahler metrics with a homogeneous complex structure. Math. Proc. Cam. Phil. Soc. 122, 473–482 (1997) 8. Biquard, O.: Sur les e´ quations de Nahm et les orbites coadjointes des groupes de Lie semi-simples complexes. Math. Ann. 304, 253–276 (1996) 9. Biquard, O.: Twisteurs des orbites coadjointes. Ecole Polytechnique preprint (1997) 10. Connell, S.: The dynamics of the SU (3) (1, 1) magnetic monopoles. Ph.D. thesis, The Flinders University of South Australia (1991) 11. Dancer, A.S.: Nahm’s equations and hyperk¨ahler geometry. Commun. Math. Phys. 158, 545–568 (1993) 12. Dancer, A.S.: A family of hyperk¨ahler manifolds. Quart. J. Math. Oxford 45, 463–478 (1994)
Monopoles and the Gibbons–Manton Metric
321
13. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387–407 (1984) 14. Gibbons, G.W. and Manton, N.S.: The moduli space metric for well-separated BPS monopoles. Phys. Lett. B 356, 32–38 (1995) 15. Gibbons, G.W. and Rychenkova, P.: HyperK¨ahler quotient construction of BPS monopole moduli spaces. Commun. Math. Phys. 186 (1997), 581–599 16. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) 17. Hitchin, N.J.: Polygons and gravitons. Math. Proc. Camb. Phil. Soc. 83, 465–476 (1979) 18. Hitchin, N.J., Karlhede, A., Lindstr¨om, U. and Roˇcek, M.: Hyperk¨ahler metrics and supersymmetry. Commun. Math. Phys. 108, 535–586 (1985) 19. Hurtubise, J.C.: Monopoles and rational maps: a note on a theorem of Donaldson. Commun. Math. Phys. 100, 191–196 (1985) 20. Hurtubise, J.C.: The classification of monopoles for the classical groups. Commun. Math. Phys. 120, 613–641 (1989) 21. Hurtubise, J.C. and Murray, M.K.: On the construction of monopoles for the classical groups. Commun. Math. Phys. 122, 35–89 (1989) 22. Hurtubise, J.C. and Murray, M.K.: Monopoles and their spectral data. Commun. Math. Phys. 133, 487– 508 (1990) 23. Kronheimer, P.B.: A hyper-k¨ahlerian structure on coadjoint orbits of a semisimple complex group. J. London Math. Soc. 42, 193–208 (1990) 24. Kronheimer, P.B.: The construction of ALE spaces as hyper-K¨ahler quotients. J. Diff. Geom. bf29, 665–683 (1989) 25. Lindstr¨om, U. and Roˇcek, M.: Scalar tensor duality and N = 1, 2 nonlinear σ-models. Nucl. Phys. 222B, 285–308 (1983) 26. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. B 110, 54–56 (1982) 27. Manton, N.S.: Monopole interactions at long range, Phys. Lett. B 154, 397–400 (1985) 28. Murray, M.K.: Non-abelian magnetic monopoles. Commun. Math. Phys. 96, 539–565 (1984) 29. Murray, M.K.: A note on the (1, 1, . . . , 1) monopole metric. J. Geom. Phys. 23, 31–41 (1997) 30. Nahm, W.: The construction of all self-dual monopoles by the ADHM method. In Monopoles in quantum field theory. Singapore: World Scientific, 1982 31. Nakajima, H.: Monopoles and Nahm’s equations. In Einstein metrics and Yang-Mills connections New York: Marcel Dekker, 1993 32. Pedersen, H. and Poon, Y.S.: Hyper-K¨ahler metrics and a generalization of the Bogomolny equations. Comm. Math. Phys. 117, 569–580 (1988) 33. Stuart, D.: The geodesic approximation for the Yang-Mills-Higgs equations. Commun. Math. Phys. 166, 149–190 (1994) 34. Swann, A.: Hyperk¨ahler and quaternionic K¨ahler geometry. Math. Ann. 289, 421–450 (1991) Communicated by H. Nicolai
Commun. Math. Phys. 194, 323 – 341 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Dynamical Localization for Discrete and Continuous Random Schr¨odinger Operators F. Germinet1 , S. De Bi`evre2 1 UFR de Math´ ematiques et LPTMC, Universit´e Paris VII – Denis Diderot, 75251 Paris Cedex 05, France. E-mail: [email protected] 2 UFR de Math´ ematiques et URA GAT, Universit´e des Sciences et Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France. E-mail: [email protected]
Received: 23 July 1997 / Accepted: 22 October 1997
Abstract: We show for a large class of random Schr¨odinger operators Hω on `2 (Zν ) and on L2 (Rν ) that dynamical localization holds, i.e. that, with probability one, for a suitable energy interval I and for q a positive real, sup rq (t) ≡ suphPI (Hω )ψt , |X|q PI (Hω )ψt i < ∞. t
t
Here ψ is a function of sufficiently rapid decrease, ψt = e−iHω t ψ and PI (Hω ) is the spectral projector of Hω corresponding to the interval I. The result is obtained through the control of the decay of the eigenfunctions of Hω and covers, in the discrete case, the Anderson tight-binding model with Bernoulli potential (dimension ν = 1) or singular potential (ν > 1), and in the continuous case Anderson as well as random Landau Hamiltonians.
1. Introduction We show for a large class of random Schr¨odinger operators Hω on `2 (Zν ) and on L2 (Rν ) that dynamical localization holds, i.e. that, with probability one, for a suitable energy interval I and q > 0, sup rq (t) ≡ suphPI (Hω )ψt , |X|q PI (Hω )ψt i < ∞. t
t
Here ψ is a function of sufficiently rapid decrease, ψt = e−iHω t ψ, PI (Hω ) is the spectral projector of Hω corresponding to the interval I and X is the usual position operator. The result covers all random Schr¨odinger operators for which exponential localization has been proved, including operators with Bernoulli potentials in dimension 1 and random Landau Hamiltonians, for example.
324
F. Germinet, S. De Bi`evre
The strategy of the proof is as follows. First recall that exponential localization, i.e. pure point spectrum and exponentially decaying eigenfunctions, is by now a well established property of random Schr¨odinger operators in many situations. On the other hand, it is also known that exponential localization does not systematically entail dynamical localization [11]. The authors of [11] point out that, to obtain dynamical localization, some control is needed on the location and the size of the boxes outside of which the eigenfunctions “effectively” decrease exponentially. This is precisely what is achieved for random Schr¨odinger operators in the present paper (Theorem 3.1 and Theorem 4.2). Our proof here uses the ideas of Von Dreifus and Klein [14], and in particular those of the proof of their Theorem 2.3. We proceed as follows: once exponential localization has been proved, and using the fact that the spectrum is now known to be discrete, one can exploit the result of the multi-scale analysis a second time to get better (and sufficient) control on the eigenfunction decay. We first deal with the discrete case (Sects. 2 and 3). In Sect. 2 we start by proving (along the lines of [11]) a sufficient condition (see (2.1)) on the eigenfunctions of a Hamiltonian H which implies dynamical localization. In Sect. 3 we give the proof of the announced result for the discrete Anderson model. The continuous case is dealt with in Sects. 4 and 5. Exponential localization for Schr¨odinger operators has recently been carried over to the continuum by Combes and Hislop [6] and by Klopp [21]. The case of random Landau Hamiltonians is dealt with by Combes, Hislop and Barbaroux [3, 7], by Wang [27] and by Dorlas, Macris and Pul´e [13]. All those papers use an adaptation to the continuous case of the multi-scale analysis originally developed for discrete Schr¨odinger operators ([17, 18, 14] or see [5], [24]). This reduces the proof of exponential localization to the verification of two hypotheses: a Wegner estimate and an estimate allowing the “initialization” of the multi-scale analysis. Our central result here (Theorem 4.2) shows that those two hypotheses actually imply dynamical localization. We give some applications in Sect. 5. To put our results in perspective, we recall, first, the work of Holden and Martinelli [23] who prove, roughly speaking, that r2 (t) = o(t) for some particular continuous models. More recently Del Rio, Jitomirskaya, Last and Simon [11] used bounds of Aizenman [1] to give a simple proof (avoiding the multi-scale analysis) of dynamical localization for the discrete Anderson model with a potential with bounded density. But the bounds of [1] do not seem to carry over to the continuous case, nor to Bernoulli and other singular potentials in the discrete case. To deal with these cases, we were therefore obliged to return to the (rather painful) multi-scale analysis. A further application of our result to the random dimer model [16] is given in [10], and dynamical localization for the almost Mathieu model is proven in [19], using a related method.
2. Eigenfunction Decay and Dynamical Localization In this section we give, for a class of self-adjoint operators H with pure point point spectrum, defined on either `2 (Zν ) or L2 (Rν ), a sufficient condition on the eigenfunctions (see (2.1)) that guarantees dynamical localization. Our strategy for proving dynamical localization for random Schr¨odinger operators is then to prove that a property much stronger than this condition is indeed satisfied (Theorem 3.1, Theorem 4.2). Let H0 be the following operator on L2 (Rν ): H0 = H1 ⊕ H2 ,
Dynamical Localization for Random Schr¨odinger Operators
325
Pν where H1 = p21 +p22 , p1 = ∂x1 +Bx2 /2, p2 = ∂x2 −Bx1 /2, B ≥ 0, and H2 = 3 p2i , pi = ∂xi . One can also write H0 = (P − A)2 , where A is the vector potential B/2(x2 , −x1 ), − → written in the symmetric gauge, associated to the constant magnetic field B = Bex3 . Theorem 2.1. Let H be a self-adjoint operator on H = `2 (Zν ) or L2 (Rν ) with pure point spectrum on some interval I ⊂ R. Let ϕn be its eigenfunctions with corresponding eigenvalues En ∈ I. In the case H = L2 (Rν ), suppose that I is compact and that H has the form H0 + V , H0 as described above, V ∈ L∞ (Rν ). Suppose moreover that ∃γ > 0, γ 0 ∈]0, γ/2[ and sites (xn ) s.t.∀ n, |ϕn (x)| < Cγ eγ
0
|xn | −γ|x−xn |
e
.
(2.1)
Let q > 0 and ψ ∈ H decaying exponentially at a rate θ > 2γ 0 . Then there exists a constant Cψ = Cψ (I, γ, γ 0 , θ, q) such that: ∀ t ≥ 0, k |X|q/2 e−iHt PI (H)ψk2 ≤ Cψ .
(2.2)
This simple result relies on ideas of Sect. 7 in [11]. Note that (2.1) says roughly that the eigenfunctions are localized inside boxes of size |xn |/2 around “centers” xn . This is stronger than exponential localization of H on I, which only means that ∃γ > 0 such that ∀n, ∃Cn > 0, |ϕn (x)| ≤ Cn e−γ|x| ,
(2.3)
but weaker than what the authors of [11] called SULE (Semi-Uniformly Localized Eigenfunctions). We choose to present the proof of this theorem in the continuous case, since the first part of the proof (Lemma 2.2) is a little bit more technical in this situation. In order to prove Theorem 2.1 we need control on the growth of the |xn | in n, which is given by the following preliminary lemma: Lemma 2.2. Let H be as in the proposition and δ > 0. Then one can order the |xn | in increasing order, and there exists a constant C = (4 max(B, 1))−1 such that for n large enough (depending on δ): |xn | ≥ Cn1/(ν+δ) . Proof of Lemma 2.2. Essentially we follow the ideas of the proof of Theorem 7.1 of [11]. We recall that the energies En that we consider belong to I. Let δ > 0 be given and let 0 δ 0 > 0 so that δ > δ 0 (ν − 1). Let L > 0 be given, define J = [0, Lδ ], and write χ2L (x) for the characteristic function of the ball of radius 2L centered at 0. Suppose that |xn | < L. Then, for such n and for some constant C1 , one has: hϕn , (1 − χ2L (X))χJ (H0 )χ2L (X)ϕn i ≤ k(1 − χ2L (X))ϕn kL2 kχ2L (X)ϕn kL2 0 ≤ Cγ e(γ +γ)|xn | (1 − χ2L (X))e−γ|x| ≤ C1 e
−(γ−γ 0 )L
L2
.
(2.4)
0
Secondly, using 1 − χJ (y) ≤ yL−δ , y ≥ 0, and Hϕn = En ϕn : 0 hϕn , (1 − χJ (H0 ))ϕn i ≤ hϕn , Hϕn i + kV k∞ L−δ 0
≤ C(I, kV k∞ )L−δ .
(2.5)
326
F. Germinet, S. De Bi`evre
So, using (2.4) and (2.5) together with two similar inequalities, tr(χ2L (X)χJ (H0 )χ2L (X)) X hϕn , χ2L (X)χJ (H0 )χ2L (X)ϕn i ≥
(2.6)
n| En ∈I, |xn |≤L
0 0 ≥ ]{n| En ∈ I, |xn | ≤ L} 1 − 3C1 e−(γ−γ )L − C(I, kV k∞ )L−δ ≥
1 ]{n| En ∈ I, |xn | ≤ L} for L ≥ L0 = L0 (γ, γ 0 , I, kV k∞ , δ 0 ). 2
(2.7)
The next step is then to bound the trace class norm of the operator Q = χ2L (X)χJ (H0 )χ2L (X). Let’s study the case B 6= 0. Since {u21 + u22 ≤ L} ⊂ {u21 ≤ L, u22 ≤ L}, and denoting by χ(d) 2L (x) the characteristic function of the d-dimensional ball of radius 2L centered at 0, remark that tr(Q) ≤ tr (χ2L (X)(χJ (H1 ) ⊗ χJ (H2 ))χ2L (X)) (2) (ν−2) (ν−2) = tr (χ(2) 2L (X)χJ (H1 )χ2L (X)) ⊗ (χ2L (X)χJ (H2 )χ2L (X)) ≡ tr (Q1 ⊗ Q2 ) .
(2.8)
The operator Q2 is the product of two Hilbert–Schmidt operators with respective kernel −1 −1 (ν−2) (x − y) [25], where F −1 g denotes (x) F g (x − y) and χ (x) F χ χ(ν−2) J 2L 2L Pν−2 the inverse Fourier transform of g(x) = χJ ◦ s(x), with s(x1 , ..., xν−2 ) = 1 x2i . So, denoting by kCk1 the trace class norm of an operator C defined on H, one has: kQ2 k1 ≤ kχ2L (X)χJ (H2 )kHS kχJ (H2 )χ2L (X)kHS 0
2 2 ν−2 δ (ν−2) ≤ k(χ(ν−2) L . 2L (x)kL2 kχJ ◦ s(x)kL2 ≤ (4L)
(2.9)
We turn now to the magnetic part Q1 . It is well known that the spectrum of our H1 consists of eigenvalues (2n + 1)B, n = 0, 1, 2, .... The corresponding projectors are operators with kernel [22], √ 0 B B(x − x0 ) , (2.10) Pn (x, x0 ) = ei 2 x∧x pn where
1 pn ((x1 , x2 )) = 2π
Z R
eikx1 hn (k + x2 /2)hn (k − x2 /2) dk,
and hn (k) are the normalised Hermite functions. Note that we gave the expression in the symmetric gauge, and not in the Landau gauge as in [22]. Write now of Pn P projector corresponding to the eigenvalues belonging to J. PJ = n| En ∈J Pn , the (2) (X) , which are two Hilbert–Schmidt operators with Then Q1 = χ2L (X)PJ PJ χ(2) 2L the same norm (because of (2.10)). And one has, for each n, 2 kχ(2) 2L (X)Pn kHS Z Z √ 2 (2) = χ2L (x) B(x − x0 ) dx dx0 pn 0 x Z x 2 x2 x2 2 ≤ (4BL) F hn (. + )hn (. − ) (x1 ) 2 dx2 2 2 L x2
Dynamical Localization for Random Schr¨odinger Operators
Z = (4BL)2
327
x2 2 x2 2 hn (k + ) hn (k − ) dx2 dk 2 2 x2 ,k
= (4BL)2 khn k4L2 . 0
0
0
So, since khn kL2 = 1, and using ]{n ≥ 0, (2n + 1)B ≤ Lδ } ≤ Lδ /2B if Lδ ≥ B, one has 2 2+δ 0 . kχ(2) 2L (X)PJ kHS ≤ 8BL And then, using (2.8) and (2.9), tr(Q) ≤ CLν+δ ,
(2.11)
taking δ > δ 0 (ν − 1), and C P = 4 max(B, 1). Note here that if B = 0 then the free ν Hamiltonian H0 has the form 1 ∂i2 . Hence the analysis made previously for Q2 is valid for such a Q and (2.11) holds. In order to finish the argument, note that, together with (2.7) and (2.8), (2.11) tells us that, for L ≥ L0 , N (L) ≡ ]{n|, En ∈ I, |xn | ≤ L} is finite. Order then the eigenfunctions in such a way that |xn | increases. So, N (|xn |) = n, and if n ≥ N0 ≡ N (L0 ), one has, with L = |xn |: |xn | ≥ Cn1/(ν+δ) , for n > N0 .
(2.12)
The main difference between the continuous and discrete cases comes from this lemma, in the sense that on L2 (Rν ) one has to control the behaviour of the eigenfunctions in the momentum variables: that was achieved through the use of χJ (H0 ). In the discrete case, (2.11) is replaced by the trivial equality tr(χ2L ) = (4L + 1)ν . This also explains why no specific form for H is needed on l2 (Zν ). It is easy, then, to rewrite the proof of Lemma 2.2 in this case (see also [11]). Proof of Theorem 2.1. Let ψ ∈ H be such that, for some constant C(ψ) > 0 and θ > 2γ 0 , |ψ(x)| < C(ψ)e−θ|x| . We have to bound kX q/2 PI (H)e−iHt ψk, for q > 0, t > 0. We recall that θ > 2γ 0 and γ > 2γ 0 . Without loss of generality, let’s suppose θ < γ and write γ = θ + ε, ε > 0. Then kX q/2 PI (H)e−iHt ψk2 X ≤ |hϕn , ψi| kX q ϕn k n| En ∈I
≤
X
n| En ∈I
Z C(ψ)Cγ,q |xn |
≤C(ψ)Cγ,q
X
q
e
−θ|x| 2γ 0 |xn | −γ|x−xn |
e
e
0
q −(θ−2γ )|xn |
|xn | e
Z
≤Cψ (γ, γ 0 , I, θ, ε), where we used respectively for the second and last inequalities 2 kX q ϕn k2 ≤ Cγ,q |xn |2q e2γ
and Lemma 2.2.
0
|xn |
,
dx
−ε|x−xn |
e
n| En ∈I
dx
328
F. Germinet, S. De Bi`evre
3. The Discrete Anderson Model We consider the self-adjoint operator Hω Hω = −1 + Vω , where 1 is the discrete Laplacian on `2 (Zν ) and Vω (ω ∈ ) is a random potential, the (Vω (x))x∈Zν being i.i.d. random variables. Their common probability measure µ is assumed to be non degenerate, i.e. not concentrated on a single point. The conditions that we impose on µ are: R if ν = 1 : ∃ η > 0 , R | v |η dµ(v) < ∞; (3.1) if ν ≥ 2 : µ is α-H¨older continuous. Let us recall how the disorder δ(µ) of a α-H¨older continuous measure µ is defined: δ(µ)−1 = inf
sup |b − a|−α µ([a, b]).
τ >0 |b−a|<τ
Theorem 3.1. Let Hω be the Anderson Hamiltonian, and µ the common probability measure of the Vω (x), x ∈ Zν , not concentrated in a single point and satisfying condition (3.1). Let I be a compact interval and ε > 0. • If ν = 1, define 0 ≡ γ(I) ≡ inf {γ(E), E ∈ I}, where γ(E) is the Lyapunov exponent at energy E. Suppose 0 = γ(I) > 0. • If ν > 1, pick 0 > 0. Suppose the disorder δ(µ) is taken sufficiently high. Then, P almost surely, Hω has pure point spectrum on I, and there exist centers xn,ω associated to the eigenfunctions ϕn,ω with energy En,ω ∈ I such that: ∀ γ0 ∈]0, 0[, there exists a constant C(ω, ε, γ0 ) such that ε
∀x ∈ Zν , |ϕn,ω (x)| ≤ C(ω, ε, γ0 )eγ0 |xn,ω | e−γ0 |x−xn,ω | .
(3.2)
Evidently, one can also write a “low energy” version of this theorem. As an immediate consequence of Theorem 2.1 and Theorem 3.1 we have: Corollary 3.2. Let Hω be as in Theorem 3.1, and PI (Hω ) the spectral projection on the compact interval I. Then for q > 0 and ψ ∈ `2 (Zν ) decaying exponentially with rate θ > 0, k |X|q/2 PI (Hω )e−iHω t ψk2 is bounded uniformly in t almost surely. Comparing (3.2) to (2.3) and to (2.1), one notices that now the size of the boxes in which the eigenfunctions “live” can grow at most as |xn,ω |ε . One expects this can be improved to a polynomial bound. Supposing µ has a bounded density with compact support, the polynomial bound follows from the proof of Theorem 7.6 in [11]. The proof of Theorem 3.1 is based on the ideas of [14], and in particular on the proof of Theorem 2.3 in [14]. The strategy is the following: since the hypotheses of Theorem 3.1 imply exponential localization, we know that there exist “centers” xn,ω , where the eigenfunction ϕn,ω is maximal, and one can then exploit the result of the multiscale analysis a second time to improve the control of the decay of the eigenfunctions. As already pointed out, this proof has the advantage of yielding the result for singular potentials and in particular for Bernoulli potentials in dimension 1. In addition, the proof extends to continuous random Schr¨odinger operators, as shown in Sects. 4 and 5. To make this paper self-contained, we start by recalling the elements from [14] that we need. First of all, 3L (x) denotes the cube of side L/2 centered in x and ∂3L (x) its
Dynamical Localization for Random Schr¨odinger Operators
329
boundary. H3L (x),ω is the restriction of the operator Hω to the cube 3L (x) with Dirichlet boundary conditions, and G3L (x) (E, ., .) is its resolvent. Given L0 > 1, α ∈]1, 2[, we define Lk (k ∈ N) recursively via Lk+1 = Lα k . Given in addition an integer b ≥ 2, we define Ak+1 (xo ) = 32bLk+1 (xo )\32Lk (xo ). Note that we do not indicate the dependence of Lk and Ak on L0 , α and b since these quantities will at any rate be fixed later on. We further need the following definition: Definition 3.3. Let γ > 0 and an energy E ∈ R be given. A cube 3L (x) is said to be (γ, E)-regular if E 6∈ σ(H3L (x) ) and if for all y ∈ ∂3L (x), |G3L (x) (E, x, y)| ≤ e−γL/2 . Otherwise 3L (x) will be called (γ, E)-singular. Note that this definition is ω-dependent, but we follow the usual practice by not indicating this. For xo ∈ Zν , Ek (xo ) is defined to be the following set: {ω| ∃ E ∈ I, ∃ x ∈ Ak+1 (xo ), 3Lk (xo ) and 3Lk (x) are (γ, E)−singular}. Finally, we recall a well known identity. Let x ∈ Zν , E 6∈ σ(H3L (x) ), and ϕ ∈ `2 (Zν ) so that Hϕ = Eϕ be given, then: X
ϕ(x) =
G3L (x) (E, x, y)ϕ(y 0 ).
(3.3)
(y,y 0 )∈∂3L (x)
Here (with some abuse of notation) (y, y 0 ) ∈ ∂3L means y and y 0 are nearest neighbours with y ∈ 3L (x) and y 0 6∈ 3L (x). ∂3+Lk (x) will denote the points y 0 just outside 3Lk (x). In order to prove Theorem 3.1, we start with the following three lemmas: Lemma 3.4. Let p > ν and α ∈]1, 2 − 2ν/(p + 2ν)[ and b > 1 be given. Assume the hypotheses of Theorem 3.1 are satisfied. Then for any γ ∈]0, 0[ there exists L0 = L0 (p, ν, γ, b, δ(µ)) > 1 such that: ∀ k ∈ N, ∀ x ∈ Zν , P(Ek (x)) ≤
(2bLk+1 + 1)ν . (Lk )p
Lemma 3.5. Let γ > 0 be fixed. There exists a constant L∗ (ν, γ) so that, if H is a Schr¨odinger operator and ϕ ∈ `2 (Zν ) an eigenvector of H with eigenvalue E, and if x∗ ∈ Zν satisfies |ϕ(x∗ )| = sup{|ϕ(x)|, x ∈ Zν }, then 3L (x∗ ) is (γ, E)-singular, for all L ≥ L∗ (ν, γ). Lemma 3.6. If ν, sν , and γ are some positive constants, then ∀ η ∈]0, 1[ there exists L(η, γ, ν) such that, if L ≥ L(η, γ, ν): ∀ x, xo ∈ Zν ,
sν Lν−1 e−γL/2
o| |x−x L/2+1
≤ e−γη|x−xo | .
(3.4)
330
F. Germinet, S. De Bi`evre
The first lemma follows immediately from the Appendix and from Theorem 2.2 of Von Dreifus and Klein [14] and constitutes the core of the proof of exponential localization in [14]. We will prove an analog of it for continuous random Schr¨odinger operators in the Appendix. The second lemma says roughly that if ϕ is an eigenvector of H with eigenvalue E, then E must be close to the spectrum of H3L (x) provided L is big enough and 3L (x) is centered on a maximum of |ϕn,ω |. It is quite simple, but central for what follows. Obviously the third lemma doesn’t need a proof. We have stated it separately in order to make clear later on that L(η, γ, ν) only depends on the model parameters and not on the particular eigenfunction we consider. Note that L(η, γ, ν) behaves like (1/γ) at a positive power. Proof of Lemma 3.5. Let ϕ be as in the lemma: ϕ ∈ `2 (Zν ), so x∗ exists. Suppose that 3L (x∗ ) is (γ, E)-regular, and apply the identity (3.3) at the point x∗ . Then for some y 0 ∈ ∂3+L (x): |ϕn,ω (x∗ )| ≤ sν Lν−1 e−γL/2 |ϕn,ω (y 0 )| ≤ sν Lν−1 e−γL/2 |ϕn,ω (x∗ )|,
(3.5)
where sν is a constant depending only on the dimension. Now let L∗ (γ, ν) be a positive real such that sν Lν−1 e−γL/2 < 1 for L ≥ L∗ . Then, for such L, (3.5) is impossible, and 3L (x∗ ) cannot be (γ, E)-regular any more, that is: 3L (x∗ ) is (γ, E)-singular for L ≥ L∗ (γ, ν). Proof of Theorem 3.1. Under the hypotheses of the theorem, Hω has P−a.s. exponential localization on I (see [4 and 3]). This means that there exists 0 ⊂ , µ(0 ) = 1 so that for all ω ∈ 0 , σc (Hω ) ∩ I = ∅ and for all eigenvalue En,ω ∈ I, the corresponding eigenfunction ϕn,ω is `2 and satisfies (2.3). The aim is therefore to control the constant Cn,ω of (2.3) and more precisely to show that xn,ω can be chosen so that this Cn,ω grows slower than an exponential in |xn,ω |ε . In order to prove Theorem 3.1, we wish to use Eq. (3.3) repeatedly, and on a scale Lk for suitably large k, to estimate the value of ϕn,ω (x) when x belongs to Ak+1 (xn,ω ) for suitably chosen xn,ω . To do this, one has to work “outward” from x ∈ Ak+1 (xn,ω ) to the boundary of Ak+1 (xn,ω ), making sure that the boxes of size Lk to which one applies (3.3) are regular. After these preliminaries, let’s start the proof properly speaking, which will consist of three steps. Firstly, let I, 0 > 0, γ0 and ε be as in the theorem. Pick γ ∈]γ0 , 0[. First step. Let p > ν, α ∈]1, 2 − 2ν/(p + 2ν)[ and b > 1 be given. With the Lk , k ≥ 0, defined in Lemma 3.4, consider [ Ek (xo ). Fk = |xo |≤(Lk+1 )1/ε
Lemma 3.4 then implies that for some constant C(ε, ν, b), −p+να(1+1/ε)
P(Fk ) ≤ C(ε, ν, b)Lk
.
Hence, since p can be chosen larger than 2ν(1 + 1/ε), one has Borel-Cantelli lemma then implies that:
P∞ k=0
P(Fk ) < ∞. The
Dynamical Localization for Random Schr¨odinger Operators
P lim
m→∞
[
331
Fk = 0,
k≥m
so that the set 1 = {ω ∈ |∃ k˜ 1 = k˜ 1 (ω, ε, p, γ) such that ∀ k ≥ k˜ 1 , ω 6∈ Fk } has full measure. This ends the probabilistic part of the proof. Note that the choice of ε puts a lower bound on p. This in turn forces the disorder to be high via Lemma 3.4. Second step Now pick an ω in (0 ∩ 1 ), which will be kept fixed throughout the rest of the proof. Let En,ω ∈ I, ϕn,ω be its eigenfunction, and let xn,ω be a point where |ϕn,ω (x)| is maximal. Note that such a point exists since ω ∈ 0 and therefore ϕn,ω ∈ `2 (Zν ). Let k1 = k1 (ω, ε, p, γ, xn,ω ) = max(k˜ 1 , k0 (ε, xn,ω )),
(3.6)
where for all y ∈ Zν , the integer k0 (ε, y) is defined as follows: k0 (ε, y) = min{k ≥ 0 such that |y|ε < Lk+1 }.
(3.7)
Hence ∀k ≥ k1 , ω 6∈ Ek (xn,ω ). Indeed, if k ≥ k1 , then ω 6∈ Fk and |xn,ω |ε < Lk+1 . This implies that ω 6∈ Ek (xn,ω ), and consequently that ∀k ≥ k1 , ∀E ∈ I and ∀y ∈ Ak+1 (xn,ω ) either 3Lk (xn,ω ) or 3Lk (y) is (γ, E)-regular. We can then apply Lemma 3.5 to conclude that there exists an integer k2 = max(k˜ 2 , k0 (ε, xn,ω )), where k˜ 2 depends on the same parameters as k˜ 1 and not on n, so that ∀k ≥ k2 , ∀En,ω ∈ I, ∀y ∈ Ak+1 (xn,ω ), 3Lk (y) is (γ, En,ω )-regular. Last step. We now finish the argument along the lines of the proof of Theorem 2.3 in [14] ; ω is still fixed in (0 ∩ 1 ). For k ≥ k2 and for x ∈ Ak+1 (xn,ω ), 3Lk (x) being (γ, En,ω )-regular, we can apply relation (3.3): −γLk /2 |ϕn,ω (x0 )|, |ϕn,ω (x)| ≤ sν Lν−1 k e
where x0 ∈ ∂3+Lk (x) is chosen so that |ϕn,ω (x0 )| = sup{|ϕn,ω (y)|, y ∈ ∂3+Lk (x)}. As long as x0 is still in Ak+1 (xn,ω ) we can use (3.3) again, so that we can repeat this step at least d(x, ∂Ak+1 (xn,ω ))/(Lk /2 + 1) times. Hence for k ≥ k2 and x ∈ Ak+1 (xn,ω ), |ϕn,ω (x)| ≤
−γLk /2 sν Lν−1 k e
d(x, ∂Ak+1 (xn,ω )) Lk /2 + 1 .
Comparing this to (3.4), we see that in order to get a useful estimate we need a lower bound on d(x, ∂Ak+1 (xn,ω )). For that purpose we cover Zν with new annular regions A˜ k+1 (xn,ω ) defined as follows. Pick ρ ∈]0, 1[ so that ργ > γ0 and choose the integer b introduced at the beginning of the proof so that b > (1 + ρ)/(1 − ρ). Then set A˜ k+1 (xn,ω ) = 3[2bLk+1 /(1+ρ)] (xn,ω )\32Lk /(1−ρ) (xn,ω ) ⊂ Ak+1 (xn,ω ).
332
F. Germinet, S. De Bi`evre
Note that if x ∈ A˜ k+1 (xn,ω ) then d(x, ∂Ak+1 (xn,ω )) ≥ ρ|x − xn,ω |. Hence, repeating (3.3) ρ|x − xn,ω | times, one has that for all k ≥ k2 and for all x ∈ A˜ k+1 (xn,ω ), n| ρ|x−x Lk /2+1 −γLk /2 |ϕn,ω (x)| ≤ sν Lν−1 e , k
or, applying Lemma 3.6, and choosing η ∈]0, 1[ such that γ0 = ρηγ we conclude that there exists an integer k3 = max(k˜ 3 , k0 (ε, xn,ω )), where k˜ 3 depends again not on n, such that: (3.8) ∀ k ≥ k3 , and ∀ x ∈ A˜ k+1 (xn,ω ), |ϕn,ω (x)| ≤ e−γ0 |x−xn | . But now note that for all x ∈ Zν , and provided |x − xn,ω | > L0 /(1 − ρ), there exists a k so that x ∈ A˜ k+1 (xn,ω ). This means that there exists an integer k = max(k˜ 4 , k0 (ε, xn,ω )), k˜ 4 depending once again not on n, such that (3.8) holds for all x ∈ Zν satisfying |x − xn,ω | > Lk . Hence, using that |ϕn,ω (x)| ≤ 1 for all x ∈ Zν : (3.9) ∀x ∈ Zν , |ϕn,ω (x)| ≤ C(ω, ε, γ0 )eγ0 Lk e−γ0 |x−xn,ω | . So far, we have only proved that the eigenfunctions decay exponentially, but we are now in a position to control the n-dependence of the constant eγ0 Lk as follows. Note that the only n-dependence of k comes from k0 (ε, xn,ω ). Suppose sup{|xn,ω |, En,ω ∈ I} < ∞, then k can be chosen n-independently, so that we actually obtain a uniform localization (called ULE in [11]), and a fortiori condition (2.1) of Theorem 2.1. But Lemma 2.2 contradicts this first possibility. So, in fact, sup{|xn,ω |, En,ω ∈ I} = ∞, and, for n sufficiently large, one has: k = k0 (ε, xn,ω ) i.e., with (3.7), Lk ≤ |xn,ω |ε . Inserting this in (3.9) yields the announced result.
4. The Continuous Case In this section, our goal is to obtain the analog of Theorem 3.1 and Corollary 3.2 for continuous random Schr¨odinger operators. The result is stated in Theorem 4.2 below. In Sect. 5, we will present some models where the hypotheses of this theorem are satisfied. We consider random Schr¨odinger operators on L2 (Rν ) of the following type (ν ≥ 1): X λi (ω)u(x − i). (4.1) Hω = H0 + i∈Zν
Here H0 = (i∇ − A)2 + Vper , where A is a vector potential of a constant magnetic field − → B =rot(A), and Vper a periodic potential. ii) The variables λi (ω), i ∈ Zν are independent and identically distributed, with common distribution µ. iii) The function u(x) belongs to C02 (Rν ), with suppu ⊂ [−R, R]ν .
i)
Dynamical Localization for Random Schr¨odinger Operators
333
To state the hypotheses, we need to recall some notations and simple facts. We introduce |x| = max{|xi |, i = 1, ..., ν}, and denote by 3L (x) the cube 3L (x) = {y ∈ Rν | |y − x| < L/2}. ˜ L (x) is the subset Moreover δ > 0 being fixed (independently of L), 3 ˜ L (x) = {y ∈ 3L (x) such that L/2 − δ < |x − y| < L/2}; 3 χ˜ L,x will denote its characteristic function.We denote furthermore by χL,x a function in C 2 (Rν ) with support in 3L (x) and satisfying, 0 ≤ χL,x ≤ 1, χL,x ≡ 1 on the ˜ L (x) so that ∇χL,x lives in 3 ˜ L (x). Note that we will often drop the subcube 3L (x)\3 ω or x-dependence of the objects introduced in order to alleviate the notations. We furthermore define local Hamiltonians H3L (x),ω as follows. When A = 0 and Vper = 0, H3L (x),ω is the restriction of the operator Hω to the cube 3L (x) with Dirichlet boundary conditions. When A 6= 0 or Vper 6= 0, X λi (ω)u(x − i). H3L (x),ω = H0 + i∈3L ∩Zν
We denote by WL,x the first order differential operator WL,x ≡ [H0 , χL,x ] and write R3L (x) (E) the resolvent of H3L (x),ω . We then always have the geometric resolvent equation: if 3l ⊂ 3L ⊂ Rν and if E 6∈ σ(H3L ) and E 6∈ σ(H3l ) then χl R3L (E) = R3l (E)χl + R3l (E)Wl R3L (E).
(4.2)
Definition 4.1. Let γ > 0 and an energy E ∈ R be given. A cube 3L (x) is said to be (γ, E)-regular if E 6∈ σ(H3L (x) ) and if: kχ4δ,x R3L (x) (E)WL,x k ≤ e−γL/2 . Otherwise 3L (x) will be called (γ, E)-singular. We now state the result. Given a compact interval I, and reals γ0 > 0, p > ν, L0 , L˜ > 1, we introduce Hypothesis (H1). (γ0 , I, p, L0 ): P(∀ E ∈ I, 3L0 is (γ0 , E)-regular) > 1 −
1 . Lp0
˜ (Wegner): There exists CW so that for all Hypothesis (H2). (I, L) ˜ I˜ ⊂ I1 ≡ {E | d(E, I) < 1} and L > L, ˜ < CW |I| ˜ |3L |. E(tr(E3L (0) (I))) ˜ one also has, for L > L, ˜ 0< Note that, using Chebyshev’s inequality and [H2](I, L), η < 1 and E ∈ I, (4.3) P(d(E, σ(H3L (0),ω )) < η) < CW |3L |η. This is the so-called “Wegner Estimate”. We will need both [H2] and (4.3) for the proof of Proposition 4.3.
334
F. Germinet, S. De Bi`evre
Theorem 4.2. Let ε > 0. Suppose that for some interval I and reals γ0 > 0, ˜ hold for L0 > L˜ p > 2ν(1 + 1/ε), the hypotheses [H1](γ0 , I, p, L0 ) and [H2](I, L) large enough. Then with probability one there exist points xn,ω , associated to the eigenfunctions ϕn,ω with energies En,ω ∈ I, so that: ∀ γ ∈]0, γ0 [ and for some constant Cω = C(ω, ε, γ, γ0 , I, L0 ), one has, for all x ∈ Rν , ε
|ϕn,ω (x)| ≤ Cω eγ|xn,ω | e−γ|x−xn,ω | .
(4.4)
Moreover, if q > 0 and ψ ∈ L2 (Rν ) decays exponentially with mass θ > 0, then, with probability 1, there exists a constant Cψ,ω such that, k |X|q/2 PI (Hω )e−iHω t ψk2 ≤ Cψ,ω . Analysing the proofs of the previous two sections, one sees that the only missing ingredient for the proof of Theorem 4.2 is an analog of Lemma 3.4, stated as Proposition 4.3 below. Indeed, the arguments of Sect. 3 are readily transcribed to the continuous case, provided one makes the following adaptation. First, one replaces Eq. (3.3) by the following equality: if ϕ ∈ L2 (Rν ) satisfies Hϕ = Eϕ for some E, then for 3L (x) ⊂ Rν so that E 6∈ σ(H3L (x) ): χ4δ,x ϕ = χ4δ,x R3L (x) (E)WL,x ϕ. Secondly, one defines x∗ (ϕ), the analog of x∗ in Lemma 3.5, as follows. If ϕ belongs to L2 (Rν ) and Hϕ = Eϕ, then Z Z 2 sup |ϕ(x)| dx = |ϕ(x)|2 dx. y∈4δZν
34δ (y)
34δ (x∗ (ϕ))
So, writing xn,ω ≡ x∗ (ϕn,ω ), and redefining the annular region Ak+1 (xn,ω ) as 32bLk+1 (x)\32Lk +2R (xn,ω ), one obtains the bound written in (4.4), but for kχ4δ,x ϕn,ω k, x ∈ 4δZν , rather than for |ϕn,ω (x)|. Then to get the pointwise estimate (4.4) apply ε 0 Theorem 2.4 of [9], or decompose ke−γ|xn,ω | eγ |x−xn,ω | ϕn,ω k2L2 , with γ 0 sufficiently close to γ, on boxes of size 4δ and centered on 4δZν , and apply Theorem IX.26 of [25]. It therefore remains to state and prove the analog of Lemma 3.4. Following [14] let’s denote by R(γ, L, x, y) the set R(γ, L, x, y) ≡ {ω ∈ | ∀ E ∈ I, 3L (x) or 3L (y) is (γ, E)-regular}. As in the discrete case, for all α ∈]1, 2[ and L0 > 1 we define the sequence (Lk )k∈N by Lk+1 = Lα k. Proposition 4.3. For any γ ∈]0, γ0 [, p > ν and α ∈]1, 2 − 2ν/(p + 2ν)[ there exist ˜ hold for L0 > L∗ , L0 > L, ˜ L∗ = L∗ (γ, I, α) such that if [H1](γ0 , I, p, L0 ) and [H2](I,L) then for all k ≥ 0: |x − y| > Lk + 2R =⇒ P(R(γ, Lk , x, y)) > 1 −
1 L2p k
.
The proof of Proposition 4.3 follows upon adapting the arguments of the proof of Theorem 2.2 in [14] to the continuum. Various authors [6, 21] have written up versions of the multi-scale analysis for continuous Schr¨odinger operators, but, to our knowledge, the version we need is not available in the literature. Nethertheless, nobody seems to doubt that any such argument can be carried over from the discrete to the continuous case. Since multi-scale arguments are in addition to this painful, we have chosen to put the proof of Proposition 4.3 in the appendix, while making an effort to give a clear, complete and relatively simple argument.
Dynamical Localization for Random Schr¨odinger Operators
335
5. Applications We briefly indicate two applications of Theorem 4.2 to an Anderson model on Rν [6] and to random Schr¨odinger operators with a magnetic field. The Anderson tight-binding model. Here the free Hamiltonian H0 of Eq. (4.1) is −1. We suppose, following [6], that u(x) > χ3/2 (x), where χ3/2 is the characteristic function of the cube 33/2 (0). Putting together Proposition 4.5 and Theorem 5.1 of [6], one has immediately from Theorem 4.2: Theorem 5.1. Suppose that µ has a L∞ density g(λ) with support [0, λmax ] and disorder δ0 = kgk−1 ∞ > 0 then, for energy EA > 0 fixed and disorder δ0 high enough, or for disorder δ0 > 0 fixed and energy EA low enough, the conclusions of Theorem 4.2 hold on [0, EA ]. We notice that the result holds as well for theP “breather” model, where the potential is given by the closely related formula: Vω (x) = i∈Zν u(λi (ω)(x − i)) [8]. The Landau Hamiltonian. Here the Hamiltonian has the general form described in Eq. (4.1), i.e. A 6= 0 and Vper = 0. Although the result is still valid in arbitrary dimension under further assumptions (see [2]) we prefer to state the application in the well-known two dimensional version [3, 7, 27]. In that case, the vector potential A is given by A=
B (x2 , −x1 ), 2
with B > 0. Recall that the spectrum of the free Landau Hamiltonian H0 consists of a sequence of eigenvalues En (B) = (2n + 1)B, n ∈ N. √ We suppose that u > 0, suppu ⊂ B(0, 1/ 2), and that there exist C0 and r0 > 0 such that u|B(0,r0 ) > C0 . We suppose that the common measure µ (of the λi (ω)) has a bounded density function g ∈ C02 (R), g being even and positive for almost every λ ∈ suppg. Under those assumptions it is well-known that M0 = sup{|Vω (x)|, x, ω} < +∞. Let’s define the following bands: I0 (B) = [−M0 , B − ε0 (B)], In+1 (B) = [En (B) + εn (B), En+1 (B) − εn (B)] with some εn (B) > 0. It follows from [7] and Theorem 4.2: Theorem 5.2. Let V be as described above. Then for B high enough, there exist some εn (B) = O(B −1 ) such that the conclusions of Theorem 4.2 hold on each interval In (B), n ∈ N. A similar result holds for the model treated by Wang [27], where u can be negative and its support is included in B(0, r), 0 < r < 1.
336
F. Germinet, S. De Bi`evre
A. Appendix We turn to the proof of Proposition 4.3. Let us point out that our definition of a (γ, E)regular box (Definition 4.1) differs slightly from the one in [6]. This difference will allow us to free ourselves from most of the difficulties due to the use of an auxiliary lattice in [6] and [21]. We need the following two concepts: Definition A.1. Let r > 0. Two boxes 31 and 32 will be called r non-overlapping iff d(31 , 32 ) > 2r. Note that, if 31 and 32 are R non − overlapping, then, since suppu ⊂ [−R, R]ν , two events depending respectively on the λi with i in 31 and in 32 are necessarily independent. Definition A.2. Let β ∈]0, 1[ be given. A box 3L will be called non resonant at energy E (we’ll write E − N R) if β kR3L (E)k ≤ 2eL . β
This means, in other words, that d(E, σ(H3L )) > (1/2)e−L . Remark that the commutator WL does not appear in Definition A.2 as it did in Definition 4.1. In fact one can replace WL with the characteristic function χ˜ L,x defined above (see [6] for the Anderson case and [7] Lemma 5.1 and lines (5.27 - 5.29) if A 6= 0) as follows. There exists a constant C(δ, I) with: kχ4δ,x R3L (x) (E)WL,x k ≤ C(δ, I)kχ4δ,x R3L (x) (E)χ˜ L,x k.
(A.1)
Remark then that this last bound tells us that β
3L is E − N R =⇒ kχ4δ R3L (E)WL k ≤ 2C(δ, I)eL .
(A.2)
An essential ingredient of the proof of Proposition 4.3 is the following deterministic lemma. Lemma A.3. Let L = lα with α ∈]1, 2[ and x ∈ Rν , l > 12R. Denote by sν the number of faces of a cube in dimension ν. Assume that for some γ > 27lβ + 4(ν − 1)ln(l/δ)) + 4ln(2sν ) /l, with δ and β defined previously, and for some energy E, i) 3L (x) is E − N R; ii) Each box of size 4j(l+R), j=1,2,3, centered in x + lZν and contained in 3L (x) is E-NR; iii) Among all the (γ, E)-singular boxes of size l contained in 3L (x), there are no more than three that are two by two R non-overlapping. Then 3L (x) is (γ 0 , E)-regular with ln 2sν C(l/δ)ν−1 27 2 0 , (A.3) γ = γ 1 − α−1 − α(1−β) − l l ) l with C = C(δ, I) defined in (A.1).
Dynamical Localization for Random Schr¨odinger Operators
337
In [6], Combes and Hislop have proved a simpler version of this result. In fact, they have adapted to the continuous case a simplified version of [14] which is contained in chapter IX of [5] (see also [15]). But, as in the discrete case, this simplified version does not seem to suffice to obtain the results of this paper, since we need to obtain regular boxes at any size with good probability, uniformly in a compact interval of energy, and no longer at some fixed energy E. So we turn to [14] and adapt it to the continuous case. Proof of Lemma A.3. The aim is to bound kχ4δ,x R3L (x) (E) WL,x k. Using first inequality (A.1), we are reduced to control kχ4δ,x R3L (x) (E)χ˜ L,x k. This is achieved in (A.12) below. We recall that δ > 1 has been chosen small, so, without Loss of generality one can suppose l > 3δ. In order to achieve our goal, we will recursively construct inside 3L (x) a chain of n boxes 3l (vk ), k = 0, ..., n − 1, being most of the time (γ, E)-regular, and starting at v0 ≡ x. At each step of this process, we will use the geometric resolvent equation as follows. ˜ L (x)) > 0. For Let l0 > 3δ and consider any box 3l0 (z) ⊂ 3L (x) with d(3l0 (z), 3 E 6∈ σ(H3l0 (z) ) ∪ σ(H3L (x) ), the resolvent identity (4.2) gives: χ4δ,z R3L (x) (E)χ˜ L,x = χ4δ,z R3l0 (z) (E)Wl0 ,z R3L (x) (E)χ˜ L,x . The support of Wl0 ,z can be covered by a family of boxes 34δ (v) ⊂ 3L (x), indexed by points v that satisfy |v − z| = l0 /2 and so that the sum over all the corresponding characteristic functions χ4δ,v is equal to 1 on suppWl0 ,z . (Note that sν (1 + l0 /3δ)ν−1 < sν (l0 /δ)ν−1 such boxes suffice.) Clearly, there exists one of those v for which kχ4δ,z R3L (x) (E)χ˜ L,x k ≤ sν (l0 /δ)ν−1 kχ4δ,z R3l0 (z) (E)Wl0 ,z k kχ4δ,v R3L (x) (E)χ˜ L,x k.
(A.4)
Suppose now in addition that 3l0 (z) is (γ, E)-regular. Then we immediately have: 0
kχ4δ,z R3L (x) (E)χ˜ L,x k ≤ sν (l0 /δ)ν−1 e−γl /2 kχ4δ,v R3L (x) (E)χ˜ L,x k.
(A.5)
Apply now this argument to 3l (x), and set v0 = x, v1 = v. Repeat the process as long as 3l (vk ) is (γ, E)-regular and stays away from the boundary of 3L (x). Clearly, there exists some k ∗ ≥ 0 so that for all 0 ≤ k < k ∗ , 3l (vk ) is (γ, E)-regular and ˜ L (x)) > 0, whereas one of these conditions fails for 3l (vk ) ⊂ 3L (x), d(3l (vk ), 3 3l (vk∗ ). As a result, if k ∗ > 0, we have ∗
kχ4δ,x R3L (x) (E)χ˜ L,x k ≤ (sν (l0 /δ)ν−1 )k e−γk
∗
l/2
kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k. (A.6)
If k ∗ = 0, ∗this equation holds trivially. The important point here is that we gained a factor e−γk l/2 : if there were no (γ, E)-singular boxes 3l inside 3L (x), this would end the proof. Indeed, in that case, the process could only end when 3l (vk∗ ) gets too close to the boundary of 3L (x), implying k ∗ ≥ (L/l), so that (A.6) immediately yields the result upon using hypothesis (i). Of course, there may be (γ, E)-singular boxes in 3L (x) and we now use hypothesis (iii) of the lemma to control the case in which the above process stops because 3l (vk∗ ) is (γ, E)-singular and vk∗ is at a distance greater than 12(l + R) from the boundary of 3L (x). Using hypothesis (iii) and drawing a few pictures one easily convinces oneself that one can pack all the singular boxes of size l in t ≤ 3 slightly bigger and disjoint boxes 3li ⊂ 3L (x), centered in x + lZν , and so that each box 3l (z), where z belongs to the edge of one of those 3li , is (γ, E)-regular. More precisely, the li are taking on
338
F. Germinet, S. De Bi`evre
Pt one of the values 4j(l + R), 1 ≤ j ≤ 3, they satisfy i=1 li ≤ 12(l + R) ≡ l0 , and the two following facts are simultaneously true: d(z, ∂3L (x)) ≥ l/2 r [ (A.7) (1) z ∈ 3 (x)\ 3 =⇒ (3l (z) is (γ, E)-regular) . L
li
i=1
(2) If z belongs to the edge of one of the 3li , then z 6∈
Sr i=1
3li .
(A.8)
So, if 3l (vk∗ ) is (γ, E)-singular, there exists i ∈ {1, .., t} so that 3l (vk∗ ) ⊂ 3li . Define a new family of points v on the boundary of 3li , such that the boxes 34δ (v) cover suppWli . Then use the equivalent of (A.4) with 3l0 (z) = 3li and z = vk∗ . This produces some vk∗ +1 that belongs to the edge of 3li , and consequently (A.7)-(A.8) implies that 3l (vk∗ +1 ) is (γ, E)-regular, provided d(vk∗ +1 , ∂3L (x)) > l/2. By checking how vk∗ +1 is positioned with respect to vk∗ one sees that the latter condition is satisfied because vk∗ is at least at a distance 12(l + R) from ∂3L (x)). Use now Hypothesis (ii) and apply once again (A.4) to 3l (vk∗ +1 ), to obtain that for some vk∗ +2 on the edge of 3l (v), with |vk∗ − vk∗ +2 | ≤ li : ν−1 β l0 l kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k ≤ 2s2ν e−γl/2+li kχ4δ,vk∗ +2 R3L (x) (E)χ˜ L,x k. 2 δ (A.9) Using the preliminary condition on γ, this leads to kχ4δ,vk∗ R3L (x) (E)χ˜ L,x k ≤ kχ4δ,vk∗ +2 R3L (x) (E)χ˜ L,x k.
(A.10)
This is the way in which we get past a singular box 3l (vk∗ ) far from the edge of 3L (x). We have now completely described the recursive construction of the vk and it is clear that the process grinds to a standstill only when, for some n, 3l (vn ) is too close to ∂3L (x). From (A.10) one sees that, when meeting a singular box, we do not gain a factor exp−γl/2 , so that we have to assure ourselves this does not happen too often before the process ends. We therefore need to count how many of the boxes 3l (vk ), 0 ≤ k ≤ n are regular. Since |vk+1 − vk | = l/2 in that case and |vk+2 − vk | ≤ li if not, it is not hard to see that the process cannot stop before n = n∗ = n∗1 + 2t, where # " Pt L/2 − i=1 li ∗ , n1 = l/2 or
[L/l] − 27 ≤ n∗1 ≤ L/l.
(A.11)
Hence, using (i) of the lemma, one has n∗1 β 2eL . kχ4δ,x R3L (x) (E)χ˜ L,x k ≤ sν (l/δ)ν−1 e−γl/2
(A.12)
Putting together relations (A.1) and (A.12) and using (A.11) as well as the definition of γ 0 stated in the lemma leads to the desired result. Proof of Proposition 4.3. Take α ∈]1, 2 − 2ν/(p + 2ν)[ and γ ∈]0, γ0 [. Let L ≡ Lk+1 = Lα k , and use Eq. (A.3) to produce a sequence of exponents γk ∈]0, γ0 ]. It will be enough to show that
Dynamical Localization for Random Schr¨odinger Operators
339
a) ∀k ≥ 0, γ ≤ γk ≤ γ0 ; b) ∀k ≥ 0 : |x − y| > Lk+1 + 2R =⇒ P (R(γk , Lk+1 , x, y)) > 1 − 1/L2p k+1 . To prove (a), choose L0 > 0: the sequence (γk )k≥0 produced by repeatedly using (A.3) decreases, so for all k ≥ 0, γk+1 ≤ γk ≤ γ0 . Then, using Eq. (A.3), it is clear there exists L∗ = L∗ (γ, γ0 , β, α, ν) so that, if L0 > L∗ , the sequence (γk )k≥0 satisfies 0<
∞ X
(γk − γk+1 ) ≤ 15γ0
k=0
∞ X
L1−α + k
k=0
∞ X
− 21 min(1,α(1−β))
Lk
k=0
≤ γ0 − γ.
Hence (a) follows, and we turn to (b), which clearly follows from: |x − y| > Lk+1 + 2R =⇒ ∀ E ∈ I, the hypotheses of Lemma A.3 with γ = γk P > 1 − 1/L2p k+1 . (A.13) and L = Lk+1 are satisfied for either point xn,ω or y Firstly, since for all k ≥ 0, γk ≥ γ, provided L0 is large enough, one has: ∀k ≥ 0, γk >
1 27Lβk + 4(ν − 1)ln(Lk /δ)) + 4ln(2sν ) . Lk β
Let’s now define Il = {E ∈ R, d(E, I) ≤ e−l /2} and σ 0 (H3l ) = σ(H3l ) ∩ Il . It is easy to estimate the probability that the distance between the respective spectrum of two ˜ is greater than η, 0 < η < 1. Using R non-overlapping boxes 3l1 and 3l2 , l1 , l2 > L, first (4.3) and then Hypothesis [H2], one has, with some abuse of notations: Z X P(d(σ 0 (H3l1 ), σ 0 (H3l2 )) < η) ≤ P3l1 d(σ 0 (H3l1 ), E) < η dω2 E∈σ 0 (H3l ) 2
≤ CW |3l1 |η E(tr(E3l2 (Il2 ))) 2 (|I| + 1)|3l1 | |3l2 |η ≤ CW = CW,I |3l1 | |3l2 |η.
(A.14)
Hence, for all k, and writing for convenience L ≡ Lk+1 and l = Lk : if |x − y| > L + 2R, it follows from (A.14) that P(∃ u ∈ (x + lZν ) ∩ 3L (x), v ∈ (y + lZν ) ∩ 3L (y) and l1 , l2 = L or 4j(l + R)l, j = 1, .., 3 with 3l1 (u) ⊂ 3L (x) and 3l2 (v) ⊂ 3L (y), with d(σ 0 (H3l1 ), σ 0 (H3l2 )) < η) ≤ CW,I (L/l)2ν |3L |2 η.
(A.15)
But consider this elementary exercise in logic: let Ai and Bj , i and j = 1, .., J be 2J intervals, then ∀ i, j = 1, ..., J, d(Ai , Bj ) > η ⇐⇒ ∀ E ∈ R ∀ i, j = 1, ..., J, (d(E, Ai ) > η/2 or d(E, Bj ) > η/2) ∀ E ∈ R, either (∀ i = 1, ..., J, d(E, Ai ) > η/2) ⇐⇒ . or (∀ j = 1, ..., J, d(E, Bj ) > η/2)
340
F. Germinet, S. De Bi`evre β
This, combined with inequality (A.15) and η = e−Lk , gives for all k ≥ 0, and if |x − y| > Lk+1 + 2R that ∀ E ∈ I, (i) and (ii) of Lemma A.3 with γ = γk P > 1 − 1/L2p+1 k+1 . (A.16) and L = Lk+1 are satisfied for either point x or y Let’s finish the proof: for L0 large enough, Hypothesis [H1](γ, I, p, L0 ) gives (A.13) at rank 0. Suppose it is true at rank k: points (i) and (ii) of Lemma A.3, with γ = γk and L = Lk+1 , are satisfied for either points x or y, with probability evaluated line (A.16). Now, P(for any E ∈ I, (iii) of Lemma A.3 holds) = 1 − P(∃ E ∈ I s.t. there are at least 4 R non-overlapping (γk , E)-singular boxes 3Lk contained in 3Lk+1 (x)) ≥ 1 − P(∃ E ∈ I s.t. there are at least 2 R non-overlapping (γk , E)-singular boxes 3Lk contained in 3Lk+1 (x))2 !2 (Lk+1 /Lk + 1)2ν ≥ 1− , L2p k
(A.17)
where we obtained the last inequality using the recurrence hypothesis. Hence, since α < 2 − 2ν/(p + 2ν), combining (A.16), (A.17), there exists a constant L∗ = L∗ (p, γ, γ0 , ν) such that if L0 > L∗ and |x − y| > Lk+1 + 2R: P(R(γk , Lk+1 , x, y)) > 1 − Use now that γk > γ, and Proposition 4.3 is proved.
1 L2p k+1
.
References 1. Aizenman, M.: Localization at weak disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Barbaroux, J.M.: Dynamique quantique des milieux d´esordonn´es. Th`ese de doctorat, Toulon (1996) 3. Barbaroux, J.M., Combes, J.M., Hislop, P.D.: Landau Hamiltonian with unbounded random potentials. Preprint (1997) 4. Carmona, R., Klein, A., Martinelli, F.: Anderson localization for Bernoulli and other singular potentials. Commun. Math. Phys. 108, 41–66 (1987) 5. Carmona, R., Lacroix, J.: Spectral theory of random Schr¨odinger operator. Basel–Boston: Birkha¨user, 1990 6. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonian in d-dimension. J. Funct. Anal. 124, 149–180 (1994) 7. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 8. Combes, J.M., Hislop, P.D., Mourre, E.: Spectral Averaging, Perturbation of Singular Spectra, and Localization. Trans. Amer. Math. Soc. 348, 4883–4894 (1996) 9. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 10. De Bi`evre, S., Germinet, F.: Dynamical Localization for the random dimer model. Preprint (1998) 11. Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one pertubations and localization. J. d’Analyse Math. 69, 153–200 (1996)
Dynamical Localization for Random Schr¨odinger Operators
341
12. Delyon, F., Levy, Y., Souillard, Y.: Anderson localization for multi-dimensional systems at large disorder or large energy. Commun. Math. Phys. 100, 463–470 (1985) 13. Dorlas, T.C., Macris, N., Pul´e, J.V.: The Nature of the Spectrum for the Landau Hamiltonian with δ impurities. To appear in J. Stat. Phys. 14. von Dreifus, A., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) 15. von Dreifus, H., Klein, A.: Localization for random Schr¨odinger operators with correlated potentials. Commun. Math. Phys. 140, 133–147 (1991) 16. Dunlap, D.H., Phillips, P.: Absence of localization in a random Dimer model. Phys. Rew. Lett. 61, 88 (1990) 17. Fr¨ohlich, J., Spencer, T.: Absence of diffusion with Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) 18. Fr¨ohlich, J., Martinelli, F., Scoppola, E., Spencer,T.: Constructive proof of localization in the Anderson tight binding model. Commun. Math. Phys. 101, 21–46 (1985) 19. Germinet, F.: Dynamical Localization II with an Application to the Almost Mathieu Operator. Preprint (1997) 20. Jona-Lasinio, G., Martinelli, F., Scoppola, E.: Mutiple tunnelings in d-dimensions: a quantum particle in a hierarchical potential. Ann. Inst. Henri Poincar´e, vol. 42, 73–108 (1985) 21. Klopp, F.: Localization for continuous Random Schr¨odinger Operators. Commun. Math. Phys. 167, 553–569 (1995) 22. Kunz, H.: Quantum Hall effect for electrons in random potential. Commun. Math. Phys. 112, 121–145 (1987) 23. Martinelli, F., Holden, H.: On absence of diffusion near the bottom of the spectrum for a random Schr¨odinger operator. Commun. Math. Phys. 93, 197–217 (1984) 24. Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin: Springer-Verlag, 1992 25. Reed, M., Simon, B.: Methods of modern mathematical physics Vol. I-IV, London: Academic Press, 1975 26. Simon, B., Wolff, T.: Singular continuous perturbation under rank one perturbation and localization for random Hamiltonians. Commun. Pure Appl. Math. 39, 75–90 (1986) 27. Wang, W.M.: Microlocalization, percolation and Anderson localization for the Schr¨odinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) Communicated by B. Simon
Commun. Math. Phys. 194, 343 – 358 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On Maxwell’s Equations with a Temperature Effect, II Robert Glassey1 , Hong-Ming Yin2 1 2
Department of Mathematics, Indiana University, Bloomington, IN 47405, USA Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA
Received: 24 July 1997 / Accepted: 22 October 1997
Abstract: In this paper we study Maxwell’s equations with a thermal effect. This system models an induction heating process where the electric conductivity σ strongly depends on the temperature u. We focus on a special one–dimensional case where the electromagnetic wave is assumed to be parallel to the y-axis. It is shown that the resulting hyperbolic–parabolic system has a global √ smooth solution if the electrical conductivity σ(u) grows like uq with 0 ≤ q < 8 + 4 3. A fundamental element in this paper is the establishment of a maximum principle for wave equations with damping. This maximum principle provides an a priori bound for the first derivative with respect to both x and t of the solution without the imposition of any differentiability assumptions nor bounds on the coefficient of the damping term. The use of a nonlinear multiplier then permits (via a bootstrap procedure) the estimation of successively higher Lp -norms of the temperature function u. 1. Introduction In this paper we study the following coupled hyperbolic–parabolic system: wtt − wxx + σ(u)wt = 0, (x, t) ∈ QT , ut − uxx = σ(u)wt2 , (x, t) ∈ QT ,
(1.1) (1.2)
subject to appropriate initial–boundary conditions, say, for example, w(i, t) = fi (t), u(i, t) = gi (t), 0 ≤ t ≤ T, i = 0, 1, w(x, 0) = w0 (x), wt (x, 0) = w1 (x), u(x, 0) = u0 (x), 0 ≤ x ≤ 1,
(1.3) (1.4)
where QT = (0, 1) × (0, T ) and T > 0. Equation (1.1) derives from Maxwell’s equations [4] where the electric field is asRt sumed to be {0, g(x, t), 0} and w(x, t) = 0 g(x, τ ) dτ . Equation (1.2) describes standard
344
R. Glassey, H.-M. Yin
heat conduction by taking into account the local Joule’s heat produced by Eddy currents. One of the important features in the system is that the electric conductivity σ strongly depends on the temperature. The reader may consult [6] for further physical background while [3, 9] [and the references therein] for the mathematical model of induction heating. In [9] one of the authors studied the full Maxwell system coupled to a nonlinear heat equation in a bounded domain in R3 . Existence of a global weak solution was established under a boundedness assumption of σ(u). Moreover, for the case of one space dimension it is shown that the weak solution is also classical. When σ(u) is not a priori assumed to be bounded, the problem becomes much more complicated. For instance, consider a special case where σ(u) = uq with q > 1. It is well–known that the solution of the following heat equation ut − uxx = uq will blow up in finite time, provided that initial datum is suitably large. In particular, the L1 -norm of u(x, t) blows up in finite time. However, with the coupling factor wt2 as the coefficient of uq , it is not clear whether the temperature u(x, t) will blow up in finite time. Indeed, with suitable boundary conditions, it is easy to derive the following energy estimate for any T > 0: Z TZ 1 Z 1 2 2 [wt + wx ] dx + σ(u)wt2 dx dt ≤ C. 0
0
0
This estimate shows that the right–hand side of Eq.(1.2) belongs to L1 (QT ). From the theory of parabolic equations [5], it follows that at least the L1 (0, 1)-norm of u(x, t) is bounded for all t ≥ 0. On the other hand, the boundedness of u(x, t) in L1 (0, 1) does not ensure the Lp -boundedness of u(x, t) for large p. Thus, the existence of a global solution to the system (1.1)–(1.4) seems challenging. In this paper we show that under certain conditions on the boundary values, the q q problem (1.1)–(1.4) has √ a global smooth solution if σ0 [1 + |u| ] ≤ σ(u) ≤ σ1 [1 + |u| ] for any q ∈ [0, 8 + 4 3), no matter how large the initial data are. We do not know if this value for q is sharp. The crucial step is that we prove a maximum principle for the following wave equation with damping: wtt − wxx + σ(x, t)wt = 0, where σ(x, t) ≥ 0. Unlike the case for elliptic and parabolic equations [8], this maximum principle provides an upper bound for |wx | and |wt | in terms of the bounds for the initial and boundary quantities. It does not require any smoothness nor bounds on σ(x, t). In particular, a priori bounds for wx and wt are derived when the boundary data at x = 0 and x = 1 are homogeneous, or given periodically in the x-variable. To the best of the authors’ knowledge, this maximum principle is new and has independent interest itself. The proof of the maximum principle is based on deriving a sequence of “energy identities” and an iteration technique. This technique has some similarity to Moser’s iteration [7] in deriving a priori bounds of solutions to elliptic and parabolic equations. Moreover, the first two elements of this sequence may be reminiscent of an observation for the solution of the “nonlinear σ model”, cf. [2]. This paper is organized as follows. In Sect. 2 we derive the maximum principle for damped wave equations. In particular, we derive an a priori bound for wx and wt for certain special boundary data. In Sect. 3, we derive various a priori energy estimates for w(x, t) and u(x, t). The existence of a global solution to (1.1)–(1.4) is established via a fixed point argument.
Maxwell’s Equations with Temperature Effect, II
345
For the reader’s convenience, we recall some standard notations below. Let B be a Banach space and 1 ≤ p < ∞, and let Lp (0, T ; B) = {f |f : [0, T ] → B with the norm ||f ||Lp (0,T ;B) < ∞, } , where
Z
T
||f ||Lp (0,T ;B) = [ 0
||f ||pB dt] p . 1
1+α α The spaces W m,p (), H01 (), Wp2,1 (QT ) and C 1+α, 2 (Q¯ T ), C 2+α,1+ 2 (Q¯ T ), etc. are the usual Sobolev and classical spaces. The reader may consult [5] for the definition of these spaces.
2. A Maximum Principle for Damped Wave Equations Let σ(x, t) be measurable and nonnegative in Q = {(x, t) : 0 < x < 1, t > 0}. Consider the following wave equation: wtt − wxx + σ(x, t)wt = 0,
(x, t) ∈ Q.
(2.1)
With appropriate initial and boundary conditions, one can derive an a priori bound for the energy Z tZ 1 Z 1 [wt2 + wx2 ] dx + σwt2 dx dt. 0
0
0
Further a priori estimates often require smoothness and bounds on σ(x, t) as well as on its derivatives. In this section we derive L∞ -bounds for wt and wx over Q in terms of a constant which depends only on the values of wx and wt on the boundary x = 0 and x = 1 as well as at the initial time. In particular, if the boundary conditions are given by w(0, t) = w(1, t) = 0, or wx (0, t) = wx (1, t) = 0, or a periodic form w(0, t) = w(1, t), wx (0, t) = wx (1, t), then the L∞ -norm of wx and wt in Q can be estimated by a known constant depending only on initial values. This a priori estimate does not depend on the smoothness of σ(x, t) nor on bounds for σ(x, t) as long as it is nonnegative. Theorem 2.1. Let σ(x, t) be nonnegative and integrable in Q. Let w(x, t) be a solution of the wave equation (2.1). Then sup |wx | + sup |wt | ≤ C,
(x,t)∈Q
(2.2)
(x,t)∈Q
where C = 4 max{C0 , C1 } and C0 = sup |wx (x, 0)| + sup |wt (x, 0)|, 0<x<1 0<x<1 C1 = max sup |wx (0, t)| + sup |wt (0, t)|, sup |wx (1, t)| + sup |wt (1, t)| . t>0
t>0
t>0
t>0
346
R. Glassey, H.-M. Yin
Proof. We shall first show the result in a fixed time interval, say, [0, T ] for any T > 0. Let QT = (0, 1) × (0, T ]. Multiplying (2.1) by wt , we see that ∂ ∂ 1 2 2 (wt + wx ) + σwt2 = (wx wt ). ∂t 2 ∂x Similarly, we multiply (2.1) by wx to obtain
∂ 1 2 ∂ (wx wt ) + σwt wx = (wt + wx2 ) . ∂t ∂x 2
On Q¯ T , we define
1 2 (w + wx2 ), p = wt wx . 2 t Then e and p will satisfy the following system: e=
et + σwt2 = px , in QT , pt + σp = ex , in QT . Define en+1 (x, t) =
e2n + p2n , pn+1 = en pn , 2
(2.3) (2.4)
n = 1, 2, · · · ,
where e1 = e = 21 (wt2 + wx2 ) and p1 = p = wt wx . Then en+1 and pn+1 will satisfy the following system: (en+1 )t + σan+1 = (pn+1 )x in QT , (pn+1 )t + σbn+1 = (en+1 )x in QT ,
(2.5) (2.6)
where a1 = wt2 , and an+1 = en an + pn bn , b1 = p, and bn+1 = an pn + bn en , n = 1, 2, . . . . We claim that an+1 ≥ 0, pn+1 bn+1 ≥ 0 for all n ≥ 1. The claim can be proved by induction. Indeed, for n = 1, it is clear that a2 = e1 a1 + p1 b1 = ewt2 + p2 ≥ 0 and b2 p2 = (a1 p1 + b1 e1 )e1 p1 ≥ 21 wt4 p2 + e2 p2 ≥ 0. Assume that the claim holds for n, i.e. an ≥ 0, pn bn ≥ 0. Then an+1 = en an + pn bn ≥ 0, since en ≥ 0 for all n ≥ 1. From the definition of pn and bn , we see pn+1 bn+1 = (en pn )[an pn + bn en ] = en an p2n + e2n pn bn ≥ 0, which concludes the induction proof. Now we integrate (2.5) over (0, 1) to obtain
Maxwell’s Equations with Temperature Effect, II
d dt
Z
Z
1
Z
1
en+1 dx + 0
347 1
σan+1 dx =
(pn+1 )x dx.
0
As σ ≥ 0 and an+1 ≥ 0, we see that Z 1 Z 1 Z t en+1 dx ≤ en+1 |t=0 dx + [pn+1 (1, τ ) − pn+1 (0, τ )] dτ. 0
(2.7)
0
0
(2.8)
0
From the definition of en , we see that en+1 = ≥ ≥ ≥ ≥ Now 1 · 2
e2n + p2n 2
!2 1 2 1 e2n−1 + p2n−1 e = 2 n 2 2 2 1 1 · e4n−1 2 2 ··· 2 2n−1 n 1 1 1 · ··· e21 . 2 2 2
2 2n−1 1 1 1 ··· = 2n −1 2 2 2
and e1 ≡ e ≥ It follows that
Z
1
en+1 dx ≥ Z
wt2 w2 , e1 ≥ x . 2 2
0
1 22n+1 −1
1
en+1 dx ≥ 0
1 2
2n+1 −1
Z Z
1
wt2
0 1 0
n+1
n+1
wx2
dx,
(2.9)
dx.
(2.10)
We claim that the following estimates hold for all n ≥ 0: n+1
en+1 (x, 0) ≤ C02 , Z t n+1 [pn+1 (1, τ ) − pn+1 (0, τ )] dτ ≤ 2T C12 , 0
where
s C0 =
||wx (x, 0)||2L∞ (0,1) + ||wt (x, 0)||2L∞ (0,1) 2
,
C1 = max sup |wx (0, t)| + sup |wt (0, t)|, sup |wx (1, t)| + sup |wt (1, t)| . t>0
t>0
t>0
We use induction to prove (2.11)–(2.12). From the definition,
t>0
(2.11) (2.12)
348
R. Glassey, H.-M. Yin
e1 (x, 0) =
wx (x, 0)2 + wt (x, 0)2 ≤ C02 . 2
Assume that the estimate (2.11) holds for all n ≤ k, i.e. n
en (x, 0) ≤ C02 ,
n = 1, 2, · · · , k.
Now, from the definition of ek and pk , we see that ek+1 (x, 0) = = = ≤ ≤
e2k + p2k 2 2 ek + e2k−1 p2k−1 2 e2k + e2k−1 e2k−2 · · · e21 p21 2 k+1 k k−1 2 C02 + C02 +2 +···+2 p21 2 k+1 k k−1 2 C02 + C02 +2 +···+2 +4 , 2
since p21 (x, 0) = |wx (x, 0)wt (x, 0)|2 ≤ C04 . Now
2 + 22 + · · · + 2k = 2(2k − 1) = 2k+1 − 2.
It follows that ek+1 (x, 0) ≤
C02
k+1
+ C02 2
k+1
= C02
k+1
.
To prove the second estimate (2.12) of the claim, we recall from the definition of pn+1 , pn+1 = en pn = en en−1 · · · e1 p1 . By using the same induction argument, we can easily derive n
n
en (0, t) ≤ C12 , en (1, t) ≤ C12 , n = 1, 2, . . . as follows. From |p1 (0, t)| = |wx (0, t)wt (0, t)| ≤ C12 , |p1 (1, t)| = |wx (1, t)wt (1, t)| ≤ C12 we get, assuming the validity of the claim at all steps k ≤ n, n
|pn+1 |x=0 ≤ C12 C12 = C12 Similarly,
n+1
.
n−1
· · · C12 · C12
(2.13)
Maxwell’s Equations with Temperature Effect, II
349 n
|pn+1 |x=1 ≤ C12 C12 = C12
n+1
n−1
· · · C12 · C12
.
Finally, we combine the above estimates to obtain from (2.8) (after an integration over t) the following two inequalities: Z
1 22n+1 −1 1
Z
22n+1 −1
TZ 1 0
0
0
TZ 1 0
wt2
n+1
dx dt ≤ C02
n+1
n+1
dx dt ≤ C02
n+1
wx2
T + 2T 2 C12 T + 2T 2 C12
n+1
n+1
,
(2.14)
.
(2.15)
th
We take the 2n+1 root and then take the limit as n tends to infinity to conclude that sup [0,1]×[0,T ]
sup [0,1]×[0,T ]
|wt | ≤ 2 max{C0 , C1 },
(2.16)
|wx | ≤ 2 max{C0 , C1 }.
(2.17)
As the constants in (2.16)–(2.17) do not depend on T , the desired estimates follow immediately. Corollary 2.2. If the boundary conditions associated with Eq. (2.1) are given by w(0, t) = w(1, t) = 0, or wx (0, t) = wx (1, t) = 0, or the boundary conditions are periodic w(0, t) = w(1, t), wx (0, t) = wx (1, t), t ≥ 0, then sup |wx | + sup |wt | ≤ C0 ,
(x,t)∈Q
where
C0 = 4
(2.18)
(x,t)∈Q
sup |wx (x, 0))| + sup |wt (x, 0)| .
0<x<1
0<x<1
Proof. When w(0, t) = w(1, t) = 0 or wx (0, t) = wx (1, t) = 0, then p1 (0, t) = p1 (1, t) = 0, it follows from (2.13) that pn+1 (0, t) = pn+1 (1, t) = 0. The conclusion follows immediately from the proof of Theorem 2.1. When the boundary conditions are given periodically at x = 0 and x = 1, then en (0, t) = en (1, t), pn (0, t) = pn (1, t) for all n ≥ 1. It follows that pn+1 (1, t) − pn+1 (0, t) = 0, t ≥ 0. Again the desired result follows immediately from the estimate (2.8) in the proof of Theorem 2.1.
350
R. Glassey, H.-M. Yin
Corollary 2.3. For a Cauchy T problem associated with (2.1), if initial data w(x, 0) and wt (x, 0) are of class C 1 (R) L2 (R), then sup |wx | + sup |wt | ≤ C0 ,
(x,t)∈Q
where C0 = 4
(2.19)
(x,t)∈Q
sup |w0 (x)| + sup |w1 (x)| .
x∈R
x∈R
Proof. Indeed, when w(x, 0) and wt (x, 0) have compact support, then the solution w(x, t) of the Cauchy problem for (2.1) has compact support for every t ≥ 0. On a fixed interval [0, T ], we choose M sufficiently large such that wx (±M, t) = wt (±M, t) = 0, 0 ≤ t ≤ T. Then
pn+1 (±M, t) = 0, 0 ≤ t ≤ T.
We integrate the inequality (2.8) over [−M, M ] and then take the limit to conclude the estimate |wx | + sup |wt | ≤ C0 . sup (x,t)∈[−M,M ]×(0,T ]
(x,t)∈[−M,M ]×(0,T ]
Since C0 does not depend on M nor on T , we first let M tend to +∞ and then let T go to +∞ to obtain the desired result. When the initial data w(x, 0) and wt (x, 0) do not have compact support, then we approximate the initial values by a sequence of functions wn (x, 0) and wnt (x, 0) with compact support. As the estimate (2.19) for the approximate solution wn (x, t) does not depend on n, we can take the limit and the limit function will satisfy the same estimate (2.19). 3. Existence of Global Solutions We begin with the case where σ(u) has at most linear growth, namely, 0 ≤ σ(u) ≤ C[1 + |u|]. For this case, we show that the problem (1.1)–(1.4) has a unique global q q solution in the classical √ sense. For the case where σ0 [1 + |u| ] ≤ σ(u) ≤ σ1 [1 + |u| ] with 1 < q < 8 + 4 3, we will show that the problem (1.1)–(1.4) has a unique global solution if f0 (t) = f1 (t) = 0. We first list assumptions on the data. H(3.1) : w0 (x) ∈ C 3 [0, 1], w1 (x) ∈ C 3 [0, 1]; f0 (t), f1 (t) ∈ C 2 [0, T ], H(3.2) : u0 (x) ∈ C 2+α [0, 1]; g0 (t), g1 (t) ∈ C 1+α/2 [0, T ] for some α ∈ (0, 1). Moreover, the following consistency conditions are to hold: u0 (0) = g0 (0), u0 (1) = g1 (0), 0
00
0
0
00
0
g0 (0) − u0 (0) = σ(u0 (0))f0 (0)2 , g1 (1) − u0 (1) = σ(u0 (1))f1 (0)2 , 0
0
f0 (0) = w0 (0), f1 (0) = w0 (1); f0 (0) = w1 (0), f1 (0) = w1 (1), 00
00
0
00
00
0
f0 (0) − w0 (0) + σ(u0 (0))f0 (0) = 0, f1 (0) − w0 (1) + σ(u0 (1))f1 (0) = 0.
Maxwell’s Equations with Temperature Effect, II
351
Theorem 3.1. Under the assumptions H(3.1)–H(3.2) there exists a classical solution (w(x, t), u(x, t)) in [0, T ] for any T > 0, if σ(u) ∈ C 2 (R) and 0 ≤ σ(u) ≤ M [1 + |u|] for u ∈ R. Proof. The local existence of a solution to (1.1)–(1.4) is standard and we shall skip it. To prove the global solvability, we only need to derive an a priori estimate in an appropriate classical space. We first derive some a priori energy estimates. Let F (x, t) = (1 − x)f0 (t) + xf1 (t) and
W (x, t) = w(x, t) − F (x, t).
Then W (x, t) satisfies Wtt − Wxx + σ(u)[Wt + Ft (x, t)] = −Ftt (x, t),
(x, t) ∈ QT .
(3.1)
Multiplying (3.1) by Wt , we obtain after integration over [0, 1] × (0, T ] that Z 1 Z TZ 1 [Wx2 + Wt2 ] dx + σ(u)Wt2 dx dt 0
Z
T
≤C +C Z ≤C +δ
0 T 0
0
Z
0
1
σ(u)|Wt | dx dt Z
0 1
Z σ(u)Wt2
0
T
Z
1
dx dt + C(δ)
σ(u) dx dt, 0
0
where at the second step, we have used H¨older’s and Young’s inequalities. Choosing δ properly, we obtain Z TZ 1 Z TZ 1 Z 1 [Wx2 + Wt2 ] dx + σ(u)Wt2 dx dt ≤ C + C |u| dx dt. (3.2) 0
0
Let
0
0
0
H(x, t) = (1 − x)g1 (t) + xg2 (t).
Then U (x, t) = u(x, t) − H(x, t) satisfies Ut − Uxx = σ(U + H)[Wt + Ft ]2 , Green’s representation gives Z 1 Z tZ U (x, t) = G(x, y, t, 0)U0 (y) dy + 0
0
(x, t) ∈ QT .
(3.3)
1
G(x, y, t, τ )σ(u)[Wt + Ft ]2 dy dτ,(3.4) 0
where G(x, y, t, τ ) is the Green’s function of the heat equation (1.2). Integrating over [0, 1] with respect to x, we see that Z tZ 1 Z 1 |U (x, t)| dx ≤ C + C σ(u)[1 + Wt2 ] dy dτ 0
0
≤ C +C
Z tZ 0
0
1
|u(x, t)| dx dt, 0
where at the final step we have used the estimate (3.2) and the fact
352
R. Glassey, H.-M. Yin
Z
1
G(x, y, t, τ ) dx ≤ C 0
for a fixed constant C. It follows that Z tZ Z 1 |u(x, t)| dx ≤ C + C 0
0
Gronwall’s inequality implies that Z
1
|u(x, t)| dx dt. 0
1
|u(x, t)| dx ≤ C, 0
where C depends on T , 0 ≤ t ≤ T . From the estimate (3.2), we have Z Z 1 2 2 [Wx + Wt ] dx + 0
As
T 0
Z
1 0
σ(u)Wt2 dx dt ≤ C.
(x−y)2
|G(x, y, t, τ )| ≤ C(t − τ )− 2 e−C 4(t−τ ) , 1
it follows from (3.4) and (3.5) that Z t ||u(·, τ )||L∞ (0,1) √ dτ, |U (x, t)| ≤ C + C t−τ 0
(3.5)
t > τ, x, y ∈ (0, 1)
for all (x, t) ∈ QT .
An application of the generalized Gronwall inequality gives us sup |u(x, t)| ≤ C, QT
where C depends only on the initial and boundary data as well as on the upper bound T. With this priori L∞ -bound of u(x, t) in hand, we can easily follow the argument of [9] to conclude the desired existence result. Next we study the case where σ(u) grows like uq with q > 1. We assume H(3.3): Let σ(u) ∈ C 2 (R) satisfy σ0 [1 + |u|q ] ≤ σ(u) ≤ σ1 [1 + |u|q ], where σ0 and σ1 are positive constants. Theorem 3.2. Let the assumptions H(3.1)–H(3.3) hold and let f1 (t) = f2 (t) = 0 for t ∈ [0, T ]. Assume g1 (t), g2 (t) and u0 (x) are nonnegative on [0, 1]. Then any solution (w, u) of (1.1)–(1.4) satisfies the following a priori estimates: sup |wx | + sup |wt | ≤ C, ||u||Wp2,1 (QT ) ≤ C, QT
QT
where
p √ (q − 8)2 − 48 if q ≥ 8 + 4 3. 4 √ The number p can be arbitrary large if q < 8 + 4 3, while the constant C depends only on known data and p. 1 + ε∗ + q ∗ (q − 8) − , ε = p= q
Maxwell’s Equations with Temperature Effect, II
353
Proof. For simplicity, we assume g0 (t) = g1 (t) = 0. Otherwise, similar to the argument used in Theorem 3.1 we introduce U (x, t) = u(x, t) − H(x, t). Without loss of generality we may further assume that σ(u) = uq . It will be seen that the general case can be handled similarly. The proof will be divided into three steps. First of all, we note that u(x, t) ≥ 0 by the maximum principle since u0 (x) ≥ 0. Step 1. Multiplying Eq. (1.1) by uwt and then integrating over (0, 1) × (0, T ), we have, after some routine calculations, Z Z tZ 1 1 1 u[wx2 + wt2 ] dx + uq+1 wt2 dx dt 2 0 0 0 Z Z Z 1 0 1 1 1 1 u0 [(w0 )2 + (w1 )2 ] dx − ut [wx2 + wt2 ] dx − wx wt ux dx. = 2 0 2 0 0 Since wt and wx are uniformly bounded by Corollary 2.2, it follows that Z TZ 1 Z tZ 1 Z 1 u[wx2 + wt2 ] dx + uq+1 wt2 dx dt ≤ C1 + C2 [|ux | + |ut |] dx dt.(3.6) 0
0
0
0
0
On the other hand, by applying the Wp2,1 -estimate for the parabolic equation (1.2), we obtain " # Z Z ||u||pW 2,1 (Q p
In particular, for p = Z
q+1 q , T
0
T
≤ C ||u0 ||pW 2 +
T)
p
0
1
0
upq wt2p dx dt .
(3.7)
we see that
Z
1 0
Z
upq wt2p dx dt
T
≤C 0
Z
1 0
uq+1 wt2 dx dt
Z
T
≤ C + CC2 0
Z
(since wt is bounded)
1
[|ux | + |ut |] dx dt, 0
where at the final step the estimate (3.6) was used. It follows that Z TZ 1 ||u||pW 2,1 (Q ) ≤ C + C [|ux | + |ut |] dx dt. p
T
0
0
After applying H¨older’s and Young’s inequalities we get ||u||pW 2,1 (Q p
T)
≤ C,
where p = q+1 q . Now by multiplying (1.2) by u and using (3.8), we immediately obtain
(3.8)
354
R. Glassey, H.-M. Yin
Z
Z
1
0≤t≤T
0
Z
T
u2 dx +
sup
0
1
u2x dx dt ≤ C.
0
(3.9)
Sobolev’s embedding shows that for any v(x) ∈ H01 (0, 1), Z
Z
1
v dx ≤ C 0
"Z
1
6
0
#2
1
dx ·
vx2
2
v dx
.
(3.10)
0
It follows from (3.9)–(3.10) that Z
T 0
Z
Z
1
u dx dt ≤ C 0
Z
T
6
0
1
u2x dx dt ≤ C.
0
(3.11)
Step 2. Let ε > 0. Multiplying (1.1) by u1+ε wt and then integrating over (0, 1) × (0, T ), we obtain Z
Z
1 0
u1+ε [wx2 + wt2 ] dx + Z
Z
T
≤C +C 0
1
T
Z
0
1 0
uq+1+ε wt2 dx dt
uε [|ut | + |ux |] dx dt.
(3.12)
0
Multiplying Eq.(1.2) by u1+ε and integrating over (0, 1) × (0, T ), we have Z
Z
1
u
sup 0≤t≤T
0
Z
T
≤C 0
1
1 0
T 0
uε u2x dx dt
uq+1+ε wt2 dx dt
T
≤C +C
Z
dx + 0
Z Z
2+ε
0
Z
1
uε [|ut | + |ux |] dx dt,
(3.13)
0
where at the final step the estimate (3.12) was used. An application of the Wp2,1 -estimate with p = 1+ε+q to (1.2) yields q Z
T 0
Z
1 0
[up + upxx + upt ] dx dt Z
T
≤C +C Z
0 T
≤C +C 0
Z Z
1 0 1
uq+1+ε wt2p dx dt uε [|ut | + |ux |] dx dt,
(3.14)
0
where the estimate (3.13) was used again. Now we use H¨older’s inequality with r = inequality to obtain
1+ε+q q
and s =
1+ε+q ε+1
and then Young’s
Maxwell’s Equations with Temperature Effect, II
Z
Z
T 0
1
uε |ut | dx dt
0
Z
T
≤ Z
0
≤δ
Z
!1/r
1
T
|ut | dx dt 1
|ut |r dx dt + C(δ) |ut |
=δ
Z
!1/s
1
εs
0
T 0
(1+ε+q)/q
Z
u dx dt 0
0 1
Z
0
Z
r
0
Z
T
0 T
Z
355
Z
1
uεs dx dt
0
Z
T
Z
1
dx dt + C(δ)
0
0
u[ε(1+ε+q)]/(1+ε) dx dt.
(3.15)
u[ε(1+ε+q)]/(1+ε) dx dt.
(3.16)
0
Similarly, Z
T 0
Z Z
1
uε |ut | dx dt
0 T
≤δ 0
Z
Z
1
T
|ut |(1+ε+q)/q dx dt + C(δ) 0
0
Z
1 0
If we can choose ε such that [ε(1 + ε + q)]/(1 + ε) ≤ 6, then the final term in the right–hand side of (3.15) and (3.16) can be estimated by (3.11). Hence after choosing δ small in (3.15)–(3.16), we see from (3.14)–(3.16) that Z TZ 1 [up + upxx + upt ] dx dt ≤ C, 0
0
where p=
1+ε+q . q
Now the inequality ε(1 + ε + q) ≤6 1+ε is equivalent to
ε2 + (q − 5)ε − 6 ≤ 0.
The maximum value of ε which we can choose is p −(q − 5) + (q − 5)2 + 24 . ε= 2 Before we derive further estimates, we note that from the embedding theorem [5]: ||u||L∞ (QT ) ≤ C||u||Wp2,1 (QT ) if p > 23 . Now p= Hence p >
3 2
is equivalent to
1+ε 1+ε+q =1+ . q q
356
R. Glassey, H.-M. Yin
1 1+ε > . q 2 It follows that q < 6. Consequently, when q < 6 there exists a constant C such that ||u||L∞ (QT ) ≤ C. Step 3. Let −(q − 5) +
ε1 =
Now from Step 2 we know that Z 1 Z u2+ε1 dx + sup 0≤t≤T
Let v = u
2+ε1 2
0
p (q − 5)2 + 24 . 2 1
0
Z
T 0
uε1 u2x dx dt ≤ C.
. Then Sobolev’s embedding implies
Z
T 0
It follows that
Z
Z
1
T
v dx dt ≤ C 6
0
0
Z
T 0
Z
Z
Z
1 0
vx2
dx dt · sup 0≤t≤T
!4
1 2
v dx
.
0
1
u6+3ε1 dx dt ≤ C. 0
Now as in Step 2 we can choose ε as large as possible such that ε(1 + ε + q) ≤ 6 + 3ε1 . 1+ε
(3.17)
The largest value of ε satisfying the inequality (3.17) is p −(q − 5 − 3ε1 ) + (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) . ε= 2 We define ε2 to be the right–hand side of the above equation. By continuing this process, we obtain a sequence εn which is defined as follows: p −(q − 5 − 3εn ) + (q − 5 − 3εn )2 + 4(6 + 3εn ) , n = 1, 2, . . . . (3.18) εn+1 = 2 We now show by induction that the sequence εn is monotonic increasing. Note that ε2 − ε1 > 0 is equivalent to p p 3ε1 + (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) − (q − 5)2 + 24 > 0,
(3.19)
which is equivalent to p (q − 5 − 3ε1 )2 + 4(6 + 3ε1 ) − (q − 7 − 3ε1 ) > 0.
(3.20)
Maxwell’s Equations with Temperature Effect, II
357
If q − 7 − 3ε1 < 0, then the inequality (3.20) holds automatically. If q − 7 − 3ε1 ≥ 0, then a simple calculation shows that the inequality (3.20) holds as long as q > 0. This concludes the proof that ε2 > ε1 . Now we assume that εn > εn−1 . From the definition, εn+1 − εn = 3(εn −εn−1 )+
p
(q−5−3εn )2 +4(6 + 3εn )− 2
p (q − 5 − 3εn−1 )2 + 4(6 + 3εn−1 )
.
It follows that εn+1 − εn > 0 is equivalent to p 9(εn − εn−1 )2 + 6(εn − εn−1 ) (q − 5 − 3εn )2 + 4(6 + 3εn ) +(q − 5 − 3εn )2 + 4(6 + 3εn ) > (q − 5 − 3εn−1 )2 + 4(6 + 3εn−1 ). That is, p
(q − 5 − 3εn )2 + 4(6 + 3εn ) − [q − 7 − 3εn ] > 0.
(3.21)
If q − 7 − 3εn < 0, the above inequality holds automatically. If q − 7 − 3εn ≥ 0, then the inequality (3.21) holds as long as q > 0. Let ε∗ = lim εn . n→+∞
∗
From the definition of εn , we see that ε , if it is finite, satisfies p −(q − 5 − 3ε∗ ) + (q − 5 − 3ε∗ )2 + 4(6 + 3ε∗ ) ∗ ε = . 2
(3.22)
Equation (3.22) is equivalent to 2ε∗2 + (8 − q)ε∗ + 6 = 0. Solving this equation, we find that ∗
ε =
(q − 8) ±
p (q − 8)2 − 48 . 4
When (q − 8)2 − 48 < 0, then there is no real root of (3.22). This implies that ε∗ = ∞. In this case, we can choose n large enough such that p = 1+εqn +q > 23 . With this choice of εn , the embedding theorem shows that the L∞ -norm of u is bounded by a constant which depends only on known data. When (q − 8)2 − 48 ≥ 0, the largest possible ε∗ is p (q − 8) − (q − 8)2 − 48 ∗ ε = . 4 This implies that if ∗
0<ε < then
q−8−
p (q − 8)2 − 48 , 4
ε∗ (1 + ε∗ + q) < 6 + 3ε∗ . 1 + ε∗ Therefore we can estimate the final term in (3.16) to obtain
358
R. Glassey, H.-M. Yin
||u||Wp2,1 (QT ) ≤ C, where p=
1 + ε∗ + q q
and C depends only on known data. As a direct consequence of Theorem 3.2, we have the following existence result. Corollary 3.3. Under the assumptions of Theorem √ 3.2, the problem (1.1)–(1.4) has a unique global smooth solution if 0 ≤ q < 8 + 4 3. 2 Proof. We can assume √ the condition (q − 8) − 48 < 0 √ that q ≥ 6 from Step 2. Then reduces to q < 8 + 4 3. Then, when 0 ≤ q < 8 + 4 3, we have the following a priori estimate for any p > 1: ||u||Wp2,1 (QT ) ≤ C.
Sobolev’s embedding yields ||u||
C 1+α,
1+α 2
¯T) (Q
≤ C,
where C depends only on known data. The rest of the proof exactly follows from [9].
√ Remark 3.1. Again we point out that we do not know if the number 8+4 3 in Corollary 3.3 is optimal for the existence of a global solution to (1.1)–(1.4). When σ(u) = uq with any q > 0, the a priori estimates in Theorem 3.2 hold. However, these are not strong enough to allow us to show the existence of a global weak solution to (1.1)–(1.4). Numerical experiments suggest that the temperature will blow up in finite time if q is large. References 1. Boccardo, L. and Gallouet, T.: Nonlinear Elliptic and Parabolic Equations involving measure data. J. Funct. Anal. 87, 149–169 (1989) 2. Ginibre, J. and Velo, G.: The Cauchy problem for the O(N ), CP(N −1), and GC (N, p) models. (English) Ann. Phys. 142 no.2, 393–415 (1982) 3. Kriegsmann, G.A.: Microwave heating of dispersive media. SIAM J. Appl. Math. 53, 655–669 (1993) 4. Landau, L.D. and Lifshitz, E.M.: Electrodynamics of Continuous Media. New York: Pergamon Press, 1960 5. La dyzenskaja, O.A., Solonnikov, V.A. and Ural’ceva, N.N.: Linear and Quasi-linear Equations of Parabolic Type AMS Trans. 23, Providence, R.I.: American Math. Soc., 1968 6. Metaxas, A.C. and Meredith, R.J.: Industrial Microwave Heating. I.E.E. Power Engineering Series, Vol. 4, London: Per Peregrimus Ltd., 1983 7. Moser, J.: A Harnack inequality for parabolic differential equations. Comm. on Pure and Appl. Math. 17, 101–134 (1964) 8. Protter, M.H. and Weinberger, H.F.: Maximum principles in Differential Equations. New York: SpringerVerlag, 1984 9. Yin, H.M.: On Maxwell’s equations in an electromagnetic field with the temperature effect. Notre Dame preprint series# 253, 1996, to appear in SIAM Journal of Mathematical Analysis Communicated by H. Araki
Commun. Math. Phys. 194, 359 – 388 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Renormalization Group Pathologies and the Definition of Gibbs States J. Bricmont1,? , A. Kupiainen2,?? , R. Lefevere1 1 UCL, Physique Th´ eorique, B-1348, Louvain-la-Neuve, Belgium. E-mail: [email protected], [email protected] 2 Helsinki University, Department of Mathematics, Helsinki 00014, Finland. E-mail: [email protected]
Received: 30 April 1997 / Accepted: 27 October 1997
Dedicated to the memory of Roland L. Dobrushin Abstract: We show that the so-called Renormalization Group pathologies in low temperature Ising models are due to the fact that the renormalized Hamiltonian is defined only almost everywhere (with respect to the renormalized Gibbs measures). We construct this renormalized Hamiltonian using a Renormalization Group method developed for random systems and we show that the pathologies are analogous to Griffiths’ singularities. 1. Introduction The Renormalization Group (RG) has been one of the most useful tools of theoretical physics during the past decades. It has led to an understanding of universality in the theory of critical phenomena and of the divergences in quantum field theories. It has also provided a nonperturbative calculational framework as well as the basis of a rigorous mathematical understanding of these theories. Even though the RG was primarily devised for the study of (approximatively) scale invariant situations such as statistical mechanical models at the critical point, it was found useful in the mathematical analysis of problems that were not critical but that nevertheless were “multiscale”: for example, first order phase transitions in regular [12] and disordered [2] spin systems. The spin variables give an appropriate representation of these systems at and above the critical point; however, at low temperatures, these models are most naturally expressed in terms of contours (domain walls) that separate the different ground states. To apply the RG method, one inductively sums over the small scale contours, producing an effective theory for the larger scale contours. However, the real power of the RG both theoretically and in most applications has been to realize it as a map between Hamiltonians, and the latter are usually expressed ? ??
Supported by EC grant CHRX-CT93-0411 Supported by NSF grant DMS-9205296 and EC grant CHRX-CT93-0411
360
J. Bricmont, A. Kupiainen, R. Lefevere
in terms of the spin variables. So, one would like to define rigorously such a map, but this program has met some difficulties. It was observed in simulations [19] that the RG transformation seems, in some sense, “discontinuous” as a map between spin Hamiltonians. These observations led subsequently to a rather extensive discussion of the so-called “pathologies” of Renormalization Group Transformations (RGT): van Enter, Fernandez and Sokal have shown [34, 35] that, first of all, the RG transformation is not really discontinuous. But they also show, using results of Griffiths and Pearce [16, 17] and of Israel [20], that, roughly speaking, there does not exist a renormalized Hamiltonian for many RGT applied to Ising-like models at low temperatures and in some cases even at high temperatures (in particular in a large external field, see [32, 33]). More precisely, van Enter, Fernandez and Sokal consider various real-space RGT (block spin, majority vote, decimation) that can be easily and rigorously defined as maps acting on measures (i.e. on probability distributions of the infinite volume spin system). The problem occurs when one tries to rewrite this map in terms of Hamiltonians. Hamiltonians are usually expressed as sums of (n-body) interactions of the spins that have sufficient decay properties so as to define infinite volume Gibbs measures. If we start with a Gibbs measure µ corresponding to a given Hamiltonian H, one can easily define the renormalized measure µ0 . The problem then is to reconstruct a renormalized Hamiltonian H 0 (i.e. a set of interactions) for which µ0 is a Gibbs measure. Although this is trivial in finite volume, it is not so in the thermodynamic limit, and it is shown in [35] that, in many cases at low temperatures, even if H contains only nearest-neighbour interactions, there is no absolutely summable interaction (defined in (2.7) below) giving rise to a Hamiltonian H 0 for which µ0 is a Gibbs measure. What is one supposed to think about these pathologies? From a practical point of view, we understand the reason why there might be problems: one is using the wrong variables, i.e. the spin variables rather than the contours variables. The fact that the usefulness of the RG method depends crucially on choosing the right variables has been known for a long time. The “good” variables should be such that a single RG transformation, which can be interpreted as solving the statistical mechanics of the small scale variables with the large ones kept fixed, should be “noncritical”, i.e. should be away from the parameter regions where phase transitions occur. This is true in particular in the low temperature region if one uses the contour variables. In all the cases where pathologies were found, they were due to the fact that a single RG transformation involves a system that has a phase transition for some fixed values of the large scale variables. However, from the theoretical point of view, we believe that it is interesting to see just how pathological the renormalized Hamiltonians are. This question is related to another one, of independent interest: when is a measure Gibbsian for some Hamiltonian? For example, Schonmann showed [31] that, when one projects a Gibbs measure (at low temperatures) to the spins attached to a lattice of lower dimension, the resulting measure is not, in general, Gibbsian. There has been an extensive investigation of this problem of pathologies and Gibbsianness. Martinelli and Olivieri [29, 30] have shown that, in a non-zero external field, the pathologies disappear after sufficiently many decimations. Fernandez and Pfister [9] study the set of configurations that are responsible for those pathologies. They give criteria which hold in particular in a non-zero external field, and which imply that this set is of zero measure with respect to the renormalized measures. The renormalized Hamiltonian has been studied at and above the critical temperature by several authors [1, 3, 4, 5, 10, 11, 18, 21]. Our goal in this paper is to further clarify the situation: following an idea of Dobrushin [7], we prove that, for several examples considered in [35], the renormalized Hamilto-
Renormalization Group Pathologies and Definition of Gibbs States
361
nian actually exists, but the corresponding interaction satisfies a weaker summability condition than the one used in [35]. Our condition ((2.11) below) is however sufficient to define Gibbs measures, in a way that is similar to the one used before for “unbounded spins”. Thus in a sense, the pathologies are not there in the end, and the renormalized measures are Gibbsian. However, it turns out that the renormalized Hamiltonian is only defined almost everywhere with respect to the renormalized measure: it becomes “pathological” on a set of measure zero that in particular includes the configurations used in [35] to exhibit the pathologies. Our result is similar to the one of Maes and Vande Velde [28] on the Schonmann example of the measures projected on a lattice of lower dimension, except that, in our case, we show that the two renormalized states are Gibbsian with respect to the same Hamiltonian (while this question is left open in [28]). However, as pointed out to us by A. Sokal, with our definition of Gibbs states, the interaction defining the Hamiltonian is not unique, while it is essentially unique (up to physical equivalence) with the usual definition. Of course, we do not claim to justify the RG method in general, and, besides, our results do not hold near the critical point. We do not even prove that, upon iteration, the RGT drives the Hamiltonians to a trivial fixed point, although this can probably be done, in some of the examples discussed below. But we do clarify the nature of the “pathologies”. Our proof is based on the following idea: we consider the spins distributed by the renormalized measures as random external fields acting on the original system, and we apply the methods of [2] to construct the renormalized Hamiltonian. So, we consider a single transformation of the Renormalization Group but in order to construct the Hamiltonian on the image spin system, we iterate some RG transformations acting on the system with random fields. As in all random systems, there is a set of measure zero of “bad” configurations of the random fields, for which “typical” results (e.g. decay of correlations) do not hold, and which are responsible for the Griffiths’ singularities ([15]). We shall see below that the pathologies used in [35] are actually due to those bad configurations. But, once one excludes this set of measure zero, the renormalized system has a nice Hamiltonian, with rapidly decaying interactions.
2. Results We consider the nearest-neighbour Ising model on Zd , d ≥ 2, at β large, for simplicity. To each i ∈ Zd , we associate a variable σi ∈ {−1, +1}, and the (formal) Hamiltonian is X − βH = β (σi σj − 1), (2.1) hiji
where hiji denotes a nearest-neighbour pair and β is the inverse temperature. At low temperatures, there are two extremal translation invariant Gibbs measures corresponding to (1), µ+ and µ− . To define our RGT, let L = (bZ)d , b ∈ N, b ≥ 2 and cover Zd with disjoint b-boxes Bx = B0 + x, x ∈ L, where B0 is a box of size b centered around 0. Associate to each x ∈ L a variable sx ∈ {−1, +1}, denote by σA an element of {−1, +1}A , for A ⊂ Zd , |A| < ∞, and introduce the probability kernels Tx = T (σBx , sx ) for x ∈ L, i.e. Tx satisfies
362
J. Bricmont, A. Kupiainen, R. Lefevere
1) T (σBx , sx ) ≥ 0, X 2) T (σBx , sx ) = 1.
(2.2)
sx d
For any measure µ on {−1, +1}Z , we denote by µ(σA ) the probability of the configuration σA . Definition. Given a measure µ on {−1, +1}Z , the renormalized measure µ0 on = {−1, +1}L is defined by: X Y µ(σBA ) T (σBx , sx ), (2.3) µ0 (sA ) = d
σBA
x∈A
where BA = ∪x∈A Bx , A ⊂ L, |A| < ∞, and sA ∈ A = {−1, +1}A . It is easy to check, using (1.1) and (1.2), that µ0 is a measure. We shall call the spins σi the internal spins and the spins sx the external ones (they are also sometimes called the block spins). We shall need two other conditions on T : we assume that T is symmetric:
and that
T (σBx , sx ) = T (−σBx , −sx )
(2.4)
0 ≤ T (σBx , sx ) ≤ e−β
(2.5)
if σi 6= sx ∀i ∈ Bx . The condition (2.5) means that there is a coupling which tends to align sx and the spins in the block Bx . It would be more natural to have, instead of (2.5), 0 ≤ T ≤ (with independent of β but small enough). However, assuming (2.5) simplifies the proofs. Note that (2.2, 2.4, 2.5) imply that T ≡ T ({σi = +1}i∈Bx , +1) = T ({σi = −1}i∈Bx , −1) ≥ 1 − e−β ≥
1 2
(2.6)
for β large. The usual transformations, discussed in [35], Sect. 3.1.2, like decimation or majority rule, obviously satisfy (2.4, 2.5). The Kadanoff transformation satisfies (2.5) for p large. Our results could be extended to the block spin transformations (where sx does not belong to {−1, +1}). Let us now summarize the main result of [35]. Consider interactions 8 = (8X ), which are families of functions 8X : X → R, indexed by X ⊂ L, |X| < ∞. Assume that 8 is a) translation invariant: 8X = 8X+x ∀x ∈ L, b) uniformly absolutely summable: X X30
k8X k∞ < ∞.
(2.7)
Renormalization Group Pathologies and Definition of Gibbs States
363
This set of interactions obviously forms a Banach space, with the norm (2.7) (note that our terminology differs slightly from the one of [35]: we add the word “uniformly” to underline the difference with respect to condition (2.11) below) . Then, one defines, ∀V ⊂ L, |V | < ∞, the Hamiltonian X 8X (sX∩V ∨ s¯X∩V c ), (2.8) H(sV |s¯V c ) = − X∩V 6=∅
where sV ∈ V ,s¯V c is the restriction to V c of s¯ ∈ and, for X ∩ Y = ∅, sX ∨ sY denotes the obvious configuration in X∪Y . The condition (2.7) implies that H is a continuous function of sV and s¯V c (in the product topology), and that πV8 (sV |s¯V c ) = Z −1 (s¯V c ) exp(−H(sV |s¯V c ))
(2.9)
(where Z −1 (s¯V c ) is the obvious normalization factor) defines a quasilocal specification in the sense of [35]. Definition. µ is a Gibbs measure for 8 if the conditional probabilities satisfy, ∀V ⊂ L, |V | finite, ∀sV ∈ V , (2.10) µ(sV |s¯V c ) = πV8 (sV |s¯V c ) µ a.e. Note that we left out the inverse temperature β. When we refer to β below, we mean the inverse temperature of the original Ising model (2.1), before acting with the RGT. The main result of [35] is that, for a variety of RGT, there is no interaction satisfying a) and b) above for which µ0+ or µ0− are Gibbs measures. However, we observe that, in order to define H(sV |s¯V c ) and πV8 (sV |s¯V c ), it is not necessary to assume (2.7); it is enough to assume the existence of a tail set ⊂ on which the following pointwise bounds hold: b’) -pointwise absolutely summable: X |8X (sX )| < ∞ ∀x ∈ L, ∀s ∈ .
(2.11)
X3x
We shall therefore enlarge the class of “allowed” interactions by dropping the condition (2.7) and assuming (2.11) instead. Actually, a similar setup was already used in the theory of “unbounded spins” with infinite range interactions, see [13]. With this condition, we can define the specification π 8, by −1 c c c (2.12) πV8, (sV |s¯V c ) = Z (s¯V ) exp(−H(sV |s¯V )) for s¯V ∈ , 0 for s¯V c 6∈ and then define Gibbs measures for the pair 8, as follows: Definition. Given a tail set ⊂ , µ is a Gibbs measure for the pair (8, ) if µ() = 1, and there exists a version of the conditional probabilities that satisfy, ∀V ⊂ L, |V | finite, ∀sV ∈ V , µ(sV |s¯V c ) = πV8, (sV |s¯V c ) (2.13) ∀s¯ ∈ .
364
J. Bricmont, A. Kupiainen, R. Lefevere
Since conditional probabilities are defined almost everywhere, this definition is very similar to the usual one. However, when condition (2.7) holds, the conditional probabilities can be extended everywhere, and are continuous, which is not the case here. We can now state our main result: Theorem 1. Under assumptions (2.4), (2.5) on T , and for β large enough, there exist disjoint translation-invariant (hence, tail) sets + , − ⊂ such that µ0+ (+ ) = µ0− (− ) = 1 and an interaction 8 satisfying a) and b’) with = + ∪ − such that µ0+ and µ0− are Gibbs measures for the pair (8, ). Remarks. 1. It should be emphasized that we are able to prove that there exists an interaction for which both µ0+ and µ0− are Gibbsian (in the sense defined above). Moreover, this interaction has stronger decay properties than (2.11), see (3.5) below. Thus, the result is different from the one of Maes and Vande Velde [28] on the projected Gibbs measure: they show that both the “plus” and the “minus” projected measures are Gibbsian (in the same sense as here) for some interaction, but not necessarily for the same one. 2. One can distinguish different type of models where “pathologies” occur. Our framework is the one where the pathologies are the weakest. In the case of [28], it is an open question whether one can take the same interaction for µ0+ and µ0− . But if one combines projection with enough decimation, as in [24], then one knows that each of the resulting states is Gibbsian (in the strongest sense, i.e. with interactions satisfying (2.7)), but for different interactions. This in turn implies that non-trivial convex combinations of these states are not quasilocal everywhere, see [36], where other examples of “robust” non-Gibbsianness can be found. 3. Note that in the theory of “unbounded spins” with long range interactions, a set of “allowed” configurations has to be introduced, where a bound like (2.11) holds [13]. But here, of course, contrary to the unbounded spins models, each k8X k∞ is finite. We even have the bound k8X k∞ ≤ Cβ|X|, see (3.4) below. 4. The set = + ∪ − is not “nice” topologically: e.g. it has an empty interior (in the usual product topology). Besides, our effective potentials do not belong to a natural Banach space like the one defined by (2.7). However, this underlines the fact that the concept of Gibbs measure is a measure-theoretic notion and the latter often does not match with topological notions. 5. We can regard {sx } as a set of quenched external fields coupled to the spins σ by the probability kernels T . The distribution of {sx } is given by µ0+ or µ0− . Then, as in most disordered systems, there is a set of “good” configurations of the random fields ( here) for which the system with the spins σ has good clustering properties. And, implicitly, we shall use the latter to construct our Hamiltonian. This is why part of the proof below uses the techniques of [2, 12]. Of course, there are also “bad” configurations of the random fields for which the σ spins do not have good clustering properties, and those are essentially the ones used in [35] to prove that there is no absolutely summable potential for which µ0+ , µ0− are Gibbsian. 6. To illustrate the role of the set , consider the (trivial) case, where b = 1, and T = δ(σi − sx ) with i = x, i.e. the “renormalized” system is identical to the original one (this example was suggested to us by A. Sokal). Then , as constructed in our proof, will be the set of configurations such that all the (usual) Ising contours are finite and each site is surrounded by at most a finite number of contours. When X = a contour γ (considered as a suitable set of sites), we let 8X (sX ) = 2β|γ|
(2.14)
Renormalization Group Pathologies and Definition of Gibbs States
365
for sX = a configuration making γ a contour, and 8X (sX ) = 0 otherwise. Obviously, this 8 satisfies (2.11) but not (2.7). One can write = + ∪− , according to the values of the spins in the infinite connected component of the complement of the contours. It is easy to see that µ+ , µ− are indeed, at low temperatures, Gibbs measures (in the sense considered + − here) for this new interaction: a Peierls argument shows that Pµ (+ ) = µ (− ) = 1, and for s ∈ the (formal) Hamiltonian (2.1) is βH = 2β γ |γ|. Actually, we shall prove the theorem by using a kind of perturbative analysis around this example. Of course, in this example one could alternatively take = and 8 = the original nearestneighbor interaction; this shows the nonuniqueness of the pair (8, ) in our generalized Gibbs-measure framework. 3. Outline of the Proof 3.1. The main propositions. We shall construct the interaction 8 inductively. We shall now give the strategy and indicate the different steps of the proof. Consider µ0+ ; using (2.3), one sees that the conditional probabilities µ0+ (sV |s¯V c ) can be obtained through the following limit, if it exists: µ0+ (sV |s¯V c ) = lim lim P 32 ↑L 31 ↑Zd
where Z3+ 1 (s32 ) =
X Y
Z3+ 1 (sV ∨ s¯V c ∩32 ) , + ˜ ∨s ¯V c ∩32 ) V s˜ V Z31 (s
(3.1)
Tx e−βH(σ31 |+) ,
(3.2)
σ31 x∈32
where H(σ31 |+) is defined as in (2.1), but with the sum restricted to i ∈ 31 , with σj = +1, ∀j ∈ 3c1 . The conditional probabilities µ0− (sV |s¯V c ) can be obtained by similar formulas, with + replaced by −. We shall prove Proposition 1. Under assumptions (2.4), (2.5), there exists, for all η > 0, a β¯ < ∞, such that, ∀β ≥ β¯ in (2.1), (2.5), there exists a set ⊂ , = + ∪ − , and an interaction 8, such that, ∀s1 , s2 ∈ + , ∀V ⊂ L, |V | < ∞, X Z3+ 1 (s132 ) 1 2 lim lim = exp 8X (sX ) − 8X (sX ) (3.3) + (s2 ) 32 ↑L 31 ↑Zd Z3 32 1 X∩V 6=∅
if s1x = s2x ∀x 6∈ V . The functions 8X satisfy: k8X k∞ ≤ cβ|X| for some c < ∞, and, ∀x ∈ L, ∀s ∈ , X |8X (sX )| exp(d(X)1−η ) = C(x, s) < ∞,
(3.4)
(3.5)
X3x
where d(X) = diam(X). A formula similar to (3.3) holds with + replaced by − for all s1 , s2 ∈ − . Remark. The factor exp(d(X)1−η ) is not optimal; we could replace it by exp(d(X)1−η + |X|1−η ); but we expect |8X (sX )| to decay as exp(−d(X) − |X|).
366
J. Bricmont, A. Kupiainen, R. Lefevere
Proposition 2. µ0+ (+ ) = µ0− (− ) = 1. Clearly, (1.1) and these two Propositions imply Theorem 1. Remark. The set will be of measure zero for Gibbs measures which are not convex combinations of µ0+ and µ0− , such as the non-translation-invariant Gibbs measures with ˜ of µ0 interfaces that exist for d ≥ 3 [6]. An open question is to find another set , measure one, for the renormalized measure corresponding to a non-translation invariant Gibbs measure µ (for d ≥ 3), and an interaction with respect to which µ0 is Gibbsian. 3.2. The contour representation. To prove these propositions, we shall use a “contour”, or “polymer” representation of Z3+ 1 (s32 ). But, since we regard the external spins as random fields acting on the internal ones, we shall first define the sets where the external spins are “bad”, namely where they change sign and exert opposite influences on the internal spins. Let [ (3.6) D(s) = {Bx |x ∈ L, ∃y ∈ L, |x − y| = b, sx 6= sy }, where we use throughout the paper: |x| = max |xi |. i=1,···,d
(3.7)
D(s) is determined by the set of (ordinary) contours of the configuration s. Define also D32 ≡ D+ (s32 )
(3.8)
with D+ (s32 ) = where
S {Bx |x ∈ 32 , ∃y ∈ 32 , |x − y| = b, sx 6= sy } S {Bx |x ∈ ∂32 , sx = −1}, ∂32 = {x ∈ 32 , d(x, 3c2 ) = b}.
(3.9) (3.10)
(d is the distance corresponding to (3.7).) D− (s32 ) is defined similarly, with sx = +1 instead of sx = −1. Now, we introduce the “contours” of the internal spins: let, for each term in (3.2) S 0(σ31 |+) = {Bx |x ∈ 32 , ∀i ∈ Bx , σi 6= sx } [ S {Bx ⊂ 31 | ∃hiji, i ∈ Bx , σi 6= σj } D32 , (3.11) where σi = +1 for i 6∈ 31 . So 0 includes the boxes Bx where all the internal spins differ from sx , and the boxes intersected by the usual contours of the configuration σ31 , plus all the sites in D32 . Of / ∂32 , either all course, these sets are not disjoint: in a box Bx belonging to D32 , x ∈ internal spins in Bx differ from sx or all internal spins differ from sy 6= sx in some box By with |x − y| = b, or there is a pair hiji with σi 6= σj in Bx ∪ By . We include D32 in the contours, and we coarse-grain them into b-boxes for convenience. Note that in all boxes Bx not contained in 0, the internal spinsSare constant (and, in 32 , are equal to the external ones). So, one may decompose 0 = γ into connected components (a subset Y of Zd is connected if any two points of Y can be joined by a path (iα ), with |iα − iα+1 | = 1), and one may define contours as pairs γ = (γ, σ(γ)), where
Renormalization Group Pathologies and Definition of Gibbs States
367
γ is the support of the contour, and σ(γ) is a configuration {σi (γ)}i∈γ c , σi (γ) = +1 or −1, defined on the complement of γ, which is constant on the connected components of γ c (this notion of contour will be slightly generalized below). Definition. A family of contours 0 is compatible if the supports of the contours are mutually disjoint: γ 1 ∩ γ 2 = ∅, and if their signs match and agree with the boundary conditions on 31 . So, the notion of compatibility is as for the usual Ising contours, and if 0 is compatible, σi (0) is unambiguously defined, ∀i ∈ 0c . Definition. A family of contours 0 is s-compatible if 0 is compatible and, moreover, σx (0) = sx ∀x ∈ (32 ∩ L)\0. The notion of s-compatibility imposes a constraint due to the external spins. For example, if all the external spins have value +1, a single (Ising) contour surrounding 32 , with + spins outside and − spins inside, is compatible but is not s-compatible. One may write: X ρ(0), (3.12) Z3+ 1 (s32 ) = (T )|32 | 0⊃D32
where the sum runs over s-compatible families of contours with 0 ⊂ 31 , ρ(0) = Q γ∈0 ρ(γ) with ρ(γ) = 0 if γ does not contain the connected components of D32 that it intersects or if σx (0) 6= sx , for some x with Bx adjacent to γ; it equals ρ(γ) =
X ? Y Tx exp(−βH(σγ |σ(γ)) T σγ x∈γ∩L
(3.13)
otherwise. Here H(σγ |σ(γ)) is defined in the same way as (2.8) but with the Hamiltonian (2.1); σi (γ), i ∈ / γ, is fixed by the signs associated to the complement of γ, and the P? runs over spin configurations σγ such that γ is a contour of the configuration sum σγ ∨ σ(γ) (and ρ(γ) = 0 if the sum is empty); ρ(γ) is a function of the external spins {sx |x ∈ 32 , d(x, γ ∩ L) ≤ 2b}, since it vanishes unless γ contains the connected components of D32 that it intersects (observe that the property, for a set Di , to be a connected component of D32 depends on {sx |x ∈ 32 , d(x, Di ∩ L) ≤ 2b}). The fact that we sum in (3.12) over s-compatible families introduces a global constraint on the set of contours which will be characterized explicitly in Lemma 4.1 below. It is easy to see that 0 ≤ ρ(γ) ≤ exp(−β0 |γ\(D32 ∩ ∂32 )|),
(3.14)
where β0 = β0 (b, β) depends on the choice of b in the definition of L and goes to infinity as β in (2.1) goes to infinity. To prove (3.14), observe that one gets a factor e−2β from the Hamiltonian (2.1) for each pair hiji with σi 6= σj , and a factor e−β for each box Bx such that ∀i ∈ Bx , σi 6= sx , from our assumption (2.5) on the probability kernels Tx ; for / ∂32 , we use the observation made above that in or the boxes Bx in D32 , but with x ∈ near each such box, either there is a pair hiji with σi 6= σj or all the internal spins differ from the external one. Finally, using the lower bound (2.6) on T , we get (3.14) for β0 = cβ,
(3.15)
368
J. Bricmont, A. Kupiainen, R. Lefevere
with some c > 0 (which has to be taken small enough because, in the above argument, we implicitly assigned the same factor e−2β or e−β to different sites of γ). 3.3. Renormalization. Let us now introduce the coarse-grained description of the system on which our inductive scheme is based. Let L > b be some odd integer (which will be taken large enough below). Divide Zd into disjoint L− boxes {i| |i − Lx| < L2 }, where x ∈ Zd , i.e. each i ∈ Zd can be written as i = Lx + j with x ∈ Zd and |jµ | < L2 , µ = 1, · · · , d (here and below, we use the letters x, y to denote sites in the new lattices Zd ). We define [L−1 i] = x and the L−box of sites i such that [L−1 i] = x is denoted by Lx. Also for a set Y ⊂ Zd , [ YL ] = {[L−1 i]|i ∈ Y }, while LY = ∪{Lx|x ∈ Y }. We use a similar notation for all scales Ln , n = 1, 2, · · ·. We shall now describe D32 and D(s) on these different coarse-grained scales. Let us introduce the random variables Nxn = Nxn (s), n = 0, 1, 2 · · ·, defined inductively as follows: x ∈ D(s),
Nx0 = 2d
if
Nx0
otherwise, X
=0
Nxn+1 = L−1+η 0
(3.16) Nyn ,
(3.17)
y∈Lx0 \D n (s)
where Dn (s) = {Di |Di is a connected component of Dn (s), |Di | ≤ Lα , N n (Di ) ≤ L−3α }, (3.18) with η as in Proposition 1, α = η4 , D0 (s) = D(s); X Nxn , N n (Y ) =
(3.19)
x∈Y
for Y ⊂ Zd , and
Dn+1 (s) = [L−1 (Dn (s)\Dn (s))],
(3.20)
where, for a set Y ⊂ Zd , we write: Y = {i|d(i, Y ) ≤ 1}.
(3.21)
It is easy to see inductively that Nxn = 0 if x ∈ / Dn (s)
(3.22)
Dn (s) ⊂ {x|Nxn 6= 0}.
(3.23)
and n n n and sets D3 , D3 n = 2, · · · We define also variables Nx,3 2 2 2 (3.17, 3.20), but starting with D32 instead of D(s) in (3.16). d
Then, we define, ∀x ∈ Z ,
x = {s ∈ |∃n(x) ∀n ≥ n(x), Nxn = 0} and
by the same formulas
Renormalization Group Pathologies and Definition of Gibbs States
=
\
x .
369
(3.24)
x∈Zd
To understand intuitively the meaning of , observe that iterating the operation (3.20) removes, at each step, the “small” connected components of D(s), and “glues” or “blocks” together the “large” ones that are not too far from each other. Then, the configurations in are those for which this operation ends, after finitely many steps, for each sequence of Ln -boxes labelled by a given site of Zd . Note that in the configurations used in [35] to construct “counterexamples”, D(s) covers an infinite connected subset of the lattice (and obviously N0n 6= 0, for all large enough n’s). Now we can formulate the inductive representation for the partition function which will be used to prove Proposition 1. We shall need a somewhat more general notion of ˆ σ(γ)), where contour: here and below, a contour on scale n will be a triple γ =(γ, γ, d γ is a connected subset of Z , γˆ ⊂ γ (γˆ is not necessarily connected) and σ(γ) is a collection of signs σx (γ), x ∈ Zd on the complement of γ which are constant on the connected components of the complement of γ. However, when γ = Di , for Di a n , we shall have γ = γ, ˆ and we shall simply denote γ by connected component of D3 2 Di . For those contours, on scale n = 0, the signs σ(Di ) coincide with the values of the external spins in Dic . We shall define below (at the end of the proof of Proposition n . On 3) “renormalized” values snx of the external spins, for each x ∈ [L−n 32 ]\D3 2 n scale n, the signs σ(Di ) will also coincide with the values s of the external spins in Dic . As before, a set of contours 0 is compatible if γ 1 ∩ γ 2 = ∅, ∀γ1 , γ2 ∈ 0, and if the signs match (among the contours and with the boundary conditions on 31 ). It is sn -compatible, on scale n, if, moreover, σx (0) = snx , ∀x ∈ Zd \0. We shall derive inductively the following representation for the partition function (3.12): X f n (s ) 8nX (sX ))Ze3+ 1 (s32 ), (3.25) Z3+ 1 (s32 ) = e +,31 32 exp( X⊂32 n where f+,31 (s32 ) corresponds to a “bulk” free energy. Since in Proposition 1, we study a n (s32 ), ratio of partition functions, we need only to bound the difference between two f+,3 1 for different s32 ’s, and this is done in (3.32) below. 8nX will converge, as n → ∞, to the interactions 8X , while
Ze3+ 1 (s32 ) =
X
ρn (0) exp(W n (0)),
(3.26)
0
where the sum runs over sn -compatible sets of contours, with 0 ≡ ∪γ∈0 γ ⊂ [L−n 31 ] n b ≡ ∪γ∈0 γ and the constraint 0 b ⊃ D3 , 2 Y ρn (0) = ρn (γ). (3.27) ρn (γ), is, for n ≥ 1, a function of {sx |x ∈ Ln γ ∩ 32 } while X W n (0) = 9n (Y, 0),
(3.28)
Y ⊂[L−n 31 ]
where 9n (Y, 0) is, for n ≥ 1, a function of {sx |x ∈ Ln Y ∩ 32 }. We shall prove that, eventually, Ze3+ 1 (s32 ) → 1 as 32 ↑ L. Proposition 1 will then follow easily from such a representation. In the proposition below, we collect the bounds satisfied by ρn (γ), 9n (Y, 0) and n f+,31 (s32 ):
370
J. Bricmont, A. Kupiainen, R. Lefevere
Proposition 3. Under the hypotheses of Proposition 1, ∀s32 ∈ 32 , and for n such that |[L−n 32 ]| ≥ Ld , (3.25, (3.26) hold, where 8nX satisfies (3.4, (3.5) uniformly in n, n 0 ≤ ρn (γ) ≤ exp(βn kn N3n2 (γ) − βn |γ\D3 |) 2
(3.29)
n , and, for each connected component Di of D3 2
ρn (Di ) ≥ exp(−βn kn N3n2 (Di )). Moreover,
(3.30)
|9n (Y, 0)| ≤ e−βn |Y | ;
(3.31)
9 (Y, 0) depends on 0 only through 0 ∩ Y ≡ {γ|γ ∩ Y 6= ∅} and 9 (Y, 0) = 0 unless Y is connected and 0 ∩ Y 6= ∅. Finally, ∀s132 , s232 ∈ 32 , with s1x = s2x , ∀x ∈ V , and for n such that d([L−n V ], [L−n 32 ]c ) ≥ L, n
n
n n |f+,3 (s132 ) − f+,3 (s232 )| ≤ C|V | exp(−(d(V, 3c2 ))1−2η ) 1 1
(3.32)
for some C < ∞, and βn = L(1−η)n β0 , kn = k0 − L−n ,
(3.33)
where k0 < ∞, β0 = cβ and η = 4α. n satisfying (3.32), Similar formulas and bounds hold for Z3−1 , with a function f−,3 1 + − and D32 (=D (s32 )) replaced by D (s32 ). Remarks. 1. We shall see in the proof (Eq. 4.18) below) that, at each scale, there are two contributions to 8nX . One is given by ln ρn (Di ), and is similar to the contour energy in (2.14). The other contribution comes from the sum over the internal spins that introduces “interactions” between the contours of the external spins, and gives rise to the last term in (4.18). 2. The restriction on n in the proposition is technical: for larger n’s, all the statements would remain true, except that βn would no longer increase as in (3.33). However, in the limit 32 ↑ L, the largest value of n to which the proposition applies increases to infinity. 3. In the proofs, we shall denote by c or C a generic constant that depends only on the lattice dimension or on b, but not on the choice of L. This constant may vary from place to place. We shall use C(L) to denote a generic constant that may depend also on L. We shall assume that L is chosen large enough (given η = 4α in Proposition 1) so that inequalities like C ≤ Lα can be used. Besides, we shall assume that β0 = cβ (see (3.15)) is large enough, so that inequalities like C(L) ≤ β0 can be used. 4. Proofs Proof of Proposition 3. The proof will be made for the + boundary conditions. The proof for the − boundary condition is similar, but we shall indicate which quantities may depend on the boundary conditions. Although the proof is rather technical, the main idea is quite simple: we cannot take directly the logarithm of Z3+ 1 (s32 ), wherever s32 is not constant, because the change of signs of s32 in D32 introduces constraints in
Renormalization Group Pathologies and Definition of Gibbs States
371
that partition function, i.e. it forces the presence of contours. We define a sort of local partition function, ρ(D32 ) (see (4.1) below), containing the contours that are constrained only by the “small” connected components of D32 , and factor it out of the sum. Then the sum over the contours that do not intersect (D32 \D32 ) can be exponentiated via the usual polymer formalism. The exponent is divided into three parts: the terms that are n not inside 32 contribute to f+,3 (s32 ) (see (4.20)), those that are inside 32 but do not 1 depend on the remaining contours (i.e. those that intersect (D32 \D32 )) contribute to 8n (X) (see (4.18)), and finally those that depend on the remaining contours contribute to 9n (Y, 0) (see (4.19, 4.14)). Then, we “block” the remaining contours and iterate the operation. By definition of , eventually D32 becomes empty, there are no constraints left and Z˜ 3+ 1 (s32 ) converges to 1 (see (4.34)). Turning to the proof, we see that, for n = 0, (3.25) and (3.26) follow from (3.12) 0 b = 0, 80X = 0, 90 (Y, 0) = 0, f+,3 (s32 ) = |32 | ln T (which is independent of s32 , with 0 1 0 so that (3.32) holds trivially) and ρ (γ) = ρ(γ). The bounds on ρ(γ) will be discussed later (see proof of Lemma 5). Now assume that the proposition holds for n, and let us prove it for n + 1. We shall delete the indices n, n + 1 and denote by a prime the scale n + 1. Let Y ρ(Di ), (4.1) ρ(D32 ) = Di ∈D32
and write (3.26) as:
Ze = ρ(D32 )
XY
ρ(γ) exp(W (0)).
(4.2)
0 γ∈0
W (0) was defined in (3.28), and ρ(γ) = Q
ρ(γ) , i ρ(Di )
(4.3)
where the product runs over Di ⊂ γ, Di ∈ D32 . In (4.2), we use the fact that 0 ⊃ D32 ⊃ D32 . A contour γ is small if, (4.4) V (γ) 6⊃ [L−n 32 ] and
γ ∩ (D32 \D32 ) = ∅,
(4.5)
where V (γ) is the complement of the infinite connected component of Z \γ. A contour is large otherwise. Note that, unlike in [2, 12], the notion of small contour does not refer to the size of γ, but, basically, to the subset of D32 intersected by γ. It is convenient to include in the large contours those for which V (γ) ⊃ [L−n 32 ]. Indeed, as we shall see in Lemma 1 below, the global constraints on families of contours due to the fact that they have to be s-compatible can be expressed entirely in terms of those contours. This inclusion is, however, what limits the values of n in Proposition 3: when |[L−n 32 ]| becomes too small, all the contours are “large” and the iteration stops. As we shall see, the bounds (3.29, 3.30) are sufficient to control the sum over the small contours (see Lemma 3 below). So, we rewrite the sum in (4.2) as d
X` b 01 ⊃D32 \D32
ρ(01 ) exp(W (01 ))
Xs 02
ρ(02 ) exp(W (01 , 02 )).
(4.6)
372
J. Bricmont, A. Kupiainen, R. Lefevere
P b 1 ⊃ (D32 \D32 ) and runs over all families of large contours such that 0 02 runs over the set Cs (01 ) of families of small contours 02 such that 01 ∪ 02 is s-compatible P b 2 ⊃ D32 . If Cs (01 ) = ∅, then b1 ∪ 0 and such that 0 02 = 0. Finally, P
01
W (01 , 02 ) = W (01 ∪ 02 ) − W (01 ) X X X 9(Y, 01 ∪ 02 ) − 9(Y, 01 ) ≡ 9(Y, 01 , 02 ) = Y
Y
(4.7)
Y
where the sums run over Y ⊂ [L−n 31 ], and the last sum runs over Y ∩ 02 6= ∅ because 9(Y, 0) depends on 0 only through 0 ∩ Y . Let us first characterize explicitly the constraint that the families of contours have to be s-compatible. For that, we define Out(0) = {γ ∈ 0|V (γ) ⊃ [L−n 32 ]}. b ⊃ D32 is s-compatible if and Lemma 1. A compatible family of contours 0 such that 0 only if Out(0) ∪ D32 is s-compatible. The proof of this lemma and of the other ones is given in the Appendix. Using this Xs lemma, one may characterize the families of contours that enter the sum : 02
b 1 ⊃ D32 \D32 and Cs (01 ) 6= ∅, then Lemma 2. If 01 is a family of contours such that 0 Cs (01 ) is the set of families of small contours 02 such that: 1) 02 ∩ 01 = ∅, 2) the signs of the contours in 02 ∪ 01 match among themselves and with the boundary conditions on 31 , b 1 ). b 2 ⊃ (D32 \0 3) 0 We shall show that the sum
Xs
(4.8) (4.9) (4.10) (4.11)
can be exponentiated using this lemma, the bounds
02
(3.29, 3.30) and the standard polymer formalism: Lemma 3. Xs 02
X
ρ(02 ) exp(W (01 , 02 )) = exp( Y
ϕ+ (Y, 01 )),
(4.12)
⊂[L−n 31 ]
where ϕ+ (Y, 01 ) is a function of {sx , x ∈ Ln Y ∩ 32 }, for n ≥ 1, and of {sx |x ∈ 32 , d(x, Y ) ≤ 2b} for n = 0; ϕ+ (Y, 01 ) depends on 01 only through 01 ∩ Y . In particular, ϕ+ (Y, 01 ) = ϕ+ (Y, ∅) if 01 ∩ Y = ∅, and we denote it by ϕ+ (Y ) in that case. Moreover, ϕ+ (Y, 01 ) satisfies the bound: |ϕ+ (Y, 01 )| ≤ exp(−βL−2α |Y |),
(4.13)
and ϕ+ (Y, 01 ) = 0 unless Y is connected. Finally, one may define ϕ− (Y, 01 ) with − boundary conditions, and we have ϕ+ (Y, 01 ) = ϕ− (Y, 01 ), if Ln Y ⊂ 32 .
Renormalization Group Pathologies and Definition of Gibbs States
373
Now, insert (4.12) in (4.6). We write X X ϕ+ (Y, 01 ) = ϕ+ (Y ) Y ⊂[L−n 31 ]
Y ⊂[L−n 31 ]
X
+ Y
=
(ϕ+ (Y, 01 ) − ϕ+ (Y ))χ(Y ∩ 01 6= ∅)
⊂[L−n 31 ]
X
ϕ+ (Y ) +
Y ⊂[L−n 31 ]
where
X
ϕ˜ + (Y, 01 ),
Y ⊂[L−n 31 ]
ϕ˜ + (Y, 01 ) = (ϕ+ (Y, 01 ) − ϕ+ (Y ))χ(Y ∩ 01 6= ∅).
(4.14)
We get, using (4.2, 4.6) and writing 0 for 01 , X
Ze = ρ(D32 ) exp(
X`
ϕ+ (Y ))
b 0⊃D32 \D32
Y ⊂[L−n 31 ]
X
ρ(0) exp(W (0) +
ϕ˜ + (Y, 0)).
Y ⊂[L−n 31 ]
(4.15) Write:
X Y
X X
ϕ+ (Y ) =
⊂[L−n 31 ]
ϕ+ (Y )χ(Ln Y = X)
X⊂32 Y
X
+ Y
ϕ+ (Y )χ(Ln Y ∩ 3c2 6= ∅),
(4.16)
⊂[L−n 31 ]
and a similar formula for ln ρ(D32 ). Then, using (3.25), we get: Z3+ 1 (s32 )
=e
0 f+, 31 (s32 )
X
exp(
80X (sX ))
X⊂32
X`
f (0)), ρ(0) exp(W
where, for n ≥ 1, and X ⊂ 32 , X X ln ρ(Di )χ(Ln Di = X) + ϕ+ (Y )χ(Ln Y = X), 80X = 8X + Di ∈D32
(4.18)
Y
and, using (3.28), X
f (0) = W Y
9(Y, 0) + ϕ˜ + (Y, 0) ,
(4.19)
⊂[L−n 31 ]
while 0 f+,3 (s32 ) = f+,31 (s32 ) + 1
+
(4.17)
b 0⊃D32 \D32
X
X
ϕ+ (Y )χ(Ln Y ∩ 3c2 6= ∅)
Y ⊂[L−n 31 ]
ln ρ(Di )χ(Ln Di ∩ 3c2 6= ∅).
(4.20)
Di ∈D32
For n = 0, one modifies (4.18, 4.20) by replacing Ln Di by {x|d(x, Di ) ≤ 2b}, and L Y by {x|d(x, Y ) ≤ 2b}. Note that all the terms in (4.19) vanish unless Y ∩ 0 6= ∅ (using Proposition 3 and (4.14)). n
374
J. Bricmont, A. Kupiainen, R. Lefevere
Lemma 4. The functions 8nX , defined inductively by (4.18), are functions of {sx }x∈X , are independent of the boundary conditions on 31 , and satisfy the bounds in Proposition n (s32 ), defined 1 uniformly in n. The limit limn→∞ 8nX = 8X exists. Moreover, f+,3 1 n inductively by (4.20), and f−,31 (s32 ) defined similarly, satisfy (3.32). Now we shall “block” the terms in X`
f (0)) ρ(0) exp(W
(4.21)
b 0⊃D32 \D32 in order to obtain the representation (3.26) for Z˜ on the next scale. Let, for each term S 0 0 b 0 = [L−1 0] D3 (note that D3 is not necessarily included in [L−1 0], in (4.21), 0 2 2 S 0 0 b into connected components: 0 b0 = because of the bar in (3.20)) and decompose 0 ˆ i. iγ Write also X`
f (0) = W
0
e b )) + (U (Y, 0) + 9(Y, 0
Y ⊂[L−n 31 ]
where, in
P`
X
E(γˆ i0 , 0 ∩ γˆ i0 ),
(4.22)
i
, we sum only over Y with d(Y ) ≥
L 4,
and we define
U (Y, 0) = 9(Y, 0) + ϕ˜ (Y, 0) − min(9(Y, 0) + ϕ˜ + (Y, 0)),
(4.23)
e b 0 ) = min(9(Y, 0) + ϕ˜ + (Y, 0)), 9(Y, 0
(4.24)
+
0
and
0
S 0 b 0 . Finally, =0 where min0 is taken over all 0 such that [L−1 0] D3 2 X (9(Y, 0) + ϕ˜ + (Y, 0)), E(γˆ 0 , 0 ∩ γˆ 0 ) =
(4.25)
Y ∩Lγˆ 0 6=∅
where all the terms satisfy d(Y ) < L4 . Note that those Y ’s can intersect Lγˆ 0 for at most one γˆ 0 , since, for disconnected sets Y1 , Y2 in Zd , d(LY1 , LY2 ) ≥ L. Let, for Y 0 ⊂ [L−(n+1) 31 ], 0
b)= 90 (Y 0 , 0
X` [L−1 Y
]=Y
e b 0 ). 9(Y, 0
(4.26)
0
We get X
(4.21) =
X
exp(
0
0 b 0 ⊃D3
0
b )) 90 (Y 0 , 0
ρ(0)
0 =b [L−1 0]∪D3 0
0
Y 0 ⊂[L−(n+1) 31 ]
2
exp(
X` 2
X i
E(γˆ i0 , 0 ∩ γˆ i0 ) +
X`
U (Y, 0)).
(4.27)
Y
P` We need to do a Mayer expansion on exp( Y U (Y, 0)) in order to factorize the sum over 0 in (4.27). We note, for further use, that, by (4.23), U (Y, 0) ≥ 0. We write
Renormalization Group Pathologies and Definition of Gibbs States
exp(
X`
U (Y, 0)) =
Y
375
Y X Y (eU (Y,0) − 1 + 1) = (eU (Y,0) − 1) Y
≡
X
Y Y ∈Y
V (Y, 0),
(4.28)
Y
where V (Y, 0) ≥ 0. P` Insert (4.28) in (4.27). We can write the first in (4.27) as: X`
ρ(0) exp(
0 =b [L−1 0]∪D3 0
0
X
E(γˆ i0 , 0 ∩ γˆ i0 ))
X
V (Y, 0) =
00 ⊃b 0
0
Y
i
X Y
2
ρ0 (γ 0 ),
(4.29)
γ0
0
b ∪ [L−1 Y] = ∪γ 0 is decomposed into connected components and where 00 = 0 0
0
ρ (γ ) =
X`
ρ(0)V (Y, 0) exp(
X
E(γˆ i0 , 0 ∩ γˆ i0 )),
(4.30)
i
(0,Y)
where the sum runs over (0, Y) such that 0 b0 ∩ γ0, [L−1 0] ∪ (D3 ∩ γ0) = 0 2
(4.31)
0 [L−1 (0 ∪ Y)] ∪ (D3 ∩ γ0) = γ0, 2
(4.32)
and such that the signs {σ(γ)}γ∈0 are the same for all the terms in the sum. Note that all the terms in (4.30) are positive by (4.23). The factorization of the sum in (4.29) holds because U (Y, 0) depends on 0 only 0 through 0 ∩ Y , and all the terms in (4.25) intersect only one γˆ 0 . We define γ 0 = (γ 0 , 0ˆ ∩ γ 0 , σ(γ 0 )), where σ(γ 0 ) is determined by the common signs of {σ(γ)}γ∈0 , and b 0 ). 90 (Y 0 , 00 ) = 90 (Y 0 , 0 s0x0
0
−n−1
(4.33)
0 32 ]\D3 , 2
for each x ∈ [L as the (constant) value of sy , Finally, we define for y ∈ Lx0 \(∪D⊂D32 V (D)). To see that sy is constant, observe that Lx0 ∩(D32 \D32 ) = 0 ∅, since x0 ∈ / D3 , and to see that the set Lx0 \(∪D⊂D32 V (D)) is not empty, notice that 2 dα n and α small. So, s0x0 is well-defined. |V (D)| ≤ cL < |Lx| = Ld , for D ⊂ D3 2 0 0 With this definition of s and of σ(γ ), we see that the sum (4.29) runs over s0 compatible families of contours. Inserting (4.29) in (4.27), and combining (4.17, 4.27) we get (3.25, 3.26) on the next scale and the proof of Proposition 3 is finished with: Lemma 5. 90 (Y 0 , 00 ) defined by (4.26, 4.33) and ρ0 (γ 0 ) defined by (4.30), satisfy the claims of Proposition 3, for β 0 = L1−4α β. Remark. Before proving Proposition 1, let us characterize + and − . It is easy to see that, if s ∈ , each connected component Di of D(s) is finite. Moreover, for each x ∈ L, there are at most a finite number of Di ’s with x ∈ V (Di ) (where V (Di ) = the complement of the infinite connected component of Zd \Di ). Indeed, otherwise, ∀n, there exists m ≥ n and a connected component Di of Dm (s) such that x ∈ V (Di ), |Di | ≥ Lα , Di ∩ L2 {x} = 6 ∅, where L2 {x} is the cube of size L2 centered at x (indeed, if this last condition is not satisfied for some m, it will hold for a larger m, because of
376
J. Bricmont, A. Kupiainen, R. Lefevere
the “blocking” in (3.20)). This would imply (by (3.23) that, ∀n, there exists m ≥ n and y ∈ Zd , with |x − y| ≤ L2 + 1, and Nym 6= 0; but since there is a finite number of such y’s, this in turn means that s ∈ / . So, combining these two facts, we see that, for each s ∈ , there exists a unique infinite b-connected set (a subset Y of L is b-connected if any two points of Y can be joined by a path (xi ), xi ∈ L with |xi − xi+1 | = b), where sx is of a given sign and this sign defines + and − . Now we can give the Proof of Proposition 1. Let us apply Proposition 3 up to the largest n such that |[L−n 32 ]| ≥ Ld . For that n, [L−n 32 ] ⊂ L2 {0}, where L2 {0} is the cube of size L2 centered at 0. Since 32 ↑ L, n → ∞. So, we have [L−n V ] = {0}, for n (i.e. 32 ) large, since V is fixed. Therefore, we have, for n as above, d([L−n V ], [L−n 32 ]c ) ≥ L and we can use (3.32). Then, given the bounds on 8nX and (3.32), it is enough to show that (4.34) Ze3+ 1 (s32 ) = 1 + O(e−cβn ) for s ∈ + . We claim that, in that case, for s ∈ + and n large enough, n D3 =∅ 2
(4.35)
(which implies, by (3.22), N3n2 ,x = 0, ∀x). Postponing the proof of (4.35), and using the representation (3.26), where now 0 = ∅, ρ(∅) = 1, W (∅) = 0, enters the sum since n = ∅, we get: D3 2 X ρn (0) exp(W n (0)), (4.36) Ze+ (s3 ) = 1 + 31
2
06=∅
where, for each γ in the sum, V (γ) ∩ [L−n 32 ] 6= ∅ (all other γ’s were small on the first scale, see (4.4, 4.5)). n = ∅, and N3n2 ,x = 0, ∀x. We use also (3.28) Now, we use the bound (3.29), with D3 2 and |W n (0)| ≤ ce−βn |0|, which follows from (3.31), and the fact that 9n (Y, 0) = 0 unless Y is connected and Y ∩ 0 6= ∅, to get X X ρn (0) exp(W n (0)) ≤ exp(−βn |0|/2). (4.37) 06=∅
06=∅
Since for each γ in the previous sum, V (γ) ∩ [L−n 32 ] 6= ∅, and since |[L−n 32 ]| ≤ L2d , for n as above ([L−n 32 ] ⊂ L2 {0}), we have, k X X X (4.37) ≤ exp(−βn |γ|/2) ≤ (e−cβn L2d )k , (4.38) k≥1 V (γ)∩[L−n 32 ]6=∅
k≥1
which proves (4.34). We are left with the proof of (4.35). Now observe that (on scale n = 0), if Di is a connected component of D32 = D+ (s32 ) (or of D− (s32 )) such that Di ∩ ∂32 = ∅, then Di ⊂ D(s). Besides, if s ∈ + and sx = −1, x ∈ ∂32 , then x ∈ V (Di ) for some connected Di ⊂ D(s).
Renormalization Group Pathologies and Definition of Gibbs States
377
Hence, if s ∈ + , D+ (s32 ) ⊂
[
{V (Di )|Di ⊂ D(s), V (Di ) ∩ 32 6= ∅}.
This implies inductively that [ n D3 ⊂ {V (Di )|Di ⊂ Dn (s), V (Di ) ∩ [L−n 32 ] 6= ∅}. 2
(4.39)
On the other hand, if s ∈ + , there exists n such that, ∀m ≥ n, Dm (s) ∩ L2 {0} = ∅.
(4.40)
V (Dn (s)) ∩ L2 {0} = ∪Di ⊂Dn (s) V (Di ) ∩ L2 {0} = ∅.
(4.41)
This implies that
Indeed, if (4.41) does not hold, and Dn (s) ∩ L2 {0} = ∅, then, because of the blocking in (3.20), Dm (s)∩L2 {0} 6= ∅, for some m ≥ n. By taking 32 large, we may assume that this n is the one chosen at the beginning of the proof, in particular that [L−n 32 ] ⊂ L2 {0}. n = ∅, i.e. (4.35). Combining this fact and (4.39, 4.41), we get D3 2 To prove Proposition 2 and Lemma 5, we need the following lemma. Lemma 6. Let Di be a connected component of Dn . Then p β0 Lαn/2 |Di | ≤ βn N n (Di ) ≤ 2dβ0 Lnd |Di |.
(4.42)
n , N3n2 (Di ). A similar bound holds for D3 2
Now we shall prove that the configurations in + , − are typical for µ0+ , µ0− : Proof of Proposition 2. It is enough to show, see (3.24), that, for any x ∈ Zd , µ0+ (cx ) = µ0− (cx ) = 0,
(4.43)
because, by (3.14) and a simple Peierls argument, one sees that the probability with respect to µ0+ (resp. µ0− ) of an infinite b-connected set of − (resp. +) spins is zero, hence µ0+ (− ) = µ0− (+ ) = 0. We shall prove that, ∀A ⊂ Zd , and ∀{Nxn ∈ L−(1−η)n N}x∈A , µ0+ (
Y
χNxn χA ) ≤ exp(−βen N n (A)),
(4.44)
x∈A
where χNxn means that the random variable Nxn (s) defined by (3.16, 3.17) takes the value Nxn (by (3.17), Nxn ∈ L−(1−η)n N), χA is the indicator function of the event: “A is a union of connected components of Dn (s)”, and βen = cL−αn/4 βn . Then (4.43) follows from
(4.45)
378
J. Bricmont, A. Kupiainen, R. Lefevere
µ0+ (cx ) ≤ lim
N →∞
∞ X
N →∞
n=N
≤ lim C N →∞
∞ Xc X
µ0+ (Nxn 6= 0)leq lim ∞ X
exp(−c
p β0 Lαn/4 |A|)
n=N A3x
p exp(−c β0 Lαn/4 ) = 0,
(4.46)
n=N
Pc where runs over connected sets, and, in the second inequality, we use (4.44) and the lower bound in (4.42). For µ0− , we can use the symmetry µ0+ (cx ) = µ0− (cx ), which follows from (2.4). To prove (4.44), consider first n = 0. Using (3.16), it is enough to show: µ0+ (A ⊂ D(s)) ≤ exp(−cβ0 |A|).
(4.47)
We have, by definition (3.6) of D(s), µ0+ (A ⊂ D(s)) = lim lim
32 ↑L 31 ↑Zd
XA Z3+ ({sx |d(x, A) ≤ b}) 1 , Z3+ 1
(4.48)
where the sum runs over {sx |d(x, A) ≤ b} such that, ∀x ∈ A, ∃y ∈ L, |x − y| = b and sx 6= sy . Using the representation (3.12) in the numerator of (4.48), we get: (T )−|32 |
XA
X
Z3+ 1 ({sx |d(x, A) ≤ b}) ≤
X1
ρ(0) =
0⊃BA
ρ(01 )
01 ⊃BA
X01
ρ(02 ),
02
(4.49) where the first sum runs over 01 such that γ ∩ BA 6= ∅, ∀γ ∈ 01 , with BA = ∪x∈A Bx , and the second sum runs over 02 ∩ BA = ∅, 02 ∩ 01 = ∅. Now, for any 01 in (4.49), P01 (T )−|32 | Z3+ 1 ≥ 02 ρ(02 ). Inserting this and (4.49) in (4.48), and using the bound (3.14), where we can put exp(−β0 |γ|) in the RHS (by taking 32 large enough), we get µ0+ (A ⊂ D(s)) ≤
X1
exp(−β0 |01 |)
01 ⊃BA
≤ exp(−β0 |A|/2)
X1
exp(−β0 |01 |/2)
01 ⊃BA
≤ exp(−β0 |A|/2) exp(ce−β0 /2 |A|) ≤ exp(−cβ0 |A|)
(4.50)
for β0 large enough, using 01 ⊃ BA in the first inequality, and the fact that each (connected) γ in 01 contains a box in BA in the second inequality, which implies X1 01 ⊃BA
exp(−β0 |01 |/2) ≤
Y
(1 +
X
e−β0 |γ|/2 )
γ3i
i∈BA
0 −β0 /2
≤ exp(c e
|BA |) ≤ exp(ce−β0 /2 |A|)
with c = c0 bd . This proves (4.47), i.e. (4.44), for n = 0. Then we proceed inductively: we have, by (3.17, 3.20),
(4.51)
Renormalization Group Pathologies and Definition of Gibbs States
µ0+ (
Y x0 ∈A0
379
χ N 0 0 χ A0 ) ≤ x
X
µ0+ (
[L−1 A]=A0 ,{Ny }
Y
Y
χNy χA )
x0 ∈A0
y∈A
X
χ(L−1+η
Ny = Nx0 0 )
(4.52)
y∈Lx0 ∩A
Since A is a union of connected components of Dn (s), using (4.44) on scale n and Lemma 6, we have: X∗
(4.52) ≤
Y
e−βN (A) ˜
x0 ∈A0
A,{Ny }
χ(L−1+η
X
Ny = Nx0 0 ),
(4.53)
y∈Lx0 ∩A
√ P∗ ˜ (A) ≥ c β0 Lαn/4 |A| (see (4.42, 4.45)). runs over A, {Ny } such that βN where Since Ny ∈ L−(1−η)n N, the sum over Ny , y ∈ Lx0 ∩A has at most L(1−η)(n+1) Nx0 0 terms. ˜ (A) ≥ Note also that, by (3.17, (3.33), βN (A) = β 0 N 0 (A0 ), i.e., again by (4.45), βN β˜ 0 Lα/4 N 0 (A0 ). √ 0 |A| ˜ (A) ≥ β˜ Lα/4 N 0 (A0 ) + β0 L αn 4 So, using βN 2 2 , we have: β˜ 0 (4.53) ≤ exp − Lα/4 N 0 (A0 ) 2 !Ld Y X 0 Nx 0 x0 ∈A0
L(1−η)(n+1)|A| exp(−
[L−1 A]=A0
p αn |A| ); β0 L 4 2
(4.54)
for β0 and L large, the sum over A is bounded by O(1), and we get: (4.54) ≤ exp(−β˜ 0 N 0 (A0 )),
(4.55)
which proves (4.44) on scale n + 1. Appendix: Proof of the Lemmas Proof of Lemma 1. First, note that a compatible family 0 that contains an s-compatible ˜ c and we have ˜ is trivially s-compatible: each x ∈ 0c ∩ L belongs also to 0 family 0 ˜ = sx (because 0 ˜ is s-compatible) and σx (0) ˜ = σx (0) (because 0 is compatible σx (0) ˜ ⊂ 0). and 0 ˜ ≡ Out(0) ∪ D32 this shows that, if Out(0) ∪ D32 is s-compatible, Now, letting 0 then, 0 is s-compatible. To prove the converse, we proceed inductively. Let us consider first n = 0. Assume that 0 is an s-compatible family of contours containing D32 . We shall show that Out(0)∪ D32 is s-compatible. First of all, note that Out(0) can be written as (γ1 , . . . , γn ) with γ i+1 ⊂ Int(γi ) and 32 ⊂ V (γn ). Obviously, since 0 is compatible, and since Out(0) contains all the contours in 0 such that 32 ⊂ V (γ), the sign in (V (γi+1 ))c must match the one in the component of γ ci containing γ i+1 ; therefore, Out(0) is compatible. To show that Out(0) ∪ D32 is s-compatible, we consider the following cases: a) γ n ∩ 32 6= ∅. In that case the signs of σ(γn ) must coincide with the signs of sx for all Bx adjacent to γ n (because γn belongs to 0, which is s-compatible). But since the signs of sx are constant outside D32 , this implies that Out(0) ∪ D32 is s-compatible.
380
J. Bricmont, A. Kupiainen, R. Lefevere
b) γ n ∩32 = ∅, which means that γ ∩32 = ∅, for all γ ∈ Out(0) (γn is the innermost contour in Out(0)). It also means that sx = 1 for some x ∈ ∂32 : indeed, otherwise, ∂32 would entirely belong to D32 , and ∂32 would be a part of a contour γ ∈ 0 such that 32 ⊂ V (γ); hence this γ would belong to Out(0), but obviously γ ∩ 32 6= ∅, contradicting our assumption on γn . We shall show that the (constant) sign given by σ(γn ) to 32 must be +1. Assuming this result, we see that this sign is compatible with those sx = 1 for x ∈ ∂32 , and again, since the signs of sx are constant outside D32 , this implies that Out(0) ∪ D32 is s-compatible. To show that this sign must be +1, assume that it is −1. This means that all x ∈ ∂32 with sx = 1 must be in V (γ) for some contour γ ∈ 0\Out(0), since 0 is s-compatible. But since all x ∈ ∂32 with sx = −1 belong to 0 ⊃ D32 , by definition, and since ∂32 is connected, ∂32 must be in V (γ). But then 32 ⊂ V (γ), which contradicts the fact that γ ∈ 0\Out(0). Let us now proceed inductively: if 00 is an s0 -compatible family of contours on scale n + 1, then there is an s-compatible family 0 ⊂ L00 . Moreover, Out(0) ∪ (D32 \D32 ) ⊂ 0 L(Out(00 ) ∪ D3 ). But since, by assumption, Out(0) ∪ D32 is s-compatible, Out(00 ) ∪ 2 0 0 D32 is s -compatible because of the way s0 and σ(00 ) were inductively defined, at the end of the proof of Proposition 3. Proof of Lemma 2. Since the subset 01 of large contours of an s-compatible family 0 always satisfies Out(01 ) = Out(0), Cs (01 ) 6= ∅ means, by Lemma 1, that 01 ∪ D32 is scompatible. Thus, any compatible family 0 ⊃ 01 ∪D32 is s-compatible, by the argument given at the beginning of the previous proof. So, Cs (01 ) consists of all the families of contours 02 so that 01 ∪ 02 is compatible and contains D32 , which is equivalent to the statements in the Lemma. Proof of Lemma 3. We write exp(W (01 , 02 )) =
Y
(e9(Y,01 ,02 ) − 1 + 1) =
X
τ (Y, 01 , 02 ),
(A.1)
Y
Y
where the product over Y runs over Y ∩ 02 6= ∅, see (4.7), and Y (e9(Y,01 ,02 ) − 1). τ (Y, 01 , 02 ) =
(A.2)
Y ∈Y
From (3.31), we have
|e9(Y,01 ,02 ) − 1| ≤ Ce−β|Y | ,
(A.3)
and, using (4.3), (3.29), (3.30),(3.18) and (3.22) we have: ρ(γ) ≤ exp(2βkL−3α |γ ∩ D32 | − β|γ\D32 |)
(A.4)
because γ is small, which implies γ ∩ (D32 \D32 ) = ∅. On the other hand, |γ\D32 | ≥
1 −α L |γ|, 2d
if
γ 6⊂ D32 ,
(A.5)
which follows from the fact that γ is connected while the connected components of D32 contain at most Lα points, so that, for each connected component of D32 , there exists at
Renormalization Group Pathologies and Definition of Gibbs States
381
least one site in γ\D32 which is adjacent to that connected component, and the same site can be adjacent to a fixed number of such components (the worst case is when γ\D32 contains only one site separating different components of D32 of size Lα ). Using the / D32 , fact that we also have |γ ∩ D32 ] ≤ |γ| ≤ 2dLα |γ\D32 |, we get, for γ ∈ ρ(γ) ≤ exp(−cβL−α |γ|)
(A.6)
and ρ(γ) = 1 for γ ∈ D32 . Now inserting (A.1) in (4.12), we get Xs
ρ(02 ) exp(W (01 , 02 )) =
X Y
02
(A.7)
A(Z, 01 ),
(A.8)
Z Z∈Z
where Z = 02 ∪ Y, and Z = (02 , Y ) with 02 ∪ Y connected; A(Z, 01 ) = ρ(02 )τ (Y, 01 , 02 )
(A.9)
with 02 ∪ Y = Z. A(Z, 01 ) depends on 01 only through 01 ∩ Z and is a function of {sx |x ∈ Ln Z ∩ 32 } for n ≥ 1 or of {sx |x ∈ 32 , d(x, Z) ≤ 2b} for n = 0, because it is a product of factors with these properties. Combining (A.3) and (A.6), (A.7), we get A(Z, 01 ) = 1
(A.10)
|A(Z, 01 )| ≤ exp(−cβL−α |Z|)
(A.11)
if Z = (Di , ∅), with Di ∈ D32 , and otherwise. We shall now see how to apply the polymer formalism (see e.g. [22]). There are constraints on 02 coming from (1,2,3) in Lemma 2: to deal with constraint (3), define ˜ 01 ) = A(Z, 01 ) for the corresponding Z, and a polymer as Z˜ = ((02 \D32 )∪Y with A(Z, define Z˜ to be “connected” if 02 ∪ Y is connected. Since A(Z, 01 ) = 1, if Z = (Di , ∅), ˜ 01 ) = 1 if Z˜ = ∅ and the bound (A.11) allows us to with Di ∈ D32 , we have A(Z, control the sum over “connected” polymers. The remaining constraints on Z˜ come from (1,2) in Lemma 2. The constraint (1) gives rise to the usual hard-core constraint between polymers. To deal with constraint (2), observe that, since the contours in 02 are small, if γ ∈ 02 , and if a finite connected component Vi of γ c intersects [L−n 32 ], it must be adjacent to γ ∩ 32 , and the constraint (2) in Lemma 2 is automatically satisfied for that component of γ c , since the signs σ(γ) must agree (by definition of a contour) with those of the internal spins in the blocks adjacent to γ. On the other hand, if a connected component Vi does not intersect 32 , then the s-compatibility reduces there to compatibility, as for the usual Ising contours and the constraint can be dealt with as in that case. The claims of the lemma follow then from the polymer formalism ([22]) applied to (A.8), for β0 large; ϕ+ (Y, 01 ) is a sum of products of A(Z, 01 ), with Z ⊂ Y so its dependence on {sx } and on 01 follows from one of the A(Z, 01 )’s mentioned above. We put L−2α in (4.13), instead of L−α , in order to control constants. To prove that ϕ+ (Y, 01 ) = ϕ− (Y, 01 ), if Ln Y ⊂ 32 , observe that, if γ ⊂ 32 , and γ ∩ ∂32 = ∅, the values of σi (γ), i ∈ / γ are determined by the external spins outside γ, since, by definition, the internal spins coincide with the external ones on the boxes adjacent to γ. Hence, the value of ρ(γ), see (3.13), is independent of the boundary conditions on 31 . It is then easy to see inductively that, on scale n, if Ln Y ⊂ 32 , ϕ+ (Y, 01 ) is independent of the boundary conditions.
382
J. Bricmont, A. Kupiainen, R. Lefevere
Proof of Lemma 4. First, observe that 8nX is indeed a function of {sx }x∈X , since, by Proposition 3 and Lemma 3, ρ(Di ), for Ln Di = X, and ϕ+ (Y ), for Ln Y = X, are functions of {sx }x∈X . Besides, 8nX is independent of the boundary conditions on 31 , by the argument given at the end of the proof of Lemma 3. Let us prove the bound (3.5). We consider separately each contribution to (4.18), and sum them over n. From (3.29, 3.30, 3.33), we have: | ln ρn (Di )| ≤ k0 βn N3n2 (Di ) ≤ k0 cβ0 Lnd |Di | ≤ C(L)β0 Lnd ,
(A.12)
where, in the second inequality, we use Lemma 6, and in the third, we use |Di | ≤ Lα , since Di belongs to D32 . Besides, d(X) ≤ C(L)Ln , if X = Ln Di and Di belongs to D32 . Let, for x ∈ L, n (s)}. n(x, s) = max{n|[L−n x] ∈ D3 2
(A.13)
(A.14)
n (s) means, by (3.23), that there exists a If s ∈ , n(x, s) < ∞, because [L−n x] ∈ D3 2 −n n y, with d(y, [L x]) ≤ 2, and Ny 6= 0. However, for any fixed x, Nyn = 0 for all such y 0 s, for n large enough, and s ∈ . So we get the bound XX X | ln ρn (D)|χ(Ln D = X) exp(d(X)1−η ) n X3x D∈D n 3
≤
X X
2
| ln ρn (D)|χ([L−n x] ∈ D) exp(C(L)Ln(1−η) )
n D∈D n 32
≤ C(L)β0
n(x,s) X
Lnd exp(C(L)Ln(1−η) ) ≡ C1 (x, s) < ∞,
(A.15)
n=0 n . since, for each n, [L−n x] ∈ D for at most one D ∈ D3 2 On the other hand,
d(X) ≤ C(L)Ln d(Y ) ≤ C(L)Ln |Y |
(A.16)
if X = Ln Y and Y is connected. So, with η = 4α, XXX |ϕ+ (Y )|χ(Ln Y = X) exp(d(X)(1−4α) ) n X3x Y
≤
X
X
|ϕ+ (Y )| exp(C(L)Ln(1−4α) |Y |)
n Y 3[L−n x]
≤
X
X
n Y 3[L−n x]
|ϕ+ (Y )| exp(
βn L−2α |Y | ), 2
where we used C(L)Ln(1−4α) ≤
βn L−2α , 2
(A.17)
(A.18)
Renormalization Group Pathologies and Definition of Gibbs States
383
which holds by (3.33) for all n ≥ 0, provided β0 is large enough. Then, the previous sum is bounded by X
Xc
n Y 3[L−n x]
exp(−
X βn L−2α |Y | L−2α )≤C ) ≡ C2 , exp(−βn 2 2 n
(A.19)
Pc using the bound (4.13) and the fact that the sum Y runs over connected sets, since + ϕ (Y ) = 0 unless Y is connected. Now take C(x, s) = C1 (x, s) + C2 to get (3.5). Turning to the proof of (3.4), k8nX k∞ ≤ cβ0 |X|, follows, for the second term in (4.18), from (A.12) (where Lnd |Di | ≤ |Ln Di ] = |X|), since a given X can equal Ln Di , for Di ∈ D for at most one n. The last term in (4.18) is controlled by 4.13, (A.19). The existence of the limit n → ∞ follows from the bounds (A.15, A.17, A.19). Turning to the proof of (3.32), observe that the last term in (4.20) is independent of sV , as long as d([L−n V ], [L−n 32 ]c ) ≥ L, since each ρ(Di ) is a function of {sx |x ∈ Ln Di }, Ln Di ∩3c2 6= ∅, and |Di | ≤ Lα . The contribution to (3.32) of the previous term in (4.20) is bounded (on scale n), using (4.13) and the fact that ϕ(Y ) is a function of sV only for Y ∩ [L−n V ] 6= ∅, by: |[L−n V ]| exp(−cβn L−2α d([L−n V ], [L−n 32 ]c )).
(A.20)
Now, use (3.33), which implies cβn L−2α d([L−n V ], [L−n 32 ]c ) ≥ Lηn (d(V, 3c2 ))1−2η (for β0 large) and |[L−n V ]| ≤ cL−nd |V |. The factor Lηn controls the sum over n, and we get (3.32). Proof of Lemma 5. Consider first 90 (Y 0 , 00 ). From (4.14), (3.31) and (4.13) we have 0 e |9(Y, 0ˆ )| ≤ 3 exp(−βL−2α |Y |).
Since Y is connected and all the terms in (4.26) have d(Y ) ≥
(A.21) L 4,
we have
|Y | ≥ cL|Y 0 |,
(A.22) 0
since [L−1 Y ] = Y 0 . The number of terms in the sum in (4.26) is at most 2L |Y | , since Y ⊂ LY 0 , so, using cLβL−2α − Ld ln 2 ≥ βL1−4α , for β0 large, we get (see (4.33)) d
|90 (Y 0 , 00 )| ≤ exp(−βL1−4α |Y 0 |)
(A.23)
for β0 large. 0 ˜ Then observe that, by induction and Lemma 3, 9(Y, 0ˆ ) = 0 unless Y is connected 0 and Y ∩ 0 6= ∅ (see (4.14)). Hence, since [L−1 Y ] = Y 0 , 90 (Y 0 , 0ˆ ) = 90 (Y 0 , 00 ) = 0 unless Y 0 is connected and Y 0 ∩ 00 6= ∅. Also by induction and Lemma 3, 9(Y, 0), 0 0 ˜ 0ˆ ) and 90 (Y 0 , 00 ) ϕ˜ + (Y, 0) depend on 0 only through 0∩Y ; hence, since 0 ⊂ L0ˆ , 9(Y, depend on 00 only through 00 ∩ Y 0 . Likewise, 9(Y, 0), ϕ˜ + (Y, 0) are functions (for 0 n ≥ 1) of {sx |x ∈ Ln Y ∩ 32 }, but since Y ⊂ LY when [L−1 Y ] = Y 0 , 90 (Y 0 , 00 ) is 0 a function of {sx |x ∈ Ln+1 Y ∩ 32 }. The same conclusion holds for n = 0, because 0 {x|d(x, Y ) ≤ 2b} ⊆ LY when [L−1 Y ] = Y 0 , for L large enough.
384
J. Bricmont, A. Kupiainen, R. Lefevere
Turning to ρ0 (γ 0 ), a similar argument shows that it is a function of {sx |x ∈ Ln+1 γ 0 ∩ 32 }. Let us prove (3.29) inductively: observe that it holds trivially for n = 0 because of (3.14). To proceed inductively, we use (4.30): ρ0 (γ 0 ) =
X`
ρ(0)V ¯ (Y, 0) exp(
X
E(γˆ i0 , 0 ∩ γˆ i0 ))
(A.24)
i
(0,Y)
with the constraints (4.31) and (4.32). Let us bound the different parts of this expression. Using (4.25) and the bounds (3.31), (4.13), we have: X −2α X −3α |E(γˆ i0 , 0 ∩ γˆ i0 )| ≤ C(L)e−βL |γˆ i0 | ≤ e−βL |γ 0 |, (A.25) i
i
since the sum here runs over disjoint subsets γˆ i0 of γ 0 , and the sum in (4.25) runs over connected sets Y that intersect Lγˆ 0 . Next using 4.3), the bounds (3.29) and (3.30), we have: ρ(0) ¯ ≤ exp βkN32 ((D32 \D32 ) ∩ 0) + 2βkN32 (D32 ∩ 0) − β|0\D32 | . (A.26) Since, by (3.22), N32 (0) = N32 (D32 ∩ 0) = N32 (D32 \D32 ) ∩ 0 + N32 (D32 ∩ 0). Now, we can write: |0\D32 | ≥ cL−α |0\(D32 \D32 )|,
(A.27)
because 0\D32 can be written as 0\(D32 \D32 )\D32 and, for each connected component of D32 , there is always at least one site in 0\(D32 \D32 ) which is adjacent to that component. By definition of D32 , N32 (D32 ∩ 0) ≤ L−3α |D32 ∩ 0| ≤ L−3α |0\(D32 \D32 )|
(A.28)
(since D32 ∩ 0 ⊂ 0\(D32 \D32 ) ). So we get, ρ(0) ≤ exp βkN32 ((D32 \D32 ) ∩ 0) − cβL−α |0\(D32 \D32 )| .
(A.29)
We have, by (4.28, 4.23, 3.31) and Lemma 3, Y −2α Ce−βL |Y | , ≤ exp −βL−3α |Y| V (Y, 0) ≤
(A.30)
Y ∈Y
P where |Y| = Y ∈Y |Y |, and we put L−3α in order to eliminate the constant C. Inserting (A.25, A.29, A.30) in (A.24), we get X −3α ρ0 (γ 0 ) ≤ exp e−βL |γ 0 | + β 0 kN30 2 (γ 0 )
exp −βL
0,Y −3α
|Y| − cβL
−α
|0\(D32 \D32 )|
(A.31)
with the same constraints as before and we used, by definition (3.33) of β 0 , and (3.17) of N30 2 ,
Renormalization Group Pathologies and Definition of Gibbs States
385
βN32 (D32 \D32 ) ∩ 0 = β 0 N30 2 (γ 0 ).
(A.32) S
In order to control the sum over 0, Y, we decompose 0\(D32 \D32 ) = i Ai into connected components and call a component long if, either Ai = γ, with V (γ) ⊃ 0 ) 6= ∅ and we call Ai short otherwise. [L−n 32 ], or Ai ∩ L(γ 0 \D3 2 S S S S ` Write A = ` Ai and As = s Ai , where ` ( s ) is the union over the long (short) components. We claim that a long component satisfies d(Ai ) ≥ L; to show that, consider two cases: either Ai = γ, with V (γ) ⊃ [L−n 32 ], and d(Ai ) ≥ L because of 0 our restriction on n in Proposition 3; or Ai intersects L(γ 0 \D3 ), and, since 0 contains 2 only long contours, Ai must intersect D32 \D32 (see (4.5), in which case it must cross 0 \L([L−1 (D32 \D32 ]), which implies d(Ai ) ≥ L; the corridor of width L given by LD3 2 0 0 Moreover, each box in L(γ \D32 ) is intersected either by a long Ai or by a Y ∈ Y, all of which have a diameter at least L4 ; thus, we have X 0 |Ai | + |Y| > cL|γ 0 \D3 |. (A.33) 2 Ai ∈A`
We write then the sum over 0, Y as follows: X X X X exp −β(cL−α |Ai | + L−3α |Y|) exp(−cβL−α |Ai |). (A.34) A` ,Y
As
Ai ∈A`
Ai ∈As
Using (A.33) and β 0 = L1−4α β, this sum can be bounded by 0 exp(−2β 0 |γ 0 \D3 |) exp(e−βL 2
−3α
C(L)|γ 0 |),
for L large enough, where the last factor bounds the sums in (A.34). So, we get, going back to (A.31), −3α 0 0 | + β 0 kN30 2 (γ 0 ) + (C(L) + 1)e−βL |D3 ∩ γ 0 | , (A.35) ρ0 (γ 0 ) ≤ exp −β 0 |γ 0 \D3 2 2 −3α
0 0 using |γ 0 | = |γ 0 \D3 | + |D3 ∩ γ 0 |P and (C(L) + 1)e−βL ≤ β0. 2 2 0 0 0 0 0 By (3.22), we have N32 (γ ) = Di ⊂γ 0 N (Di ) = N (D32 ∩ γ 0 ), and, using Lemma 6, we can bound
β 0 kN30 2 (γ 0 ) + (C(L) + 1)e−βL
−3α
0 |D3 ∩ γ 0 | ≤ β 0 k 0 N30 2 (γ 0 ) 2
(A.36)
−3α
<< L−n . This iterates (3.29). because, by (3.33), k 0 −k = L−n (1−L−1 ) and e−βn L To prove (3.30), observe that it follows for n = 0 because, see (2.1), |HD (σD |σ˜ Dc )| ≤ 4d|D|,
(A.37)
and we obtain a lower bound, with β0 k0 = 4dβ, by keeping in (3.12) only the term with Tx = T¯ . Inductively, we use (A.38) ρ0 (Di0 ) ≥ exp(E(Di0 ))ρ (D32 \D32 ) ∩ LDi0 because all terms in (4.30) are positive. Then use (3.30) inductively, the fact, analogous to (A.32), that (A.39) βN32 (D32 \D32 ) ∩ LDi0 = β 0 N30 2 (Di0 ), (A.25) and (A.36) with γ 0 = Di0 .
386
J. Bricmont, A. Kupiainen, R. Lefevere
Proof of Lemma 6. The upper bound follows inductively from (3.16) (for n = 0), βN (D(s)\D(s)) ∩ LDi0 = β 0 N 0 (Di0 ) X
and from
|Dj | ≤ Ld |Di0 |.
Dj ∈(D(s)\D(s))∩LDi0
As for the lower bound, it is obvious for n = 0 and β0 large. As above, we have X βN (Dj ), (A.40) β 0 N 0 (Di0 ) = j
P
runs over Dj ∈ (D(s)\D(s)) ∩ LDi0 and we have Di0 = [L−1 (∪j Dj )]. Write P 1 P2 P1 P2 the sum in (A.40) as + , where in , we have |Dj | > Lα and in , |Dj | ≤ Lα . In this latter case we must have N (Dj ) ≥ L−3α |Dj |, since Dj 6∈ D (see (3.18)). For P1 , we use Lemma 6 inductively where
j
X1
βN (Dj ) ≥
X1 p β0 Lαn/2 |Dj |,
j
(A.41)
j
and, since we have |Dj | ≥ Lα , we get Lαn/2
X1
0 |Dj | ≥ Lα(n+1)/2 |Di,1 |
(A.42)
j 0 with Di,1 = [L−1 (
S1 j
Dj )]. For
0
fact that each box Lx with x
0
box intersected by some Dj in X2
P2
, we have, using the definition (3.33) of βn and the S2 = [L−1 ( j Dj )] must contain or be adjacent to a
0 ∈ Di,2 P2
:
βN (Dj ) ≥ L−3α β
j
X2
|Dj |
j 0 | ≥ β0 L(1−4α)n L−3α c|Di,2 p α(n+1)/2 0 |Di,2 | ≥ β0 L
(A.43)
for α small enough and β0 large enough. Obviously (A.40-A.43) imply the lower bound in (4.42) on scale n + 1 since 0 0 | + |Di,2 | ≥ |Di0 |. |Di,1
(A.44)
n , N3n2 (Di ). The same arguments hold for D3 2
Acknowledgement. We would like to thank A. van Enter, R. Fernandez, C. Maes, C.-E. Pfister, A. Sokal, and K. Vande Velde for interesting discussions. This work was supported by NSF grant DMS-9205296 and by EC grant CHRX-CT93-0411.
Renormalization Group Pathologies and Definition of Gibbs States
387
References 1. Benfatto, G., Marinari, E. and Olivieri, E. Some numerical results on the block spin transformation for the 2d Ising model at the critical point. J. Stat. Phys. 78, 731–757 (1995) 2. J. Bricmont and A. Kupiainen. Phase transition in the 3d random field Ising model. Commun. Math. Phys. 116, 539–572 (1988) 3. Cammarota, C.: The large block spin interaction. Il Nuovo Cimento 96B, 1–16 (1986) 4. Cassandro, M. and Gallavotti, G.: The Lavoisier law and the critical point. Il Nuovo Cimento 25B, 695–705 (1975) 5. Cirillo, E.N.M. and Olivieri, E.: Renormalization-group at criticality and complete analyticity of constrained models: a numerical study. J. Stat. Phys. 86, 1117–1151 (1997) 6. Dobrushin, R.L.: Gibbs states describing a coexistence of phases for the three-dimensional Ising model. Th. Prob. and its Appl. 17, 582–600 (1972) 7. Dobrushin, R.L.: Lecture given at the workshop “Probability and Physics”, Renkum, (Holland), 28 August – 1 September, 1995 8. Domb, C. and Green, M.S. (Eds.): Phase transitions and critical phenomena, Vol. 6 New York: Academic Press, 1976 9. Fern´andez, R. and Pfister, C.-Ed.: Global specifications and non-quasilocality of projections of Gibbs measures. EPFL preprint, 1996 10. Gallavotti, G. and Knops, H.: Block spins interactions in the Ising model. Commun. Math. Phys. 36, 171–184 (1974) 11. Gallavotti, G. and Martin-L¨of, A.: Block spins distributions for short range attractive Ising models. Il Nuovo Cimento 36, 1–16 (1974) 12. Gawe¸dzki, K., Koteck´y, R. and Kupiainen, A.: Coarse graining approach to first order phase transitions. J. Stat. Phys. 47, 701–724 (1987) 13. Georgii, H.-O.: Gibbs Measures and Phase Transitions. de Gruyter Studies in Mathematics, Vol. 9), Berlin–New York: Walter deGruyter, 1988 14. Goldenfeld, N.: Lectures on phase transitions and the renormalization group. Frontiers in Physics 85, Reading, MA: Addison-Wesley, 1992 15. Griffiths, R.B.: Nonanalytic behavior above the critical point in a random Ising ferromagnet. Phys. Rev. Lett. 23, 17 (1969) 16. Griffiths, R.B. and Pearce, P.A.: Position-space renormalization-group transformations: Some proofs and some problems. Phys. Rev. Lett. 41, 917–920 (1978) 17. Griffiths, R.B. and Pearce, P.A.: Mathematical properties of position-space renormalization-group transformations. J. Stat. Phys. 20, 499–545 (1979) 18. Haller, K. and Kennedy, T.: Absence of renormalization group pathologies near the critical temperaturetwo examples. J. Stat. Phys. 85, 607–637 (1996) 19. Hasenfratz, A. and Hasenfratz, P.: Singular renormalization group transformations and first order phase transitions (I). Nucl. Phys. B 295, [FS21], 1–20 (1988) 20. Israel, R.B.: Banach algebras and Kadanoff transformations. In J. Fritz, J. L. Lebowitz, and D. Sz´asz, eds. Random Fields (Esztergom, 1979), Vol. II, pp. Amsterdam: North-Holland, 1981, 593–608 21. Kennedy, T.: Some rigorous results on majority rule renormalization group transformations near the critical point. J. Stat. Phys. 72, 15–37 (1993) 22. Kotecky, R. and Preiss, D.: Cluster expansion for abstract polymer models. Commun. Math. Phys. 103, 491–498 (1986) 23. L¨orinczi, J.: Some results on the projected two-dimensional Ising model. In M. Fannes, C. Maes, and A. Verbeure, ed. Proceedings NATO ASI Leuven Workshop “On Three Levels”, New York: Plenum Press, 1994, pp. 373–380 24. L¨orinczi, J. and Vande Velde, K.: A note on the projection of Gibbs measures. J. Stat. Phys. 77, 881–887 (1994) 25. L¨orinczi, J. and Winnink, M.: Some remarks on almost Gibbs states. In N. Boccara, E. Goles, S. Martinez, and P. Picco, eds. Cellular Automata and Cooperative Systems, Dordrecht: Kluwer, 1993, pp. 423–432 26. Maes, C. and Vande Velde, K.: Defining relative energies for the projected Ising measure. Helv. Phys. Acta. 65, 1055–1068 (1992) 27. Maes, C. and Vande Velde, K.: The (non-)Gibbsian nature of states invariant under stochastic transformations. Physica A 206, 587–603 (1994) 28. Maes, C. and Vande Velde, K.: Relative energies for non-Gibbsian states. Leuven preprint, 1996
388
J. Bricmont, A. Kupiainen, R. Lefevere
29. Martinelli, F. and Olivieri, E.: Some remarks on pathologies of renormalization-group transformations. J. Stat. Phys. 72, 1169–1177 (1993) 30. Martinelli, F. and Olivieri, E.: Instability of renormalization-group pathologies under decimation. J. Stat. Phys. 79, 25–42 (1995) 31. Schonmann, R.H.: Projections of Gibbs measures may be non-Gibbsian. Commun. Math. Phys. 124, 1–7 (1989) 32. van Enter, A.C.D.: Ill-defined block-spin transformations at arbitrarily high temperatures. J. Stat. Phys. 83, 761–765 (1996) 33. van Enter, A.C.D., Fern´andez, R. and Koteck´y, R.: Pathological behavior of renormalization group maps at high fields and above the transition temperature. J. Stat. Phys. 79, 969–992 (1995) 34. van Enter, A.C.D., Fern´andez, R. and Sokal, A.D.: Renormalization transformations in the vicinity of first-order phase transitions: What can and cannot go wrong. Phys. Rev. Lett. 66, 3253–3256 (1991) 35. van Enter, A.C.D., Fern´andez, R. and Sokal, A.D.: Regularity properties and pathologies of positionspace renormalization-group transformations: Scope and limitations of Gibbsian theory. J. Stat. Phys. 72, 879–1167 (1993) 36. van Enter, A.C.D. and L¨orinczi, J.: Robustness of the non-Gibbsian property: Some examples. University of Groningen preprint, 1995, J. Phys. A, Math. and Gen. 29, 2465–2473 (1996) Communicated by D. C. Brydges
Commun. Math. Phys. 194, 389 – 462 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Wulff Droplets and the Metastable Relaxation of Kinetic Ising Models Roberto H. Schonmann1,? , Senya B. Shlosman2,3,?? 1 2 3
Mathematics Department, University of California at Los Angeles, Los Angeles, CA 90095, USA Mathematics Department, University of California at Irvine, Irvine, CA 92697, USA Institute for the Information Transmission Problems, Russian Academy of Sciences, Moskow, Russia
Received: 7 May 1997 / Accepted: 29 October 1997
Abstract: We consider the kinetic Ising models (Glauber dynamics) corresponding to the infinite volume Ising model in dimension 2 with nearest neighbor ferromagnetic interaction and under a positive external magnetic field h. Minimal conditions on the flip rates are assumed, so that all the common choices are being considered. We study the relaxation towards equilibrium when the system is at an arbitrary subcritical temperature T and the evolution is started from a distribution which is stochastically lower than the (−)-phase. We show that as h & 0 the relaxation time blows up as exp(λc (T )/h), with λc (T ) = w(T )2 /(12T m∗ (T )). Here m∗ (T ) is the spontaneous magnetization and w(T ) is the integrated surface tension of the Wulff body of unit volume. Moreover, for 0 < λ < λc , the state of the process at time exp(λ/h) is shown to be close, when h is small, to the (−)-phase. The difference between this state and the (−)-phase can be described in terms of an asymptotic expansion in powers of the external field. This expansion can be interpreted as describing a set of C ∞ continuations in h of the family of Gibbs distributions with the negative magnetic fields into the region of positive fields. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 1.2 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 1.3 Some tools and further definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 1.4 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 1.5 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 2 Metastable Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 ? The work of R.H.S was partially supported by the N.S.F. through grants DMS 9100725, DMS 9400644 and DMS 9703814. ?? The work of S.B.S. was partially supported through grant DMS 9208029 and by the Russian Fund for Fundamental Research through grant 930101470.
390
R. H. Schonmann, S. B. Shlosman
2.2 2.3 2.4 2.5 3 3.1 3.2 3.3 3.4 3.5
Bottlenecks for the dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 The restricted ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Asymptotic expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 More general initial distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Relaxation Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Inverted pyramids and droplet growth . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 Rescaling and droplet creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Double well structure of equilibrium distributions . . . . . . . . . . . . . . . . . 444 Spectral gap estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
1. Introduction 1.1. Preliminaries. This paper is a continuation of the paper [Sch 1], and contains substantial strengthening of the results of that paper in the case of dimension 2. We refer the reader to [Sch 1] for a discussion of the motivation and background of the problem. For introductions to metastability see, e.g., [GD] and [PL]. The precise results in the current paper can only be stated after enough notation is introduced and are therefore postponed to Sect. 1.4. We provide next an informal summary of our results. Our concern in this paper is with the metastable behavior of the 2 dimensional Ising model, evolving with a reversible spin-flip dynamics, in the proximity of the phasecoexistence line. We study the system at an arbitrary subcritical temperature T and under a small positive external magnetic field h. The results proven all refer to limits in which h & 0. These results fully confirm, in particular, a conjecture raised by Aizenman and Lebowitz in [AL]. This conjecture was that if started from a typical configuration of the (−)-phase, for times of order exp(λ/h) with λ below a critical value λc the system would be in a sort of metastable state, close to the (−)-phase (in spite of the presence of the positive external field). On the other hand, for a time of order exp(λ/h) with λ > λc the system would have already relaxed and so would be close to the (+)-phase. In [Sch 1] a weaker version of this conjecture was proven, with the first scenario occurring for λ < λ1 and the second for λ > λ2 , with these two constants λ1 and λ2 having been explicitly estimated, but both being non-optimal. Moreover the temperature was supposed to be substantially lower than the critical one and the initial distribution had to be concentrated on the configuration with all spins down. On the good side, the results in [Sch 1] are valid in arbitrary dimension. Here we will only consider dimension 2, but strengthen the result in the following ways: 1) The constants λ1 and λ2 are shown to be identical, the common value being given by λc = λc (T ) =
w(T )2 . 12 T m∗ (T )
(1.1)
Here m∗ (T ) is the spontaneous magnetization and w(T ) is the integrated surface tension of the Wulff body of unit volume. Note that all quantities on the right-hand-side of (1.1) pertain to equilibrium statistical mechanics. 2) The (subcritical) temperature T can be arbitrarily close to the critical one. 3) The initial distribution is only required to be stochastically lower than the (−)-phase. In particular it can be a Gibbs distribution under any negative value of the external field (as it would if the system were allowed to first relax to equilibrium under a negative external field and then, suddenly, the field was switched to a small positive value).
Wulff Droplets and Metastable Relaxation
391
4) We also show that in a certain technical fashion it makes sense to say that at times of order exp(λ/h), with λ < λc rather than being in the (−)-phase the system is better described as being in a metastable state which is infinitesimally (in h) higher than the (−)-phase. The rigorous result is presented in the form of an asymptotic expansion in powers of h. The metastable states can then be seen as a family of C ∞ continuations into the region of positive external fields of the curve of the equilibrium states with negative external fields. (See also the discussion after the statement of the main result of this paper in the Sect. 1.4.) The result about the C ∞ continuations is not in conflict with the known fact, proven in [Isa], that at least at low enough temperature, there is no analytic continuation of the equilibrium states beyond the transition point. Various comments regarding (1) above are in order. To our knowledge, this is the first rigorous relation established between the equilibrium Wulff shape and the time evolution of kinetic Ising models. (The first rigorous relation between the equilibrium surface tension and the time evolution of kinetic Ising models was, as far as we know, established in the fundamental paper [Mar], in the situation in which there is no external field.) It is important to point out that when we started the investigation which led to the current paper we could not see any evident reason for (1.1) to hold. This doubt was expressed, and to some extent discussed, in Sect. 1-iii of [Sch 1]. Since the doubt stemmed in part from the study of the metastable behavior of anisotropic Ising models in [KO], it is important to stress that regarding the problems treated in the current paper, our results and methods apply also to these models. In Sect. 1.5 we will present a heuristic picture which predicts (1.1) and is based on considering the free-energy of individual droplets of the stable phase in the midst of the metastable phase and taking into account droplet growth at a fast enough speed. The aspect of the heuristics which originally seemed weak to us was the idea that the evolution of these droplets is governed by their equilibrium free-energy. A recent detailed study of computer simulations of the metastable relaxation of twodimensional kinetic Ising models in [RTMS], which was done independently of our work in this paper, also indicated the validity of (1.1). It has become evident over the years that the metastable behavior of kinetic Ising models is very rich and that precise mathematical statements can be conjectured and sometimes proven in various different asymptotic regimes (see, e.g., Sect. 4 of [Sch 1] or a more complete discussion in [Sch2]). In a recent companion paper, [DS], results were obtained which are counterparts to some of those presented here, but in the case in which the external field is held fixed (positive and small) and the temperature is scaled to zero. This paper is divided into three parts. In the remainder of the first one, to which this section belongs, we will be introducing notation and terminology, stating results, motivating these results heuristically and presenting some basic tools. In the second part we prove the results concerning the metastable regime, i.e., the behavior at times of order of exp(λ/h) with 0 < λ < λc . In the third part we prove the results concerning the relaxation regime, i.e., the behavior at times of the order of exp(λ/h) with λ > λc . 1.2. Notation and terminology. In this section we introduce a long sequence of definitions, notation and techniques. We tried to make everything as standard as possible, so that most readers will browse quickly through this section, finding few things which they are not familiar with. Most statements are made without proof, and we refer readers to the books [Geo and Lig], and other references therein, for explanation. Almost all the notation introduced below is identical to that in the papers [Sch1 and SS1].
392
R. H. Schonmann, S. B. Shlosman
The lattice. The cardinality of a set 3 ⊂ Z2 will be denoted by |3|. The expression 3 ⊂⊂ Z2 will mean that 3 is a finite subset of Z2 . For each x ∈ Z2 , we define the usual norms kxkp = (|x1 |p + |x2 |p )1/p , p > 0 finite, and kxk∞ = max{|x1 |, |x2 |}. The distance between two sets 31 , 32 ∈ Z2 in each one of these norms will be denoted by distp (31 , 32 ) = inf{||x − y||p : x ∈ 31 , y ∈ 32 }. In case 31 = {x}, we also write distp (31 , 32 ) = distp (x, 32 ). The interior and exterior boundaries of a set 3 ⊂ Z2 will be denoted, respectively by ∂int 3 = {x ∈ 3 : kx − yk1 = 1 for some y 6∈ 3}, and ∂ext 3 = {x 6∈ 3 : kx − yk1 = 1 for some y ∈ 3}. The p-norm diameter of a set 3 ⊂⊂ Z2 is defined by diamp (3) = max{distp (x, y) : x, y ∈ 3}. Given a set A ⊂ R2 , we will write 3(A) = A ∩ Z2 . In case A = [−l/2, l/2]2 is a l × l square centered at the origin, we simplify the notation to 3(l) = 3(A) = Z2 ∩ [−l/2, l/2]2 . Given A, B ⊂ R2 , z ∈ R2 and c ∈ R, we define A + B = {x + y : x ∈ A, y ∈ B}, A + z = A + {z}, and cA = {cx : x ∈ A}. The set of bonds, i.e., (unordered) pairs of nearest neighbors is defined as B = {{x, y} : x, y ∈ Zd and kx − yk1 = 1}. Given a set 3 ⊂⊂ Z2 we define also B3 = {{x, y} : x, y ∈ 3 and kx − yk1 = 1}, ∂B3 = {{x, y} : x ∈ 3, y 6∈ 3 and kx − yk1 = 1}.
Notions from percolation. A chain is a sequence of distinct sites x1 , . . . , xn , with the property that for i = 1, . . . , n − 1, ||xi − xi+1 ||1 = 1. The sites x1 and xn are called the end-points of the chain x1 , . . . , xn , and n is its length. A (*)-chain, its end-points and its length are defined in the same way, but with || · ||1 replaced by || · ||∞ . Informally this means that while chains can only move along bonds of Z2 , (*)-chains can also move along diagonals. A set of sites with the property that each two of them can be connected by a chain contained in the set is said to be a connected set. A chain or (*)-chain is said to connect two sets if it has one end-point in each set. A set of sites is said to be simply-connected in case it is connected and its complement is also a connected set. A circuit is a chain such that ||x1 − xn ||1 = 1. Similarly a (*)-circuit is a (*)-chain such that ||x1 − xn ||∞ = 1.
Wulff Droplets and Metastable Relaxation
393
The configurations and observables. At each site in Z2 there is a spin which can take d values −1 and +1. The configurations will therefore be elements of the set {−1, +1}Z = . Given σ ∈ , we write σ(x) for the spin at the site x ∈ Z2 . Two configurations are specially relevant, the one with all spins −1 and the one with all spins +1. We will use the simple notation − and + to denote them. The single spin space, {−1, +1} is endowed with the discrete topology and is endowed with the corresponding product topology. The following definition will be important when we introduce finite systems with boundary conditions later on; given 3 ⊂⊂ Z2 and a configuration η ∈ , we introduce 3,η = {σ ∈ : σ(x) = η(x) for all x 6∈ 3}. Real-valued functions with domain in are called observables. For each observable f we use the notation ||f ||∞ = supη∈ |f (η)|. Local observables are those which depend only on the values of finitely many spins, more precisely, f : → R is a local observable if there exists a set S ⊂⊂ Z2 such that f (σ) = f (η) whenever σ(x) = η(x) for all x ∈ S. The smallest S with this property is called the support of f , denoted Supp(f ). The topology introduced above on , has the nice feature that it makes the set of local observables be dense in the set of all continuous observables. In the following partial order is introduced: η ≤ ζ if η(x) ≤ ζ(x) for all x ∈ Z2 . A particularly important role will be played in this paper by the non-decreasing local observables. Each local observable can be written as the difference between two nondecreasing ones. A (+)-chain in a configuration σ is a chain of sites, x1 , . . . , xn , as defined above, with the property that for each i = 1, . . . , n, σ(xi ) = +1. Given 31 , 32 ⊂ Z2 , we will use the + notation {31 ←→ 32 } to denote the set of configurations in which there is a (+)-chain with one end-point in 31 and one end-point in 32 . In case 31 = {x} we simplify this + notation to {x ←→ 32 }, and similarly for 32 . Given a configuration σ, we say that a + site x is (+)-connected to a set 3 ⊂ Z2 in σ if σ ∈ {x ←→ 3}. The (+)-cluster of a set 2 3 ⊂ Z in the configuration σ is the set of sites which are (+)-connected to 3 in this configuration. Similar notions can be defined for (−)-connectedness, (+*)-connectedness − +∗ and (−*)-connectedness. In particular the notation {31 ←→ 32 }, {31 ←→ 32 }, and −∗ {31 ←→ 32 } should have now self-explanatory meaning. The probability measures. We endow also with the Borel σ-algebra corresponding to the topology introduced above. In this fashion, each probability measure µ in this space R can be identified by the corresponding expected values f dµ of all the local observables f . A sequence of probability measures, (µn )n=1,2,... , is said to converge weakly to the probability measure ν in case Z Z f dµn = f dν for every continuous observable f . (1.2) lim n→∞
The family of probability measures on will be partially ordered by the following relation: µ ≤ ν if Z Z f dµ ≤ f dν for every continuous non-decreasing observable f . (1.3)
394
R. H. Schonmann, S. B. Shlosman
Because the local observables are dense in the set of continuous observables, we can restrict ourselves to the local ones in (1.2) and (1.3). Moreover, because every local observable is the difference between two non-decreasing ones, we can also restrict ourselves to those in (1.2). The Gibbs measures. We will consider always the formal Hamiltonian. Hh (σ) = −
1 X hX σ(x)σ(y) − σ(x), 2 x,y n.n. 2 x
(1.4)
where h ∈ R is the external field and σ ∈ is a generic configuration. In order to give precise definitions, we define, for each set 3 ⊂⊂ Z2 and each boundary condition η ∈ , H3,η,h (σ) = −
1 2
X {x,y}∈B3
σ(x)σ(y) −
1 2
X {x,y}∈∂B3 y6∈3
σ(x)η(y) −
hX σ(x). 2 x∈3
(1.5)
In what follows the temperature T will often appear explicitly in the notation, for clarity. Later on in this paper we will usually be considering a situation in which the temperature is fixed, while we scale the external field h, and then the temperature will be omitted from the notation. Given 3 ⊂⊂ Z2 , η ∈ , E ⊂ , T > 0 and h ∈ R, we write X exp(−βH3,η,T,h (σ)), Z3,η,T,h (E) = σ∈3,η ∩E
where β = 1/T . We abbreviate Z3,η,T,h = Z3,η,T,h (). The Gibbs (probability) measure in 3 with boundary condition η under external field h and at temperature T is now defined on as exp(−βH3,η,h (σ)) , if σ ∈ 3,η , Z3,η,T,h µ3,η,T,h (σ) = 0, otherwise. The Gibbs measures satisfy the following monotonicity relations to which we will refer as the FKG-Holley inequalities. If η ≤ ζ and h1 ≤ h2 , then, for each 3 ⊂⊂ Z2 and T > 0, µ3,η,T,h1 ≤ µ3,ζ,T,h2 . A Gibbs measure for the infinite system on Z2 is defined as any probability measure µ which satisfies the DLR equations in the sense that for every 3 ⊂⊂ Z2 and µ-almost all η ∈ , (1.6) µ( · |3,η ) = µ3,η,T,h ( · ). Alternatively and equivalently, Gibbs measures can be defined as elements of the closed convex hull of the set of weak limit points of sequences of the form (µ3i ,ηi ,h )i=1,2,... , where each 3i is finite and 3i → Z2 , as i → ∞, in the sense that ∞ 2 ∪∞ i=1 ∩j=i 3j = Z . For each value of T and h, µ3(l),−,T,h (resp. µ3(l),+,T,h ) converges weakly, as l → ∞, to a probability measure that we will denote by µ−,T,h (resp. µ+,T,h ). If h 6= 0 it is known that µ−,T,h = µ+,T,h , which will then be denoted simply by µT,h ; it is also
Wulff Droplets and Metastable Relaxation
395
known that this is the only Gibbs measure for the infinite system in this case. If h = 0 the same is true if the temperature is larger than or equal to a critical value Tc > 0, and is false for T < Tc , in which case one says that there is phase coexistence. We use the following abbreviations and names: µ−,T,0 = µ−,T = the (−)-phase, µ+,T,0 = µ+,T = the (+)-phase. Another known fact is that for each fixed T ,
and
µT,h → µ+,T weakly, as h & 0,
(1.7)
µT,h → µ−,T weakly, as h % 0.
(1.8)
For the expected value corresponding to a Gibbs measure µ..., in finite or infinite volume, we will use the notation Z hf i... = f dµ..., where ... stands for arbitrary subscripts. The corresponding conditional expectation, given the event E will be denoted by hf |Ei... . The spontaneous magnetization at temperature T is defined as m∗ (T ) = hσ(0)i+,T . (Here we are using a common and convenient form of abuse of notation: σ(x) is being used to denote the observable which associates to each configuration the value of the spin at the site x in that configuration. This notation will also be used in other places.) It is known that m∗ (T ) > 0 if and only if µ−,T 6= µ+,T and also that limT &0 m∗ (T ) = 1. Surface tension and Wulff shape. The direction dependent 0-field surface tension is defined in the following way. First consider on R2 × R2 the usual inner product (x, y) = x1 y1 + x2 y2 . Let S1 = {x ∈ R2 : ||x||2 = 1}, and for each vector n ∈ S1 , consider the following configuration, to be used as a boundary condition ( +1, if (x, n) ≥ 0, η(n)(x) = −1, if (x, n) < 0. The surface tension in the direction perpendicular to n is given by τT (n) = lim − l→∞
1 Z3(l),η(n),T,0 log , β||y(l) − z(l)||2 Z3(l),+,T,0
where y(l) and z(l) = −y(l) are the points where the straight line {x ∈ R2 : (x, n) = 0} intersects the boundary of the square 3(l). It is known that for each T < Tc the surface tension τT (·) is a continuous strictly positive and finite function. We shall use D to denote the set of all closed self-avoiding rectifiable curves γ ⊂ R2 that are a boundary of a bounded region, γ = ∂V, V ⊂ R2 . Let us recall that a curve is called rectifiable if the supremum of the lengths of polygons, with edges connecting arbitrary collections of points chosen on the curve, in the order inherited from the curve, is finite (and equals then the length, |γ|, of the curve γ), and that a rectifiable curve has
396
R. H. Schonmann, S. B. Shlosman
a tangent at almost every point. It is easy to verify that a curve γ that is the boundary of a convex bounded region belongs to D. We can assign to each curve γ ∈ D the quantity Z W(γ) = WT (γ) = τT (ns )ds, γ
where s parametrizes the curve γ according to Euclidean length measured along this curve, and ns is the unit outward normal vector to the curve at the point s ∈ γ (i.e. the vector orthogonal to the tangent in the considered point and oriented outward the region bounded by γ). The functional WT will be called the Wulff functional associated to the zero-field direction-dependent surface tension τT (·). Sometimes we will refer to it also as the integrated surface tension. To every vector n ∈ S1 and λ > 0 we assign the half-plane LT,n,λ = x ∈ R2 : (x, n) ≤ λτT (n) . Let us consider the intersection WT,λ =
\
LT,n,λ .
(1.9)
n∈S1
These sets clearly satisfy the scaling relation WT,λ = λWT,1 . In particular they keep the same shape, as λ varies; this shape is called the Wulff shape. The Wulff body of volume 1 is defined as WT = WT,λ0 , where λ0 is chosen so that its volume is indeed 1. WT is clearly convex and thus its boundary ∂WT ∈ D. The following is therefore well defined, w = w(T ) = WT (∂WT ). For each T < Tc , the boundary of the Wulff body satisfies the following variational principle. For all γ ∈ D which are boundaries of regions of volume 1, w(T ) ≤ WT (γ),
(1.10)
with equality only in case γ is a translation of ∂WT . The dynamics. We introduce now for the Ising model above, the type of time evolution which makes it into what is known as the kinetic Ising model, or stochastic Ising model, or dynamic Ising model or Glauber dynamics. First we recall that a spin flip system is defined as a Markov process on the state space , whose generator, L, acts on a generic local observable f as X c(x, σ)(f (σ x ) − f (σ)), (1.11) (Lf )(σ) = x∈Zd
where σ x is the configuration obtained from σ by flipping the spin at the site x, and c(x, σ) is called the rate of flip of the spin at the site x when the system is in the state σ. In order for this generator to be well defined and indeed generate a unique Markov process, one has to assume that the rates c(x, σ) satisfy certain regularity conditions. For our purposes here, we will actually restrict ourselves to the following conditions, which are more than enough to assure the existence and uniqueness of the process. (H1) (Translation invariance) For every x, y ∈ Zd ,
Wulff Droplets and Metastable Relaxation
397
c(x, σ) = c(x + y, θy σ), where θy σ is the configuration obtained by shifting σ by y, i.e., (θy σ)(z) = σ(z − y). (H2) (Finite range) There exists R such that c(0, η) = c(0, ζ) if η(x) = ζ(x) whenever kxk∞ ≤ R. The minimal such R is called the range of the interaction. The connection between the rates of flip and the Hamiltonian (1.4) and the temperature T = 1/β is established by imposing conditions which assure us that the Gibbs measures are not only invariant, but also reversible with respect to the dynamics. These conditions, called detailed balance, state that for each x ∈ Zd and σ ∈ , c(x, σ) = c(x, σ x ) exp(−β1x Hh (σ)),
where
1x Hh (σ) = σ(x)
X
(1.12)
σ(y) + h ,
y:{x,y}∈B3
which formally equals Hh (σ x ) − Hh (σ). We will usually make the dependence on h explicit, by writing ch (x, σ) for the rates. There are many examples of rates which satisfy the conditions of detailed balance (1.12) and also the other hypotheses, H(1) and H(2). The most common examples found in the literature are: Example 1 (Metropolis Dynamics). ch (x, σ) = exp(−β(1x Hh (σ))+ ), where (a)+ = max{a, 0} is the positive part of a. Example 2 (Heat Bath Dynamics). ch (x, σ) = Example 3.
1 . 1 + exp(β1x Hh (σ))
β ch (x, σ) = exp − 1x Hh (σ) . 2
Each one of these rates satisfies also the further conditions below which will be needed for the analysis in this paper to be possible. (H3) (Attractiveness and monotonicity in h) If η(x) ≤ ζ(x) and h1 ≤ h2 , then ch1 (x, η) ≤ ch2 (x, ζ) if η(x) = ζ(x) = −1, ch1 (x, η) ≥ ch2 (x, ζ) if η(x) = ζ(x) = +1. (H4) (Uniform boundedness of rates) For each temperature T there is h(T ) > 0 and 0 < cmin (T ) ≤ cmax (T ) < ∞ such that for all h ∈ (−h(T ), h(T )) and σ ∈ , cmin (T ) ≤ ch (0, σ) ≤ cmax (T ).
398
R. H. Schonmann, S. B. Shlosman
Throughout this paper we will suppose that we have chosen and kept fixed a set of rates ch (x, σ) which satisfy the detailed balance conditions, (1.12) and all the hypotheses η )t≥0 , where η is the initial H(1) - H(4). This spin flip system will be denoted by (σh;t configuration. If this initial configuration is selected at random according to a probability ν )t≥0 . The probability measure measure ν, then the resulting process is denoted by (σh;t on the space of trajectories of the process will be denoted by P, and the corresponding expectation by E. (Later, when we couple various related processes, we will also use the symbols P and E to denote probabilities and expectations in some larger probability spaces, but no confusion should arise from this.) The assumption of detailed balance, (1.12), assures that the Gibbs measures are invariant with respect to the stochastic Ising models. Moreover, from the assumption of attractiveness, H(3), one obtains the following convergence results − → µ−,T,h , σh;t and
+ → µ+,T,h , σh;t
weakly, as t → ∞. We will want to consider, sometimes as a tool, and sometimes for its own sake, the counterpart of the stochastic Ising model that we are considering, on an arbitrary finite set 3 ⊂⊂ Z2 , with some boundary condition ξ ∈ . This process, which will be η )t≥0 , where η ∈ 3,ξ is the initial configuration, is defined as the denoted by (σ3,ξ,h;t spin flip system with rates of flip given by ch (x, σ) if σ, σ x ∈ 3,h, c3,ξ,h (x, σ) = 0 otherwise. When σ, σ x ∈ 3,h, , (1.12) yields, for all x ∈ Zd , µ3,ξ,h (σ)c3,ξ,h (x, σ) = µ3,ξ,h (σ x )c3,ξ,h (x, σ x ),
(1.13)
which is the usual reversibility condition for finite state-space Markov processes. (Conversely, if one requires (1.13) to be satisfied for arbitrary 3 ∈ F and ξ ∈ , then one η ) is irreducible and can deduce that (1.12) must hold.) It is clear from H(4) that (σ3,ξ,h;t hence from (1.13) it follows that, for any η, η → µ3,ξ,T,h , σ3,ξ,h;t
weakly, as t → ∞.
Graphical construction. In order to prove our claims in this paper, we will use a standard graphical construction which provides versions of the whole family of processes at a given temperature T , with arbitrary value of h ∈ (−h(T ), h(T )), either on the infinite lattice Z2 or on any of its finite subsets, with arbitrary boundary conditions and starting from any initial configuration, all on the same probability space. This construction is the same one used in [Sch 1]. But the relevance of this construction will be even greater here than it was in that paper, since in part 3 of our paper we will use it to define the process on regions of space-time which are fairly general, and we will set up a rescaling procedure based on such objects. The graphical construction that we use is a specific version of what is called basic coupling between spin flip processes: a coupling in which the spins flip together as
Wulff Droplets and Metastable Relaxation
399
much as possible, considering the constraint that they have to flip with certain rates. The construction is carried out by first associating to each site x ∈ Z2 two independent Poisson processes, each one with rate cmax (T ). We will denote the successive arrival + − )n=1,2,... and (τx,n )n=1,2,... . Assume times (after time 0) of these Poisson processes (τx,n that the Poisson processes associated to different sites are also mutually independent. + We say that at each point in space-time of the form (x, τx,n ) there is an upward mark and − that at each point of the form (x, τx,n ) there is a downward mark. Next we associate to ∗ ∗ each arrival time τx,n , where ∗ stands for + or −, a random variable Ux,n with uniform distribution between 0 and 1. All these random variables are supposed to be independent among themselves and independent from the previously introduced Poisson processes. This finishes the construction of the probability space. The corresponding probability and expectation will be denoted, respectively, by P and E. We have to say now how the various processes are constructed on this probability space. For finite 3 and arbitrary ξ, η ) is constructed as follows. We know that almost surely the random the process (σ3,ξ,h;t ∗ times τx,n , x ∈ 3, n = 1, 2, . . . , ∗ = +, −, are all distinct, and we update the state of the process at each time when there is a mark at some x ∈ 3 according to the following ∗ ), and the configuration rules. If the mark that we are considering is at the point (x, τx,n ∗ immediately before time τx,n was σ, then i) The spins not at x do not change. ii) If σ(x) = −1 (resp. σ(x) = +1), then the spin at x can only flip if the mark is of upward type (resp. downward type). iii) If the mark is upward and σ(x) = −1, or if the mark is downward and σ(x) = +1, ∗ cmax . then we flip the spin at x if and only if c3,ξ,h (x, σ) > Ux,n One can readily see that the process constructed in this fashion has the correct rates of flip. In principle, one would like to construct the processes on the infinite lattice Z2 in a similar fashion, with ch (x, σ) replacing c3,ξ,h (x, σ) in (iii). Some extra care has to be taken, because during any non-degenerate interval of time infinitely many marks occur. This is not a real problem, because of the assumption that the range of the interaction, R, is finite. Starting from a configuration η at time 0, we have to say how the spin at a generic site x at a time t is obtained. Using percolation arguments one can argue that on a set of probability 1 in the probability space where the marks were defined, for any fixed η (x))l=1,2,... x and t, if we take any boundary condition ξ, then the sequence (σ3(l),ξ,h;t will converge as l → ∞ (i.e., will become constant for large l), to a limit which does not η (x), and it is clear that depend on ξ. This limit can then be taken to be the value of σh;t the process thus constructed has the correct flip rates and is therefore a version of the η ). The expected value of a function of that process will be denoted by h·iηh;t . process (σh;t A standard proof of the claim above about insensitivity to receding boundary conditions can be found in [Sch 1]. Those estimates presented in [Sch 1] show also that even if we let t grow with l, but keeping l/t large enough, then the spin at a fixed site x is almost insensitive up to time t to what happens outside of the box 3(l). We state this result in the form of a lemma for future reference. This lemma is a rigorous counterpart to the informal statement that because of the finite range of the interaction and of the uniform upper bound on the rates of flip, “the effects travel with a bounded speed”. Lemma 1.2.1. For each temperature T , there exists a finite positive constant C(T ) such that if l ≥ C(T )t, then there exist a positive function C1 (x), x ∈ Z2 and a positive constant C2 , such that for every site x ∈ Z2 ,
400
R. H. Schonmann, S. B. Shlosman
sup
η η sup sup P(σh;t (x) 6= σ3(l),ξ,h;t (x)) ≤ C1 (x) exp{−C2 l}.
h∈(−h(T ),h(T )) ξ∈ η∈
Proof. See [Sch 1]. Because of the hypotheses (H3), of attractiveness and monotonicity in h, the coupling provided by the construction above preserves the order between the coupled marginal processes, in various cases. In this paper we will need the following facts. If η ≤ ζ, ξ ≤ ξ 0 , −h(T ) < h1 ≤ h2 < h(T ) and 3 ⊂⊂ Z2 is arbitrary, then for all t ≥ 0, η ζ ≤ σ3,ξ σ3,ξ,h 0 ,h ;t , 1 ;t 2
(1.14)
σhη1 ;t ≤ σhζ 2 ;t ,
(1.15)
η ≤ σhζ 2 ;t . σ3,−,h 1 ;t
(1.16)
and We will refer to these inequalities as basic-coupling inequalities. (Observe that the FKGHolley inequalities for the models we are considering can be derived from (1.14).) We will sometimes have to enlarge the probability space defined by the graphical construction to accommodate a random choice of the initial configuration, according to some distribution ν, performed independently of the subsequent time evolution of the process. In this case we will replace the initial configuration with ν in the notation for the process, but, in a slight abuse of notation, we will still use P and E for probabilities and expectations in this enlarged probability space. A few more remarks on notation and conventions. We will use C, C1 , C2 ,C 0 ,C(T ), etc..., to denote positive finite constants, whose precise values are not relevant and may even change from appearance to appearance. We will omit the temperature T in most of the notation, since it is fixed. 1.3. Some tools and further definitions. Before we can state our main results, we need to introduce a few more concepts. This will by done in this section, intermingled with the presentation of some basic techniques. A fundamental fact that we will use often is that if T < Tc then there is a finite positive constant C(T ) such that for all h ≥ 0, all 3 ⊂ Z2 and all x, y ∈ Z2 , −∗ −∗ −∗ µ3,+,h x ←→ y ≤ µ+,h x ←→ y ≤ µ+ x ←→ y ≤ exp(−C(T )||x − y||∞ ).
(1.17 )
The first two inequalities above are instances of the FKG-Holley inequalities (and hold, of course, for all T ), and the third one is Theorem 1 in [CCS]. We review next some well known ways to exploit these inequalities in combination with the FKG-Holley inequalities and the Markov property of the Gibbs measures. Until we say otherwise, we will be supposing that f is an increasing local observable. We start with the observation that if ν1 and ν2 are two probability distributions on , and the event E ⊂ is such that ν1 ( · ) ≤ ν2 ( · |E c ), then
Wulff Droplets and Metastable Relaxation
Z
Z f dν1 −
401
Z
Z Z f dν1 − ν2 (E) f dν2 ( · |E) − ν2 (E c ) f dν2 ( · |E c ) Z Z = f dν1 − f dν2 ( · |E c ) Z Z c + ν2 (E) − f dν2 ( · |E) + f dν2 ( · |E )
f dν2 =
≤ 2 ||f ||∞ ν2 (E).
(1.18 )
0 2 Suppose n that 3 ⊂ 3 ⊂⊂o Z and η ∈ . By partitioning the complement of the − event E = Supp(f ) ←→ 3c according to what the (−)-cluster of the set 3c is, and using the FKG-Holley inequalities and the Markov property of the Gibbs distributions, one obtains (1.19) µ30 ,η,h ( · |E c ) ≥ µ3,+,h ( · ).
Therefore, from the FKG-Holley inequalities and (1.18) we obtain for any η ∈ − (1.20) 0 ≤ hf i3,+,h − hf i30 ,η,h ≤ 2 ||f ||∞ µ30 ,η,h Supp(f ) ←→ 3c . If we let now 30 → Z2 and suppose that µ30 ,η,h → µ weakly for some distribution µ, we obtain Z − 0 ≤ hf i3,+,h − f dµ ≤ 2 ||f ||∞ µ Supp(f ) ←→ 3c . (1.21) By taking η = + and combining (1.21) with (1.17) we obtain, when T < Tc and h ≥ 0, (1.22) 0 ≤ hf i3,+,h − hf i+,h ≤ C(f ) exp − C(T ) dist∞ (Supp(f ), 3c ) . Next we consider the correlation functions hf ; gi3,+,h = hf gi3,+,h − hf i3,+,h hgi3,+,h , where f and g are two local observables, not necessarily increasing. We claim that there is a finite positive numerical constant C such that for each T and h, − (1.23) |hf ; gi3,+,h | ≤ C ||f ||∞ ||g||∞ µ3,+,h Supp(f ) ←→ Supp(g) . There is no loss in generality in supposing that f and g are increasing and have ||f ||∞ ≤ 1, ||g||∞ ≤ 1. Given η ∈ {−1, +1}Supp(g) , let Eη be the event that the configuration restricted to Supp(g) is identical to η, and let gη be the value that g assumes on Eη . Note now that using first the FKG-Holley inequalities and then (1.20), we obtain X hf |Eη i3,+,h gη µ3,+,h (Eη ) hf gi3,+,h = η∈{−1,+1}Supp(g)
≤ hf i3\Supp(g),+,h hgi3,+,h h i − ≤ hf i3,+,h + 2µ3,+,h Supp(f ) ←→ Supp(g) hgi3,+,h . This immediately implies (1.23).
402
R. H. Schonmann, S. B. Shlosman
In case T < Tc and h ≥ 0, we can combine (1.23) with (1.17) to obtain |hf ; gi3,+,h | ≤ C(f, g) exp − C(T ) dist∞ (Supp(f ), Supp(g)) ,
(1.24)
an estimate that in particular is uniform in h ≥ 0 and in 3. A consequence of the exponential decay of correlations in (1.24) is that when T < Tc the function hf i+,h of h ≥ 0, which to h = 0 associates hf i+ and to h > 0 associates hf ih , is infinitely differentiable at h = 0. Moreover, for j = 1, 2, ..., the following identity holds: j X dj hf i+,h β = hf ; σ(x1 ); ...; σ(xj )i+ . (1.25) j dh 2 h=0+ 2 x1 ,...,xj ∈Z
In this expression, which will be justified below, the quantities that appear inside of the summation are called generalized Ursell functions, and are defined next. We start by defining the generalized Ursell functions for a Gibbs measure µ3,η,h ( · ), where 3 is a finite set. For this purpose we consider a generalization of the Hamiltonian, in which at each site x the external applied field may be different, and takes the value hx . If h is the function that to each x ∈ Z2 associates hx , then we will denote by µ3,η,h the corresponding Gibbs distribution in 3 with boundary condition η. With a local observable f given we define j ∂ j hf i3,η,h 2 , hf ; σ(x1 ); ...; σ(xj )i3,η,h = β ∂hx1 . . . ∂hxj h≡h where h ≡ h means that the function h is identically h. It is easy to see, by induction on j, that hf ; σ(x1 ); ...; σ(xj )i3,η,h is a linear combination of products of µ3,η,h -expected values of local observables, all of them with support contained in Supp(f )∪{x1 , ..., xj }. Convergence of the generalized Ursell functions as 3 → Z2 follows from this, for arbitrary T , h and η, as long as µ3,η,h converges weakly to some distribution. In case T < Tc , h ≥ 0 and η = +, we can do better and obtain from (1.22) the bound |hf ; σ(x1 );...; σ(xj )i3,+,h − hf ; σ(x1 ); ...; σ(xj )i+,h | ≤ Cj (f ) exp − C(T ) dist∞ (Supp(f ) ∪ {x1 , ..., xj }, 3c ) . (1.26 ) Observe that in particular this estimate is uniform in h ≥ 0. The proof of (1.25) can be found in Sect. 2 of [M-L]. In that paper the result was proven at low enough temperature, but this was so simply because the estimate (1.24), which is a consequence of (1.17) was not available then. Replacing Theorem 1 in [M-L] with (1.24), the proof in that paper applies up to Tc . The basic estimate used in [M-L], and to which we will return in Sect. 2.4, states that the exponential decay of correlations (1.24) implies a similar exponential decay for the generalized Ursell functions, when the diameter of the set Supp(f ) ∪ {x1 , ..., xj } becomes large: |hf ; σ(x1 ); ...; σ(xj )i3,+,h | diam∞ (Supp(f ) ∪ {x1 , ..., xj }) . ≤ Cj (T, f ) exp −C(T ) j
(1.27 )
Wulff Droplets and Metastable Relaxation
403
Observe that in particular this estimate is uniform in h ≥ 0 and in 3. The proof that (1.27) follows from (1.24) in the special case in which f is of the form f (σ) = σ(y1 ) · · · σ(ym ) for some set of sites {y1 , ..., ym } can be found in the Appendix B of [M-L]. The general case follows immediately from this one, since any local function f can be written as a linear combination of functions of this special form, with {y1 , ..., ym } running over all the subsets of Supp(f ). As explained in the proof of Theorem 4 in [M-L], (1.25) follows in a standard fashion from (1.27). In connection to (1.25) it is worth mentioning that for h > 0 the function hf ih is analytic, as follows from the Lee-Yang theorem. Moreover, in this case the Gibbs distributions are completely analytic in an appropriate sense (see [SS1] and references therein). The estimate (1.27) can be replaced for h > 0 by a stronger one: |hf ; σ(x1 ); ...; σ(xj )i3,+,h | ≤ Cj0 (T, h, f ) exp −C(T, h) disttree (Supp(f ), x1 , ..., xj ) , where disttree (31 , ..., 3j ) is the length of the shortest tree in R2 , connecting all the sets 31 , ..., 3j ⊂ Z2 . However, the constants Cj0 (T, h, f ) explode as h → 0. It is also important to recall that, on the other hand, it has been proven in [Isa] that at low enough temperatures there is an essential singularity nevertheless at h = 0; this is expected to be so up to Tc , but no proof of that claim is available, as far as we know. The identity (1.25) and the various related statements that we made above have, of course, analogues for h ≤ 0. Those are the ones that will be relevant for us when we study metastability under small h > 0, since the “metastable state” should then be a “continuation” of the equilibrium states with h ≤ 0. 1.4. Main results. Recall that we are considering a kinetic Ising model for the formal Hamiltonian (1.4) in dimension 2, which is supposed to satisfy conditions (H1), (H2), (H3) and (H4) of Sect. 1.2. Recall also that for T < Tc we define λc = λc (T ) = w(T )2 /(12 T m∗ (T )). The following theorem is our main result. Theorem 1. Suppose T < Tc . For every probability distribution ν ≤ µ− the following happens: i) If 0 < λ < λc , then for each n ∈ {1, 2, ...} and for each local observable f , n−1 X ν = bj (f )hj + O(hn ) E f σh;exp(λ/h) j=0
for h > 0, where bj (f ) =
j 1 dj hf i−,h 1 β = j! dhj h=0− j! 2
X x1 ,...,xj
hf ; σ(x1 ); ...; σ(xj )i− , ∈Z2
and O(hn ) is a function of f and h which satisfies lim suph&0 |O(hn )|/hn < ∞. ii) If λ > λc , then for any finite positive C there is a finite positive C1 such that for every local observable f , C ν , E f σh;exp(λ/h) − hf ih ≤ C1 ||f ||∞ exp − h for all h > 0.
404
R. H. Schonmann, S. B. Shlosman
From this theorem, the simple Proposition 1 in [Sch 1], and the fact that hf ih → hf i+ as h & 0 (see (1.7), or the paragraph which precedes (1.25)), the following corollary is obtained. Corollary 1. Suppose T < Tc . For every probability distribution ν ≤ µ− the following happens. If we let h & 0 and t → ∞ together, then for every local observable f , ν → hf i− if lim sup h log t < λc (T ). i) E f σh;t ν ii) E f σh;t → hf i+ if lim inf h log t > λc (T ). ν converges In other words, we are stating that the law of the random configuration σh;t weakly to µ− in case (i) and to µ+ in case (ii). This corollary is already an important strengthening of Theorem 1 in [Sch 1] in the 2 dimensional case. The following aspects of that theorem are improved here: 1) There is a single constant λc separating the regimes (i) and (ii). 2) The temperature is now only required to be below Tc . 3) The initial distribution is much more general than in [Sch 1], where it was supposed to be concentrated on the configuration with all spins down. Note that, by the FKG-Holley inequalities, for each h < 0 the distribution µh satisfies the condition above on the initial distribution ν. To illustrate and clarify the way in which Theorem 1(i) improves even further the statement in Corollary 1(i), let us take the local observable given by f (σ) = σ(0) and n = 2. We have then, when 0 < λ < λc , ν (0) = −m∗ + χh + O(h2 ), E σh;exp(λ/h)
when h > 0. Here
X dhσ(0)i−,h β = hσ(0); σ(x)i− , χ = b1 (f ) = dh 2 h=0− 2 x∈Z
is the susceptibility at h = 0− . This means that when h > 0 is small the function ∗ ν −m + χh is a better approximation to E σh;exp(λ/h) (0) than the constant function ∗ ∗ identical to −m = hf i− . This function −m +χh is the smooth linear continuation into the region h ≥ 0 of the function which to h < 0 associates the equilibrium expectation hf ih . Similar interpretations can be given for larger values of n and arbitrary f . Another way to express part of the content of Theorem 1 is to observe that it claims that for any λ, 0 < λ < λc and any probability distribution ν ≤ µ− the branch of states h·iνh;exp(λ/h) for h > 0 is a C ∞ continuation of the family h·ih for h < 0. That interpretation suggests that the phenomenon of metastability should be understood dynamically, in which case the physically meaningful smooth continuations through the critical point h = 0 become possible. In the physics literature (see, e.g., [BM]), one sometimes relates the metastable relaxation of a system to the presence of a “plateau” in the graph corresponding to the ν . Of course, strictly speaking there time evolution of a quantity of the type of E f σh;t is no “plateau”, and generically the slope of such a function is never 0. Still, from the experimental point of view a rough “plateau” can be seen and described as follows. In a ν seems to converge to a value close to hf i− ; after this, relatively short time E f σh;t one sees an apparent flatness in the relaxation curve over a period of time which may be quite long compared with the time needed to first approach this value. But eventually the relaxation curve starts to deviate from this almost constant value and move towards
Wulff Droplets and Metastable Relaxation
405
the true asymptotic limit, close to hf i+ . The experimentally almost flat portion of the relaxation curve is referred to as a “plateau”. Theorem 1 can be seen to some extent as giving some precise meaning to such a “plateau”, and we discuss now two ways in which this can be done. First note that if 0 < λ0 < λ00 < λc , then from Part (i) of the theorem we have ν ν E f σh;exp(λ − E f σh;exp(λ → 0, 0 /h) 00 /h) faster than any power of h. Observe that we are considering times which are of different order of magnitudes, when h is small, and still we are observing a nearly constant ν . For a second way in which Theorem 1 can be seen as expressing the E f σh;t ν versus log(t), rather than presence of a “plateau”, we can think of plotting E f σh;t versus t. This is somewhat the natural graph to consider, if one is interested in the order of magnitude of the relaxation time. If the log(t)-axis is drawn in the proper scale, amounting to replacing it with h log(t), then, when h is small, Theorem 1 tells us that the graph should be close to that of a step function which jumps at the point λc , from the value hf i− to the value hf i+ . Readers who are familiar with [Sch1 and Sch2] can expect that also Theorem 4 and Corollary 1 in [Sch 1], which refer to finite systems with (−) boundary conditions and sizes which are scaled as h & 0, have stronger versions along the lines of the current paper. This is indeed the case, but for brevity we will omit the statements of these theorems, which can easily be obtained and can be proved with the techniques introduced in this paper. 1.5. Heuristics. One of the appealing features of the results proven in this paper is that some of them can be correctly predicted based on a very simple and naive-looking heuristics. It is probably a challenge to a historian of science to trace back the origin and evolution of this non-rigorous approach to the problem, to give the proper credit to the people involved and to elucidate how the interplay among empirical observation of metastable systems, theoretical analysis and computer simulation led to the reasoning described in a simple fashion below. Here we will make no attempt to clarify the history of the subject. The interested reader will find a great deal of references to the earlier literature on the subject in the paper [RTMS]. It is worth stressing that certain parts of the heuristics were rediscovered more than once in different forms and contexts, so that giving proper credit is a very difficult task. The reader may want to compare the heuristics presented below with the one presented in [Sch1] and [Sch2]. The main difference is that here we are being more ambitious by basing the heuristics on the computation of the free-energy of a droplet with an arbitrary shape, rather than on the energy of a square droplet. At the time that those two former papers were written, it seemed to its author that there was no compelling reason to believe that the equilibrium free-energy of droplets would predict correctly the value of λc , i.e, the behavior of the relaxation time to the level of the correct rate of exponential growth with 1/h. The first ingredient of the heuristics is the idea of looking at an individual droplet of the stable phase (roughly the (+)-phase, since h is small) in a background given by the metastable phase (roughly the (−)-phase). Let S be the shape of that droplet, which a priori can be arbitrary. Say that l2 is the volume (i.e., the number of sites) of the droplet, and let us find an expression for the free-energy of such a droplet. This free-energy may be seen as coming from two main contributions. There should be a bulk term, proportional to l2 . This term should be obtained by multiplying l2 by the difference in
406
R. H. Schonmann, S. B. Shlosman
free-energy per site between the (+)-phase and the (−)-phase in the presence of a small magnetic field h > 0. This difference in the free-energy per site of the two phases should come only from the term in the Hamiltonian which couples the spins to the external field and should therefore be given by 2m∗ h/2 = m∗ h. The other relevant contribution to the free-energy of the droplet should come from its surface, where there is an interface between the (+)-phase and the (−)-phase. This contribution is proportional to the length of the interface, which is of the order of l. It should be multiplied by a constant wS which depends on the shape of the droplet. This constant wS represents the excess free-energy per unit of length integrated over the surface of the droplet when its scale is changed so that its volume becomes 1. Therefore, since the external field h is small, we can take for wS the value W(γ), where γ is the boundary of the droplet, rescaled in this fashion. In particular, wS is minimized when the droplet has the Wulff shape, and in this case wS = w. Adding the pieces, we obtain for the free-energy of the droplet the expression 8S (l) = −m∗ hl2 + wS l. The two terms in this expression become of the same order of magnitude, in case l is of the order of 1/h. Therefore, for later convenience we write l = b/h, with a new variable b ≥ 0. This yields φS (b) , 8S (b/h) = h where φS (b) = −m∗ b2 + wS b. This very simple function takes the value 0 at b = 0, grows with b on the interval [0, BcS , ], (wS )2 wS S S where BcS = BcS (T ) = 2m ∗ , reaching its absolute maximum φS (Bc ) = 4m∗ = A (T ) = S A at the end of this interval and decreases with b on the semi-infinite interval [BcS , ∞). wS It crosses the value 0 at the point B0S (T ) = B0S = m ∗ = 2Bc . Metastability is then “understood” from the fact that systems in contact with a heat bath move towards lowering their free-energy, so that the presence of a free-energy barrier which needs to be overcome in order to create a large droplet of the stable phase with any shape keeps the system close to the metastable phase. Subcritical droplets are constantly being created by thermal fluctuations, in the metastable phase, but they tend to shrink, as dictated by the free-energy landscape. On the other hand, once a supercritical droplet is created due to a larger fluctuation, it will grow and drive the system to the stable phase, possibly colliding and coalescing in its growth with other supercritical droplets created elsewhere. As a function of h, the linear size of a critical droplet, BcS /h, blows up as h & 0. One can then, in a somewhat circular, but heuristically-meaningful way, say that the macroscopic free-energy of droplets is indeed a relevant object of consideration. One can also hope then that sharp theorems could be conjectured and possibly proven regarding the asymptotic behavior of quantities of interest in the limit h & 0. Regarding the shape of the droplet, the height of this barrier is minimized by minimizing the value of the constant wS , i.e., by considering Wulff-shaped droplets. This singles out the Wulff shape as the most relevant one in the heuristics above. We will simplify the notation by omitting the subscript S when talking about the Wulff shape. In particular, w w2 , A = . (1.28) Bc = 2m∗ 4m∗ Based on the expression above for the free-energy barrier, one predicts the rate of creation of supercritical droplets with center at a given place to be exp −βA . h
Wulff Droplets and Metastable Relaxation
407
In what follows now we write d instead of 2, to make the role of the dimension clear in the geometric argument which comes next. We are concerned with an infinite system, and we are observing it through a local function f , which depends, say, on the spins in a finite set Supp(f ). For us the system will have relaxed to equilibrium when Supp(f ) is covered by a big droplet of the plus-phase, which appeared spontaneously somewhere and then grew, as discussed above. We want to estimate how long we have to wait for the probability of such an event to be large. If we suppose that the radius of supercritical droplets grows with a speed v, then we can see that the region in spacetime where a droplet which covers Supp(f ) at time t could have appeared is, roughly speaking, a cone with vertex in Supp(f ) and which has as base the set of points which have time-coordinate 0 and are at most at distance tv from Supp(f ). The volume of such a cone is of the order of (vt)d t. The order of magnitude of the relaxation time, trel , before which the region Supp(f ) is unlikely to have been covered by a large droplet and after which the region Supp(f ) is likely to have been covered by such an object can now be obtained by solving the equation βA d = 1. (1.29) (vtrel ) trel exp − h This gives us trel = v −d/(d+1) exp
βA (d + 1) h
.
(1.30)
In order to use this relation to predict the way in which the relaxation time scales with h, one needs to figure out the way in which v scales with h. If we suppose, for instance, that v does not scale with h, or at least that if it goes to 0, as h & 0, it does it so slowly that (1.31) lim hd−1 log v = 0, h&0
then we can predict that trel = exp
βA (d + 1) h
= exp
λc h
,
where λc =
βw2 βA βw2 = = , d + 1 (d + 1) 4m∗ 12 m∗
in agreement with our result (1.1). In Sect. 1-iii of [Sch1] and more explicitly in Sect. 4 of [Sch2] an argument was given in support of the conjecture that v ∼ Ch as h & 0, a much stronger conjecture than (1.31). In the paper [RTMS] (see display (9) there) a different non-rigorous argument is described, in which the same conclusion is derived from an “Allen-Cahn approximation”. In part 3 of this paper we will introduce a rescaling procedure and obtain results which can be seen as rigorous counterparts to (1.31). It is interesting to compare this feature of the regime of fixed T and h & 0 with the case of fixed h > 0 and T & 0, studied in [DS]. In that case the analogue of (1.31) is false, and consequently the v-term in (1.30) is of greater relevance than it is here.
408
R. H. Schonmann, S. B. Shlosman
2. Metastable Regime 2.1. Preliminaries. In this part of the paper we will prove part (i) of Theorem 1. The first step will be to prove, in Sections 2.2 and 2.3, the following proposition. Proposition 2.1.1. Suppose that T < Tc and 0 < λ < λc . Then for each constant a ∈ (0, 1/4), there is a positive finite constant C such that for each local observable f there is a positive finite constant C(f ) such that for h > 0, − − hf i3(1/ha ),−,h ≤ C(f ) exp(−C/ha ). E f σh;exp(λ/h) We will not try to optimize the constants C and C(f ) in this proposition. But we observe that from our proof, if the inequality displayed above is only required to hold for h ≤ h0 for some h0 > 0 depending on f , then we can take C(f ) = C 0 ||f ||∞ |Supp(f )|, where C 0 does not depend on f . Proposition 2.1.1 transforms our dynamical problem into an equilibrium one, in case the initial distribution is concentrated on the configuration with all spins down. In Sect. 2.4 we will study the behavior of hf i3(1/ha ),−,h for small h > 0 and show that it gives rise to the asymptotic expansion claimed in part (i) of Theorem 1. Let us note here that if our goal were only to prove Corollary 1(i), with the initial distribution having all spins down, then Proposition 2.1.1 would have reduced our task to a very simple one. First, from the heuristic viewpoint, with a > 0 small, the box 3(1/ha ) is too small for any supercritical droplet to fit inside, so that one should expect to see the (−)-phase inside it. A rigorous argument to the effect that for 0 < a < 1 hf i3(1/ha ),−,h → hf i−
as h & 0
(2.1)
can be obtained from the FKG-Holley inequalities in combination with part (a2) of Corollary 1 in [SS1]. But for 0 < a < 1/2, which is our case, a simple and direct argument can also be given. It is clear that the following estimate, which can be seen as a uniform bound on a Radon-Nikodym derivative, holds. For all η ∈ 3(1/ha ),− , exp(−β|3(1/ha )|h) ≤
µ3(1/ha ),−,h (η) ≤ exp(β|3(1/ha )|h). µ3(1/ha ),−,0 (η)
Since |3(1/ha )|h ≤ h1−2a → 0, as h & 0, (2.1) follows from the weak convergence of µ3(1/ha ),−,0 to µ− as h & 0. In this argument we see directly that, with a < 1/2, the box 3(1/ha ) is too small for the external field h > 0, acting on each spin, to be able to win over the effect of the negative spins at the boundary. The extension of Theorem 1(i) to arbitrary initial distributions ν ≤ µ− will be obtained in Sect. 2.5. Interestingly enough a basic tool there will be a result obtained in [Mar], and extended to arbitrary subcritical temperatures in [CGMS], concerning the 2 dimensional Ising model evolving in the absence of an external field. 2.2. Bottlenecks for the dynamics. We start now the proof of Proposition 2.1.1. Several times in the proof of this proposition we will use arguments which are only true for small enough h > 0, but the constant C(f ) can be adjusted so that we do not have to require h to be small in the statement of the proposition. In order to prove Proposition 2.1.1 there is no loss in generality in supposing that f is increasing and that it has ||f ||∞ ≤ 1. For the remainder of the proof of this proposition we will make these assumptions.
Wulff Droplets and Metastable Relaxation
409
To simplify the notation we set th = exp(λ/h).
(2.2)
We turn first to the proof of the easy half of Proposition 2.1.1. We will show that for small h > 0, λ − . (2.3) )) − hf i3(1/ha ),−,h ≥ − exp − exp E(f (σh;t h 2h For this, observe first that from the basic-coupling inequalities we have − − )) ≥ E(f (σ3(1/h E(f (σh;t a ),−,h;t )). h h µ
(2.4)
3(1/h ),−,h − Let the process (σ3(1/h a ),−,h;t ) and the stationary process (σ3(1/ha ),−,h;t ) evolve on the probability space defined by the graphical construction, so that in particular once these processes hit each other they remain together forever. Note that, for some positive , the probability that these two processes will hit each other during any unit time interval 2a is at least 1/h , regardless of their states at the beginning of this time interval. Also, µ3(1/ha ),−,h − f (σ3(1/ha ),−,h;t ) ≤ f (σ3(1/h a ),−,h;t ) with probability one. From these remarks it is clear that − 1/h2a th −1 (2.5 ) 0 ≤ hf i3(1/ha ),−,h − E(f (σ3(1/h a ),−,h;t )) ≤ 1 − h λ , ≤ exp − exp 2h a
for small h > 0. The inequality (2.3) follows from (2.4) and (2.5). The main task in this and the next sections is to prove the other half of Proposition 2.1.1, i.e., the inequality − )) − hf i3(1/ha ),−,h ≤ C(f ) exp(−C/ha ). E(f (σh;t h
(2.6)
We approach this problem borrowing some ideas from [Sch1]. As in that paper, set 3h = 3(exp(λc /h))
(2.7),
and observe that Lemma 1.2.1 (which is the same as Lemma 1 in [Sch1]) gives us the following stronger version of Lemma 2 in [Sch1]: − − )) − E(f (σ3 ))| ≤ C1 (f ) exp(−C2 exp(λ/h)), |E(f (σh;t h h ,−,h;th
(2.8)
where C1 (f ) and C2 are positive and finite. Next we will introduce a restricted set of configurations, in a way similar to [Sch1], and inspired there by [CCO] and by the heuristic idea of critical droplets. To make this idea precise one uses the standard notion of contours, on the dual lattice Z2 + (1/2, 1/2), which separate spins −1 from +1. In the definition of these contours, we adopt here the splitting rules used in, e.g., [DKS] (see Sect. 3.1 there), which allow one to take the contours as self-avoiding curves, which are closed, when the boundary conditions are, e.g., (−), as is our case. We will denote by |0| the length of the contour 0, by Int 0 the set of sites it surrounds, and by V (0) the number of spins that it surrounds, which we call the volume of 0. As usual, a contour is called an external contour if it is not enclosed by any other contour. If 0 is such a contour of the configuration σ, and the boundary
410
R. H. Schonmann, S. B. Shlosman
conditions are (−), then at certain sites x, attached to 0, the values σ(x) are uniquely defined by the presence of 0. The set of such sites will be denoted by ∂0. We have ∂0 = ∂− 0 ∪ ∂+ 0,
(2.9)
where σ|∂± 0 = ±1. We will use the notation − =
∞ [
3(l),− .
l=1
Our restricted set of configurations is defined as n B 2 o c R = σ ∈ − : each contour 0 in σ has V (0) ≤ , h
(2.10)
where Bc is defined in (1.28). We want to argue that up to a time as large as th the system evolving in the box 3h with (−) boundary conditions and starting with all spins −1, will be unlikely to escape from R, in which case the system indeed would look very much like the (−) phase. In order to do this we introduce a modified dynamics evolving in 3h ,− , in which large droplets cannot, by definition, be formed and then we couple the unrestricted dynamics to this modified one, in a natural way. The modified dynamics is simply defined as the Markov process on 3h ,− which evolves as the original stochastic Ising model in 3h , with (−) boundary conditions, but for which all jumps out of R are suppressed. In other words, the rates, c˜3h ,−,h (x, σ), of the new process are identical to c3h ,−,h (x, σ) in case σ, σ x ∈ 3h ,− ∩ R and are 0 otherwise. We will denote this modified process, which is η )t≥0 ,where η ∈ R is the initial configuration. restricted to the state space R, by (σ˜ 3 h ,−,h;t It is well known, and very easy to prove, that such a modified process is also reversible and that since it is, in our case, irreducible, its unique invariant probability measure is µ˜ 3h ,−,h given by µ˜ 3h ,−,h ( · ) = µ3h ,−,h ( · |R). This distribution is sometimes called a “restricted ensemble”, and, informally speaking, represents the “metastable state”. η ) can be constructed on For each initial configuration η ∈ R, the process (σ˜ 3 h ,−,h;t the same probability space corresponding to the graphical construction introduced in Sect. 1.2. For this purpose it is enough to suppress all jumps out of R. In other words, Poisson marks which should cause such a jump are just ignored. The important fact about this construction is that if we introduce µ˜ 3
τ = inf{t ≥ 0 : the process (σ˜ 3hh,−,h;t ) has a suppressed jump at time t}, then
,−,h
µ˜ 3
− σ3 ≤ σ˜ 3hh,−,h;t for all t < τ. h ,−,h;t ,−,h
(2.11)
(Readers who are familiar with the argumentation in [Sch1], have noted that while most of what we introduced above in connection to the restricted ensemble is similar to its counterparts in that paper, the notions are not strictly parallel to those there. The reason is that, while we are still pursuing the idea that the boundary of R is a bottleneck, the arguments used in [Sch1] to prove Lemma 5(i) there would give us an estimate that, while good enough to obtain Corollary 1 (i), would not be good enough to obtain an estimate as sharp as Proposition 2.1.1, which is needed for the proof of Theorem 1(i).)
Wulff Droplets and Metastable Relaxation
411
We introduce now a family of sets whose union is the inner boundary of R. For each site x ∈ Z2 define n o Fx− = σ ∈ R : σ x 6∈ R . Set also ϕ = sup µ˜ 3h ,−,h (Fx− ). x∈3h
We will now consider a discrete time Markov chain embedded into the stationµ˜ 3 ,−,h ary process (σ˜ 3hh,−,h;t ). It is formed by the configurations of our process between the successive jumps. For this purpose order all the Poisson marks in the graphical construction which occur on 3h , according to the time they occur. Let N (t) be the number of such marks from time 0 up to time t. Let Mx,k be the event that the k th such mark occurs at the site x, and Fx,k be the event that immediately before this k th mark the µ˜ 3
process (σ˜ 3hh,−,h;t ) is in Fx− . Note that Mx,k and Fx,k are independent events and that P(Mx,k ) = 1/|3h |, while by stationarity P(Fx,k ) = µ˜ 3h ,−,h (Fx− ) ≤ ϕ for all x and k. Set also K = b2 |3h | cmax (T ) th c to obtain the estimate X P(Mx,k ∩ Fx,k ) P(τ ≤ th ) ≤ P(N (th ) > K) + ,−,h
x∈3h ,k=1,...,K
≤ C1 exp(−C2 exp(λ/h)) + C3 |3h |th ϕ,
(2.12 )
where in the second inequality a standard large deviation estimate for Poisson random variables was used. From (2.8), (2.11) and (2.12) it follows that Z − (2.13) E(f (σh;th )) ≤ f dµ˜ 3h ,−,h + C3 |3h |th ϕ + C4 (f ) exp(−C5 exp(λ/h)). All the quantities in the right-hand-side of (2.13) pertain to equilibrium statistical mechanics, so that we have reduced our dynamical problem of proving (2.6) to the equilibrium problems of proving the following two claims concerning the measure µ˜ 3h ,−,h : i) for all > 0 and h > 0 small enough
A ϕ ≤ C1 exp −β(1 − ) h
,
(2.14)
where A is defined by (1.28); ii) for h > 0 Z f dµ˜ 3h ,−,h ≤ hf i3(1/ha ),−,h + C(f ) exp(−C2 /ha ).
(2.15)
It is interesting to note that the term C3 |3h |th ϕ in (2.13) has a direct connection with the heuristics in Sect. 1.4. The quantity ϕ plays the role of the rate of nucleation, while |3h |th is the space-time volume of a cylinder which plays a role similar to the space-time cone in the heuristics. The absence here of the velocity factor which appears in (1.29) is a consequence of our using an upper bound of order 1 for the velocity of propagation of effects, through (2.8). Once (2.14) is proven, it follows from the definitions (1.1), (1.28), (2.2) and (2.7) that |3h |th ϕ vanishes exponentially fast in 1/h as h & 0.
412
R. H. Schonmann, S. B. Shlosman
2.3. The restricted ensemble. We start our study of the measure µ˜ 3h ,−,h by observing that it is sufficient to study measures of the type µ˜ 3,−,h on subsets 3 of Z2 which are much smaller than 3h . The definition of µ˜ 3,−,h is analogous to that of µ˜ 3h ,−,h , with 3 replacing 3h . Suppose that for each sufficiently small value of h > 0 we have an event Eh which only depends on the values of the spins inside the box x + 3(1/h3 ). Consider the larger box x + 3(2/h3 ) and condition on what the set of exterior contours which surround at least one site in its complement, (x + 3(2/h3 ))c , is. Let that set consist of contours {0j }.SDenote by 3({0j }) the connected component of the complement, (x + 3(2/h3 )) \ j (Int 0j ∪ ∂(0j )), which contains the set x + 3(1/h3 ). Then one obtains, when h is small, that X αi · µ˜ 3i ,−,h (Eh ), (2.16) µ˜ 3h ,−,h (Eh ) = i
where the 3 -s denote different 3({0j })-s, the index i runs over a finite set, αi > 0 and P i αi = 1. The choices of the scales above and the need that h be small for (2.16) to hold, are clearly related to the fact that we are conditioning the Gibbs measure on the absence of any contour with volume larger than (Bc /h)2 . Therefore, with the choices above, we are sure that for small h the support of Eh will be disjoint from the set of sites surrounded by any contour which also surrounds any site in (x + 3(2/h3 ))c . The sets 3i which appear in (2.16) have an additional property, which will be important later: they are simply-connected. For each value of h > 0 and each x ∈ 3h , the event Fx− satisfies the condition above on Eh , so that to derive (2.14) it is enough to obtain a corresponding upper bound: A , (2.17) sup sup µ˜ 3,−,h (Fx− ) ≤ C1 exp −β(1 − ) h x∈3h 3∈Lx,h i
for small h > 0. Here Lx,h is the family of simply-connected sets 3 which satisfy (x + 3(1/h3 )) ∩ 3h ⊂ 3 ⊂ x + 3(2/h3 ). Similarly, the derivation of (2.15) is reduced to that of the following, Z f dµ˜ 3,−,h ≤ hf i3(1/ha ),−,h + C(f ) exp(−C2 /ha ), (2.18) sup 3∈L0,h
for h > 0. In order to derive (2.17) and (2.18) we will use the notion of the skeleton of a contour, as introduced in [DKS]. In what follows b is a fixed but arbitrary number in (a, 1/4) and r is a fixed but also arbitrary number in (0, b/2) ∩ (0, a). A contour 0 will be said to be h-vertebrate if V (0) > (1/h)2b , otherwise 0 will be said to be h-invertebrate. (Usually one says that 0 is large in the former case and small in the latter one, but this terminology would be confusing in the present paper, since “large” and “small” contours may also be used in connection with being supercritical or subcritical, with the threshold volume being the quantity (Bc /h)2 , which is much larger than (1/h)2b .) Often we will omit mention of h when referring to a vertebrate contour. Given now a vertebrate contour 0 one can associate to it, in an algorithmic way, a sequence of sites, (x1 , ..., xJ ) of the dual lattice Z2 + (1/2, 1/2). We think of the sites x1 , ..., xJ as the ordered vertices of a closed polygonal curve, with possible self-intersections (see Fig. 5.3 on p. 166 in [DKS]); we will denote this curve by γ in what follows and call it the skeleton of 0. For the construction of γ, given 0, the reader is referred to Chapter 5 of [DKS]; here we will limit ourselves to reviewing some of the basic properties that we can guarantee the skeleton to have:
Wulff Droplets and Metastable Relaxation
413
(S.1) xi ∈ 0 for each i, moreover, the points xi are consecutive on 0 (for one of the orientations of it). (S.2) The length of each edge of γ is bounded between C1 (1/h)r and C2 (1/h)r , where 0 < C1 < C2 < ∞ are fixed appropriate constants. (S.3) The Hausdorff distance between 0 and γ satisfies ρH (0, γ) ≤ (1/h)r .
(2.19)
The length, |γ| of a skeleton γ is defined as the sum of the Euclidean lengths of its edges. To each skeleton γ we associate its Wulff functional, W(γ), defined by summing over the edges of γ the product of the Euclidean length of each edge by the surface tension in the direction defined by the edge, i.e., τT (n), with n perpendicular to the edge. Observe that from the fact that the surface tension τT (n) is bounded away from 0 and ∞ uniformly in n, (2.20) C3 W(γ) ≤ |γ| ≤ C4 W(γ). From (S.2) and (2.20) it follows that the number J(γ) of vertices in γ satisfies C5 W(γ)hr ≤ J(γ) ≤ C6 W(γ)hr .
(2.21)
To each configuration σ ∈ − we can associate the collection G = {01 , ...0n } of its external vertebrate contours. To this collection we can associate the collection S = {γ1 , ..., γn } of their skeletons. The Wulff functional associated to the configuration σ is then defined as n X W(γi ), W(S) = i=1
with the convention that W(∅) = 0. Next we want to consider the volume surrounded by the external vertebrate contours 01 , ..., 0n and say that it has to be close to the volume surrounded by the collection of skeletons γ1 , ..., γn . A difficulty lies in the fact that while the volume surrounded by the contours is easily defined as V (G) =
n X
V (0i ),
i=1
the fact that the skeletons can self-intersect and also intersect with each other makes the notion of the volume that they surround more delicate. Fortunately the notion of “phase volume”, as defined in Sect. 2.10 of [DKS], solves this difficulty. This definition is as follows (a look at Fig. 2.5 on p. 37 of [DKS] will probably lead the reader to guess correctly the definition). The set R2 \ ∪ γi splits up into a collection of connected components Qα with exactly one unbounded component among them. A component Qα will be called a minus-component if any path that connects its interior points with points of the unbounded component and intersects the curves from S in a finite number of points, intersects them in an odd number of points (counted with multiplicities). The phase volume of S, denoted by Vˆ (S), is defined as the joint volume of all the minuscomponents. Motivated by (2.19), we want to show that V (G) and Vˆ (S) have to be also relatively close to each other. If we remove from R2 all the points which are at a distance not larger than (1/h)r from ∪0i , then the remaining set also splits up into connected components with exactly one unbounded component among them. It is easy to see that all the bounded components are subsets of minus-components in the splitting produced by S, while the
414
R. H. Schonmann, S. B. Shlosman
unbounded component is a subset of the unbounded component in the splitting produced by S. It is also clear that the bounded components in this splitting are inside contours of G, while the unbounded component in this splitting is completely outside the contours of G. Hence ! n X |0i | (1/h)2r . (2.22) |V (G) − Vˆ (S)| ≤ C7 i=1
Similarly, by removing from R all the points which are at a distance not larger than (1/h)r from ∪γi , one can also derive ! n X ˆ |γi | (1/h)2r ≤ C9 W(S)(1/h)2r , (2.23) |V (G) − V (S)| ≤ C8 2
i=1
where in the second inequality use of (2.20) was made. For convenience we introduce also another measure of the “volume” of a collection of skeletons, which is motivated by the procedure described in the explanation of why (2.22) and (2.23) hold. We define Vˇ (S) as the number of sites inside the minus components Qα , which are at a distance larger than (1/h)r from ∪γi . Clearly we have
and
Vˇ (S) ≤ min{V (G), Vˆ (S)},
(2.24)
Vˇ (S) ≥ max{V (G), Vˆ (S)} − C10 W(S)(1/h)2r .
(2.25)
h,co Given a finite set G of h-vertebrate contours we denote by SG the set of configurations which belong to − and which have as their collection of external h-vertebrate contours the set G. We say that G is a compatible set of external h-vertebrate contours h,co is not empty. in case SG Similarly, given a finite set of skeletons S, we define CSh as the class of all G which are compatible sets of external h-vertebrate contours and which have as their set of skeletons the set S. We define also [ h,co SG . SSh,sk = h G∈CS
SSh,sk is the set of configurations which belong to − and which have as their collection of skeletons corresponding to their external h-vertebrate contours the set S. We say that S is a compatible set of skeletons in case SSh,sk is not empty. Again similarly, given an interval of real numbers, I, we define SIh,W as the set of configurations which belong to − and for which the collection S of skeletons corresponding to their external h-vertebrate contours satisfies W(S) ∈ I. One would like to say that the volume of a collection of skeletons is the sum of the volumes of the individual skeletons. One proper version of this is the following relation, which follows from the argumentation used to show (2.22) and (2.23). In this relation, and also later on, we use the simplified notation Vˇ (γi ) in place of the more cumbersome Vˇ ({γi }). If S = {γ1 , ..., γn } is a compatible set of skeletons, then Vˇ (S) =
n X i=1
Vˇ (γi ).
(2.26)
Wulff Droplets and Metastable Relaxation
415
A fundamental fact about the Wulff shape and the associated quantity w is the variational characterization (1.10). By scaling lengths we obtain easily from this and (2.24) that for any skeleton γ q (2.27) w Vˇ (γ) ≤ W(γ). This inequality will now be used to derive another one, which is of central relevance in this paper. This is the content of the next lemma. Lemma 2.3 .1. For each configuration in R, the associated set of skeletons S, corresponding to its set of vertebrate external contours satisfies W(S) ≥ 2m∗ hVˇ (S). Proof. Say that the collection of external vertebrate contours for the configuration with which we are concerned is G = {01 , ..., 0n } and that 0i has skeleton γi , for i = 1, ..., n. From (2.24) and the definition (2.10) of R we have for each i, q Bc Vˇ (γi ) ≤ . (2.28) h Multiplying the inequalities (2.27) (with γ = γi ) and (2.28) by each other, and using the fact that Bc = w/(2m∗ ) we obtain W(γi ) ≥ 2m∗ hVˇ (γi ). The thesis now follows by adding over i, using (2.26).
In order to show (2.17), we need to know that the skeletons of configurations in Fx− are associated with sufficiently large values of W(·). This is the content of the next lemma. Lemma 2.3 .2. Given > 0 there is h0 > 0 such that for all 0 < h ≤ h0 and all x ∈ Z2 the following holds. For each configuration in Fx− , the associated set of skeletons S, corresponding to its set of vertebrate external contours satisfies W(S) ≥
2A(1 − ) . h
Proof. Let σ ∈ Fx− be the configuration with which we are concerned. By definition, σ x ∈ Rc , so the configuration σ x has an external contour 0 with V (0) > (Bc /h)2 . On the other hand, σ ∈ R, so the contour 0 of the configuration σ x is attached to the site x. e σ) Let sq(x) be the 3 × 3 square of the dual lattice, centered at x. Consider the set G(x, of all dual bonds that are either in sq(x) (there are 24 of them) or belong to a contour of e σ) be these bonds, which σ, which contains some bonds of sq(x). Let G(x, σ) ⊂ G(x, e σ). are “visible from infinity”, i.e., are not screened away from it by other bonds in G(x, Evidently, the set of bonds G(x, σ) serves as a set of bonds of exterior contours of some new configuration η = η(x, σ), which in fact has no interior contours. Let these contours be G1 , G2 , ..., Gk . Clearly, k ≤ 6, since each contour Gi passes through at least two e σ) = G(x, e σ x ), so the same is true also for boundary sites of sq(x). Note that G(x, x x G(x, σ ), η(x, σ ) and G1 , G2 , ..., Gk .
416
R. H. Schonmann, S. B. Shlosman
P The idea of the proof is to observe that Int 0 ⊂ ∪i Int Gi , so in particular i V (Gi ) ≥ V (0). That implies that the contours Gi should be quite long. On the other hand, the set of dual bonds G(x, σ) \ sq(x) belongs to exterior contours of σ, so they should be long as well. The implementation of the above argument is as follows. Let Se be the set of skeletons of the subset of all vertebrate contours among G1 , G2 , ..., Gk . Denote the corresponding subset of indices by ver ⊂ {1, ..., k}. We have then that Se = {γi , i ∈ ver}. Note that the volume of any invertebrate contour is o( h1 ), so we have: B 2 X 1 c . V (Gi ) ≥ −o h h i∈ver Hence the estimate (2.25) implies that also B 2 X 1 c ˇ V (γi ) ≥ . −o h h i∈ver Using now (2.27) we have s B 2 Xq 2A 1 1 c e ˇ V (γi ) ≥ w W(S) ≥ w = −o , −o h h h h i∈ver
(2.29)
where we are using an evident property of the square root and also the definitions (1.28). e One The only remaining problem now is the relation between the skeletons S and S. would like to argue that in some sense Se ⊂ S, which would imply our claim. However, the last inclusion is almost always violated. The way out of this unlucky circumstance is the following. Note, that in fact the skeleton of a contour is not uniquely defined; it should just be a closed polygon satisfying the properties of the type (S.1)-(S.3) above. Once this is the case, any such polygon satisfies any statement above made about any skeleton. We are going to use this nonuniqueness in order to prepare a special family of e the skeletons Se of the family {Gi , i ∈ ver}. This family is going to be constructed in such a way as to use as big pieces of the family S as possible. Namely, note that every maximal connected arc kj of G(x, σ) \ sq(x) is also an arc of some external contour of the configuration σ, and so it inherits a portion κj of the skeleton S. (Note that there are at most 6 such arcs.) Namely, define κj to be the maximal subpolygon of S with both endpoints in kj . (It might happen that some κj are empty; for example, it will be the case when the external contour in question would be invertebrate.) It is immediate to see that the family of the (open) polygons ∪κj can be made into a skeleton family of the contour family {Gi , i ∈ ver } by adding at most six extra edges to it, which addition might require the prior removal of some ending edges from ∪κj (also in the amount e of at most six). This is the skeleton family Se sought. Now (2.29), being valid for every possible skeleton family of the contour family {Gi , i ∈ ver }, implies that 2A 1 ee W(S) ≥ −o . h h On the other hand,
ee W(S) ≥ W(S) −o
1 , h
Wulff Droplets and Metastable Relaxation
417
e since every edge of Se except a finite number belongs to S. (Note that in fact the family ee S might be much longer than S, though it is immaterial for our argument). The next lemma shows that vertebrate contours have a minimum cost. Lemma 2.3.3. There is a constant h0 > 0 such that for 0 < h ≤ h0 the following holds. For each configuration in − , the associated set of skeletons S, corresponding to its set of vertebrate external contours is either empty or satisfies w 1 b . W(S) ≥ √ 2 h Proof. Suppose that S is not empty, so that there exists γ ∈ S. This means that γ is the skeleton of a vertebrate contour 0. Suppose that the inequality that we should prove is false and use (2.25) to prove that then 1 2r+b 1 2r 1 2b 1 1 2b ≥ −C ≥ . Vˇ (γ) ≥ V (0) − CW(S) h h h 2 h In the last inequality we used the fact that r < b/2, and for this last inequality to be true h0 has to be taken properly. Using (2.27) now, we obtain w 1 b . W(S) ≥ W(γ) ≥ √ 2 h
The next two lemmas are of a somewhat technical nature. The first one of them will only be used in the proof of the second one, while that second one is a step towards proving (2.18) and it will also be used in the proof of the fundamental Lemma 2.3 .6 below. Some readers may prefer to first read the proof of Lemma 2.3 .6, which has an immediate heuristic appeal, and later return to Lemmas 2.3 .4 and 2.3 .5. +∗ We will use the notation P(x, l) = {x ←→ (x + 3(l))c } for the event that x is (+,*)-connected to a site outside the box x + 3(l). Given a set G of compatible external h-vertebrate contours (possibly G = ∅), we define the following conditional Gibbs measure and corresponding expectation: Z G,h h,co 0 = µ ( · |S ) and hf i = f dµG,h µG,h 3,−,h G 3,−,h0 3,−,h0 3,−,h0 . Lemma 2.3 .4. There are positive finite constants C1 and C2 such that for all 0 < h0 ≤ h ≤ 1, all x ∈ Z2 and all finite 3 ⊂ Z2 , 1 ∅,h ≤ C1 exp(−C2 /ha ). µ3,−,h0 P x, a 2h Proof. We start with an observation akin to the one that originated (2.16), but with the box x + 3(3/h2b ) replacing the box x + 3(2/h3 ) used there. Conditioning on what the set of contours which surround at least one site in (x + 3(3/h2b ))c is we obtain, when h is small,
418
R. H. Schonmann, S. B. Shlosman
µ∅,h 3,−,h0
X 1 1 ∅,h αi · µ3i ,−,h0 P x, a P x, a = , 2h 2h i
i is a subset of x+3(3/h2b ) where the index i runs over a finite set, for each of its values 3P 2b which contains (x + 3(1/h )) ∩ 3 and αi > 0; moreover i αi = 1. i The event that the box 3 is free of h-vertebrate contours is a decreasing event, while 1 P x, 2ha is an increasing event, so by the FKG-Holley inequalities we obtain 1 1 ∅,h ≤ µ3i ,−,h0 P x, a . µ3i ,−,h0 P x, a 2h 2h
Using now the fact that for each i, |3i | ≤ 9/h4b , that b < 1/4 and that 0 < h0 ≤ h ≤ 1, we have Z3i ,−,h0 (P x, 2h1a ) 1 = µ3i ,−,h0 P x, a 2h Z3i ,−,h0 Z3i ,−,0 (P x, 2h1a ) ≤ exp(β|3i |h) Z3i ,−,0 1 ≤ µ3i ,−,0 P x, a exp(C1 h1−4b ) 2h ≤ C2 exp(−C3 /ha ), where in the last inequality we are also using (1.17), with the role of +s and −s switched. This completes the proof of the lemma. Lemma 2.3.5. There are positive finite constants C1 and C2 such that for every local observable g there is h0 = h0 (g) > 0 such that for all 0 < h0 ≤ h ≤ h0 and all finite 3 ⊂ Z2 , a |hgi∅,h 3,−,h0 − hgi3∩(Supp(g)+3(1/ha )),−,h0 | ≤ C1 ||g||∞ |Supp(g)| exp(−C2 /h ),
Proof. Without loss in generality we suppose that g is increasing and has ||g||∞ ≤ 1. Set [ 1 P x, a . E1 = 2h x∈Supp(g)
Once more we use an argument similar to the one used to prove (2.16), this time we condition on what the (+,*)-cluster of the set (Supp(g) + 3(1/ha ))c is. In this fashion we obtain the following equality: ! X ∅,h ∅,h ∅,h ∅,h c αi · hgi3i ,−,h0 µ∅,h hgi3,−,h0 = 3,−,h0 ((E1 ) ) + hg|E1 i3,−,h0 µ3,−,h0 (E1 ), i
where the index i runs over a finite set, for each of its values 3i is a subset P of 3 ∩ (Supp(g) + 3(1/ha )) which contains 3 ∩ Supp(g) and αi > 0; moreover i αi = 1. Note that for small enough h0 (depending on Supp(g)) the fact that a < b implies that no h-vertebrate contour can fit inside 3 ∩ (Supp(g) + 3(1/ha )), so that for each value
Wulff Droplets and Metastable Relaxation
419
of i we have h · i∅,h = h · i3i ,−,h0 . The equality displayed above and Lemma 2.3 .4 3i ,−,h0 now lead to X ∅,h αi · hgi3i ,−,h ≤ 2µ∅,h hgi3,−,h0 − 3,−,h0 (E1 ) i
≤ C1 |Supp(g)| exp(−C2 /ha ), (2.30 ) for a proper choice of h0 . Similarly, consider now the event E2 =
[ x∈Supp(g)+3(1/ha )
1 P x, a 2h
,
and condition on what the (+,*)-cluster of (Supp(g) + 3(2/ha ))c is. The same reasoning above, gives us, for small enough h0 , X ∅,h αj · hgi3j ,−,h ≤ 2µ∅,h hgi3,−,h0 − 3,−,h0 (E2 ) i
≤ C1 |Supp(g)| exp(−C2 /ha ), (2.31 ) where the index j runs over a finite set, for each of its values 3j is a subset of 3 ∩ (Supp(g) + 3(2/ha )) which contains 3 ∩ (Supp(g) + 3(1/ha )) and αj > 0; moreover P j αj = 1. Thanks to the fact that g is being supposed to be increasing, we have for all i and all j as above, (2.32) hgi3i ,−,h ≤ hgi3∩(Supp(g)+3(1/ha )),−,h ≤ hgi3j ,−,h . The lemma follows from (2.30), (2.31) and (2.32).
The next lemma is a fundamental step, in which an aspect of the heuristics about droplets and their surface and bulk free-energies is made into a rigorous results. Lemma 2.3 .6. For any p > 0, given > 0 there is a finite positive constant h0 such that for any 0 < h ≤ h0 , any collection of skeletons S, and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , µ˜ 3,−,h (SSh,sk ) ≤ exp −β (1 − )W(S) − (1 + )hm∗ Vˇ (S) W(S) ≤ exp −β(1 − 3) . 2 Note. In this and some further statements the restriction |3| ≤ 1/hp is not really necessary. Still, we are using it since it simplifies some arguments, and is enough for our purposes.
420
R. H. Schonmann, S. B. Shlosman
Proof. The second inequality is immediate from Lemma 2.3 .1, so that we only have to prove the first one. It is enough to consider the case of the nonempty skeleton S. We start with µ˜ 3,−,h (SSh,sk ) ≤
X Z3,−,h (S h,co ) G h G∈CS
=
Z3,−,h (S∅h,co )
X Z3,−,0 (S h,co ) G h G∈CS
Z3,−,0 (S∅h,co )
exp
β 2
Z
h 0
Xh
hσ(x)iG,h 3,−,h0
−
hσ(x)i∅,h 3,−,h0
!
i dh
0
,
x∈3
(2.33 ) where in the first step we used the fact that R ⊃ S∅h,co when h is small, while in the second step we used the fact that for an arbitrary E ⊂ 3,− , βX d log Z3,−,h (E) = hσ(x)|Ei3,−,h . dh 2 x∈3 Next we will show that given > 0 it is possible to take h0 > 0 small enough so that if G ∈ CSh , 0 < h0 ≤ h ≤ h0 , and 3 is simply-connected and satisfies |3| ≤ 1/hp , then i Xh ∅,h ∗ˇ 2a hσ(x)iG,h + 1. (2.34) 3,−,h0 − hσ(x)i3,−,h0 ≤ 2(1 + )m V (S) + CW(S)/h x∈3
(The reader should not be confused by the fact that a seemingly does not enter the l.h.s. In fact, it enters, because the restriction G ∈ CSh depends on the related parameter r.) For this purpose let ∂− G (resp. ∂+ G) be the set of sites where each configuration in − with the set of external contours equal to G is doomed to be −1 (resp. +1). Let G 3G ext and 3int be the components of 3\(∂− G ∪ ∂+ G) which are, respectively, external ˇG ˇG and internal to the contours in G. Let also 3 ext and 3int be, respectively, the subsets of G G 3ext and 3int obtained by removing from these sets all sites which are at a distance not larger than 2/ha from any point in any contour of G. ˇG ˇG First we consider the sites x which are neither in 3 int nor in 3ext . For these we have, using (2.19) (2.20) and the fact that r < a, that when h0 is small, X
h
i ∅,h 2a ˇG ˇG − hσ(x)i hσ(x)iG,h ≤ 2|3\(3 0 0 int ∪ 3ext )| ≤ CW(S)/h . 3,−,h 3,−,h
ˇG ˇG x∈3\(3 int ∪3ext )
(2.35) ˇG Regarding now the sites x ∈ 3 ext , we observe that for these sites ∅,h hσ(x)iG,h 3,−,h0 = hσ(x)i3G ,−,h0 . ext
a a But for each such x we have 3G ext ∩ (x + 3(1/h )) = 3 ∩ (x + 3(1/h )), so that a double application of Lemma 2.3 .5 gives us, for small h0 , that a |hσ(x)i∅,h − hσ(x)i∅,h 3,−,h0 | ≤ C1 exp(−C2 /h ). 3G ,−,h0 ext
Wulff Droplets and Metastable Relaxation
421
p ˇG Combining the last two displays, and using the fact that |3 ext | ≤ |3| ≤ 1/h , we obtain for small enough h0 , i X h ∅,h hσ(x)iG,h (2.36) 3,−,h0 − hσ(x)i3,−,h0 ≤ 1. G
ˇ ext x∈3
ˇG Finally, regarding now the sites x ∈ 3 int , we observe that for these sites hσ(x)iG,h 3,−,h0 = hσ(x)i3G ,+,h0 . int
But since each such x is separated from the boundary of the set 3G int by a minimal distance 1/ha , when h is small, we obtain from the FKG-Holley inequalities, (1.22) and (1.7), hσ(x)i3G ,+,h0 ≤ hσ(x)i3G ,+,h ≤ m(h) + C1 exp(−C2 /ha ) ≤ m∗ (1 + ), int
int
provided h0 is chosen small enough. On the other hand, since 3 is simply-connected, a a ˇG for each x ∈ 3 int we have 3 ∩ (x + 3(1/h )) = x + 3(1/h ), so that Lemma 2.3 .5 gives us, for small h0 , a |hσ(x)i∅,h 3,−,h0 − hσ(x)ix+3(1/ha ),−,h0 | ≤ C1 exp(−C2 /h ).
By another application of the FKG-Holley inequalities and (1.22) (with +1 and −1 switched) we have hσ(x)ix+3(1/ha ),−,h0 ≥ hσ(x)ix+3(1/ha ),−,0 ≥ −m∗ − C1 exp(−C2 /ha ) ≥ −m∗ (1 + ), provided h0 is chosen small enough. Combining the last four displays, and using the ˇ ˇG fact that |3 int | ≤ V (S), for small enough h0 we obtain i X h ∅,h ∗ˇ (2.37) hσ(x)iG,h 3,−,h0 − hσ(x)i3,−,h0 ≤ 2(1 + )m V (S). G
ˇ int x∈3
By adding (2.35), (2.36) and (2.37), we obtain (2.34). Keeping in mind that our concern is with the r.h.s. of (2.33), we observe now that X Z3,−,0 (S h,co ) G h G∈CS
Z3,−,0 (S∅h,co )
=
µ3,−,0 (SSh,sk ) µ3,−,0 (S∅h,co )
.
(2.38)
The numerator of this fraction is controlled in the fundamental Lemma 10.1 in [Pfi], where it is shown to satisfy µ3,−,0 (SSh,sk ) ≤ exp(−βW(S) + CJ(S)),
P where J(S) = J(γi ), in case S = {γ1 , ..., γn }, with J(γ) being the number of vertices of the skeleton γ. In reality, in the statement of Lemma 10.1 in [Pfi] the assumption that the temperature is low enough is made. But, as pointed out in [Iof2], this assumption is actually not needed for Pfister’s elegant proof, based on a clever use of duality and Griffiths’ inequalities, to work. Using (2.21) we can now write that given > 0 there is h0 > 0 such that for 0 < h ≤ h0 ,
422
R. H. Schonmann, S. B. Shlosman
µ3,−,0 (SSh,sk ) ≤ exp(−β(1 − /2)W(S)).
(2.39)
Turning now to the denominator in the r.h.s. of (2.38), observe that at the boundary of any external vertebrate contour of a configuration in − there is a (+,*)-chain with l∞ -diameter at least 1/hb . If σ ∈ (S∅h,co )c , then in σ there is such a (+,*)-chain, and (1.17) (with +1 and −1 switched) can be used in combination with |3| ≤ 1/hp to conclude that 1 (2.40) µ3,−,0 (S∅h,co ) ≥ , 2 for small enough h. The first inequality claimed in the statement of the lemma follows from (2.33), (2.34), (2.38), (2.39) and (2.40), since Ch1−2a < /2 if h is small. The next lemma takes care of some remaining entropy. Lemma 2.3.7. For any p > 0, given > 0 there exists h0 > 0 such that for all u > 0 and D > 0, there is a finite positive constant C such that for any 0 ≤ h ≤ h0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , D h,W µ˜ 3,−,h S D ,∞) ≤ C exp −β(1 − ) u . [ hu 2h Proof. There is no loss in generality in supposing that 0 < < 1 and that u > r; the second of these claims being justified by Lemma 2.3 .3 and the fact that r < b. , for k = 1, 2, .... To bound these We start by estimating µ˜ 3,−,h S h,W [ hDu k, hDu (k+1)) quantities from above, using Lemma 2.3 .6, all that we need is an upper bound on the number of choices of skeletons S which correspond to configurations in 3,− and have W(S) ∈ [ hDu k, hDu (k + 1)). Using (2.21), the number J(S) of vertices that S can have is bounded above by Nk = C1 D(k + 1)/hu−r ≤ C2 Dk/hu−r . Let V be the set of distinct points which are possible vertices of S. The cardinality of V is bounded above by 4|3| ≤ 4/hp . We consider now ordered Nk -tuples of points in V and associate to each point in such an Nk -tuple one of the words “continue", “close", or “quit". To every such object we associate a collection of closed polygons in the following way. We start from the first point in the Nk -tuple and while we keep seeing the word “continue", we do the following. We join the first point to the second point, this one to the third point, and so on. When we first reach a point where we read “close", we connect it to the first point of the Nk -tuple, closing a first polygonal line, and then we jump to the point which in the Nk -tuple follows the point where we just read “close". While we do not see the word “quit", we proceed in this fashion, understanding that the word “close" means that we close the polygonal line which we are currently constructing, and that then we jump to the next point in the Nk -tuple and start the next polygonal line from there. The word “quit" is self-explanatory: we stop the procedure by closing the last polygon and disregard the remaining points of the Nk -tuple. The procedure that we described generates all the collections of skeletons with which we are concerned and plenty of additional garbage. At any rate, counting the number of options here gives us an upper bound on the quantity we are interested in. This upper bound on the number of choices of S is bounded above by 2D k Cu−r D 12 h Nk Nk ≤ exp β u k , |V | 3 ≤ hp 4h
Wulff Droplets and Metastable Relaxation
423
for small enough h. Combining this estimate with Lemma 2.3 .6, gives us that for small h0 , ∞ X h,W µ ˜ S = µ˜ 3,−,h S h,W 3,−,h D D D ,∞ k, (k+1) ) [ hu ) [ hu hu k=1 ∞ X D D D 0 C exp −β 1 − k exp β u k ≤ C exp −β(1 − ) u , ≤ 2 2hu 4h 2h k=1
provided < 1.
We are now close to completing the proof of Proposition 2.1.1. The inequality (2.17) is a direct consequence of Lemmas 2.3 .2 and 2.3 .7. For use in Sect. 3.2, we observe that the same argument gives for each p > 0 and > 0 the existence of some finite C1 so that A − , (2.41) sup µ˜ 3,−,h (Fx ) ≤ C1 exp −β(1 − ) sup h 3 simply-connected x∈3 |3|≤(1/h)p
for small h. Turning to (2.18), Lemma 2.3 .5, with h = h0 , gives us that when h is small, for all 3 ∈ L0,h , a |hf i∅,h (2.42) 3,−,h − hf i3(1/ha )),−,h | ≤ C(f ) exp(−C1 /h ). We will be done once we replace the conditional Gibbs distribution µ∅,h 3,−,h , implicit in this expression, with the conditional Gibbs distribution µ˜ 3,−,h . To this end we combine Lemmas 2.3 .3 and 2.3 .7 to write µ˜ 3,−,h (S∅h,co )c ≤ C2 exp(−C3 /hb ). Combining this inequality with the equalities Z Z Z f dµ˜ 3,−,h = f dµ˜ 3,−,h + S∅h,co
R
and hf i∅,h 3,−,h =
S∅h,co
S∅h,co
f dµ˜ 3,−,h
µ˜ 3,−,h (S∅h,co )
c f dµ˜ 3,−,h ,
,
gives us Z f dµ˜ 3,−,h − hf i∅,h ≤ C4 (f ) exp(−C3 /hb ) ≤ C4 (f ) exp(−C3 /ha ). 3,−,h
(2.43)
Together (2.42) and (2.43) give us (2.18), and Proposition 2.1.1 is proved.
2.4. Asymptotic expansion. Proposition 2.4.1. Suppose the T < Tc . Then for each constant a ∈ (0, 1/2) and local observable f , for n = 1, 2, 3, ..., the following expansion holds when h > 0:
424
R. H. Schonmann, S. B. Shlosman
hf i
3(1/ha ),−,h
=
n−1 X
bj (f )hj + O(hn ),
j=0
where j 1 dj hf i−,h 1 β = bj (f ) = j! dhj h=0− j! 2
X
hf ; σ(x1 ); ...; σ(xj )i− ,
x1 ,...,xj ∈Z2
and O(hn ) is a function of f and h which satisfies lim suph&0 |O(hn )|/hn < ∞. The existence of the derivatives bj (f ) and their relations with the summations over generalized Ursell functions are contained in (1.25), modulo the interchange of the role of +s and −s. Below we will also be using various other relations from Sect. 1.3 modulo this symmetry; we will do it without further warning. From (1.23) and (1.17), in combination with an idea already explained in Sect. 2.1in connection with the derivation of (2.1) and also exploited at the end of the proof of Lemma 2.3 .4 (and which amounts to an estimate on a Radon-Nikodym derivative), we have for each a ∈ (0, 1/2), T < Tc , 0 ≤ h0 ≤ h ≤ 1 and local observables f and g, + |hf ; gi3(1/ha ),−,h0 | ≤ C(f, g) µ3(1/ha ),−,h0 Supp(f ) ←→ Supp(g) + Z3(1/ha ),−,h0 Supp(f ) ←→ Supp(g) = C(f, g) Z3(1/ha ),−,h0 + Z3(1/ha ),−,0 Supp(f ) ←→ Supp(g) ≤ C(f, g) exp(β|3(1/ha )|h0 ) Z3(1/ha ),−,0 + ≤ C(f, g) eβ µ3(1/ha ),−,0 Supp(f ) ←→ Supp(g) ≤ C(f, g, T ) exp − C(T ) dist∞ (Supp(f ), Supp(g)) . (2.44 ) As with (1.27) in Sect. 1.3, the argument in Appendix B of [M-L] shows that from the exponential decay of correlations in (2.44) a similar exponential decay follows for the generalized Ursell functions. With a, T , h, h0 and f as above, |hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,h0 | diam∞ (Supp(f ) ∪ {x1 , ..., xj }) . ≤ Cj (T, f ) exp −C(T ) j
(2.45 )
If we keep 0 < h ≤ 1 fixed and look at hf i3(1/ha ),−,h0 as a function of h0 ≥ 0, we can write the following Taylor expansion:
Wulff Droplets and Metastable Relaxation
425
1 dn hf i3(1/ha ),−,h0 1 dj hf i3(1/ha ),−,h0 j n hf i = 0 h + n! 0 00 h 0 )n j! d(h0 )j d(h h =0 h =h (h) j=0 n−1 X 1 β j X = hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 hj + j! 2 j=0 x1 ,...,xj ∈3(1/ha ) 1 β n X + hf ; σ(x1 ); ...; σ(xn )i3(1/ha ),−,h00 (h) hn , n! 2 (2.46 ) x1 ,...,xn ∈3(1/ha ) n−1 X
3(1/ha ),−,h
where all that we know about h00 (h) is that h00 (h) ∈ [0, h]. Thanks to the uniformity on h and h0 of the bound (2.45), this is enough to conclude that the coefficient which multiplies hn in the last term of (2.46) is bounded in absolute value by a constant which does not depend on h. In other words, this last term in (2.46) is indeed a O(hn ). To conclude the proof of Proposition 2.4.1 we must show that for j = 0, 1, ..., n − 1, the coefficient which multiplies hj converges fast enough to bj (f ). We will show that this convergence occurs at a rate which is more than enough for this purpose. Indeed, using first (1.27) and then (1.26), we have for some finite positive C1 , C2 , C3 and C4 , X X a hf ; σ(x ); ...; σ(x )i − hf ; σ(x ); ...; σ(x )i 1 j 3(1/h ),−,0 1 j − x1 ,...,xj ∈3(1/ha ) x1 ,...,xj ∈Z2 X ≤ hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 x1 ,...,xj ∈3(1/2ha ) X C2 − hf ; σ(x1 ); ...; σ(xj )i− + C1 exp − a h x1 ,...,xj ∈3(1/2ha ) X hf ; σ(x1 ); ...; σ(xj )i3(1/ha ),−,0 − hf ; σ(x1 ); ...; σ(xj )i− ≤ x1 ,...,xj ∈3(1/2ha )
C2 + C1 exp − a h C4 ≤ C3 exp − a . h This completes the proof of Proposition 2.4.1. We have now finished the proof of Theorem 1(i) in the special case in which the initial distribution ν is concentrated on the configuration with all spins down. 2.5. 2.5. More general initial distributions. In this section we will show that part (i) of Theorem 1 for an initial distribution ν ≤ µ− can be derived from the same result for the particular case in which ν is concentrated on the configuration with all spins down. With no loss in generality we will again suppose that f is increasing and has ||f ||∞ ≤ 1. In this case for all t > 0, − ν )) ≥ E(f (σh;t )). E(f (σh;t
426
R. H. Schonmann, S. B. Shlosman
The claim therefore will be proven once we show that for all λ and λ0 which satisfy 0 < λ < λ0 , there are finite positive constants C1 and C2 such that for h > 0, − ν E(f (σh;exp(λ/h) )) ≤ E(f (σh;exp(λ 0 /h) )) + C1 exp(−C2 /h).
(2.47)
To prove this inequality we first note that for arbitrary s, t ≥ 0, Z µ− µ ζ ν )). E(f (σh;t )) ≤ E(f (σh;t )) = P(σ0;s− ∈ dζ)E(f (σh;t From Lemma 1.2.1 there exists finite positive C3 , C4 and C5 so that we have Z Z µ ζ ζ − )) ≤ P(σ0;s ∈ dζ)E(f (σh;t )) P(σ0;s− ∈ dζ)E(f (σh;t µ − (x) 6= σ0;s− (x) for some x ∈ 3(C3 t) + C4 exp(−C5 t). + P σ0;s From the basic-coupling inequalities and the Markov property, Z Z ζ ζ − − − P(σ0;s ∈ dζ)E(f (σh;t )) ≤ P(σh;s ∈ dζ)E(f (σh;t )) = E(f (σh;s+t )). Combining the last three displays, and taking t = s = exp(λ/h) we obtain − ν )) ≤ E(f (σh;2 E(f (σh;exp(λ/h) exp(λ/h) ))
µ− − (0) 6= σ0;exp(λ/h) (0) + |3(C3 exp(λ/h))| P σ0;exp(λ/h)
+ C4 exp(−C5 exp(λ/h)) − ≤ E(f (σh;exp(λ 0 /h) )) + (0) − m∗ + C6 exp(2λ/h) E σ0;exp(λ/h)
+ C4 exp(−C5 exp(λ/h)), when h is small. In the second inequality above we used the basic-coupling inequalities − )) in t), and also the spintwice (which imply in particular the monotonicity of E(f (σh;t reversal symmetry in case h = 0. + (0)) → m∗ fast enough To complete the proof of (2.47) we have to show that E(σ0;u as u → ∞. The following lemma, which states that this happens faster than any power of 1/u, is clearly sufficient for our purpose. Lemma 2.5.1. Suppose T < Tc . Then for each p > 0 there is a positive finite constant C such that + (0) − m∗ ≤ Cu−p . 0 ≤ E σ0;u Proof. The lower bound is a standard application of a basic-coupling inequality. To prove the upper bound, we will first also use basic-coupling inequalities, in order to compare the infinite system with finite ones with (+) boundary conditions. For these finite systems we will then use a result in [Mar], which was extended up to Tc in [CGMS]. The result of [Mar] and [CGMS] that we will use refers to the spectral gap of the generator a kinetic Ising model in a finite box. For each finite 3 ⊂ Z2 , η ∈ and h, the · )t≥0 is a finite-state-space reversible irreducible Markov process and process (σ3,η,h;t
Wulff Droplets and Metastable Relaxation
427
its generator has its (discrete) spectrum contained in the interval (−∞, 0], with 0 being in the spectrum. The spectral gap, denoted by gap(3, η, h), is then simply the absolute value of the largest non-zero number in the spectrum. It is shown in [Mar] (Theorem 3.1) that for low enough T , given ∈ (0, 1/2), there exists a finite positive C so that gap(3(l), +, 0) ≥ exp(−Cl1/2+ ). The common belief is that even up to Tc the lower bound on gap(3(l), +, 0) in this inequality is far from optimal. No rigorous result in this direction is available, and for temperatures close to Tc only a weaker bound has been proven. That bound, which is implicitly derived in [CGMS] (see the introduction of that paper), states that for any > 0, gap(3(l), +, 0) ≥ exp(−l), (2.48) for large l. Fortunately for us, this estimate suffices for our purpose here. For any l > 0, we can write, using basic-coupling inequalities, a standard estimate for the relaxation to equilibrium of expected values of observables in terms of the spectral gap (see, e.g., inequality (59) in [Sch1]), and (1.22), + + (0) − hσ(0)i3(l),+,0 + hσ(0)i3(l),+,0 − m∗ − m∗ ≤ E σ3(l),+,0;u E σ0;u ≤
e−gap(3(l),+,0) u + C1 exp(−C2 l). µ3(l),+,0 (+)
(2.49 )
We use now (2.48) with = C2 /(2p), and we choose l = (log u)/(2). Since 2 µ3(l),+,0 (+) ≥ exp(−C 3 l ), the first term in the right hand side of (2.49) is bounded √ above by exp(− u/2), when u is large. This finishes the proof, since the second term in the right hand side of (2.49) is of the claimed form and this upper bound on the first term goes to 0 even faster.
3. Relaxation Regime 3.1. Preliminaries. In this part of the paper we will prove part (ii) of Theorem 1. For this we suppose that λ > λc is fixed and that ν ≤ µ− . Once more, there is no loss in generality in supposing that f is increasing and that it has ||f ||∞ ≤ 1. For the remainder of the proof we will make these assumptions. Half of our goal is trivial, since the basic-coupling inequalities give for each t ≥ 0, µh ν )) ≤ E(f (σh;t )) = hf ih . E(f (σh;t
Our goal is therefore reduced to proving that for all C > 0 there exists a finite C1 such that ν E(f (σh;exp(λ/h) )) ≥ hf ih − C1 exp(−C/h). (3.1) At this point there is also no loss in supposing that ν is concentrated on the configuration with all spins down, and therefore also that λc < λ < 2λc . In our argumentation this second assumption will simplify things. In Sect. 3.2 we will introduce certain space-time structures, which we call inverted pyramids, and which will be used to obtain statements concerning droplet growth. To be
428
R. H. Schonmann, S. B. Shlosman
able to use these results in order to prove (3.1), we will need to use such inverted pyramids as building blocks of a rescaling procedure, and will also need to obtain a mathematical counterpart to the notion of droplet creation at the correct rate; these two topics will be covered in Sect. 3.3. In both Sects. 3.2 and 3.3, we will need to use some lemmas which can be seen as rigorous counterparts to the notion that the function φ(b) = wb − m∗ b2 gives the free-energy of optimally-shaped droplets, and that equilibrium distributions can be studied based on this heuristics. To avoid distracting the reader with the technicalities behind these lemmas, they are only presented later in the paper, in Sect. 3.4. Some of the results in that section were already contained in the paper [SS1], but here we will need substantial strengthenings of them; moreover the techniques used here will be different from those in [SS1] and so provide an alternative to parts of that paper. Finally some estimates on the spectral gap of the generator of the dynamics of some kinetic Ising models on some finite sets will also be needed in Sections 3.2 and 3.3. Again, those will be postponed to the final section, 3.5, in this part of the paper, in order to avoid distracting the reader’s attention from the main ideas in Sects. 3.2 and 3.3. The remainder of the paper is written having in mind a reader who will be following it in the order in which the sections are presented. With this in mind we tried to motivate and explain heuristically in Sections 3.2 and 3.3 the results which are used there but will only be proven later, in Sects. 3.4 and 3.5. Readers who prefer following the lemmas and propositions in a strictly logical order, and who do not worry about the motivation behind each lemma can read the sections in the following order: 3.4, 3.5, 3.2, 3.3. Considering this possibility and also the length of the paper, we repeat some definitions from Sects. 3.2 and 3.3 in Sects. 3.4 and 3.5. 3.2. Inverted pyramids and droplet growth. In this section we will introduce two propositions which are counterparts to the statement: “If we start with a large enough Wulffshaped droplet of the (+)-phase in the midst of the (−)-phase, then it is likely to grow with a linear speed larger than any negative exponential of 1/h”. In the first of these propositions “large enough” will be substantially larger than just “supercritical” (it will mean that the droplet has a negative free-energy); in the second of these propositions this aspect will be improved, at the cost of extra technicalities and extra work in the proof. · )t≥0 evolving in a box We will need to generalize the notion of a process (σ3,η,h;t 3, with boundary condition η. We will have to consider the time evolution to occur in boxes which may change with time. Since we will construct all of our generalized processes using the graphical construction, their definition is very elementary. The spacetime regions with which we will be concerned will all be of the following type. Let t0 < t1 < t2 < ... < tN +1 be an increasing finite sequence of times, and 30 , 31 , ..., 3N be finite subsets of Z2 . Our space-time region is ♦=
N [
[ti , ti+1 ] × 3i .
i=0
We will refer to [t0 , t1 ] × 30 as the bottom cylinder of ♦ and to [tN , tN +1 ] × 3N as the top cylinder of ♦. The base of ♦ is t0 × 30 , while the top of ♦ is tN +1 × 3N . While the boundary condition could be very general, in this paper we will only need to consider (−)-boundary conditions in space-time. We will need to start the evolution from an arbitrary time s ∈ [t0 , tN +1 ), from a space configuration η ∈ 3j ,− , where j is defined by s ∈ [tj , tj+1 ). The following notation will be used to denote the process:
Wulff Droplets and Metastable Relaxation
429 s,η (σ♦,−,h;t )t≥s .
Its definition is as follows. We freeze all spins outside of ♦ as −1 and at time s set the configuration to η. We use then the graphical construction with its standard rules to update spins inside ♦, after time s. The basic-coupling inequalities generalize in obvious ways. For instance, if η ≤ ζ, and −h(T ) < h1 ≤ h2 < h(T ), then for all t0 ≤ s ≤ t, s,η s,ζ ≤ σ♦,−,h , σ♦,−,h 1 ;t 2 ;t
and, in case t0 = 0,
0,η σ♦,−,h ≤ σhζ 2 ;t . 1 ;t
We explain next the basic type (up to space-time translations) of space-time regions ♦ which we will deal with. For reasons which will become clear, we will refer to such regions as “inverted pyramids”. For i = 0, ..., N − 1, the set 3i+1 will be obtained from 3i by adding one site to it. In particular we will have 30 ⊂ 31 ⊂ ... ⊂ 3N . These sets 3i will all be simply-connected subsets of Z2 and have “approximate Wulff shapes”, in the following technical sense. Given a positive number l0 , we say that a subset 3 of Z2 is l0 -quasi-Wulff-shaped in case it is connected, simply connected and 3((l − l0 )W ) ⊂ 3 ⊂ 3((l + l0 )W ), for some number l. An l with this property will be said to be a linear-size-parameter for the quasi-Wulff-shaped set 3 (of course, more than one such value of l can exist). In our case there will be an l0 for which each 3i , i = 0, ..., N will be l0 -quasi-Wulff-shaped. The absolute constant l0 will be fixed throughout our work, and will be omitted from the notation. We will identify our inverted pyramids by four non-negative parameters: b1 < b2 , h and δ. The first three of these parameters are related to the space dimensions of the inverted pyramid, by setting 30 = 3( bh1 W ) and 3N = 3( bh2 W ). The parameters δ and again h are related to the time dimensions, by setting ti+1 − ti = exp(δ/h), for i = 0, ..., N . It is not hard to see that if l0 > 2 then for all b1 , b2 and h as above there is a sequence, Seq(b1 , b2 ; h) = (30 , ..., 3N ), of boxes with the required properties, and with N = N (b1 , b2 ; h) ≤ C
b2 h
2 (3.2)
for some finite constant C. (Of course, more than one such sequence typically exists, and we suppose that some rule to choose one from among them is being used). Given also δ, the notation O = O(b1 , b2 ; h; δ) will be used for the inverted pyramid which we described above and which also has t0 = 0. The technical counterpart to the idea of the growth of a droplet will be contained in an event that we define next. With b1 , b2 , h and δ fixed (and omitted from the notation in several places, for simplicity), and also an initial configuration η ∈ 3( b1 W ),− given, h we let Gη be the following event:
430
R. H. Schonmann, S. B. Shlosman
n o 0,η tN ,+ Gη = σO,−,h;t = σ O,−,h;tN +1 . N +1 Observe that it is clear from the basic-coupling inequalities that, regardless of what η 0,η tN ,+ is, σO,−,h;t ≤ σO,−,h;t , so that Gη only really requires that the complementary N +1 N +1 inequality holds. Informally, G+ can be seen as the event of a droplet of +1 spins of linear size proportional to b1 /h at time 0 growing to become a droplet of linear size proportional 2 to b2 /h at time tN = N exp(δ/h) ≤ C b2 /h exp(δ/h). Recall from Sect. 1.5 that heuristically the free-energy of a Wulff-shaped droplet (b/h)W of the (+)-phase is given by φ(b)/h, where φ(b) = wb − m∗ b2 , and that Bc is the value of b which maximizes this function, while B0 = 2Bc is the value of b above which this function becomes negative. In Sect. 1.5 this was used to predict heuristically the behavior of the relaxation time as h & 0. Similarly, the free-energy of such droplets can also be used to predict the typical aspect of a Gibbs distribution µ3( b W ),−,h ( · ), h when h is small, and this will be of great relevance in this paper. When b < B0 , one should expect this Gibbs distribution to resemble the (−)-phase, since droplets of the (+)-phase would all have positive free-energies. On the other hand, when b > B0 , one should expect this Gibbs distribution to resemble the (+)-phase, separated from the (−) spins at the boundary by a large contour, since a single droplet of the (+)-phase of the size of the system itself would have the lowest possible free-energy. Rigorous results of this type were obtained in [SS1]. Unfortunately, for our purposes in the current paper we will need technically stronger results than those in [SS1]. As mentioned in Sect. 3.1, these technical results will only be presented and proven in Sect. 3.4. If the reader accepts the picture which we just presented and justified heuristically as reasonable, then, he or she should have no difficulty believing in the specific statements which appear in this and in the next section and are proven only in Sect. 3.4. Next we state the first of the two main claims of this section. Proposition 3.2.1. Given B0 < b1 < b2 and δ > 0 there are positive finite constants C1 and C2 such that for h > 0, Z dµ3( b1 W ),−,h (η)P(Gη ) ≥ 1 − C1 exp(−C2 /h). (3.3) h
In particular P(G+ ) ≥ 1 − C1 exp(−C2 /h).
(3.4)
Moreover the choice of C2 does not depend on δ and b2 and it can be taken arbitrarily large, provided b1 is large enough. Observe that from the heuristic picture described before the last proposition, we can see that in (3.3) we are starting from a droplet of the (+)-phase, and the statement is that it is likely to grow at a speed which is controlled in a useful way. In comparison, for Bc < b1 < B0 , (3.3) should be false, since then in the Gibbs distribution µ3( b1 W ),−,h ( · ) h no droplet of the (+)-phase should be present. On the other hand, we should expect (3.4) to be true also in this case, since there we are starting from a droplet not just of the (+)-phase, but actually a solid droplet of (+) spins, with a supercritical size. This claim is contained in the next proposition. In this proposition we will use the following object. Recall the definition (2.10) of R and set µ b3,−,h ( · ) = µ3,−,h ( · |Rc ).
Wulff Droplets and Metastable Relaxation
431
This is the Gibbs measure conditioned on the presence of a supercritical droplet. The following should be expected to happen, based on the heuristics. When 3 = 3( hb W ) and b > B0 the conditioning has no major effect; but if Bc < b < B0 , then the conditioning produces a droplet of the (+)-phase, of roughly the size of the whole system, separated from the (−) spins at the boundary by a large contour. This is so since a single droplet of the (+)-phase of the size of the system itself would have the lowest possible free-energy compatible with the conditioning. It is important to note that in the next proposition, in addition to having to modify (3.3) by introducing the conditioning on Rc , also the way in which δ can be chosen is different. The reason for this difference will be clarified in the proof of the proposition. Proposition 3.3.2. Given Bc < b1 < b2 there are positive finite constants δ0 , C1 and C2 such that if 0 < δ < δ0 , for h > 0, Z db µ3( b1 W ),−,h (η)P(Gη ) ≥ 1 − C1 exp(−C2 /h). (3.5) h
In particular
P(G+ ) ≥ 1 − C1 exp(−C2 /h).
(3.6)
In the remainder of this section we will prove Propositions 3.2. 1 and 3.2. 2. Besides leaving some technical lemmas which concern equilibrium distributions to Sect. 3.4, also some technical results which concern the kinetic Ising models run in certain finite boxes, including those of the type of 3 hb W , will have their proofs postponed to Sect. 3.5. These results, when used in the current section, will be heuristically motivated, though. Proof of Proposition 3.2.1 (modulo results in Sects. 3.4 and 3.5) . The second claim, (3.4), follows from the first one, (3.3), and the basic-coupling inequalities. Our task is to prove (3.3) and the claims about the value of the constant C2 . There is no loss in supposing that h is small, and we will assume that 0 < h ≤ 1. For i = 1, ..., N , and arbitrary ζ ∈ 3i−1 ,− set o n ti−1 ,ζ ti ,+ = σ Giζ = σO,−,h;t O,−,h;ti+1 . i+1 Note that
i Gη ⊃ G1η ∩ ∩N i=2 G+ .
(3.7)
Our goal now is to prove that for some positive finite C1 and C2 , as in the statement of the proposition, for i = 1, ...., N , Z dµ3i−1 ,−,h (ζ)P((Giζ )c ) ≤ C1 exp(−C2 /h). (3.8) In particular this implies, as in the first paragraph in this proof, that P((Gi+ )c ) ≤ C1 exp(−C2 /h).
(3.9)
By putting together (3.8) (used for i = 1), (3.9) (used for i = 2, ..., N ), (3.7) and (3.2) we obtain (3.3). The sets 3i−1 and 3i differ only in that the latter has one extra site, say x, that the former does not have. For an arbitrary ζ ∈ 3i−1 ,− , we will need to compare µ3i−1 ,−,h (ζ) with µ3i ,−,h (ζ). For this purpose we introduce the notation Sx− = {σ : σ(x) = −1} for the event that the spin at the site x is negative, and let
432
R. H. Schonmann, S. B. Shlosman
α = inf inf µ{0},ξ,h (S0− ) |h|≤1 ξ∈
(3.10)
be the largest lower bound on the probability of having in equilibrium a spin −1 at the origin, given any information about the other spins, for values of h in the arbitrarily chosen neighborhood [−1, +1] of the origin. Clearly α > 0 for each temperature T > 0. With this notation, µ3i ,−,h (ζ) = µ3i ,−,h (Sx− ) µ3i ,−,h (ζ|Sx− ) ≥ α µ3i−1 ,−,h (ζ), which can be seen as a uniform estimate on a Radon-Nikodym derivative: sup ζ∈3i−1 ,−
dµ3i−1 ,−,h 1 (ζ) ≤ . dµ3i ,−,h α
(3.11)
µ
3,−,h Using the stationarity of the processes (σ3,−,h )t≥0 , and (3.11) we obtain Z Z ti−1 ,ζ ti ,+ dµ3i−1 ,−,h (ζ)P (Giζ )c = dµ3i−1 ,−,h (ζ)P σO,−,h;t = 6 σ O,−,h;ti+1 i+1 Z ti ,ζ ti ,+ 6= σO,−,h;t = dµ3i−1 ,−,h (ζ)P σO,−,h;t i+1 i+1 Z 1 ti ,ζ ti ,+ dµ3i ,−,h (ζ)P σO,−,h;t = 6 σ ≤ O,−,h;ti+1 i+1 α Z 1 ζ + dµ3i ,−,h (ζ)P σ3 . = 6 σ = ,−,h;exp(δ/h) 3 ,−,h;exp(δ/h) i i α (3.12 )
This may have been seen at first sight as a minor and trivial maneuver, but it is actually a central step in our approach towards controlling droplet growth. We have just transformed our problem pertaining to “growth” into a problem pertaining to “rapid loss of memory” or, in other words, “rapid convergence to equilibrium”, since (3.12) will provide us with the aimed (3.8), once we show that Z
ζ + = 6 σ dµ3i ,−,h (ζ)P σ3 3i ,−,h;exp(δ/h) ≤ C1 exp(−C2 /h). i ,−,h;exp(δ/h)
(3.13)
Proving (3.13) seems like a standard problem, due to the vast current literature on + ) , and we want to this type of issue: we have a reversible Markov process (σ3 i ,−,h;t t≥0 show that it reaches equilibrium in a time of the order of exp(δ/h). There are nevertheless still major hurdles to overcome. The standard approach to such a problem starts with the derivation of a lower bound on the spectral gap of the generator of the process. The result that we are after would then follow if the time with which we are concerned were much larger than the inverse of the lower bound on the spectral gap. Such an approach is nevertheless unfeasible in our case, due to the fact that here the spectral gap is of the order of exp(−βA/h), so that its inverse is much larger than exp(δ/h), when δ is small. We will not give the full proof of this claim on the value of the spectral gap, since it will be of no use for us (some readers may want to take it as an exercise, with the hint that it can be solved using techniques in this paper), and will limit ourselves to explaining the nature of the difficulty at the heuristic level. This difficulty lies precisely in the sort of metastability studied in this paper. We are considering a Glauber dynamics in the box 3i ,
Wulff Droplets and Metastable Relaxation
433
which is almost the same as a set 3( hb W ) with some b ∈ [b1 , b2 ]. Since B0 < b1 ≤ b, in equilibrium we should have a large droplet of the (+)-phase, covering the box 3i almost − )t≥0 , then to entirely. If we look at the process started from all spins down, (σ3 i ,−,h;t reach equilibrium this big droplet has to be formed, and the system has to go through the bottleneck presented by the situation with a critical droplet. Hence the free-energy barrier to be overcome has height A/h. The system should then reach equilibrium in a time of order exp(βA/h). (Because the linear size of the box is of the order of 1/h, droplet growth is of no relevance for the estimate of the order of magnitude of the relaxation time inside the box.) Starting with all spins down should maximize the relaxation time, since equilibrium is basically the (+)-phase, and the inverse of this relaxation time should hence give the order of magnitude of the spectral gap. The same heuristics above which pointed out the problem with using the spectral gap for our purpose of proving (3.13) indicates also why we should believe that this inequality holds nevertheless. The difficulty pointed out concerns the long time needed to relax towards equilibrium if the process is started with all spins down. We are, on the other hand, concerned with the case in which we start with all spins up, much closer to equilibrium. One can talk of an heuristic picture with a double well structure. The configuration with all spins down is in the higher (metastable) well, separated from the other one by the free-energy barrier, but the configuration with all spins up is inside the deeper (stable) well, and our problem concerns only relaxation inside this well. The problem still remains at this point how to exploit this heuristic picture and prove (3.13). The solution will be to use the basic-coupling inequalities in order to compare our process with some modified ones, for which the spectral gaps can be proven to be large enough for our purposes. In doing so we were inspired by arguments in Sect. 5 of [Mar]. · ) , in which One of the two comparison processes that we will use is (σ3 i ,+,h;t t≥0 (+)-boundary conditions are used. The other one will be denoted by · , and has the following meaning. The box in which it is run is σ3 core ,(+,−),h;t i \3 t≥0
+B0 the annulus 3i \3core , where 3core = 3( Bc2h W ), and the boundary condition denoted by (+, −) refers to freezing the spins up inside the core 3core and down outside 3i . In Sect. 3.5, Propositions 3.5.1 and 3.5.2, we will show that for any δ > 0 the generators of these two processes satisfy δ , (3.14) inf gap(3i , +, h) ≥ exp − i=0,...,N 2h δ inf gap 3i \3core , (+, −), h ≥ exp − , (3.15) i=0,...,N 2h
for small enough h. The intuitive reason behind these relatively large spectral gaps, is that the extra (+)’s introduced as boundary conditions eliminate metastability in the time evolution of these two processes. In equilibrium these systems are again basically in the (+)-phase, and if we start them with all spins down, then there is no need to nucleate a critical droplet in order to relax to equilibrium. In the first case the (+)-phase drifts inwards from the (+)-boundary towards the center. In the second case a supercritical (+)droplet is frozen by hand in the center of the box, so that the relaxation is a “down-hill” movement on the free-energy landscape; the (+)-phase should drift outwards from the center towards the outer boundary of the box. Our task now is to show how (3.14) and (3.15) can be used to derive (3.13). We partition 3i into two sets:
434
R. H. Schonmann, S. B. Shlosman
3 =3 in
b1 + B0 W 2h
,
in 3out i = 3i \3 .
Using the basic-coupling inequalities, we have Z ζ + dµ3i ,−,h (ζ)P σ3 = 6 σ 3i ,−,h;exp(δ/h) i ,−,h;exp(δ/h) Z o Xn ζ + (y) = +1 − P σ (y) = +1 P σ3 ≤ dµ3i ,−,h (ζ) ,−,h;exp(δ/h) ,−,h;exp(δ/h) 3 i i y∈3i
o Xn + = P σ3 (y) = +1 − µ ({σ : σ(y) = +1}) 3 ,−,h i i ,−,h;exp(δ/h) y∈3i
≤
X n o + (y) = +1 − µ3i ,−,h {σ : σ(y) = +1} P σ3 i ,+,h;exp(δ/h)
y∈3in
+
X n o + P σ3 (y) = +1 − µ {σ : σ(y) = +1} . core 3 ,−,h i ,(+,−),h;exp(δ/h) i \3
y∈3out i
At this point we can use (3.14) and (3.15) in a standard way. For instance, from inequality (59) in [Sch1], + (y) = +1 − µ3i ,+,h {σ : σ(y) = +1} P σ3 i ,+,h;exp(δ/h) 2 ! δ e− exp(δ/h)gap(3i ,+,h) b2 ≤ ≤ exp C exp{−e 2h } µ3i ,+,h (+) h and + − µ3i \3core ,(+,−),h {σ : σ(y) = +1} P σ3 core ,(+,−),h;exp(δ/h) (y) = +1 i \3 2 ! core δ e− exp(δ/h)gap(3i \3 ,(+,−),h) b2 ≤ exp{−e 2h }. ≤ exp C µ3i \3core ,(+,−),h (+) h Combining the last three displayed inequalities, we obtain Z ζ + = 6 σ dµ3i ,−,h (ζ)P σ3 ,−,h;exp(δ/h) 3 i i ,−,h;exp(δ/h) X µ3i ,+,h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ y∈3in
+
X
y∈3out i
+C
0
b2 h
µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} 2
exp C
b2 h
2 ! δ
exp{−e 2h }.
(3.16 )
Our remaining problem concerns only equilibrium. We will use a result from Sect. 3.4, but observe that this result is intuitively natural, based on the heuristics presented immediately before the statement of Proposition 3.2. 1. Set
Wulff Droplets and Metastable Relaxation
c b1 W 3 ∗ →←→ 3 , h
B1 =
435
in −
n c o − . B2 = 3core ∗ →←→ 3in
(3.17)
(3.18)
From Lemma 3.4.8 we know that for every b1 > B0 there exists a C2 = C2 (b1 ) > 0, C2 → ∞ as b1 → ∞ and a finite C1 such that for j = 1, 2, sup µ3i ,−,h (Bj ) ≤ C1 exp(−C2 /h).
(3.19)
i=0,...,N
From (1.20), for each y ∈ 3in , µ3i ,+,h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ 2µ3i ,−,h (B1 ) .
(3.20)
Similarly, for each y ∈ 3out i ,
µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ3i ,−,h {σ : σ(y) = +1} ≤ 2µ3i ,−,h (B2 ) . (3.21) The desired inequality (3.3) and the claims about the choice of the constant C2 which appears there follow from combining (3.16), (3.20), (3.21) and (3.19). Proof of Proposition 3.2..2 (modulo results in Sects. 3.4 and 3.5) . We will explain how the proof of Proposition 3.2. 1 can be adapted to prove this proposition. There are several extra complications, since µ b3i ,−,h is not an invariant distribution for the process · ) , and in particular this is the reason for which we will have to choose δ0 (σ3 i ,−,h t≥0 small enough. The idea is to look at this distribution instead as a “metastable state” for this process, and to use techniques from Part 2 of the present paper in this connection. Using the graphical construction we can define the following processes restricted to Rc . For arbitrary s ∈ [t0 , tN +1 ), and η ∈ 3j ,− ∩Rc , where j is defined by s ∈ [tj , tj+1 ), the process s,η )t≥s (b σO,−,h;t is obtained in the following simple way. We freeze all spins outside O as −1 and at time s set the configuration to η. We use then the graphical construction with its standard rules modified by suppressing jumps which would bring the system to R, to update spins inside O after time s. Bottlenecks for this dynamics are the sets Fx+ = {σ : σ x ∈ Fx− }. Define also
Fx = Fx− ∪ Fx+ .
For each x the event Fx depends only on the spins at sites other than x. If |h| ≤ 1, we have from the definition (3.10) of α, that for i = 0, ..., N and x ∈ 3i , µ3i ,−,h (Fx− ) ≥ α µ3i ,−,h (Fx ).
(3.22)
Using this inequality in combination with the bottleneck estimate (2.41) we have that for arbitrary > 0,
436
R. H. Schonmann, S. B. Shlosman
µ b3i ,−,h (Fx+ ) =
1 − µ3i ,−,h (Fx ) µ3i ,−,h (Fx+ ) α µ3i ,−,h (Fx ) ≤ ≤ µ3i ,−,h (Rc ) µ3i ,−,h (Rc ) µ3i ,−,h (Rc )
1 C exp −β(1 − ) A µ3 ,−,h (R) 1 h µ˜ 3i ,−,h (Fx− ) i ≤ , α µ3i ,−,h (Rc ) α µ3i ,−,h (Rc ) for small h, independent of i. Since 3 bh1 W ⊂ 3i and b1 > Bc , Lemma 3.4.4 gives us for some δ0 > 0, A 2δ0 µ3i ,−,h (Rc ) ≥ exp −β + , h h =
for small h. So, if our choice of above is made properly, we obtain δ0 + , b3i ,−,h (Fx ) ≤ C1 exp − ϕ b = sup sup µ h i=0,...,N x∈3i
(3.23)
for some finite C1 . Define now ti ,ζ τiζ = inf{t ≥ ti : the process (b σO,−,h;t ) has a suppressed jump at time t}.
Clearly ti ,ζ ti ,ζ = σO,−,h;t σ bO,−,h;t
for
ti ≤ t < τiζ .
(3.24)
From the same argument used to prove (2.12), we obtain now, using (3.23), for i = 0, ..., N , and δ < δ0 , Z b2 ζ δ/h W eδ/h ϕ + C4 3 b db µ3i ,−,h (ζ)P(τi ≤ ti+1 ) ≤ C2 exp −C3 e h (3.25 ) ≤ C5 exp(−C6 /h), for small h. (We remind the reader that ti+1 = ti + eδ/h .) Before we can proceed with the adaptation of the proof of Proposition 3.2. 1 to prove Proposition 3.2. 2, we need to derive an analogue to (3.11). We will show that for small enough h, for i = 1, ..., N , sup ζ∈3i−1 ,−
db µ3i−1 ,−,h 2 (ζ) ≤ . db µ3i ,−,h α
(3.26)
To this end, as in the argumentation for (3.11), we will use the notation x = 3i \3i−1 , and Sx− = {σ : σ(x) = −1}. First note now that by partitioning (Fx )c ∩ Rc according to what the configuration in 3i−1 is and denoting by {Ej } the resulting parts, we have X µ b3i ,−,h (Ej |(Fx )c ) µ b3i ,−,h (Sx− |Ej ) µ b3i ,−,h (Sx− |(Fx )c ) = j
=
X
µ b3i ,−,h (Ej |(Fx )c ) µ3i ,−,h (Sx− |Ej )
j
≥
X j
µ b3i ,−,h (Ej |(Fx )c ) α = α.
Wulff Droplets and Metastable Relaxation
437
Therefore, using (3.23), µ b3i ,−,h (Sx− ) ≥ µ b3i ,−,h ((Fx )c ) µ b3i ,−,h (Sx− |(Fx )c )
α b3i ,−,h (Sx− |(Fx )c ) ≥ , ≥ 1−µ b3i ,−,h (Fx+ ) µ 2
for small enough h, uniformly in i. Since ζ(x) = −1, we have µ b3i ,−,h (ζ) = µ b3i ,−,h (Sx− ) µ b3i ,−,h (ζ|Sx− ) b3i−1 ,−,h (ζ) ≥ =µ b3i ,−,h (Sx− ) µ
α µ b3i−1 ,−,h (ζ), 2
completing the proof of (3.26). We are now ready to explain how the proof of Proposition 3.2. 1 can be modified to prove Proposition 3.2. 2. In place of (3.8), we have to prove the analogous statement: Z db µ3i−1 ,−,h (ζ)P((Giζ )c ) ≤ C1 exp(−C2 /h).
(3.27)
ti−1 ,b µ3 ,−,h )ti−1 ≤t≤ti , and (3.26) For this we use (3.24), (3.25), the stationarity of (b σO,−,h;t i−1 to obtain the following replacement of (3.12)
Z ti−1 ,ζ ti ,+ = 6 σ µ3i−1 ,−,h (ζ)P σO,−,h;t db µ3i−1 ,−,h (ζ)P (Giζ )c = db O,−,h;ti+1 i+1 Z Z ti ,ζ ti ,+ ζ + db µ3i−1 ,−,h (ζ)P(τi−1 6= σO,−,h;t ≤ ti ) µ3i−1 ,−,h (ζ)P σO,−,h;t ≤ db i+1 i+1 Z 2 ti ,ζ ti ,+ db µ3i ,−,h (ζ)P σO,−,h;t + C5 exp(−C6 /h). 6= σO,−,h;t (3.28 ) ≤ i+1 i+1 α
Z
We can show that (3.28) leads to (3.27) by adapting the steps used to prove that (3.12) leads to (3.8). The following are the changes in the argument. This time we take 2Bc + b1 core W , =3 3 3h Bc + 2b1 in 3 =3 W , 3h and in 3out i = 3i \3 .
Similarly to the derivations of (3.16) and (3.28), we can use the basic-coupling inequalities, the spectral gap estimates in Propositions 3.5.1 and 3.5.2, (3.24), (3.25), ti ,b µ3i ,−,h )ti ≤t≤ti+1 to derive and the stationarity of (b σO,−,h;t
438
R. H. Schonmann, S. B. Shlosman
Z
ti ,ζ ti ,+ db µ3i ,−,h (ζ)P σO,−,h;t 6= σO,−,h;t i+1 i+1 Z o X n t ,+ ti ,ζ i (y) = +1 − P σ (y) = +1 P σO,−,h;t ≤ db µ3i ,−,h (ζ) O,−,h;ti+1 i+1 y∈3i
X n o + ≤ P σ3 (y) = +1 − µ b3i ,−,h {σ : σ(y) = +1} i ,+,h;exp(δ/h) y∈3in
+
X n o + P σ3 (y) = +1 − µ b {σ : σ(y) = +1} core 3 ,−,h i ,(+,−),h;exp(δ/h) i \3
y∈3out i
Z + |3i | db µ3i ,−,h (ζ)P(τiζ ≤ ti+1 ) X µ3i ,+,h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} ≤ y∈3in
+
X
y∈3out i
+ C7
b2 h
µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} 2 (
C5 exp(−C6 /h) + exp C8
b2 h
)
2 ! e
δ − exp( 2h )
.
(3.29 )
The definitions of B1 and B2 are the same as before (see (3.17) and (3.18)), but with the modified choices above of 3core and 3in . From Lemma 3.4.8 we know that for j = 1, 2, b3i ,−,h (Bj ) ≤ C9 exp(−C10 /h). (3.30) sup µ i=0,...,N
Due to the conditioning in the definition of µ b3i ,−,h , the derivation of the analogues of (3.20) and (3.21) are somewhat more delicate. For y ∈ 3in , we let {Ej } denote the c partition of (B1 )c according to what the (−,*)-cluster of 3 bh1 W is. We obtain the following: X αj µ b3i ,−,h {σ : σ(y) = +1}|Ej µ b3i ,−,h {σ : σ(y) = +1}|(B1 )c = j
=
X
αj µ3i ,−,h {σ : σ(y) = +1}|Ej
j
≥ µ3i ,+,h {σ : σ(y) = +1} , where in the second equality we used the fact that for each j, Ej ⊂ Rc , and in the final inequality P we used the same standard argument which gives rise to (1.19) and the fact that j αj = 1. From (1.18) it follows then that µ3i ,+,h {σ : σ(y) = +1} − µ b3i ,−,h {σ : σ(y) = +1} ≤ 2b µ3i ,−,h (B1 ) . (3.31) Similarly we can derive, for each y ∈ 3out i , b3i ,−,h {σ : σ(y) = +1} ≤ 2b µ3i ,−,h (B2 ) . µ3i \3core ,(+,−),h {σ : σ(y) = +1} − µ (3.32) Our goal, (3.27), follows from combining (3.28), (3.29), (3.30), (3.31) and (3.32).
Wulff Droplets and Metastable Relaxation
439
3.3. Rescaling and droplet creation. The inverted pyramid O = O(b1 , b2 ; h; δ) and the event G+ were conceived having in mind their use in a rescaling procedure. To each point k = (k1 , k2 , k3 ) of the rescaled space time Z2 × Z+ we associate the following translate of O: 0 b b0 k1 , k2 , tN k3 + O, Ok = h h where as before N is 1 less than the number of elements in the sequence Seq(b1 , b2 ; h), tN = N exp(δ/h) and b0 > 0 is a new parameter. We suppose that b1 , b2 and b0 are such that 0 \ b b1 b1 W + ,0 3 W = ∅, (3.33) 3 h h h and
3
b1 W h
+
b0 ,0 h
⊂ 3
b2 W h
.
(3.34)
It will be important that this can be done for arbitrarily large values of b1 . Indeed, with b1 given we can choose b0 for (3.33) to hold, and then choose b2 for (3.34) to hold. Of course, we will sometimes confuse inverted pyramids with their indices in the terminology being introduced below, and no inconvenience should arise from this. We say that two inverted pyramids Ok and Ok0 are neighbors in case |k3 − k30 | = 1 and (k1 , k2 ) − (k10 , k20 ) ∈ {(0, 0), (0, 1), (1, 0), (0, −1), (−1, 0)}. Note that O = O(0,0,0) has exactly 5 neighbors, all at rescaled time 1. The bottom cylinders of these 5 inverted pyramids will be pairwise disjoint and each will be contained in the top cylinder of the inverted pyramid O(0,0,0) . A rescaled-space-time oriented chain will be a sequence (k (1) , ..., k (n) ) of elements of Z2 × Z+ such that k (i) and k (i+1) are neighbors and k3(i+1) = k3(i) + 1, for i = 1, ..., n − 1. The start of the chain is k (1) , and its end is k (n) . We will say that the inverted pyramid Ok is open if the corresponding event G+ happens for it. This event will be denoted by G+,k . More formally, G+,k is the set of realizations of the graphical construction which would be in G+ after space and time 0 0 were translated by the amount −( bh k1 , bh k2 , tN k3 ). The events G+,k have the very nice property that they are well suited for being concatenated. The simplest version of this idea is the following. Suppose that (k (1) , ..., k (n) ) is a rescaled-space-time oriented chain. We will be concerned with the · ). To fix some notation, say that the bottom cylinder of the bottom process (σ∪ i O (i) ,−,h;t k
inverted pyramid, Ok(1) , is the set 3bot × [s0 , s1 ] and that the top cylinder of the top inverted pyramid, Ok(n) , is the set 3top × [s3 , s4 ]. Standard applications of the basiccoupling inequalities yield ∩i=1,...,n G+,k(i) ⊂
n
s0 ,+ σ∪ iO
k(i) ,−,h;s4
s3 ,+ = σ∪ iO
o k(i) ,−,h;s4
.
Pictorially this amounts to a droplet of the (+)-phase flowing through the open tube ∪i Ok(i) . This relation is not yet strong enough for our purposes, because it requires a solid blob of (+) spins at the bottom, and we will not have such a solid droplet of (+)’s. Nevertheless the following stronger version of this relation can be derived via the same standard use of the basic-coupling inequalities. For an arbitrary η ∈ 3bot ,−
440
R. H. Schonmann, S. B. Shlosman
n
s0 ,η s0 ,+ = σO σO (1) ,−,h;s1 (1) ,−,h;s1 k
o\
∩i=1,...,n G+,k(i) n s0 ,η s3 ,+ ⊂ σ∪ = σ∪ i O (i) ,−,h;s4 iO
k
k
o k(i) ,−,h;s4
. (3.35 )
We will now construct a space-time structure motivated by the cone which appeared in the heuristics in Sect. 1.5. Our goal is to look at the system at time exp(λ/h), with λc ≤ λ ≤ 2λc , and prove (3.1). Set M = bexp(λ/h)/tN c. Moving backwards in time we choose now a set CM of inverted pyramids Ok . From the inverted pyramids with rescaled time coordinate M − 1 we take only O(0,0,M −1) . Inductively, once we have selected the inverted pyramids at a certain rescaled time m > 0, we take at rescaled time m − 1 the inverted pyramids which are neighbors to at least one inverted pyramid already included in our set. The procedure stops at rescaled time 0. Note that if we consider the indices of the selected inverted pyramids, what we have done is precisely to construct the discrete rescaled analog of the space-time cone, as in the heuristics. An alternative definition of CM is that it is the set of inverted pyramids from which we can start a rescaled space-time oriented chain which ends at O(0,0,M −1) . The cardinality of the set CM clearly satisfies C1 M 3 ≤ |CM | ≤ C2 M 3 . On the other hand, the bounds on λ and the fact that, due to (3.2), 2 b2 δ , exp tN ≤ C h h give us, for small h, C0 Therefore
C3
h b2
h b2
2
exp
6 exp
λ−δ h
3(λ − δ) h
≤ M ≤ exp
2λc h
≤ |CM | ≤ C4 exp
.
6λc h
.
Let us choose δ as
λ − λc , 2 which means that λ − δ = λc + δ. Then, since λc = βA/3, we have δ=
C3
h b2
6 exp
3δ h
exp
βA h
≤ |CM | ≤ C4 exp
(3.36)
6λc h
.
(3.37)
The lower bound in (3.37) is central to our analysis, and we will return to it later, when we discuss droplet creation. The upper bound in (3.37) is of technical relevance in connection to droplet growth, because we will want to have all the events G+,k , k ∈ CM happening. At this point recall that our goal is to prove (3.1), in which an arbitrarily large constant C is involved. It is clear from Proposition 3.2. 1 and the upper bound in
Wulff Droplets and Metastable Relaxation
441
(3.37) that given a value for the constant C in (3.1) we can choose b1 so large that for some finite C1 , P(∪k∈CM (G+,k )c ) ≤ C1 exp(−C/h), (3.38) for all h > 0. Define the space-time region [
1=
! Ok
[
,
k∈CM
where =3
b2 W h
λ × M tN , exp . h
We can think of this space-time region 1 as playing the role of the cone in the heuristics. Informally speaking, (3.38) assures that once a supercritical droplet of size (b1 /h)W is born at the bottom of some Ok , k ∈ CM , then with large probability it will reach the top of the uppermost inverted pyramid, O(0,0,M −1) , of CM . If it survives also through the top cylinder of 1, it will reach the time exp(λ/h). A further condition on b2 is required to assure us that the cylinder at the top of 1 is wide enough so that the aimed conclusion (3.1) will be proven with an arbitrarily large constant C. Technically this condition is the following. Given the constant C in (3.1), we will need to take b2 large enough for there to exist a finite C1 so that hf i3( b2 W ),−,h ≥ hf ih − C1 exp(−C/h),
(3.39)
h
for all h > 0 small enough. That such a choice is possible is heuristically reasonable, since in the double-well picture of the Gibbs distribution µ3( b2 W ),−,h with b2 large, the h mass is concentrated in the well corresponding to the (+)-phase, which can be made very deep by choosing b2 appropriately large. From the rigorous view-point, aside from the claim that C can be arbitrarily large, a proof of (3.39) can be found in [SS1] (see the proof of Theorem 1.b.3 there). A complete proof of (3.39) is obtained by combining Lemma 3.4.8 with (1.20) and the FKG-Holley inequalities. Observe that our choices are made in the following order. Suppose λ and C are given. First (3.36) gives us the value of δ. Then (3.38) gives us the value of b1 . Afterwards, (3.33) gives us b0 , and finally (3.34) and (3.39) give us b2 . So far we have developed mathematically rigorous counterparts to the notion that if a supercritical droplet is created close to the bottom of any of the inverted pyramids Ok , k ∈ CM it is likely to grow and bring the (+)-phase to the neighborhood of the origin at time exp(λ/h). What we still need is to make mathematical sense of the creation of such a supercritical droplet as occurring at a rate predicted by the heuristics. The lower bound in (3.37) tells us that if we could say that close to the bottom of each one of these inverted pyramids, and independently of what happens close to the bottom of the other inverted pyramids, there is probability of the order of exp(−β A h ) of creating such a droplet, then we would be done. This would be akin to saying that the rate of creation of supercritical droplets is exp(−β A h ), as we expect. Motivated by (3.35) we will say that supercritical droplet creation occurs in the bottom of the inverted pyramid Ok , which has 3bot × [s0 , s1 ] as its bottom cylinder, in case the following event happens:
442
R. H. Schonmann, S. B. Shlosman
n Fk =
o s0 ,− s0 ,+ σO = σ Ok ,−,h;s1 . k ,−,h;s1
The events Fk , k ∈ CM are clearly mutually independent, since they are determined by the graphical construction marks in disjoint regions of space-time. The final part of this section will be concerned with proving that for each k for small h > 0 βA 1 . (3.40) P(Fk ) = P(F0 ) ≥ exp − 2 h If for the moment we suppose that this is known, we can complete the proof of (3.1) as follows. Consider the event \ ∪k∈CM Fk . E = ∩k∈CM G+,k From (3.38), the lower bound in (3.37), and (3.40) we have for small h, |CM | βA 1 ≤ 2C1 exp(−C/h). P(E ) ≤ C1 exp(−C/h) + 1 − exp − 2 h
c
Now, using basic-coupling inequalities, (3.35) and (3.39) we obtain 0,− ν E f σh;exp(λ/h) ≥ E f σ1,−,h;exp(λ/h) 0,− ≥ E f σ1,−,h;exp(λ/h) ; E − P(E c ) M tN ,+ = E f σ,−,h;exp(λ/h) ; E − P(E c ) M tN ,+ ≥ E f σ,−,h;exp(λ/h) − 2P(E c ) ≥ hf i3( b2 W ),−,h − 4C1 exp(−C/h) h
≥ hf ih − C10 exp(−C/h). All that remains to be done in this section is to show (3.40). For this purpose we will insert another inverted pyramid inside the bottom cylinder of each one of our inverted pyramids Ok , k ∈ CM . To distinguish the new inverted pyramids from the ones that we have been discussing so far (parametrized by b1 , b2 , δ and h), we will call the old ones “growth inverted pyramids” and the ones which we are introducing now “creation inverted pyramids”. The creation inverted pyramid which is inserted inside the bottom cylinder of O = O0 is described next. It will be of the form Ocr =
Ncr [
[ui , ui+1 ] × 3cr i ,
i=0
with the following features. As with the growth inverted pyramids, for cr i = 0, ..., Ncr − 1, the set 3cr i+1 will be obtained from 3i by adding one site to it. In cr cr cr particular we will have 30 ⊂ 31 ⊂ ... ⊂ 3N . These sets 3cr i will all be l0 -quasib0 Wulff-shaped. We will take 3cr = 3 W , where b ∈ (B , b ) 0 c 1 is a new parameter. At 0 h cr b1 the other end, 3Ncr = 3 h W , where b1 is the same one used for the growth inverted δ pyramids. The bottom cylinder of Ocr will have height u1 − u0 = exp 2h , while all its
Wulff Droplets and Metastable Relaxation
443
0 other cylinders will have height ui+1 − ui = exp δh , for i = 1, ..., Ncr , where δ 0 < δ is also a new parameter. With the parameters b1 , δ, b0 and δ 0 fixed and satisfying the conditions above, it is clear that for small enough h > 0, there is an inverted pyramid as described above which fits inside the bottom cylinder of O, and has its top 3cr Ncr × uNcr +1 coinciding with the top of the bottom cylinder of O, i.e., such that uNcr +1 = exp(δ/h). This is the inverted pyramid that we will denote by Ocr . The basic-coupling inequalities imply that n
u0 ,− σO cr ,−,h;uN
u
cr +1
o
,+
cr = σON cr ,−,h;uN
cr +1
⊂
n
o 0,− 0,+ σO,−,h;exp(δ/h) . = σO,−,h;exp(δ/h)
Therefore the next lemma implies (3.40). Lemma 3.2. Given b1 > Bc and δ > 0 there are b0 ∈ (Bc , b1 ) and δ 0 > 0 such that for small h > 0, P
u0 ,− σO cr ,−,h;uNcr +1
=
u cr ,+ σON cr ,−,h;uNcr +1
βA 1 . ≥ exp − 2 h
Proof (modulo results in Sects. 3.4 and 3.5). Before starting the rigorous proof we will motivate it heuristically. Consider first the bottom cylinder of Ocr . Intuitively, when b0 is close to Bc , we can see the system in the box 3 bh0 W as having a double well structure with the deeper well corresponding to the (−)-phase, and the higher well corresponding to the presence of a supercritical droplet of the (+)-phase. The barrier between these wells is given by the configurations with a critical droplet. Note that if we were in b0 inside the box 3 W with (−) boundary conditions, then the equilibrium at time u 1 h
quantity 21 exp − βA would indeed be a lower bound on the probability of being in h the higher well and hence having a supercritical droplet. We would therefore like to say that inside this bottom cylinder the system started at should reach equilibrium at the top of this cylinder, i.e., at time u0 with all spins down δ exp 2h . In other time u1 = u0 + words, we would like to say that the relaxation time δ · for the process σ b0 is shorter than exp 2h . For this to be true it should 3
h
W ,−,h;t
be enough to take b0 close enough to Bc , so that from the higher well there is a very small barrier to overcome to reach equilibrium. This free-energy barrier can be made δ , so the available time should be enough to equilibrate the system. smaller than 2h The heuristics in the last paragraph can be made rigorous by considering the spectral gap of the generator of the process. This will be done in Sect. 3.5. From Proposition 3.5.3 we have that if b0 is chosen close enough to Bc , then δ b0 W , −, h ≥ exp − , gap 3 h 4h
(3.41)
for small enough h. We start now the rigorous proof of the lemma. First we break things down according δ to what happens at time u1 = u0 + exp 2h ,
444
R. H. Schonmann, S. B. Shlosman
u0 ,− P σO cr ,−,h;uN
cr +1
u cr ,+ = σON = cr ,−,h;uNcr +1 X u ,− u1 ,ζ P σO0cr ,−,h;u1 = ζ P σO cr ,−,h;uN
u
cr +1
,+
cr = σON cr ,−,h;uN
cr +1
.
ζ
Combining (3.41) with the standard inequality (59) in [Sch1], we have X uNcr ,+ u0 ,− u1 ,ζ P σ = ζ P σ = σ Ocr ,−,h;u1 Ocr ,−,h;uNcr +1 Ocr ,−,h;uNcr +1 − ζ X u ,+ u1 ,ζ cr µ3 b0 W ,−,h (ζ) P σO = σON cr ,−,h;uNcr +1 cr ,−,h;uNcr +1 h ζ ! b δ 2 − exp( 2h )gap 3 h0 W ,−,h δ e b0 ≤ exp C e− exp( 4h ) . ≤ µ3 b0 W ,−,h (−) h h
But using Lemma 3.4.4 and Proposition 3.2. 2, we can take δ 0 small enough so that X uNcr ,+ u1 ,ζ µ3 b0 W ,−,h (ζ) P σO = σ Ocr ,−,h;uN +1 cr ,−,h;uN +1 ζ
≥ µ3
cr
h
b0 h
W
(Rc ) ,−,h
X ζ∈Rc
µ b3
b0 h
cr
u1 ,ζ (ζ) σO P cr ,−,h;uN ,−,h
u
cr +1
≥
for small h. The three displays above combined give us the lemma.
exp −
,+
cr = σON cr ,−,h;uN
βA h
cr +1
3 · , 4
At this point the proof of part (ii) of Theorem 1 has been reduced to proving the claims in Sects. 3.4 and 3.5. 3.4. Double well structure of equilibrium distributions. In this section we will study some of the features of the Gibbs distributions on finite simply-connected sets with (−)boundary conditions. We will extend and strengthen results contained in Theorem 1 of [SS1]; the purpose being the use of these stronger results in several other sections of the current paper. While the main results that we will derive in this section and use in other ones could be derived using the same approach as in [SS1], we will nevertheless introduce an alternative method for proving them, based on results and techniques from Sect. 2.3 of this paper. Basically, in [SS1] we started from results on large deviations of the average spin (i.e., the magnetization) inside of a box, under no external field. We could see the external field then as tilting the distribution. Such an approach is appealing and even natural, but since in the current paper our interest lies primarily on contours and not on the magnetization, it seemed even more natural to search for a direct approach to the problems, in which the magnetization need not to be mentioned. Having in mind that we are interested in the contours of configurations in − , it is natural to regard W and Vˆ , defined in Sect. 2.3, as random variables. Given a configuration η ∈ − , the associated values of these random variables are, respectively,
Wulff Droplets and Metastable Relaxation
445
W(S) and Vˆ (S), where S is the collection of skeletons corresponding to the external contours of η. Basically we will show that in a sense the function φ(b) = wb − m∗ b2
b ≥ 0,
plays the role of a large-deviation rate function for the random variable Vˆ . The title of this section derives from the shape of this function. The reader will realize that in the lemmas below the results are stated and proven with a certain amount of uniformity over the allowed sets 3; this uniformity is needed in some of our applications of the lemmas, since for instance in Sect. 3.2 we need uniformity over all the sets which are bases of cylinders of inverted pyramids. Recall that Bc is the value of b which maximizes φ, with φ(Bc ) = A, while B0 = 2Bc is the value of b above which this function becomes negative. Recall also that given a h,co finite set of vertebrate contours G we denote by SG the set of configurations which belong to − and which have as their collection of external vertebrate contours the set G. In some of the lemmas below, we will use as a “reference” the set of configurations with no external vertebrate contours, S∅h,co . Our next lemma shows that this is essentially the same as using the set R, defined by (2.10), as a “reference”. We will use the notation oh (1) to represent some function of h > 0, satisfying the property limh&0 oh (1) = 0. Lemma 3.4.1. For any p > 0 there is a function oh (1) such that for any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , 1≤
Z3,−,h (R) ≤ 1 + oh (1). Z3,−,h S∅h,co
Proof. From Lemmas 2.3.3 and 2.3.7 we have, for small enough h depending on p and some finite C also depending on p, c w . ≤ C exp −β √ µ˜ 3,−,h S∅h,co 4 2heb (Here we use eb to denote the parameter entering the definition of the vertebrate contour.) The thesis follows immediately from the last estimate. Lemma 3.4.2. For any p > 0 there is h0 > 0 and a function oh (1) such that given D0 > 0 there is a finite constant C so that for any D > D0 , any b > 0, any 0 < h ≤ h0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , Z3,−,h W ≥
, Vˆ ≤ Z3,−,h S∅h,co D h
b 2 h
β ∗ 2 ≤ C exp − (D − m b )(1 + oh (1)) . h
Proof. The proof of Lemma 2.3.6 and (2.24) show that we can choose h0 and C so as 2 to have for each collection of skeletons S such that Vˆ (S) ≤ hb ,
446
R. H. Schonmann, S. B. Shlosman
h G∈CS
β ∗ 2 ≤ exp − (W(S) − m b )(1 + oh (1)) . h Z3,−,h (S∅h,co )
X Z3,−,h (S h,co ) G
Summing then over S with W(S) ≥ D h , using the entropy estimate in the proof of Lemma 2.3 .7, we obtain the desired conclusion. h will denote the event that In what follows, with b > 0 and 0 < ρ < 1 given, Eb,ρ b(1−ρ) there is an external contour which surrounds h W and is contained in b(1+ρ) h W , and that moreover this is the only external vertebrate contour.
Lemma 3.4.3. For any p > 0 and γ < 1 there are finite positive constants h0 and C1 and a function oh (1) such that for any 0 < h ≤ h0 , any b > 0, any ρ = ρ(h) > hγ and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp and contains 3 b(1+ρ) h W , h Z3,−,h Eb,ρ β ≥ C1 exp − φ(b)(1 + oh (1)) . h Z S h,co 3,−,h
∅
In particular, ρ can be a positive constant. Proof. Without loss of generality we can suppose that ρ = hγ with γ < 1. h according to what the vertebrate external contours are, as Partition Eb,ρ [ h,co h Eb,ρ = SG . h G∈Gb,ρ
h h By the definition of Eb,ρ , each G ∈ Gb,ρ is a singleton. Exactly as in (2.33), we have h Z3,−,h Eb,ρ X Z3,−,h (S h,co ) G = h,co Z (S ) 3,−,h Z3,−,h S∅h,co h ∅ G∈Gb,ρ ! Z i X Z3,−,0 (S h,co ) β hXh G,h ∅,h 0 G exp = hσ(x)i3,−,h0 − hσ(x)i3,−,h0 dh . 2 0 x∈3 Z3,−,0 (S h,co ) h G∈Gb,ρ
∅
Arguments analogous to the ones which led to (2.34) provide also the following bound, in the opposite direction. For small enough h0 , and some finite positive C2 , if h and 0 < h0 ≤ h ≤ h0 , then G ∈ Gb,ρ 2 i Xh b ∅,h ∗ − hσ(x)i (1 + o (1)) . hσ(x)iG,h ≥ 2m 0 0 h 3,−,h 3,−,h h x∈3 Therefore we have h h,co X Z Z3,−,h Eb,ρ β 3,−,0 (SG ) ∗ 2 ≥ C1 exp m b (1 + oh (1)) h,co h Z3,−,0 (S∅ ) Z3,−,h S∅h,co h G∈Gb,ρ β ∗ 2 h m b (1 + oh (1)) ≥ C1 exp µ3,−,0 (Eb,ρ ). h
Wulff Droplets and Metastable Relaxation
447
The problem is now reduced to a 0-field problem. We will use the notation introduced h , let FG be the event that the spins in in the proof of Lemma 2.3 .6. For each G ∈ Gb,ρ ∂− G are all negative and those in ∂+ G are all positive. With this notation we can write h )= µ3,−,0 (Eb,ρ
X
µ3,−,0 (FG )µ3G (S∅h,co ) ext ,−,0
h G∈Gb,ρ
X
≥ µ3,−,0 (S∅h,co )
µ3,−,0 (FG ) ≥
h G∈Gb,ρ
1 X µ3,−,0 (FG ), 2 h G∈Gb,ρ
where in the second step we used the FKG-Holley inequalities, and in the last step we used (2.40). To complete the proof, we note that from Lemma 5.2 in [Iof1], or from the techniques in Sect. 7 of [SS2], we have X β µ3,−,0 (FG ) ≥ exp − wb(1 + oh (1)) . h h G∈Gb,ρ
Lemma 3.4.4. For any p > 0 and any b > Bc there are finite positive constants h0 and 2 such that for any 0 < h ≤ h0 and any simply-connected set 3 ⊂ Z which satisfies |3| ≤ 1/hp and contains 3 hb W ,
β µ3,−,h (R ) ≥ exp − A(1 − ) . h c
Proof. From Lemmas 3.4.1 and 3.4.3 we have, for small enough h0 , , small constant ρ and some positive finite C1 and C2 , h E Z 3,−,h b(1−ρ),ρ Z3,−,h (R ) ≥ Z3,−,h (R) 2 Z3,−,h S∅h,co β ≥ C1 exp − φ(b(1 − ρ))(1 + oh (1)) h β ≥ 2 exp − A(1 − ) . h c
The conclusion is now immediate.
Lemma 3.4.5. For any p > 0, there are finite positive constants h0 , C1 and C2 and a function oh (1) such that given also 0 < b1 < b2 there is a finite constant C10 so that for any 0 < h ≤ h0 , any b ∈ [b1 , b2 ], any κ ∈ (0, 1), any integer k and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp and contains 3 hb (1 − κ2 /4)W ,
448
R. H. Schonmann, S. B. Shlosman
! β min C1 exp − φ(eb)(1 + oh (1)) h b(1−κ)<eb
kC10
b 2 h
! β κ min . exp − φ(eb)(1 + oh (1)) − C2 b h b(1−κ)<eb
Proof. For the first inequality, simply note that if > 0 is small, ρ = /2, then for small h for all eb > 0, !2 !2 eb eb (1 − ) ≤ Vˆ ≤ ⊂ Eeh , b(1−ρ),ρ h h W ⊂ 3. Hence the claim follows from Lemma 3.4.3. and also 3 b(1−ρ)(1+ρ) h For the second inequality, note that from the variational result for families of 2 curves presented in Sect. 2.9 of [DKS] we know that if Vˆ ≥ hb then W ≥ w hb . 2 We can therefore apply Lemma 3.4.2 to each of the k events hb (1 − i κk ) ≤ Vˆ ≤ b κ 2 h (1 − (i − 1) k ) , i = 1, 2, ..., k, to conclude that 2 2 Z3,−,h hb (1 − i κk ) ≤ Vˆ ≤ hb (1 − (i − 1) κk ) Z3,−,h S∅h,co 2 Z3,−,h W ≥ w hb (1 − i κk ) , Vˆ ≤ hb (1 − (i − 1) κk ) ≤ Z3,−,h S∅h,co κ β κ 0 ≤ C1 exp − φ(b(1 − (i − 1) )(1 + oh (1)) − C2 b . h k k The previous lemma basically contains the promised characterization of φ(·) as a large-deviation rate function for the random variable Vˆ . Informally it tells us that under appropriate conditions on 3, 2 Z3,−,h Vˆ ≈ hb β ∼ exp − φ(b) . h S h,co Z 3,−,h
∅
There is a technical difficulty in applying Lemma 3.4.5, nevertheless, and this is the motivation for the next lemma. The issue is that, as stated above, the lemma cannot be used to estimate Z3,−,h Vˆ ∈ (I/h)2 /Z3,−,h S∅h,co , if I is, e.g., an interval of the
form [0, b], for some b > 0, since C10 = C10 (b1 , b2 ) can explode as b1 → 0.
Wulff Droplets and Metastable Relaxation
449
Lemma 3.4.6. For any p > 0, there is h0 > 0 and a function oh (1) such that for any 0 < h ≤ h0 , any 0 < b < B0 , and any simply-connected set 3 ⊂ Z2 which satisfies |3| ≤ 1/hp , 2 Z3,−,h Vˆ ≤ hb ≤ 1 + oh (1). 1≤ Z3,−,h S∅h,co Proof. Only the upper bound has to be explained. If b < Bc the claim follows from Lemma 3.4.1. The general case is reduced to this one by taking 0 < b0 ≤ min{Bc , b} and using Lemma 3.4.5 to control the contribution from the interval [b0 , b]. h ¯ Let E be the event that there is an external contour which up to a translation b,ρ b(1−ρ) h W
surrounds
and is contained in
b(1+ρ) h W.
Lemma 3.4.7. For any p > 0 and ρ > 0 there are finite positive constants 0 and C2 such that given also 0 < b1 < b2 there are finite positive constants h0 and C1 such that set 3 ⊂ Z2 for any 0 < h ≤ h0 , any b ∈ [b1 , b2 ], any ≤ 0 and any simply-connected which satisfies |3| ≤ 1/hp and contains 3 hb (1 − 2 /4)W , 2 2 ! b C2 b b h c µ3,−,h (E¯ b,ρ ) (1 − ) ≤ Vˆ ≤ . ≤ C1 exp − h h h Proof. We will use the stability result for the Wulff functional (in the case of families of curves) contained in Theorem 2.9 of [DKS]. Consider a configuration in ( 2 2 ) b b h c (1 − ) ≤ Vˆ ≤ . (E¯ b,ρ ) ∩ h h Clearly there is a positive finite C (depending only on the temperature) such that, if h is small, for any such configuration there is no external contour whose skeleton, even after being translated by any amount, is at a Hausdorff distance less than C bρ h from the boundary of hb W . To use Theorem 2.9 in [DKS], we scale lengths down by dividing them by b(1−) h . The rescaled collection of skeletons corresponding to the external contours has a phase volume bounded below by 1. On the other hand, no translate of any of the rescaled ρ 1 from the boundary of 1− W. skeletons is at a Hausdorff distance less than C 1− The quoted theorem in [DKS] tells us then that the Wulff functional associated to the collection of scaled skeletons is bounded below by an expression of the type w(1 + G(ρ, )), where lim&0 G(ρ, ) = G(ρ, 0) > 0. Scaling lengths back to their original value, we have obtained the following lower bound for the Wulff functional of any configuration in the event with which we are concerned: b W ≥ w (1 − )(1 + G(ρ, )). h If 0 is chosen small enough so that (1 − 0 )(1 + G(ρ, 0 )) > 1, and then h0 is chosen small enough, based on Lemma 3.4.2, we obtain for all ≤ 0 ,
450
R. H. Schonmann, S. B. Shlosman
n h c Z3,−,h (E¯ b,ρ ) ∩
b h (1
2
− )
Z3,−,h S∅h,co
≤ Vˆ ≤
b 2 h
o
2 Z3,−,h W ≥ w hb (1 − )(1 + G(ρ, )) , Vˆ ≤ hb ≤ Z3,−,h S∅h,co β ∗ 2 ≤ C1 exp − wb(1 − )(1 + G(ρ, ))(1 + oh (1)) − m b (1 + oh (1)) h β C20 b ≤ C1 exp − φ(b)(1 + oh (1)) − , h h where C20 = βw((1 − )G(ρ, ) − )/2. The result now follows from the comparison between this estimate and the first inequality in Lemma 3.4.5. In this fashion, with both h0 and 0 small enough, we can take C2 = βwG(ρ, 0)/3. As motivation for the next lemma, we recall that by controlling how deeply the (−,*) cluster of the boundary penetrates in a set 3, we can obtain estimates similar to (1.20) for the expected value of observables. Define 0 c [ b0 − b − b −∗ W ∗ →←→ 3 W 3 . () = Bh,b h h 0 b ∈(b,b]
Recall that a set 3 ⊂ Z2 is said to be l0 -quasi-Wulff-shaped with linear parameter l in case it is simply-connected and 3((l − l0 )W ) ⊂ 3 ⊂ 3((l + l0 )W ). Recall also that µ b3,−,h ( · ) = µ3,−,h ( · | Rc ). Lemma 3.4.8. For any l0 > 0 and any > 0 there is C2 > 0 such that given also Bc < b1 ≤ b2 there are positive finite constants h0 and C1 such that if 0 < h ≤ h0 , then for any l0 -quasi-Wulff-shaped set 3 which has linear size parameter b/h with b1 ≤ b ≤ b2 , C2 (b1 − Bc ) −∗ . (3.42) ()) ≤ C1 exp − µ b3,−,h (Bh,b h In case b1 > B0 the same holds also for the measure µ3,−,h : C2 (b1 − B0 ) −∗ µ3,−,h (Bh,b . ()) ≤ C1 exp − h
(3.43)
Proof. We start by introducing more terminology. The event that a certain contour 0 is present is equivalent to the statement that the spins in a certain set of sites S1 (0) all have the same sign and the spins in a certain other set of sites S2 (0) all have the opposite sign. Exactly one of the two sets, S1 (0) or S2 (0) is surrounded by the contour, while the other is completely outside of the contour. The one which is surrounded by 0 will be denoted by ∂int 0, while the one which is outside of 0 will be denoted by ∂ext 0.
Wulff Droplets and Metastable Relaxation
451
+ Let Dh,b () be the event that for some contour 0 which surrounds spins in ∂int 0 are +1. Our first goal is to prove that c C2 (b1 − Bc ) + µ b3,−,h Dh,b () . ≤ C1 exp − h
b(1−/2) W h
all
(3.44)
0 2 In order to do it let us consider the event Vˆ ≥ bh , which for b0 < Bc and h small 0 2 . If b0 is just slightly smaller enough is bigger than Rc , that is Rc ⊂ Vˆ ≥ b h
than Bc , then the bigger event is a good approximation to the smaller one. So instead of considering the conditional distribution µ b3,−,h , we begin with the dis tribution µ¯ 3,−,h ( · ) = µ3,−,h ( · | Vˆ ≥ (b0 /h)2 ). The choice of b0 is immaterial. The only thing we need is that b0 < Bc and that it is close enough to Bc , so that φ(b0 ) > φ(b1 ). Under such a choice this value b0 would not even appear in our estimates. Elementary geometric considerations show that for small enough ρ, dependent on but not on b and h, c + h c () ⊂ E¯ b,ρ . 3,− ∩ Dh,b 2 And obviously every configuration in 3,− has Vˆ ≤ hb + Ch , for some finite constant C, which depends on b. So we can use Lemma 3.4.7 to conclude that we can choose an appropriate b00 (larger than but close enough to b), and small 0 > 0, 00 and h0 > 0 so as to have 2 ! c b + 0 (1 − ) µ¯ 3,−,h Dh,b () Vˆ ≥ h 2 ! c b + 0 ˆ = µ3,−,h Dh,b () V ≥ (1 − ) h 2 ! 00 c b + 00 = µ3,−,h Dh,b () Vˆ ≥ (1 − ) h 2 00 2 ! c b00 b + 00 ˆ = µ3,−,h Dh,b () (1 − ) ≤ V ≤ h h ! 2 00 2 c b00 b h 00 ≤ µ3,−,h E¯ b,ρ (1 − ) ≤ Vˆ ≤ h h 00 2 ! 2 c b00 b h 00 ˆ ¯ ≤ µ3,−,h (1 − ) ≤ V ≤ Eb00 ,ρ/2 h h C4 b ≤ C3 exp − , (3.45 ) h for some finite positive C3 and C4 , which depend on ρ, and hence on , but with C4 not depending on b, b1 and b2 .
452
R. H. Schonmann, S. B. Shlosman
Lemma 3.4.5 implies that given 000 > 0, after possibly readjusting the value of h0 , we will also have 2 ! β φ(b(1 − 000 )) − φ(b) b 000 ˆ (1 − ) ≤ C5 exp − µ¯ 3,−,h V ≤ h h 2 000 C (b1 − Bc ) ≤ C5 exp − . (3.46 ) h The last step above is based on elementary calculus, and the resulting C > 0 is a constant which depends only on the temperature. From (3.45) and (3.46) (with 0 = 000 ) we have c C7 (b1 − Bc ) + . (3.47) () ≤ C6 exp − µ¯ 3,−,h Dh,b h To go back from µ¯ 3,−,h to µ b3,−,h , observe that if we choose an appropriate 000 , from (3.46) and an application of Lemma 3.4.7 similar to the one above we obtain C 9 b1 . µ¯ 3,−,h (R) ≤ C8 exp − h 0 2 c ˆ , we have Because R ⊂ V ≥ bh µ b3,−,h
+ () Dh,b
c
=µ b3,−,h
0 2 ! b c + () Vˆ ≥ Dh,b h
= µ¯ 3,−,h
c + µ ¯ D () 3,−,h h,b c + . () Rc ≤ Dh,b µ¯ 3,−,h (Rc )
From the last three displays we obtain (3.44) + () according to what the outermost contour 0 in its Let now {Ej } partition Dh,b definition is. In this fashion we obtain the following: X + −∗ −∗ () Dh,b αj µ b3,−,h Bh,b µ b3,−,h Bh,b () = () Ej j
=
X
−∗ αj µ3,−,h Bh,b () Ej
j
−∗ ≤ µ3,+,h Bh,b (/2) 4 C11 b b2 exp − ≤ C10 h h C13 b1 ≤ C12 exp − , h
(3.48 )
where in the second equality we used the fact that for each j, Ej ⊂ Rc , in the first inequality we used the Markov P property of Gibbs distributions, the FKG-Holley inequalities and the fact that j αj = 1, and in the second inequality we used (1.17).
Wulff Droplets and Metastable Relaxation
453
The first claim that we wanted to prove, (3.42), follows from (3.44) and (3.48). Regarding the second claim, (3.43), which refers to the case in which b1 > B0 , note that then Lemmas 3.4.5 (lower bound), 3.4.6 and 3.4.7 imply C2 b β φ(b) + C1 exp − µ3,−,h (R) ≤ C14 exp h 2 h C2 b1 C16 (b1 − B0 ) β φ(b1 ) ≤ C14 exp + C1 exp − ≤ C15 exp − . h 2 h h This shows that our second claim follows from the first one, already proven, since b3,−,h ( · ) + µ3,−,h (R). µ3,−,h ( · ) ≤ µ 3.5. Spectral gap estimates. In this section we will prove three propositions which provide lower bounds on the spectral gap of the generator of the kinetic Ising models on some finite sets. A basic technique to be used comes from the fundamental paper [Mar], where for the first time (to our knowledge) a rigorous mathematical relation was established between relaxation times of kinetic Ising models and the equilibrium surface tension. This was done in a setting in which there is no external applied field, and the system was taken in a square box of size l × l with free boundary conditions and at low temperature (in that paper only very low temperatures were considered, but the main results were later extended up to Tc in [CGMS]). The “time to jump between the (+)phase and the (−)-phase” was shown in that paper to behave as exp(β τ¯ l), as l → ∞, where τ¯ = τ ((1, 0)) is the surface tension in a coordinate direction. · )t≥0 is a finite-stateFor each finite 3 ⊂ Z2 , η ∈ and h, the process (σ3,η,h;t space reversible irreducible Markov process and its generator has its (discrete) spectrum contained in the interval (−∞, 0], with 0 being in the spectrum. The spectral gap, denoted by gap(3, η, h), is then simply the absolute value of the largest non-null number in the spectrum. We will prove the three propositions below, the first two of them being needed in Sect. 3.2 and the third one being needed in Sect. 3.3. The heuristics behind these three propositions was explained in those sections. For a recap of the terminology used in these lemmas, see the beginning of Sect. 3.4 and the paragraph in that section which precedes Lemma 3.4.8. Proposition 3.5.1. For any l0 > 0, any > 0 and any b¯ > 0 there is h0 > 0 such that if 0 < h ≤ h0 then for any l0 -quasi-Wulff-shaped set 3 which has linear size parameter ¯ b/h with b ≤ b, . gap(3, +, h) ≥ exp − h Proposition 3.5.2. For any l0 > 0, any > 0 and any Bc < bcore < b¯ there is h0 > 0 such that if 0 < h ≤ h0 then for any l0 -quasi-Wulff-shaped set 3 which has linear size ¯ parameter b/h with bcore < b ≤ b, , gap 3\3core , (+, −), h ≥ exp − h
454
R. H. Schonmann, S. B. Shlosman
where 3core = 3 bcore h W , and the boundary condition (+, −) refers to freezing the spins up inside the core 3core and down outside 3. Proposition 3.5.3. For any > 0 there are b > Bc and h0 > 0 such that if 0 < h ≤ h0 then b W , −, h ≥ exp − . gap 3 h h We will explain how certain results and techniques in [Mar] can be used to prove the three propositions above. The specific problem in [Mar] which is close to ours is the subject of Sect. 3 in that paper. This problem concerns the spectral gap for the process with no external field, in a square box of size l × l with (+)-boundary conditions. It is shown that given ∈ (0, 1/2), at low enough temperature there is C > 0 so that gap(3(L), +, 0) ≥ exp(−CL1/2+ ).
(3.49)
Heuristically speaking, in this problem (contrary to the case of free-boundary conditions, which is the main concern in [Mar]), there is no free-energy barrier to overcome for the system either starting with all spins down or all spins up to reach equilibrium. One simply expects the (+)-phase to drift inwards from the boundary. (This indicates that the result (3.49) is far from optimal, and that the corresponding gap should be much larger than this lower bound.) This situation is similar to our problems, as stated in the propositions above. In the first two propositions we are dealing with situations without free-energy barriers, and in the third one the free-energy barrier can be made as small as needed, by adjusting the value of b. The technique used in [Mar] to prove (3.49) consists in comparing the spectral gap for the kinetic Ising model with the one for a block-dynamics. The estimate on the spectral gap for the block-dynamics is reduced to equilibrium-statistical-mechanics problems, which are then solved. By a block-dynamics the following is meant. Suppose that {30 , ..., 3J } is a finite collection of finite subsets of Z2 and that 3 ⊂ ∪j=0,...,J 3j is another finite subset of Z2 . The block-dynamics in 3 with blocks {30 , ..., 3J } and with boundary condition η ∈ 3,− will be denoted by · (σ3,{3 0 ,...,3J },η,h;t )t≥0 .
It is defined by updating each block 3j ∩ 3 at rate 1, independently of the other blocks, and at each update of 3j ∩ 3 replacing the configuration inside this block with a configuration chosen at random according to the Gibbs distribution µ3j ∩3,σ,h , where σ is the current configuration. The corresponding generator is given by X µ3j ∩3,σ,h (σ 0 )(f (σ 0 ) − f (σ)). (Lf )(σ) = j=0,...,J
To state an inequality which compares the spectral gap gap(3, η, h) with the spectral gap gap(3, {30 , ..., 3J }, η, h) of the block dynamics, we need to introduce some notation. Set Lj = max |{(x1 , x2 ) ∈ 3j : x2 = k}|, k∈Z
L = max Lj , j=0,...,J
Wulff Droplets and Metastable Relaxation
and
455
V = max |3j |. j=0,...,J
Theorem 2.1 in [Mar] provides us with the following bound: gap(3, η, h) ≥
C1 exp(−C2 L) gap(3, {30 , ..., 3J }, η, h), V
(3.50)
where C1 and C2 are finite positive constants which depend only on the temperature. The proof of this result in [Mar] is restricted to the case in which the blocks are of a certain type, adapted to the needs in that paper, but this restriction is clearly not relevant in the proof. Somewhat more importantly, the rates of the kinetic Ising models in [Mar] are not as general as ours, with only a special case of rates, which satisfy detailed-balance and our assumptions (H1) – (H4) being considered. This is not a problem, and indeed, once (3.50) is established for one choice of rates satisfying detailed-balance and (H4), it holds for all such rates, thanks to the fact that the spectral gap is bounded below and above by, respectively, the same equilibrium quantity multiplyed by cmin (T ) and cmax (T ) (for this see, e.g., Eq. (61) in [Sch1]). In order to prove Propositions 3.5.1 – 3.5.3, our blocks will be Wulff-annuli, defined as follows. Given ρ > 0, set A0ρ = 2ρW, Ajρ = (j + 2)ρW \ jρW
for
j = 0, 1, . . . ,
and for h > 0, and j = 0, 1, . . . , 3j = 3
1 j Aρ , h
where the value of ρ is chosen in a fashion that we describe next, and which depends on the value of and b¯ in the propositions that we are proving (in the case of Proposition 3.5.3 we can choose some arbitrary b¯ > Bc ). Note that for each ρ > 0, j = 0, 1, ..., and r ∈ R, the set Ajρ ∩ {(x1 , x2 ) ∈ R2 : x2 = r} if not empty consists of either an interval or the union of two intervals, and that its Lebegue measure satisfies max
max
¯ r∈R j=0,...,bb/ρc
|Ajρ ∩ {(x1 , x2 ) ∈ R2 : x2 = r}| → 0
as
ρ → 0.
¯ Therefore we can choose ρ small enough and take J = bb/ρc, so that the corresponding 0 J collection of blocks, {3 , ..., 3 }, satisfies for each h > 0, L≤
, 2C2 h
(3.51)
where C2 comes from (3.50). Since V grows only as a power of 1/h, from (3.50) and (3.51) we see that all that remains is to show that with the choices above and the pertinent 3 and η, gap(3, {31 , ..., 3J }, η, h) can be assured to be large enough. In the case of Proposition 3.5.1 and 3.5.2 this amounts to showing that this quantity can be bounded below by a positive constant which does not depend on h. In the case of Proposition 3.5.3, we need to show that by taking b sufficiently close to Bc this will also be the case. (The way we set things up above, some blocks 3j may not intersect the set 3 in some situations.
456
R. H. Schonmann, S. B. Shlosman
This, of course, is not important, since such blocks have no effect on the dynamics. The setup above was chosen for notational convenience.) To study gap(3, {30 , ..., 3J }, η, h) one can couple the processes − + (σ3,{30 ,...,3J },η,h;t )t≥0 and (σ3,{3 0 ,...,3J },η,h;t )t≥0 in such a way that the first marginal never lies above the second one, and after they hit each other they coalesce and remain together. This can be done, for instance, via a graphical construction, in which to each block 3j we associate a rate 1 exponential Poisson process. The occurrence times of this Poisson process then determine the moments of update of 3j ∩ 3 in both marginal processes, and the updates are coupled in a way that preserves the order, which is possible due to the FKG-Holley inequalities. In what follows we will use P to denote the probabilities associated to this coupling. The goal is now to show that there are positive finite constants t0 , and h0 so that for all 0 < h ≤ h0 and all 3 with which we are concerned, and corresponding η, 1 − + P σ3,{3 ≤ . 0 ,...,3J },η,h;t 6= σ3,{30 ,...,3J },η,h;t0 0 2
(3.52)
From this inequality it then follows that for all t > 0, 1 bt/t0 c log(1/2) − + ≤ ≤ C exp − t . P σ3,{3 0 ,...,3J },η,h;t 6= σ3,{30 ,...,3J },η,h;t 2 t0 In particular then we have gap(3, {30 , ..., 3J }, η, h) ≥
log(1/2) , t0
as needed to complete the proofs. Concerning the proof of (3.52), we start by observing that we have only a finite number of blocks. Therefore, if t0 is chosen large enough it will be likely that before time t0 a sequence of updates will have occurred in the coupled processes in which the blocks were updated in a particular, predetermined order. In the case of Propositions 3.5.1 and 3.5.3 the good order is the decreasing one, from J down to 0, while in the case of Proposition 3.5.2 it is the increasing order, from 0 up to J. The point is that at the end of a sequence of updates produced in the good order it is very likely, when h is small, that the two marginal processes will have hit each other. To show this, we will use an equilibrium estimate on how much the boundary conditions can influence the Gibbs distributions inside the annular blocks. This is the content of the following lemma. After stating it and before going into the proof of it we will explain the idea of how it has to be used in order to prove (3.52). In the lemma we will use the notation − () = Ph,b
and + Ph,b () =
c b(1 − ) b(1 + ) − W →←→ 3 W , 3 h h c b(1 − ) b(1 + ) + W →←→ 3 W 3 . h h
Lemma 3.5.1. For any l0 > 0, any > 0 and any 0 < b1 < b2 there are finite positive constants C1 and C2 such that
Wulff Droplets and Metastable Relaxation
457
(a) If φ(b1 ) < φ(b2 ) then for all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 and bh2 , + µ32 \31 ,(+,−),h (Ph,b ()) ≤ C1 exp(−C2 /h), 1 c then for all h > 0. And if also x ∈ 3 b1h+ W
|hσ(x)i32 \31 ,(+,−),h − hσ(x)i32 \31 ,(−,−),h | ≤ C1 exp(−C2 /h), for all h > 0. (b) If φ(b1 ) > φ(b2 ) then for all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 and bh2 , − ()) ≤ C1 exp(−C2 /h), µ32 \31 ,(+,−),h (Ph,b 2 for all h > 0. And if also x ∈ 3 b2h− W then
|hσ(x)i32 \31 ,(+,−),h − hσ(x)i32 \31 ,(+,+),h | ≤ C1 exp(−C2 /h), for all h > 0. (c) For all 31 and 32 which are l0 -quasi-Wulff shaped with respective linear-sizeparameters bh1 and bh2 , − ()) ≤ C1 exp(−C2 /h), µ32 \31 ,(−,+),h (Ph,b 1
for all h > 0. And if also x ∈ 3
b1 + h W
c
then
|hσ(x)i32 \31 ,(−,+),h − hσ(x)i32 \31 ,(+,+),h | ≤ C1 exp(−C2 /h), for all h > 0. Proof of (3.52). Here we explain the idea of deriving the estimate (3.52) from the above lemma. We will do it for the case of Proposition 3.5.1, for which the statement (c) of the lemma is used. The main observation is very simple. As the reader remembers, we are concerned with the event that the updates of the blocks happen in decreasing order, that is first the block 3J is updated, then the block 3J−1 is, and so on. Note that the boundary condition on the outer boundary of 3J is (+). As we learn from the lemma, after the update of the block 3J , with overwhelming probability it becomes almost completely filled with (+)-phase, no matter what the boundary condition on the inner boundary of 3J is. In particular, its middle line — which is the outer boundary of the block 3J−1 — is in the (+)-phase. So the argument can be repeated. For the complete argument the reader is referred to [Mar], Theorem 3.1. In the final remark before going into the proof of the Lemma 3.5.1 let us explain the relations between different parts of the lemma and Propositions 3.5.1 – 3.5.3: part (a) has to be used for Proposition 3.5.3, part (b) – for Proposition 3.5.2, and part (c) – for the Proposition 3.5.1. The choice of b in Proposition 3.5.3 has to be done in such a way that φ(b) > φ(b − ρ) with ρ chosen before (3.51). Proof of Lemma 3.5.1. In each one of the three parts of the lemma, the second claim, about the expected value of the spin at a site x, follows from the first claim in the same part of the lemma, by arguments analogous to (1.20).
458
R. H. Schonmann, S. B. Shlosman
Regarding the first claim in each part of the lemma, we could in principle develop a machinery similar to the one developed in Sect. 3.4, to deal with the Gibbs distribution inside Wulff annuli. Some technical complications would arise from the fact that such sets are not simply-connected. It turns out, nevertheless, that for our purposes we can avoid this lengthy approach, and rather use the results for simply-connected sets (more specifically, for quasi-Wulff-shaped sets) of Sect. 3.4, combined with some tricks involving conditioning. To stress that our approach is to some extent natural, we observe that the hypothesis in our lemma refers to the values of φ(b1 ) and φ(b2 ), and hence somehow we should use our knowledge about quasi-Wulff-shaped boxes to prove these results on the annuli. And to make the conditioning below appear less of a trick, observe that, say in part (a), we are interested in (+)-boundary conditions in the center, something that is akin to having a droplet of the (+)-phase placed there; the conditioning places such a droplet in the center. We turn now to the proof of the first claim in part (a) of the lemma. Recall the definitions in the first paragraph of the proof of Lemma 3.4.8. Using that notation, we will say that 0 is a negative-outside (same as positive-inside) contour of a configuration η in case η is identically −1 on ∂ext 0 and identically +1 on ∂int 0. External contours of any configuration in − are negative-outside. Let E be the event that there is a contour which surrounds 31 and E 0 ⊂ E be the W and is negative-outside. We are going event that such a contour is contained in b1 (1+/2) h to argue that if E happens, then with very high probability E 0 happens as well. So we need an estimate on µ32 ,−,h (E) from below, and an estimate on µ32 ,−,h E ∩ (E 0 )c from above. FromLemma 3.4.3 we know that the first probability is at least of the order β of exp − h φ(b1 ) . To obtain the second estimate, let us introduce the number b0 > b1 , which is close enough to b1 , so that δ = 1 − bb10 is small enough. The choice of δ will be made later. One sees immediately that µ32 ,−,h (E 0 )c ∩ E " 2 #! 2 b1 b2 0 c =µ32 ,−,h (E ) ∩ E ∩ ≤ Vˆ ≤ h h 0 2 ! 2 b0 b ≤µ32 ,−,h (E 0 )c ∩ E (1 − δ) ≤ Vˆ ≤ × h h 2 0 2 ! 0 b b ˆ × µ32 ,−,h (1 − δ) ≤ V ≤ h h 0 2 2 ! b b2 +µ32 ,−,h ≤ Vˆ ≤ . h h 0 2 Note now, that if both events (E 0 )c ∩ E and Vˆ ≤ bh happen, and δ is small enough, then we can claim that the following three properties hold: there is an exterior contour surrounding 31 ; 0 W; this contour can not be shifted so as to fit inside b (1+/4) h there are no other exterior contours which can surround 3 1 even after being shifted. 0 2 In other words, we have the inclusion (E 0 )c ∩ E ∩ Vˆ ≤ bh ⊂ (E¯ bh0 ,/4 )c . So from the hypothesis that φ(b1 ) < φ(b2 ) and Lemmas 3.4.3, 3.4.5 and 3.4.7 we have
Wulff Droplets and Metastable Relaxation
µ32 ,−,h (E 0 )c | E
459
≤ C1 exp(−C2 /h),
(3.53)
for some positive finite constants C1 and C2 . Set c b1 (1 + ) b1 (1 + /2) + W →←→ 3 W . F = 3 h h Partitioning E 0 according to what the innermost negative-outside contour around 31 is and using the FKG-Holley inequalities we obtain µ32 ,−,h F | E 0 ≤ µ32 ,−,h (F ) . In order to have φ(b1 ) < φ(b2 ) we must havenb2 < B0 .oClearly also, for some 0 > 0 0 which depends on and b1 , F ∩ 32 ,− ⊂ W ≥ h . Therefore, if we choose a conveniently small 00 > 0 and use Lemmas 3.4.2, and 3.4.5 we obtain 00 2 ! 00 2 ! 0 ˆ ˆ + µ32 ,−,h V ≥ µ32 ,−,h (F ) ≤ µ32 ,−,h W ≥ , V ≤ h h h ≤ C1 exp(−C2 /h). Combining the various inequalities displayed above, we have µ32 ,−,h F | E ≤ µ32 ,−,h (E 0 )c | E + µ32 ,−,h F | E 0 ≤ C1 exp(−C2 /h).
(3.54 )
To proceed, we have to introduce a new notion. We will call a (*)-circuit γ = x1 , . . . , xn a c-circuit iff there exists a configuration σ ∈ − with one contour, 0, such that ∂+ 0 = {x1 , . . . , xn }. (Note that our definition depends on the choice of the splitting rules, used in connection with transforming contours into closed curves.) The interior Int γ of a c-circuit γ is by definition the interior of the corresponding contour 0. Let γ1 , γ2 be two c-circuits; we define the intersection c-circuits δk in the following way. Let I = Int γ1 ∩ Int γ1 and Ik be connected components of I in the sense of the above mentioned splitting rules. Since I is simply-connected, so are the Ik -s. Let δk be c-circuits, such that Int δk = Ik . For immediate use we need the following property of c-circuits: let γ1 and γ2 be two c-circuits, δ be one of the intersection c-circuits, and a configuration σ be given, such that both γ1 and γ2 are (+∗)-circuits of it. Then so is the circuit δ. To see this, let us introduce the corresponding contours 01 , 02 and 1, and let first a site x of the c-circuit δ be at the distance 21 from some bond b of 1. But this bond this contour, b evidently belongs to at least one of the contours 01 , 02 , while x is inside √ 2 hence σ(x) = +1. In the remaining case the distance dist(x, 1) = 2 , and in such a √
situation there are two adjacent bonds b1 , b2 of 1, with dist(x, b1 ) = dist(x, b2 ) = 22 , and two nearest neighbors x1 , x2 of x, belonging to δ, such that dist(xi , bi ) = 21 , i = 1, 2. Consider another nearest neighbor y of the sites x1 , x2 . It stays outside 1, hence it also stays outside at least one of the contours 01 , 02 , while all three sites x, x1 , x2 are inside both of them. Hence both bonds b1 , b2 belong to that contour, which again implies that σ(x) = +1. Let E 00 be the event that some c-circuit γ which surrounds 31 and is contained b1 (1+/2) in W is a (+∗)-circuit. Let {Ej00 } be the partition of E 00 according to what the h innermost such circuit γ is. (The preceding paragraph ensures the existence of such an
460
R. H. Schonmann, S. B. Shlosman
innermost (+)-c-circuit.) Using the fact that for each j we have 32 ,− ∩ Ej00 ⊂ E, the FKG-Holley inequalities and (3.53) we obtain X µ32 ,−,h F | Ej00 ∩ E µ32 ,−,h Ej00 | E µ32 ,−,h F | E ≥ j
=
X
µ32 ,−,h F | Ej00 µ32 ,−,h Ej00 | E
j
≥ µ32 \31 ,(+,−),h (F )
X
µ32 ,−,h Ej00 | E
j
= µ32 \31 ,(+,−),h (F ) µ32 ,−,h E 00 | E ≥ µ32 \31 ,(+,−),h (F ) µ32 ,−,h E 0 | E 1 ≥ µ32 \31 ,(+,−),h (F ), 2
(3.55 )
for small h. Comparing (3.54) with (3.55) we obtain + ()) ≤ µ32 \31 ,(+,−),h (F ) ≤ 2C1 exp(−C2 /h). µ32 \31 ,(+,−),h (Ph,b 1
This completes the proof of part (a) of the lemma, and we turn to the proof of the first claim in part (b) of the lemma. Let this time E be the event that there is a contour which surrounds b1 (1−) W and h 0 W . With no loss of E be the event that there is a contour which surrounds b2 (1−/2) h generality, we will suppose that is small enough so that E 0 ⊂ E. From the hypothesis that φ(b1 ) > φ(b2 ) and Lemmas 3.4.5 and 3.4.7 we have (3.56) µ32 ,−,h (E 0 )c | E ≤ C1 exp(−C2 /h), for some positive finite constants C1 and C2 . Therefore also − µ32 ,−,h Ph,b () E ≤ C1 exp(−C2 /h). 2
(3.57)
W. Let this time E 00 be the event that there is a (+)-circuit which surrounds b1 (1−) h b1 (1−/2) 00 00 W . Let {Ej } partition E according to what the innermost and is contained in h such (+)-circuit is. − ())c according to what the (−)-cluster of (32 )c is and using the By partitioning (Ph,b 2 FKG-Holley inequalities combined with (1.17) and the graph-theoretic duality between connectivity and (*)-connectivity, in addition to (3.57), we obtain c − − 00 c () E + µ ) () P (E µ32 ,−,h (E 00 )c | E ≤ µ32 ,−,h Ph,b 3 ,−,h 2 h,b 2 2 ≤ C1 exp(−C2 /h).
(3.58 )
Using the fact that for each j we have 32 ,− ∩Ej00 ⊂ E, the FKG-Holley inequalities and (3.58) we obtain
Wulff Droplets and Metastable Relaxation
461
X 00 − − µ32 ,−,h Ph,b () E ≥ µ () E ∩ E µ32 ,−,h Ej00 | E P 3 ,−,h j 2 h,b2 2 j
=
X
00 − µ32 ,−,h Ej00 | E µ32 ,−,h Ph,b () E j 2
j
X − ≥ µ32 \31 ,(+,−),h Ph,b () µ32 ,−,h Ej00 | E 2
j
− Ph,b () 2
µ32 ,−,h E 00 | E 1 − () , ≥ µ32 \31 ,(+,−),h Ph,b 2 2
= µ32 \31 ,(+,−),h
(3.59 )
for small h. Comparing (3.57) with (3.59) we obtain + () ≤ 2C1 exp(−C2 /h). µ32 \31 ,(+,−),h Ph,b 1 This completes the proof of part (b) of the lemma, and we turn to to the proof of the first claim in part (c) of the lemma. Because in Sect. 3.4 we did not study systems with (+) boundary conditions, we will use a somewhat artificial approach to part (c), by reducing it to part (a), studied above. (In doing so we will proceed as someone who first heats up cold water in order to freeze it later, because he knows how to freeze hot water, but has never frozen cold water). First note that by reversing all signs and then using the FKG-Holley inequalities, we obtain, for any h0 ≥ −h, − + + () = µ32 \31 ,(+,−),−h Ph,b () ≤ µ32 \31 ,(+,−),h0 Ph,b () . µ32 \31 ,(−,+),h Ph,b 1 1 1 Suppose that h0 > 0 and set b01 =
h0 b1 h
and
b02 =
h0 b2 . h
If h0 is small enough we have 0 < b01 < b02 < Bc and hence φ(b01 ) < φ(b02 ). But 31 b0 and 32 are l0 -quasi-Wulff shaped with respective linear-size-parameters bh1 = h10 and b2 h
=
b02 h0 .
So we can quote part (a) of the lemma to conclude the proof of part (c).
Acknowledgement. Over the years during which we have worked on this project, we have enjoyed the benefit of several stimulating conversations with various colleagues. We are especially thankful to A. van Enter, R. Koteck´y, F. Martinelli, E. Olivieri and E. Scoppola. Part of this work was done while R.H.S. was visiting Rome in the Fall of 1994, and he thankfully acknowledges the warm hospitality of the Physics Department of the University of Rome I and of the Mathematics Departments of the Universities of Rome II and III.
462
R. H. Schonmann, S. B. Shlosman
References [AL]
Aizenman, M. and Lebowitz, J.L.: Metastability effects in bootstrap percolation. J. Phys. A 21, 3801–3813 (1988) [BM] Binder, K. and M¨uller-Krumbhaar, H.: Investigation of metastable states and nucleation in the kinetic Ising model. Physical Review B 9, 2328–2353 (1974) [CCO] Capocaccia, D., Cassandro, M. and Olivieri, E.: A study of metastability in the Ising model Commun. Math. Phys. 39, 185–205 (1974) [CCS] Chayes, J., Chayes, L. and Schonmann, R.H.: Exponential decay of connectivities in the twodimensional Ising model. J. Stat. Phys. 49, 433–445 (1987) [CGMS] Cesi, F., Guadagni, G., Martinelli, F. and Schonmann, R.H.: On the 2D dynamical Ising model in the phase coexistence region near the critical point. J. Stat. Phys. 85, 55–102 (1996) [DS] Dehghanpour, P. and Schonmann, R.H.: Metropolis dynamics relaxation via nucleation and growth. Commun. Math. Phys. 188, 89–119 (1997) [DKS] Dobrushin, R.L., Koteck´y, R. and Shlosman, S.B.: Wulff construction: a global shape from local interaction. AMS translations series, Providence, RI.: Am. Math. Soc., 1992 [Geo] H.-O. Georgii: Gibbs measures and phase transitions. Berlin, New York: Walter de Gruyter, 1988 [GD] Gunton, J.D. and Droz, M.: Introduction to the theory of metastable and unstable states. In: Lecture Notes in Physics, 183, Berlin–Heidelberg–New York: Springer-Verlag, 1983 [Iof1] Ioffe, D.: Large deviations for the 2D Ising model: A lower bound without cluster expansions. J. Stat. Phys. 74, 411–432 (1994) [Iof2] Ioffe, D.: Exact large deviation bounds up to Tc for the Ising model in two dimensions. Prob. Th. rel. Fields 102, 313–330 (1995) [Isa] Isakov, S.N.: Nonanalytic features of the first order phase transition in the Ising model. Commun. Math. Phys. 95, 427–443 (1984) [KO] Koteck´y, R. and Olivieri, E.: Droplet dynamics for asymmetric Ising model. J. Stat. Phys. 70, 1121– 1148 (1993) [Lig] Liggett, T.: Interacting Particle Systems. Berlin–Heidelberg–New York: Springer-Verlag, 1985 [Mar] F. Martinelli: On the two-dimensional dynamical Ising model in the phase coexistence region. J. Stat. Phys. 76, 1179–1246 (1994) [M-L] Martin-L¨of, A.: Mixing properties, differentiability of the free energy and the central limit theorem for a pure phase in the Ising model at low temperature. Commun. Math. Phys. 32, 75–92 (1973) [PL] Penrose, O. and Lebowitz, J.L.: Towards a rigorous molecular theory of metastability. In: Fluctuation Phenomena (second edition), E. W. Montroll, J. L. Lebowitz, editors, Amsterdam: North–Holland Physics Publishing, 1987 [Pfi] Pfister, C.E.: Large deviations and phase separation in the two-dimensional Ising model. Helv. Phys. Acta 64, 953–1054 (1991) [RTMS] Rikvold, P.A., Tomita, H., Miyashita, S. and Sides, S.W.: Metastable lifetimes in a kinetic Ising model: dependence on field and system size. Phys. Rev. E 49, 5080–5090 (1994) [Sch1] Schonmann, R.H.: Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region. Commun. Math. Phys. 161, 1–49 (1994) [Sch2] Schonmann, R.H.: Theorems and conjectures on the droplet driven relaxation of stochastic Ising models. In: Probability theory of spatial disorder and phase transition, G. Grimmett, ed., Amsterdam: Kluwer Publ. Co, 1994, pp. 265–301 [SS1] Schonmann, R.H. and Shlosman, S.B.: Complete analyticity for 2D Ising completed. Commun. Math. Phys. 170, 453–482 (1995) [SS2] Schonmann, R.H. and Shlosman, S.B.: Constrained variational problem with applications to the Ising model. J. Stat. Phys. 83, 867–905 (1996) Communicated by Ya. G. Sinai
Commun. Math. Phys. 194, 463 – 479 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
A Steady-State Quantum Euler–Poisson System for Potential Flows 1,2 ¨ Ansgar Jungel 1
Fachbereich Mathematik, Universit¨at Rostock, Universit¨atsplatz 1, D-18055 Rostock, Germany. Fachbereich Mathematik, Technische Universit¨at Berlin, Straße des 17. Juni 136, D-10623 Berlin, Germany. E-mail: [email protected]
2
Received: 25 February 1997 / Accepted: 29 October 1997
Abstract: A potential flow formulation of the hydrodynamic equations with the quantum Bohm potential for the particle density and the current density is given. The equations are selfconsistently coupled to Poisson’s equation for the electric potential. The stationary model consists of nonlinear elliptic equations of degenerate type with a quadratic growth of the gradient. Physically motivated Dirichlet boundary conditions are prescribed. The existence of solutions is proved under the assumption that the electric energy is small compared to the thermal energy. The proof is based on Leray-Schauder’s fixed point theorem and a truncation method. The main difficulty is to find a uniform lower bound for the density. For sufficiently large electric energy, there exists a generalized solution (of a simplified system), where the density vanishes at some point. Finally, uniqueness of the solution is shown for a sufficiently large scaled Planck constant.
1. Introduction The evolution of a fluid or gas is governed by the hydrodynamic equations [20] ∂n + div J = 0, ∂t ∂J J ⊗J + div + P − nF = W. ∂t n
(1.1) (1.2)
The first equation expresses the conservation of mass where n is the particle density and J the particle current density. The second equation expresses the conservation of momentum where P = (Pij ) denotes the pressure tensor, F the sum of the external forces, and W the momentum relaxation term. The ith component of div (J ⊗ J/n + P ) is given by
464
A. J¨ungel
d X ∂ Ji Jj + Pij , ∂xj n j=1
where d ≥ 1 is the space dimension. We consider an isothermal or isentropic quantum fluid of charged particles. In par ticular, the pressure tensor is assumed to be of the form P = δij r(n) , where δij is the Kronecker symbol. The pressure function r is given by the particle density, i.e. r(n) = To n in the isothermal case and r(n) = To nβ in the isentropic case, where β > 1 and To is a (scaled) temperature constant. In the isothermal case, the fluid temperature T is equal to To ; in the isentropic case we get T = To nβ−1 . We assume that the external force is the gradient of the sum of the electric potential V , the external potential Vext , and the quantum Bohm potential 1 √ Q = δ 2 √ 1 n, n δ > 0 being the scaled Planck constant. The external potential models (interior) quantum wells. Equations (1.1)–(1.2) are coupled to Poisson’s equation for the electric potential, λ2 1V = n − C(x).
(1.3)
Here, λ denotes the scaled Debye length, and C(x) models fixed background ions. Finally, the relaxation term is given by W = −αJ, where α > 0 is the inverse of the scaled relaxation time. With these assumptions the quantum hydrodynamic equations can be formulated as ∂n + div J = 0, (1.4) ∂t √ ∂J J ⊗J 1 n + div + ∇r(n) − n∇(V + Vext ) − δ 2 n∇ √ = −αJ. ∂t n n (1.5) The primary application of the quantum hydrodynamic equations to date has been in analyzing the flow of electrons in quantum semiconductor devices, like resonant tunneling diodes [10]. Very similar model equations have been used in other areas of physics, e.g. in superfluidity [22] and in superconductivity [6]. The quantum Euler–Poisson system (1.3)–(1.5) has been justified in [1, 10, 12, 13, 14]. It can be derived from a moment expansion of the Wigner-Boltzmann equations [10] or from a mixed state Schr¨odinger–Poisson system [12]. In particular, the single state Schr¨odinger-Poisson system iε
ε2 ∂ψ = − 1ψ + (V + Vext )ψ, ∂t 2
λ2 1V = |ψ|2 − C(x)
is equivalent (for appropriate “smooth” solutions) to the irrotational zero temperature flow equations ∂n + div J = 0, ∂t √ ε2 ∂J J ⊗J 1 n + div − n∇(V + Vext ) − n∇ √ =0 ∂t n 2 n
A Quantum Euler–Poisson System
465
and Poisson’s equation (1.3) (see [21, 14]). These equations are known as Madelung’s fluid equations [22]. The expression “irrotational” means that the current density can be written as J = n∇S, where S is called a phase or quantum Fermi potential. The √ equivalence of the two models follows from the definitions n = |ψ|2 , ψ = n exp(iS/ε) and J = n∇S. We note that for finite relaxation times α > 0, there is no equivalence to a Schr¨odinger-Poisson system, even not in the mixed state. In this paper we study the steady-state equations div J = 0, √ J ⊗J 1 n 2 div + ∇r(n) − n∇(V + Vext ) − δ n∇ √ = −αJ, n n
(1.7)
λ 1V = n − C
(1.8)
2
(1.6)
in a bounded domain ⊂ Rd (d ≥ 1) occupied by the fluid. The main assumption is that we consider a potential flow, i.e. we assume that the particle current can be written as J = n∇S with the quantum Fermi potential S (see above). This means that the velocity J/n = ∇S is assumed to be irrotational. It is physically reasonable to assume that n > 0 holds in the device. Since div (J ⊗ J/n) = 21 n∇|∇S|2 we can rewrite (1.7) as √ 1 n 1 n∇ |∇S|2 + To h(n) − V − Vext − δ 2 √ = −αn∇S, (1.9) 2 n where 1 h(n) = To
Z
n 1
r0 (s) ds s
(1.10)
is the enthalpy function. In the isothermal case, h(n) = log(n) holds; for isentropic states, we have h(n) = (β/(β − 1))(nβ−1 − 1) for β > 1. Notice that the electric potential and the quantum Fermi potential are fixed only up to additional constants. Since n > 0, Eq. (1.9) implies √ 1 n 1 |∇S|2 + To h(n) − V − Vext − δ 2 √ + αS = 0. 2 n The integration constant can be assumed to be zero by choosing√a reference point for the electric potential. For the analysis it is convenient to use w = n as a variable. Then (1.6), (1.8), and (1.9) can be written as δ 2 1w = w( 21 |∇S|2 + To h(w2 ) − V − Vext + αS),
(1.11)
div (w ∇S) = 0,
(1.12)
2
λ 1V = w − C 2
2
in .
(1.13)
Physically relevant boundary conditions for w, S, and V will be specified later. The fluid models (1.6)–(1.8) or (1.11)–(1.13) have been studied in some special situations. For vanishing convective and quantum terms the problem (1.6)–(1.8) is known as the isentropic drift-diffusion model used for semiconductor devices [17, 18, 24]. The
466
A. J¨ungel
quantum drift-diffusion model (zero convective term, δ > 0) has been investigated in [2]. The classical potential flow hydrodynamic equations (δ = 0) are analyzed in, e.g. [5, 7, 9]. In the paper [29] the existence for the one-dimensional stationary quantum hydrodynamic equations (1.6)–(1.8) with non-standard boundary conditions is investigated. The steady-state system (1.11)–(1.13) in several space dimensions is studied here mathematically for the first time. In the analysis of (1.11)–(1.13), two main difficulties arise. The elliptic equation (1.12) is, a priori, of degenerate type with a non-standard (since non-local) degeneracy. We will show, however, that the solution w is strictly positive and therefore, (1.12) becomes strictly elliptic. Every solution (w, S, V ) of (1.11)–(1.13) with positive w is a solution of the problem (1.6)–(1.8) with n = w2 , J = n∇S. Another difficulty arises due to the term |∇S|2 on the right hand side of (1.11), stemming from the convective term in (1.6). This difficulty also appears in the thermistor problem (see [4, 27]). However, we have to apply different techniques than used in the thermistor problem. To derive the boundary conditions we make physically relevant hypotheses. The boundary data are assumed to be the superposition of the thermal equilibrium functions (neq , Seq , Veq ) and the applied potential U (x): n = neq ,
S = Seq + U,
V = Veq + U
on ∂.
The thermal equilibrium state is defined by J = 0 or, equivalently, S = const. (as n > 0). By fixing the reference point for S (and Seq ) we can suppose that Seq = 0. We assume further that the total space charge C − neq vanishes at the boundary and that no quantum √ √ effects occur on ∂, i.e. 1 neq / neq = 0. Finally, Vext = 0 on ∂, since Vext is introduced to model interior quantum wells. We get from (1.11) 0=
1 |∇Seq |2 + To h(neq ) − Veq + αSeq 2
or, since Seq = 0, Veq = To h(neq )
on ∂.
Therefore we get the Dirichlet boundary conditions w = wo , with wo =
√
C,
S = So ,
So = U,
V = Vo
on ∂
(1.14)
Vo = To h(C) + U.
(1.15)
It is the aim of this paper to show the existence and uniqueness of solutions to (1.11)–(1.14). More precisely, we prove in Sect. 2 that there exists a solution (w, S, V ) to (1.11)–(1.14) with ∇S ∈ L∞ () under the assumption that the temperature constant To is large enough (isothermal and isentropic case) or that the boundary Fermi potential So is small enough in some norm (isothermal case). This means that the electric energy which is connected with the applied potential U (and hence with So ) has to be much smaller than the thermal energy, in some sense. For the proof we first replace Eq. (1.12) by div (max(m, w)2 ∇S) = 0 (m > 0) which is uniformly elliptic. By means of LeraySchauder’s fixed point theorem, the existence of a solution to the truncated problem will be shown. For this solution the density w turns out to be strictly positive. So we get a solution to the original problem (1.11)–(1.14) by choosing the truncation parameter m > 0 smaller than the lower bound of w.
A Quantum Euler–Poisson System
467
We need the smallness assumption on the data in the proof of the positivity of w. We do not know if the existence of solutions can be proved without this assumption. In the stationary thermistor problem which is formally related to the quantum hydrodynamic model, it is well known that there exist solutions only if the applied potential is “small” enough (for the precise conditions see [27]). Furthermore, in the one-dimensional case it is possible to show the non-existence of solutions for “large” applied voltages [3, 4]. We recall that the thermistor problem reads div (k(w)∇w) = −σ(w)|∇S|2 , div(σ(w)∇S) = 0, where w and S have here the meaning of the temperature and electric potential, respectively. In the simulation of semiconductor tunneling devices where a variant of the presented quantum fluid model has been used, numerical results indicate that the density can be extremely small compared to, e.g., the boundary density, for values of the applied voltage U far from the thermal equilibrium (e.g. nmin = 10−4 , n|∂ = 1; see [10]). It is not clear if there is a lower bound for the density for all U and if yes, how it can be controlled. The positivity property of w is connected to the regularity for S. Indeed, we show that w is strictly positive if and only if the gradient of S is bounded (Sect. 3). For ultra-small devices, Eqs. (1.11)–(1.14) can be replaced asymptotically by a simplified system [19]. We show that there exists a solution of this (one-dimensional) system, where the density vanishes at some point. However, the solution is discontinuous and therefore, it is only defined in a generalized sense (see Sect. 3). There exists at most one solution to (1.11)–(1.14) if the scaled Planck constant δ is sufficiently large (Sect. 4). For δ = 0, there exist situations where the problem has more than one solution [11].
2. Existence of Solutions In this section we prove the existence of solutions to (1.11)–(1.14) with general Dirichlet boundary data. The following assumptions are needed: (A1) ⊂ Rd (d ≥ 1) is a bounded domain with boundary ∂ ∈ C 1,1 . (A2) h ∈ C 0 (0, ∞) is a non-decreasing function satisfying lim h(x) = +∞,
x→∞
lim xh(x2 ) < +∞.
x→0+
(A3) wo ∈ W 2,p () for p > d/2, inf ∂ wo > 0; So ∈ C 1,γ () with γ = 2 − d/p; Vo ∈ H 1 () ∩ L∞ (); C, Vext ∈ L∞ (). The constants α, δ, λ, and To are assumed to be positive. We call a function h ∈ C 0 (0, ∞) satisfying (A2) isothermal if h(0+) = −∞ and isentropic if h(0+) < 0. The enthalpy function h(s) = log(s) is isothermal. Furthermore, the enthalpy h(s) = (β/(β − 1))(sβ−1 − 1) is isentropic. The main results of this section are the following theorems:
468
A. J¨ungel
Theorem 2.1. Let (A1)–(A3) hold and let h be isothermal. Then there exists ε > 0 such that if kSo kC 1,γ () ≤ ε or To ≥ 1/ε, then there exists a solution (w, S, V ) of (1.11)–(1.14) satisfying, for some w > 0, w ∈ W 2,p (),
S ∈ C 1,γ (), V ∈ H 1 () ∩ L∞ (), w(x) ≥ w > 0 in .
(2.1) (2.2)
Theorem 2.2. Let (A1)–(A3) hold and let h be isentropic. Then there exists ε > 0 such that if To ≥ 1/ε then there exists a solution (w, S, V ) of (1.11)–(1.14) satisfying (2.1)–(2.2). Notice that we are assuming boundary data which are independent of the parameter To . The case of the boundary functions (1.15) can also be treated, see Remark 2.5. First we prove that there exists a solution of a truncated system. For this, define sK = max (0, min (s, K)) and tm (s) = max (m, s) for s ∈ R and 0 < m ≤ K. Throughout this section (A1)–(A3) are assumed to hold. Consider 2 ) − V − Vext + αS), δ 2 1w = wK ( 21 |∇S|2 + To h(wK
div (tm (wK ) ∇S) = 0, λ2 1V = wK w − C 2
w = wo ,
S = So ,
(2.4) (2.5)
in , V = Vo
(2.3)
on ∂.
(2.6)
The proof of existence of solutions to this truncated system is based on the following a priori estimates. Lemma 2.3. Let (w, S, V ) be a weak solution to (2.3)–(2.6). Then there exist constants w, S, S, V , V , and c1 (m) such that 0 ≤ w(x) ≤ w,
−S ≤ S(x) ≤ S, −V ≤ V (x) ≤ V kwk2,p, ≤ c1 (m).
in ,
(2.7) (2.8)
Here, k · k2,p, denotes the norm of the Sobolev space W 2,p (). The precise dependence of the above bounds on the data is needed in the uniqueness proof in Sect. 4 and is stated here for future reference: S = − inf So , ∂
S = sup So ,
(2.9)
∂
V = sup Vo + c(, d, λ)kCk0,∞, ,
(2.10)
∂
w = max kwo k0,∞,∂ , w1 (V , S, To , h) ,
V = − inf Vo + c(, d, λ) kCk0,∞, + w , 2
∂
(2.11) (2.12)
where c(, d, λ) > 0 and w1 = w1 (V , S, To , h) > 0 is such that h(w12 ) ≥ (V + kVext k0,∞, + αS)/To .
A Quantum Euler–Poisson System
469
Proof. First step. L∞ estimates for w, S, and V . First observe that, using w− = min (0, w) ∈ H01 () as test function in (2.3), it follows w(x) ≥ 0
a.e. in .
(2.13)
The maximum principle gives the bounds −S = inf So ≤ S(x) ≤ sup So = S ∂
in .
(2.14)
∂
Next we show that V is uniformly bounded in L∞ (). Let Uo = sup∂ Vo , U ≥ Uo , and take (V − U )+ = max (0, V − U ) as a test function in (2.5). Then Z Z Z |∇(V − U )+ |2 = − wK w(V − U )+ + C(V − U )+ (2.15) λ2 Z ≤ C(V − U )+ ≤ ck(V − U )+ k1,2, (meas (V > U ))1/2 . Here and in the following, c, ci denote positive constants only depending on the given data. Let r > 2 be such that the embedding H 1 () ,→ Lr () is continuous. It is well known that for W > U , (meas (V > W ))1/r (W − U ) ≤ c()k(V − U )+ k1,2, holds [25, Ch. 4]. Therefore we get from (2.15), for W > U ≥ Uo , meas (V > W ) ≤
c (meas (V > U ))r/2 . (W − U )r
Since r/2 > 1, we can apply Stampacchia’s Lemma (see [26, Ch. 2.3] or [25, Ch. 4]) to get def
V (x) ≤ V = Uo + c(, d, λ)kCk0,∞, ,
(2.16)
where c(, d, λ) > 0. Before we can find a lower bound for V , we prove that w is bounded from above (independently of K). For this set V ext = kVext k0,∞, , let w ≥ kwo k0,∞,∂ and K > w and use (w − w)+ as a test function in (2.3): Z Z 1 δ2 wK (w − w)+ |∇S|2 |∇(w − w)+ |2 = − 2 Z − wK (w − w)+ To (h(w2 ) − h(w2 )) Z + wK (w − w)+ (V + Vext − To h(w2 ) − αS) Z ≤ wK (w − w)+ (V + V ext − To h(w2 ) + αS), using (A2), (2.14) and (2.16). Since h(s) → ∞ as s → ∞, there exists w ≥ kwo k0,∞,∂ such that h(w2 ) ≥ (V + V ext + αS)/To . This implies
470
A. J¨ungel
w(x) ≤ w
a.e. in .
(2.17)
Now use (−V − U )+ with U ≥ Uo = − inf ∂ Vo as test function in (2.5) to get Z Z λ2 |∇(−V − U )+ |2 ≤ (wK w − C)(−V − U )+ Z ≤ c (−V − U )+ , where c > 0 depends on C and w. Using Stampacchia’s method as above allows to conclude that def
V (x) ≥ −V = −Uo − c(, d, λ)(kCk0,∞, + w2 ). Second step. H 1 estimate for w. Use w − wo as test function in (2.3) to obtain Z Z Z Z 1 1 2 2 2 2 δ wK w|∇S| + wK wo |∇S|2 (2.18) ∇w · ∇wo − |∇w| = δ 2 2 Z Z Z 2 ) + wK wV − wK wo V − To wK (w − wo )h(wK Z Z + wK (w − wo )Vext − α wK (w − wo )S. With the test functions V − Vo and S − So in (2.5), (2.4) respectively, we get on the one hand Z Z Z Z Z wK wV = −λ2 |∇V |2 + λ2 ∇V · ∇Vo + Vo wK w + C(V − Vo ) Z Z Z λ2 |∇V |2 + λ2 |∇Vo |2 + c wK w + c, ≤− 2 using Young’s and Poincar´e’s inequalities; on the other hand, we have for K > w, Z Z Z m wK |∇S|2 ≤ tm (wK )2 |∇S|2 ≤ tm (wK )2 |∇So |2 Z ≤ w2 |∇So |2 . Therefore we can estimate (2.18) as follows: Z Z Z δ2 δ2 w2 2 2 2 |∇wo | + c(wo , So ) − To wK wh(wK |∇w| ≤ ) 2 2 m Z Z Z λ2 2 2 |∇V |2 + c(λ) wK )| − + c |wK h(wK 4 Z + c wK w + c ≤ c(m, w). Third step. W 2,p estimate for w. The following elliptic estimate holds [15, Thm. 8.33 and 8.34]:
A Quantum Euler–Poisson System
471
kSkC 1,ε () ≤ c2 kSo kC 1,ε ()
for all 0 < ε ≤ γ,
(2.19)
where c2 > 0 depends on , d, m, and the C 0,ε () norm of tm (wK )2 . It can be seen from the proof of this estimate that c2 = c3 (, d)c4 (m)ktm (wK )2 kC 0,ε () . Furthermore, we have the elliptic estimate 2 ) − V − Vext + αS)k0,p, , kwk2,p, ≤ c5 kwo k2,p, + kw( 21 |∇S|2 + h(wK where c5 > 0 depends on , d and δ [15, 9.15 and 9.17]. Hence, using (2.19) for ε = γ/2, δ 2 kwk2,p, ≤ c(1 + kSk21,2p, ) ≤ c(1 + kSk2C 1,γ/2 () ) ≤ c(1 + kw2 k2C 0,γ/2 () ) ≤ c(1 + w2 kwk2C 0,γ/2 () ) ≤ c(1 + kwkC 0,γ () ) ≤
δ2 kwk2,p, + c(δ, m). 2
In the last step we have used the interpolation inequality kwkC 0,γ () ≤ εkwk2,p, + c(ε)kwk0,∞, , which follows from the facts that the embedding W 2,p () ,→ C 0,γ () is compact (since p > d/2) and the embedding C 0,γ () ,→ L∞ () is continuous [28, p. 365]. The constant c(δ, m) in the above estimate depends on , d, δ, m, w, and V . We obtain finally kwk2,p, ≤ 2c(δ, m)/δ 2 = c1 (m). Lemma 2.4. There exists a solution (w, S, V ) of δ 2 1w = w( 21 |∇S|2 + To h(w2 ) − V − Vext + αS), div (tm (w) ∇S) = 0, λ2 1V = w2 − C
in ,
w = wo ,
V = Vo
2
S = So ,
(2.20) (2.21) (2.22)
on ∂,
(2.23)
such that w ∈ W 2,p (), S ∈ C 1,γ (), V ∈ H 1 () ∩ L∞ (), and w(x) ≥ 0 in . Proof. We use a fixed point argument. Let u ∈ C 0,γ (). Let V ∈ H 1 () be the unique solution of λ2 1V = uK u − C in , V = Vo on ∂, and let S ∈ H 1 () be the unique solution of div (tm (uK )2 ∇S) = 0
in ,
S = So
on ∂.
As in the proof of Lemma 2.3, we see that V ∈ L∞ (). Since tm (uK )2 is H¨older continuous of order γ, we get S ∈ C 1,γ () [15, Thm. 8.34]. Finally, let w ∈ H 1 () be the unique solution of
472
A. J¨ungel
δ 2 1w = σuK ( 21 |∇S|2 + To h(u2K ) − V − Vext + αS) w = σwo on ∂,
in ,
with σ ∈ [0, 1]. The right-hand side of this elliptic problem lying in L∞ (), we conclude w ∈ W 2,p () and, since p > d/2, w ∈ C 0,γ (). Thus the fixed point operator T : C 0,γ () × [0, 1] → C 0,γ (), (u, σ) 7→ w, is well defined. It holds T (u, 0) = 0 for u ∈ C 0,γ (). Estimates similarly as in the proof of Lemma 2.3 give the bound kwk2,p, ≤ c for all w ∈ C 0,γ () satisfying T (w, σ) = w, where c > 0 is independent of w and σ. Standard arguments show that T is continuous and compact, noting the compactness of the embedding W 2,p () ,→ C 0,γ (). We can apply Leray-Schauder’s fixed point theorem to get a solution (w, S, V ) of (2.3)–(2.6). Choosing K > w (see (2.7)), this tripel is also a solution of (2.20)–(2.23).
Proof of Theorems 2.1 and 2.2. We rewrite the elliptic estimate (2.19) for ε = γ: kSkC 1,γ () ≤ c3 (, d)c4 (m)ktm (w)2 kC 0,γ () kSo kC 1,γ () . It holds c4 (m) → ∞ as m → 0+. Now, ktm (w)2 kC 0,γ () ≤ c(w)kwkC 0,γ () ≤ c(w)kwk2,p, ≤ c5 . From the proof of Lemma 2.3 it can be seen that c5 = c6 (w)c7 (m) with c6 (w) → ∞ as w → ∞ and c7 (m) → ∞ as m → 0+. The bound w depends on To such that w → ∞ as To → 0+ (see (2.11)). Thus we can write kSk2C 1,γ () ≤
c0 kSo k2C 1,γ () , f (To )g(m)
(2.24)
where f and g are positive continuous non-decreasing functions in [0, ∞) such that f (To ) → 0 as To → 0+, f (To ) > 0 as To → ∞, and g(m) → 0 as m → 0+. The constant c0 > 0 does not depend on So , To , or m. Let 0 < m < inf ∂ wo and take (w − m)− = min(0, w − m) as test functions in (2.20). Then, using (A2), (2.24), and (2.7), Z Z 2 − 2 δ |∇(w − m) | = − w(w − m)− To (h(w2 ) − h(m2 )) Z 1 − w(w − m)− ( |∇S|2 + To h(m2 ) − V − Vext + αS) 2 Z c0 kSo k2C 1,γ () ≤ w(−(w − m)− ) f (To )g(m) + To h(m2 ) + V + V ext + αS , def
where V ext = kVext k0,∞, . The constant c8 (To ) = V +V ext +αS depends on To through V such that c8 (To ) can be taken to be non-increasing as To increases (see (2.11)–(2.12)). Then
A Quantum Euler–Poisson System
Z δ
473 − 2
2
|∇(w − m) | ≤
To I2 I1 + g(m)
Z
w(−(w − m)− ),
(2.25)
where 1 To h(m2 ) + c8 (To ), 2 c0 1 kSo k2C 1,γ () + g(m)h(m2 ). I2 = To f (To ) 2 I1 =
First case: Let h be isothermal. For arbitrary To > 0, let w ∈ (0, inf ∂ wo ) be such that h(w2 ) ≤ −2c8 (To )/To (using (A2)). This implies, for m = w, that I1 ≤ 0. Set A = − 21 g(w)h(w2 ) > 0 and ε2 = ATo f (To )/c0 . Then, for m = w and kSo kC 1,γ () ≤ ε, we obtain c0 I2 ≤ ε2 − A ≤ 0. To f (To ) Taking into account (2.25) we conclude that w ≥ w in . For arbitrary So , take m = w ∈ (0, inf ∂ wo ) such that h(w2 ) ≤ −2c8 (1) and let A be defined as above. Choose T1 ≥ 1 such that T1 f (T1 ) ≥ c0 kSo k2C 1,γ () /A. Then we have for all To ≥ T1 , since T 7→ c8 (T )/T is non-increasing, h(w2 ) ≤ −2c8 (1) ≤ −2c8 (To )/To , and hence I1 ≤ 0. Since the function T 7→ T f (T ) is increasing, we obtain c0 c0 kSo k2C 1,γ () ≤ kSo k2C 1,γ () ≤ A, To f (To ) T1 f (T1 ) by definition of T1 . This implies I2 ≤ 0 and w ≥ w in . Second case: Let h be isentropic. Let w ∈ (0, inf ∂ wo ) be such that h(w2 ) < 0, and let T2 ≥ 1 be such that T2 ≥ −2c8 (1)/h(w2 ) > 0 and T2 f (T2 ) ≥ c0 kSo k2C 1,γ () /A, where A is defined as in the first case. Taking m = w and To ≥ T2 , we get I1 ≤ 0 and I2 ≤ 0. We conclude the proof by taking the truncation parameter m = w in (2.21). Remark 2.5. We have assumed that the boundary functions wo , So , and Vo do not depend on the parameters, e.g. To . However, if we take Vo = To h(C) + U (x) (see (1.15)), the above arguments also apply. Indeed, let Co > 0 be such that h(Co ) = 0 and choose a scaling of the variables and functions such that inf ∂ C ≥ C0 (this does not affect To ). Then, for isothermal or isentropic functions, h(inf ∂ C) ≥ 0. This implies V = −To inf ∂ h(C) + U ≤ U , and the constant c8 (To ) can be taken non-increasing as To increases. Note that now V also depends on To , but in such a way that the property w → ∞ as To → 0+ remains valid. Remark 2.6. Using a relaxation scaling as in [23], i.e. defining the rescaled variables nˆ = n, Sˆ = αS = S/τ , Vˆ = V , where τ = 1/α is the scaled relaxation time, we get from (1.11)–(1.12) the equations ˆ δ 2 1wˆ = w( ˆ = 0. div(wˆ 2 ∇S)
τ2 ˆ 2 + To h(wˆ 2 ) − Vˆ − Vext + S), ˆ |∇S| 2
474
A. J¨ungel
One may expect that the diffusive term To h(wˆ 2 ) dominates the convective term ˆ 2 for sufficiently small τ > 0, which would give the existence of solu(τ 2 /2)|∇S| tions by the presented method, for fixed To . However, we also have to transform the boundary function Sˆ o = So /τ = U/τ , and it is easy to see that then the convective term is not necessarily “small” for small relaxation times. Choosing different boundary conditions, namely So = U/α, the above rescaling gives Sˆ o = U , and the estimates of the presented proofs lead to an existence result for sufficiently small τ > 0 (see [8]). Remark 2.7. It would be very interesting to study the small dispersion limit δ → 0 and the relaxation time limit τ → 0. However, the W 2,p () norm of w and therefore, the lower bound w depend on δ such that w → 0 as δ → 0. Moreover, it seems difficult to identify the limits of the nonlinear functions. Concerning the relaxation time limit, it can be seen that c8 (To ) → ∞ as τ → 0 (see the proof of Theorems 2.1 and 2.2), and hence, w → 0 as τ → 0. Taking the boundary conditions discussed in Remark 2.6, we expect, however, that the limit τ → 0 can be performed (see [8]). For the small dispersion limit in thermal equilibrium states, we refer to [11]. The relaxation time limit τ → 0 of the hydrodynamic equations (i.e. δ = 0 in (1.7)) is performed in [23]. 3. Positivity and Non-Positivity Properties We show in this section that the existence of a uniform lower bound for the density w is related to the regularity of the gradient of S. Furthermore, we construct a generalized one-dimensional solution of a simplified problem, where the density w vanishes at some point. For this solution, the Fermi potential S is discontinuous. Let (A1)–(A3) hold and let h be isothermal or isentropic. Proposition 3.1. Let (w, S, V ) ∈ (H 1 () ∩ L∞ ())3 be a weak solution to (1.11)– (1.14) with S ∈ W 1,∞ (). Then there exists m > 0 such that w(x) ≥ m > 0
in .
Proof. First let h be isentropic. Then the function f = 21 |∇S|2 + To h(w2 ) − V − Vext + αS is bounded in . Since w ≥ 0, we can apply Harnack’s inequality [15, p. 199] to δ 2 1w = wf to conclude that for all subsets ω ⊂⊂ , sup w ≤ c(ω) inf w. ω
ω
(3.1)
Now suppose that w vanishes in some non-empty set ωo ⊂⊂ . Let ωn ⊂⊂ be a sequence of sets with ωo ⊂ ωn and ωn → as n → ∞ in the set theoretic sense. Then (3.1) gives w = 0 in ωn and, in the limit n → ∞, w = 0 in . This contradicts the positivity of wo on ∂. If h is isothermal, we proceed as in [2]. Consider ωo = {w = 0} ⊂ . Since wf ∈ L∞ (), w is continuous, hence ωo is relatively closed in . Suppose that ωo is nonvoid and choose xo ∈ ωo . Then wf ≤ 0 in a ball B(xo ) ⊂ and 1w ≤ 0 in B(xo ). As the function w assumes its nonnegative infimum 0 in B(xo ), it follows that w = 0 in
A Quantum Euler–Poisson System
475
B(xo ). Thus ωo is relatively open in . This implies ωo = or ωo = ∅. By the positivity of wo , we conclude that w > 0 in . The existence of a uniform lower bound m > 0 for w now follows from the continuity of w in . Corollary 3.2. Let (w, S, V ) be a weak solution to (1.11)–(1.14). Then w(x) ≥ m > 0
a.e. in
if and only if
S ∈ W 1,∞ ().
Now we consider the following simplified system in = (0, 1) ⊂ R: δ 2 wxx = 21 w(Sx )2
in ,
w(0) = 1, w(1) = 1,
(3.2)
2
in ,
S(0) = 0, S(1) = Uo ,
(3.3)
Jx = (w Sx )x = 0
It can be seen that Eqs. (1.11)–(1.12) reduce to (3.2)–(3.3) for very small √ domains (after an appropriate asymptotic limit; see [19]). We only√consider Uo ∈ √ [0, 2δπ]. To solve (3.2)–(3.3) we have to distinguish the cases Uo < 2δπ and Uo = 2δπ. We say that (w, S) ∈ H 1 () × L∞ () is a generalized solution to (3.2)–(3.3) with S(1) = Uo if there exists a sequence of weak solutions (wε , Sε ) ∈ (H 1 ())2 of (3.2)–(3.3) with S(1) = Uε and Uε → Uo as ε → 0 such that w = lim wε , ε→0
S = lim Sε ε→0
in the L2 () sense,
and for all φ ∈ H01 () it holds Z Z 1 wε (Sε )2x φ, lim δ 2 (wε )x φx = − lim ε→0 ε→0 2 Z wε2 (Sε )x φx = 0. lim ε→0
√ Proposition 3.3. (i) Let 0 ≤ Uo < 2δπ. Then there exists a smooth solution (w, S) ∈ 2 C 2 () of (3.2)–(3.3) such that √
w(x) ≥ c(Uo ) > 0
in .
(ii) If Uo = 2δπ then there exists a generalized solution (w, S) ∈ H 1 () × L∞ () of (3.2)–(3.3) such that w( 21 ) = 0. √ √ Proof. Let Uo√= 2δπ and let Uε < 2δπ be a sequence such that Uε → Uo as ε → 0. Set σε = Uε / 2δ. A computation shows that 1/2 , wε (x) = (1 − 2x)2 + 2(1 + cos σε )x(1 − x) √ 1 − (1 − cos σε )x , x ∈ [0, 1], Sε (x) = 2δ arccos wε (x) solve (3.2)–(3.3) with Sε (1) = Uε . Furthermore, √ wε2 (x)(Sε )x (x) = 2δ sin σε and
(3.4)
476
A. J¨ungel
r
1 (1 + cos σε ) > 0 2 In the limit ε → 0 we get cos σε → −1 and wε (x) ≥
wε (x) → w(x) = |1 − 2x| √ Sε (x) → 2δ H(x)
in .
in L2 (), in L2 ()
(ε → 0),
where H(x) = 0 for x ∈ (0, 1/2) and H(x) = π for x ∈ (1/2, 1). Taking into account (3.4) we obtain for all φ ∈ H01 (), Z Z Z 1 1 − , wε (Sε )2x φ = δ 2 (wε )x φx → δ 2 wx φx = 4δ 2 φ 2 2 Z Z √ wε2 (Sε )x φx = 2δ sin σε (ε → 0). φx → 0
Therefore, (w, S) is a generalized solution to (3.2)–(3.3). 4. Uniqueness of Solutions Uniqueness of solutions follows under the assumption that the scaled Planck constant δ is large enough. If δ = 0, there exists more than one solution of the thermal equilibrium state (i.e. J = 0; see [11]). Theorem 4.1. Let (A1)–(A3) hold and let h be isothermal or isentropic. Then there exists δo > 0 such that if δ ≥ δo , there exists at most one solution (w, S, V ) to (1.11)–(1.14) satisfying (2.1)–(2.2). Proof. Let (w1 , S1 , V1 ) and (w2 , S2 , V2 ) be two solutions to (1.11)–(1.14) satisfying (2.1)–(2.2). Take w1 − w2 as a test function in the difference of the Eqs. (1.11) satisfied by w1 , w2 , respectively, to get Z Z 1 (w1 |∇S1 |2 − w2 |∇S2 |2 )(w1 − w2 ) |∇(w1 − w2 )|2 = − (4.1) δ2 2 Z + (w1 V1 − w2 V2 )(w1 − w2 ) Z − To (w1 h(w12 ) − w2 h(w22 ))(w1 − w2 ) Z − α (w1 S1 − w2 S2 )(w1 − w2 ) Z + Vext (w1 − w2 )2 = I1 + · · · + I 5 . The weak formulation of the difference of (1.12) for S1 , S2 , respectively, reads Z Z w12 ∇(S1 − S2 ) · ∇φ = − (w12 − w22 )∇S2 · ∇φ for all φ ∈ H01 (). Taking φ = S1 − S2 we obtain
A Quantum Euler–Poisson System
477
Z
Z
w2
|∇(S1 − S2 )|2 ≤
w12 |∇(S1 − S2 )|2 Z = − (w12 − w22 )∇S2 · ∇(S1 − S2 )
≤ 2wkw1 − w2 k0,2 k∇S2 k0,∞ k∇(S1 − S2 )k0,2 which implies k∇(S1 − S2 )k0,2 ≤ (2w/w2 )kw1 − w2 k0,2 k∇S2 k0,∞ .
(4.2)
Now we are able to estimate I1 , . . . , I5 : Z 1 (w1 ∇(S1 − S2 ) · ∇(S1 + S2 ) + (w1 − w2 )|∇S2 |2 )(w1 − w2 ) I1 = − 2 ≤ (w/w)2 + 1 k∇S2 k0,∞ (k∇S1 k0,∞ + k∇S2 k0,∞ )kw1 − w2 k20,2 , using (4.2). The integral I2 is estimated by using (1.13): Z 1 ((w1 − w2 )2 (V1 + V2 ) + (w12 − w22 )(V1 − V2 )) I2 = 2 Z Z λ2 1 2 (w1 − w2 ) (V1 + V2 ) − |∇(V1 − V2 )|2 = 2 2 ≤ V kw1 − w2 k20,2 . The monotonicity of h implies Z I3 = −To (w1 (h(w12 ) − h(w22 ))(w1 − w2 ) + (w1 − w2 )2 h(w22 )) ≤ −To h(w2 )kw1 − w2 k20,2 . Finally, we can estimate the integral I4 employing Poincar´e’s inequality and (4.2): Z I4 = −α (w1 (S1 − S2 )(w1 − w2 ) + (w1 − w2 )2 S2 ) ≤ α c()(w/w)2 k∇S2 k0,∞ + S kw1 − w2 k20,2 . Let K = k∇S1 k0,∞ + k∇S2 k0,∞ . Then we get from (4.1), 2 w2 2 2 w 2 δ − 2K + 1 − V − V ext + To h(w ) − α 2 K + S kw1 − w2 k20,2 ≤ 0. w2 w (4.3) Only K depends on δ (via the W 2,p () norm of w; see the third step of the proof of Lemma 2.3) such that K remains bounded as δ → ∞. Therefore there exists δo > 0 such that if δ ≥ δo , then (4.3) implies kw1 − w2 k20,2 ≤ 0. Hence w1 = w2 in . Finally, we infer S1 = S2 from (4.2) and V1 = V2 from (1.13).
478
A. J¨ungel
Remark 4.2. There exists at most one weak solution (w, S, V ) in the class of functions satisfying w, V ∈ H 1 () ∩ L∞ (), w(x) ≥ m > 0 in , and (only) S ∈ W 1,q (), where q = d if d ≥ 3, q > 2 if d = 2 and q = 2 if d = 1, under the assumption that the scaled Planck constant δ > 0 is large enough. The proof of this result is similar to the proof in [16]. Acknowledgement. This work was partially supported by the EC-TMR network # ERB-4061-PL97-0396, by the Deutscher Akademischer Austauschdienst (DAAD), and by the Deutsche Forschungsgemeinschaft, grant numbers MA1662/-1 and -2.
Referencs 1. Ancona, M. and Iafrate, G.: Quantum correction to the equation of state of an electron gas in a semiconductor. Phys. Rev. B 39, 9536–9540 (1989) 2. Ben Abdallah, N. and Unterreiter, A.: On the stationary quantum drift-diffusion model. To appear in Math. Mod. Num. Anal. (1998) 3. Cimatti, G.: Remark on the existence and uniqueness for the thermistor problem under mixed boundary conditions. Quart. Appl. Math. 47, 117–121 (1989) 4. Cimatti, G. and Prodi, G. Existence results for a nonlinear elliptic system modelling a temperature dependent electrical resistor. Ann. Mat. Pura Appl. 152, 227–237 (1988) 5. Degond, P. and Markowich, P.: A steady state potential flow model for semiconductors. Ann. Mat. Pura Appl. 165, 87–98 (1993) 6. Feynman, R.: Statistical Mechanics, A Set of Lectures. Frontiers in Physics. New York: W.A. Benjamin, 1972 7. Gamba, I.: Sharp uniform bounds for steady potential fluid-Poisson systems. Proc. Roy. Soc. Edinburgh Sect. A 127, 479–516 (1997) 8. Gamba, I., Gasser, I., and J¨ungel, A.: In preparation. 1998 9. Gamba, I. and Morawetz, C.: A viscous approximation for a 2D steady semiconductor or transonic gas dynamic flow: Existence theorem for potential flow. Comm. Pure Appl. Math. 49, 999–1049 (1996) 10. Gardner, C.: The quantum hydrodynamic model for semiconductor devices. SIAM J. Appl. Math. 54, 409–427 (1994) 11. Gasser, I. and J¨ungel, A.: The quantum hydrodynamic model for semiconductors in thermal equilibrium. ZAMP 48, 45–59 (1997) 12. Gasser, I. and Markowich, P.A.: Quantum hydrodynamics, Wigner transforms and the classical limit. Asympt. Anal. 14, 97–116 (1997) 13. Gasser, I., Markowich, P.A., and Ringhofer, C.: Closure conditions for classical and quantum moment hierachies in the small temperature limit. Transp. Theory Stat. Phys. 25, 409–423 (1996) 14. Gasser, I., Markowich, P.A. and Unterreiter, A.: Quantum Hydrodynamics. Proceedings of the SPARCH GdR Conference, St. Malo, 1995 15. Gilbarg, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Berlin: Springer, 2nd edition, 1983 16. Howison, S., Rodrigues, J. and Shillor, M.: Stationary solutions to the thermistor problem. J. Math. Anal. Appl. 174, 573–588 (1993) 17. J¨ungel, A.: On the existence and uniqueness of transient solutions of a degenerate nonlinear drift-diffusion model for semiconductors. Math. Models Meth. Appl. Sci. 4, 677–703 (1994) 18. J¨ungel, A.: Asymptotic analysis of a semiconductor model based on Fermi-Dirac statistics. Math. Meth. Appl. Sci. 19, 401–424 (1996) 19. J¨ungel, A.: A note on current-voltage characteristics from the quantum hydrodynamic equations for semiconductors. Appl. Math. Lett. 10, 29–34 (1997) 20. Kreuzer, H.: Nonequilibrium Thermodynamics and its Statistical Foundation. Oxford: Clarendon Press, 1981 21. Landau, L. and Lifschitz, E.: Lehrbuch der Theoretischen Physik. volume III, Quantenmechanik. Berlin: Akademie-Verlag, 1985 22. Loffredo, M. and Morato, L.: On the creation of quantum vortex lines in rotating HeII. Il nouvo cimento 108B, 205–215 (1993)
A Quantum Euler–Poisson System
479
23. Marcati, P. and Natalini, R.: Weak solutions to a hydrodynamic model for semiconductors and relaxation to the drift diffusion equations. Arch. Rat. Mech. Anal. 129, 129–145 (1995) 24. Markowich, P.A. and Unterreiter, A.: Vacuum solutions of the stationary drift-diffusion model. Ann. Sc. Norm. Sup. Pisa 20, 371–386 (1993) 25. Stampacchia, G.: Equations elliptiques du second ordre a` coefficients discontinus. Les Presses de l’Universit´e de Montr´eal, Canada, 1966 26. Troianiello, G.M.: Elliptic Differential Equations and Obstacle Problems. New York: Plenum Press, 1987 27. Xie, H. and Allegretto, W.: C α () solutions of a class of nonlinear degenerate elliptic systems arising in the thermistor problem. SIAM J. Math. Anal. 22, 1491–1499 (1990) 28. Zeidler, E.: Nonlinear Functional Analysis and its Applications, Vol. II. New York: Springer, 1990 29. Zhang, B. and Jerome, J.: On a steady-state quantum hydrodynamic model for semiconductors. Nonlin. Anal. 26, 845–856 (1996) Communicated by J. L. Lebowitz
Commun. Math. Phys. 194, 481 – 492 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Semi-Classical Approximation for Modular Operads E. Getzler? Max-Planck-Institut f¨ur Mathematik, Gottfried-Claren-Str. 26, D-53225 Bonn, Germany. Received: 6 February 1997 / Accepted: 20 February 1998
Abstract: We study the contribution of one-loop graphs (the semi-classical expansion) problem in the setting of modular operads. As an application, we calculate the Betti numbers of the Deligne-Mumford-Knudsen moduli spaces of stable curves of genus 1 with n marked smooth points. The semi-classical approximation is an explicit formula of mathematical physics for the sum of Feynman diagrams with a single circuit. In this paper, we study the same problem in the setting of modular operads [5]; instead of being a number, the interaction at a vertex of valence n will be an Sn -module. The motivation for developing this theory was the desire to calculate the Sn equivariant Hodge polynomials of the Deligne-Mumford-Knudsen moduli spaces M1,n of stable curves of genus 1 with n marked smooth points. In performing these calculations, we use the formulas for the Sn -equivariant Serre characteristics of M0,n and M1,n derived in [1] and [3] respectively. A particular consequence of our calculations will be needed in [4] to find a relation among the codimension two cycles in M1,4 . Theorem. The S4 -module H 4 (M1,4 , Q) is isomorphic to V(4) ⊗ Q7 ⊕ V(3,1) ⊗ Q4 ⊕ V(2,2) ⊗ Q2 . 1. Wick’s Theorem and the Semi-Classical Approximation Let 0g,n be the small category whose objects are isomorphism classes of stable graphs G of genus g(G) = g with n totally ordered legs [5], and whose morphisms are the ? Current address: Department of Mathematics, Northwestern University, Evanston, IL 60208-2730, USA. E-mail: [email protected]
482
E. Getzler
automorphisms: if G ∈ 0g,n , its automorphism group Aut(G) is the subset of the permutations of the flags which preserve all the data defining the stable graph, including the total ordering of the legs. Because of the stability condition, 0g,n is a finite category. Define polynomials {Mvg,n | 2(g − 1) + n > 0} of a set of variables {vg,n | 2(g − 1) + n > 0} by the following formula: X
Mvg,n =
G∈Ob 0g,n
1 | Aut(G)|
Y
vg(v),n(v) .
(1.1)
v∈Vert(G)
Introduce the sequences of generating functions X
ag (x) =
vg,n
2(g−1)+n>0
xn , n!
and
bg (x) =
X
Mvg,n
2(g−1)+n>0
xn . n!
Wick’s theorem gives an integral formula for the generating functions {bg } in terms of {ag }: X Z ∞ ∞ ∞ X (x − ξ)2 dx g−1 g−1 √ bg ~ = log exp ag ~ − . 2~ 2π~ −∞ g=0 g=0 As written, this is purely formal, since it involves the integration of a power series in x. It may be made rigourous by observing that the integral transform Z ∞ 2 dx f (~, x)e−(x−ξ) /2~ √ f 7−→ 2π~ −∞ induces a continuous linear map on the space of Laurent series Q((~))[[x]] topologized by the powers of the ideal (~, x). The semi-classical expansion is a pair of formulas for b0 and b1 in terms of a0 and a1 , which we now recall. Definition 1.2. Let R be a ring of characteristic zero. The Legendre transform L is the involution of the set x2 /2 + x3 R[[x]] characterized by the formula (Lf ) ◦ f 0 + f = p1 f 0 . Theorem 1.3. The series x2 /2 + b0 is the Legendre transform of x2 /2 − a0 . The first few coefficients of b0 may be calculated, either from the definition of Mv0,n or from Theorem 1.3: n
Mv0,n
3
v0,3
4
2 v0,4 + 3v0,3
5
3 v0,5 + 10v0,4 v0,3 + 15v0,3
6
2 + 105v v 2 + 105v 4 v0,6 + 15v0,5 v0,3 + 10v0,4 0,4 0,3 0,3
We now come to the formula for b1 , known as the semi-classical approximation.
Semi-Classical Approximation for Modular Operads
483
Theorem 1.4. The series b1 and a1 are related by the formula b1 = a1 − 21 log(1 − a000 ) ◦ (x + b00 ). By the definition of the Legendre transform, we see that (Lf )0 ◦ f 0 = x. It follows that Theorem 1.4 is equivalent to the formula b1 ◦ (x − a00 ) = a1 − 21 log(1 − a000 ). This formula expresses the fact that the stable graphs contributing to b1 are obtained by attaching a forest whose vertices have genus 0 to two types of graphs: (i) those with a single vertex of genus 1 (corresponding to the term a1 ); (ii) stable graphs with a single circuit, and all of whose vertices have genus 0 – we call such a graph a necklace. The presence of a logarithm in the term which contributes the necklaces is related to the fact that there are (n − 1)! cyclic orders of n objects. The first few coefficients of b1 are also easily calculated: n
Mv1,n
1
v1,1 + 21 v0,3
2
v1,2 + v1,1 v0,3 +
3
v1,3 + 3v1,2 v0,3 + v1,1 v0,4 +
4
2 +v v v1,4 + 6v1,3 v0,3 + 3v1,2 v0,4 + 15v1,2 v0,3 1,1 0,5
+
1 2
1 2
2 v0,4 + v0,3 1 2
3 v0,5 + 3v0,4 v0,3 + 2v0,3
2 + 12v v 2 + 6v 4 v0,6 + 4v0,5 v0,3 + 3v0,4 0,4 0,3 0,3
2. The Semi-Classical Approximation for Modular Operads In the theory of modular operads, one replaces the sequence of coefficients {vg,n } considered above by a stable S-module, that is, a sequence of Sn -modules V((g, n)). The analogue of (1.1) is the functor on stable S-modules which sends V to O V((g(v), n(v))). (2.1) MV((g, n)) = colim G∈0g,n
v∈Vert(G)
Thus, the coefficients in (1.1) are promoted to vector spaces, the product to a tensor product, the sum over stable graphs to a direct sum, and the weight | Aut(G)|−1 to colimAut(G) , that is, the coinvariants with respect to the finite group Aut(G). Note that this definition makes sense in any symmetric monoidal category C with finite colimits. We will need the Peter-Weyl theorem to hold for actions of the symmetric group Sn on C; thus, we will suppose that C is additive over a ring of characteristic zero. Definition 2.1. The characteristic chn (V) of an Sn -module is defined by the formula 1 X Tr (V)pσ ∈ 3n ⊗ K0 (C), chn (V) = σ n! σ∈Sn
where pσ is the product of power sums p|O| over the orbits O of σ.
484
E. Getzler
Although this definition appears to require rational coefficients, this is an artifact of the use of the power sums pn ; it is shown in [2] that the characteristic is a symmetric function of degree n with values in the Grothendieck group of the additive category C. If rk : 3 → Q[x] is the homomorphism defined by hn 7→ xn /n!, we have rk(chn (V)) = [V]/n! ∈ K0 (C) ⊗ Q. Note that rk(f ) is obtained from f by setting the power sums pn to 0 if n > 1, and to x if n = 1. The place of the generating functions ag and bg is now taken by X ˆ 0 (C), chn (V((g, n))) ∈ 3⊗K ag = 2(g−1)+n>0
X
bg =
ˆ 0 (C). chn (MV((g, n))) ∈ 3⊗K
2(g−1)+n>0
Theorem (8.13) of [5], whose statement we now recall, calculates bg in terms of ah , h ≤ g. Let 1 be the “Laplacian” on 3((~)) given by the formula ∞ X ∂ n ∂2 n ~ + . 1= 2 ∂p2n ∂p2n n=1
Theorem 2.2. If V is a stable S-module, then ∞ ∞ X X bg ~g−1 = Log exp(1) Exp ag ~g−1 . g=0
g=0
There is also a formula for b0 in terms of a0 . To state it, we must recall the definition of the Legendre transform for symmetric functions. Let ˆ 0 (C) = {f ∈ 3⊗K ˆ 0 (C) | rk(f ) = x2 /2 + O(x3 )}. 3∗ ⊗K If f is a symmetric function, let f 0 = ∂f /∂p1 ; this operation may be expressed more invariantly as p⊥ 1 (Ex. I.5.3, Macdonald [6]). ˆ 0 (C) characterized Definition 2.3. The Legendre transform L is the involution of 3∗ ⊗K by the formula (Lf ) ◦ f 0 + f = p1 f 0 . The Legendre transform Lf of a function f is characterized by the formula (Lf )0 ◦ f = x. For symmetric functions, although the analogue of this formula holds, in the form (Lf )0 ◦ f 0 = h1 , 0
the situation is not as simple, since there is no single notion of integral for symmetric functions (the “constant” term may be any function of the power sums pn , n > 1). Nevertheless, there is a simple algorithm for calculating Lf from f . Denote by fn and gn the coefficents of f and g = Lf lying in 3n ⊗ K0 (C). (i) The formula f 0 ◦ (Lf )0 = h1 may be rewritten as N X n=3
gn0 +
N X n=3
N −1 X fn0 ◦ h1 + gk0 ∼ =0 k=3
This gives a recursive procedure for calculating gn0 .
mod 3N ⊗ K0 (C).
Semi-Classical Approximation for Modular Operads
485
(ii) Having determined g 0 , we obtain g from the formula f = Lg, or g = p1 g 0 − f ◦ g 0 . We now recall Theorem (7.17) of [5], which is the generalization to modular operads of Theorem 1.3. Theorem 2.4. The symmetric function h2 + b0 is the Legendre transform of e2 − a0 . The main result of this paper is a formula for b1 in terms of a1 and a0 , generalizing Theorem 1.4. If f is a symmetric function, write f˙ = ∂f /∂p2 = 21 p⊥ 2 f. Theorem 2.5. b1 =
∞
a1 −
1 X φ(n) a˙ 0 (˙a0 + 1) log(1 − ψn (a000 )) + 2 n 1 − ψ2 (a000 )
◦ (h1 + b00 ).
n=1
Here, φ(n) is Euler’s function, the number of prime residues modulo n. Remark. The first two terms inside the parentheses on the right-hand side of Theorem 2.5 are analogues of the corresponding terms in the formula of Theorem 1.4. In particular, the second of these terms is closely related to the sum over necklaces in the definition of MV((1, n)), as is seen from the formula ∞ X n=1
∞ X φ(n) log(1 − pn ). chn IndSZnn 11 = − n n=1
The remaining term may be understood as a correction term, which takes into account the fact that necklaces of 1 or 2 vertices have non-trivial involutions (while those with more vertices do not). A proof of the theorem could no doubt be given using this observation; however, we prefer to derive it directly from Theorem 2.2. If we take the plethysm on the right of the formula of Theorem 2.5 with the symmetric function h1 −a00 , and apply the formula (h1 +b00 )◦(h1 −a00 ) = h1 , we obtain the equivalent formulation of this theorem: b1 ◦ (h1 − a00 ) = a1 −
∞
1 X φ(n) a˙ 0 (˙a0 + 1) log(1 − ψn (a000 )) + . 2 n 1 − ψ2 (a000 ) n=1
Proof of Theorem 2.5. The symmetric function b1 is a sum over graphs obtained by attaching forests whose vertices have genus 0 to either a vertex of genus 1, or to a necklace. In other words, b1 = a1 + sum over necklaces ◦ (h1 + b00 ). To prove the theorem, we must calculate the sum over necklaces. To do this, observe that a necklace is a graph with flags coloured red or blue, such that each vertex has exactly two red flags, each edge is red, and all tails are blue. Let W((n)), n ≥ 1, be the sequence of representations of S2 × Sn , V((0, n + 2)); W((n)) = ResSSn+2 n ×S2 think of the first factor of the product Sn × S2 as acting on the blue flags at a vertex, and the second factor as acting on the red flags. Applying Theorem 2.2, we see that ˆ ⊗K ˆ 0 (C) Log exp(1 ⊗ 1) Exp(Ch(W)) ∈ 3⊗3
486
E. Getzler
is the sum over stable graphs all of whose edges are red. To impose the condition that all tails are blue, we set the variables qn to zero before taking the Logarithm. We now proceed to the explicit calculation. We set ~ = 1, since it plays no rˆole when ˆ we will denote power sums in the all graphs have genus 1. In writing elements of 3⊗3, first factor of 3 by pn , and in the second by qn . Lemma 2.6. The characteristic Ch(W) of W is the “bisymmetric” function ˆ 0 (C). ˆ 2 ⊗K Ch(W) = 21 a000 q12 + a˙ 0 q2 ∈ 3⊗3 ⊥ Proof. We have Ch(W) = h⊥ 2 a0 ⊗ h2 + e2 a0 ⊗ e2 . Expressing this in terms of power sums, we have ⊥ ⊥ 1 ⊥ 2 1 2 h⊥ 2 a0 ⊗ h2 + e2 a0 ⊗ e2 = 2 (p1 ) + p2 a0 ⊗ 2 (q1 + q2 ) 2 ⊥ 1 2 + 21 (p⊥ 1 ) − p2 a0 ⊗ 2 (q1 − q2 ) 2 2 ⊥ = 21 (p⊥ 1 ) a0 ⊗ q1 + p2 a0 ⊗ q2 .
From this lemma, it follows that ∞ ∞ Y q2 Y q2n ˆ ⊗K ˆ 0 (C), ∈ 3⊗3 exp ψn (a000 ) n exp ψn (˙a0 ) Exp Ch(W) = 2n n n=1
n=1
We now apply the heat kernel and separate variables: 2 Y n ∂2 00 qn exp (a ) exp ψ exp(1 ⊗ 1) Exp Ch(W) |qn =0 = n 0 2 ∂qn2 2n qn =0 n odd Y × exp
n even
∂ n ∂2 + 2 2 ∂qn ∂qn
q2 2qn exp ψn (a000 ) n + ψn/2 (˙a0 ) . 2n n qn =0
We now insert the explicit formulas for the heat kernel of the Laplacian, namely 2 Z ∞ q n ∂2 dq √ exp )| = f (q ) exp − . f (q n qn =0 n 2 ∂qn2 2n 2πn −∞ For the odd variables, matters are quite straightforward: Z ∞ q2 2 qn2 n ∂2 dq n 00 00 qn √ n exp ψn (a0 ) − exp = exp ψn (a0 ) 2 2 ∂qn 2n 2n 2n qn =0 2πn −∞ 00 −1/2 . = 1 − ψn (a0 ) For the even variables, things become a little more involved: 2 ∂ 2qn n ∂2 00 qn + ψ + (a ) (˙ a ) exp ψ exp n 0 n/2 0 2 ∂qn2 ∂qn 2n n qn =0 q2 2qn n ∂2 exp ψn (a000 ) n + ψn/2 (˙a0 ) = exp 2 2 ∂qn 2n n qn =1 Z ∞ (qn − 1)2 q2 2qn dq √ n . − = exp ψn (a000 ) n + ψn/2 (˙a0 ) 2n n 2n 2πn −∞
Semi-Classical Approximation for Modular Operads
487
To perform this gaussian integral, we complete the square in the exponent: ψn (a000 )
(qn − 1)2 qn2 2qn + ψn/2 (˙a0 ) − 2n n 2n 2 qn 1 00 qn = − 1 − ψn (a0 ) + 2ψn/2 (˙a0 ) + 1 − 2n n 2n 2ψn/2 (˙a0 ) + 1 2 2 ψn/2 (˙a0 ) ψn/2 (˙a0 ) + 1 1 − ψn (a000 ) qn − . + =− 2n 1 − ψn (a000 ) n 1 − ψn (a000 )
Thus, the gaussian integral equals 1−
−1/2 ψn (a000 )
2 ψn/2 (˙a0 ) ψn/2 (˙a0 ) + 1 . exp n 1 − ψn (a000 )
Putting these calculations together, we see that
exp(1 ⊗ 1) Exp Ch(W) |qn =0
1 ψn (˙a0 ) ψn (˙a0 ) + 1 = exp 1− n 1 − ψ2n (a000 ) n=1 ∞ Y −1/2 a˙ 0 (˙a0 + 1) , = Exp 1 − ψn (a000 ) 1 − ψ2 (a000 ) ∞ Y
−1/2 ψn (a000 )
n=1
and, applying the operation Log, that ∞ Y −1/2 a˙ 0 (˙a0 + 1) Log exp(1 ⊗ 1) Exp Ch(W) |qn =0 = Log . + 1 − ψn (a000 ) 1 − ψ2 (a000 ) n=1
The proof of the theorem is completed by the following lemma, applied to f = 1 − a000 . ˆ 0 (C) have constant term equal to 1; that is, rk(f ) = 1+O(x). Lemma 2.7. Let f ∈ 3⊗K Then ∞ ∞ Y 1 X φ(n) Log log(ψn (f )). ψn (f )−1/2 = − 2 n n=1
n=1
Proof. By definition, Log
∞ Y n=1
ψn (f )−1/2 =
∞ X µ(k) k=1
k
log
∞ Y
ψnk (f )−1/2 = −
n=1
∞
n=1
The lemma follows from the formula X µ(d) d|n
1 YX µ(d) log(ψn (f )). 2 d
d
which follows by M¨obius inversion from
=
d|n
φ(n) , n
P d|n
φ(d) = n.
Corollary 2.8. Define ag = rk(ag ), bg = rk(bg ), and a˙ 0 = rk(˙a0 ). Then we have a1 ◦ (x − a00 ) = a1 − 21 log(1 − a000 ) + a˙ 0 (˙a0 + 1).
488
E. Getzler
Example 2.9. Suppose V((0, n)) = 11 is the trivial one-dimensional representation for all n ≥ 3, while V((1, n)) = 0. Then MV((1, n)) is an Sn -module whose rank is the number of graphs in 001,n , where 001,n ⊂ 01,n is the subset of stable graphs all of whose vertices have genus 0. We have a0 =
∞ X n=3
∞ X pn
hn = exp
n=1
n
− 1 − h 1 − h2 .
Theorem 2.5 leads to the following results; the calculations were performed using J. Stembridge’s symmetric function package SF for maple [7]. 0 n chn MV((1, n)) |01,n | 1
s1
1
2
3 s2
3
3
7 s3 + 4 s21
15
4
20 s4 + 17 s31 + 14 s22 + 4 s212
111
5
52 s5 + 78 s41 + 71 s32 + 33 s312 + 34 s22 1 + 4 s213 + s15
1104
An explicit formula for the generating function of the numbers |001,n | may be obtained from Corollary 2.8, using the formulas a00 = ex − 1 − x, a000 = ex − 1 and a˙ 0 = 21 (ex − 1). Proposition 2.10. ∞ X n=1
|001,n |
1 xn 1 = − log 2 − ex + (e2x − 1) ◦ (1 + 2x − ex )−1 . n! 2 4
3. The Sn -Equivariant Hodge Polynomial of M1,n A more interesting application of Theorem 2.5 is to the stable S-module in the category of Z-graded mixed Hodge structures V((g, n)) = Hc• (Mg,n , C). Let KHM be the Grothendieck group of mixed Hodge structures. The Sn -equivariant Serre characteristic eSn (Mg,n ) is by definition the characteristic chn (V((g, n))) ∈ 3n ⊗ KHM. It follows from the usual properties of Serre characteristics (see [2] or Proposition (6.11) of [5]) that chn (MV((g, n))) is the Sn -equivariant Serre characteristic eSn (Mg,n ) of the moduli space Mg,n of stable curves. Since the moduli space Mg,n is a complete smooth Deligne-Mumford stack, its k th cohomology group carries a pure Hodge structure of weight k; thus, the Hodge polynomial of Mg,n may be extracted from eSn (Mg,n ). Using Theorem 2.5, we will calculate the Serre characteristics eSn (M1,n ). It is shown in [1] (see also [2]) that Y P ∞ 1 µ(n/d)(1+Ld ) n d|n −1 (1 + p ) n ∞ X h1 h2 n=1 , − 2 − eSn (M0,n ) = a0 = L3 − L L −L L+1 n=3
Semi-Classical Approximation for Modular Operads
489
where L is the pure Hodge structure C(−1) of weight 2. Theorem 2.4 implies that ∞ X h2 + b0 = h2 + eSn (M0,n ) n=3
is the Legendre transform of e2 − a0 ; this was used in [1] to calculate eSn (M0,n ). 2k 1 Let S2k+2 be the pure Hodge structure gr W 2k+1 Hc (M1,1 , Sym H), where H is the 1 local system R π∗ Q of rank 2 over the moduli stack of elliptic curves. (Here, π : M1,2 → M1,1 is the universal elliptic curve.) This Hodge structure has the following properties: (i) S2k+2 = F 0 S2k+2 ⊕ F 0 S2k+2 ; (ii) there is a natural isomorphism between F 0 S2k+2 and the space of cusp forms S2k+2 for the full modular group SL(2, Z). (In particular, S2k+2 = 0 for k ≤ 4.) It is shown in [3] that a1 =
∞ X
Q∞
Sn
e (M1,n ) = res0
n=1 (1
n=1
×
+ pn )
∞ X k=1
1 n
P d|n
µ(n/d)(1−ω d −Ld /ω d +Ld )
−1
1 − ω − L/ω + L ! S2k+2 + 1 2k ω − 1 ω − L/ω dω , L2k+1
where res0 [α] is the residue of the one-form α at the origin. We may now apply Theorem 2.5 to calculate the generating function of the Sn equivariant Serre characteristics eSn (M1,n ). We do not give the details, since they are quite straightforward, though the resulting formulas are tremendously complicated when written out in full. However, we do present some sample calculations, performed with the package SF.
n
eSn M1,n
1
(L + 1)s1
2
2
(L2 + 2L + 1)s2
4
3
(L3 + 3L2 + 3L + 1)s3 + (L2 + L)s21
12
4
(L4 + 4L3 + 7L2 + 4L + 1)s4 + (2L3 + 4L2 + 2L)s31 + (L3 + 2L2 + L)s22
49
5
(L5 + 5L4 + 12L3 + 12L2 + 5L + 1)s5 + (3L4 + 11L3 + 11L2 + 3L)s41
260
χ(M1,n )
+ (2L4 + 7L3 + 7L2 + 2L)s32 + (L3 + L2 )(s312 + s22 1 )
In a table at the end of the paper, we give a table of non-equivariant Serre characteristics of M1,n for n ≤ 15; these give an idea of the way in which the Hodge structures S2k+2 typically enter into the cohomology. In particular, we see that the evendimensional cohomology of the moduli spaces M1,n is spanned by Hodge structures of the form Q(`), while the odd dimensional cohomology is spanned by Hodge structures of the form S2k+2 (`). The rational cohomology groups of M1,n satisfy Poincar´e duality: there is a nondegenerate Sn -equivariant pairing of Hodge structures H k (M1,n , Q) ⊗ H 2n−k (M1,n , Q) −→ Q(−n). Unfortunately, our formula for eSn (M1,n ) does not render this duality manifest.
490
E. Getzler
4. The Euler Characteristic of M1,n As an application of Corollary 2.8, we give an explicit formula for the generating function of the Euler characteristics χ(M1,n ). Theorem 4.1. Let g(x) ∈ x + x2 Q[[x]] be the solution of the equation 2g(x) − (1 + g(x)) log(1 + g(x)) = x. Then ∞ X
χ(M1,n )
n=1
1 1 xn = − log 1 + g(x) − log 1 − log(1 + g(x)) + (g(x)), n! 12 2
where (x) =
1 19 x + 23 x2 /2 + 10 x3 /3 + x4 /2 . 12
Proof. We apply Corollary 2.8 with the data a00 =
∞ X
∞
χ(M0,n+1 )
n=2
xn X xn = = (1 + x) log(1 + x) − x, (−1)n (n − 2)! n! n! n=2
1 a000 = log(1 + x), a˙ 0 = x(x + 2), 4 ∞ 1 X x2 x3 x4 xn (−1)n (n − 1)! a1 = χ(M1,1 )x + χ(M1,2 ) + χ(M1,3 ) + χ(M1,4 ) + 2 6 24 12 n! n=5 1 1 x2 x 3 x4 x2 − log(1 + x) + x+ + − , =x+ 2 12 12 2 3 4 where we have used that χ(M1,1 ) = χ(M1,2 ) = 1 and χ(M1,3 ) = χ(M1,4 ) = 0. The function g(x) of the statement of the theorem is x + b00 (x). The following corollary was shown us by D. Zagier. Corollary 4.2. χ(M1,n ) ∼ where
r C=
(n − 1)! −1/2 −3/2 + O n 1 + Cn , 4(e − 2)n
e−2 (1 + 4e + 9e2 + 4e3 + 2e4 ) ≈ 18.31398807. 18πe
Semi-Classical Approximation for Modular Operads
491
492
E. Getzler
Proof. Analytically continue g(x) to the domain C \ [e − 2, ∞). The resulting function has an asymptotic expansion of the form g(x) ∼ e − 1 −
∞ X p 2e(e − 2 − x) + ak (e − 2 − x)k/2 . k=3
The asymptotics (4.2) follow by applying Cauchy’s integral formula to the right-hand side of Theorem 4.1, with contour the circle |x| = e − 2. The peculiar polynomial (x) of Theorem 4.1 combines the error terms in the formula for χ(M1,n ) with the correction terms involving a˙ 0 in Corollary 2.8. Omitting the term (g(x)) in Theorem 4.1 and dividing by 2, we obtain the generating function not of the Euler characteristics χ(M1,n ), but rather of the virtual Euler characteristics χv (M1,n ) of the underlying smooth moduli stack (orbifold). The asymptotic behaviour of the virtual Euler characteristics is the same as that of the Euler characteristics, with C replaced e = e−2 1/2 ≈ 0.06835794. The ratio between these Euler characteristics has the by C 18πe asymptotic behaviour 2
χv (M1,n ) e − C)n−1/2 + O(n−1 ), ∼ 1 + (C χ(M1,n )
giving a statistical measure of the ramification of M1,n for large n. Acknowledgement. I wish to thank the Department of Mathematics at the Universit´e de Paris-VII, and the Max-Planck-Institut f¨ur Mathematik in Bonn for their hospitality during the inception and completion, respectively, of this paper. I am grateful to D. Zagier for showing me the asymptotic expansion of Corollary 4.2. This research was partially supported by a research grant of the NSF and a fellowship of the A.P. Sloan Foundation.
References 1. Getzler, E.: Operads and moduli spaces of genus 0 Riemann surfaces. In: The moduli space of curves, eds. R. H. Dijkgraaf et al., Basel: Birkh¨auser, 1995, pp. 199–230 2. Getzler, E.: Mixed Hodge structures of configuration spaces. Max-Planck-Institut, preprint MPI-96-61, alg-geom/9510018 3. Getzler, E.: Resolving mixed Hodge modules on configuration spaces. To appear, Duke Math. J.; alg-geom/9611003 4. Getzler, E.: Intersection theory on M1,4 and elliptic Gromov-Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997); alg-geom/9612004 5. Getzler, E. and Kapranov, M.M.: Modular operads. Compositio Math. 110, 65–126 (1998); dg-ga/9408003 6. Macdonald, I.G.: Symmetric Functions and Hall Polynomials. 2nd edition, Oxford: Clarendon Press, 1995 7. Stembridge, J.: A Maple package for symmetric functions. http://www.math.lsa.umich.edu/∼jrs/maple.html Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 493 – 512 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Suppression of Critical Fluctuations by Strong Quantum Effects in Quantum Lattice Systems Sergio Albeverio1,2 , Yuri Kondratiev3,4 , Yuri Kozitsky5,6 1 Fakult¨ at f¨ur Mathematik, Ruhr-Universit¨at, D-44780 Bochum, Germany, SFB 237 Essen–Bochum–D¨usseldorf and BiBoS Research Centre, D 33615 Bielefeld, Germany 2 CERFIM, Locarno, Switzerland 3 BiBoS Research Centre, Bielefeld Universit¨ at, D 33615, Germany 4 Institute of Mathematics, Kiev, Ukraine 5 Institute of Mathematics, Marie Curie–Sklodowska University PL 20-031 Lublin, Poland 6 Institute for Condensed Matter Physics, Lviv, Ukraine
Received: 31 May 1996 / Accepted: 12 March 1997
Abstract: Translation invariant models of quantum anharmonic oscillators with a polynomial anharmonicity and a ferroelectric interaction are considered. For these models, it is proved that the critical fluctuations of the position operator, peculiar to the critical point, are suppressed at all temperatures provided the oscillators are strongly quantum. This phenomenon is shown to occur in particular if the oscillator’s mass is less than some threshold value depending on the anharmonicity parameters. 1. Posing of the Problem and Results The object of our investigation in this paper is a countable system of quantum particles (oscillators) performing one-dimensional oscillations around their equilibrium positions and interacting among themselves. The set of all equilibrium positions is assumed to be a lattice IL. For further simplicity we choose this lattice to be simple cubic, that means IL = ZZ ν with a certain positive integer ν. For every l ∈ IL, let a quantum particle with the mass m and one internal degree of freedom be given. The dynamics of its oscillations around the equilibrium position l is described by means of canonical momentum and displacement operators {pl , ql } defined on (a dense subset of) the Ncomplex Hilbert space Hl = L2 (IRl , dxl ). For every finite subset 3 ⊂ IL, we put H3 = l∈3 Hl . The dynamics of the whole system of particles can be described by means of the model Hamiltonian formally given by X 1 X H= dll0 ql ql0 + Hl , (1) 2 0 0 l6=l ;l,l ∈IL
2
l∈IL
pl + V (ql ), (2) 2m where the coefficients dll0 form the dynamical matrix, m is the oscillator’s mass, and V is the one-particle potential. The first term in (1) describes the interaction between Hl =
494
S. Albeverio, Y. Kondratiev, Y. Kozitsky
the particles, the second one describes their individual properties by means of the oneparticle Hamiltonian Hl , l ∈ IL. The interaction is assumed to be translation invariant and ferroelectric. The simplest example of such type can be given as follows. Let |l − l0 | be the Euclidean distance between the points l and l0 . We set for l 6= l0 dll0 = −φ(|l − l0 |),
(3)
where φ is some nonnegative monotone function with sufficiently fast falloff, such that X φ(|l|) < ∞. l∈IL
For the further convenience, we include the case l = l0 in the sum in (1) putting dll = 0. The formal Hamiltonian cannot be defined directly as an operator, it is the heuristic limit of “local Hamiltonians” H3 , associated with finite subsets 3 ⊂ IL, each of which being defined as an essentially self-adjoint lower bounded operator acting on the corresponding Hilbert space H3 . For our purposes, it is enough to consider these Hamiltonians indexed by boxes 3 = {l = (l1 , . . . , lν ) | lj0 ≤ lj ≤ lj1 , j = 1, . . . , ν; lj0 < lj1 , lj0 , lj1 ∈ ZZ }. To introduce the local Hamiltonian in the box 3 which corresponds to the periodic boundary conditions, we set 0 0 3 d3 ll0 = −φ(|l − l |3 ), l 6= l ; dll = 0;
(4)
where the function φ is the same as in (3) and the distance |l−l0 |3 is assumed to be on the torus T (3) which is obtained from 3 by suitable identification of boundary components. For the box 3 described above, it can be defined by the expression |l − l0 |23 =
ν X
|lj − lj0 |23 ,
j=1
where
|lj − lj0 |3 = min{|lj − lj0 |; lj1 − lj0 + 1 − |lj − lj0 |}.
This yields
(5)
|l − l0 |3 ≤ |l − l0 |
for all pairs l, l0 in 3. Taking into account the monotonocity of φ, one gets d3 ll0 ≤ dll0
(6)
0
for all pairs l, l in 3. Thus we set H3 =
X 1 X 3 dll0 ql ql0 + Hl , 2 0
(7)
X 1 X dll0 ql ql0 + Hl , 2 0
(8)
l,l ∈3
H30 =
l,l ∈3
l∈3
l∈3
where Hl is given by (2). The latter local Hamiltonian corresponds to zero boundary conditions. These ones will also be considered. As it was mentioned above, the dynamical matrices introduced by (3) – (5) are the simplest ones having the properties we will utilize in this paper. Let us describe them more systematically. Denote by L the family of all boxes. For a given box 3,
Suppression of Critical Fluctuations by Strong Quantum Effects
495
let L(3) denote the partition of IL by boxes that can be obtained as corresponding translations of this 3. Denote by G the group of all translations of IL. Let G(3) be the subgroup of G consisting of translations which generate the partition L(3). This means L(3) = {t(3), t ∈ G(3)}, where t(3) = {t(l), l ∈ 3}. Thereafter, the dynamical matrices (dll0 )l,l0 ∈IL and (d3 ll0 )l,l0 ∈3 have the following properties: (D1) dll0 is invariant under translations on IL; 3 (D2) d3 ll0 ≤ 0 (ferroelectricity); dll = dll = 0; ll0 ≤ 0, dP 0 (D3) the series l0 ∈IL dll converges for all l ∈ IL; (D4) d3 ll0 = min{dlt(l0 ) : t ∈ G(3)}. Now let us describe the one-particle potentials V that will be considered. We assume: (V1) V is an even polynomial, deg V = 2M with M ≥ 2; (V2) the polynomial v such that v(q 2 ) = V (q) is convex on IR+ = [0, +∞). An example of V satisfying these conditions is V (q) = aq 2 + q 4
(9)
with arbitrary real a. For a < 0, the potential (9) has a double-well shape. In the sequel, when speaking about a model of quantum anharmonic oscillators, we will mean the model possessing the properties (D1)–(D4), (V1),(V2). The simplest example of such a model was described above. Remark 1.1. Let V satisfy (V1), (V2). Then the probability measure dµV (x) = C exp(−V (x))dx (C is a positive normalizing constant) belongs to the BFS class of measures [6], and for every α > 0, the function exp(αx2 ) is µV -integrable on IR. For every 3 ∈ L and given inverse temperature β = T −1 , the Gibbs states in 3 are defined as functionals on an appropriate C ∗ -algebra of observables by means of the local Hamiltonians H3 , H30 given by (7), (8) respectively as follows: trace(Ae−βH3 ) trace(Ae−βH3 ) 0 , γβ,3 (A) = , γ (A) = 0 β,3 trace(e−βH3 ) trace(e−βH3 ) 0
(10)
see e.g. [4] for the definition and related discussions. The state γ is periodic, the state γ 0 corresponds to zero boundary conditions. These states can be fully determined by means of temperature Green functions. Let Sβ be a circle of length β (isometric to the segment [0, β] with identified ends). For each ordered set {τ0 , τ1 , . . . , τn }, τi ≤ τi+1 , τi ∈ Sβ , and appropriate operators {A0 , A1 , . . . , An }, the corresponding temperature Green function is determined as follows: 0β,3 A0 ;A1 ;...;An (τ0 , τ1 , . . . , τn ) = where
1 trace A0 e−(τ1 −τ0 )H3 A1 e−(τ2 −τ1 )H3 . . . An e−(β−τn +τ0 )H3 , Z3 Z3 = trace e−βH3 .
(11) (12)
496
S. Albeverio, Y. Kondratiev, Y. Kozitsky
The Green function defined by (11), (12) with H30 instead of H3 will be denoted by 00,β,3 A0 ;A1 ;...;An (τ0 , τ1 , . . . , τn ). These functions can be obtained by using the analytic continuation of the usual Green functions corresponding to the unitary time evolution, see [1, 7]. Thermodynamic properties of the model can be described by passing to the limit 3 % IL. To do it, we introduce the collections of boxes C as sets completely ordered by inclusion. Each such a set is countable and thus can be considered as a sequence of boxes. For the models we consider, the existence of limiting periodic Gibbs states can be proved (see e.g. [2], where it was done for systems satisfying a reflection positivity condition). Our intention in this paper is to show that the large fluctuations of displacements of oscillators described by the Hamiltonians (1)–(7), (8) are absent (suppressed) at any temperature if the model parameters satisfy certain conditions that can be characterized as conditions of “strong quantumness”. Thus we restrict ourselves to the consideration of those operators that describe such fluctuations. Due to the Z2 -symmetry of the local 0 (ql ) = 0. For some real δ, we define Hamiltonians (7), (8) one has γβ,3 (ql ) = γβ,3 Aδ (3) =
1 |3|
1 2 +δ
Q3 =
1 |3|
1 2 +δ
X
ql .
(13)
l∈3
This operator describes the fluctuations of displacements of particles in 3. For some C, let us consider the sequence {Aδ (3), 3 ∈ C} from the point of view of the central limit theorem. In the case of a weak dependence between oscillations, one can expect the asymptotic “normality” of this sequence with δ = 0. But if this “normality” occurs for some δ > 0, this will correspond to the appearance of a strong dependence between oscillations which takes place at the critical temperature. In the quantum case, the normality of these sequences can be understood in terms of corresponding Green functions. Accordingly, we give the following definition of the critical temperature. Definition 1.1. For given temperature T , some δ > 0, and a sequence C, let the sequence 0 of temperature Green functions {0β,3 Aδ (3);Aδ (3) (τ, τ ), 3 ∈ C} given by (11)–(13), or the 0,β,3 sequence of 0Aδ (3);Aδ (3) (τ, τ 0 ) functions, converge in L1 (Sβ2 ) to some function 0δ (τ, τ 0 ) different from zero on an Sβ2 subset of positive Lebesgue measure. Then one says that the critical fluctuations of displacements of particles occur at this temperature. The latter is said to be a critical temperature. If for every sequence C and arbitrary positive δ, these sequences of Green functions converge to zero, we say that the critical fluctuations are absent at this temperature. Now we can formulate our main result. Theorem 1.1. Let the model of quantum anharmonic oscillators which possesses the properties (D1)–(D4), (V1), (V2) be considered. For this model, there exists a positive m∗ such that for all values of the oscillator’s mass m ∈ (0, m∗ ), the critical fluctuations are absent at all temperatures. The proof of this statement is preceded by a number of lemmas, other assertions, and remarks. Remark 1.2. The model with the small values of the oscillator’s mass can be characterized as “strongly quantum”. Below we analyze this and some other conditions of strong (or large) quantumness.
Suppression of Critical Fluctuations by Strong Quantum Effects
497
Now we simplify our notations by setting 0,β,3 0 δ 0 0 0δ3 (τ, τ 0 ) = 0β,3 Aδ (3);Aδ (3) (τ, τ ) , 00,3 (τ, τ ) = 0Aδ (3);Aδ (3) (τ, τ ).
(14)
From the definitions (11)–(13), one has: 0δ3 (τ + θ, τ 0 + θ) = 0δ3 (τ, τ 0 ),
(15)
where the addition of the type of τ + θ is understood modulo β. The same periodicity can be shown to hold also for 0δ0,3 . Proposition 1.1. The following estimates hold with arbitrary δ ≥ 0, 3 ∈ L, almost everywhere on Sβ2 : 0δ3 (τ, τ 0 ) ≥ 0δ0,3 (τ, τ 0 ) > 0.
The proof of this variant of the Lebowitz and Griffiths inequalities for our model will be given below. We set Z Z 1 0δ3 (τ, τ 0 )dτ dτ 0 . (16) U3δ = 2 β Sβ Taking into account (15), we obtain Z Z 0δ3 (τ, τ 0 )dτ 0 = U3δ = Sβ
Sβ
0δ3 (τ, τ 0 )dτ.
(17)
Proposition 1.2. For given δ > 0, let the sequence {U3δ , 3 ∈ C} converge to zero. Then the sequences {0δ3 (τ, τ 0 ), 3 ∈ C}, {0δ0,3 (τ, τ 0 ), 3 ∈ C} converge to zero in L1 (Sβ2 ). The proof of this assertion immediately follows from Proposition 1.1 and (16). Taking into account the definition (13), one has 0δ3 (τ, τ 0 ) = |3|−2δ 003 (τ, τ 0 ) ; U3δ = |3|−2δ U30 .
(18)
Let us return to the dynamical matrix D = (dll0 ) that can naturally be defined on the Hilbert space l2 (IL) as a linear operator. Having in mind the translation invariance (D1) and the “ferroelectricity” property (D2), we deduce that the series mentioned in (D3) does not depend on l. Thus the norm of the mentioned operator is X dll0 . (19) kDk = − l0 ∈IL
Write H˜ l = Hl + kDkql2 , l ∈ IL,
(20)
where Hl is given by (2). Each Hamiltonian H˜ l with V obeying (V1), (V2) has purely discrete spectrum. We set (21) H˜ l ψs = Es ψs , s ∈ ZZ+ , where the eigenvalues Es are numbered in increasing order. Denote 1 = min {Es+1 − Es | s ∈ ZZ+ }.
(22)
498
S. Albeverio, Y. Kondratiev, Y. Kozitsky
Lemma 1.1. (Main estimate). For the model of quantum oscillators, let the parameters of the one-particle Hamiltonian m and 1 be given by (22), and the interaction parameter kDk satisfy the following condition: m12 > 2kDk.
(23)
Then for every β > 0 and arbitrary 3 ∈ L, U30 ≤
1 . m12 − 2kDk
(24)
The proof will be given below. Lemma 1.2. For the model of quantum anharmonic oscillators, let the parameters m, 1, and kDk satisfy condition (23). Then the critical fluctuations are absent for all β > 0. Proof. Proposition 1.1 implies U30 > 0, for all β > 0 and 3 ∈ L. Therefore, for m, 1, and kDk satisfying (23), the estimate (24) holds, thus the sequence {U30 , 3 ∈ C} is bounded. This yields, for arbitrary δ > 0, that the sequence {U3δ , 3 ∈ C} converges to zero . The latter and Proposition 1.2 yield in their turn that the sequences {0δ3 (τ, τ 0 ), 3 ∈ C} , {0δ0,3 (τ, τ 0 ), 3 ∈ C} converge to zero in L1 (Sβ2 ). Thus the critical fluctuations are absent for all β > 0. For the system of harmonic oscillators, we obtain an equality in (24) (this follows from the proof of Lemma 1.1 given below). This agrees with the corresponding results obtained for such systems (see e.g. expression (7) in [7]). For a harmonic oscillator ( V (q) = 21 a2 q 2 ) , we have m12 = kDk + a2 , where the latter coefficient describes the “rigidity” of the oscillator. In this case, the condition (23) does not depend on m. But in the case of anharmonic oscillators where the one-particle potential V satisfies the conditions (V1), (V2), m12 may infinitely grow with m tending to zero, which implies that the condition (23) can be satisfied for given kDk by putting m to be less than some threshold value m∗ . Such a growth of m12 is proven in the following assertion. Lemma 1.3. Let V satisfy the conditions (V1), (V2). Then there exists a positive m∗ such that for all values of the oscillator’s mass m less than m∗ , the condition (23) is satisfied. Proof. The one-particle potential V (q) = v(q 2 ) in the Hamiltonian (7) has the form V (q) = b0 q 2M + b1 q 2M −2 + . . . + bM −1 q 2 + bM ,
(25)
where b0 > 0, M = deg v ≥ 2 , due to the condition (V1). For some σ > 0, we consider the unitary operator Wσ in L2 (IR, dx) (Symanzik scaling) given by the formula 1
(Wσ ϕ)(x) = σ 2 ϕ(σx).
(26)
Wσ pWσ−1 = σ −1 p , Wσ qWσ−1 = σq.
(27)
Then from (26), we have
1 ˜ l is unitary equivalent to the Taking σ = σm := m− 2M +2 and using (27), we have that H operator h i M 1 m− M +1 L0 + m M +1 Rm (q) , (28)
Suppression of Critical Fluctuations by Strong Quantum Effects
where L0 =
499
1 2 p + b0 q 2M 2
(29)
and M −2
M −1
Rm (q) = b1 q 2M −2 + b2 m M +1 q 2M −4 + . . . + (bM −1 + kDk)m M +1 q 2 + bM m M +1 . 1
Let 1(0) be defined by (22) but with the eigenvalues Es of the operator L0 instead of H˜ l . Then using the perturbation theory for the operator 1
L0 + m M +1 Rm (q) (see e.g. [8]), we deduce the asymptotic equivalence M −1
m12 ∼ m− M +1 (1(0) )2 , m → 0 + . Therefore, we can choose m∗ such that m12 > 2kDk for all m ∈ (0, m∗ ).
2
Remark 1.3. For the anharmonic oscillator with V meeting (V1), (V2), m1 can be considered as a parameter describing the quantum character of this oscillator. It may infinitely grow with m tending to zero, as it was established by Lemma 1.3, or with the growth of 1. The latter may occur for the double-well potential, when its minima are coming close (e.g. by means of the external pressure [12]), increasing the tunnelling between the wells. Therefore, for these systems, the condition (23) may serve as a mathematical realization of the notion of “strong quantumness”. Proof of Theorem 1.1. It follows directly from Lemmas 1.2 and 1.3.
In fact, we can prove stronger statements than that of Theorem 1.1. Let us consider a sequence of operators {B(3), 3 ∈ C}. Definition 1.2. The sequence {B(3), 3 ∈ C} is said to be degenerate at zero if for all n ∈ IN , the sequences of temperature Green functions {0β,3 B(3);...;B(3) (τ1 , . . . , τn ), 3 ∈ C} converge to zero in L1 (Sβn ). Theorem 1.2. Let the conditions of Lemma 1.2 be satisfied. Then for all β > 0 and arbitrary δ > 0, the sequence of fluctuation operators {Aδ (3), 3 ∈ C} is asymptotically degenerate at zero. The proof of this and the following theorem will be given in the next section. For any given sequence {λ(3), 3 ∈ C | λ(3) ∈ IR}, we set Bλ (3) = λ(3)A0 (3).
(30)
Theorem 1.3. Let the conditions of Lemma 1.2 be satisfied. Then for all β > 0 and an arbitrary sequence {λ(3), 3 ∈ C} converging to zero, the sequence of operators {Bλ (3), 3 ∈ C} given by (30) is asymptotically degenerate at zero. Remark 1.4. The suppression of the long range order by strong quantum fluctuations in the physical systems described by the above models was experimentally observed (see e.g. J.E. Tibballs et al [12]) and discussed long ago from the physical point of view (see e.g. T. Schneider et al [9] or the book [5], chapter 2.5.4.3). A rigorous justification of this phenomenon was given by A. Verbeure and V.A. Zagrebnov [14]. Our theorems describe a quantum effect which is essentially stronger than the one described in [14] – they imply the suppression not only of the long-range order but also of any critical anomalies. The suppression of the long-range order would correspond to the value of δ = 21 in (13).
500
S. Albeverio, Y. Kondratiev, Y. Kozitsky
2. Functional Integral Representation For given inverse temperature β and 3 ∈ L, let us consider the measure space (β,3 ; 6β,3 ), where β,3 = {ω3 (·) | ω3 : Sβ → IR3 }, ω3 (·) = {ωl (·), l ∈ 3 | ωl ∈ := C(Sβ → IR)}. The set 6β,3 is the standard σ-algebra of β,3 subsets generated by cylinder subsets, see e.g. [2]. In view of Definitions 1.1, 1.2 , our main objects are the Green functions defined by (11) with all Aj chosen to be Aδ (3). But in the case where Aj are the multiplication operators by measurable functions Aj (q3 ) with q3 = {ql , l ∈ 3}, we have, in fact [7]: Z n Y β,3 0A0 ;...;An (τ0 , . . . , τn ) = Aj (ω3 (τj ))dνβ,3 (ω3 (·)). (31) β,3 j=0
Here νβ,3 is determined as a measure on β,3 by the Hamiltonian H3 and has been rigorously defined in [1, 7]. In the sequel, we will use the local Hamiltonian H3 presented as follows: X 1 X 3 H˜ l , Jll0 ql ql0 + (32) H3 = − 2 0 l,l ∈3
l∈3
where H˜ l is given by (20) and Jll30 = −d3 ll0 + kDkδll0 ,
(33)
with d3 ll0 defined by (D4). For the one-particle Hamiltonian, we will use the following representation ˜ l = 1 p2l + V˜ (ql ) , V˜ (ql ) = V (ql ) + kDkql2 . (34) H 2m By means of the matrix (Jll30 ), one can define on the Euclidean space IR3 the operator J 3 that has the norm X d3 (35) kJ 3 k = kDk − ll0 . l0 ∈3
Therefore, the smallest eigenvalue of J 3 is not less than X d3 kDk + ll0 > 0, l0 ∈3
where (D2) and (D4) were taken into account. Thus this operator is symmetric and positive. The latter means that the scalar product in IR3 , (J 3 x, x) is strictly positive for all nonzero x ∈ IR3 . Thus we can write νβ,3 as follows: Z X 1 1 dνβ,3 (ω3 (·)) = exp Jll30 ωl (τ )ωl0 (τ )dτ X dχβ (ωl (·)). (36) Zβ,3 2 0 Sβ l,l ∈3
l∈3
The measure χβ is determined by the one-particle Hamiltonian (34) and has also been defined in [1, 7]. For our purpose, it is convenient to use its heuristic representation [1, 7], ! Z Z m 1 2 ˜ V (ωl (τ )dτ dωl (·). (ω˙ l (τ )) dτ − (37) dχβ (ωl (·)) = exp − Z 2 Sβ Sβ
Suppression of Critical Fluctuations by Strong Quantum Effects
501
It is possible to give a rigorous meaning to (37) by means of a lattice approximation , as in constructive quantum field theory. Within this approximation, the derivative ω˙ l (τ ) := dωl (τ )/dτ is replaced by a difference relation, whereas the integrals over Sβ are put to be equal to the corresponding Riemann integral sums [10]. This approach will be used in the next section. For given k ∈ IN , let π be a partition of the set {1, 2, . . . , 2k} onto unordered pairs, i.e. π = {(π(1), π(2)); (π(3), π(4)); . . . ; (π(2k − 1), π(2k))}, and P(2k) be the collection of all such partitions. We set 0δ2k,3 (τ1 , . . . , τ2k ) = 0β,3 Aδ (3);...;Aδ (3) (τ1 , . . . , τ2k ),
(38)
where the right-hand-side of (38) is defined by (11) with Aδ (3) given by (13). Then (see (14)) 0δ2,3 (τ, τ 0 ) = 0δ3 (τ, τ 0 ). Lemma 2.1. For all δ ≥ 0, k ∈ IN , the following estimate holds for all 3 ∈ L and almost all values of corresponding variables k X Y
0δ2k,3 (τ1 , . . . , τ2k ) ≤
0δ3 τπ(2i−1) , τπ(2i) .
(39)
π∈P(2k) i=1
The proof of this Gaussian-upper-bound-like inequality is done in the next section. For 3 consisting of only one element, we denote the corresponding U3δ given by (17) simply by U . Clearly, this U does not depend on δ and is determined by the one-particle Hamiltonian (34). The proof of Lemma 1.1 is based on the following Gaussian-upperbound-like estimate, which for our model is proved in the next section by means of the lattice approximation. Lemma 2.2. For given β, let U and kDk satisfy the condition U −1 > 2kDk.
(40)
Then for arbitrary 3 ∈ L, U30 ≤
U . 1 − 2U kDk
(41)
Lemma 2.3. Let 1 be defined by (22), then for all β > 0, the following estimate holds: U≤
1 . m12
Proof. Taking into account (17), (14), (12), (11), (10) and (21), one gets Z 1 ˜ ˜ trace qe−τ Hl qe−(β−τ )Hl dτ, U= Z Sβ where
˜ Z = trace e−β Hl .
We set qss0 = (ψs , qψs0 ) and obtain from (43)
(42)
(43)
502
S. Albeverio, Y. Kondratiev, Y. Kozitsky
U=
1 Z
X
(qss0 )2
s,s0 ∈Z Z+
(Es − Es0 )(e−βEs0 − e−βEs ) . (Es − Es0 )2
(44)
For symmetry reasons, we have qss = 0, thus the case s = s0 in the sum (44) can be excluded. Therewith, taking into account definition (22), one can estimate the denominator in (44) and obtain 1 X (qss0 )2 (Es − Es0 )(e−βEs0 − e−βEs ) Z12 s,s0 1 1 ˜ l , q]]e−β H˜ l = 1 , = 2 · trace [q, [H 1 Z m12
U ≤
where [·, ·] means the commutator. This gives (42).
Proof of Lemma 1.1. Follows immediately from Lemmas 2.2 and 2.3. Proof of Theorem 1.2 and 1.3. Follows immediately from Theorem 1.1 and estimate (39).
3. Proof of Lemmas 2.1 and 2.2 In order to prove Lemmas 2.1 and 2.2, we will construct some technical background within the lattice approximation approach. The starting point here is expression (37). As it was mentioned, the main idea of this approach is to replace the integrals over Sβ n by corresponding Riemann integral sums. Let us divide Sβ by the points τn = N β, n = 0, 1, . . . , N − 1, and introduce the following notations: r β ωl (τn ); n = 0, 1, . . . , N − 1; (45) ξln = N ξl0 = ξlN , ξl = (ξl0 , ξl1 , . . . , ξlN −1 ); β ; n = 0, 1 . . . N − 1; N ωl (τn+1 ) − ωl (τn ) . ω˙ l (τn ) = 1τ
1τ = τn+1 − τn =
Now we define a sequence of measures on IRN that converges to the corresponding measure on trajectories, when N → ∞, in the sense of convergence of integrals on cylinder functions.We set !N −1 N −1 X 1 (N ) ) ξln ξln+1 X dρ(N (46) dχβ (ξl ) = (N ) exp αN β (ξln ), Xβ n=0 n=0 s ) dρ(N β (ξln )
=
1 Yβ(N )
β 2 exp −αN ξln − V˜ N
where Xβ(N ) , Yβ(N ) are normalizing constants, and
N ξln β
!! dξln ,
(47)
Suppression of Critical Fluctuations by Strong Quantum Effects
αN = m
N β
503
2 .
We consider (N ) (ξ) dνβ,3
N −1 X X 1 1 ) = (N ) exp Jll30 ξln ξl0 n X dχ(N (48) β (ξl ) 2 0 Zβ,3 n=0 l∈3 l,l ∈3 N −1 N −1 X X 1 1 ) = (N ) (N ) exp J3(N ) (ln, l0 n0 )ξln ξl0 n0 X X dρ(N β (ξln ), 2 0 Zβ,3 Xβ l,l ∈3 n=0
where
l∈3 n=0
J3(N ) (ln, l0 n0 ) = δnn0 Jll30 + δll0 δ|n−n0 |N ,1 αN ,
(49)
0
δ|n−n0 |N ,1 = 1 iff |n − n | = 1, N − 1. For a suitable function F (ξ), we define Z ) (N ) = F (ξ)dνβ,3 (ξ), < F (ξ) >(N 3 IRN |3|
(50)
Z
and ) < F (ξ) >(N 0,3 =
IRN |3|
(0,N ) F (ξ)dνβ,3 (ξ),
(51)
(0,N ) where νβ,3 is defined by (48) with (N ) J0,3 (ln, l0 n0 ) = δnn0 Jll0 + δll0 δ|n−n0 |N ,1 αN ,
(52)
instead of J3(N ) (ln, l0 n0 ), and Jll0 = −dll0 + kDk,
(53)
instead of Jll30 . Note that (D4) yields Jll0 ≤ Jll30 .
(54)
Introduce δ,N (n1 , . . . , n2k ) S2k,3 k N −k(1+2δ) = |3| β
X
) < ξl1 n1 . . . ξl2k n2k >(N 3 ,
(55)
l1 ,...l2k ∈3
δ,N (n1 , . . . , n2k ) as where k ∈ IN and nj = 0, 1, . . . , N − 1. We also introduce S2k,0,3 (N ) (N ) given by (55) with < . . . >0,3 instead of < . . . >3 . Let us consider a sequence {Ns ∈ IN , s ∈ IN } possessing the property Ns < Ns+1 and, therefore, tending to infinity when s → ∞. By means of this sequence, we can define a sequence {n(s) (τ ), s ∈ IN | 0 ≤ n(s) ≤ Ns } such that for given τ ∈ Sβ , lims→∞ (n(s) (τ )/Ns ) = τ /β.
504
S. Albeverio, Y. Kondratiev, Y. Kozitsky
Lemma 3.1. For every k ∈ IN , δ ≥ 0, 3 ∈ L, the sequence n o δ,Ns (s) (s) S2k,3 (n1 (τ1 ), n(s) 2 (τ2 ), . . . , n2k (τ2k )), s ∈ IN converges to the temperature Green function 0δ2k,3 (τ1 , . . . , τ2k ), given by (38), (31), when s → ∞, for almost all values of (τ1 , . . . , τ2k ) ∈ Sβ2k . The same convergence of the (s) δNs δ sequence of S2,0,3 (n(s) 1 (τ1 ), n2 (τ2 )) to 00,3 (τ1 , τ2 ) given by (14) holds true. (N ) , N ∈ IN } Proof. The standard arguments yield that the sequence of measures {νβ,3 converges to the measure νβ,3 given by (36) in the sense of convergence of integrals of cylinder functions. This yields the convergence of the corresponding moments, which yields in turn the convergence to be proven. Just the same arguments yield the convergence to the Green functions in the case of zero boundary conditions. P Now we can fix 3 ∈ L, the integer N , and simplify our notations by putting l = P P PN −1 l∈3 , n = n=0 , and so on. Let us consider the measure (48) as a Gibbs measure (with corresponding boundary conditions) for a classical ferromagnetic spin model with ) the single-spin measure ρ(N β given by (47), which belongs to the BFS class of measures (see Remark 1.1). We set ) (0) (N ) 0 0 u2 (ln, l0 n0 ) =< ξln ξl0 n0 >(N 3 ; u2 (ln, l n ) =< ξln ξl0 n0 >0,3 .
(56)
u4 (l1 n1 , l2 n2 , l3 n3 , l4 n4 ) ) = < ξl1 n1 ξl2 n2 ξl3 n3 ξl4 n4 >(N 3 −u2 (l1 n1 , l2 n2 )u2 (l3 n3 , l4 n4 ) − u2 (l1 n1 , l3 n3 )u2 (l2 n2 , l4 n4 ) − u2 (l1 n1 , l4 n4 )u2 (l2 n2 , l3 n3 ).
(57)
Proposition 3.1. For our model, the following estimates hold: 0 0 0 0 0 < u(0) 2 (ln, l n ) ≤ u2 (ln, l n ),
(58)
u4 (l1 n1 , l2 n2 , l3 n3 , l4 n4 ) < 0,
(59)
) < ξl1 n1 . . . ξl2k n2k >(N 3 ≤
X
k Y
u2 (lπ(2i−1) nπ(2i−1) , lπ(2i) nπ(2i) ).
(60)
π∈P(2k) i=1
Proof. Equations (58) (positivity) and (59) are known as Lebowitz inequalities, the estimate (60) is the Gaussian upper bound inequality, the relationship between u(0) 2 and u2 in (58) is the Griffiths inequality produced by (54). The validity of these inequalities follows from the properties of the measures (48), mentioned above (see Sect. 12 of [6]). Proof of Proposition 1.1 and Lemma 2.1. It follows immediately from (58), (60), (55) and Lemma 3.1. The rest of this paper is devoted to the proof of Lemma 2.2. To achieve this aim, we will need an additional technique explained below. Let F be a set of functions F defined on IRM , M ∈ IN , that can be continued on M C as entire functions, such that the norm
Suppression of Critical Fluctuations by Strong Quantum Effects
505
kF ka = sup{|F (z)| exp(−akzk2 ) : z ∈ CM }
(61)
is finite for all a > 0. Here kzk2 = |z1 |2 +. . .+|zM |2 . This set equipped with the pointwise linear operations and the topology defined by the family of norms {k·ka , a > 0} becomes a Fr´echet space that will also be denoted as F . We will use the following property of this space [11]: Proposition 3.2. The set of all polynomials is dense in F. With each symmetric bilinear form on IRM × IRM , one can associate a symmetric linear operator A such that this form can be written as (y, Ax), where (·, ·) denotes the scalar product in IRM . The matrix (Ajj 0 ) consisting of the matrix elements of the operator A in the canonical basis of IRM will also be denoted by A. We will use the notation A > 0, for positive operators of such type. Let A be a symmetric linear positive operator. We set M ∞ M X X 1 X ∂ 2k F (x) Aj1 j10 . . . Ajk jk0 . (62) (T (A)F )(x) = k! ∂xj1 . . . ∂xjk0 0 0 k=0
j1 ...jk =1 j1 ...jk =1
Estimating the derivatives, one can prove that for all such A, T (A) continuously maps F into itself. T (A) can be expressed in the integral form: (T (A)F )(x) Z 1 1 = F (y) exp − (y − x, A−1 (y − x) dy, X(A) IRM 2
(63)
where 1
X(A) = [det(2πA)] 2 .
(64)
The identity of (62) and (63) can be proven for monomials, and then for every F ∈ F by means of Proposition 3.2. This identity permits the operator T (A) to be extended to a wider class of functions. Let A be a symmetric linear positive, B be a symmetric linear operator on IRM . The set of all such B satisfying with this A the condition A−1 − B > 0 will be denoted by B(A). For every B ∈ B(A), the operator T (A) given by (63) can be applied to F (x) = exp( 21 (x, Bx)), giving (T (A)F )(x) Z 1 1 1 −1 = (y, By) − (y − x, A (y − x)) dy. exp X(A) IRM 2 2
(65)
We put (y, By) − (y − x, A−1 (y − x)) = (x, Cx) − (y − W x, D(y − W x)),
(66)
and obtain C = B(1 − AB)−1 ; D = A−1 − B; W = (1 − AB)−1 . Then we insert the right-hand-side of (66) into the exponent in (65). This yields:
(67)
506
S. Albeverio, Y. Kondratiev, Y. Kozitsky
(T (A)F )(x) X((1 − AB)−1 A) 1 = exp( (x, B(1 − AB)−1 x)) X(A) 2 Z 1 1 exp(− (y − W x, (A−1 − B)(y − W x)))dy X((1 − AB)−1 A) IRM 2 1 X((1 − AB)−1 A) exp( (x, B(1 − AB)−1 x)). = X(A) 2
(68)
Proceeding in this direction, we can extend the operator T (A) to the following class of functions. For a symmetric linear positive operator A, we set 1 (69) F(A) = F (x) = exp( (x, Bx))G(x) | B ∈ B(A), G ∈ F . 2 Proposition 3.3. For a given symmetric linear positive operator A, the operator T (A) defined on F by the expression (63) can be continuously extended to F(A) as follows: (T (A)F )(x) 1 = 4(A, B) exp( (x, B(1 − AB)−1 x)) 2 (T ((1 − AB)−1 A)G)((1 − AB)−1 x), with 4(A, B) =
X((1 − AB)−1 A) . X(A)
(70) (71)
Proof. For G ≡ 1, (70) reduces to (68). Now let G(x) be a polynomial. Then the integral on the right-hand-side of (63) with F (x) = exp( 21 (x, Bx)) G(x) and B ∈ B(A) converges. Applying the representation (66), we obtain (T (A)F )(x) 1 1 = 4(A, B) exp( (x, B(1 − AB)−1 x)) 2 X((1 − AB)−1 A) Z 1 G(y) exp(− (y − W x, (A−1 − B)(y − W x)))dy M 2 IR 1 = 4(A, B) exp( (x, B(1 − AB)−1 x))(T ((1 − AB)−1 A)G)(W x), 2 where the integral form (63) is used as a definition, and W is given by (67). Thus (70) holds for G being a polynomial. Now we apply Proposition 3.2 and obtain the stated extension. Let us consider F ∈ F(A), then, for every t ∈ (0, 1], F ∈ F(tA). Having this in mind, we set (72) Ft (x) = (T (tA)F )(x), t ∈ (0, 1]; F0 (x) = F (x). Proposition 3.4. For every t ∈ [0, 1], Ft (x) is differentiable with respect to t, and obeys the following equation: M ∂ 2 Ft (x) ∂Ft (x) 1 X = Ajj 0 ; F0 (x) = F (x). ∂t 2 0 ∂xj ∂xj 0 j,j =1
(73)
Suppression of Critical Fluctuations by Strong Quantum Effects
507
The proof follows easily from definition (62) of T (A). Now let us return to the measures (46)–(49). We set Z X ) exp ηln ξln dν (N ) (ξ), 8(N 3 (η) = IRN |3|
Z F3(N ) (η) =
X
IRN |3|
β,3
l,n
exp
l,n
(74)
) N |3| ηln ξln X dχ(N . β (ξl ), η ∈ IR l
Taking into account that the one-particle potential V satisfies the requirements (V1), ) (N ) (V2), we conclude that both 8(N 3 and F3 belong to F , where M used in the definition of the latter is put equal to N |3|. Moreover, the expression (48) implies ) 8(N 3 (η) =
1 (N ) Zβ,3
(T (I3 )F3(N ) )(η),
(75)
where the operator I3 : IR|3|N → IR|3|N is defined by its matrix elements I3 (ln, l0 n0 ) = Jll30 δnn0
(76)
and then is positive, symmetric, and periodic. The linear operator C : IR|3|N → IR|3|N is said to be periodic if for its matrix elements one has C(ln, l0 n0 ) = C((l +λ)(n+ν), (l0 + λ)(n0 + ν)) with arbitrary λ ∈ 3 and ν ∈ {0, 1, . . . , N − 1}, where the additions are modulo |3| and N respectively. Since 3 and N are fixed, we simplify our notations by omitting these symbols in 8, F , Zβ and I. Now let C be a periodic linear symmetric operator belonging to B(I). Write 1 F (η) = exp( (η, Cη))G(η). 2
(77)
For given C, this can be considered as a definition of G(η). It is not difficult to check that for each positive symmetric I, C ∈ B(I) implies −C ∈ B((1 − IC)−1 I). This yields G(η) ∈ F((1 − IC)−1 I) (due to F ∈ F ). We set for t ∈ [0, 1] (see (72)): exp R(t, η) = (T (t(1 − IC)−1 I)G)(η).
(78)
Inserting F (η) as given by (77) into (75) and applying the identity (70), one obtains 8(η) =
1 1 4(I, C) exp{ (η, C(1 − IC)−1 η) + R(1, (1 − IC)−1 η)}. Zβ 2
(79)
Now let us consider the expression (78) having in mind Proposition 3.4. This yields X ˙ η) := ∂R(t, η) = 1 R(t, [(1 − IC)−1 I](ln, l0 n0 ) ∂t 2 0 0 ln,l n 2 ∂R(t, η) ∂R(t, η) ∂ R(t, η) . + ∂ηln ∂ηl0 n0 ∂ηln ∂ηl0 n0
(80)
˙ η) are even and infinitely It is not difficult to show that for all t ∈ [0, 1], R(t, η) and R(t, often differentiable functions with respect to η at the point η = 0. We set
508
S. Albeverio, Y. Kondratiev, Y. Kozitsky
R2k (t | l1 n1 , l2 n2 , . . . , l2k n2k ) =
∂ 2k R(t, η) ∂ηl1 n1 . . . ∂ηl2k n2k
, η=0
∂ R˙ 2k (t | l1 n1 , . . . , l2k n2k ) = R2k (t | l1 n1 , . . . , l2k n2k ). ∂t Then we obtain from (80): 1 R˙ 2 (t | ln, l0 n0 ) = 2
X
[(1 − IC)−1 I](l1 n1 , l2 n2 )
l1 n1 ,l2 n2
{R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) +2R2 (t | l1 n1 , ln)R2 (t | l2 n2 , l0 n0 )}.
(81)
Taking into account the periodic boundary conditions with respect to l ∈ 3, as well as to n ∈ {0, 1, . . . , N − 1}, one deduces that X X R2 (t | ln, l0 n0 ) = |3|N R2 (t | ln, l0 n0 ). (82) ln,l0 n0
ln
We set R2 (t) =
X 1 R2 (t | ln, l0 n0 ), |3|N 0 0
(83)
ln,l n
and obtain from (81) taking into account (82), R˙ 2 (t) = KR22 (t) + Q(t), where K=
(84)
X 1 [(1 − IC)−1 I](ln, l0 n0 ), |3|N 0 0
(85)
ln,l n
Q(t) =
X X 1 [(1 − IC)−1 I](ln, l0 n0 ) R4 (t | ln, l0 n0 , l1 n1 , l2 n2 ). (86) 2|3|N 0 0 ln,l n
l1 n1 ,l2 n2
The initial condition for R2 (t) follows from (78) and (77): X 1 G2 (ln, l0 n0 ) |3|N 0 0 ln,l n X X 1 = F2 (ln, l0 n0 ) − C(ln, l0 n0 ) , |3|N 0 0 0 0
R2 (0) =
ln,l n
where G2 (ln, l0 n0 ) = 0 0
F2 (ln, l n ) =
(87)
ln,l n
∂ 2 log G(η) ∂ηln ∂ηl0 n0 ∂ 2 log F (η) ∂ηln ∂ηl0 n0
,
(88)
.
(89)
η=0
η=0
The operator C has been restricted to be periodic and to belong to B(I). Now we impose more essential restrictions as follows:
Suppression of Critical Fluctuations by Strong Quantum Effects
X
X
C(ln, l0 n0 ) =
ln,l0 n0
509
F2 (ln, l0 n0 ),
(90)
ln,l0 n0
C = gI −1 , g < 1, −1
(91) 3
where I is the inverse operator of I (it exists due to the positivity of J ), g is a positive quantity (depending on 3 and N ) to be determined from (90). The possibility to choose it being less than one will be shown to be produced by (23). Lemma 3.2. Let the operator C satisfy (90), (91). Then R2 (t) given by (83) is negative for all t ∈ (0, 1]. In particular (92) R2 (1) < 0. Proof. Taking into account (90), one gets R2 (0) = 0. The condition (91) yields in turn [(1 − IC)−1 I](ln, l0 n0 ) =
1 J 30 δnn0 ≥ 0, 1 − g ll
(93)
due to (33) and (D2). Let us prove the assertion of this lemma assuming R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) < 0
(94)
for t ∈ [0, 1]. Then Q(t) given by (86) is negative for t ∈ [0, 1]. Taking into account (93), we obtain that K given by (85) is positive. Thus the function R2 (t) possesses the properties: it is equal to zero at t = 0, has a negative derivative at t = 0 and a negative derivative at each t such that R2 (t) = 0. The latter can be deduced from (84). Clearly, each continuous function of t possessing these properties becomes zero at most once. In our case this occurs at the point t = 0. Thus, this function preserves its sign for all t ∈ (0, 1], that means it is negative (it decreases at t = 0) for these t, which was to be proved. Now let us prove (94). We return to (78) and insert in it G as given by (77), that is 1 G(η) = exp(− (η, Cη))F (η). 2 Taking into account (91), we obtain then
N |23| 1 g(1 − g) 1−g exp R(t, η) = (η, I −1 η) exp − 1 − (1 − t)g 2 1 − (1 − t)g t 1−g T I F η . 1 − (1 − t)g 1 − (1 − t)g
(95)
We set ht = (1 − g)[1 − (1 − t)g]−1 and exp 9(t, η) = (T (
tht I)F )(η). 1−g
(96)
Then (95) can be rewritten as N |3 | 2
exp(R(t, η) = ht hence
1 exp(− ght (η, I −1 η) + 9(t, ht η)), 2
0 0 R4 (t | l1 n1 , l2 n2 , ln, l0 n0 ) = h−4 t 94 (t | l1 n1 , l2 n2 , ln, l n ),
(97)
510
S. Albeverio, Y. Kondratiev, Y. Kozitsky
where 94 is defined in the same way as R4 . Recalling definition (74) of F (η) = F3(N ) (η), we obtain from (96), Z X X 1 exp ηln ξln + tht Jll30 ξln ξl0 n exp 9(t, η) = N | 3 | 2 IR ln l,l0 ,n X (N ) +αN ξln ξln+1 X dρβ (ξln ). (98) ln
ln
where we have also taken into account (46). Therefore, 94 is the fourth semi-invariant (of the type (57)) of the ferromagnetic Gibbs measure (Jll30 ≥ 0, αN > 0) with the initial ) measure ρ(N β belonging to the BFS class. Then 94 is negative (see Proposition 3.1). The latter implies (94), which completes the proof. Now let us return to (79). We set X 1 82 (ln, l0 n0 ), N |3| 0 0
U3(N ) =
(99)
ln,l n
where 0 0
82 (ln, l n ) =
∂ 2 log 8(η) ∂ηln ∂ηl0 n0
.
(100)
η=0
Then (79) gives with C = gI −1 , U3(N ) =
1 g + R2 (1), (1 − g)kJ 3 k (1 − g)2
where
kJ 3 k =
X
Jll30
(101)
(102)
l∈3
is given by (35). To obtain (101), we have used (76). But Lemma 3.2 yields (92), thus U3(N ) <
g . (1 − g)kJ 3 k
(103)
The zeroth initial condition for (84) is produced by relation (90). But (74) implies Y F (η) = f (ηl ) (104) l
Z
with f (ηl ) =
IRN
) exp(ηl ξl )dχ(N β (ξl ).
Taking into account (105), (104), and (89), we get 2 ∂ log f (ηl ) 0 0 0 F2 (ln, l n ) = δll0 f2 (n, n ) = δll0 ∂ηln ∂ηln0 ηl =0 Z ) = δll0 ξln ξln0 dχ(N β (ξl ). IRN
Proof of Lemma 2.2. Comparison of (99), (100) and (50) yields
(105)
(106)
Suppression of Critical Fluctuations by Strong Quantum Effects
511
X 1 ) < ξln ξl0 n0 >(N 3 . N |3| 0 0
U3(N ) =
(107)
ln,l n
Then taking into account definition (55), one gets U3(N )
2 1 X 0,N β 0 = S2,3 (n, n ) . β N 0 n,n
The latter can be rewritten as follows: 1 X 0,N S2,3 (n, n0 )1τ 1τ 0 . U3(N ) = β 0
(108)
n,n
The right-hand-side of (108) tends to Z Z Z Z 1 1 002,3 (τ, τ 0 )dτ dτ 0 = 003 (τ, τ 0 )dτ dτ 0 = U30 2 2 β β Sβ Sβ as N → ∞. Here we have taken into account (16) and Lemma 3.1. This means that the left-hand-side of (103) tends to U30 when N → ∞. In order to obtain the limit of the right-hand-side of (103), we use (91), (90), and obtain P 0 0 ln,l0 n0 F2 (ln, l n ) , g=P −1 )(ln, l0 n0 ) ln,l0 n0 (I and then by means of (106) and (45), we have kJ 3 k lim lim g = N →∞ β N →∞ kJ 3 k = β =
kJ 3 k β
Z
Z
X IRN
Z
n
ω(τ )dτ
Sβ
Z Z 2 Sβ
!2
ω(τn )1τ
) dχ(N β (ξ)
!2 dχβ (ω(·))
0(τ, τ 0 )dτ dτ 0 = kJ 3 kU < 2kDkU < 1,
where we have used (40) as well as the estimate kJ 3 k < 2kDk which follows from (35) and (D2), (D4). Meanwhile, 0(τ, τ 0 ) means 0δ3 (τ, τ 0 ) with one–point 3, and U is the same as in Lemma 2.2. Then passing to the limit N → ∞ in (103), we obtain U30 ≤
U U ≤ , 1 − U kJ 3 k 1 − 2U kDk
where we have taken into account definition (35) which yields lim kJ 3 k = 2kDk,
3%IL
when 3 % IL.
Acknowledgement. One of the authors (Yuri Kozitsky) gratefully acknowledges the partial support of the International Science Foundation under the Grants UCN 000, UCN 200 as well as of Deutscher Akademischer Austauschdienst (Referat 325). The financial support of the DFG (Research Project AL 214 / 9- 2) is also gratefully acknowledged by the authors.
512
S. Albeverio, Y. Kondratiev, Y. Kozitsky
References 1. Albeverio, S., Høegh–Krohn, R.: Homogeneous Random Fields and Quantum Statistical Mechanics. J. Funct. Anal. 19, 242–279 (1975) 2. Barbulyak, V.S., Kondratiev, Yu.G.: A Criterion for the Existence of Periodic Gibbs States of Quantum Lattice Systems. Selecta Math. (formerly Sov.) 12, 25–35 (1993) 3. Berezansky, Yu.M., Kondratiev, Yu.G.: Spectral Methods in Infinite Dimensional Analysis. Dordrecht: Kluwer Academic Publishers, 1994 4. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. New York– Heidelberg–Berlin: Springer Verlag, 1979 5. Bruce, A.D. Cowley, R.A.: Structural Phase Transitions, London: Taylor and Francis Ltd, 1981 6. Fernandez, R., Fr¨ohlich, J., Sokal, A.: Random Walks, Critical Phenomena and Triviality in Quantum Field Theory. Berlin–Heidelberg–New–York–London–Paris–Tokyo–Hong Kong: Springer Verlag, 1992 7. Globa, S.A., Kondratiev, Yu.G.: The Construction of Gibbs States of Quantum Lattice Systems. Selecta Math. Sov. 9, 297–307 (1990) 8. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer Verlag, 1966 9. Schneider, T., Beck, H., Stoll, E.: Quantum Effects in an n-component Vector Model for Structural Phase Transitions. Phys. Rev. B13, 1123–1130 (1976) 10. Simon, B.: The P (ϕ)2 Euclidean (Quantum) Field Theory. Princeton, NJ: Princeton Univ. Press, 1974 11. Taylor, B.A.: Some Locally Convex Spaces of Entire Functions. In: Proceedings of Symposia of Pure Mathematics, Vol. XI, Providence, RI: AMS, 1968 12. Tibballs, J.E., Nelmes, R.J., McIntyre, G.J.: The Crystal Structure of Tetragonal KH2 PO4 and KD2 PO4 as a Function of Temperature and Pressure. J. Phys. C: Solid State Phys. 15, 37–58 (1982) 13. Verbeure, A., Zagrebnov, V.: Phase Transitions and Algebra of Fluctuation Operators in Exactly Soluble Model of a Quantum Anharmonic Crystal. J. Stat. Phys. 69, 329–359 (1992) 14. Verbeure, A., Zagrebnov, V.: No–Go Theorem for Quantum Structural Phase Transitions. J. Phys. A: Math. Gen. 28, 5415–5421 (1995) Communicated by Ya. G. Sinai
Commun. Math. Phys.194, 513 – 539 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Lump Dynamics in the CP 1 Model on the Torus J. M. Speight Department of Mathematics, University of Texas at Austin, Austin, Texas 78712, USA. E-mail: [email protected] Received: 28 July 1997 / Accepted: 3 November 1997
Abstract: The topology and geometry of the moduli space, M2 , of degree 2 static solutions of the CP 1 model on a torus (spacetime T 2 × R) are studied. It is proved that M2 is homeomorphic to the left coset space G/G0 , where G is a certain eightdimensional noncompact Lie group and G0 is a discrete subgroup of order 4. Low energy two-lump dynamics is approximated by geodesic motion on M2 with respect to a metric g defined by the restriction to M2 of the kinetic energy functional of the model. This lump dynamics decouples into a trivial “centre of mass” motion and nontrivial relative motion on a reduced moduli space. It is proved that (M2 , g) is geodesically incomplete and has only finite diameter. A low dimensional geodesic submanifold is identified and a full description of its geodesics obtained.
1. Introduction The CP 1 model in (2 + 1) dimensions has long been popular in theoretical physics, both for its condensed matter applications, and as a simple nonlinear field theory possessing topological solitons, usually called lumps. The Euler-Lagrange equation of the system is not integrable, so there is no hope of solving the multilump initial value problem exactly. Numerical simulations of the model have revealed a rich diversity in the lump dynamics, which includes not only the now-familiar 90◦ scattering in head on collisions, but also lump expansion, collapse and singularity formation. It is an interesting and highly nontrivial problem to understand the mechanisms underlying this complicated dynamics. Such understanding has been afforded in similar field theories (those of Bogomol’nyi type) by the geodesic approximation of Manton [17, 1, 22]. Here the low-energy dynamics of n solitons is approximated by geodesic motion in the moduli space of static n-soliton solutions, Mn , the metric g being defined by the restriction to Mn of the kinetic energy functional of the field theory. So understanding n-soliton dynamics is
514
J. M. Speight
reduced to studying the topology and geometry of (Mn , g), a finite dimensional, smooth Riemannian manifold. Several authors have pursued this programme for the CP 1 model in R2+1 with standard boundary conditions [28, 16], concentrating on the case of two lumps. There is, however, a technical problem: the metric on M2 does not, strictly speaking, exist, that is, at every point p ∈ M2 some vectors in Tp M2 are assigned infinite length by the kinetic energy functional (they are “non-normalizable zero modes”). These divergences stem from the noncompactness of space R2 . They are essentially due to the existence in the general static solution of scale and orientation parameters which are frozen in the geodesic approximation because to alter them, no matter how slowly, costs infinite kinetic energy. This is only possible because the kinetic energy is an integral over a noncompact space. One can study geodesic motion orthogonal to the bad directions, or one can remove the problem entirely by studying the model on a compact space [24]. In this paper we impose square periodic spatial boundary conditions on the model, or, equivalently, place it on a flat torus. The aim is to establish results concerning the topology and geometry of (M2 , g), and to describe their implications for low-energy two lump dynamics on the torus, within the framework of the geodesic approximation. The work is arranged as follows. In Sect. 2 we introduce the CP 1 model on the torus, and review some relevant background material. In particular we use a standard argument of Belavin and Polyakov to show that M2 is the space of degree 2 elliptic functions. In Sect. 3 we equip M2 with a natural metric topology, and prove that it is homeomorphic to the left coset space G/G0 , where G is the Lie group P SL(2, C) × T 2 and G0 is a discrete subgroup of order 4. This allows one to give M2 a natural differentiable structure (that of the smooth manifold G/G0 ) and provides M2 with a good global parametrization, using the covering space G. In Sect. 4 this parametrization is used to survey the degree 2 static solutions and describe their energy density distributions. It is found that exceptionally symmetric solutions exist with four, rather than two identical energy lumps, as well as the expected two-lump and annular solutions. In Sect. 5 the metric g on M2 is defined, and some of its properties discussed. We lift g to obtain ge, the metric on the covering space G, and show that ge is a product metric on P SL(2, C)×T 2 . In this way, we show that lump dynamics in the geodesic approximation decomposes into a trivial “centre of mass” motion, the T 2 part, and a nontrivial relative motion, the P SL(2, C) part. So attention may be restricted to geodesic motion on a reduced covering space, without loss of generality. In Sect. 6 it is proved that (M2 , g) is geodesically incomplete by finding an explicit, maximally extended geodesic, and showing that it has only finite length. It follows that lumps can collapse to form singularities in finite time. In Sect. 7 a 2-dimensional totally geodesic submanifold is identified by computing the fixed point set of a discrete group of isometries. The geodesics of this submanifold and their associated lump motions are described. In Sect. 8 it is proved that (M2 , g) has only finite diameter, despite its noncompactness. One should therefore visualize it as having only finite extent. In consequence, all static solutions are close to the end of moduli space, that is, close to collapse. In Sect. 9 some concluding remarks are presented. Two 3-dimensional totally geodesic submanifolds are identified, and it is shown that 90◦ head on scattering must occur in the model under certain conditions. The present work is summarized, and extensions suggested.
Lump Dynamics in the CP 1 Model on the Torus
515
2. The CP 1 Model on the Torus The field, a map from spacetime to CP 1 , W : R × T 2 → CP 1 , will throughout be considered complex valued, so that we are using an inhomogeneous coordinate on CP 1 , or equivalently, a stereographic coordinate on S 2 , exploiting the well known diffeomorphism between CP 1 and the two sphere. The metric and volume form on the codomain in terms of such a coordinate are, respectively, h=
4 du du¯ , (1 + |u|2 )2
ω=
2i du ∧ du¯ . (1 + |u|2 )2
(1)
It is convenient to use a complex coordinate on physical space also, by identifying T 2 with C/, where is the period module, which we choose, for concreteness, to be = {n + im : n, m ∈ Z}.
(2)
So we impose square periodic boundary conditions of unit period on W . Position in T 2 is parametrized by position z = x + iy in the covering space C. The metric on spacetime is η = dt2 − dx2 − dy 2 , and the action functional of the field theory is the standard harmonic map functional for mappings (R × T 2 , η) → (CP 1 , h), that is, Z S[W ] = R×T 2
¯ µν ∂µ W ∂ ν W η . 2 (1 + |W | )2
This may be written in a fashion reminiscent of Lagrangian mechanics, S = upon definition of the kinetic and potential energy functionals, Z T2
˙ |2 |W , (1 + |W |2 )2
T2
1 (1 + |W |2 )2
T = Z V =
(3) R
dt(T −V ),
(4) ! ∂W 2 ∂W 2 ∂x + ∂y .
(5)
The configuration space Q is C 1 (T 2 , S 2 ), the space of continuously differentiable maps T 2 → S 2 (note that V [W ] is finite for all W ∈ Q by compactness of T 2 ). By Hopf’s Degree Theorem [11], Q decomposes into disjoint homotopy classes labelled by topological degree n, an integer, Q=
a
Qn .
(6)
n∈Z
Physically, n is interpreted as the “lump number” of the configuration, the excess of lumps over antilumps. Static solutions are extremals of V , that is harmonic maps T 2 → S 2 . The space of minimal energy static solutions in Qn is called the degree n moduli space, denoted Mn . A well-known argument due to Belavin and Polyakov [4] shows that Mn (n assumed nonnegative) is in fact the space of degree n elliptic functions, that is, holomorphic maps T 2 → S2:
516
J. M. Speight
Z
|∂x W + i∂y W |2 (1 + |W |2 )2 T2 Z 1 = V [W ] − W ∗ω 2 T2 1 = V [W ] − Vol(S 2 )n = V [W ] − 2πn, 2
0≤
(7)
where W ∗ ω is the pullback of the volume form on S 2 by W . It follows that V |Qn ≥ 2πn
(8)
with equality if and only if (∂x + i∂y )W = 0, which is the Cauchy-Riemann equation for W . So if there exist degree n elliptic functions, then Mn is the space of such functions, since any other function has higher energy. If there are no such functions, then Mn is empty, for the energy bound (8) is optimal. To see this, consider the following family of functions. For > 0 small, define W ∈ Qn so that 2n zn |z| < W (z) = (9) 0 |z| > 2 interpolating between these two regions with a smooth cutoff function. This consists of a flat-space degree n lump of width 2 cut off on a disc of radius . Since W is not exactly holomorphic, V [W ] > 2πn, but the excess can be made arbitrarily small by choosing small enough. It is easily proved that there are no unit degree elliptic functions [13], so we conclude that M1 = ∅, and the simplest nontrivial moduli space is M2 . 3. The Degree Two Moduli Space Weierstrass explicitly constructed a degree 2 elliptic function ℘, and it is on this that we base our parametrization of M2 . The partial fraction representation of ℘ is X 1 1 1 ℘(z) = 2 − (10) − 2 . z (z − ν)2 ν ν∈\{0}
Several properties of ℘ will be needed, some of which follow easily from Eq. (10), others of which are less straightforward. A comprehensive treatment can be found in [15]. Specifically: ℘(iz) = −℘(z),
℘(−z) = ℘(z), ℘0 (z)2 = 4℘(z)(℘(z)2 − e21 ),
℘(z) ¯ = ℘(z), (11)
where e1 = ℘( 21 ) is a real number, approximately 6.875. It follows that ℘ is real on the boundary and central cross of the unit square, and purely imaginary on the diagonals of the unit square (see Fig. 1) and that ℘ has a double pole at 0 and a double zero at (1 + i)/2. Given one holomorphic function ℘ ∈ M2 one can obtain others by composing on the right with a rigid translation of T 2 and on the left with a M¨obius transformation of S 2
Lump Dynamics in the CP 1 Model on the Torus
517
i
1+ i
s0
s2 double zero double pole
0
s1
1
Fig. 1. The fundamental domain of the Weierstrass ℘ function: ℘ is real on the solid lines and imaginary on the dashed lines. The four double valency points are marked by circles
since these preserve holomorphicity and degree. In terms of a stereographic coordinate W on S 2 , M¨obius transformations are unit degree rational maps [23] W 7→
a11 W + a12 , a21 W + a22
(12)
where aij ∈ C and a11 a22 6= a12 a21 else the degree degenerates to zero. One may collect the parameters aij into a matrix L ∈ GL(2, C) and denote the action of the matrix L on S 2 defined in Eq. (12) by W 7→ L W . (The constraint a11 a22 6= a12 a21 is now det L 6= 0, ensuring that L is invertible, and hence in GL(2, C).) Composition of M¨obius transformations coincides with matrix multiplication, L2 (L1 W ) ≡ (L2 L1 ) W.
(13)
Note, however, that this M¨obius representation of GL(2, C) is not faithful since any pair of matrices L, L0 ∈ GL(2, C) such that L = λL0 for some λ ∈ C generate the same M¨obius transformation. Denoting this scale equivalence ∼ we identify the M¨obius group with GL(2, C)/ ∼, each equivalence class of which may be represented by a unimodular matrix (det L = 1). If L is unimodular then so is −L, so SL(2, C) is a double cover of GL(2, C)/ ∼, and the M¨obius group is identified with SL(2, C)/Z2 , usually denoted P SL(2, C), which is easily seen to be six dimensional (the P stands for “projective”). For the sake of brevity, let G denote the eight dimensional Lie group P SL(2, C)×T 2 with the group product (L1 , s1 ) · (L2 , s2 ) = (L1 L2 , s1 + s2 ). We can define a G-action on M2 , G × M2 → M2 such that (g, W ) 7→ Wg , where W(L,s) (z) = L W (z − s).
(14)
518
J. M. Speight
We claim that this action is transitive, since the G-orbit of ℘ exhausts M2 . Lemma 1. For each W ∈ M2 there exists (L, s) ∈ G such that W (z) = L ℘(z − s). Proof. This may be established in several ways [14, 8, 2]. One economical, instructive (and apparently novel) argument appeals to the Riemann-Hurwitz formula, which constrains the number and valency of multivalent points of a holomorphic mapping between compact Riemann surfaces given their genera and the degree of the map [3]. In the case of a degree 2 holomorphic map from T 2 to S 2 the formula states that any such function must have exactly 4 distinct double valency points (for ℘ these are 0, 21 , 2i and (1 + i)/2). Let W ∈ M2 and s ∈ T 2 be one of its double valency points which is not a double pole. Then (W (z + s) − W (s))−1 is another elliptic function with a double pole at 0, and no poles elsewhere (in the fundamental period square). Its Laurent expansion about 0 is a1 a2 1 = + a3 + · · · , + W (z + s) − W (s) z 2 z
(15)
where a1 6= 0. Consider f (z) = [W (z+s)−W (s)]−1 −a1 ℘(z). This is an elliptic function with at most a simple pole at 0, and no poles elsewhere. Hence it has degree 1 or degree 0. But there are no degree 1 elliptic functions, and all degree 0 elliptic functions are constant, so f (z) = c. Defining a1 W (s) cW (s) + 1 (16) L0 = c a1 and L = (det L0 )− 2 L0 , it follows that W (z) = L ℘(z − s). 1
It is clear that for each W ∈ M2 the associated g ∈ G is not unique, since any one of the four distinct double valency points can be chosen as the basis of the construction of (L, s) outlined above. Conversely, given a choice of s ∈ T 2 , a double valency point of W , the construction of L ∈ P SL(2, C) is unique, so for each W ∈ M2 there are exactly four different g ∈ G such that W = ℘g . In particular, we can construct three alternative formulae for ℘(z) based on the three double valency points s0 = (1 + i)/2, s1 = 1/2 and s2 = i/2 (the trivial formula ℘(z) results from choosing s = 0, the fourth double valency point): −e21 ℘(z − s0 ) e1 [℘(z − s1 ) + e1 ] ≡ ℘(z − s1 ) − e1 −e1 [℘(z − s2 ) − e1 ] ≡ . ℘(z − s2 ) + e1
℘(z) ≡
(17) (18) (19)
These are found by computing the Laurent expansions of ℘ about si using (11), the formula for ℘0 . It is convenient to treat M2 as the G-orbit of ℘/e1 , rather than ℘. The identities (17,18,19) can be rewritten ℘(z) ℘(z − si ) i = 0, 1, 2, (20) ≡ Ui e1 e1 where Ui are the following SU (2) matrices:
Lump Dynamics in the CP 1 Model on the Torus
U0 =
0 1 −1 0
,
i U1 = √ 2
519
1 1 1 −1
,
i U2 = √ 2
−1 1 1 1
.
(21)
So the stabilizer of ℘/e1 under the G-action is G0 = {(I, 0), (U0 , s0 ), (U1 , s1 ), (U2 , s2 )},
(22)
a discrete subgroup of G isomorphic to the Viergruppe V4 , that is, abelian, with each element its own inverse (when checking this recall that SL(2, C) matrices which differ only in sign are identified). The SU (2) subgroup of SL(2, C) acting on S 2 via is a double cover of SO(3) acting on S 2 via the natural rotation action. So G0 is a discrete group of simultaneous rotations of the target space S 2 and translations of the domain T 2 . In fact a straightforward calculation shows that the Ui are rotations of S 2 by π about three orthogonal axes. For a general W ∈ M2 , then, if W = (℘/e1 )g , then W = (℘/e1 )h if and only if h is an element of the left coset gG0 , which we will henceforth denote [g]. So the mapping φ : G/G0 → M2 , φ : [g] 7→ φ[g] , where ℘(z − s) (23) φ[(L,s)] (z) = L e1 is well defined and bijective. It would seem natural, therefore, to identify M2 with G/G0 via φ, but this only makes sense provided φ is a homeomorphism. Before proving that this is indeed the case, there are a few necessary preliminaries. Let p : G → G/G0 be the projection map p(g) = [g]. Since G0 is a discrete subgroup of G, it acts freely and properly discontinuously on G, so the quotient space G/G0 is, like G itself, a Hausdorff, smooth manifold [27]. The pair (G, p) is a covering space of G/G0 , and p is a local homeomorphism. It is useful to define φe : G → M2 such that φe = φ ◦ p, that is, φe : g 7→ φeg = (℘/e1 )g . The Lie group SL(2, C) is noncompact, and is, in fact, homeomorphic to R3 ×SU (2), as may be shown [19] by decomposing any SL(2, C) matrix L into the product HU , where U ∈ SU (2) and H is a positive definite, hermitian, unimodular matrix, this pair being unique. The space of H-matrices is homeomorphic to R3 and may be parametrized so that for all λ ∈ R3 , q (24) H(λ) = 1 + |λ|2 I + λ · τ , where τ = (τ1 , τ2 , τ3 ) are the Pauli spin matrices. It follows that P SL(2, C) ∼ = R3 × 3 3 2 ∼ ∼ (SU (2)/Z2 ) = R × SO(3), and so G = R × SO(3) × T . To prove that φ is a homeomorphism we will need to understand the behaviour of φeg : T 2 → S 2 as g approaches the end of G, i.e. as λ = |λ| → ∞. For this purpose, b λ ∈ (0, ∞)} for consider the one parameter family {φλ,b = φe(L,0) ∈ M2 : L = H(λλ), λ b = λ/λ ∈ S 2 . Explicitly, some fixed λ ℘(z) . (25) (z) = H(λ) φλ,b λ e1 b The action of H(λ) on S 2 (H(λ) : W 7→ H(λ) W ) has exactly two fixed points, λ b and as λ → ∞ all but a vanishing neighbourhood of −λ b is mapped by H(λ) to and −λ, b within a vanishing neighbourhood of λ [24]. So the limiting function φ∞,b : T 2 → S2 λ has the general form
520
J. M. Speight
φ∞,b (z) = lim φλ,b (z) = λ λ λ→∞
b b z∈ λ / (℘/e1 )−1 (−λ) −1 b b −λ z ∈ (℘/e1 ) (−λ).
(26)
b under ℘/e1 , are b all but two points of T 2 , the preimages of −λ That is, for generic λ, b b mapped by φ∞,b to λ, while these two points are mapped to −λ. In the four special cases λ b b coincide (double valency points) so λ ∈ {(0, 0, ±1), (±1, 0, 0)}, the preimages of −λ 2 b collapses all but one point in T is mapped to λ. The point to note is that in all cases φλ,b λ to a discontinuous limit. The statement that φ : G/G0 → M2 is a homeomorphism is, of course, meaningless until we equip M2 with a topology (the domain inherits its topology from G, which we take to have the natural product topology on P SL(2, C) × T 2 ). There are many sensible choices for the topology on T 2 . One simple and directly physical choice is to endow Q2 with the metric topology where distance between configurations is measured by their maximum pointwise deviation in the codomain S 2 , so that M2 ⊂ Q2 inherits the relative topology. That is, let d : S 2 × S 2 → R be the usual distance function on S 2 , and define D : Q2 × Q2 → R such that, for all W1 , W2 ∈ Q2 , D(W1 , W2 ) = sup d(W1 (z), W2 (z)).
(27)
z∈T 2
It is straightforward to verify that D satisfies the axioms of a distance function. The resulting metric topology on Q2 is Hausdorff, as is any metric topology [5]. Rather than break up the smooth manifold G into coordinate charts, it is convenient to equip G with a metric topology also, as follows: let h be the (Riemannian) product metric h = (dλ · dλ) ⊕ hSO(3) ⊕ ds ds¯
(28)
on G ∼ = R3 ×SO(3)×T 2 , where hSO(3) is the biinvariant metric on SO(3) of unit volume. e where d(g e 1 , g2 ) is The Riemannian manifold (G, h) has a natural distance function d, 1 the infimum of lengths (with respect to h) of piecewise C paths connecting g1 and g2 . That de is a distance function, and that the associated metric topology coincides with the original topology on G (independent of the choice of h) are standard theorems of Riemannian geometry [9]. We may now state and prove Theorem 1. Throughout, B (x) denotes the open ball of radius centred on x, where the space containing x (S 2 , M2 e should be clear from or G), and hence the appropriate distance function (d, D or d), context. Theorem 1. The bijection φ : G/G0 → M2 is a homeomorphism. Proof. We must prove that both φ and φ−1 are continuous. To prove the former, it suffices to show that φe = φ ◦ p, is continuous, since the projection p is a local homeomorphism. Fix g0 ∈ G and > 0. Then we must show that ∃δ > 0 such that ∀g ∈ Bδ (g0 ), φeg ∈ B (φeg0 ). Let φ∗ : G × T 2 → S 2 such that φ∗ (g, z) = φeg (z). Note that φ∗ is ˜ > 0 such that manifestly continuous. Hence, for each z˜ ∈ T 2 there exists δ(z) ˜ ⇒ d(φ∗ (g, z), φ∗ (g0 , z)) ˜ < (g, z) ∈ Bδ(z) ˜ (g0 ) × Bδ(z) ˜ (z)
. 3
(29)
The collection of open balls {Bδ(z) ⊂ T 2 : z ∈ T 2 } is an open cover of T 2 . Since T 2 is compact, there exists a finite subcover {Bδ(zj ) (zj ) : j = 1, 2, . . . , N }. Define δ = inf{δ(zj ) : j = 1, 2, . . . , N } > 0.
Lump Dynamics in the CP 1 Model on the Torus
521
Now, let g ∈ Bδ (g0 ) and consider D(φeg , φeg0 ). For each z ∈ T 2 there exists j ∈ {1, 2, . . . , N } such that z ∈ Bδ(zj ) (zj ). Further, g, g0 ∈ Bδ (g0 ) ⊂ Bδ(zj ) (g0 ) by definition of δ, so (g, z), (g0 , z) ∈ Bδ(zj ) (g0 ) × Bδ(zj ) (zj ). Hence, using (29) and the triangle inequality, d(φ∗ (g, z), φ∗ (g0 , z)) ≤ d(φ∗ (g, z), φ∗ (g0 , zj )) + d(φ∗ (g0 , z), φ∗ (g0 , zj )) <
2 . (30) 3
Thus, 2 D(φeg , φeg0 ) = sup d(φ∗ (g, z), φ∗ (g0 , z)) ≤ < , 3 z∈T 2
(31)
so φe is continuous. e To prove that φ−1 is continuous we again convert the problem to one involving φ, using general properties of covering spaces. Fix [g0 ] ∈ G/G0 and choose any open neighbourhood U of [g0 ]. The inverse image of [g0 ] under p is the left coset g0 G0 = {g0 , g1 , g2 , g3 }. Since p is a local homeomorphism there exists > 0 such that p(U ) ⊂ U , where 3 [ B (gi ) ⊂ G. (32) U := i=0
We will show that there exists δ > 0 such that φe−1 (Bδ (W0 )) ⊂ U , where W0 = φ[g0 ] ∈ M2 . It follows that φ−1 (Bδ (W0 )) ⊂ U , and hence that φ−1 is continuous. For each n ∈ N, define the compact set An = B n (I, 0)\U ⊂ G,
(33)
where B n (I, 0) is the closed ball of radius n centred on (I, 0) ∈ G. Since φe is continuous, e n ) ⊂ M2 is also compact, and therefore closed (M2 is Hausdorff). Hence the comφ(A e n ) is open, and it contains W0 by construction (since An ∩ U = ∅), plement M2 \φ(A e n ). In this way, construct a so there exists δn > 0 such that Bδn (W0 ) ⊂ M2 \φ(A ∞ positive sequence (δn )n=1 , which, without loss of generality, we may assume is decrease By construction, ing and converges to 0. Consider the preimage of Bδn (W0 ) under φ. −1 φe (Bδn (W0 )) ∩ An = ∅, so every point in the preimage lies either in U , or at a distance greater than n from (I, 0) ∈ G. We claim that there exists N ∈ N such that φe−1 (Bδn (W0 )) ⊂ U . Choosing δ = δN , the proof is then complete. Assume this claim is false. Then ∀n ∈ N there exists / B n (I, 0) such that φegn ∈ Bδn (W0 ). For each n, choose such a gn and consider gn ∈ ∞ e e ∞ the sequence (gn )∞ n=1 . Since (δn )n=1 → 0, the image of the sequence under φ, (φgn )n=1 3 ∼ converges to W0 in M2 . Define two projection maps on G = R × SO(3) × T 2 : π1 : G → [0, ∞) π2 : G → S × SO(3) × T 2
2
such that π1 (λ, U, s) = λ = |λ|, b U, s). such that π2 (λ, U, s) = (λ,
(34)
The singularity of π2 when λ = 0 is irrelevant here. By construction, (λn )∞ n=1 = (π1 (gn ))∞ is unbounded, and without loss of generality, we may choose g such that λn n n=1 is increasing. Since (π2 (gn ))∞ takes values in a compact space, it has, by the Bolzanon=1 Weierstrass theorem, a convergent subsequence (π2 (gnr ))∞ r=1 . By translation and rotation
522
J. M. Speight
symmetry of T 2 and S 2 respectively, we may assume without loss of generality that its b I, 0). limit is (λ, Consider the image under φe of the associated subsequence (gnr )∞ r=1 , which apb I, 0) : t ∈ (0, ∞)}. proaches the end of R3 × SO(3) × T 2 asymptotic to the line {(tλ, The function φegnr (z) converges pointwise, as r → ∞, to φ∞,b (z), the limiting function λ
previously described (to check this, use continuity of the SO(3) and T 2 actions on S 2 and T 2 respectively, and of the function ℘/e1 ). But φ∞,b , being discontinuous, cannot λ be in M2 , and hence cannot be W0 , a contradiction. 4. Degree 2 Static Solutions
e is a covering space of M2 . The aim of An immediate corollary of Theorem 1 is that (G, φ) this section is to describe the connexion between any point g ∈ G and its corresponding static solution φeg ∈ M2 , that is, to obtain a picture of what the static lumps look like, and how they change as g varies. A configuration W may be visualized as a distribution of unit length three-vectors (“arrows”) over the torus. The energy density function of W is |Wx |2 + |Wy |2 , (35) E(x, y) = (1 + |W |2 )2 so the energy is located where the direction of the arrows is varying sharply in (x, y), in other words, where neighbouring arrows are stretched apart. It is the function E that we will describe as W varies in M2 . For this purpose, rather than using the hermitian-unitary (or “polar”) decomposition of SL(2, C) used above, another standard decomposition is convenient. Namely, any L ∈ SL(2, C) may be uniquely decomposed into a product U T with U ∈ SU (2) and T upper triangular, real on the diagonal, positive definite and unimodular. The space of such T -matrices is homeomorphic to R+ × C (here R+ = (0, ∞)) and may be parametrized thus: √ √ αe1 √ αe1 ρ . (36) T (α, ρ) = 0 1/ αe1 This allows one to write any W ∈ M2 in the form ℘(z − s) = U [α(℘(z − s) + ρ)]. W (z) = (U T ) e1
(37)
Changing U ∈ SU (2) merely produces a global internal rotation of the solution and so has no effect on E(z). Changing s ∈ T 2 translates the solution on the torus, so it suffices to examine the three parameter family W (z) = α(℘(z) + ρ)
(38)
(α, ρ) ∈ R+ × C, whose energy density is E(z) =
8α2 |℘(z)||℘(z)2 − e21 | . (1 + α2 |℘(z) + ρ|2 )2
(39)
Note that for all (α, ρ), E = 0 at the four double valency points z = 0, s0 , s1 , s2 , around which the direction of the arrows is constant to first order.
Lump Dynamics in the CP 1 Model on the Torus
523
(a)
(b)
600
150
400
100
200
50
0 1
1
0.5
0 1
0 0
(d)
300
300
200
200
100
100
0 1
1 0.5 0 0
0.5 0 0
(c)
0.5
1
0.5
0.5
0 1
1
0.5
0.5 0 0
Fig. 2. Energy density plots of W (z) = ℘(z) + ρ for various values of ρ. In plot (a) ρ = 1 − i, so the roots of W are separate and two lumps form. In plots (b), (c) and (d), ρ = 0, e1 , −e1 respectively so the roots of W coincide. Here the energy distribution is roughly annular, centred on the double valency points s0 , s2 , s1
The behaviour of E as (α, ρ) covers R+ × C is remarkably varied, going beyond the two-lump and annular structures one might expect by analogy with the planar CP 1 model. First, consider the case α = 1. Here, the energy is located in lumps close to the two roots of ℘(z) + ρ (symmetrically placed about s0 since ℘ is even) where the denominator of (39) is smallest. The only exceptions are when these roots coincide, ρ = 0, −e1 , e1 , or are close to coincidence, for then the lumps lose their individual identity and form a ring-like structure (centred on s0 , s1 or s2 respectively) rather reminiscent of coincident planar solitons (Fig. 2). If we now imagine increasing α above 1, the effect on W is to pull all the arrows in the configuration towards the north pole of S 2 (W = ∞), so that those close to the south pole (W = 0) are stretched apart. Since the energy is located where the arrows are stretched apart, increasing α therefore tends to concentrate E more strongly on roots of ℘(z) + ρ, and the lumps become taller and narrower. As α → ∞ the lumps collapse and “pinch off”. Conversely, if α is decreased below 1 the arrows of the configuration are pulled southwards, and for α very small, E concentrates on the double pole of ℘(z) + ρ (z = 0), where W points north. In this case a ring structure appears, centred on z = 0, and collapses to zero width as α → 0. These two cases are compared in Fig. 3. Noting the symmetry property ℘(iz) ≡ −℘(z), we see that whenever ρ passes through 0 ∈ C along a smooth curve, the roots of ℘(z) + ρ coalesce and emerge at right angles to their line of approach, giving a first hint that the familiar 90◦ scattering of lumps through a ring structure may take place in the geodesic approximation. We shall return to this point later.
524
J. M. Speight (a)
(b)
1500
100
1000 50 500 0 1
0 1 1
0.5 0
1
0.5
0.5
0.5 0
0
0
Fig. 3. Energy density plots of W (z) = α(℘(z) + 1 − i) in the cases of (a) large α (α = 2) and (b) small α (α = 0.03)
500
(a)
40
(b)
20 0 1 0.5 0 0 40
0.5
1
0.5 0 0
(c)
40
20
0.5
1
(d)
20
0 1 0.5 0 0 1000
0 1
0.5
1
0 1 0.5 0 0
0.5
1
(e)
500 0 1 0.5 0 0
0.5
1
Fig. 4. The exceptionally symmetric family W (z) = α℘(z). The parameter values are (a) α = 4, (b) α = 0.3, (c) α = 1/e1 , (d) α = 0.01 and (e) α = 0.005. Plot (c) depicts the most evenly spread energy distribution possible for a degree 2 static solution
The special case ρ = 0 is exceptional, and will be prominent in later sections. Examining the formula (39) in this case, we see that the global maxima of E must occur where ℘ is purely imaginary. If not, assume that a global maximum occurs at z0 and let ℘(z0 ) = u ∈ C\iR. Then E(z0 ) =
8α2 |u||u2 − e21 | 8α2 |u|(|u|2 + e21 ) < , (1 + α2 |u|2 )2 (1 + α2 |u|2 )2
(40)
Lump Dynamics in the CP 1 Model on the Torus
525
Fig. 5. Energy density plot of W (z) = (℘(z) − i)/e1
where the inequality is strict since u2 is not real-negative. But there exists z1 ∈ T 2 such that ℘(z1 ) = i|u|, and E(z1 ) =
8α2 |u|| − |u|2 − e21 | > E(z0 ), (1 + α2 |u|2 )2
(41)
a contradiction. Given the symmetry of E under ℘ 7→ −℘, and that ℘ is even, it follows that E(z) has at least four peaks on the diagonals of the unit square, symmetrically placed about s0 . Plots of E confirm that there are, in fact, exactly four such peaks (Fig. 4). The must symmetric case is α = 1/e1 , that is, W (z) = ℘(z)/e1 . Here, using the identity (17), one can easily show that E(z −s0 ) ≡ E(z), so the four peaks are located halfway towards the centre s0 along the diagonals, i.e. at the points (1+i)/4, (3+i)/4, 3(1+i)/4, (1+3i)/4. This solution has the most evenly spread energy distribution possible. Once again, one can consider the effect of increasing α (pulling the arrows northwards) or decreasing α (pulling southwards) for this family. Increasing α moves the lumps towards s0 , where they coalesce, form a shrinking ring structure and pinch off. Decreasing α has the same effect, except the ring is centred on 0 rather than s0 . In fact, the solution α℘(z) is identical, up to the rotation and translation (U0 , s0 ) ∈ G0 , to the solution ℘(z)/(e21 α). When α is close to 1/e1 and |ρ| is small but nonzero, the behaviour of E(z) is intermediate between the two cases described above. It has four peaks, but two of these are larger than the other two (Fig. 5). 5. The Metric on M2 The argument of Belavin and Polyakov (7) shows that M2 is the flat valley bottom of Q2 , on which V attains its topological minimum value, 4π. Any departure from M2 involves increasing V , and hence climbing the valley walls. Consider the initial value
526
J. M. Speight
problem where W starts on M2 and is given a small push tangential to it. Then, by energy conservation, it must stay close to M2 during its subsequent evolution. In the geodesic approximation one constrains the configuration to lie on M2 for all time, but allows the position in M2 to evolve in time according to the constrained action principle. Since V = 4π always, the dynamics is determined solely by the kinetic energy functional (4). Using the homeomorphism φ we can transfer the differentiable structure of G/G0 to M2 . Let {qi : i = 1, 2, . . . , 8} be local coordinates on M2 , and consider the kinetic energy T as q i vary in time: T = gij (q)q˙i q˙j , (42) where
Z gij (q) = Re
T2
1 ∂W ∂W . (1 + |W |2 )2 ∂qi ∂q j
(43)
the Equation (43) defines a Riemannian metric on M2 , g = gij dq i dq j , and furthermore R constrained Euler-Lagrange Eq. (obtained by varying the action S[q] = dt T (q, q)) ˙ is the geodesic equation for (M2 , g). The conjecture is, then, that geodesics in this Riemannian manifold are, when travelled at low speed, close to low-energy two-lump dynamical solutions of the CP 1 model. Some justification for this can be found when comparison is made with other models for which the approximation has been used. In the case of abelian-Higgs vortices, for example, the approximation [22] is supported by rigorous analysis [26] and extensive numerical solution of the full field equations [18]. Ideally, one would like an explicit, closed-form expression for the metric g, but this is rarely possible in practice. There are exceptions [1, 24, 25], but unfortunately this is not one of them. It is possible to place fairly strong constraints on the possible form of g, but not to write it down explicitly (to do so requires, naively, the evaluation of 36 integrals over T 2 , each with 8 parameters). It is convenient to lift the geometry to the covering e space (G, ge), where ge = φe∗ g, the pullback of the metric g by the covering projection φ. The most useful constraint is that ge is a product metric ge = gb ⊕ δ on P SL(2, C) × T 2 , where δ = 2πds ds. ¯ By product metric [10] we mean block diagonal with gb independent of position in T 2 and δ independent of position in P SL(2, C). This is easily established if we recall that any W ∈ M2 is a rational function of ℘(z − s), W (z) = R0 (℘(z − s)), so denoting by µ any one of the six P SL(2, C) moduli, we see that ∂W = −R1 (℘(z − s))℘0 (z − s), ∂s ∂W = R2 (℘(z − s)), ∂µ where R1 , R2 are also rational functions. So the (µ, s) component of ge is Z Z 0 geµs = Re f (℘(z − s))℘ (z − s) = Re f (℘(z))℘0 (z) = 0, T2
(44)
(45)
T2
since ℘ is even while ℘0 is odd (here f (u) = −(1 + |R0 (u)|2 )−2 R1 (u)R2 (u)). Similarly, geµs¯ = 0, so ge is block diagonal as claimed. Translation symmetry implies that ge, and hence gb, must be independent of s. Hence, it remains to show that δ is independent of the f (z − s(t)), W f ∈ M2 , and compute the kinetic P SL(2, C) moduli. Let W (x, y, t) = W energy,
Lump Dynamics in the CP 1 Model on the Torus
Z T = T2
f 0 |2 1 f 2 |W ]|s| ˙ = 2π|s| |s| ˙ 2 = V [W ˙ 2. f |2 )2 2 (1 + |W
527
(46)
Since this is, by definition, gss¯ |s| ˙ 2 , we read off the metric δ = 2πds ds¯ on T 2 . The geodesic equation for (G, ge) decouples into independent geodesic equations for (P SL(2, C), gb) and (T 2 , δ). Consequently, we may identify s as an effective “centre of mass coordinate” which drifts on T 2 at constant velocity, independent of the lumps’ relative motion in P SL(2, C). Without loss of generality, therefore, we can investigate b gb), where G b denotes P SL(2, C). So geodesic motion in the reduced covering space (G, lump dynamics in the geodesic approximation has Galilean boost symmetry. This may be understood as a remnant of the Lorentz symmetry of the CP 1 model in R2+1 : the field equation is still Lorentz invariant, but the spatial boundary conditions now are not. Under a Lorentz boost, they suffer Lorentz contraction. In a low speed approximation such as this, however, Galilean boost symmetry is recovered, since the spatial contraction is a high order effect. One further constraint on gb will prove useful: since the kinetic energy functional is invariant under global internal rotations of W (rotations of the codomain S 2 ), SU (2) b gb) [24]. Briefly, gb (or ge) is left-invariant acts isometrically by left multiplication on (G, under SU (2). 6. Geodesic Incompleteness of (M2 , g) One of the most basic questions one can ask about a Riemannian manifold without boundary is whether it is geodesically complete, that is, whether all geodesics can be extended infinitely in time (forwards and backwards). In view of the noncompactness of M2 , this is a nontrivial question for (M2 , g). We will prove that (M2 , g) is , in fact, geodesically incomplete, by finding a geodesic which, although maximally extended, has only finite length (since geodesics are traversed at constant speed, this is sufficient). This geodesic is obtained explicitly, despite our lack of explicit information about g, by using discrete isometries to identify a one dimensional geodesic submanifold. Such arguments have been used to obtain multimonopole scattering geodesics [12] given similarly scant knowledge of the metric on moduli space. The key observation is that the fixed point set of a discrete group of isometries of a Riemannian manifold is (if a submanifold) a totally geodesic submanifold, that is, a geodesic which starts on and tangential to the fixed point set must remain on the fixed point set for all subsequent time. This follows directly from uniqueness of solutions to the initial value problem of an ordinary differential equation. If a discrete group is found whose fixed point set is diffeomorphic to R, then the set itself is a geodesic. b gb): The following mappings are isometries of (G, P : L 7→ L, R : L 7→ τ3 Lτ3 .
(47) (48)
To see this, consider their effect on W (z) = L (℘(z)/e1 ), ℘(z) ℘(z) ¯ =L = W (z), ¯ e1 e1 ℘(z) = −[L (−℘(z)/e1 )] = −W (iz). R : W (z) → 7 (τ3 Lτ3 ) e1
P : W (z) 7→ L
(49) (50)
528
J. M. Speight
So P produces simultaneous reflexions in both domain (z 7→ z) ¯ and codomain (W 7→ W ), while R produces rotations of π/2 in the domain (z 7→ iz) and π in the codomain (W 7→ −W ), all of which are symmetries of the CP 1 model. The composition of P b gb). Since P 2 = R2 = and R in either order (they commute) is another isometry of (G, 2 (P R) = Id, the isometries {Id, P, R, P R} form the Viergruppe V4 under composition. b V , the fixed point set of V4 is A straightforward calculation shows that 6 b V = {diag((αe1 ) 21 , (αe1 )− 21 ) : α ∈ R+ }. 6
(51)
b diffeomorphic to R, and hence is a geodesic. Its image This is clearly a submanifold of G e under the projection φ is 6V = {α℘(z) : α ∈ R+ } which is a geodesic of (M2 , g), also diffeomorphic to R. The submanifold 6V was described at the end of Sect. 4. The lump motion corresponding to this geodesic is an infinitely tall thin ring centred at 0 in the past spreading out into four distinct identical peaks, which recombine to form an infinitely tall thin ring centred on s0 in the future (assuming that 6V is traversed in the sense of increasing α). The question remains whether 6V is traversed in finite time, i.e. has finite length, and to answer this one needs to understand the induced metric gV on 6V . The restriction of g to 6V is gV = f (α)dα2 , (52) where
Z f (α) = T2
|℘|2 . (1 + α2 |℘|2 )2
(53)
Note that f is clearly positive and decreasing, and is easily shown to have limits ∞ and 0 as α → 0 and α → ∞ respectively. To prove that 6V has finite length we will need detailed asymptotic estimates for f in these two limits. The identity (17) implies that f (α) ≡
1 1 f( ), (αe1 )4 αe21
(54)
so the behaviour in one limit follows directly from the behaviour in the other. Lemma 2. The following asymptotic formulae hold, π2 as α → 0, 4α π2 f (α) ∼ 2 3 as α → ∞. 4e1 α
f (α) ∼
(55) (56)
Proof. We need only prove (55) since (56) follows from this and Eq. (54). The idea is to split the integration region of (53) into a small neighbourhood of 0 and its complement, bound the contribution of the latter region and use ` a Laurent expansion in the former. Fix some ∈ (0, 41 ) and split T 2 into D (0) (T 2 \D (0)), where D (0) is the open disk of radius centred on 0. Then Z Z Z |℘|2 |℘|2 |℘|2 < f (α) < + (57) 2 2 2 2 2 2 2 2 2 D (0) (1 + α |℘| ) D (0) (1 + α |℘| ) T 2 \D (0) (1 + α |℘| ) and |℘| is bounded on T 2 \D (0), so there exists M ∈ (0, ∞) such that
Lump Dynamics in the CP 1 Model on the Torus
Z T 2 \D (0)
529
|℘|2 < (1 + α2 |℘|2 )2
Z T 2 \D (0)
|℘|2 < M
(58)
independent of α. Hence Z Z α|℘|2 α|℘|2 < αf (α) < αM + , 2 2 2 2 2 2 D (0) (1 + α |℘| ) D (0) (1 + α |℘| )
(59)
so it suffices to prove that Z lim
α→0
D (0)
π2 α|℘|2 . = 2 2 2 (1 + α |℘| ) 4
(60)
/ D (0)) bounded away The function h(z) = z 2 ℘(z) is analytic, bounded and (since s0 ∈ from 0 on D (0). So ℘(z) =√h(z)/z 2 , where 0 < c < |h(z)| < c 1 2 < ∞, c1 and c2 being √ constants. Defining γ = / α and u = z/ α, Z Z α|h(z)|2 /|z|4 α|℘|2 = lim dz d z ¯ lim α→0 D (0) (1 + α2 |℘|2 )2 α→0 D (0) (1 + α2 |h(z)|2 /|z|4 )2 √ Z |h( αu)|2 |u|4 √ = lim du du¯ χγ (u) , (61) α→0 C (|u|4 + |h( αu)|2 )2 where χγ is the characteristic function of the disk (i.e. χγ (u) = 1 if |u| < γ, 0 otherwise). The integrand of (61) is bounded above, independent of α, by c22 |u|4 + |u|4 )2
(62)
(c21
which is integrable on C. Hence, Lebesgue’s dominated convergence theorem applies [6], and we may interchange the order of limit and integration in Eq. (61). From the Laurent expansion of ℘ about 0, ℘(z) =
1 + O(z 2 ), z2
(63)
one sees that h(z) = 1 + O(z 4 ), whence √ |u|4 χγ (u)|h( αu)|2 |u|4 √ = . lim 4 2 2 α→0 (|u| + |h( αu)| ) (1 + |u|4 )2 Integrating this function over C yields π4 , which completes the proof.
(64)
There now immediately follows Theorem 2. The moduli space (M2 , g) is geodesically incomplete. Proof. We need only prove that the length of 6V , Z ∞ p dα f (α) l=
(65)
0
is finite. By Lemma 2 there exist 0 < α1 < α2 < ∞, 0 < c3 , c4 < ∞ such that f (α) < c3 /α on (0, α1 ) and f (α) < c4 /α3 on (α2 , ∞). Hence,
530
J. M. Speight
√
l < 2 c3 α1 +
Z
α2
p dα f (α) + 2
α1
r
√
< 2 c3 α1 + (α2 − α1 )f (α1 ) + 2 by monotonicity of f .
r
c4 α c4 α
(66)
The geodesic approximation predicts, then, that lumps (at least when coincident) can shrink and form singularities in finite time. Shrinking has been observed in numerical simulations of the CP 1 model in both the plane [29] and the torus [7], although the particular initial value problem considered here has not been simulated. 7. A Two-Dimensional Geodesic Submanifold The Viergruppe V4 has three Z2 subgroups {Id, P }, {Id, R} and {Id, P R} whose fixed b gb). Of these, b R, 6 b P R respectively are all geodesic submanifolds of (G, bP , 6 point sets 6 b R is two dimensional (the others are three dimensional) and projects under φe to 6 6R = {αeiψ ℘(z) : α ∈ R+ , ψ ∈ [0, 2π]},
(67)
a geodesic submanifold of (M2 , g) diffeomorphic to a cylinder. This has a tractable geodesic problem. Recalling that g is left-invariant under SU (2), its restriction gR to 6R is independent of ψ. In fact, gR = f (α)(dα2 + α2 dψ 2 ),
(68)
where f is the same function defined in (53). Lemma 2 implies that the asymptotic form of gR towards the ends of the cylinder is π2 (dα2 + α2 dψ 2 ) as α → 0, 4α π2 gR ∼ 2 3 (dα2 + α2 dψ 2 ) as α → ∞. 4e1 α gR ∼
(69) (70)
The formula in (69) is the metric of a flat, singular cone with deficit angle π, so (6R , gR ) can be visualized as having a conical singularity at α = 0. By virtue of the identity (54), the metric gR is invariant under the mapping α 7→ (e21 α)−1 , and consequently (6R , gR ) has an identical conical singularity at α = ∞, as may be shown by the reparametrization β = (e21 α)−1 in (70). So (6R , gR ) is a rotationally symmetric cylinder of finite length with its ends pinched to identical cones. It is the internal rotation orbit, about a fixed axis, of the one parameter family of exceptionally symmetric configurations already described (6V ). The singular points α = 0 and α = ∞ correspond to infinitely narrow, spiky, ring like configurations centred on 0 and s0 respectively. Motion on 6R corresponds to rotational and shape changing motion of the double lump on the torus. The conserved kinetic energy of this motion is T = f (α)α˙ 2 +
J2 , 4α2 f (α)
(71)
where J = α2 f (α)ψ˙ is the conserved angular momentum conjugate to ψ. One may imagine the dynamics as that of a point particle moving on the interval (0, ∞) with
Lump Dynamics in the CP 1 Model on the Torus
531
position dependent mass and subject to a potential. Geodesic motion is invariant under rescaling of time, so one can restrict attention to the two cases J 2 = 0 and J 2 = 1. If J 2 = 0 the motion is irrotational and the point particle travels from one conical singularity to the other in finite time along a path of constant ψ. These geodesics are just rotated versions of 6V . The more interesting case is when J 2 = 1, where the nature of the motion is determined by the centrifugal potential U(α) =
1 . 4α2 f (α)
(72)
From the asymptotic formulae of Proposition 1 we see that the potential has the asymptotic behaviour , as α → 0, (73) π2 α 2 e U(α) ∼ 12 α as α → ∞ (74) π implying that U must have at least one stable equilibrium. The identity (54) implies a similar identity for the potential, U(α) ∼
U(
1 ) ≡ U (α). αe21
(75)
12
10
8
U6
4
2
0 0
0.2
0.4
0.6
0.8
1 a
1.2
1.4
1.6
1.8
2
Fig. 6. The centrifugal potential U (α) of Eq. (72), solid line, compared with the asymptotic formulae for U for small and large α given in Eqs. (73) and (74), dashed lines
532
J. M. Speight
Differentiating both sides of (75) one finds that U has a critical point at α = 1/e1 , the fixed point of the isometry α 7→ (αe21 )−1 . Numerical evaluation of U suggests that this is the only critical point, a global minimum, so that U is a single potential well (see Fig. 6). Since U grows unbounded as α → 0 and α → ∞, all motion in the well is oscillatory. So these geodesics wind around 6R , passing back and forth along its length indefinitely. They are bounded away from the singularities by angular momentum conservation. They correspond to rotational motions of the double lump during which the arrows of the configuration spin about the north-south axis of S 2 , and its shape periodically oscillates about that of the most symmetric configuration, W (z) = ℘(z)/e1 . 8. The Diameter of (M2 , g) In this section we will prove that (M2 , g) has finite size, in an appropriate sense. Since one is interested in (M2 , g) primarily for its geodesics, a linear measure of size is most meaningful, so we will consider its diameter. Recall that (M2 , g), like any Riemannian manifold, has a natural distance function d : M2 × M2 → R, where d(W, W 0 ) is the infimum of lengths with respect to g of piecewise C 1 paths in M2 connecting W and W 0 (note that d has nothing to do with D, the distance function defined in Sect.3, although they define equivalent topologies on M2 ). The diameter of (M2 , g) is simply the diameter of the associated metric space (M2 , d), that is, diam(M2 , g) =
sup
W,W 0 ∈M2
d(W, W 0 ).
(76)
Once again, it is the noncompactness of M2 which makes this diameter interesting, and its finiteness nontrivial. The geometric meaning of the result is that all points lie within a bounded distance of each other, and, in particular, no point lies far from the end of M2 , where the static solutions collapse to singular, spiky configurations. Thus all static solutions are close to collapse in this geometry. This may be the underlying cause of the ubiquitous instability found in numerical simulations of two-lump scattering on the torus [7]. Theorem 3. The moduli space (M2 , g) has finite diameter. Proof. It suffices to prove that the covering space (G, ge) has finite diameter and, further, since ge = gb ⊕ δ is a product metric and T 2 is compact, it is sufficient to prove that b gb) has finite diameter. By the triangle inequality for the reduced covering space (G, b×G b → R, d:G b gb) ≤ 2 sup d(W, W0 ), (77) diam(G, b W ∈G b Let W0 = ℘. We will explicitly construct a path from where W0 is any point in G. + ∼ b = SO(3) × [R × C] to W0 , and bound its length independent of W . W ∈G Let W = U [α(℘ + ρ)]. The first piece of the path has (α, ρ) ∈ R+ × C fixed, but takes U to I. For example, since any U ∈ SU (2) is exp(u) for some u ∈ su(2), we could consider the path (t) = exp((1 − t)u), so that (0) = U while (1) = I. Denote by γ(α, ρ) the metric on SO(3) induced by gb at fixed (α, ρ). Since SO(3) is compact the length of (t) is bounded independent of U for each (α, ρ). One must check, however, that the length remains bounded as a function of (α, ρ). Since γ(α, ρ) is a left-invariant metric on SO(3), it suffices to show that
Lump Dynamics in the CP 1 Model on the Torus
0(α, ρ) :=
533 3 X
|γij (α, ρ)|
(78)
i,j=1
is a bounded function, where γij (α, ρ) are the metric coefficients of γ evaluated at I ∈ SO(3) with respect to a particular choice of basis for TI SO(3). The basis used does not matter. One convenient choice consists of the three vectors represented by the curves τ i , i = 1, 2, 3, t ∈ (−, ), (79) exp it 2 where τi are the Pauli matrices (this is equivalent to choosing {iτi /2 : i = 1, 2, 3} as a basis for su(2)). Elementary calculation then shows that 0 < 3 for all (α, ρ). For example, γ33 (α, ρ) is the squared length of the vector [exp(itτ3 /2)]. Let w := α(℘ + ρ). Then τ 3 w(z) = eit w(z), W (z, t) = exp it 2 ˙ (z) = ∂W = iw(z), W (80) ∂t t=0 and
Z γ33 (α, ρ) =
T2
˙ |2 |W = (1 + |W |2 )2
Z T2
1 |w|2 < . 2 2 (1 + |w| ) 2
(81)
Bounds on the other metric coefficients are equally straightforward. It remains to construct a path from α(℘ + ρ) to ℘ with length bounded above inde` pendent of (α, ρ). It is necessary to split R+ × C into two pieces X+ X− and construct the path differently in each piece. Here X+ = {(α, ρ) ∈ R+ × C : α > 1} and X− is its complement. For any (α, ρ) ∈ X− construct the path x− : [0, 1] → R+ × C, where (α, (1 − 2t)ρ) t ∈ [0, 21 ] x− (t) = , (82) (1 + 2(1 − α)(t − 1), 0) t ∈ ( 21 , 1] so that x− (0) = (α, ρ), x− (1) = (1, 0). Thinking of R+ × C as the upper half of R3 , this path consists of a horizontal line from (α, ρ) to (α, 0) followed by a vertical line from (α, 0) to (1, 0) (see Fig 7). Its length is bounded above by the sum of the lengths of the curves {(α, teiψ ) : t ∈ [0, ∞) and {(t, 0) : t ∈ (0, 1]}, where ψ = argρ. So l[x− ] < l1 (α, ψ) + l3 ,
(83)
where Z
∞
q
Z
∞
Z
α2 d|ρ| gb|ρ||ρ| (α, ρ) = d|ρ| l1 (α, ρ) = 2 iψ 2 2 T 2 (1 + α |℘ + |ρ|e | ) 0 0 1 Z 2 Z 1 p Z 1 |℘|2 dα gbαα (α, 0) = dα . l3 = 2 2 2 T 2 (1 + α |℘| ) 0 0
21 , (84)
534
J. M. Speight
α X+
. x+
α=1 Im(ρ)
x-
.
XRe(ρ) Fig. 7 The paths x− and x+ constructed in the proof of Theorem 3
That l3 is finite follows directly from Lemma 2, since gbαα (α, 0) is precisely f (α), the function previously discussed. To prove that l1 (α, ρ) is finite and bounded independent of (α, ψ) ∈ (0, 1] × [0, 2π] is more involved. By a change of variable, σ := α|ρ|, we can rewrite l1 (α, ψ) as Z l1 (α, ψ) =
Z
∞
dσ 0
T2
1 (1 + |α℘ + σeiψ |2 )2
21 .
(85)
One must now appeal to a technical lemma, whose proof we postpone: Lemma 3. There exist σ∗ , C > 0, independent of (α, ψ), such that ∀σ > σ∗ and α ≤ 1, Z C 1 < 3. (86) iψ |2 )2 (1 + |α℘ + σe σ 2 T It follows that Z
σ∗
Z
1 l1 (α, ψ) = iψ 2 2 0 T 2 (1 + |α℘ + σe | ) Z ∞r C < σ∗ + < C 0 < ∞, 3 σ σ∗
21
Z
∞
Z
+ σ∗
T2
1 (1 + |α℘ + σeiψ |2 )2
C 0 being a constant. Now, for any (α, ρ) ∈ X+ construct the path x+ : [0, 1] → R+ × C, where
21
(87)
Lump Dynamics in the CP 1 Model on the Torus
x+ (t) =
535
(α − 2(α − 1)t, ρ) t ∈ [0, 21 ] (1, 2(1 − t)ρ) t ∈ ( 21 , 1],
(88)
consisting (see Fig. 7) of a vertical line from (α, ρ) to (1, ρ) followed by a horizontal line from (1, ρ) to (1, 0). Its length is bounded above by the sum of the lengths of the lines {(t, ρ) : t ∈ [1, ∞)} and {(1, teiψ ) : t ∈ [0, ∞)}. So l[x+ ] < l2 (ρ) + l1 (1, ψ),
(89)
where l1 was previously defined and Z
∞
dα
l2 (ρ) =
p
Z gbαα (α, ρ) =
1
Z
∞
dα 1
T2
|℘ + ρ|2 (1 + α2 |℘ + ρ|2 )2
21 .
(90)
We have already shown that l1 (1, ψ) is finite and bounded independent of ψ (this follows from Lemma 3 in the case α = 1). That l2 (ρ) is finite ∀ρ ∈ C is easily shown, using an argument similar to that of Lemma 2. Let z1 , z2 be the roots of ℘ + ρ (possibly coincident) and split T 2 into small neighbourhoods of these roots and their complement. In the complement use the trivial bound |℘ + ρ| ≥ C, constant, while near the roots use Laurent expansions of ℘ + ρ. One finds that gbαα < C 0 /α3 which is sufficient for finiteness of l2 (ρ) for all ρ, and boundedness of l2 on any compact subset of C. This is insufficient for our purposes, since l2 (ρ) could grow unbounded as |ρ| → ∞. We again appeal to a technical lemma whose proof we postpone: Lemma 4. For all ρ ∈ C such that |ρ| > e1 + 2, Z 2 π |℘ + ρ|2 < 4 + 4 log(1 + α2 ). 2 2 2 α 2α T 2 (1 + α |℘ + ρ| )
(91)
So for all ρ outside the closed disk De1 +2 (0), Z l2 (ρ) < 1
∞
21
π 2 dα 4 + 4 log(1 + α2 ) α 2α
= C < ∞,
(92)
b lie within C being a constant. Hence l2 is bounded independent of ρ, and all points in G a bounded distance of ℘, and hence, one another. Proof of Lemma 4. Let ρ ∈ C such that |ρ| > e1 + 2. Since ℘ is an even function, Z |℘ + ρ|2 , (93) gbαα (α, ρ) = 2 2 2 2 H (1 + α |℘ + ρ| ) where H = [0, 1) × [0,`21 ) is the “half torus” (the point is that ℘ is injective on H). Split H into two pieces H+ H− , where H+ = {z ∈ H : |℘ + ρ| > 1}. Now, Z Z Z 1 |℘ + ρ|2 1 1 < < < 4. (94) 2 |℘ + ρ|2 )2 4 |℘ + ρ|2 4 (1 + α α α α H+ H+ H+ To estimate the contribution of the H− region, we perform a variable change z 7→ u = ℘(z) on H− . Since ℘ is injective on H− , this variable change is well defined provided ℘ has no critical (i.e. double valency) points in H− . The transformed integration range
536
J. M. Speight
℘(H− ) is a closed disk of unit radius centred on −ρ, so given that |ρ| > e1 + 2, ℘(H− ) contains none of {∞, 0, ±e1 }, and hence H− excludes all the double valency points. The Jacobian of the variable change is |℘0 (z)|−2 = |4u(u2 − e21 )|−1 , so Z H−
1 |℘ + ρ|2 = (1 + α2 |℘ + ρ|2 )2 4
Z ℘(H− )
|u + ρ|2 du du¯ . 2 2 |u||u − e1 | (1 + α2 |u + ρ|2 )2
(95)
Now, for all u ∈ ℘(H− ), |u| ≥ e1 + 1 > 1, and |u ± e1 | ≥ ||u| − e1 | ≥ 1, so Z H−
1 |℘ + ρ|2 < (1 + α2 |℘ + ρ|2 )2 4 =
π 2
Z du du¯ ℘(H− )
Z
1
dx 0
|u + ρ|2 (1 + α2 |u + ρ|2 )2
x3 (1 + α2 x2 )2
Z 1+α2 y−1 π dy 4 4α 1 y2 π log(1 + α2 ). < 4α4
(x := |u + ρ|) (y := 1 + α2 x)
=
(96)
Using inequalities (94) and (96) in Eq. (93), the result immediately follows.
Proof of Lemma 3. The idea is similar to the proof of Lemma 4: "Z # Z Z 1 1 1 =2 + , iψ 2 2 iψ 2 2 iψ 2 2 T 2 (1 + |α℘ + σe | ) H+ (1 + |α℘ + σe | ) H− (1 + |α℘ + σe | ) (97) ` where again H+ H− = H, the half torus, but now H+ = {z ∈ H : |α℘ + σeiψ | > σ 4 }. 3
(98)
The H+ integral is trivially bounded by 1/σ 3 . We make the same variable change z 7→ 3 u = ℘(z) on H− . Now ℘(H− ) is a closed disk of radius σ 4 /α centred on σei(ψ+π) /α. In order that ℘(H− ) contain none of {∞, 0, ±e1 }, it suffices that σ ≥ σc , where σc is the real solution of 3 (99) σc − σc4 = 2e1 . To see this, note that ∀u ∈ ℘(H− ), 3
|u| ≥
3 3 σ − σ4 ≥ σ − σ 4 ≥ σc − σc4 = 2e1 , α
(100)
where the restriction α ≤ 1 has been used. So the variable change is well defined provided σ ≥ σc . Recall that the Jacobian of the transformation is |4u(u2 − e21 )|−1 . Now ∀u ∈ ℘(H− ), |u| ≥ 2e1 as shown above. Hence |u|3 = |u|2 |u| ≥ 4e21 |u|, and
(101)
Lump Dynamics in the CP 1 Model on the Torus
537
|u(u2 − e21 )| ≥ ||u|3 − e21 |u|| = |u|3 − e21 |u| 1 3 ≥ |u|3 − |u|3 = |u|3 . 4 4 Thus,
Z H−
1 1 ≤ (1 + |α℘ + σeiψ |2 )2 3
Z ℘(H− )
du du¯ 1 . |u|3 (1 + |αu + σeiψ |2 )2
(102)
(103)
Now let σ˜ c be the real solution of 3
σ˜ c − σ˜ c4 =
σ˜ c , 2
(104)
and define σ∗ = sup{σc , σ˜ c }. Then, provided σ ≥ σ∗ , for all u ∈ ℘(H− ), 3
|u| ≥
σ − σ4 σ ≥ . α 2α
(105)
This allows one to estimate the |u|−3 part of the integrand of inequality (103), which still holds since σ∗ ≥ σc : Z Z 8α3 1 du du¯ ≤ iψ |2 )2 3 (1 + |α℘ + σe 3σ (1 + |αu + σeiψ |2 )2 H− ℘(H− ) Z 1 8α3 2π x = dx (x := |αu + σeiψ |) 3σ 3 α2 0 (1 + x2 )2 Cα C = 3 ≤ 3. (106) σ σ The result immediately follows.
9. Conclusion In this paper we have considered the low-energy dynamics of two CP 1 lumps moving on a torus in the framework of the geodesic approximation. We have proved that the degree 2 moduli space M2 is homeomorphic to the left coset space G/G0 , where G is the eight-dimensional, noncompact Lie group P SL(2, C) × T 2 and G0 is a discrete subgroup of order 4. This result provides a good global parametrization of M2 with unconstrained parameters, based on the Weierstrass ℘ function (this situation should be compared with other studies where M2 was parametrized using the Weierstrass σ function and constrained parameters [7, 20]), and allows a systematic description of the degree 2 static solutions, some of which display four rather than, as one might expect, two distinct energy peaks. By lifting the metric g on M2 defined by the kinetic energy to ge on the covering space G, we showed that the dynamics decouples into a trivial “centre of mass” motion and a nontrivial relative motion of the lumps. This reduces the problem b gb). Two further results to geodesic motion in a 6-dimensional reduced covering space (G, were proved concerning the Riemannian geometry of (M2 , g), namely that the moduli space is geodesically incomplete and has finite diameter. These imply that static lumps can collapse to singularities in finite time, and that all static solutions are close to such
538
J. M. Speight
singularities. In addition, a two dimensional geodesic submanifold was identified, and its geometry and geodesics described in detail. To make further progress in solving the geodesic problem for (M2 , g) one would need to resort to numerical solution of the geodesic equation. Given the explicit parametrization of M2 , and that the metric components are integrals over a compact, two dimensional domain, such numerical work should be reasonably economical. In particular, there are two 3 dimensional geodesic submanifolds whose geodesic problems would be well suited to numerical study, and which should yield interesting lump dynamics. These b → G. b are 6P and 6P R , the projected fixed point sets of the isometries P, P R : G Explicitly, 6P = {exp(iψτ2 /2) [α(℘(z) + ρ1 )] : ψ ∈ [0, 2π], α ∈ R+ , ρ1 ∈ R}, 6P R = {exp(iψτ1 /2) [α(℘(z) + iρ2 )] : ψ ∈ [0, 2π], α ∈ R+ , ρ2 ∈ R}, (107) so both are internal rotation orbits, about (different) fixed axes, of the α(℘ + ρ) family, but with ρ real (6P ) or purely imaginary (6P R ). On these submanifolds, therefore, the two lumps, when distinct, are constrained to lie either on the central cross and boundary of the unit square, or its diagonals, respectively. In either case, they can only scatter through 90◦ . In the case of 6P , for example, any geodesic which punctures any of the cylinders ρ1 = 0, ρ1 = e1 , ρ1 = −e1 at α much greater than 1/e1 gives rise to 90◦ scattering of the lumps. Similarly, any geodesic which punctures the ρ2 = 0 cylinder in 6P R gives rise to 90◦ scattering along the diagonals. Both these processes have been observed in numerical simulations of the field equation [7]. To understand the long time behaviour of the geodesics after the scattering event would require detailed numerical work. Other extensions of the present work would be interesting. One can extend the geodesic incompleteness results proved here for (M2 , g) on T 2 and elsewhere [24] for (M1 , g) on S 2 to the general setting of (Mn , g) for the CP 1 model on an arbitrary compact Riemann surface [21]. It may well be possible to similarly extend our result concerning the finite diameter of moduli space to the general setting. Also, one would expect that (M2 , g) has finite volume, as well as diameter (although neither guarantees the other), and perhaps this can be established by making refined versions of estimates such as those in Lemmas 2, 3 and 4. Finally, it should be emphasised that all our results concern an approximation to the field theory. While this has proved remarkably successful in all situations where it has been tested, one would ideally like rigorous analysis to back up physical intuition. Given the singularity of the geometry of moduli space for the planar CP 1 model, the model on the torus provides an ideal starting point for an analysis fashioned after Stuart’s work on vortices and monopoles [26]. Acknowledgement. I would like to thank Sharad Agnihotri, Jay Handfield and Carlo Morpurgo for several helpful discussions.
References 1. Atiyah M.F., Hitchin, N.J.: The Geometry and Dynamics of Magnetic Monopoles. Princeton NJ: Princeton University Press, Princeton 1988 2. Beardon, A.F.: A Primer on Riemann Surfaces Cambridge: Cambridge University Press, 1984, p. 89 3. Beardon, A.F.: op. cit., (p. 81) 4. Belavin, A.A., Polyakov, A.M.: Metatstable states of two-dimensional isotropic ferromagnets. JETP Lett. 22, 245–247 (1975)
Lump Dynamics in the CP 1 Model on the Torus
539
5. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds and Physics, Part I. Amsterdam: North-Holland, 1982, p. 13 6. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: op. cit., (p. 43) 7. Cova, R.J., Zakrzewski, Z.J.: Soliton scattering in the O(3) model on a torus. Preprint hep-th/9706166 (1997) 8. Du Val, P.: Elliptic Functions and Elliptic Curves. Cambridge: Cambridge University Press, 1973, p. 13 9. Gallot, S., Hulin, D., Lafontaine, J.: Riemannian Geometry. Berlin: Springer-Verlag, 1987 p. 13 10. Gallot, S., Hulin, D., Lafontaine, J.: op. cit., (p. 58) 11. Guillemin, V., Pollack, A.: Differential Topology. Englewood Cliffs NJ: Prentice-Hall, 1974, p. 146 12. Hitchin, N.J., Manton, N.S., Murray, M.K.: Symmetric monopoles. Nonlinearity 8, 661–692 (1995) 13. Knopp, K.: Theory of Functions, Part 2. New York: Dover, 1947. p. 77 14. Knopp, K.: op. cit., pp. 86–91 15. Lawden, D.F.: Elliptic Functions and Applications (Chap. 6). New York: Springer-Verlag, 1989 16. Leese, R.A.: Low-energy scattering of solitons in the CP 1 model. Nucl. Phys. B344, 33–72 (1990) 17. Manton, N.S.: A remark on the scattering of BPS monopoles. Phys. Lett. 110B, 54–56 (1982) 18. Myers, E., Rebbi C., Strilka, R.: Study of the interaction and scattering of vortices in the abelian Higgs (or Ginzburg-Landau) model. Phys. Rev. D45, 1355–1364 (1992) 19. Penrose, R., Rindler, W.: Spinors and space-time, Vol. 1. Cambridge: Cambridge University Press, 1984 pp. 18-21 20. Richard, J-L., Rouet, A.: The CP 1 model on the torus. Nucl. Phys. B211, 447–464 (1983) 21. Sadun, L.A., Speight, J.M.: Geodesic incompleteness in the CP 1 model on a compcat Riemann surface. To appear in Lett. Math. Phys. 22. Samols, T.M.: Vortex scattering. Commun. Math. Phys. 145, 149–179 (1992) 23. Schwerdtfeger, H.: Geometry of Complex Numbers. New York: Dover, 1979, p. 41 24. Speight, J.M.: Low-energy dynamics of a CP 1 lump on the sphere. J. Math. Phys. 36, 796–813 (1995) 25. Strachan, I.A.B.: Low-velocity scattering of vortices in a modified Abelian Higgs model. J. Math. Phys. 33, 102–110 (1992) 26. Stuart, D.: Dynamics of abelian Higgs vortices in the near Bogomolny regime. Commun. Math. Phys. 159, 51–91 (1994) 27. Thurston, W.P.: Three-Dimensional Geometry and Topology. Princeton NJ: Princeton University Press, 1997 pp. 153–157 28. Ward, R.S.: Slowly moving lumps in the CP 1 model in (2 + 1) dimensions. Phys. Lett. 158B, 424–428 (1985) 29. Zakrzewski W.J., Piette, B.: Shrinking of solitons in the (2+1)-dimensional S 2 sigma model. Nonlinearity 9, 897–910 (1996) Communicated by A. Jaffe
Commun. Math. Phys. 194, 541 – 567 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Existence of Gelling Solutions for CoagulationFragmentation Equations Intae Jeon Department of Mathematics, Ohio State University, Columbus, Ohio 43210, USA. E-mail: [email protected] Received: 25 April 1997 / Accepted: 4 November 1997
Abstract: We study the Smoluchowski coagulation-fragmentation equation, which is an infinite set of non-linear ordinary differential equations describing the evolution of a mono-disperse system of particles in a well stirred solution. Approximating the solutions of the Smoluchowski equations by a sequence of finite Markov chains, we investigate the qualitative behavior of the solutions. We determine a device on the finite chains which can detect the gelation phenomena – the density dropping phenomena. It shows how the gelation phenomena are reflected on the sequence of finite Markov chains. Using this device, we determine various types of gelation kernels and get the bounds of gelation times.
0. Introduction The Smoluchowski coagulation-fragmentation equation is an infinite set of non-linear ordinary differential equations describing the evolution of a mono-disperse system of particles in a well stirred solution given by Ct˙(j) =
1 Xj−1 {K(j − k, k)Ct (j − k)Ct (k) − F (j − k, k)Ct (j)} k=1 2 X∞ {K(j, k)Ct (j)Ct (k) − F (j, k)Ct (j + k)}, −
(1)
k=1
for j = 1, 2, 3 · · · , where Ct (j) ≥ 0 is the expected number of j−clusters ( a cluster consisting of j−particles) per unit volume, and K, F are nonnegative symmetric functions which represent the coagulation rate of i and j−cluster, fragmentation rate of i + j cluster breaking up into i and j−cluster, respectively, [Sm, D, BCC, vE3]. Here particles may coagulate to form clusters of k particles for k ≥ 1, and large clusters may fragment into smaller ones. The equations describe the dynamics of the density of k-clusters per unit volume based on the rates at which these events take place
542
I. Jeon
assuming only second order reactions, that is, only two clusters may coagulate at a time and when a cluster fragments, it breaks into exactly two smaller clusters. Since phenomena such as polymerization, cloud formation, star formation, and binary alloy follow this dynamics, this model has a lot of applications in science, for example, astrophysics, atmospheric physics, colloidal chemistry, polymer science, etc. [Sm, S, Sc, D]. Many studies have been devoted to these models recently, including deterministic and stochastic models especially for the pure coagulation case (F (i, j) ≡ 0) [HEZ, LT, BP, BW, vE1, vE2]. Most recently, while we were preparing this monograph, Aldous, in his preprint, surveyed broad scientific literatures about this area and raised many interesting questions. Some direct and indirect answers P∞ are contained in this paper [A]. One expects the total density of particles (ρ = i=1 iCt (i)) to be conserved. However, explicit solutions for the case K(i, j) = ij show that this is not always true. It may happen that the total density of particles decreases after a finite time, a phenomenon interpreted as gelation, for it is taken to signal the formation of a cluster of infinite size. We treat the problem of gelation probabilistically by approximating the solutions of the Smoluchowski equations by a sequence of finite state Markov chains. This is a type of Law of Large Numbers. By making estimates on the approximating Markov chains, we derive conditions under which the Smoluchowski equations have a solution which exhibits the gelation phenomenon, and we derive explicit bounds on the gelation time. This paper is organized as follows: After this introduction, in Sect. 1, we construct finite Markov chains on l2 space, which will be used to approximate the solution of Smoluchowski equation. Also, we define various types of gelation. In Sect. 2, we state the main theorems and corollaries; Theorems 1 and 2 show the tightness of the chains and the existence of solutions for the Smoluchowski equation. Theorems 3 and 4 show when such gelation occurs. The proofs of Theorem 1 and 2 are given in Sect. 3. In Sect. 4, we develop machinery to detect gelation phenomena. We see how the gelation phenomena are reflected on finite chains. Using this machinery we prove Theorem 3 and 4 in Sect. 5.
1. Construction In this section we construct a sequence of finite state Markov chains associated to the rate constants K(i, j) and F (i, j), i, j ≥ 1, in the Smoluchowski coagulation fragmentation equation. In the nth Markov chain, there are on the order of n particles, each of a size inversely proportional to n, which can coagulate to form clusters. These larger clusters can coagulate among themselves or fragment to form smaller clusters, at rates proportional to n, determined by K(i, j) and F (i, j). This is a law of large numbers, or Euler scaling. With this scaling, the Markov chains can be thought of as discrete, stochastic approximations to solutions of the Smoluchowski equations. Notation. (a) Let N = {0, 1, 2, · · · }, N+ = {1, 2, 3, · · · }. P∞ 1 + (b) Let l2 = {η ∈ RN : kηkl2 = ( k=1 |η(k)|2 ) 2 < ∞}. For ρ > 0, P∞ (c) let E ρ = {η ∈ l2 : η(k) ≥ 0 for all k and k=1 kη(k) ≤ ρ}, + P∞ (d) let Enρ = { nρ η ∈ l2 : η ∈ NN , k=1 kη(k) = n}.
Existence of Gelling Solutions
543
For η ∈ E ρ , m, Pn∞≥ 1, (e) let kηk = P kη(k), k=1 n (f) let kηkn = Pk=1 kη(k), n (g) let kηknm = k=m kη(k). (h) [·] represents the largest integer function. (i) Let {ei }∞ i=1 be the basis of l2 . P∞ Remark. The density functional η → kηk = k=1 kη(k) is not continuous on {E ρ , k · ρ kl2 }. For example, let ηn = n en . Then kηn k = ρ for every n, yet limn→∞ ηn = 0. Pl However, each partial sum η → kηkl = k=1 kη(k) is continuous. ∞ Let {K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 be nonnegative, symmetric sequences, that is, K(i, j) = K(j, i), and F (i, j) = F (j, i). For all bounded functions f on Enρ consider the generator Gn given by n X ρ [{f (η + 1nij ) − f (η)}K(i, j)η(i)(η(j) − δij ) Gn f (η) = 2ρ n i+j≤n (2) n + {f (η − 1ij ) − f (η)}F (i, j)η(i + j)],
where 1nij = nρ (ei+j − ei − ej ), and δij = 1, if i = j, and 0, if i 6= j. Let Xtn,ρ denote the corresponding Markov chain on Enρ . Informally we may describe the dynamics as the following: The process waits at state η for an exponentially distributed amount of time with parameter ρ . n X {K(i, j)η(i)(η(j) − δij ) + F (i, j)η(i + j)}, λn (η) = 2ρ n i+j≤n
then jumps to state η + 1nij with probability n ρ K(i, j)η(i)(η(j) − δij ), n 2ρλ (η) n or to state η − 1nij with the complementary probability n F (i, j)η(i + j), 2ρλn (η) where i, j ≥ 1, i + j ≤ n. In any event, since the state space consists of finitely many points for all n, i.e., |Enρ | < ∞, there is a unique well defined pure jump process on Enρ for each n. This process is strong Markov and has the characteristic property that for any bounded function f on Enρ , Z t n,ρ Gn f (Xsn,ρ )ds (3) f (Xt ) − 0
is a martingale. Definition 1. A stochastic coagulation-fragmentation system with density ρ consists of n,ρ ∞ ρ the triple [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] such that ∞ (a) {K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 are nonnegative, symmetric sequences, which are called coagulation and fragmentation kernels, respectively.
544
I. Jeon
(b) Xtn,ρ is the pure jump process on the probability space (n , A, Pn ) whose Markov generator is Gn . (c) X0n,ρ = η n , for η n ∈ Enρ , and η n −→ η0 in (E ρ , k · kl2 ), where kη0 kl2 = ρ. Note. From now on, if there is no confusion, we will drop ρ from the index. n,ρ ∞ ρ Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulationα fragmentation system. For 0 < α ≤ 1, let τn be the first hitting time of cluster size greater than or equal to αn, i.e.,
. τnα = inf{t : Xtn (k) > 0 for some k ≥ αn}. Note that if α < β, then τnα ≤ τnβ a.s. Definition 2. (a) The strong gelation time for a coagulation-fragmentation system is defined by tsg = inf{t > 0 : ∃0 < α ≤ 1 such that lim sup P {τnα ≤ t} > 0}. n→∞
(b) The gelation time of a solution Ct of the Smoluchowski equations is defined by tg = inf{t > 0 : kCt k < kC0 k}. (c) The stochastic gelation(sto-gel) time of a weak limit X of X n,ρ is defined by Tg = inf{t > 0 : P {kXt k < kX0 k} > 0}. (d) We say that strong gelation occurs, gelation occurs and sto-gel occurs if tsg < ∞, tg < ∞ and Tg < ∞, respectively. Remark. In the case of pure coagulation (F (i, j) ≡ 0), strong gelation always implies sto-gel, but it is not clear if we have nonzero fragmentation kernels. We will deal with the relations between strong gelation and sto-gel in Sect. 4.2. The Becker–D¨oring equation is a special case of Smoluchowski coagulationfragmentation equation which allows only coagulation and fragmentation involving single particles. (See [BCC]). We generalize this as follows: n,ρ ∞ ρ Definition 3. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. We say that it is a generalized Becker–D¨oring system of degree r (or r-generalized Becker–D¨oring system) if there is an integer r > 0 such that F (i, j) = 0 for all i, j with min(i, j) > r.
2. Statement of the main theorems n,ρ ∞ ρ Theorem 1. (Tightness) Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy P (a) (TC1) supη∈E i,j≥1 K(i, j)η(i)η(j) < ∞, P (b) (TC2) supη∈E i,j≥1 F (i, j)η(i + j) < ∞,
then the laws of {X n } form a tight sequence.
Existence of Gelling Solutions
545
The above theorem shows the existence of weak limits. Then, some natural questions arise about whether the weak limits solve the Smoluchowski coagulation-fragmentation equation and whether they show the density dropping phenomena (sto-gel). The following theorems answer those questions for some interesting classes of coagulation and fragmentation kernels. n,ρ ∞ ρ Theorem 2. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy the conditions
= 0, (a) limi+j→∞ K(i,j) ij (b) there exists G(i+j) such that F (i, j) ≤ G(i+j) for all i, j and limi+j→∞ G(i+j) = 0, then there exists a weak limit Xt of Xtn , and it solves the system of the integral version of the Smoluchowski equation on any interval t ∈ [0, T ], T < ∞. n,ρ ∞ ρ Theorem 3. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the kernels satisfy
(a) (ij)α ≤ K(i, j) ≤ M ij for some 21 < α ≤ 1, > 0, M < ∞, (b) F (i, j) ≡ 0, then there exists a weak limit Xt of Xtn , and Tg ≤ C(α) ρ , where C(α) =
inf
0<β< 2α−1 2
2β 22+2α+β . + inf (2β − 1)2 0<β< 2α−1 (2β − 1)2 (22α−1−β − 1) 2
(4)
Combining Theorems 2 and 3, we have the following corollary. Corollary 1. Suppose the rates satisfy the conditions (a) limi+j→∞
K(i,j) ij
= 0,
α
(b) (ij) ≤ K(i, j) for some 21 < α < 1, > 0, (c) F (i, j) ≡ 0, then there exists a solution of the Smoluchowski equation with tg < ∞. Remark. Our result only concerns the existence of gelling solutions. The uniqueness of solutions of the Smoluchowski coagulation-fragmentation equation, with kernels that satisfy the conditions in Corollary 1, has not been proved yet. However, it seems highly probable since the case K(i, j) = ij is proved to have a unique solution, see [K]. As Aldous introduced in [A], since Leyvraz and Tschudi conjectured that K(i, j) = (ij)α for 21 < α < 1, is a gelling kernel in 1982 (see [LT]), this has been widely accepted in scientific modeling literature (for the meaning, see [A]). To our best knowledge, Corollary 1 is the first general rigorous result about the conjecture. n,ρ ∞ ρ Theorem 4. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose the rates satisfy
(a) The tightness conditions (T1), (T2), in Theorem 1, (b) inf i,j≥1 K(i,j) = . ij
546
I. Jeon
P Define β by setting ρβ = supη i,j F (i, j)η(i + j). Note that β is independent of ρ. If 2 ρ > β then strong gelation occurs with tsg ≤ (ρ−ρ and Tg ≤ tsg . c) Combining Theorems 4 and 5 and Theorem 6 in Sect. 4.2, we have Corollary 2. Suppose coagulation and fragmentation kernels satisfy (a) ij ≤ K(i, j) ≤ M ij, for some 0 < ≤ M < ∞, C , for some 0 ≤ C < ∞. (b) 0 ≤ F (i, j) ≤ i+j If ρ >
C ,
then sto-gel occurs and Tg ≤
4 2ρ−C .
Corollary 3. Suppose coagulation and fragmentation rates satisfy (a) ij ≤ K(i, j) ≤ M ij, for some 0 < ≤ M < ∞, (b) F (i, j) ≤ Ci ∨ j, and there exists r > 0 such that F (i, j) = 0, if i ∧ j > r. If ρ >
Cr ,
then sto-gel occurs and Tg ≤
2 ρ−Cr .
The next theorem is a device on the finite chains which can detect the gelation phenomena. n,ρ ∞ ρ Theorem 5. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system, and suppose there is a subsequence X nk which converges weakly to X. For any fixed t > 0, the following are equivalent.
I. There exists a nondecreasing function φ : N+ −→ N+ such that φ(n) ≤ n, limn→∞ φ(n) = ∞, and there exists > 0 such that lim sup P {kXtnk kφ(nk ) ≤ ρ − } > 0. k→∞
II. There exists > 0 such that P {kXt k ≤ ρ − } > 0.
3. Tightness and Weak Limits (Theorems 1 and 2) 3.1. Preliminaries. Lemma 3.1. (a) E ρ is a compact subset of l2 . (b) Enρ is a finite subset of E ρ . P∞ Proof. (a) Note that if η ∈ E ρ then supk≥1 kη(k) ≤ ρ, hence kηk2l2 = k=1 η(k)2 ≤ P∞ ρ2 2 ρ k=1 k2 ≤ cρ < ∞. Furthermore, if ηn ∈ E and ηn → η in l2 , then for all k ≥ 1, Pl Pl ηn (k) → η(k) ≥ 0. Thus for each l ≥ 1, k=1 kη(k) = limn→∞ k=1 kηn (k) ≤ ρ, and so η ∈ E ρ . Thus E ρ is a closed subset of l2 . As for compactness, let A : l2 → l2 be the diagonal operator given on the standard orthonormal basis ek = (0, 0, · · · , 0, 1, 0, · · · , 0 · · · ) by Aek = kρ ek . It is clear that A is a compact operator since it has a complete set of eigenvectors of multiplicity one and its
Existence of Gelling Solutions
547
eigenvalues accumulate only at 0. For η ∈ E ρ , define ξ by ξ(k) = kρ η(k). Then η = Aξ P∞ P∞ and ξ has norm kξk2l2 = k=1 ( kρ η(k))2 ≤ ρ−2 supk≥1 kη(k) k=1 kη(k) ≤ 1. Thus, E ρ is a closed subset of a compact set, namely the image under A of the unit ball, and so E ρ is compact. (b) It is clear from the definitions. Lemma 3.2. P For any nonnegative function K on N+ such that limi→∞ ∞ function η → i=1 K(i)η(i) is continuous on (E ρ , k · kl2 ).
K(i) i
= 0, the
Proof. Suppose η n ∈ E ρ and η n → η in l2 . For any > 0, choose N such that K(i) i < 2ρ , for all i > N . Then |
∞ X
K(i){η(i) − η n (i)}| ≤
i=1
X i≤N
K(i)|η(i) − η n (i)| +
X
K(i)|η(i) − η n (i)|.
i>N
But the first term is a finite sum and it goes to 0, as n → ∞. The second term becomes X K(i) |iη(i) − iη n (i)| < (2ρ) < . i 2ρ i>N
The following lemma states the necessary and sufficient condition of tightness. (We refer to [JM]). Lemma 3.3. Let (X n )∞ n=1 be a sequence of processes defined on their respective probability space (n , A, Pn ) with values in the complete separable metric space H. Then the sequence {P˜n } of laws of the processes (X n ) form a tight sequence if and only if [T 1] and [T 2] hold, where [T1] For any t in some dense subset T of R+ , the laws of the random variables (Xtn ) form a tight sequence of laws in H. [T2] For every N > 0, β > 0, > 0, there exists δ > 0 such that lim sup Pn {ω ∈ n : W N (X n (·, ω), δ) > β} ≤ , n→∞
(5)
. where W N (X, δ) = inf 5δ maxti ∈5 supti ≤s
3.2. Proof of Theorem 1. Proof. By Lemma 3.3, it suffices to show that [T1], [T2] hold. (a) [T1] is immediate from the compactness of E. (b) To prove [T2], first recall the parameter of exponential waiting time distribution λn (η) of η ∈ En ,
548
I. Jeon
n X ρ {K(i, j)η(i)(η(j) − δij ) + F (i, j)η(i + j)} 2ρ n i+j≤n X X n ≤ ( sup K(i, j)η(i)η(j) + sup F (i, j)η(i + j)) 2ρ η∈E η∈E
λn (η) =
i,j≥1
i,j≥1
= cn, for all η ∈ En , where c=
X X 1 ( sup K(i, j)η(i)η(j) + sup F (i, j)η(i + j)). 2ρ η∈E η∈E i,j≥1
(6)
i,j≥1
Let η ∈ En and suppose X0n = η, and let Tη = inf{t > 0 : Xtn 6= η}. Then Tη is an exponential r.v. with mean
1 λn (η) .
Define Jn and Xn by
Jn = inf{s > Jn−1 : Xs 6= Xs− }, Xn = X n (Jn ), for all n ≥ 0, with J0 = 0. Then Jn = TX0 + TX1 + · · · + TXn−1 , and it represents the holding time in the successive states visited by Xtn . (See p. 18, [An].) 1 , and let Now let {Ti } be a sequence of i.i.d exponential r.v.’s with mean cn S n = T1 + T 2 + · · · + T n , then Sn is the holding time in successive states visited by the Poisson process Yt with parameter cn. We can verify easily that Jn stochastically dominates Sn . i.e., for any k, P {Jk > k} ≥ P {Sn > k}. Now, for given N > 0, > 0, β > 0, pick δ > 0 such that δ < √β5ρc and δl = N for some integer l. Let 5 be the partition t0 = 0 < t1 = δ < t2 = 2δ < · · · < tl = N , and let Zi =
sup
ti ≤s
d(X n (t, ω), X n (s, ω)).
Then Pn {W N (X n (·, ω), δ) > β} ≤ Pn { max Zi > β} 0≤i≤l−1
= P {Z1 > β or Z2 > β · · · or Zl−1 > β} ≤ l max P {Zi > β}. i
Note that the jump size of Xtn is bounded by
√
5ρ n ,
since
Existence of Gelling Solutions
549
√ d(Xtn− , Xtn )
=
kXtn−
−
Xtn kl2
≤ sup
i+j≤n
k1nij kl2
5ρ . n
=
Thus, in order for Zi > β, we need more than [ √β5ρ n] jumps on interval [iδ, (i + 1)δ), since if not, √ β 5ρ ≤ β. Zi ≤ [ √ n] n 5ρ Therefore, Pn {W N (X n (·, ω), δ) ≥ β} ≤ l max P {Zi > β} i
≤ lPn {J[ √βn ] ≤ δ} 5ρ
≤ lP {T[ √βn ] ≤ δ} 5ρ
βn = lP {Yδ ≥ [ √ ]}. 5ρ But the last term goes to 0 as n goes to infinity since δcn < Poisson random variable goes to 0 exponentially fast. Therefore,
βn √ 5ρ
and the tail of a
lim Pn {W N (X n (·, ω), δ) > β} −→ 0, i.e., n
lim sup Pn {ω ∈ n : W N (X n (·, ω), δ) > β)} = 0. n
3.3. The weak limits. Proof of Theorem 2. For any bounded function f which has a continuous bounded . directional derivative in the sense that 5f = (5e1 f, 5e2 f, · · · , 5en f, · · · ) is l2 bounded and continuous, i.e., supη∈E k(5f )(η)kl2 < ∞, define Z t . Mtn,f = f (Xtn ) − f (X0n ) − Gn f (Xsn )ds. (7) 0
Then M n,f is a right continuous Pn martingale and M0n,f = 0 for all n. Let P˜n be the law of X n in D([0, T ] : E), and set ξt (ω) = ω(t), for each ω ∈ D([0, T ] : E), where D([0, T ] : E) is the set of E -valued cadlag functions defined on [0, T ] equipped with the Skorohod topology. Set 8ft (ω) = f (ξt (ω)) − f (ξ0 (ω)) − · ξs (ω)(i)(ξs (ω)(j) −
δij ξˆs (ω))
1 2
Z
t
X
{5f (ξs (ω)) · 10i,j K(i, j)
0 i,j≥1
− 5f (ξs (ω)) ·
10i,j F (i, j)ξs (ω)(i
(8)
+ j)}ds, S∞ / n=1 En . where 10i,j = ei+j − ei − ej , and ξˆs (ω) = nρ , if ω(s) ∈ En , and 0 if ω(s) ∈ Then for any t, 0 ≤ t < T , 8ft (X n ) = Mtn,f − φn (t),
550
I. Jeon
where kφn k∞ → 0 as n → ∞. On D([0, T ] : E), consider the right continuous canonical (Dt0 )t≥0 genT filtration 0 erated by ξ. For any s, t < T and for any F ⊂ Ds = s0 >s Ds0 , let ψ f (ω) = 1F (ω)[8ft (ω) − 8fs (ω)]. Then E˜n {ψ f (ξ)} ≤ kφn k∞ .
(9)
˜ Consider any convergent sequence (P˜nk )∞ k=1 and let P be its limit. For any > 0, . let A = {ω : ∃s such that d(ωs− , ωs ) > }, then P˜ (A ) = 0, since A ⊂ (Ac )c = G 2 and since G is open, P˜ (G) ≤ lim inf nk →∞ P˜nk (G) = 0. Therefore, the weak limit has continuous sample paths, i.e., Supp(P˜ ) ⊂ C([0, T ] : E) ⊂ D([0, T ] : E), where C([0, T ] : E) is the set of E-valued continuous functions. This implies that for any ω ∈ Supp(P˜ ), convergence to ω in Skorohod topology in D([0, T ] : E) is reduced to a uniform convergence. Using Lemma 3.2, under the hypothesis of this theorem, we can prove that, for P˜ −almost all ω, the mapping ω −→ ψ f (ω) is continuous at ω. Also, the ˜ f ) = limnk E˜ nk (ψ f ) = 0. Therefore, weak convergence and the inequality (9) imply E(ψ from (8), Mtf
1 . = 8ft (X) = f (Xt ) − f (X0 ) − 2 · Xs (i)Xs (j) −
Z
t
X
(5f (Xs ) · 10i,j K(i, j) 0 i,j≥1 5f (Xs ) · 10i,j F (i, j)Xs (i + j))ds
(10)
is a P˜ martingale with M0f = 0. Moreover, 8fT (X nk ) → 8fT (X) weakly, since 8fT is continuous (see [Bi]). That is, M nk ,f −→ M f weakly. Applying the same argument used in counting the number of jumps and bounding the jump size in Theorem 1, since M n,f P has paths with finite variation, we can show that the quadratic variation [M n,f ]t = s≤t (1Msn,f )2 converges to 0 in probability as n → ∞. Also, since [M n,f ] → [M f ] weakly, [M f ]T = 0. (See p. 342 [JS].) Therefore, Mtf = 0 for t ≤ T , with probability 1, i.e., f (Xt ) − f (X0 ) −
1 2
Z
t
X
(5f (Xs ) · 10i,j K(i, j)Xs (i)Xs (j)
0 i,j≥1
− 5f (Xs ) ·
10i,j F (i, j)Xs (i
(11)
+ j))ds = 0.
Now, for any k ∈ N+ , choosing f k , the k th coordinate projection on l2 , and restricting it to the domain E, we get for any t ∈ [0, T ],
Existence of Gelling Solutions
1 Xt (k) = X0 (k) + 2
551
Z
t
X
(5f k (Xs ) · 10i,j K(i, j)Xs (i)Xs (j)
0 i,j≥1
− 5f k (Xs ) · 10i,j F (i, j)Xs (i + j))ds Z k−1 1 tX [ {K(k − i, i)Xs (k − i)Xs (i) − F (k − i, i)Xs (k)} = X0 (k) + 2 0 i=1
−
∞ X
{K(k, i)Xs (k)Xs (i) − F (k, i)Xs (k + i)}]ds.
i=1
This is the integral version of the Smoluchowski equation.
4. Gelation Phenomena In this section we will study the gelation phenomenon, which indicates the appearance of huge clusters in a finite time. In a deterministic model, the situation of dropping total density in a finite time is interpreted as gelation. 4.1. The device. Theorem 5 shows how the stochastic gelation phenomena are reflected on the finite sequence of Markov chains X n . Proof of Theorem 5. First note that since Xt has continuous sample paths, for any t, Xtnk → Xt weakly. (See p. 131 [EK].) (a) (I −→ II). Let Al = {η ∈ E : kηkφ(l) ≤ ρ − }. Since φ is nondecreasing, we have for any l ≥ 1, kηkφ(l) =
φ(l) X
iη(i) ≤
i=1
φ(l+1) X
iη(i) = kηkφ(l+1) .
i=1
Hence, Al+1 ⊂ Al . Thus Al is a decreasing sequence of closed sets. By weak convergence, P {Xt ∈ Al } ≥ lim sup P {Xtnk ∈ Al } k→∞
≥ lim sup P {Xtnk ∈ Ank } k→∞
= lim sup P {kXtnk kφ(nk ) ≤ ρ − } k→∞
= δ > 0. But then, P {kXt k ≤ ρ − } = P {Xt ∈
∞ \ l=1
Al } = lim P {Xt ∈ Al } ≥ δ > 0. l→∞
(b) (II −→ I). Let us show that the negation of I implies the negation of II. Thus we assume
552
I. Jeon
(¬I) for every nondecreasing φ : N+ −→ N+ such that φ(n) ≤ n, φ(n) → ∞, as n → ∞, and for every > 0, lim sup P {kXtnk kφ(nk ) ≤ ρ − } = 0. k→∞
Claim. (¬I) implies the condition ∀ > 0, lim lim sup P {kXtnk kl ≤ ρ − } = 0. l→∞ k→∞
Proof of the claim. Suppose, to the contrary, that there exists > 0 and a subsequence lm , m ≥ 1 such that for every m ≥ 1, lim supk→∞ P {kXtnk klm ≤ ρ − } = δ > 0. Thus, for each m ≥ 1 there exists a subsequence k(m, r), r ≥ 1, such that for all r ≥ 1, n
P {kXt k(m,r) klm ≤ ρ − } ≥
δ . 2
Choose r(1) so that nk(1,r) ≥ l1 for every r ≥ r(1), and inductively choose r(m + 1) so that both nk(m,r(m)) ≤ nk(m+1,r(m+1)) and nk(m+1,r) ≥ lm+1 for all r ≥ r(m + 1). Define φ by φ(j) =
1 if 1 ≤ j ≤ nk(1,r(1)) , lm if nk(m,r(m)) ≤ j < nk(m+1,r(m+1)) .
Then φ is nondecreasing, φ(j) ≤ j, ∀j ≥ 1, and limj→∞ φ(j) = +∞. Letting km = k(m, r(m)), we find for all m ≥ 1, n
n
P {kXt km kφ(nkm ) ≤ ρ − } = P {kXt km klm ≤ ρ − } ≥ Hence,
δ . 2
lim sup P {kXtnk kφ(nk ) ≤ ρ − } > 0, k→∞
that is, condition I is true. This proves the claim. Now assume the negation of I and let Bl () = {η ∈ E : kηkl > ρ − }. For each > 0, Bl (), l ≥ 1, is an increasing sequence of open sets. Since Bl ( 2 ) ⊂ Bl (), we have by weak convergence, P {Xt ∈ Bl ()} ≥ P {Xt ∈ Bl ( )} 2 ≥ lim sup P {Xtnk ∈ Bl ( )} 2 k→∞ = 1 − lim inf P {kXtnk kl < ρ − } k→∞ 2 nk ≥ 1 − lim sup P {kXt kl ≤ ρ − }. 2 k→∞ S∞ But then, by the claim, if B() = l=1 Bl (),
Existence of Gelling Solutions
553
P {Xt ∈ B()} = lim P {Xt ∈ Bl ()} l→∞
≥ 1 − lim lim sup P {kXtnk kl ≤ ρ − } l→∞ k→∞ 2 = 1. Therefore,
P {kXt k ≤ ρ − } = P {Xt ∈ B()c } = 0.
Since > 0 was arbitrary, this shows that the negation of condition II holds.
The above theorem suggests that, without looking at the system of deterministic equations or the weak limits, we can determine whether stochastic gelation occurs or not by just looking at the system of processes defined on finite particle systems. Also, this theorem implies that the gelation phenomenon defined in the system of deterministic equations indicates the appearance of many but relatively small clusters in a finite time as well as the appearance of a huge cluster. This may raise some questions about the correctness of the interpretation that non-preservation of density implies the appearance of a giant cluster. 4.2. Strong gelation vs. Stochastic gelation. A more suitable definition of gelation as the appearance of a huge cluster in a finite time can be given by the strong gelation which we defined in Sect. 1. The next few results show that under certain conditions on the kernels K(i, j) and F (i, j), strong gelation implies gelation. More precisely, suppose the coagulation-fragmentation system satisfies the tightness conditions (TC1) and (TC2). If the fragmentation rates are not too large, then the sto-gel time Tg of any weak limit point can be estimated in terms of the strong gelation time tsg , provided the latter is finite. n,ρ ∞ ρ Proposition 1. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. Suppose F (i, j) ≡ 0 for all i, j, then Tg ≤ tsg , i.e., if strong gelation occurs, then stochastic gelation occurs.
Proof. Since there is no fragmentation, the first condition of Theorem 5 holds with φ(n) = [αn]. n,ρ ∞ ρ Theorem 6. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be a stochastic coagulation-fragmentation system. If the kernels satisfy the tightness condition and if P s i+j=k F (i, j) ≤ C, for all k and for some C > 0, then Tg ≤ tg , i.e., if strong gelation occurs then stochastic gelation occurs.
Proof. Suppose tsg < ∞, then exist α > 0 and t ≥ 0 such that lim supn Pn {τnα ≤ t} > 0. Let tα = inf{t > 0 : lim sup Pn {τnα ≤ t} > 0}, n
then limα→0 tα =
tsg .
Choose a subsequence and denote it by {n} again, such that lim Pn {tα ≤ τnα ≤ tα + n
α } > 0. C
For any η ∈ En , let l(η) = k, where k is the largest integer such that η(k) > 0, i.e., l represents the size of the largest cluster of a state η. Let
554
I. Jeon
. G(η) = {ξ ∈ En : l(ξ) < l(η)}, . H(η) = {ξ ∈ En : l(ξ) ≥ l(η)}, . T n = inf{t > 0 : Xτnnα +t ∈ G(Xτnnα )}. Note that T n is independent of τnα . In order for the largest cluster size to decrease, it is necessary that the largest cluster should fragment. Also, the time taken for the largest cluster size to decrease is longer than the time taken for one of the largest clusters to fragment, and the rate µ(ξ) of fragmentation for one cluster of size l(ξ) ξ ∈ H(η), with l(η) ≥ αn, satisfies . n X F (i, j)ξ(i + j) µ(ξ) = 2ρ i+j=l(ξ) n X = F (i, j)ξ(l(ξ)) 2ρ i+j=l(ξ)
≤
1 X F (i, j) 2α i+j=l(ξ)
C , 2α ρ ρ ≤ l(η) ≤ where the first inequality is true since l(ξ)ξ(l(ξ)) ≤ ρ implies ξ(l(ξ)) ≤ l(ξ) This shows that T n stochastically dominates the exponential r.v. Y with mean Therefore, ≤
ρ αn . 2α c .
α α , tα ≤ τnα < tα + } C C α α = Pn {T n > } · Pn {tα ≤ τnα < tα + } C C α α α ≥ P {Y > } · Pn {tα ≤ τn < tα + } C C α 1 = √ Pn {tα ≤ τnα < tα + }. C e
Pn {T n >
For ω, with τnα (ω) < ∞, since kXτnαn kn[αn] =
X k≥[αn]
kXτnαn (k) ≥ αn
ρ = αρ, n
α α , tα ≤ τnα < tα + } C C α 1 ≥ √ Pn {tα ≤ τnα < tα + }. C e
Pn {kXtα + Cα kn[αn] ≥ αρ} ≥ Pn {T n >
Therefore, By Theorem 5, Tg ≤
lim Pn {kXtα + Cα kn[αn] ≥ αρ} > n→∞ α tα + C , and by letting α → 0, we get
0. Tg ≤ tsg .
n,ρ ∞ ρ Theorem 7. Let [{K(i, j)}∞ i,j=1 , {F (i, j)}i,j=1 , (Xt , En ), n ≥ 1] be an r- generalized Becker– D¨oring system. If the kernels satisfy the tightness condition (T1) and (T2), then Tg ≤ tsg , i.e., if strong gelation occurs, then sto-gel occurs.
Existence of Gelling Solutions
555
P Proof. Let C = supη i,j F (i, j)η(i + j) < ∞. Suppose tsg < ∞, then by the same argument in Theorem 6, there exists α > 0, and t ≥ 0 such that lim supn Pn {τnα ≤ t} > 0. Let tα = inf{t > 0 : lim sup Pn {τnα ≤ t} > 0}, n
choose a subsequence and denote it by {n} again, such that lim Pn {tα ≤ τnα ≤ tα + n
αρ } > 0. Cr
Let Ttn be the process which counts the number of fragmentations from τnα to τnα + t, i.e., let Ttn = |{τnα < s ≤ τnα + t : Xsn = Xsn− − 1nij }|. Since the fragmentation rate Cn . n X , F (i, j)η(i + j) ≤ µn (η) = 2ρ i,j 2ρ for all η ∈ E, if we let Ytn be the Poisson process with parameter Cn 2ρ , then the same argument (with different notations) as in Theorem 1 shows that Ttn is stochastically dominated by Ytn . Therefore, αn αρ ], tα ≤ τnα < tα + } 2r Cr αn αρ = Pn {T nαρ ≤ [ ]}Pn {tα ≤ τnα < tα + } Cr 2r Cr αn αρ n ≥ P {Y αρ } ≤ [ ]}Pn {tα ≤ τnα < tα + Cr 2r Cr [ αn 2r ] αn k X ( 2r ) − αn αρ = e 2r Pn {tα ≤ τnα < tα + } k! Cr
Pn {T nαρ ≤ [ Cr
(12)
k=0
≥
αρ 1 Pn {tα ≤ τnα < tα + }, 3 Cr
for large n, where the last inequality is true by the following lemma. Lemma 4.1. For any α, 0 < α ≤ 1, [αn] X k=1
1 (αn)k −αn e as n → ∞. −→ k! 2
Proof. Let {X Pin} be the sequence of i.i.d. Poisson random variables with parameter 1, and let Sn = k=1 Xk , then Sn is a Poisson with parameter n. Thus,
556
I. Jeon [αn] X k=1
(αn)k −αn X [αn]k −[αn] e e ∼ k! k! [αn] k=1
= P {S[αn] ≤ [αn]} = P {S[αn] − [αn] ≤ 0} S[αn] − [αn] = P{ ≤ 0} 1 [αn] 2 1 −→ as n → ∞, 2 since
S[αn] −[αn] 1
[αn] 2
goes to a standard normal distribution.
Now, since in each fragmentation the largest cluster size decreases by at most r, for αn αn any ω ∈ {T nαρ ≤ [ αn 2r ]}, the largest cluster size decreases at most r · [ 2r ] ≤ 2 . So Cr αn l(Xτnα + αρ )(ω) ≥ αn − αn 2 = 2 , i.e., n
Cr
≥ kXτnnα + αρ (ω)kn[ αn 2 ] Cr
αn ρ αρ = . 2 n 2
Therefore, αρ } Cr 2 αn αρ ≥ Pn {T nαρ ≤ [ ], tα ≤ τnα < tα + } Cr 2r Cr αρ 1 }, ≥ Pn {tα ≤ τnα < tα + 3 Cr and the last inequality is true by (12). Thus Pn {kXtnα + αρ kn[ αn ≥ 2 ]
≥ lim Pn {kXtnα + αρ kn[ αn 2 ]
n→∞
Cr
αρ } > 0. 2
As in Theorem 6, letting α → 0, we get Tg ≤ tsg .
5. Applications (Theorems 3 and 4) β
δ << 1, let C = ρ(2 −1)(1−δ) , and let 5.1. Proof of Theorem 3. For 0 < β < 2α−1 2 , 0 4, define φ : N+ −→ N+ by n (13) φ(n) = [ ] ∧ max{2l : nl > mρ}. 2 Since {k } is decreasing, φ(n) is increasing and % ∞. Fix n large enough so that φ(n) >> 1. For η ∈ En , I ⊂ {1, 2, · · · , n}, let rI (η) be the rate of jump from η including only the coordinates in I, i.e., P {Xt+h ∈ AIη |Xt = η} = rI (η)h + o(h), . where AIη = {η + 1nij : i, j ∈ I}. (Recall that 1nij =
ρ n {ei+j
− (ei + ej )}.)
Existence of Gelling Solutions
557
Then
rI (η) =
n 2ρ
X i,j∈I,i+j≤n
ρ K(i, j)η(i)(η(j) − δij ). n
. For k ∈ J = {j ∈ N : 2j ≤ φ(n)}, let
j
X j . 2i i , for all j ≤ k}. K k = {η ∈ En : kηk21 ≤ i=0
Note that K k ⊂ K k−1 for all k ∈ J − {0}. Lemma 5.1. If η ∈ K k for all k ∈ J, then kηkn[ φ(n) ] ≥ δρ. 2
Proof. Let k0 = max J. Since kηk21 k0 kηkn[ φ(n) ] ≥ δρ, since φ(n) 2 <2 .
k0
≤
Pk 0
i=1
2i i < ρ(1−δ), kηkn2k0 +1 ≥ δρ. Therefore
2
T . Lemma 5.2. Let I0 = [2k−1 + 1, 2k ] N+ , then
rI0 (η) ≥
n 22+2α ρ
22kα 2k ,
(14)
for all η ∈ K k−1 \ K k , k ≥ 1. Proof. For any η ∈ K k−1 \ K k , since kηk21 have
k−1
≤
Pk−1 i=0
k
2i i , kηk21 ≥
Pk
i=0
2i i , we
k k k−1 . ω = kηk22k−1 +1 = kηk21 − kηk21 ≥ 2 k k .
Also note that if i, j ∈ I0 , then i + j ≤ n, since 2k ≤ φ(n) ≤ [ n2 ]. For k ∈ J, k ≥ 1,
558
I. Jeon k
n rI0 (η) = 2ρ
2 X i,j=2k−1 +1
ρ K(i, j)η(i){η(j) − δij } n
k
n ≥ 2ρ
2 X i,j=2k−1 +1
ρ iα j α η(i){η(j) − δij } n
k
=
2 X n {( iα η(i))2 − 2ρ k−1 i=2
+1
k
2 X i=2k−1 +1
k
2 X n ≥ {( iα η(i))2 − 2ρ k−1 i=2
+1
k
=
n ( 2ρ
2 X i=2k−1 +1
ρ i2α η(i) } n
k
2 X i=2k−1 +1
ρ iα η(i)2kα } n
k
iα η(i))(
2 X
iα η(i) −
i=2k−1 +1
(15)
2kα ρ ) n
n (k−1)α ω (k−1)α ω ρ ≥ 2 (2 − 2kα ) 2ρ 2k 2k n n (k−1)α 2k k (k−1)α 2k k ρ ≥ 2 (2 − 2kα ) 2ρ 2k 2k n n 2α ρ = 1+2α 22kα k (k − ) 2 ρ n n ≥ 2+2α 22kα 2k . 2 ρ
The last inequality is true since nk ≥ nk0 > mρ implies
2α ρ n
<
k α m2
≤
k 2 .
Proof of Theorem 3. Let Tk be the stopping time and Klk be a subset of K k defined by . Tk = inf{t > 0 : Xtn ∈ K k }, k+1 2k+1 ρ . }, Klk = {η ∈ K k : kηk21 ≤ ρ − l n
then Tk−1 ≤ Tk a.s. and Klk ⊂ K k+1 for all l such that k+1
X n . 2i i ) k+1 ], l − 1 ≥ αk = [(ρ − 2 ρ i=0
where [·] is the greatest integer function, since for such l, kηk21
k+1
≤ρ−
k+1
k+1
i=0
i=1
X l2k+1 ρ n 2k+1 ρ X i ≤ ρ − (ρ − = 2i i ) k+1 2 i . n 2 ρ n
. For l = 0, 1, 2, · · · , let σlk = inf{t > 0 : Xt ∈ Klk }. Then σ0k−1 = Tk−1 , σlk−1 ≥ k−1 σl−1 and
Existence of Gelling Solutions
559
αk−1 −1
X
Tk = Tk−1 +
i=0
k−1 (σi+1 ∧ Tk − σik−1 ∧ Tk ) + (Tk − σαk−1 ∧ Tk ). k−1
Also, αk−1 −1
E(Tk − Tk−1 ) = E{
X
k−1 (σi+1 ∧ Tk − σik−1 ∧ Tk )} + E(Tk − σαk−1 ∧ Tk ) k−1
i=0
≤ αk−1
max
0≤i≤αk−1 −1
k−1 E(σi+1 ∧ Tk − σik−1 ∧ Tk ) + E(Tk − σαk−1 ∧ Tk ). k−1
Claim. For 0 ≤ i ≤ αk−1 − 1, k−1 ∧ Tk − σik−1 ∧ Tk ) ≤ E(σi+1
Proof. Let
T = inf{t > 0 : Xσk−1 +t ∈ AIX0 i
22+2α ρ . n22kα 2k
k−1 σ i
(16)
∪ K k }.
k−1 k−1 , if ξ ∈ AIη0 , then ξ ∈ K k or ∃m > l such that ξ ∈ Km , For any η ∈ Klk−1 \ Kl+1 since k
k
k
kηk21 − kξk21 = k − 1ij k21 ( for some i, j ∈ I0 ) k ρ = k (ei + ej )k21 (since i + j > 2k ) n ρ ρ =i +j n n ρ ≥ 2k . n k−1 That is, if Xσk−1 ∧Tk ∈ Klk−1 \ Kl+1 for some l, then Xσk−1 ∈ Klk−1 \ K k and i
i
k−1 Xσk−1 +T ∈ Km \ K k for some m > l, or Xσk−1 +T ∈ K k . Thus i
i
σik−1
∧ Tk + T ≥
k−1 σi+1
∧ Tk a.s.,
hence k−1 ∧ Tk − σik−1 ∧ Tk ) ≤ E(T ). E(σi+1
. Now let τ = inf{t > 0 : Xt ∈ AIX0 − }, then t
E(T ) = E{T |Xσk−1 } i
= E{τ |X0 ∈ K k−1 \ K k } = EX k−1 (τ ) σ i
∧Tk
≤ max Eη (τ ) η∈K k−1
≤ max
η∈K k−1
≤
1 rI0 (η)
22+2α ρ , n22kα 2k
560
I. Jeon
and the Claim is done.
Similarly, E(Tk − σαk−1 )≤ k−1 ∧Tk
22+2α ρ . n22kα 2k
(17)
Let γ = 2α − 1 − 2β > 0, then for k ≥ 1 with constants C1 , C2 which may vary in each expression, E(Tk − Tk−1 ) ≤ (αk−1 + 1) ≤ αk−1 ≤ (ρ −
22+2α ρ n22kα 2k
C1 22+2α ρ + 2 2kα n2 k m2(γ+β)k k X
2 i i )
i=0
n 22+2α ρ C1 + 2 k 2kα 2 ρ n2 k m2(γ+β)k
k X 1 22+2α ρ C1 1 = {ρ − C }) + 2 βi k 2kα 2 2 ρ 2 k m2(γ+β)k
(18)
i=0
22+2α+β ρ(1 − δ) C1 C2 δ = + + γk 2 (k+1)β γk (γ+β)k C 2 2 m2 2 22+2α+β C1 C2 δ ≤ + + . ρ(1 − δ)(2β − 1)2 2(γ+β)k m2(γ+β)k 2γk Therefore, for 0 < β < ∞ X
2α−1 2 ,
E(Tk − Tk−1 ) ≤
k=1
∞ X 22+2α+β C1 1 + C2 δ + ρ(1 − δ)(2β − 1)2 2(γ+β)k m k=1
2+2α+β
=
C1 2 + C2 δ + ρ(1 − δ)(2β − 1)2 (2γ+β − 1) m
Now, by a slight modification of (15), we get ρ n K(1, 1)η(1){η(1) − } 2ρ n n 2 C + C1 , ≥ 2ρ
γI 0 (η) =
where I 0 = {1}. Thus, as in (18), E(T0 ) ≤
2β C1 + C2 δ. + − 1)2 m
ρ(2β
Recalling that k0 , the largest integer such that 2k0 ≤ φ(n), and setting T −1 = 0, we have
Existence of Gelling Solutions
ETk0 = ET0 + E(
k0 X
561
(Tk − Tk−1 )
k=1
≤
inf
0<β< 2α−1 2
C1 2β 22+2α+β + + C2 δ + inf β − 1)2 (2γ+β − 1) ρ(2β − 1)2 0<β< 2α−1 ρ(1 − δ)(2 m 2
. = q(α, , ρ, m, δ).
For any 0 > 0 and for large n, Pn {Tk0 ≤ q(α, , ρ, m, δ) + 0 } ≥ By Lemma 5.1,
0 . = δ0 . E(Tk0 ) + 0
P {kXtn k[ φ(n) ] ≥ δρ} ≥ δ0 , 2
for all t ≥ q(α, , ρ, m, δ) + 0 . Hence, by Theorem 5, Tg ≤ q(α, , ρ, m, δ). In (15), we can get a sharper bound n 2 (1 − )22kα 2k . 21+2α ρ m Letting m → ∞, δ → 0, finally, we get Tg ≤ C(α) =
inf
0<β< 2α−1 2
(2β
C(α) ρ ,
where
2β 21+2α+β . + inf 2 β 2 2α−1−β − 1) 2α−1 − 1) 0<β< 2 (2 − 1) (2
Proof of Corollary 1. It is clear from Theorem 2 and Theorem 3.
5.2. Proof of Theorem 4. For any cadlag sample path ω of Xtn , let . An (t, ω) = {s ≤ t : ω(s) = ω(s−) + 1nij for some i, j}, . Bn (t, ω) = {s ≤ t : ω(s) = ω(s−) − 1nij for some i, j}. That is, An (t) is the set of times that coagulation occurs, and Bn (t) is the set of times that fragmentation occurs. Let Ttn (ω) = |An (t, ω)| − |Bn (t, ω)|, where | · | is the cardinality function. Lemma 5.3. Suppose |Bn (t, ω)| = 0, ω(0) = ρe1 , then |An (t, ω)| = n−1 iff ω(t) =
ρ n en .
Proof. Since coagulation is the reverse process of fragmentation, we may assume |An (t, ω)| = 0, ω(0) = nρ en and prove that |Bn (t, ω)| = n − 1 iff ω(t) = ρe1 . This can be done by induction on n. (a) The case n = 2; ω(0) = ρ2 e2 , so ω(t) = ρ(e1 ) iff |Bn (t, ω)| = 1. (b) Assume true for all k < n, then for any cadlag ω of Xtn , with ω(0) = nρ en and with Bn (t, ω) = {t1 , t2 , · · · , tr }, ti < ti+1 , r ≥ 1, ω(t1 ) = nρ (ek1 + ek2 ) for some k1 , k2 > 0 such that k1 + k2 = n. Since fragmentation occurs independently from other clusters, we may think that ω consists of two states ω1 , ω2 , such that ω1 (0) = nρ ek1 ∈
562 k1
I. Jeon ρ
k2
ρ
Ekn1 , ω2 (0) = nρ ek2 ∈ Ekn2 , ω1 (t) = induction hypothesis,
k1 ρ n e1 ,
and ω2 (t1 ) =
k2 ρ n e1 .
Therefore, by the
|Bn (t, ω)| =|Bk1 (t, ω1 )| + |Bk2 (t, ω2 )| + 1 =(k1 − 1) + (k2 − 1) + 1 =n − 1, iff ω(t) = ρe1 . Lemma 5.4. If Ttn (ω) = n − 1 for some t and cadlag path ω of Xtn , with ω(0) = ρe1 , then ω(t) = nρ en . Proof. If Ttn = n − 1, then |An (t, ω)| = (n − 1) + k, and |Bn (t)| = k, for some k ≥ 0. The proof is by induction on k, and the case k = 0 is a consequence of Lemma 5.3. Thus suppose the assertion is true for all k < m. Let An (t, ω) = {s1 , s2 , · · · , sn−1+m } and Bn (t, ω) = {t1 , t2 , · · · , tm }, where the times listed above are distinct and in increasing order. There exist r < n−1+m and i, j ≥ 1 such that sr < t1 < sr+1 and ω(t1 ) − ω(sr ) = −1ij . Now, there exists a path ω 0 such that Bn (t, ω 0 ) = φ, An (t, ω 0 ) = {u1 , u2 , · · · , up }, 0 ω (0) = ρe1 , ω 0 (up ) = nρ en , and ω 0 (ul ) = ω(t1 ), for some 1 < l < p. In case this is not clear, begin with configuration ω(t1 ) and, step by step, fragment its clusters until configuration ρe1 is reached. Reversing the steps, a sequence of configurations ρe1 = η0 −→ η1 −→ · · · −→ ηl = ω(t1 ), is obtained, where the arrow indicates ηi+1 is obtained from ηi by a single coagulation of two clusters. Next, step by step, coagulate the clusters of configuration ω(sr ) until configuration nρ en is reached, yielding a sequence ω(t1 ) −→ ω(sr ) = ηl+1 −→ ηl+2 −→ · · · −→ ηp = en . Letting ui = pi t and setting ω 0 (ui ) = ηi does the trick. (We use a similar device in what follows.) By Lemma 5.3, p = n − 1. Now, define a path ω 00 through the sequence of configurations ω(s1 ) −→ · · · −→ ω(sr ) = ω 0 (ul+1 ) −→ ω 0 (ul+2 ) −→ · · · −→ ω 0 (un−1 ). This sequence of configurations has length n − 1; hence, r = l + 1, by Lemma 5.3 again. Finally, define a new path ω 000 by a sequence of configurations of which the first l = r − 1 steps are those of ω 0 , leading to configuration ω 0 (ul ) = ω(t1 ), and whose remaining steps are exactly those of the original path ω, coagulation and fragmentation steps included. Evidently, |Bn (t, ω 000 )| = m − 1 since the fragmentation step from ω(sr ) to ω(t1 ) has been deleted from ω. Also, |An (t, ω 000 )| = (n − 1) − (m − 1) since there are only r − 1 coagulation steps in ω 0 up to the appearance of configuration ω(t1 ). By the inductive hypothesis, ω(t) = ω 000 (t) = nρ en . Proof of Theorem 4. By the condition of the theorem, for any ρ > ρc , ∃0 < α < 21 such that ρ(1 − α) > ρc . Assume that n is sufficiently large so that αn >> 1. For η ∈ En , such that η(i) = 0 for all i ≥ αn, the coagulation rate
Existence of Gelling Solutions
563
ρ . n X λn (η) = K(i, j)η(i){η(j) − δij } 2ρ n i+j≤n
≥
[αn] [αn] ρ n X X ijη(i){η(j) − δij } 2ρ n j=1 i=1
=
[αn] X [αn] X
n { 2ρ
ijη(i)η(j) −
j=1 i=1
[αn] X [αn] X j=1 i=1
ρ ijη(i)δij } n
[αn] X
ρ n 2 {ρ − i2 η(i) } 2ρ n i=1 ρn n 2 (ρ − αρ2 ) = (1 − α), ≥ 2ρ 2
=
where the last inequality holds since [αn] X i=1
i2 η(i)
X [αn] X ρ ≤ η(i)ρ ≤ αρ i iη(i) = αρ2 . n n [αn]
[αn]
i=1
i=1
Also, the fragmentation rate ρc n n . n X · ρβ = . F (i, j)η(i + j) ≤ µn (η) = 2ρ 2ρ 2 i+j≤n
Therefore, µn (η) ≤
ρn ρc n < (1 − α) ≤ λn (η). 2 2
Let λ0 = ρ(1−α) , µ0 = ρ2c , then λ0 > µ0 . Let Ytn be the birth and death process on 2 {−n, −(n − 1), · · · , 0, 1, 2, · · · , n} with reflecting state {−n} and absorbing state {n} and transition probability Pt (i + 1|i) = λ0 nt + o(t) (−n ≤ i < n), Pt (i − 1|i) = µ0 nt + o(t) (−n < i < n), Pt (n − 1|n) = 0. Then Ytn represents “the number of coagulation steps − the number of fragmentation and fragmentation rate steps” up to time t of a process with coagulation rate ρ(1−α)n 2 ρc n . 2 Let T be the first hitting time of absorbing state n from the initial state 0, i.e., let . T = inf{t > 0 : Ytn = n|Y0n = 0}, then ET ≤
1 n = <∞ λ0 n − µ 0 n λ0 − µ 0
(see sec. 4.7 in [KT]). Recall that . τnα = inf{t : Xtn (k) > 0 for some k ≥ αn}, the first hitting time of the cluster size greater than or equal to αn.
564
I. Jeon
Since Xtn hits k−cluster for some k ≥ αn before “the number of coagulation − the number of fragmentation” = n, and by stochastic dominance, for any 0 > 0, Pn {τnα ≤
1 + 0 } ≥ P {Y 1 +0 = n} λ0 −µ0 λ0 − µ 0 1 = P {T ≤ + 0 } λ0 − µ0 = P {T ≤ E(T ) + 0 } 0 , ≥ E(T ) + 0
for all n. Therefore, lim inf P {τnα ≤ n
and strong gelation occurs in time 2 tsg ≤ (ρ−ρ . c)
1 0 + 0 } ≥ , λ0 − µ 0 E(T ) + 0
1 λ0 −µ0
+ 0 , for any 0 > 0. Letting 0 , α → 0, we get
Proof of Corollary 2. (TC1) is satisfied since, for any η ∈ En , X X K(i, j)η(i)η(j) ≤ M ijη(i)η(j) i,j
i,j
≤ M(
X
iη(i))(
X
i
jη(j))
j
≤ M ρ2 < ∞. Also, (TC2) is satisfied since X
F (i, j)η(i + j) ≤
i,j≥1
≤
X
C η(i + j) i+j
i,j≥1 ∞ X X
k=2 i+j=k ∞ X
=
(k − 1)
k=2
≤
C η(i + j) i+j C η(k) k
∞ CX kη(k) 2 k=2
ρC . ≤ 2 Moreover,
X i+j=k
F (i, j) ≤
X i+j=k
C i+j
= (k − 1) ·
C ≤ C. k
Existence of Gelling Solutions
565
Since ρβ = sup η
β≤
C 2
and ρc =
β
≤
C 2 .
Tg ≤
X
F (i, j)η(i + j) ≤
i,j
ρC , 2
By Theorem 4 and Theorem 6, 2 2 ≤ (ρ − ρc ) (ρ −
C 2 )
=
4 . 2ρ − C
Proof of Corollary 3. X
F (i, j)η(i + j) ≤
C(i + j)η(i + j) ≤ Crρ,
i=1 j=1
i,j≥1
hence β ≤ Cr, ρc =
r r X X
β
≤
Cr , and, therefore, by Theorems 4 and 6, tg
≤
2 ρ−Cr .
Acknowledgement. This is a part of the author’s Ph.D thesis. I would like to thank my adviser, P. March, for introducing this subject, for showing me the mathematical ideas and insights, and for his encouragement.
References [A] [An] [BH] [BC] [BCC] [B] [Bi] [BP] [BW] [D]
[DS]
[DGS] [ES] [EK] [F] [H]
Aldous, D.J.: Deterministic and Stochastic Models for Coalescence (Aggregation, Coagulation): A Review of the Mean-Field Theory for Probabilists. Preprint (1996) Anderson, W.I.: Continuous Time Markov Chains. New York: Springer-Verlag, 1991 Bak, T.A., Heilmann, O.J.: Post-gelation solutions to Smoluchowski’s coagulation equation. J. Phys. A: Math. Gen. 27, 4203–4209 (1994) Ball, J.M., Carr, J.: The discrete coagulation-fragmentation equations: existence, uniqueness, and density conservation. J. Stat. Phys. 61, 203–234 (1990) Ball, J.M., Carr, J., Penrose, O.: The Becker–D¨oring Cluster Equations: Basic Properties and Asymptotic Behavior of Solutions. Commun. Math. Phys. 104, 657–692 (1986) Barrow, J.D.: Coagulation with fragmentation. J. Phys. A: Math. Gen. 14, 729–733 (1981) Billingsley, P. Convergence of Probability measures. New York: J. Wiley & Sons, 1968 Buffet, E., Pule, I.V.: On the Lushnikov’s model of gelation. J. Stat. Phys. 58, 1041–1058 (1990) Buffet, E., Werner, R.F.: A counterexample in coagulation theory. J. Math. Phys. 32, 2276–2278 (1991) Drake, R.L.: A general mathematical survey of the coagulation equation. In: Topics in Current Aerosol Research. 3, Pergamon Part 2, G. M. Hidy and J. R. Brock, eds. Oxford: Press, 1972, pp. 51–119 Donnelly, P., Simons, S.: On the stochastic approach to cluster size distribution during particle coagulation: I. Asymptotic expansion in the deterministic limit. J. Phys. A: Math. Gen. 26, 2755– 2767 (1993) Dubovskii, P.B., Galkin, V.A., and Stewart, I.W.: Exact solutions for the coagulation-fragmentation equation. J. Phys. A: Math. Gen. 25, 4737–4744 (1992) Ernst, M.H., Szamel, G.: Fragmentation kinetics. J. Phys. A: Math. Gen. 26, 6085–6091 (1993) Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. New York: J. Wiley & Sons, 1986 Feller, W.: An Introduction to Probability Theory and Its applications Volume II. New York: J. Wiley, 1971 Halmos, P.R.: A Hilbert Space Problem Book. New York : Springer- Verlag, 1982
566
[HEZ]
I. Jeon
Hendricks, E., Ernst, M., and Ziff, R.: Coagulation equations with gelation. J. Stat. Phys. 31, 519–563 (1983) [HSES] Hendricks, E., Spouge, M., Eibl, M., and Schreckenberg, M.: Exact Solutions for Random Coagulation Processes. Z. Phys. B- Condensed Matter 58, 219–227 (1985) [HEL] Huang, J., Edwards, B.F., and Levine, A.D.: General solutions and scaling violation for fragmentation with mass loss. J. Phys. A: Math. Gen. 24, 3967–3977 (1985) [JS] Jacod, J., Shiryaev A.N.: Limit Theorems for Stochastic Processes. Berlin: Springer-Verlag, 1987 [JM] Joffe, A., Metivier, M.: Weak convergence of semimartingales with application to multi type branching process. Adv. in Appl. Probab. 18, 20–65 (1986) [KS] Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. New York: Springer–Verlag, 1991 [KT] Karlin, S., Taylor, M.H.: A First Course in Stochastic Process. London: Academic Press, 1975 [K] Kokholm, N.J.: On Smoluchowski’s coagulation equation. J. Phys. A: Math. Gen. 21, 839–842 (1988) [L] Leyvraz, F.: Existence and properties of post-gel solutions for the kinetic equations of coagulation. J. Phys. A: Math. Gen. 16, 2861–2873 (1983) [LT] Leyvraz, F., Tschudi, H.: Critical Kinetics near gelation. J. Phys. A: Math. Gen. 15, 1951–1964 (1982) [Li] Liggett, T.M.: Interacting Particle Systems. New York: Springer-Verlag, 1985 [Lin] Lindvall, T.: Lectures on the Coupling Method. New York: Wiley, 1992 [Lu] Lushnikov, A.A.: Certain now aspects of the coagulation theory. Izv. Atm. Ok. Fiz. 14, 738–743 (1978) [M] Marcus, A.H.: Stochastic coalescence. Technometrics 10, 133–146 (1968) [MS] Merkulovich, V.M., Stepanov, A.S.: Atmospheric and Ocean Phys. 22, 195–199 (1986) [Mc] McLeod, J.B.: On an infinite set of non-linear differential equations I, II. Quart. J. Math. Oxford(2) 13, 119–128, 193–205 (1962) [S] Silk, J.: Star Formation. Geneva Observatory, Switzerland: Sauverny, 1980 [Sc] Scott, W.T.: Poisson statistics in distributions of coalescing droplets. J. Atmos. Sci. 24, 221–225 (1967) [Sm] Smoluchowski, M.V.: Versuch einer mathematischen Theorie der Koagulationskinetik kolloider L¨osungen. Z. Phys. Chem. 92, 129–168 (1917) [SZT] Sorensen, C.M., Zhang, H.X., and Taylor, T.W.: Cluster-Size Evolution in a Coagula- tionFragmentation System. Phys.Rev.Lett. 59, 363–366 (1987) [St] Stewart, I.W.: A global existence theorem for the general coagulation-fragmentation equation with unbounded kernels. Math. Meth. Appl. Sci. 11, 627–648 (1989) [SV] Strook, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. New York: Springer-Verlag, 1979 [vE1] van Dongen, P.G.J., Ernst, M.H.: Pre-and post-gel size distributions in (ir)reversible polymerization. J. Phys. A: Math. Gen. 16, L327–L332 (1983) [vE2] van Dongen, P.G.J., Ernst, M.H.: Size distribution in the polymerization model Af RBg . J. Phys. A: Math. Gen. 17, 2281–2297 (1984) [vE3] van Dongen, P.G.J., Ernst, M.H.: Cluster size distribution on irreversible aggregation at large times. J. Phys. A: Math. Gen. 18, 2779–2793 (1985) [VZL] Vigil, R.D., Ziff, R.M., and Lu, B.: New universality class for gelation in a system with particle breakup. Phys. Rev. B 38, 942–945 (1988) [W] Warshaw, M.: Cloud droplet coalescence: Statistical foundation and a one-dimensional sedimentation model. J. Atmos. Sci. 24, 278–286 (1967) [WS] Witten, T.A., Sander, L.M.: Phys. Rev. Lett. 47, 1400–1403 (1981) [Wh] White, W.H.: A global existence theorem for Smoluchowski’s coagulation equation. Proc. Am. Math. Soc. 80, 273–276 (1980) [YH] Yu Jiang, Hu Gang: Long-time behavior of the cluster size distribution in joint coagulation processes. Physical Review B 40, 661–665 (1989)
Existence of Gelling Solutions
[YHM] [Z1] [Z2] [ZS] [ZHE]
567
Yu Jiang, Hu Gang, and Ma BenKun: Critical property and universality in the generalized Smoluchowski coagulation equation. Phys. Rev. B 41, 9424–9429 (1990) Ziff, R.: Kinetics of polymerization. J. Stat. Phys. 23, 241–263 (1980) Ziff, R.: An explicit solution to a discrete fragmentation model. J. Phys. A: Math. Gen. 25, 2569–2576 (1992) Ziff, R., Stell, G.: Kinetics of polymer gelation. J. Chem. Phys. 73, 3492–3499 (1980) Ziff, R., and Hendricks, E., Ernst, M.: Critical Properties for Gelation: A Kinetic Approach. Phys. Rev. Lett. 49, 593–595 (1982)
Communicated by J. L. Lebowitz
Commun. Math. Phys. 194, 569 – 589 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Boundary Exchange Algebras and Scattering on the Half Line Antonio Liguori1,2 , Mihail Mintchev1,2 , Liu Zhao3 1 2 3
Dipartimento di Fisica dell’Universit`a di Pisa, Piazza Torricelli 2, 56100 Pisa, Italy Istituto Nazionale di Fisica Nucleare, Sezione di Pisa, 56100 Pisa, Italy Institute of Modern Physics, Northwest University, Xian 710069, P.R. China
Received: 14 November 1996 / Accepted: 5 November 1997
Abstract: Some algebraic aspects of field quantization in space-time with boundaries are discussed. We introduce an associative algebra BR , whose exchange properties are inferred from the scattering processes in integrable models with reflecting boundary conditions on the half line. The basic properties of BR are established and the Fock representations associated with certain involutions in BR are derived. We apply these results for the construction of quantum fields and for the study of scattering on the half line.
1. Introduction It is well known that the presence of boundaries in space affects the behavior of quantum fields. In this paper we discuss the influence of the boundary conditions on the canonical commutation relations between creation and annihilation operators. Our investigation is inspired mainly by the factorized scattering theory of integrable models with reflecting boundary conditions on the half line. In the absence of boundaries [6,13,26], the algebraic features of these models are encoded in the Zamolodchikov-Faddeev (Z-F) algebra [6,26], denoted in what follows by AR . This is an associative algebra, whose generators satisfy quadratic constraints, known as exchange relations. The Fock representation of AR equipped with an appropriate involution describes the scattering processes in integrable models. In this respect one should recall first that the Fock space contains two dense subspaces whose elements are interpreted as asymptotic in- and out-states. Second, the S-matrix can be explicitly constructed as a unitary operator interpolating between the asymptotic in- and out-spaces. In a pioneering paper from the middle of the eighties, Cherednik [4] suggested a possible generalization of factorized scattering theory to integrable models with reflecting boundary conditions, which preserve integrability. The recent efforts to gain a deeper insight in various boundary-related two-dimensional phenomena, stimulated
570
A. Liguori, M. Mintchev, L. Zhao
further investigations [5,7–12,16,21,23–25] in this subject. Among others, we would like to mention the attempts to develop an algebraic approach. One of the basic ideas there is to extend the Z-F algebra by introducing [8–12] “boundary creating" (also called “reflection") operators, which formally translate in algebraic terms the nontrivial boundary conditions. When possible, such an algebraic formulation is quite attractive because the treatment of the boundary conditions in their standard analytic form is as a rule a complicated matter. In spite of the great progress in implementing the above idea in particular models, the fundamental features of the boundary operators and their interplay with the “bulk" theory are still to be investigated. This is among the main purposes of the present paper. We start our analysis by introducing an exchange algebra BR with the following structure. In the above spirit, BR contains both boundary and bulk generators. The latter have a counterpart in AR , but we shall see that the exchange of two bulk generators of BR involves in general boundary elements. The impact of the boundary on the bulk theory is therefore manifest already on the algebraic level, while the detailed boundary conditions are specified on the level of representation. We concentrate in this article on the Fock representations of BR . We will show that there exist two series of such representations, depending on certain involutions in BR . We shall construct these representations explicitly, establishing also their basic properties. As an application of these results, we will perform a detailed and rigorous investigation of the S-matrix of integrable models in the presence of reflecting boundaries. The paper is organized as follows. In Sect. 2 we define the exchange algebra BR and investigate some of its basic features. We introduce the concept of reflection BR algebra and the related notion of reflection automorphism. At the end of this section we describe also a family of natural generalizations of BR . Section 3 is devoted to the Fock representations of BR . In Sect. 4 we describe some applications. We show that the second quantization on the half line naturally leads to BR . We also analyze here the scattering operator of integrable models. The last section contains our conclusions. In the appendix we construct representations of BR carrying a boundary quantum number. This article brings together and extends the results independently obtained by the present authors in [19] and [27]. 2. The Exchange Algebra BR BR is by definition an associative algebra with identity element 1. It has two types of generators: (2.1) {aα (x), a∗α (x) : α = 1, ..., N, x ∈ Rs } and (2.2) {bβα (x) : α, β = 1, ..., N, x ∈ Rs }, which, as mentioned in the introduction, are called bulk and boundary generators respectively. For convenience, we divide the constraints on (2.1,2) in three groups: (i)
bulk exchange relations are quadratic in the bulk generators and read β1 β2 (x2 , x1 ) aβ2 (x2 ) aβ1 (x1 ) = 0, aα1 (x1 ) aα2 (x2 ) − Rα 2 α1
a
∗α1
(x1 ) a
∗α2
(x2 ) −
aα1 (x1 ) a∗α2 (x2 ) −
∗β2
∗β1
a (x2 ) a (x1 ) Rβα21βα12 (x2 , x1 ) = α2 β1 a∗β2 (x2 ) Rα (x1 , x2 ) aβ1 (x1 ) = 1 β2
1 1 2 δ(x1 − x2 ) δαα12 1 + δ(x1 + x2 ) bα α1 (x1 ); 2 2
0,
(2.3) (2.4)
(2.5)
Boundary Exchange Algebras and Scattering on the Half Line
571
(ii) boundary exchange relations γ2 γ1 Rα (x1 , x2 ) bδγ11 (x1 ) Rγβ21δδ12 (x2 , −x1 ) bβδ22 (x2 ) = 1 α2 δ2 δ1 (x1 , −x2 ) bγδ11 (x1 ) Rδβ21γβ12 (−x2 , −x1 ); bγα22 (x2 ) Rα 1 γ2
(2.6)
(iii) mixed relations γ1 γ2 (x2 , x1 ) bδγ22 (x2 ) Rγβ12δδ21 (x1 , −x2 ) aδ1 (x1 ), aα1 (x1 ) bβα22 (x2 ) = Rα 2 α1
(2.7)
γ2 δ 1 (x1 , x2 ) bγδ11 (x1 ) Rγβ21γα12 (x2 , −x1 ) . bβα11 (x1 ) a∗α2 (x2 ) = a∗δ2 (x2 ) Rα 1 δ2
(2.8)
In the above equations and in what follows the summation over repeated upper and lower indices is always understood. The entries of the exchange factor R are complex valued measurable functions on Rs × Rs , obeying γ1 γ2 (x1 , x2 ) Rγβ11γβ22 (x2 , x1 ) = δαβ11 δαβ22 , Rα 1 α2
(2.9)
γ1 γ2 γ2 γ3 β 1 δ2 (x1 , x2 ) Rγδ22βα33 (x1 , x3 ) Rγβ11δβ22 (x2 , x3 ) = Rα (x2 , x3 ) Rα (x1 , x3 ) Rδβ22γβ33 (x1 , x2 ) . Rα 1 α2 2 α3 1 γ2 (2.10) These compatibility conditions are assumed throughout the paper and can be considered as general requirements on R, which together with Eqs. (2.3–8) define the exchange algebra BR . Equation (2.10) is the spectral quantum Yang–Baxter equation in its braid form, Rs playing the role of spectral set. Let us comment now on the exchange relations (2.3–8), which may look at first sight a bit complicated. Concerning the general structure, we observe that after setting √ formally all boundary generators in (2.3–8) to zero and rescaling by a factor of 1/ 2 the bulk generators, one gets the Z-F algebra AR . This fact clarifies partially the origin of Eqs. (2.3–5). The presence of boundary generators in the right hand side of (2.5) is worth stressing. This is one of the essential points, in which our approach differs from the previous attempts to define a boundary exchange algebra. Equation (2.6) describes the exchange of two boundary generators taken in generic points and also deserves a remark. It looks similar to the boundary Yang–Baxter equation [4]; the difference is that the elements {bβα (x)} do not commute in general and consequently their position in (2.6) is essential. Notice also that {bβα (x)} close a subalgebra of BR , which presents by itself some interest [24]. Finally, Eqs. (2.7, 8) express the interplay between {aα (x), a∗α (x)} and {bβα (x)} and represent another relevant new aspect of our proposal. Two straightforward examples, denoted by B± , correspond to the constant solutions β1 β2 = ± δαβ21 δαβ12 Rα 1 α2
(2.11)
of (2.9,10) and represent in the above context the counterparts of the canonical (anti)commutation relations. Equations (2.6–8) imply that {bβα (x)} are central elements in B± . Nevertheless, also in these relatively simple cases the right-hand side of Eq. (2.5) keeps trace of the nontrivial boundary conditions. Two applications of B+ with N = 1 are described in Sect. 4. To further understand the structure of BR and its representations, it is instructive to introduce some involutions in BR . Let HN be the family of invertible Hermitian N × N matrices and let M be the set of matrix valued functions m : Rs → HN , such that the entries of m(x) and m(x)−1 are measurable and bounded in Rs . Consider the mapping Im defined by
572
A. Liguori, M. Mintchev, L. Zhao
Im : a∗α (x) 7−→ mβα (x) aβ (x),
(2.12)
Im : aα (x) 7−→ a∗β (x) m−1α β (x),
(2.13)
Im : bβα (x) 7−→ mγβ (−x) bδγ (−x) m−1α δ (x).
(2.14)
Provided that m ∈ M satisfies β1 β2 γ1 γ2 β1 β2 1 γ2 R†γ α1 α2 (x1 , x2 ) mγ1 (x1 ) mγ2 (x2 ) = mα1 (x2 ) mα2 (x1 ) Rγ1 γ2 (x2 , x1 ),
(2.15)
it is not difficult to check that when extended as an antilinear antihomomorphism on BR , Im defines an involution. In Eq. (2.15) and in what follows the dagger stands for Hermitian conjugation, i.e. α1 α2
1 β2 R†β α1 α2 (x1 , x2 ) ≡ Rβ1 β2 (x1 , x2 ),
the bar indicating complex conjugation. Notice that for the algebras B± Eq. (2.15) is satisfied for any m ∈ M. In this paper we shall concentrate on the following specific type of BR -algebras. We call the boundary generators {bβα (x)} reflections if bγα (x) bβγ (−x) = δαβ
(2.16)
hold. In this case we refer to BR as a reflection exchange algebra. The condition (2.16) is Im -invariant and one easily proves Proposition 1. Let BR be a reflection exchange algebra. Then the mapping % : aα (x) 7−→ bβα (x)aβ (−x),
(2.17)
% : a∗α (x) 7−→ a∗β (−x)bα β (−x),
(2.18)
% : bβα (x) 7−→ bβα (x),
(2.19)
leaves invariant the constraints (2.3–8) and extends therefore to an automorphism on BR . Moreover, being compatible with Im , % is actually an automorphism of {BR , Im } considered as an algebra with involution. In what follows % is called the reflection automorphism of BR . Besides encoding some essential features of any reflection exchange algebra, % has a direct physical interpretation in scattering theory: it provides a mathematical description of the intuitive physical picture that bouncing back from a wall, particles change the sign of their rapidities. In fact, the two elements a∗α (−x) and a∗β (x)bα β (x) are %-equivalent, a∗α (−x) ∼ a∗β (x)bα β (x).
(2.20)
This relation in our framework is the counterpart of a heuristic equation (see for example Eq. (3.22) of [10]), conjectured in all papers dealing with factorized scattering with reflecting boundaries. In the next section we will show that the %-equivalence becomes actually an equality in the Fock representation of {BR , Im }. For proving this statement we will use the relations βδ (x1 , x2 ) {aδ (x1 ) − % [aδ (x1 )]} , (2.21) {aα (x1 ) − % [aα (x1 )]} a∗β (x2 ) = a∗γ (x2 ) Rαγ
Boundary Exchange Algebras and Scattering on the Half Line
573
αβ (x2 , x1 ), a∗α (x1 ) − % a∗α (x1 ) a∗β (x2 ) = a∗γ (x2 ) a∗δ (x1 ) − % a∗δ (x1 ) Rγδ (2.22) whose validity follows directly from Eqs. (2.3–5, 7, 8, 16). Before concluding this section, we would like to introduce a whole class of more general exchange algebras which can be treated in the above way. The idea is to replace the reflection x 7→ −x, which plays a special role in defining BR , with any almost everywhere differentiable mapping λ : x 7→ x e, satisfying the iterative functional equation
λ (λ(x)) = x.
(2.23)
The resulting exchange algebras will be denoted by BR,λ and are characterized by the following constraints: the relations (2.3,4) remain unchanged, whereas (2.5–8) take the form α2 β1 (x1 , x2 ) aβ1 (x1 ) = aα1 (x1 ) a∗α2 (x2 ) − a∗β2 (x2 ) Rα 1 β2 (2.24) 1 2 {δ(x1 − x2 ) δαα12 1 + δ(x1 − x e2 ) bα α1 (x1 )}, 0 1/2 2|detλ (x1 )| γ2 γ1 Rα (x1 , x2 ) bδγ11 (x1 ) Rγβ21δδ12 (x2 , x e1 ) bβδ22 (x2 ) = 1 α2 δ2 δ1 (x1 , x e2 ) bγδ11 (x1 ) Rδβ21γβ12 (e x2 , x e1 ), bγα22 (x2 ) Rα 1 γ2 γ1 γ2 aα1 (x1 ) bβα22 (x2 ) = Rα (x2 , x1 ) bδγ22 (x2 ) Rγβ12δδ21 (x1 , x e2 ) aδ1 (x1 ), 2 α1 γ2 δ1 (x1 , x2 ) bγδ11 (x1 ) Rγβ21γα12 (x2 , x e1 ). bβα11 (x1 ) a∗α2 (x2 ) = a∗δ2 (x2 ) Rα 1 δ2
(2.25)
(2.26)
Here λ0 (x) denotes the Jacobian matrix of the function λ. The results of this section regarding BR can be transferred with obvious modifications to BR,λ . For the complete set of solutions of Eq. (2.23) we refer to [14]. When s = 1 for instance, the mapping λ can be any almost everywhere differentiable function in R whose graph is symmetric with respect to the diagonal {(x, y) ∈ R2 : x = y}. Summarizing, we introduced so far the exchange algebra BR and some natural generalizations of it. We defined also a set of involutions in BR , which are useful in representation theory. Focusing on reflection type BR -algebras, we shall construct in the next section the relative Fock representations. 3. Fock Representations We consider in this paper representations of {BR , Im } with the following general structure. 1. The representation space L is a locally convex and complete topological linear space over C. 2. The generators {aα (x), a∗α (x), bβα (x)} are operator valued distributions with common and invariant dense domain D ⊂ L, where Eqs. (2.3–8) hold. 3. D is equipped with a nondegenerate sesquilinear form (inner product) h · , · im , which is at least separately continuous. The involution Im defined by Eqs. (2.12–14) is realized as a conjugation with respect to h · , · im . A Fock representation of {BR , Im } is specified further by the following requirement. 4. There exists a vector (vacuum state) ∈ D which is annihilated by aα (x). Moreover, is cyclic with respect to {a∗α (x)} and h , im = 1.
574
A. Liguori, M. Mintchev, L. Zhao
A more general situation, when a boundary quantum number [10] is present, is outlined in the appendix. There is a series of direct but quite important corollaries from the above assumptions. Let us start with Proposition 2. The automorphism % of any reflection BR -algebra is implemented in the above Fock representations by the identity operator. Proof. First of all we observe that hP 0 [a∗ ] , {aα (x) − %[aα (x)]}P [a∗ ]im = 0,
(3.1)
where P and P 0 are arbitrary polynomials. In fact, by means of Eq. (2.21) one can shift the curly bracket to the vacuum and use that aα (x) annihilate . Now the cyclicity of , combined with the properties of h · , · im , allow to replace P 0 [a∗ ] by an arbitrary state ϕ ∈ D. A further conjugation leads to
which implies
hP [a∗ ] , {a∗α (x) − %[a∗α (x)]}ϕim = 0,
(3.2)
a∗α (x) = a∗β (−x)bα β (−x)
(3.3)
on D. Analogously, employing (2.22) one concludes that aα (x) = bβα (x)aβ (−x)
(3.4)
also holds on D. Finally, taking in consideration Eq. (2.19) we deduce that % is indeed implemented by the identity operator. For describing some further characteristic features of the Fock representations of BR , we introduce the c-number distributions Bαβ (x) ≡ h , bβα (x)im .
(3.5)
The requirement 3 implies that B †βα (x) = mγα (−x) Bγδ (−x) m−1βδ (x),
(3.6)
which is the analog of condition (2.15) regarding the exchange factor R. Two other simple consequences of our assumptions 1–4 above are collected in Proposition 3. The vacuum vector is unique (up to a phase factor) and satisfies bβα (x) = Bαβ (x) .
(3.7)
Proof. The argument implying the uniqueness of the vacuum is standard. Concerning Eq. (3.7), it can be inferred from the identity h[bβα (x) − Bαβ (x)] , P [a∗ ]im = 0,
(3.8)
P being an arbitrary polynomial. In order to prove Eq. (3.8) we shift by a conjugation the polynomial to the first factor in the right-hand side of (3.8) and apply afterwards the exchange relation (2.8) and Eq. (3.5). For completing the proof, one also employs that is cyclic and h · , · im is continuous and nondegenerate.
Boundary Exchange Algebras and Scattering on the Half Line
575
Combining Eq. (3.7) with the fact that aα (x) annihilate , we conclude that Eqs. (2.5, 7, 8) allow for a purely algebraic derivation of the vacuum expectation values involving any number and combination of the generators {aα (x), a∗α (x), bβα (x)}. In particular, taking the vacuum expectation value of Eq. (2.6) one gets γ2 γ1 (x1 , x2 ) Bγδ11 (x1 ) Rγβ21δδ12 (x2 , −x1 ) Bδβ22 (x2 ) = Rα 1 α2 δ2 δ1 (x1 , −x2 ) Bδγ11 (x1 ) Rδβ21γβ12 (−x2 , −x1 ). Bαγ22 (x2 ) Rα 1 γ2
(3.9)
We thus recover at the level of Fock representation the original boundary Yang–Baxter equation [4]. In addition, when one is dealing with reflection algebras, Eq. (2.16) implies Bαγ (x) Bγβ (−x) = δαβ .
(3.10)
In this case we refer to B as a reflection matrix. A final comment in this introductory part concerns the algebras B± . Using that {bβα (x)} are central elements, in a Fock representation of B± one has bβα (x)ϕ = Bαβ (x)ϕ
(3.11)
for any ϕ ∈ D. At this stage it is convenient to introduce the set M(R, B) of all elements of M obeying both Eqs. (2.15) and (3.6). Then the basic input fixing a Fock representation of the reflection algebra {BR , Im } is the triplet {R, B; m}, where R and B satisfy Eqs. (2.9, 10) and (3.9,10), and m ∈ M(R, B). Some explicit examples of such triplets have been found already by Cherednik [4]. With any {R, B; m} we associate a Fock representation denoted by FR,B;m . To the end of this section we will describe the explicit construction of FR,B;m . n of FR,B;m . For this Our first step will be to introduce the n-particle subspace HR,B purpose we consider N M L2 (Rs ), (3.12) H= α=1
equipped with the standard scalar product Z (ϕ, ψ) =
ds xϕ†α (x)ψα (x) =
N Z X
ds xϕα (x)ψα (x).
(3.13)
α=1 n we are looking for, will be a subspace of the n-fold For n ≥ 1 the n-particle space HR,B (n) ⊗n tensor power H , characterized by a suitable projection operator PR,B . The ingredients (n) for constructing PR,B are essentially two: a specific finite group and its representation in H⊗n , defined in terms of the exchange factor R and the reflection matrix B. Let us concentrate first on the group. In the case of AR , this was [17] simply the permutation group Pn . The physics behind BR suggest to enlarge in this case the group by adding a reflection generator. More precisely, we consider the group Wn generated by {τ, σi : i = 1, ..., n − 1} which satisfy
σ i σ j = σj σ i , σ i τ = τ σi ,
|i − j| ≥ 2, 1 ≤ i < n − 2,
(3.14)
576
A. Liguori, M. Mintchev, L. Zhao
σi σi+1 σi = σi+1 σi σi+1 , σn−1 τ σn−1 τ = τ σn−1 τ σn−1 ,
(3.15)
σi2 = τ 2 = 1.
(3.16)
Wn is the Weyl group associated with the root systems of the classical Lie algebra Bn and has 2n n! elements. Although it contains no permutations, W1 = {1, τ } is nontrivial. We turn now to the representation of Wn in H⊗n . Observing that any element ϕ ∈ H⊗n can be viewed as a column whose entries are ϕα1 ···αn (x1 , . . . , xn ), we define the operators {T (n) , Si(n) : i = 1, ..., n − 1} acting on H⊗n according to: h i Si(n) ϕ
α1 ...αn
(x1 , ..., xi , xi+1 , ..., xn ) = n ≥ 2,
n [Ri i+1 (xi , xi+1 )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xi+1 , xi , ..., xn ),
T (n) ϕ
α1 ...αn
(3.17)
(x1 , ..., xn ) =
n [Bn (xn )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xn−1 , −xn ),
n ≥ 1,
(3.18)
where
β1 ...βn
Rij (xi , xj )
α1 ...αn
c β c βi βj = δαβ11 δαβ22 · · · δαβii · · · δαjj · · · δαβnn Rα (xi , xj ) i αj
(3.19)
and c βi β1 β2 βn βi n [Bi (x)]βα11...β ...αn = δα1 δα2 · · · δαi · · · δαn Bαi (x).
(3.20)
The hat in Eqs. (3.19, 20) indicates that the corresponding symbol must be omitted. For implementing Eqs. (3.17, 18) on the whole H⊗n , we assume at this stage that the matrix β1 β2 (x1 , x2 ) and Bαβ (x) are bounded functions. We are now in position to elements Rα 1 α2 prove Proposition 4. {T (n) , Si(n) : i = 1, ..., n − 1 } are bounded operators on H⊗n and the mapping χ(n) : τ 7−→ T (n) ,
χ(n) : σi 7−→ Si(n) ,
i = 1, · · · , n − 1
(3.21)
defines a representation of Wn in H⊗n . Moreover, (n) ≡ PR,B
X 1 χ(n) (ν) 2n n!
(3.22)
ν∈Wn
is a bounded projection operator in H⊗n . Proof. The main point is to show that {T (n) , Si(n) : i = 1, ..., n − 1 } obey Eqs. (3.14– 16). This can be checked directly. Equations (3.14) are satisfied by construction. Equations (3.15) follow from (2.10) and (3.9). Finally, Eqs. (2.9) and (3.10) imply (3.16).
Boundary Exchange Algebras and Scattering on the Half Line
577
(n) Let us observe in passing that PR,B is an orthogonal projector only if the N × N (n) is not orthogonal, but being a identity matrix e belongs to M(R, B). In general PR,B bounded operator determines for any n ≥ 1 a (nonempty) closed subspace (n) n ≡ PR,B H⊗n . HR,B
(3.23)
n behave as follows: By construction the elements of HR,B
ϕα1 ...αn (x1 , ..., xi , xi+1 , ..., xn ) n = [Ri i+1 (xi , xi+1 )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xi+1 , xi , ..., xn ), n ϕα1 ...αn (x1 , ..., xn ) = [Bn (xn )]βα11...β ...αn ϕβ1 ...βn (x1 , ..., xn−1 , −xn ).
0 HR,B
(3.24) (3.25)
0 also the finite particle space FR,B;m (H) as the (0) (1) (n) (n) n = ϕ , ϕ , ..., ϕ , ... with ϕ ∈ HR,B and
1
= C , we introduce Setting (complex) linear space of sequences ϕ ϕ(n) = 0 for n large enough. The vacuum state is = (1, 0, ..., 0, ...). 0 (H) the annihilation and creation operators At this point we define on FR,B;m ∗ {a(f ), a (f ) : f ∈ H} setting a(f ) = 0 and Z √ (x , ..., x ) = n + 1 ds x f †α0 (x)ϕ(n+1) [a(f )ϕ](n) 1 n α1 ···αn α0 α1 ···αn (x, x1 , ..., xn ), (3.26)
a∗ (f )ϕ
(n) α1 ···αn
(x1 , ..., xn ) =
i √ h (n) n PR,B f ⊗ ϕ(n−1)
α1 ···αn
(x1 , ..., xn ),
(3.27)
0 for all ϕ ∈ FR,B;m (H). The operators a(f ) and a∗ (f ) are in general unbounded on 0 n (H). However, for any ψ (n) ∈ HR,B one has the estimates FR,B;m
k a(f )ψ (n) k ≤
√
k a∗ (f )ψ (n) k ≤
√
(n+1) n k PR,B kk f kk ψ (n) k , (3.28) n . k · k being the L2 -norm. Therefore a(f ) and a∗ (f ) are bounded on each HR,B The right-hand side of Eq. (3.27) can be given an alternative form by implementing (n) . The resulting expression is a bit complicated, but since in explicitly the action of PR,B some cases it might be instructive, we give it for completeness:
n k f kk ψ (n) k,
(n) 1 a∗ (f )ϕ α1 ···αn (x1 , ..., xn ) = √ fα1 (x1 )ϕ(n−1) α2 ···αn (x2 , . . . , xn )+ 2 n (n−1) n C(x1 ; x2 , ..., xn )βα11···β ···αn fβ1 (−x1 )ϕβ2 ···βn (x2 , . . . , xn ) + 1 √
2 n
n X
β1 ···βn
Rk−1 k (xk−1 , xk ) · · · R1 2 (x1 , xk )
α1 ···αn
k=2
bk , . . . , xn ) + fβ1 (xk )ϕ(n−1) β2 ···βn (x1 , . . . , x
(n−1) n C(xk ; x1 , ..., x bk , ..., xn )γβ11···γ bk , . . . , xn ) , ···βn fγ1 (−xk )ϕγ2 ···γn (x1 , . . . , x where
n bk , ..., xn )βα11···β C(xk ; x1 , ..., x ···αn =
bk (k+1) (xk , xk ) · · · R(n−1) n (xk , xn )Bn (xk )· R12 (xk , x1 )R23 (xk , x2 ) · · · R
(3.29)
578
A. Liguori, M. Mintchev, L. Zhao
β1 ···βn
bk (k+1) (xk , −xk ) · · · R23 (x2 , −xk )R12 (x1 , −xk ) R(n−1) n (xn , −xk ) · · · R
. (3.30) We turn now to the boundary generators, defining bβα (x) as the multiplicative operator 0 (H) is given by Eq. (3.7) and whose action on FR,B;m
α1 ···αn
(n) bβα (x)ϕ γ1 ...γn (x1 , ..., xn ) = [R01 (x, x1 ) R12 (x, x2 ) · · · R(n−1) n (x, xn ) Bn (x)·
(n) 1 ...δn ·R(n−1) n (xn , −x) · · · R12 (x2 , −x) R01 (x1 , −x)]βδ αγ1 ...γn ϕδ1 ...δn (x1 , ..., xn ),
(3.31)
for n ≥ 1. Notice that the boundary generators {bβα (x)} preserve the particle number. 0 (H), which we By construction {a(f ), a∗ (f )} and {bβα (x)} leave invariant FR,B;m take as the domain D, whose existence was required in the definition of Fock representation. For deriving the commutation properties on D it is convenient to introduce the operator-valued distributions aα (x) and a∗α (x) defined by Z Z s †α ∗ a(f ) = d x f (x)aα (x), a (f ) = ds x fα (x)a∗α (x). (3.32) After a straightforward but lengthly computation, one verifies the validity of the following statement. Proposition 5. The operator-valued distributions {aα (x), a∗α (x)} and {bβα (x)} satisfy the relations (2.3–8) on D. Assuming that M(R, B) 6= ∅, we proceed further by implementing the involutions {Im : m ∈ M(R, B)}. For this purpose we have to construct a sesquilinear form h · , · im on D, such that the mapping (2.12–14) is realized as the conjugation with respect to h · , · im . Let us consider the following form on D: hϕ, ψim =
∞ X
hϕ(n) , ψ (n) im ,
(3.33)
n=0
where hϕ(0) , ψ (0) im = ϕ(0) ψ (0) , hϕ
(n)
Z
,ψ
(n)
(3.34)
im =
dsx1 · · · dsxn ϕ(n)†α1 ...αn (x1 , ..., xn )mβα11 (x1 ) · · · mβαnn (xn )ψβ(n) (x1 , ..., xn ). 1 ...βn (3.35) The right-hand side of (3.33) always makes sense because for any ϕ, ψ ∈ D the series is actually a finite sum. Using that m(x) satisfies Eqs. (2.15) and (3.6), one easily proves Proposition 6. The inner product defined by (3.33–35) is nondegenerate on D and the involution Im is implemented by h · , · im -conjugation. The next question concerns the positivity of h · , · im . This point is conveniently discussed after introducing the subset M(R, B)+ of those elements of M(R, B), which are positive definite almost everywhere in Rs . One has indeed Proposition 7. The inner product h · , · im is positive definite on D if and only if m ∈ M(R, B)+ .
Boundary Exchange Algebras and Scattering on the Half Line
579
Proof. From Eq. (3.35) it is clear that if m ∈ M(R, B)+ then the inner product is positive definite. Conversely, suppose that h · , · im is positive definite. Let y ∈ Rs be a fixed non zero vector, and take an arbitrary f ∈ H with support laying in the half space x · y ≥ 0. Consider the 1-particle state (1) f ]α (x) = ϕα (x) = [PR,B
1 fα (x) + Bαβ (x)fβ (−x) . 2
Using eqs. (3.6, 10) and the support properties of fα , one gets Z 1 ds x f †α (x)mβα (x)fβ (x). hϕ , ϕim = 2
(3.36)
(3.37)
Since f is arbitrary, positivity of h · , · im implies that m(x) is positive definite almost everywhere in the half space x · y ≥ 0. Finally, the arbitrariness of y allows to extend the validity of this conclusion to Rs . Proposition 7 shows that there are two kinds of Fock representations of BR . The representation FR,B;m will be called of type A if h · , · im is positive definite; otherwise we will say that FR,B;m is of type B. The standard probabilistic interpretation of quantum field theory applies directly only to the A-series. This does not mean however that the B-series has no physical applications. In the last case one has to isolate first a physical subspace where h · , · im is nonnegative. This is usually done by symmetry considerations and may depend on the specific model under consideration. The final step in completing the derivation of FR,B;m is the construction of the representation space L. It is necessary at this stage to consider the classes A and B separately. For m ∈ M(R, B)+ the inner product space {D, h · , · im } is actually a preHilbert space. Let FR,B;m (H) be the completion of D with respect to the Hilbert space topology. Clearly L = FR,B;m (H) satisfies all the requirements. For type B representations there is no distinguished Hilbert space topology for completing D. A natural substitute is the topology τ defined by the family of seminorms sψ (ϕ) ≡ |hψ , ϕim |,
ϕ , ψ ∈ D.
(3.38)
It turns out [2] that τ is the weakest locally convex topology in which h · , · im is separately τ -continuous. Moreover, τ is a Hausdorff topology, because h · , · im is nondegenerate. Therefore D admits a unique (up to isomorphism) τ -completion, which has all the needed properties and provides the space L for the B-series. We conclude this section by a general observation, which concerns A-type representations only and is based on the fact that any m ∈ M(R, B)+ can be written in the form m(x) = p† (x) p(x), where p(x) is an invertible matrix. Notice that p(x) is not unitary unless m(x) = e. It is easy to show that the mapping induced by aα (x) 7−→ pβα (x) aβ (x),
a∗α(x) 7−→ a∗β (x) p−1α β (x),
bβα (x) 7−→ pγα (x) bδγ (x) p−1βδ (−x)
(3.39) (3.40)
is an isomorphism between {BR , Im } and {BR0 , Ie }, where R0βα11βα22 (x1 , x2 ) = pγα11 (x1 ) pγα22 (x2 ) Rγδ11δγ22 (x1 , x2 ) p−1βδ11 (x2 ) p−1βδ22 (x1 ). Setting
B 0βα (x) = pγα (x) Bγδ (x) p−1βδ (−x),
(3.41) (3.42)
580
A. Liguori, M. Mintchev, L. Zhao
one has in addition that FR,B;m and FR0 ,B 0 ;e are equivalent. In other words, for any m ∈ M(R, B)+ one can equivalently replace Im with Ie , suitably modifying (see Eqs. (3.41, 42)) the exchange factor R and the reflection matrix B. Let us mention finally that the above formalism carries over easily to the Fock representations of BR,λ . One must only replace the Lebesgue measure ds x by the λinvariant measure |detλ0 (x)|1/2 ds x. 4. Applications 4.1. Free Boson Field on the Half Line. In order to give a first idea about the physical content of the algebra BR , we focus below on a simple example of quantization in R+ . More precisely, we construct the free boson field 8(t, x), satisfying (4.1) x ∈ R+ , ∂t2 − ∂x2 + M 2 8(t, x) = 0, with the boundary condition lim(∂x − η) 8(t, x) = 0, x↓0
η ≥ 0.
(4.2)
The standard Neumann and Dirichlet boundary conditions are recovered from (4.2) by setting η = 0 or taking the limit η → ∞ respectively. We will show that the quantization of the system (4.1,2) can be described in terms of BR with N = 1 and R = 1. The exchange structure of this boundary algebra is trivial, which allows to isolate and easily illustrate the physical implications of the boundary generator b(k). In this section the arguments of the BR -generators have the meaning of momenta and are denoted therefore by k, p, etc. Let us introduce the phase factor B(k) =
k − iη . k + iη
(4.3)
Then the triplet {R = 1, B; m = e} satisfies all requirements of the previous section and one can construct the corresponding Fock representation F1,B;e . Equation (3.31) shows that the operator b(k) acts as a multiplication by B(k). Therefore, one is left in F1,B;e with the following relations: [a(k) , a(p)] = 0, [a∗ (k) , a∗ (p)] = 0, 1 1 [a(k) , a∗ (p)] = δ(k − p) + B(k)δ(k + p). 2 2
(4.4)
Notice that these would be the standard canonical commutation relations, apart from the term B(k)δ(k + p). We define now the field operator Z ∞ dk √ a(k) e−iω(k)t+ikx + a∗ (k) eiω(k)t−ikx , (4.5) 8(t, x) = 2πω(k) −∞ where ω(k) =
√
M 2 + k2 .
(4.6)
Boundary Exchange Algebras and Scattering on the Half Line
581
This is just the expression in the case without boundary, but one should keep in mind that now the algebra of creation and annihilation operators is different. By means of (4.4) one easily derives the basic correlator - the two-point Wightman function Z ∞ dk e−iω(k)t12 e−ik(x1 −x2 ) + B(k)e−ik(x1 +x2 ) , h , 8(t1 , x1 )8(t2 , x2 )ie = 4πω(k) −∞ (4.7) where t12 = t1 − t2 . The right-hand side of Eq. (4.7) defines a tempered distribution (B(k) is C ∞ and bounded on R), which satisfies Eqs. (4.1, 2). It consists of two terms. The term without B(k) is the usual two-point Wightman function of the system without boundary. The term proportional to B(k) has its origin in the boundary generator and explicitly breaks translation and Lorentz invariance. It is remarkable that in spite of this fact, 8(t, x) is a local field. The validity of this statement can be deduced from the commutator (4.8) [ 8(t1 , x1 ) , 8(t2 , x2 ) ] = iD(t1 − t2 , x1 , x2 ). One has
e x1 + x2 ), D(t, x1 , x2 ) = 1(t, x1 − x2 ) + 1(t, Z
where 1(t, x1 − x2 ) = −
∞ −∞
dk sin[ω(k)t] eik(x1 −x2 ) 2πω(k)
is the ordinary Pauli–Jordan function with mass M and Z ∞ dk e 1(t, x1 + x2 ) = − sin[ω(k)t] B(k) eik(x1 +x2 ) . 2πω(k) −∞
(4.9) (4.10)
(4.11)
Observing that for x1 , x2 ∈ R+ the inequality |t1 − t2 | < |x1 − x2 | implies |t1 − t2 | < x1 + x2 , one concludes that the locality properties of the field 8 are governed by the e x) for |t| < x. The latter can be easily evaluated and using that η ≥ 0, behavior of 1(t, one finds e x)| = 0. (4.12) 1(t, |t|<x So, 8(t, x) is a local field when x ∈ R+ . Notice that this is not the case if 8(t, x) is e in the commutator have considered on the whole real line. The two terms 1 and 1 a very intuitive explanation. As far as |t1 − t2 | < |x1 − x2 | no signal can propagate between the points (t1 , x1 ) and (t2 , x2 ) and the commutator vanishes. When |x1 − x2 | < |t1 −t2 | < x1 +x2 signals can propagate directly between the two points, but they cannot be influenced by the boundary and the only contribution comes from the standard Pauli– Jordan function 1. As soon as x1 + x2 = |t1 − t2 |, signals starting from one of the points can be reflected at the boundary and reach the other point. This phenomenon is e and is codified in term proportional to B(k) of the boundary responsible for the term 1, algebra (4.4). The case η < 0 is slightly more delicate due to the presence of a bound state in the one-particle energy spectrum, which must be taken into account in the construction of a local field. The results of this subsection can be obviously generalized to higher space-time dimensions. 4.2. Scattering on the Half Line. Before entering the details of the application of BR to factorized scattering with reflecting boundary conditions, we will discuss the simple
582
A. Liguori, M. Mintchev, L. Zhao
case of particles of mass M freely moving on R+ and bouncing over a wall at x = 0. The relevant one-particle space is L2 (R+ , dx). We denote by Dη ⊂ L2 (R+ , dx) the subspace of C ∞ -functions on R+ , which vanish for sufficiently large x, have square integrable first and second derivatives and obey d − η ϕ(x) = 0. (4.13) lim x↓0 dx The current
i dϕ d ϕ j=− − ϕ ϕ 2m dx dx
(4.14)
satisfies j(0) = 0 for all ϕ ∈ Dη , thus preventing any probability flow through the wall x = 0. For a one-particle Hamiltonian we take H (1) = −
1 4, 2M
(4.15)
defined on Dη . The evolution problem is well posed because H (1) , which is obviously symmetric, is actually essentially self-adjoint [22]. A set of (generalized) eigenstates verifying (4.13) is k ∈ R, (4.16) ψk (x) = e−ikx + B(k)eikx , where B(k) is given by Eq. (4.3). The eigenvectors (4.16), which represent physically scattering states, satisfy ψ−k (x) = ψ k (x) = B(−k)ψk (x).
(4.17)
For η ≥ 0 the systems {ψk : k > 0} and {ψ−k : k > 0} are separately complete and are related via complex conjugation, which in the physical context implements time reversal. When η < 0, there is in addition a unique bound state p (4.18) ψb (x) = −2η eηx , with energy E = −η 2 /2M . The n-body Hamiltonian of the associated multiparticle Bose system H (n) = −
1 (41 + ... + 4n ) 2M
(4.19)
n – the subspace of symmetric functions in Dη⊗n . Clearly, there is neither is defined on Dη+ particle production nor particle collision in this model. There is however a nontrivial reflection from the boundary, which can be described as follows. One can consider ψk as representing a particle, which when time t → −∞, travels with momentum −k towards the wall. Accordingly, we take
1 | − kiin = √ ψk (x), 2π
k > 0,
(4.20)
as a basis of one-particle “in"-states. Concerning the basis of one-particle “out"-states, the analogous consideration gives 1 1 |kiout = √ ψ k (x) = √ ψ−k (x), 2π 2π
k > 0.
(4.21)
Boundary Exchange Algebras and Scattering on the Half Line
583
The scattering operator is defined at this point by S |kiout = | − kiin .
(4.22)
For η ≥ 0, S is by construction a unitary operator on L2 (R+ , dx). For η < 0, S is defined and unitary on the subspace of L2 (R+ , dx) which is orthogonal to the bound state (4.18). The one-particle matrix elements of S read Z ∞ 1 out out out in hk|S|pi = hk| − pi = dxψk (x)ψp (x) = B(k)δ(k − p). (4.23) 2π 0 More generally out
hk1 , ..., kn | − p1 , ..., −pn iin = B(k1 )...B(kn )δ(k1 − p1 )...δ(kn − pn ),
(4.24)
provided that k1 > ... > kn > 0 and p1 > ... > pn > 0. Our main observation now is that the above simple scattering problem admits a field-theoretic solution in terms of the algebra (4.4). In fact, it is easy to verify that the vacuum expectation values 2n ha∗ (k1 )...a∗ (kn ) , a∗ (−p1 )...a∗ (−pn )ie ,
(4.25)
in the Fock representation F1,B;e reproduce precisely the transition amplitudes (4.24). We have therefore the following Fock realization n
|k1 , ..., kn iout = 2 2 a∗ (k1 )...a∗ (kn ), n
| − p1 , ..., −pn iin = 2 2 a∗ (−p1 )...a∗ (−pn ),
k1 > ... > kn > 0, p1 > ... > pn > 0,
(4.26) (4.27)
of the interpolating states. Summarizing, the scattering operator of our simple model has a purely algebraic characterization. In this respect, the term proportional to B(k) in (4.4) is the algebraic counterpart of the boundary condition, given analytically by Eq. (4.13). At this stage we have enough background for facing the more complicated problem of scattering in integrable models with reflecting boundary conditions in 1+1 spacetime dimensions. The presence of particle collisions in this case leads in general to the boundary algebras BR with R 6= 1. Using the Fock representations of BR , derived in the previous section, we present below a rigorous construction of the S-matrix, which generalizes some previous results [20] valid in the absence of a boundary. We also show that under certain conditions on the triplet {R, B; m}, the transition amplitudes, originally derived by Cherednik [4], are indeed Hilbert space matrix elements of a unitary operator. The asymptotic particles of integrable models are parametrized by their rapidity θ ∈ R and internal “isotopic" index α = 1, ..., N . We recall that in the case of relativistic dispersion relation the energy-momentum vector is expressed in terms of θ and the mass M according to (4.28) p0 = M cosh(θ), p1 = M sinh(θ) . An elastic reflection (p0 , p1 ) 7−→ (p0 , −p1 ) corresponds therefore to the transformation θ 7−→ −θ. The fundamental building blocks for constructing the scattering operator are the β1 β2 (θ1 , θ2 ) and Bαβ (θ), which are supposed to satisfy Eqs. (2.9, 10) and (3.9, matrices Rα 1 α2 10). We allow for R to depend on θ1 and θ2 separately (and not only on θ1 − θ2 ), because in general the presence of boundaries breaks down Lorentz invariance.
584
A. Liguori, M. Mintchev, L. Zhao
A crucial observation is that the algebra BR alone does not determine the scattering operator S we are looking for: one must fix in addition an involution Im . The latter selects a Fock representation FR,B;m , which is the main ingredient for constructing S. Postponing the discussion of the physical meaning of the choice of m ∈ M(R, B) to the end of this section, it might be instructive for the time being to describe the set M(R, B) for some familiar integrable model. We choose the SU (2) Thirring model. In this case N = 2 and setting θ12 = θ1 − θ2 the relevant R-matrix reads [1] 2 X iπρ(θ12 ) θ12 α+β (−1) Eαβ ⊗ Eβα , (4.29) Eαα ⊗ Eββ + R(θ1 , θ2 ) = (iπ − θ12 )ρ(−θ12 ) iπ α,β=1
where Eαβ are the Weyl matrices and θ θ 1 ρ(θ) = 0 + 0 1− . 2 2πi 2πi
(4.30)
The general solution of Eqs. (3.9, 10), subject to the physical constraint of boundary crossing symmetry [10], is given in [3]. Let us concentrate for simplicity on the diagonal solutions η−θ β(θ) E11 + E22 , (4.31) B(θ) = β(−θ) η+θ with η ∈ C and θ θ η + iπ − θ η + 2πi + θ 3 + 0 1− 0 0 . β(θ) = 0 4 2πi 2πi 2πi 2πi
(4.32)
Let µ+ (µ− ) be any measurable real-valued even (odd) function, such that µ± and 1/µ± are bounded. Then, if Re η = 0, the set M(R, B) contains all matrices of the form ξ ∈ R, ξ 6= 0.
m(θ) = µ+ (θ) (E11 + ξE22 ) ,
(4.33)
In addition, for η = 0 one has the solutions ¯ 21 , m(θ) = µ− (θ) ζE12 + ζE
ζ ∈ C.
(4.34)
From Eq. (4.33) it follows that M(R, B)+ 6= ∅. After this concrete example illustrating the set M(R, B), we return to the general framework. The idea is to extend the formalism, developed at the beginning of this section for the Schr¨odinger particle on the half line, to the case of integrable models. In what follows we assume that (4.35) M(R, B)+ 6= ∅ and consider representations FR,B;m of type A. The physical motivation for this restriction is quite evident. According to Proposition 7, it ensures positivity of the metric in the asymptotic spaces F out and F in , which we are going to construct now. For this purpose we introduce the following relation in C0∞ (R): f 1 f2
⇐⇒ θ1 > θ2
∀ θ1 ∈ supp(f1 ) ,
∀ θ2 ∈ supp(f2 ).
(4.36)
We will adopt also the notation f 0
⇐⇒
θ>0
∀ θ ∈ supp(f ),
(4.37)
Boundary Exchange Algebras and Scattering on the Half Line
and
fe(θ) = f (−θ).
585
(4.38)
As suggested by Eqs. (4.26, 27), F out and F in are generated by finite linear combinations of the vectors (k ≥ 1) E out = { , a∗ (f1 ) · · · a∗ (fk ) : f1α1 · · · fkαk 0, ∀ α1 , ..., αk = 1, ..., N } (4.39) and g1 ) · · · a∗ (e gk ) : g1β1 · · · gkβk 0, ∀ β1 , ..., βk = 1, ..., N } E in = { , a∗ (e (4.40) respectively. By construction both F out and F in are linear subspaces of the Hilbert space FR,B;m (H). 0 (H) which belong One should notice that in principle there are elements of FR,B;m neither to F out nor to F in . We call them mixed vectors. Linear combinations involving both in- and out-states provide in general examples of such vectors. In spite of the existence of mixed vectors, the subspaces F out and F in satisfy a sort of asymptotic completeness, which is essential for constructing the S-matrix. More precisely, one has Proposition 8. F out and F in separately are dense in FR,B;m (H). Proof. We focus on F out . Let ϕ ∈ FR,B;m (H) and let us assume that hϕ , ψim = 0
∀ ψ ∈ F out .
(4.41) In order to prove the thesis, we have to show that ϕ = ϕ(0) , ϕ(1) , ..., ϕ(n) , ... = 0. Obviously ϕ(0) = 0. Let us consider ϕ(n) for arbitrary but fixed n ≥ 1. Equation (3.27) and Eq. (4.41) imply that Z
hϕ(n) , a∗ (f1 ) · · · a∗ (fn )im = dθ1 · · · dθn ϕ(n)†α1 ...αn (θ1 , ..., θn )mβα11 (θ1 ) · · · mβαnn (θn )f1β1 (θ1 ) · · · fnβn (θn ) = 0
(4.42) for all f1 , ..., fn such that f1α1 · · · fnαn 0 ∀ α1 , ..., αn = 1, ..., N . Therefore ϕ(n) α1 ...αn (θ1 , ..., θn ) = 0
(4.43)
n in the domain θ1 > · · · > θn > 0. Finally, using that ϕ(n) ∈ HR,B has definite exchange and reflection properties described by Eqs. (3.24, 25), one can extend the domain of validity of (4.43) and conclude that ϕ(n) actually vanishes almost everywhere in Rn . Clearly, a similar argument applies also to the case of F in . We observe in passing that the definition of F out and F in does not explicitly involve the boundary generators {bβα (θ)}. This fact is not surprising because is cyclic with respect to {a∗α (θ)}. At this point we are ready to define the scattering matrix S and to prove that it is a unitary operator in FR,B;m (H). The construction consists essentially of three steps. One starts by defining S as the following mapping of E out onto E in :
S = ,
(4.44)
S a∗ (g1 )a∗ (g2 ) · · · a∗ (gk ) = a∗ (e g1 )a∗ (e g2 ) · · · a∗ (e gk ),
(4.45)
586
A. Liguori, M. Mintchev, L. Zhao
where g1β1 · · · gkβk 0, ∀ β1 , ..., βk = 1, ..., N . It is not difficult to check that hSψ out , Sϕout im = hψ out , ϕout im ,
∀ ψ out , ϕout ∈ E out .
(4.46)
∀ ψ in , ϕin ∈ E in .
(4.47)
Moreover, S is invertible and hS −1 ψ in , S −1 ϕin im = hψ in , ϕin im ,
The second step is to extend S and S −1 by linearity to the whole F out and F in respectively. Clearly, one has to show that these extensions are correctly defined. Consider for instance S and suppose that there exist a sequence g1i β · · · gki β 0 , 1
k
∀ β1 , ..., βk = 1, ..., N ,
such that a∗ (g1 )a∗ (g2 ) · · · a∗ (gk ) =
M X
i = 1, ..., M ,
a∗ (g1i )a∗ (g2i ) · · · a∗ (gki ).
(4.48)
i=1
In order to prove that the linear extension of S is not ambiguous, we must show that g1 )a∗ (e g2 ) · · · a∗ (e gk ) = a∗ (e
M X
a∗ (e g1i )a∗ (e g2i ) · · · a∗ (e gki ).
(4.49)
i=1
The argument is as follows. In the domain θ1 > θ2 > ... > θk > 0 Eq. (4.48) implies that g1β1 (θ1 ) g2β2 (θ2 ) · · · gkβk (θk ) =
M X
g1i β (θ1 ) g2i β (θ2 ) · · · gki β (θk ). 1
2
k
(4.50)
i=1
Because of the support properties of {gj } and {gji } one has that Eq. (4.50) holds actually (k) proves the validity of Eq. (4.49). in Rk , which projected by PR,B It is easy to see also that Eqs. (4.46, 47) remain valid for the linear extensions of S and S −1 on F out and F in respectively. This fact implies in particular that both S and S −1 are bounded linear operators. Finally, one extends S and S −1 by continuity to FR,B;m (H). Because of the asymptotic completeness proven in Proposition 8, the extensions are unique and define the unitary scattering operator and its inverse. As it should be expected from integrability, n n ⊂ HR,B . Notice however, that in contrast to the case without boundone has SHR,B ary, where the scattering operator leaves invariant each one-particle state, the S-matrix 1 . constructed above acts nontrivially already in HR,B By construction the matrix elements of S between out-states in the Fock space FR,B;e (H) reproduce precisely the transition amplitudes derived by Cherednik [4]. Since the latter are referred to the involution Ie , a natural question arising at this point concerns the physical meaning of other possible choices of m ∈ M(R, B)+ . For answering this question we consider two generic asymptotic states ϕin ∈ F in and ψ out ∈ F out . If both m, e ∈ M(R, B)+ , one may compare the transition amplitudes associated with the involutions Im and Ie . One finds out in hψ out , ϕin im = hψ out , ϕin d ie = hψd , ϕ ie , out where ϕin d and ψd are the “dressed" in- and out-states
(4.51)
Boundary Exchange Algebras and Scattering on the Half Line
587
(n) γ1 γn in (n) (ϕin d )α1 ...αn (θ1 , ..., θn ) = mα1 (θ1 ) · · · mαn (θn )(ϕ )γ1 ...γn (θ1 , ..., θn ),
(4.52)
†γ1 †γn out (n) (ψdout )(n) β1 ...βn (θ1 , ..., θn ) = m β1 (θ1 ) · · · m βn (θn )(ψ )γ1 ...γn (θ1 , ..., θn ).
(4.53)
It follows from Eq. (4.51) that the effect of the involution Im is exactly reproduced in FR,B;e by appropriate dressing (4.52, 53) of the in- or out-states. The results of this section can be summarized as follows. Proposition 9. Suppose that the exchange factor R and the reflection matrix B satisfy (2.9,10) and (3.9,10). Assume also that M(R, B)+ 6= ∅. Then the scattering operator associated with the Fock representation FR,B;m is unitary for any m ∈ M(R, B)+ . Conditions (2.9, 10) and (3.9, 10) are standard for the scattering on the half line. The same is true for (2.15), which is usually imposed in the slightly stronger form β1 β2 1 β2 R†β α1 α2 (θ1 , θ2 ) = Rα1 α2 (θ2 , θ1 ),
(4.54)
known as Hermitian analyticity. We emphasize that condition (3.6), which is often overlooked in the physical literature, is essential for the unitarity of S and represents therefore an useful criterion for selecting possible reflection matrices. In the case of the SU (2) Thirring model one gets in this way the restriction Re η = 0 in Eqs. (4.31, 32). Let us mention also that if R depends on the difference θ12 ≡ θ1 − θ2 , one usually assumes [13,26] that R admits a suitable continuation to the complex θ12 -plane, which satisfies crossing symmetry, has certain pole structure, etc. In that case also B is required to have a continuation in the complex θ-plane, which obeys boundary crossing symmetry [10]. In our example (see Eqs. (4.29–32)) R and B admit such continuations. Finally, the bootstrap equations [9,26] reduce further the set of physically relevant exchange and reflection matrices. From Proposition 9 it follows however that the unitarity of S as an operator in FR,B;m (H) depends exclusively on the behavior of R and B for real values of the rapidities. 5. Outlook and Conclusions In the present paper we have introduced the associative algebra BR and investigated some of its basic features. BR admits two series of Fock representations, which have been constructed explicitly. The positive metric representations provide a framework for deriving Cherednik’s transition amplitudes and proving that they are indeed the matrix elements of a unitary scattering operator. We have shown also that the algebra B+ enters the Bose quantization on the half line. The associated Klein–Gordon field is local, in spite of the breakdown of the Poincar´e symmetry. BR is actually a member of a large family of algebras BR,λ , which are defined by Eqs. (2.23–26). BR,λ can be studied in the same way as BR and are expected to find relevant applications to statistical models with boundaries. It will be interesting in this respect to extend to BR,λ the notion of second R-quantization, developed in [17,18] for the Z-F algebra AR . We point out finally that one can further generalize BR,λ , eliminating the condition (2.9) and/or (3.10). In this case, instead with the Weyl group Wn , one has to deal with an infinite dimensional group Wn0 , which is freely generated by the elements {τ 0 , σi0 : i = 1, ..., n − 1} satisfying the relations (3.14,15), but not (3.16). Recent investigations [15] show actually that the group Wn0 appears in many different physical and mathematical contexts. We hope to say more about this generalization of BR,λ in the near future.
588
A. Liguori, M. Mintchev, L. Zhao
Appendix In quantum field theory on the half line it is sometimes necessary to allow for a quantum number j = 1, ..., NB to reside on the boundary [10]. We will show below that this case is still described by the boundary algebra {BR , Im }, but corresponds to representations with slightly more general structure than that of FR,B;m . To be precise, instead of the requirement 4 formulated in the beginning of Sect. 3, these representations satisfy: 40 . There exists a NB -dimensional subspace (vacuum space) V ⊂ D, which is annihilated by aα (x). Moreover, V is cyclic with respect to {a∗α (x)} and h · , · im is positive definite on V. For NB = 1 we recover the property 4 specifying FR,B;m . Let us briefly describe now the main features of the representations characterized by the conditions 1-3 and 40 . Let 1 , . . . , NB be an orthonormal basis in V. We denote by P0 the h · , · im -orthogonal projection on V and define Bαβ (x) ≡ P0 bβα (x) P0 .
(A.1)
Notice that Bαβ (x) is now an operator, carrying the vacuum space into itself, Bαβ (x) j = Bαβ kj (x) k .
(A.2)
The following obvious generalization of Proposition 3 holds. Proposition 30 . The vacuum space V is unique and satisfies bβα (x) |V = Bαβ (x) |V .
(A.3)
Projecting the relevant equations on the vacuum space, one immediately verifies the validity of (3.6,9,10) as operator equations on V. Summarizing, the basic input for constructing the above more general class of representations of {BR , Im } is still the triplet {R, B; m}, the novelty being that Bαβ (x) are operators which satisfy (3.6,9,10) on V. Apart from the following minor modifications, the construction precisely follows that described in Sect. 3. First of all, the elements of n carry an extra lower index varying from 1 to NB . In the scalar product this index is HR,B saturated among the two states. Second, performing the substitution Bαβii (x) 7→ Bαβii jk (x) in Eq. (3.20), the operator Bi (x) becomes a NB ×NB -matrix, which inserted in (3.30,31) acts on the states by a standard matrix multiplication. References 1. Belavin, A.A.: Exact Solution of the Two-dimensional Model with Asymptotic Freedom. Phys. Lett. B 87, 117–123 (1979) 2. Bogn´ar, J.: Indefinite Inner Product Spaces. Berlin: Springer Verlag, 1974 3. Chao, L., Hou, B., Shi, K., Wang, Y., Yang, W.: Bosonic Realization of Boundary Operators in SU (2)invariant Thirring Model. Int. J. Mod. Phys. A 10, 4469–4482 (1995) 4. Cherednik, I.V.: Factorizing Particles on a Half Line and Root Systems. Theor. Math. Phys. 61, 977–983 (1984) 5. Corrigan, E., Dorey, P.E., Rietdijk, H.R.: Aspects of Affine Toda Field Theory on a Half Line. Suppl. Prog. Theor. Phys. 118, 143–164 (1995)
Boundary Exchange Algebras and Scattering on the Half Line
589
6. Faddeev, L.D.: Quantum Completely Integrable Models in Field Theory. Soviet Sci. Rev. Sect. C 1, 107–155 (1980) 7. Fendley, P., Saleur, H.: Deriving Boundary S Matrices. Nucl. Phys. B 428, 681–693 (1994) 8. Fring, A., K¨oberle, R.: Affine Toda Field Theory in the Presence of Reflecting Boundaries. Nucl. Phys. B 419, 647–664 (1994) 9. Fring, A., K¨oberle, R.: Factorized Scattering in the Presence of Reflecting Boundaries. Nucl. Phys. B 421, 159–172 (1994) 10. Ghoshal, S., Zamolodchikov, A.B.: Boundary S Matrix and Boundary State in Two-Dimensional Integrable Quantum Field Theory. Int. J. Mod. Phys. A9, 3841–3886 (1994) 11. Ghoshal, S.: Boundary S-matrix of the O(N )-Symmetric Non-Linear Sigma Model. Phys. Lett. B 334, 363–368 (1994) 12. Ghoshal, S.: Bound State Boundary S-Matrix of the Sine-Gordon model. Int. J. Mod. Phys. A 9, 4801– 4810 (1994) 13. Karowski, M., Weisz, P.: Exact Form Factors in (1+1)-Dimensional Field Theoretic Models with Soliton Behavior. Nucl. Phys. B 139, 455–476 (1978) 14. Kuczma, M., Choczewski, B., Ger, R.: Iterative Functional Equations. Cambridge: Cambridge University Press, 1990 15. Kulish, P.P., Sasaki, R.: Covariance Properties of Reflection Equation Algebras. Progr. Theor. Phys. 89, 741–761 (1993) 16. LeClair, A., Mussardo, G., Saleur, H., Skorik, S.: Boundary Energy and Boundary States in Integrable Quantum Field Theories. Nucl. Phys. B 453, 581–618 (1995) 17. Liguori, A., Mintchev, M.: Fock Representations of Quantum Fields with Generalized Statistics. Commun. Math. Phys. 169, 635–652 (1995) 18. Liguori, A., Mintchev, M., Rossi, M.: Unitary Group Representations in Fock Spaces with Generalized Exchange Properties. Lett. Math. Phys. 35, 163–177 (1995) 19. Liguori, A., Mintchev, M.: Boundary Exchange Algebras. Preprint IFUP-TH 21/96, March 1996 20. Liguori, A., Mintchev, M., Rossi, M.: Fock Representations of Exchange Algebras with Involution. J. Math. Phys. 38, 2888–2898 (1997) 21. Penati, S., Zanon, D.: Quantum Integrability in Two-Dimensional Systems with Boundary. Phys. Lett. B 358, 63–72 (1995) 22. Reed, M., Simon, B.: Methods in Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 23. Saleur, H., Skorik, S., Warner, N.P.: The Boundary Sine-Gordon Theory: Classical and Semi-Classical Analysis. Nucl. Phys. B 441, 421–436 (1995) 24. Sklyanin, E.K.: Boundary Conditions for Integrable Quantum Systems. J. Phys. A: Math. Gen. 21, 2375–2389 (1988) 25. Warner, N.P.: Supersymmetry in Boundary Integrable Models. Nucl. Phys. B 450, 663–694 (1995) 26. Zamolodchikov, A.B., Zamolodchikov, A.B.: Factorized S-Matrices in Two Dimensions as the Exact Solutions of Certain Relativistic Quantum Field Theory Models. Ann. Phys. 120, 253–291 (1979) 27. Zhao, L.: Fock Spaces with Reflection Condition and Generalized Statistics. Preprint hep-th 96040024 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 591 – 611 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Semi-Infinite Cohomology of Affine Lie Algebras Stephen Hwang Department of Engineering Sciences, Physics and Mathematics, Karlstad University, S-651 88 Karlstad, Sweden. E-mail: [email protected] Received: 21 November 1996 / Accepted: 5 November 1997
Abstract: We study the semi-infinite or BRST cohomology of affine Lie algebras in detail. This cohomology is relevant in the BRST approach to gauged WZNW models. Our main result is to prove necessary and sufficient conditions on ghost numbers and weights for non-trivial elements in the cohomology. In particular we prove the existence of an infinite sequence of elements in the cohomology for non-zero ghost numbers. This will imply that the BRST approach to the topological WZNW model admits many more states than a conventional coset construction. This conclusion also applies to some non-topological models. Our work will also contain results on the structure of Verma modules over affine Lie algebras. In particular, we generalize the results of Verma and Bernstein-Gel’fandGel’fand, for finite dimensional Lie algebras, on the structure and multiplicities of Verma modules. The present work gives the theoretical basis of the explicit construction of the elements in cohomology presented previously. Our analysis proves and makes use of the close relationship between highest weight null-vectors and elements of the cohomology.
1. Introduction and Summary of Results The present work studies the semi-infinite or BRST cohomology of affine Lie algebras. The motivation comes from the quantization of Wess-Zumino-Novikov-Witten (WZNW) models. These models play an essential part in the understanding and classification of conformal field theories. The BRST symmetry arises as a consequence of the gauging of a WZNW model w.r.t. a subgroup [1]. The constraints associated with this BRST symmetry are the generators of an affine Lie algebra g 0 = gk ⊕ g˜ k˜ . Here gk and g˜ k˜ correspond to the same finite dimensional Lie algebra, but have different central elements k and k˜ = −k − 2cg¯ (see Sect. 2 for notation). The latter affine Lie algebra corresponds to an auxiliary, and in general non-unitary, WZNW model that arose in the
592
S. Hwang
derivation in [1]. The physical states in the gauged WZNW model are now given by the non-trivial elements of the resulting BRST cohomology. In [2] it was proved that the BRST approach was equivalent to the conventional coset construction, so that the states were ghost-free and satisfied the usual highest weight conditions w.r.t. the subalgebra gk . The conditions for this proof was that one selected a specific range of representations for the auxiliary WZNW model. For the original ungauged WZNW model the range of representations were assumed to be the integrable ones. In this work we will consider completely general highest weight representations (an analogous treatment may be given for lowest weight representions). The motivation for this is that it may be that a more general situation than in ref. [2] is the physically relevant one. Our analysis of the cohomology is most straightforwardly applied to the case when the gauged subgroup coincides with the original group, i.e. when we have a topological WZNW model. But, as we will show, it also generalizes to the most important class of non-topological models, namely those in which the ungauged WZNW model is unitary. In [3] the explicit construction of elements in the BRST cohomology was considered. The procedure presented there for obtaining these elements showed that they were intimitely related to certain null-vectors. The key to the construction was to make a selection of null-vectors that generated the states in the cohomology. It turned out that these null-vectors are the highest weight vectors. Then by using the explicit form of highest weight null-vectors given by Malikov, Feigin and Fuchs [4], the elements may be constructed. Our work here may be seen as the theoretical basis of this construction. We will here prove that the procedure in [3] will always generate non-trivial states in the cohomology. We will also prove that the ghost numbers that appeared in the construction are the only possible ones. The ghost numbers will be uniquely determined by the representations of the algebras involved, and for fixed representations only one value (and its negative) will occur. It is still an open question whether the construction provides all the possible states. We also lack a general result on the dimensionality of the cohomology. The plan of the paper and its main results are the following. In Sect. 2 we give the basic definition and facts for affine algebras and associated modules. In Sect. 3 we discuss the structure of Verma modules. This is important since our analysis of the cohomology relies very heavily on this structure, in particular, on the embeddings of Verma modules into Verma modules. We make extensive use of a technique due to Jantzen [5] to perturb the highest weight of a reducible Verma module to obtain an irreducible one. This perturbation gives also a filtration of modules in a given Verma module. Section 3 contains results on the structure of Verma modules, which we have been unable to find in the literature. The main results are Theorem 3.10 and Theorem 3.11. These are generalizations of results of Verma [15] and Bernstein, Gel’fand and Gel’fand [11], respectively, for finite dimensional Lie algebras and of Rocha-Caridi and Wallach for affine Lie algebras with highest weights on Weyl orbits through dominant weights. The proof of Theorem 3.11 is almost identical to the proof of the finite dimensional case given in [14], Theorem 7.7.7 (which is used also in [6]). The proof of Theorem 3.10 only partly coincides with [6], as the latter does not extend to the case of antidominant weights. In Sect. 4 we proceed to introduce the BRST formalism. Most of the material (except Lemma 4.2) is well-known. In particular, we recapitulate a theorem due to Kugo and Ojima [7]. This theorem will partly be used in the main section, Sect. 5. It is also conceptually important in understanding the basic mechanism behind the appearance of elements in the BRST cohomology for non-zero ghost numbers, which we now explain. The theorem, which applies only to irreducible modules, states that elements in the
Semi-Infinite Cohomology of Affine Lie Algebras
593
cohomology form either singlet or doublet (singlet pair) representations w.r.t the BRST algebra. Furthermore, elements that are trivial or outside the cohomology form so-called quartets in the terminology of [7], i.e. sets four states, in which two of the elements are BRST exact. In order to obtain an irreducible module, we use a trick due to Jantzen, to perturb a reducible module into an irreducible one. In the irreducible case one may prove (Corollary 5.2), that only ghost-free highest weight states are BRST non-trivial. As the perturbation is taken to zero and the module becomes reducible, certain quartets will evolve into singlet pairs in the following way. Two of states of the quartet will remain in the irreducible module and will then form a singlet pair in this module. The two other states will become null-states. One of the main results in this paper (Theorem 5.12) is the determination of the relevant null-states. This theorem gives the necessary and sufficient conditions on the null-states to be part of a quartet, that will contain a singlet pair as the perturbation is set to zero. The implications of the theorem are exploited in Theorems 5.14 and 5.15, which give the necessary and sufficient conditions on the ghost-numbers and weights for which the cohomology is non-trivial. In particular in Theorem 5.15 a sequence of non-trivial BRST invariant states is proved to exist. This sequence is exactly the one for which the construction has been given in [3]. The ghost numbers appearing ˜ − l(λ) and l(λ) is the length of a Weyl transformation associated are ±p, where p = l(λ) with λ (see Sect. 3). This means that for given highest weights λ and λ˜ of the original and auxiliary sectors, |p| is fixed to exactly one value. By Theorem 5.14 these ghost numbers and weights are the only non-trivial ones. Let us also address the question of how the embedding of g into a larger algebra may affect our results. As our approach relies on the use of null-vectors, the crucial question is what happens to the relevant null-vectors as g is embedded. If the null-vector w.r.t. g will cease to be null in the larger algebra, then the entire quartet, to which the vector belongs for non-zero perturbation, will remain a quartet as Jantzen’s perturbation is set to zero. Thus the corresponding elements in the cohomology of g will now be exact. In addition, many more elements may disappear from the cohomology group. This is most evident from the construction in [3], where one used non-trivial states at ghost number p − 1 (p > 0) to construct a BRST non-trivial element of ghost number p. In the extreme case the module over the larger algebra is irreducible and all elements, except the one at zero ghost number, will disappear. There is one case in which the embedding will be straightforward. This will happen when we select integrable representations of the larger algebra. In this case it is known [9] that the irreducible module over the larger algebra is completely reducible w.r.t. to any subalgebra. Hence, the results given here generalize directly. This was the situation analyzed in [2]. Corollary 5.11 proves that the solutions given in [2] for a selected range of representations of the auxiliary sector, are in fact the unique solutions for zero ghost number for any selection of representations of the auxiliary sector. The existence of extra elements in the cohomology, which have non-zero ghost numbers, implies that the BRST approach to WZNW models is different from the conventional coset approach. This applies to the topological case, but also to the non-topological case, at least when we take integrable representations of the original algebra. The rˆole of these extra states is at this point unclear. It may be that their appearance will lead to inconsistencies. One may avoid the states by selecting an appropriate range of representations for the auxiliary sector. Then only ghost free states will appear in the cohomology. This was the situation treated in ref [2]. It may on the other hand be that the extra states are a new and important part of the quantization of WZNW models. In the latter case one may expect that the extra states will be needed to ensure S-matrix unitarity and hence will appear as poles in scattering amplitudes.
594
S. Hwang
2. Preliminaries Let g¯ be a simple finite dimensional Lie algebra of rank r. We denote by gk the cor¯ and responding affine Lie algebra of level k. The set of roots of g¯ and g are α¯ ∈ 1 ¯ α ∈ 1, respectively. The highest root of g¯ is denoted ψ and its length is taken to be ¯ s, ¯ + , 1+ and to simple roots by 1 one. The restriction to positive roots are denoted by 1 1s . The weight and root lattices of g¯ and g are denoted by 0¯ w , 0¯ r , 0w and 0r . 0+r is the lattice generated by positive roots. Let 0+w be the set of dominant weights, 0+w = 2λ ·α {λ ∈ 0w | αi · λ ≥ 0 for αi ∈ 1s }. Let 0fw = {λi ∈ 0+w | (αij )2j = δij for αj ∈ 1s } be the set of fundamental weights. Here λi · αj denotes the invariant scalar product on g and (αj )2 = αj · αj . Define ρ as twice the sum of fundamental weights of g. ρ¯ is the corresponding sum for g. ¯ ρ satisfies ρ · αi = (αi )2 , αi ∈ 1s . We define the set − of antidominant weights 0w = {λ ∈ 0w |αi · (λ + ρ/2) ≤ 0 for αi ∈ 1s }. A weight µ ∈ 0w is said to be singular if it is orthogonal to at least one positive root and is said to be regular otherwise. The Weyl group W of g is the set of transformations on 0w generated by the simple reflections σi (λ) = λ −
2λ · αi αi (αi )2
α i ∈ 1s .
(2.1)
The length l(w) of w ∈ W is the minimal number of simple reflections that give w. We also define the ρ−centered reflections σiρ (λ) = σ(λ + ρ/2) − ρ/2. Similarly we write wρ (λ) for a general ρ-centered Weyl transformation. We define an ordering between weights. Let µ, ν ∈ 0w be such that µ − ν ∈ 0+r . We then write µ ≥ ν. If µ − ν ∈ 0+r /{0}, then this is denoted by µ > ν. Two weights λ and µ are said to be on the same Weyl orbit if there exists w ∈ W such that µ = w(λ). Similarly, they are said to be on the same ρ-centered Weyl orbit if µ = wρ (λ). We make a triangular decomposition of g, g = n− ⊕ h ⊕ n+ . We will use the notation eα for the generators of n+ , fα for those of n− and hi , i = 1, . . . , r + 2 for the generators ¯ h1 is a central element of g with of the Cartan subalgebra h. hi , i = 2, . . . , r + 1 span h, eigenvalue k/2 and h0 is a derivation. We have a corresponding decomposition of U (g), the universal enveloping algebra of g, as U(g) = U (n− ) ⊗ U(h) ⊗ U(n+ ). Let M (λ) denote the highest weight Verma module over g of highest weight λ. The module is generated by a highest weight primary vector v0λ satisfying eα v0λ = 0, hi v0λ = λi v0λ
hi ∈ h.
(2.2)
M (λ) admits a weight decomposition M (λ) =
M
Mη (λ).
η∈0+r
Vectors in Mν (λ) will be called weight vectors of degree ν and their weights differ from the highest weight by ν. We consider throughout only vectors v ∈ Mη (λ) with dimMη (λ) < ∞. The dimension of Mν (λ) is P (ν), which is the number of ways ν may be written as a linear combination of positive roots with non-negative coefficients. Let M 0 (λ) be the proper maximal submodule of M (λ). Then M (λ)/M 0 (λ) is irreducible and isomorphic to the unique irreducible g−module L(λ).
Semi-Infinite Cohomology of Affine Lie Algebras
595
Define a Hermitian form h..|..i as the mapping from M (λ) × M (λ) to the complex numbers by hv0λ |v0λ i = 1, hwλ |uvλ i = hu† wλ |vλ i,
(2.3)
†
= fα , fα† = where u ∈ U(g) and ( ) denotes the Hermite conjugation defined by † eα , hi = hi . For vη , wµ ∈ Mη (λ) we clearly have hwµ |vη i = 0 for η 6= µ. If η = µ, then F (λ)η = hwη |vη i may be viewed as a P (η)×P (η) matrix, whose entries are polynomials in λ. The determinant of F (λ)η is given by the Kac–Kazhdan formula [10] det F (λ)η = const.
e†α
∞ h Y Y
(λ + ρ/2) · α −
α∈1+ n=1
n 2 iP (η−nα) α , 2
(2.4)
where roots α ∈ 1+ are taken with their multiplicities and P (η) = 0 if η 6∈ 0+0 . The zeros of the determinant are associated with highest weight vectors vµ that occur in M (λ) (see the following section). From Eq. (2.4) one may infer that µ = λ − nα, which implies that the Verma module M (µ) is a submodule of M (λ). M (λ) is irreducible if and only if there does not exist n ∈ Z and α ∈ 1+ such that n (λ + ρ/2) · α − α2 = 0. (2.5) 2 Notice that this equation will for any imaginary root α (i.e. α2 = 0) be equivalent to the ¯ condition k = −cg¯ , where cg¯ is the quadratic Casimir of the adjoint representation of g. 3. Structure of Embeddings of Verma Modules If the Kac–Kazhdan equation (2.5) has non-trivial solutions for a given module M (λ), then there will exist Verma modules M (µ) that are submodules of M (λ). This implies the φ
existence of a g-homomorphism, φ ∈ Homg (M (µ), M (λ)), such that M (µ) → M (λ). We will in this section and throughout the rest of this paper assume k 6= −cg¯ , so that solutions to Eq. (2.5) only occur for real roots α. The structure of embeddings P is most clearly depicted through a filtration due to Jantzen [5]. Introduce z = λ∈0fw zλ λ, where zλ are non-zero complex numbers. Consider the one-parameter family of weights λ = λ+z. If λ is a weight of a reducible module M (λ) and zi 6= 0, then for 0 < || 1, M (λ ) is irreducible. We now define a filtration M (λ ) ⊃ M (1) (λ ) ⊃ M (2) (λ ) ⊃ . . .
(3.1)
by M (n) (λ ) = {v ∈ M (λ ) | hw|vi is divisable by n for any w ∈ M (λ )} . (3.2) We will often write M for M (λ ) etc. for M (n) . If v = uv0λ , u ∈ U(g), then we write v = uv0λ . In the limit → 0 this induces a filtration of modules in M (λ) M (λ) ⊃ M (1) (λ) ⊃ M (2) (λ) ⊃ . . . .
(3.3)
Note that Jantzen’s filtration is hereditary: Let M (µ) ∈ M (s) (λ) and M (ν) ∈ M (t) (µ). Then M (ν) ∈ M (s+t) (λ).
596
S. Hwang
Any irreducible subquotient of a g-module M (λ) is isomorphic to an irreducible g-module L(µ), λ − µ ∈ 0+r . Denote by (M (λ) : L(µ)) the multiplicity of L(µ) in M (λ). M (1) (λ) is the maximal proper submodule of M (λ) and hence M (λ)/M (1) (λ) is isomorphic to the irreducible module L(λ). We will call the vectors in M (1) null-vectors πL L(λ). of M (λ). We define πL to be the projection M (λ) −→ The submodules of a given Verma module are generally not all of Verma type. It is convenient to introduce the notion of primitive vectors. Let V be a g-module. A vector vλ ∈ V is said to be primitive if there exists a submodule U of V such that v 6∈ U,
xv ∈ U for any x ∈ n+ .
(3.4)
λ is called a primitive weight. Highest weight vectors are clearly primitive, but in general they do not exhaust all primitive vectors, even in the case of finite dimensional algebras, as was first noted in [11]. In fact, there may be infinitely many more primitive vectors than highest weight vectors (see [12] for an example for finite dimensional algebras). Any module V is generated by its primitive weights as a g-module. We will call a module which is generated by acting freely with U(n− ) on a primitive vector, which is not of highest weight type, a Bernstein-Gel’fand (BG) module. The corresponding primitive vector will be called a Bernstein-Gel’fand primitive vector. Although every zero in the determinant Eq. (2.4), i.e. every (α, n) for which the Kac– Kazhdan equation (2.5) is satisfied, corresponds to a highest weight vector in M (λ) (cf. Proposition 3.8), the converse is in general not true. For a given λ there are usually more highest weight vectors than solutions (α, n). Let Homg (M (µ), M (λ)) 6= 0 for a pair (α, n) in Eq. (2.5) with α real, i.e. µ = λ − nα, n ≥ 1 and α ∈ 1+ ∩ 1R , where 1R is the set of real roots. Then we may write µ = σαρ (λ) < λ.
(3.5)
The inequality ensures that a solution to Eq. (2.5) exists. In the form Eq. (3.5) it is clear that by iteration, we will find new highest weight vectors not given by solutions to the Kac–Kazhdan equation for λ. It also follows that M (λ) is irreducible if and only if λ is antidominant. Notice that this requires k < −cg¯ . Let us proceed to give a more precise classification of highest weight vectors in M (λ) in terms of Weyl transformations. Define the Bruhat ordering on W . Let w, w0 ∈ W . We write w0 → w if there exists α ∈ 1+ ∩ 1R , such that w = σα w0 and l(w) = l(w0 ) + 1. We write w0 ≺ w if there are w0 , w1 , . . . , wp ∈ W such that w0 = wp → wp−1 → . . . → w1 → w0 = w. It may be shown that w0 ≺ w if and only if the reduced expressions w0 = σj1 . . . σjp and w = σi1 . . . σiq are such that (j1 , . . . , jp ) is obtained by deleting q − p elements from (i1 , . . . , iq ). By combining Theorem 4.2 in [10] with Eq. (3.5) we have the following. Theorem 3.1. A Verma module M (λ) contains an irreducible subquotient L(µ) if and only if the following condition is satisfied: (*) λ = µ, or there exists a sequence of positive roots α1 , α2 , . . . , αk and a sequence of weights λ = µ1 , µ2 , . . . , µk , µk+1 = µ such that µi+1 = σαρ i (µi ) < µi for i = 1, 2, . . . , k. Lemma 3.2. Let µ ∈ 0w . Then there exists w ∈ W and a unique λ + ρ/2 ∈ 0+w (k > ρ ρ ρ ρ −cg¯ ) or λ ∈ 0− w (k < −cg¯ ) such that µ = w (λ) = σin σin−1 . . . σi1 (λ), where i1 , . . . , in denote the simple roots αi1 , . . . , αin with
Semi-Infinite Cohomology of Affine Lie Algebras
597
(**) µ = λ, or µ 6= λ and σiρp+1 σiρp . . . σiρ1 (λ) < σiρp . . . σiρ1 (λ) (k > −cg¯ ) or σiρp+1 σiρp . . . σiρ1 (λ) > σiρp . . . σiρ1 (λ) (k < −cg¯ ), p = 1, 2, . . . , n − 1. − Proof. Consider k < −cg¯ . For µ ∈ 0− w the lemma is trivial (w = 1). Let µ = µ1 6∈ 0w . s Then there exists α1 ∈ 1 such that n1 = (2µ1 + ρ) · α1 /α12 ∈ N = 1, 2, 3, . . . .. This 2 implies that µ2 = σαρ 1 (µ1 ) satisfies µ2 < µ1 . Let λ+ρ/2 ∈ 0− w be such that (µ2 −λ) ≥ 0 (which is always possible, as can be seen by an explicit parametrization of the weights). We have (µ2 − λ)2 = (µ1 − λ)2 + n1 (2λ + ρ) · α1 and, therefore, (µ2 − λ)2 < (µ1 − λ)2 . If µ2 6∈ 0− w we can continue this process. We get a sequence of weights µ1 = µ, µ2 , . . . , µr with (µp+1 −λ)2 < (µp −λ)2 and µp+1 = σαρ p (µp ) < µp , p = 1, . . . , r −1. This sequence must terminate after a finite number of steps, since (µr − λ)2 ≥ 0 from (µ2 − λ)2 ≥ 0. But this can only happen if the last weight µr of the sequence satisfies αi · (2µr + ρ) ≤ 0 0 for all αi ∈ 1s , i.e. µr ∈ 0− w . We now prove the uniqueness. Assume w, w ∈ W and λ, λ0 ∈ 0+w such that µ = wρ (λ) = w0ρ (λ0 ). Then λ = w−1ρ w0ρ (λ0 ). This implies λ = λ0 , as follows by an adaption of [14], Lemma A in Sect. 13.2, to the present case. The case k > −cg¯ is proved in a completely analogous fashion.
Lemma 3.3. Let µ and λ be as in Lemma 3.2 and µ0 = λ, µ1 = σiρ1 (µ0 ), µ2 = σiρ2 (µ1 ), . . . , µn = σiρn (µn−1 ) = µ, where σiρk , k = 1, 2, . . . , n are simple reflections satisfying (**). Then for k > −cg¯ , Homg M (µp ), M (µp−1 ) 6= 0, p = 1, 2, . . . , n and for k < −cg¯ , Homg M (µp−1 ), M (µp ) 6= 0, p = 1, 2, . . . , n. Proof. The proof is by explicit construction. Consider e.g. k < −cg¯ and µp = σiρp (µp−1 ). We take the sl2 subalgebra generated by eip , fip and hip satisfying [fip , eip ] = −hip and [hip , fip ] = −fip . Let vµp be the highest weight vector that generates M (µp ) and hip vµp = µip vµp . Then it is straightforward to check that vµp−1 = (fip )2µip +1 vµp is a highest weight vector andit will generate a submodule isomorphic to M (µp−1 ). Hence, Homg M (µp−1 ), M (µp ) 6= 0. By Theorem 3.1, Lemma 3.2 and Lemma 3.3 and we have the following: Proposition 3.4. Let µ ∈ 0w . Then there exists a unique λ + ρ/2 ∈ 0+w (k > −cg¯ ), or λ ∈ 0− w (k < −cg¯ ), such that Homg (M (µ), M (λ)) 6= 0 (k > −cg¯ ), or Homg (M (λ),M (µ)) 6= 0, (k < −cg¯ ). Furthermore, if ν ∈ 0w and Homg (M (µ), M (ν)) 6= 0, then [dimHomg (M (µ), M (λ))][dimHomg (M (ν), M (λ))] 6= 0 for k > −cg¯ or [dimHomg (M (λ), M (µ))][dimHomg (M (λ), M (ν))] 6= 0 for k < −cg¯ . Lemma 3.5. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ), w ∈ W and α ∈ 1+ ∩ 1R . Then (i) σαρ wρ (λ) < wρ (λ) ⇒ l(σα w) > l(w) for k > −cg¯ or l(σα w) < l(w) for k < −cg¯ , (ii) l(σα w) > l(w) for k > −cg¯ or l(σα w) < l(w) for k < −cg¯ ⇒ σαρ wρ (λ) ≤ wρ (λ). Proof. The proof of (i) is identical to that of Lemma 7.7.2 (ii) in [14] (cf. [6], Lemma 8.2). Note that in the proof of Lemma 7.7.2 in [14], λ ∈ 0+w is assumed. The weaker condition on λ, assumed in our case, does not affect (i). We prove (ii) for k > −cg¯ . We have σαρ wρ (λ) = wρ (λ) − nα.
598
S. Hwang ρ
Here n = (2w (λ)+ρ)·α ∈ Z. If n < 0 then σαρ wρ (λ) > wρ (λ). By (i), we get l(σα w) < α2 l(w) which is a contradiction. Hence, n = 0, 1, 2, . . . and (ii) follows. The proof for k < −cg¯ is analogous. The following two lemmas are direct generalizations of [14], Lemma 7.7.4 and Lemma 7.7.5 (cf. [6], Lemma 8.4 and Lemma 8.5). Lemma 3.6. Let w1 , w2 ∈ W , γ ∈ 1+ ∩ 1R and α ∈ 1s , with γ 6= α. The following conditions are equivalent: α
γ
(i) σα w1 ←− w1 and σα w1 ←− w2 , α
σα (γ)
(ii) w2 ←− σα w2 and w1 ←− σα w2 . Lemma 3.7. Let w ∈ W and γ ∈ 1+ ∩1R be such that l(w) > l(σγ w). Then w σγ w. We proceed to obtain results on the g-homomorphisms M (ν) → M (µ). First we have the following: Proposition 3.8 (cf. [14], Lemma 7.6.11). Let ν ∈ 0w , α ∈ 1+ , µ = σαρ (ν). Assume µ ≤ ν. Then Homg (M (µ), M (ν)) 6= 0. Proof. The proof is essentially the same as in [14]. The case µ = ν is trivial, so we assume µ < ν. We consider only k > −cg¯ as the case k < −cg¯ is analogous. By Lemma 3.2 there exists w ∈ W and λ0 + ρ/2 ∈ 0+w such that ν = wρ (λ0 ). Let w = σαn . . . σα1 be a reduced expression of w in terms of simple reflexions and ν0 = λ0 , ν1 = σαρ 1 (ν0 ), ν2 = σαρ 2 (ν1 ), . . . , νn = σαρ n (νn−1 ) = ν µ0 = λ, µ1 = σαρ 1 (µ0 ), µ2 = σαρ 2 (µ1 ), . . . , µn = σαρ n (µn−1 ) = µ. Then ν0 = w0ρ (µ0 ) for some w0 ∈ W (from ν0 = w−1ρ (ν) = w−1ρ σαρ (µ)) and µ0 + ρ/2 ∈ 0+w , hence µ0 − ν0 ∈ 0+r . On the other hand, µn − νn = −mα, m > 0. Since the same element of W transforms µ and ν into µp and µp , respectively, p = 0, 1, 2, . . . , n, µp is transformed from νp by a reflexion σγρp (γp ∈ 1+ ), hence µp − νp ∈ 0+r or νp − µp ∈ 0+r . Hence, there exists a smallest integer k such that µk − νk ∈ 0+r and µk+1 − νk+1 ∈ −0+r . Now µk − νk = σαρ k+1 (µk+1 − νk+1 ). Since µk+1 − νk+1 is proportional to γk+1 , it can be seen that σαk+1 (γk+1 ) ∈ 1− . Hence, γk+1 = αk+1 (since σαk+1 permutes all positive roots except αk+1 ). The relations µk+1 − νk+1 ∈ −0+r and µk+1 = σαρ k+1 (νk+1 ) imply Homg (M (µk+1 ), M (νk+1 )) 6= 0 (Lemma 3.3). On the ρ other hand M (µk+2 ) = M (σk+2 (µk+1 )) so that Homg (M (µk+2 ), M (µk+1 )) 6= 0. Hence, Homg M (µk+2 ), M (σαρ k+2 (νk+1 )) =Homg (M (µk+2 ), M (νk+2 )) 6= 0. Continuing this step by step we arrive at Homg (M (µ), M (ν)) 6= 0. As a corollary to this proposition we can generalize results obtained by [15] and [11] for finite dimensional Lie algebras. Corollary 3.9. A necessary and sufficient condition for M (µ) to be a submodule of M (ν) is that the condition (*) in Theorem 3.1 is satisfied. We are now ready to formulate one of the main results of this section, namely the dimension of the g-homomorphisms M (µ) → M (ν). This result generalizes the result of Verma [15] for finite dimensional Lie algebras and Rocha-Caridi, Wallach [6] for representations with highest weights on Weyl orbits through dominant weights.
Semi-Infinite Cohomology of Affine Lie Algebras
599
Theorem 3.10. Let µ, ν ∈ 0w . Then dimHomg (M (µ), M (ν)) ≤ 1. Proof. We consider the cases k > −cg¯ and k < −cg¯ separately. k > −cg¯ : By Proposition 3.4 it is sufficient to prove that dimHomg (M (µ), M (λ)) ≤ 1, where µ = wρ (λ), λ + ρ/2 ∈ 0+w . The proof is then similar to that of [6], Lemma 8.14, using induction on l(w). We only sketch it. For l(w) = 0 the theorem is trivial. Assume it to be true for l(w) < p. Consider l(w) = p. Let i = 1, 2, . . . , n be such that σiρ (µ) > µ, where σi are reflections corresponding to simple roots αi . Then l(σi w) < l(w) (Lemma 3.5) and dimHomg M (µ), M (σiρ (µ)) 6= 0 (Proposition 3.7). Consider the sl2 subalgebra gi corresponding to the simple root αi , i = 1, 2, . . . , n. M (σiρ (µ)) is the so-called completion of M (µ) w.r.t gi and is unique ([18], Proposition 3.6, and [6], ρ Proposition 8.11). Then, dimHomg (M (µ), M (λ)) = dimHom g M (σi (µ)), M (λ) . By the induction hypothesis dimHomg M (σiρ (µ)), M (λ) = 1. This gives the theorem in the case k > −cg¯ . k < −cg¯ : We will prove the theorem using essentially the original argument of Verma [15], Theorem 2. By Proposition 3.4 it is sufficient to prove that dimHomg (M (λ), M (µ)) ≤ 1, where µ = wρ (λ), λ ∈ 0− w . As M (λ) is irreducible, we can count the number of states in M (µ) and M (λ) to establish that if dimHomg (M (λ), M (µ)) ≥ 2, then P (η) = dimMη (µ) ≥ 2 dimMη+λ−µ (λ) = 2P (η + λ − µ). This is, however, a contradiction [15], Lemma 3, as can be seen by considering large η. Note here the following. Firstly, Theorem 3.1 and Theorem 310 imply that a BG module V (µ) is a submodule of M (λ) if and only if (M (λ) : L(µ)) ≥ 2. Secondly, if a BG φV M
module V (µ) ⊂ M (λ), then there exists a g-homomorphism φV M such that V (µ) → M (µ) ⊂ M (λ). As Theorem 3.10 shows that an element of Homg (M (µ), M (ν)) is either zero or unique (up to a multiplicative constant), we write M (µ) ⊂ M (ν) whenever Homg (M (µ), M (ν)) 6= 0. We next generalize a result established for finite dimensional Lie algebras [19] and for k > −cg¯ in [6]. Theorem 3.11. Let µ, ν ∈ 0w . Then there exist w, w0 ∈ W and λ + ρ/2, λ0 + ρ/2 ∈ ρ 0 0ρ 0+w (k > −cg¯ ), or λ, λ0 ∈ 0− w (k < −cg¯ ) such that µ = w (λ ) and ν = w (λ). For k > −cg¯ we have: (i)
M (µ) ⊂ M (ν) ⇐⇒ w ≺ w0 , λ = λ0 ⇐⇒ (M (ν) : L(µ)) 6= 0 (ii) If M (µ) ⊂ M (ν), µ 6= ν, then there are µ = µ0 , µ1 , . . . , µn = ν such that µi+1 = wiρ (λ), i = 0, 1, . . . n − 1 with l(wi+1 ) = l(wi ) − 1, w0 = w, wn = w0 and M (µ0 ) ⊂ M (µ1 ) ⊂ M (µ2 ) ⊂ . . . ⊂ M (µn ).
For k < −cg¯ we have: (iii) M (µ) ⊂ M (ν) ⇐⇒ w0 ≺ w, λ = λ0 ⇐⇒ (M (ν) : L(µ)) 6= 0 (iv) If M (µ) ⊂ M (ν), µ 6= ν, then there are µ = µ0 , µ1 , . . . , µn = ν such that µi+1 = wiρ (λ), i = 0, 1, . . . n − 1 with l(wi+1 ) = l(wi ) + 1, w0 = w, wn = w0 and M (µ0 ) ⊂ M (µ1 ) ⊂ M (µ2 ) ⊂ . . . ⊂ M (µn ).
600
S. Hwang
Proof (cf. [14] and [6]). Consider k < −cg¯ . The existence of w, w0 follows from Lemma 3.2. By Theorem 3.1 and Corollary 3.9 we have M (µ) ⊂ M (ν) ⇐⇒ (M (ν) : L(µ)) 6= 0. AssumeM (µ) ⊂ M (ν). By Corollary 3.9 there exist γ1 , . . . , γn ∈ 1+ such that µ = wρ (λ) < σγρ1 wρ (λ) < . . . < σγρn σγρn−1 . . . σγρ1 wρ (λ) = w0ρ (λ0 ). Then λ = λ0 (Lemma 3.2) and by Lemma 3.5 we have l(w) < l(σγ1 w) < . . . < l(w0 ). Hence, w ≺ w0 (Lemma 3.7). We now assume w0 ≺ w, λ = λ0 . Then there exist γ1 , . . . , γn ∈ 1+ such that γ1
γ2
γn−1
γn
w = w0 −→ w1 −→ w2 · · · −→ wn−1 −→ wn = w0 . By Lemma 3.5 we have µ = w0ρ (λ) ≤ w1ρ (λ) ≤ . . . ≤ wnρ (λ) = ν and, hence, M (w0ρ (λ)) ⊂ M (w1ρ (λ)) ⊂ . . . ⊂ M (wnρ (λ)) (Proposition 3.8). This proves (iii) and (iv). The cases (i) and (ii) are proved analogously. It is convenient to introduce the concept of length of a weight. Let µ ∈ 0w . Then we define the length l(µ) as the smallest integer l(w) such that µ = wρ (λ), w ∈ W , λ + ρ/2 ∈ 0+w or λ ∈ 0− w . We now prove some useful results involving this concept. First we have a result similar to Lemma 3.5. Lemma 3.12. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ), w ∈ W and α ∈ 1+ ∩ 1R . The following conditions are equivalent (i) σαρ wρ (λ) <wρ (λ), (ii) l σαρ wρ (λ) > l (wρ (λ)) for k > −cg¯ , or l σαρ wρ (λ) < l (wρ (λ)) for k < −cg¯ . Proof. We prove (i)=⇒ (ii) for the case k > −cg¯ . Let w0ρ (λ) = σαρ wρ (λ) with l(w0 ) = l(σαρ wρ (λ)). We have σαρ w0ρ (λ) = wρ (λ) > σαρ wρ (λ) = w0ρ (λ), and, thus, l(w) = l(σα w0 ) < l(w0 ) (Lemma 3.5). By definition, l(w) ≥ l(wρ (λ)) and, hence, l(wρ (λ)) < l(w0 ) = l(σαρ wρ (λ)). The case k < −cg¯ is proved analogously. We now prove (ii) =⇒ (i) for k > −cg¯ . We have σαρ wρ (λ) = wρ (λ) − nα. ρ
∈ Z. n = 0 implies σαρ wρ (λ) = wρ (λ) and thus l(σαρ wρ (λ)) = Here n = (2w (λ)+ρ)·α α2 ρ l(w (λ)). This contradicts (ii) and, therefore, we have n 6= 0. If n < 0 then σαρ wρ (λ) > wρ (λ). By the implication (i) =⇒ (ii), we again contradict (ii). Hence, n = 1, 2, . . . and (i) follows. The proof for k < −cg¯ is analogous. We may easily generalize [14], Proposition 7.6.8, to obtain:
Semi-Infinite Cohomology of Affine Lie Algebras
601
Lemma 3.13. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ) and w = σαn . . . σα1 be a reduced decomposition of w ∈ W , where α1 , . . . , αn ∈ 1s . Let λ0 = λ, λ1 = σαρ 1 (λ0 ), λ2 = σαρ 2 (λ1 ), . . . , λn = σαρ n (λn−1 ). Then for k > −cg¯ , λ0 ≥ λ1 ≥ . . . ≥ λn and
2αi+1 · (λi + ρ/2) ∈ {0, 1, 2 . . .}, 2 αi+1
and for k < −cg¯ λ0 ≤ λ1 ≤ . . . ≤ λn and
2αi+1 · (λi + ρ/2) ∈ {0, −1, −2 . . .}. 2 αi+1
Lemma 3.14. Let λ + ρ/2 ∈ 0+w (k > −cg¯ ) or λ ∈ 0− w (k < −cg¯ ). Let µ ∈ 0w with µ = wρ (λ), w ∈ W . If l(µ) = l(w), then w, λ, µ satisfy (**) in Lemma 3.2 with l(µ) = n. In addition, this is the minimal integer n for which (**) is satisfied. Proof. Consider k < −cg¯ . Let w = σαn . . . σα1 with l(w) = l(µ). By Lemma 3.13 we have a sequence λ0 ≤ λ1 ≤ . . . ≤ λn and
2αi+1 · (λi + ρ/2) ∈ {0, −1, −2, . . .}. 2 αi+1
Assume λi = λi+1 for some i ∈ {0, 1, 2 . . . , n}. Then clearly w0 = σαn . . . σαi+1 σαi−1 . . . σα1 satisfies µ = w0ρ (λ) and l(w0 ) < l(w). This contradicts the assumption l(µ) = l(w). The last assertion follows by the definition of l(µ). k > −cg¯ is proved analogously. Proposition 3.15. Let M (µ) ⊂ M (ν), where µ, ν ∈ 0w . Then l(µ) − l(ν) = n for k > −cg¯ , or l(ν) − l(µ) = n for k < −cg¯ , if and only if n is the largest integer for which M (µ) ⊂ M (n) (ν). Proof. Consider k < −cg¯ . By Proposition 3.4 and the hereditary nature of Jantzen’s filtration it is sufficient to prove the proposition for l(µ) = 0, i.e. for µ ∈ 0− w and some given M (ν). We prove the “only if” case by induction on l(ν). For l(ν) = 0 the proposition is trivial. Assume it to be true for l(ν) ≤ p − 1 and consider l(ν) = p. As p ≥ 1 there must exist α ∈ 1s such that ν 0 = σαρ (ν) < ν. Then M (ν 0 ) ⊂ M (ν) (Proposition 3.8) and l(ν 0 ) < l(ν) (Lemma 3.12). If l(ν 0 ) < p − 1 then l(ν) < p, which is a contradiction. Hence, l(ν 0 ) = p − 1. In addition, M (ν 0 ) ⊂ M (1) (ν) and M (ν 0 ) 6⊂ M (2) (ν). This follows by an explicit construction of the highest weight vector that generates M (ν 0 ) (cf. the proof of Lemma 3.3). We now use the induction hypothesis on M (ν 0 ) together with the hereditary nature of Jantzen’s filtration to conclude that the proposition holds for l(ν) = p. We prove the “if” case. Consider M (µ) ⊂ M (p) (ν), µ ∈ 0− w and use induction on p. The case p = 0 is trivial. Assume the assertion to be true for 0 ≤ p ≤ n − 1 and consider p = n ≥ 1. As p ≥ 1 there must exist α ∈ 1s such that ν 0 = σαρ (ν) < ν and M (ν 0 ) ⊂ M (ν) (Proposition 3.8) with l(ν 0 ) < l(ν) (Lemma 3.12). By explicit construction of the highest weight vector one checks that M (ν 0 ) ⊂ M (1) (ν) and M (ν 0 ) 6⊂ M (2) (ν). Then the hereditary nature of Jantzen’s filtration implies M (µ) ⊂ M (n−1) (ν 0 ), which by the induction hypothesis yields l(ν 0 ) = n − 1. Then ν 0 = σαρ (ν) implies l(ν) = n, which concludes the proof. The case k > −cg¯ is proved in a completely analogous fashion.
602
S. Hwang
Lemma 3.16 ([14], Lemma 7.7.6; [6], Lemma 8.6). Let w1 , w2 ∈ W . The number of elements w ∈ W such that w1 ← w ← w2 is 0 or 2. Lemma 3.17 (cf. [14], Lemma 7.7.7 (iii) and [6], Lemma 8.15 (iii)). Let M (µ1 ) and M (µ2 ) be Verma modules with highest weights µ1 and µ2 , respectively. Let µ1 + ρ/2 and µ2 + ρ/2 be regular. If l(µ1 ) = l(µ2 ) + 2 (k > cg¯ ) or l(µ1 ) = l(µ2 ) − 2 (k < cg¯ ), then the number of µ such that M (µ1 ) ⊂ M (µ) ⊂ M (µ2 ), M (µ1 ) 6= M (µ) 6= M (µ2 ) is either 0 or 2. Proof. Consider k > −cg¯ . The definition of l(µ1 ) and l(µ2 ) implies together with Lemma 3.2 that there exists w1 , w2 ∈ W such that µ1 = w1ρ (λ), µ2 = w2ρ (λ), λ + ρ/2 ∈ 0+w and l(w1 ) = l(w2 ) + 2. In addition, µ1 + ρ/2 and µ2 + ρ/2 regular imply that λ ∈ 0+w . Then the number of w ∈ W such that M (w1ρ (λ)) ⊂ M (wρ (λ)) ⊂ M (w2ρ (λ)) and M (w1ρ (λ)) 6= M (wρ (λ)) 6= M (w2ρ (λ)) is 0 or 2, as can be seen from combining Lemma 3.16 and Theorem 3.11. This proves the assertion of the lemma for k > −cg¯ . The case k < −cg¯ is proved analogously. 4. The BRST Formalism Define the algebra g 0 = gk ⊕ g−k−2cg¯ , where cg¯ is the quadratic Casimir of the adjoint representation. This algebra is invariant under the exchange k → −k − 2cg¯ and, consequently, we may restrict to k > cg . The singular case k = −cg¯ will not be treated here. f(λ), ˜ where We will denote g−k−2cg¯ by g˜ and the Verma module over g˜ will be denoted M 0 0 λ˜ is its highest weight. Let Bn0+ , Bn0− , Bh0 , Bg˜ and Bg0 be bases of n+ , n− , h0 , g˜ and g 0 , respectively. The generators e˜α , f˜α and h˜ i is a realization of Bg˜ and e0α , fα0 and h0i ˜ and similarly L0 ˜ = L(λ) ⊗ L( ˜ ˜ (λ) ˜ λ). a realization of Bg0 . Define M 0 ˜ = M (λ) ⊗ M λλ
λλ
πL0 denotes the projection M 0 −→ L0 . We define the anticommuting ghost and antighost operators c(x) and b(x), respectively, where x ∈ Bg0 , with the following properties: (i) {c(x), b(y)} = δx† ,y , (ii) c† (x) = c(x† ), b† (x) = b(x† ), (iii) b(a1 x + a2 y) = a1 b(x) + a2 b(y),
(4.1) (4.2) (4.3)
a1 , a2 ∈ C.
Here δx,y = 1 if x = y and 0 otherwise. Introduce a normal ordering if either x ∈ Bn0− or y ∈ Bn0+ c(x)b(y) −b(y)c(x) if either x ∈ Bn0+ or y ∈ Bn0− . : c(x)b(y) : = 1 2 (c(x)b(y) − b(y)c(x)) otherwise Define the BRST operator X X 1 c(x† )x + c(x† )ρ(x) − d= 2 x∈Bg0
x∈Bh0
X
: b([x, y])c(x† )c(y † ) :,
(4.4)
(4.5)
x,y∈Bg0
where ρ(x) is the component of ρ corresponding to the element x ∈ Bh0 . The BRST operator has the following two fundamental properties: d2 = 0 and d† = d. The first property implies that xtot = {d, b(x)} generates an algebra g0 which is centerless. Define a ghost module F gh . It is generated by the ghost operators acting on a vacuum vector v0gh satisfying
Semi-Infinite Cohomology of Affine Lie Algebras
603
c(x)v0gh = b(y)v0gh = 0 for x ∈ Bn0+ and y ∈ Bn0+ ∪ Bh0 .
(4.6)
We also define a restricted module Fˆ gh = {v gh ∈ F gh | b(x)v gh = 0 for x ∈ Bh0 }. The dual F ∗gh of F gh has a vacuum vector v0∗gh satisfying c† (x)v0∗gh = b† (y)v0∗gh = 0 for x ∈ Bn0+ ∪ Bh0 and y ∈ Bn0+ .
(4.7)
The restricted dual is Fˆ ∗gh = {v ∗gh ∈ F ∗gh | c(x)v ∗gh = 0 for x ∈ Bh0 }. Define a Hermitian form for the ghost sector by hv0∗gh |v0gh i = 1, hv ∗gh |uv gh i = hu† v ∗gh |v gh i,
(4.8)
for a polynomial u in the ghost operators and v gh ∈ F gh , v ∗gh ∈ F ∗gh . If v gh = uv0gh then we denote by v ∗gh the vector u† v0∗gh . The ghost number N gh of any vector v gh ∈ F gh is defined by N gh (v0gh ) = 0 and N gh (c(x)v) = N gh (v) + 1, N gh (b(x)v) = N gh (v) − 1. The ghost numbers of vectors in the dual module is similarly defined with N gh (v0∗gh ) = 0. It is easily seen that hu∗ |vi = 0 if N gh (u∗ ) + N gh (v) 6= 0. Let C(g 0 , V ) be the complex V ⊗ F gh for a g 0 -module V . ˆ 0 , V ) = {ω ∈ C(g 0 , V ) | b(x)ω = 0, xtot ω = 0 We define the relative subcomplex C(g 0 ∗ ˆ for x ∈ Bh0 } and C(g , V ) is the dual complex. If ω = v ⊗ v gh for v ∈ V , v gh ∈ F gh , then we denote by ω ∗ the vector v ⊗ v ∗gh . We decompose d as d = dˆ +
X
(xtot c(x) + M(x)b(x)).
(4.9)
x∈Bh0
ˆ for ω ∈ C(g ˆ 0 , V ) and consequently on the relative subcomplex we We have dω = dω ˆ the may analyze the cohomology of dˆ in place of d. The cohomology associated with d, ∞/2+p 0 ˆ semi-infinite or BRST relative cohomology is sometimes denoted by H (g , V ) to distinguish it from the conventional Lie algebra cohomology. We will, however, here for simplicity write Hˆ p (g 0 , V ), where p refers to elements ω with Ngh (ω) = p. Our primary interest here will be for V = L0λ,λ˜ . But in order to gain knowledge of this case we will 0 also study V = Mλ, ˜ and its submodules. λ It will be convenient to make a classification of vectors in the complex C(g 0 , V ) using the BRST operator. A central result due to Kugo and Ojima [7] states the following. Theorem 4.1. Let V be an irreducible module. Then a basis of C(g 0 , V ) may be chosen so that for an element ω in this basis one of the following will be true. (i) Singlet case: ω ∈ H ∗ (g 0 , V ) and hω ∗ |ωi 6= 0, N gh (ω) = 0. (ii) Singlet pair case: ω ∈ H ∗ (g 0 , V ) and there exists an element ψ 6= ω such that ψ ∈ H ∗ (g 0 , V ), hψ ∗ |ωi 6= 0 and N gh (ψ) = −N gh (ω). iii Quartet case: ω 6∈ H ∗ (g 0 , V ). There will exist four elements ω1 , ω2 , ψ1 , ψ2 ∈ C(g 0 , V ), where either ω = ω1 or ω = ω2 , such that hψ1∗ |ω1 i 6= 0 and hψ2∗ |ω2 i 6= 0, ω2 = dω1 and ψ1 = dψ2 and N gh (ω1 ) = N gh (ω2 ) − 1 = −N gh (ψ1 ) = −N gh (ψ2 ) − 1.
604
S. Hwang
ˆ ˆ 0 , V ) w.r.t. d. There will exist an analogous classification on the relative subcomplex C(g In this classification all non-trivial elements in the cohomology will be singlets or singlet pairs. It should be remarked that the condition of irreducibility is essential for the theorem. In the following section, we will find that for V being a reducible Verma module the above classification does not hold. In particular, the non-trivial elements of the cohomology for non-zero ghost numbers will for this case not be members of singlet pairs. ˆ 0 , M 0 ) as follows. Let λ = λ + z and λ˜ = λ˜ − z. Define Jantzen’s filtration for ξ ∈ C(g 0(n) 0 f(λ˜ )| hw0∗ |v 0 i is divisable by n for any w0∗ ∈ Then M (λ ) = {v ∈ M (λ ) ⊗ M ∗ ∗ ˜ f M (λ ) ⊗ M (λ )}. We denote by ξ the vector v ⊗ v˜ ⊗ v gh . An element ξ is always assumed to be finite as → 0. We denote by f () ∼ n the leading order of a function f () in the limit → 0. Note that our definition of Jantzen’s filtration for g 0 implies that λ + λ˜ is independent of . This is required if the cohomology should have at least one non-trivial element for 6= 0, namely the vacuum solution ξ0 = v0 ⊗ v˜ 0 ⊗ v0gh . In the next section the following result will be needed. ˆ 2 = g()ξ1 , where ˆ 0 , M0 ) be non-zero for = 0 and dξ Lemma 4.2. Let ξ1 , ξ2 ∈ C(g ˆ 0 , M 0(ni ) ), i = 1, 2. g() ∼ 1 or . Let n1 and n2 be the largest integers for which ξi ∈ C(g ˆ 0 , M0 ) which are non-zero for = 0 and satisfy Then there exist ζ1 , ζ2 ∈ C(g (i) (ii) (iii)
∗ hζi | ξj i ∼ ni δi,j 6= 0 for 6= 0, i = j = 1, 2. ˆ 1 = f ()ζ2 , where f () ∼ 1 or ∼ . dζ ˆ 0 , M 0(ni ) ). n1 , n2 are the largest integers for which ζi ∈ C(g
In addition, for g() ∼ 1: n1 = n2 if and only if f () ∼ 1, n1 = n2 + 1 if and only if f () ∼ . For g() ∼ we have: n1 = n2 − 1 if and only if f () ∼ 1, n1 = n2 if and only if f () ∼ . ˆ 0 , M0(n1 ) ) for a largest integer n1 and M0(ni ) is irreducible for Proof. Since ξ1, ∈ C(g ˆ 0 , M0(n1 ) ) with hξ ∗ |ζ1 i ∼ n1 . Then 0 < || 1 there must exist one vector ζ1 ∈ C(g 1 ∗ ∗ ˆ ∗ ˆ 1 g()hζ1 |ξ1 i = hζ1 |dξ2 i = hdζ |ξ2 i
(4.10)
ˆ 1 = f ()ζ2 , for some vector ζ2 satisfying hζ ∗ |ξ2 i 6= 0 and which is nonimplies dζ 2 zero for = 0. In addition, f () is a non-singular function of . From the fact that dˆ is linear in the generators of g 0 we can can conclude that f () ∼ 1 or . Pick a basis of ˆ 0 , M0 ) such that ξ1 , ξ2 are two of its elements. Denote the elements of the basis by C(g ∗ ˆ 0 , M0∗ ), ζi , i = 1, 2, 3, . . . . We ξi , i = 1, 2, 3, . . . . Similarly we pick a basis of C(g ∗ ∗ choose it such that hζi |ξj i is non-zero only for i = j. Now since hζi |ξ2 i = 0 for ˆ 0 , M0(n2 ) ) we must have hζ ∗ |ξ2 i ∼ n2 . This in turn implies, using i 6= 2 and ξ2 ∈ C(g 2 ∗ ˆ 0 , M0(n2 ) ). |ξj i = 0 for j 6= 2 and the definition of Jantzen’s filtration, that ζ2 ∈ C(g hζ2 ∗ ∗ We now conclude from hξ1 |ζ1 i ∼ n1 , hξ2 |ζ2 i ∼ n2 , Eq. (4.10) and f () ∼ 1 or that for g() ∼ 1 we have n1 ∼ n2 f () ∼ n2 or n2 +1 , while for g() ∼ we have n1 ∼ n2 −1 f () ∼ n2 −1 or n2 . A standard tool in the analysis of the cohomology is a contracting homotopy operator. ˆ 0 , M 0 ), i.e. ω0 = v 0 ⊗ v gh , where v 0 = v0 ⊗ v˜0 and Let ω0 be a vacuum vector of C(g 0 0 0 ˜ , respectively. Consider an element v0 , v˜ 0 are primary highest weight vectors of M and M ˆ M 0 ) of the form ω = v 0 ⊗ v gh with v 0 = uv0 ⊗ uv ω ∈ C(g, ˜ 0 , u ∈ U (n− ), u˜ ∈ U (n˜ − ) and N gh (v gh ) = n. We write u = um +um−1 +. . .+u0 , where ui ∈ U (n− ) is a monomial of order i. Introduce a gradation Ngr . We define Ngr (ω0 ) = 0. Furthermore, Ngr (ω) =
Semi-Infinite Cohomology of Affine Lie Algebras
605
L 0 ˆ ˆ 0 ˆ 0, M 0) = m − n. We will get a filtration C(g Ngr C(g , M )Ngr . We now decompose d P 0 as dˆ = d0 + d−1 , where d0 = α∈1+ c(eα ) fα . We have dω = d0 ω + (lower order terms). ˆ 0 , M 0 )p+q−r be of the form Let ωp,q ∈ C(g ωp,q = fα1 . . . fαp v0 ⊗ v˜ ⊗ b(fβ0 1 ) . . . b(fβ0 q )c(fγ0 1 ) . . . c(fγ0 r )v0gh ,
(4.11)
where α, β, γ ∈ 1+ . The homotopy operator κ0 is now defined by κ0 ωp,q =
1 p+q
p X
fα1 . . . fc ˜ ⊗ b(fα0 i )b(fβ0 1 ) . . . b(fβ0 q ) αi . . . f αp v 0 ⊗ v
i=1
c(fγ0 1 ) . . . c(fγ0 r )v0gh
p 6= 0
κ0 ω0,q = 0,
(4.12)
where capped factors are omitted. It is now straightforward to verify (d0 κ0 + κ0 d0 )ωp,q = (1 − δp+q,0 )ωp,q + (lower order terms).
(4.13)
One may also define a gradation N˜ gr using the elements of U(n˜ − ) in place of U (n− ). P We then have a corresponding decomposition dˆ = d˜0 + d˜−1 with d˜0 = α∈1+ c(e0α ) f˜α and a homotopy operator κ˜ 0 . 5. The BRST Cohomology We will now in detail study the semi-infinite relative cohomology associated with the BRST operator. The notation follows that of previous sections. ω , ξ , . . . always denote ˆ 0 , . . .) that are finite in the limit → 0. Our starting point is Propoelements of C(g sition 5.1 concerning the cohomology of Verma modules. This proposition was to our knowledge first given in [20], Proposition 2.29. Proposition 5.1. Let M 0 be a highest weight Verma module over g 0 . Then Hˆ p (g 0 , M 0 ) = 0 for p < 0. Proof ([2]). Let ω ∈ Hˆ p (g 0 , M 0 ) and have a highest order term ωn in the gradation Ngr . ˆ = d0 ωn + (lower order terms) and hence d0 ωn = 0 to leading order. Using Then 0 = dω Eq. (4.13) we conclude that ωn = d0 (κ0 ωn ) + (lower order terms) and as a consequence ˆ 0 ωn ) + (lower order terms). Thus ω is a trivial element of Hˆ p (g 0 , M 0 ) of this, ω = d(κ to highest order. This may be iterated to lower orders and we find that ω ∈ Hˆ p (g 0 , M 0 ) will be non-trivial only for Ngr (ω) ≤ 0, which is impossible if N gh (ω) < 0. Corollary 5.2 ([2]). Let M 0 in Proposition 5.1 be irreducible. Then Hˆ p (g 0 , M 0 ) = 0 for p 6= 0. Furthermore, ω ∈ Hˆ 0 (g 0 , M 0 ) is the element ω = v0 ⊗ v˜ 0 ⊗ v0gh , where v0 ˜ respectively, satisfying and v˜ 0 are primary highest weight vectors of weights λ and λ, λ + λ˜ + ρ = 0. ˆ 0 , M 0 ) be such Corollary 5.3. Let L0 be the irreducible g 0 -module of M 0 . Let ω ∈ C(g p 0 0 ˆ 0 that 0 6= πL (ω) ∈ H (g , L ), p < 0. Then ˆ = ν, dω where ν ∈ Hˆ p+1 (g 0 , M 0(1) ) and is non-zero.
(5.1)
606
S. Hwang
ˆ = 0 in C(g ˆ 0 , M 0 ). Then Proposition 5.1 implies ω = Proof. ([2]) Assume first dω 0 0 0 ˆ ˆ ˆ ˆ 0 , M 0 /M 0(1) ), dη, η ∈ C(g , M ). Since ω ∈ C(g , M 0 /M 0(1) ) we must have η ∈ C(g ˆ ˆ = 0. which implies that ω is cohomologically trivial. Therefore, dω = ν 6= 0 and so dν p+1 0 0(1) 0 0 0 0(1) ˆ ˆ ˆ ˆ If ν 6∈ H (g , M ), then ν = dν for some ν ∈ C(g , M ) and d(ω − ν 0 ) = 0. Proposition 5.1 then implies that πL0 (ω) is a trivial element of Hˆ p (g 0 , L0 ). The following lemma is partly the converse of Corollary 5.3. ˆ = ν in C(g ˆ 0 , M 0 ) with ν ∈ Hˆ p+1 (g 0 , M 0(1) ) and ˆ 0 , M 0 ), dω Lemma 5.4. Let ω ∈ C(g p 0 0 ˆ πL0 (ω) 6= 0, then πL0 (ω) ∈ H (g , L ). ˆ = ν with ν ∈ Hˆ p+1 (g 0 , M 0(1) ) implies that dπ ˆ L0 (ω) = 0. Secondly, Proof. Firstly, dω 0 ˆ ˆ + ν0 ˆ 0 0 0 assume πL (ω) to be trivial, i.e. πL (ω) = dπL (ψ), ψ ∈ C(g , M 0 ). Then ω = dψ ˆ = dν ˆ 0 . This is a contradiction to ˆ 0 , M 0 ), with ν 0 ∈ C(g ˆ 0 , M 0(1) ), and so ν = dω in C(g the assumption ν ∈ Hˆ p+1 (g 0 , M 0(1) ). Lemma 5.5. dim Hˆ p+1 (g 0 , M 0(1) ) = dim Hˆ p (g 0 , L0 ) for p ≤ −2. ˆ Proof. Let ν ∈ Hˆ p+1 (g 0 , M 0(1) ) with p ≤ −2, then by Proposition 5.1 ν = dω, 0 0 p 0 0 ˆ ˆ ω ∈ C(g , M ) and πL0 (ω) 6= 0. Lemma 5.4 then implies πL0 (ω) ∈ H (g , L ). We have thus proved that dim(Hˆ p+1 (g 0 , M 0(1) )) ≤ dim(Hˆ p (g 0 , L0 )). We now prove that ˆ 0, M 0) the dimensionalities are in fact the same. Assume two elements ω1 , ω2 ∈ C(g p 0 0 ˆ 0 0 with πL (ω1 ), πL (ω2 ) ∈ H (g , L ), corresponding to the same element ν. By Corolˆ 1 = ν1 and dω ˆ 2 = ν2 , where ν1 = ν2 as elements ˆ 0 , M 0 ): dω lary 5.3 we have in C(g p+1 0 0(1) ˆ 1 − ω2 ) = ν1 − ν2 = dν ˆ 0, ˆ in H (g , M ). Subtracting the equations yields d(ω 0 0 0(1) 0 ˆ ˆ ν ∈ C(g , M ), which by Proposition 5.1 implies ω1 − ω2 − ν = d(. . .). πL0 (ω1 ) and πL0 (ω2 ) are therefore identical elements in Hˆ p (g 0 , L0 ). The results obtained so far are of importance for negative ghost numbers. We now turn to results relevant for positive ghost numbers. We will connect the two cases by the use of Jantzen’s perturbation, Theorem 4.1 and Lemma 4.2. ˆ = ν, ν ∈ ˆ 0 , M 0 ) with πL0 (ω) ∈ Hˆ p (g 0 , L0 ), satisfying dω Lemma 5.6. Let ω ∈ C(g 0 0(1) ˆ C(g , M ). Then: ˆ 0 , M 0 ) with πL0 (ψ) ∈ Hˆ −p (g 0 , L0 ) and hψ ∗ |ωi = 6 0. (i) There exists ψ ∈ C(g ˆ 0 , M 0(1) ) of opposite ghost number of ν, (ii) With ψ as in (i): There exists χ ∈ C(g ˆ = ψ . satisfying dχ ˆ 0 , M 0(2) ), where χ is defined as in (ii). (iii) χ, ν 6∈ C(g ˆ = 0. (iv) With ψ as in (i): dψ (v) p ≤ 0. Proof. (i) follows directly from Theorem 4.1. (ii) and (iii) follow from Theorem 4.1 and Lemma 4.2 using ω = ξ2 , ν = ξ1 and g() ∼ 1. (iv) follows by applying dˆ to the ˆ = ψ , using dˆ2 = 0 and taking the limit → 0. Finally (v) may be proved equation dχ ˆ by contradiction. If p > 0, then by (iv) and Corollary 5.3 ψ is d−exact and, hence, so is ω. ˆ 0 , M 0(1) ) be such that ˆ 0 , M 0 ), πL0 (ψ) 6= 0, and χ ∈ C(g Lemma 5.7. Let ψ ∈ C(g ˆ = ψ . Then χ ∈ C(g ˆ 0 , M 0(1) /M 0(2) ), πL0 (ψ) ∈ Hˆ p (g 0 , L0 ) and p ≥ 0. Conversely, dχ ˆ = ψ . ˆ 0 , M 0(1) ) such that dχ let πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), p ≥ 1, then there exists χ ∈ C(g
Semi-Infinite Cohomology of Affine Lie Algebras
607
ˆ = ψ . We apply Lemma 4.2 with ξ1 = ψ, ξ2 = χ and g() ∼ . Proof. Assume dχ Then n1 = 0, n2 ≥ 1 and by the lemma there exist two vectors ω and ν such that ˆ = f ()ν , with f () ∼ 1 or . Furthermore f () ∼ hω∗ |ψ i ∼ 1, hν∗ |χ i ∼ and dω 1, since otherwise n1 = n2 . This in turn implies n2 = 1, by Lemma 4.2 (iii), and ˆ 0 , M 0(1) /M 0(2) ). We now show that ν is not exact in C(g ˆ 0 , M 0(1) ). Assume χ, ν ∈ C(g 0 0(1) 0(2) ˆ with η ∈ C(g ˆ + h()ν0 , where ˆ , M /M ). Then ν = dη the contrary, i.e. ν = dη ˆ 0 , M 0(1) ) and h() is a polynomial in such that h(0) = 0. This implies that ν 0 ∈ C(g 0 ˆ 0 = f ()h()ν0 . Now lim→0 hω0∗ |ψ i 6= 0 since ω 0 and ω ω = ω − f ()η satisfies dω 0 ˆ differ by an element in C(g , M 0(1) ). This is a contradiction as can be seen from 1ˆ 0∗ 1 ˆ 0∗ 1 hω0∗ |ψ i = hω0∗ | dχ i = hdω | χ i = hf ()h()ν | χ i −→ 0
for → 0.
Thus ν ∈ Hˆ −p+1 (g 0 , M 0(1) ). Lemma 5.4 then gives πL0 (ω) ∈ Hˆ −p (g 0 , L0 ), which implies πL0 (ψ) ∈ Hˆ p (g 0 , L0 ). The condition p ≥ 0 follows from Corollary 5.3 and the fact that ˆ = 0 in C(g ˆ 0 , M 0 ). dψ We now prove the converse statement. Let πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), p ≥ 1. Pick a ˆ 0 , M 0 ), πL0 (ω) ∈ basis as in Theorem 4.1 so that ψ is one of its elements and ω ∈ C(g −p 0 0 ∗ ˆ ˆ H (g , L ), hω |ψi 6= 0, is another. Corollary 5.3 implies dω = ν and then Lemma 5.6 gives the assertion. ˆ = ψ and ˆ 0 , M 0 ) be such that N gh (ψ) ≥ 1, dχ Lemma 5.8. Let ψ and χ ∈ C(g ˆ 0 , M 0(1) /M 0(2) ). Then πL0 (ψ) ∈ Hˆ p (g 0 , L0 ). χ ∈ C(g Proof. By Lemma 5.7 it is sufficient to prove that πL0 (ψ) 6= 0. Assume the contrary i.e. ˆ 0 , M 0(1) ). Then Lemma 4.2 implies that there exist two vectors ω and ν satisfying ψ ∈ C(g ˆ = f ()ν in C(g ˆ 0 , M 0 ), where f () ∼ . In addition, ψ, ω, ν ∈ C(g ˆ 0 , M 0(1) /M 0(2) ) dω ˆ 0 and hω∗ |ψ i ∼ , hν∗ |χ i ∼ . Now N gh (ω) ≤ −1, so that by Proposition 5.1, ω = dω 0 0 0 0 ˆ for some vector ω . We then have ω = dω + h()ν for some vector ν , which is non-singular for = 0 and h() is a polynomial of such that h(0) = 0. This implies ˆ 0 , which by comparing with dω ˆ = f ()ν yields h() ∼ and ν ∼ dν ˆ 0 . ˆ = h()dν dω Then ˆ 0∗ |χ i = hν0∗ |dχ ˆ i = hν0∗ |ψ i, ∼ hν∗ |χ i ∼ hdν ˆ 0 , M 0(1) ). so that hν0∗ |ψ i ∼ 1, which contradicts ψ ∈ C(g
Proposition 5.9. Hˆ p (g 0 , L0 ) for p ≥ 1 are represented by elements of the form v ⊗ v˜ 0 ⊗ ˜ v0 is a primary v gh , or equivalently of the form v0 ⊗ v˜ ⊗ v gh , where v ∈ L, v˜ ∈ L, highest weight vector w.r.t. g, v˜ 0 is a primary highest weight vector w.r.t. g˜ and v gh satisfies c(x)v gh = 0, x ∈ n+ . Proof. Let Hˆ p (g 0 , L0 ), p ≥ 1 be non-zero. Then by Theorem 4.1 there exists ω ∈ ˆ = ν (Corollary 5.3) with ν ∈ ˆ 0 , M 0 ) such that πL0 (ω) ∈ Hˆ −p (g 0 , L0 ). We have dω C(g ˆ 0 , M 0(1) ). It follows by Lemma 5.6 (iv) that ψ ∈ C(g ˆ 0 , M 0 ) with πL0 (ψ) ∈ Hˆ p (g 0 , L0 ), C(g ˆ = 0 in C(g ˆ 0 , M 0 ). We can now use the gradation N˜ gr introduced in the will satisfy dψ previous section to decompose dˆ = d˜0 + d˜−1 and use the homotopy operator κ˜ 0 to successively eliminate highest order terms of ψ in this gradation. Since p ≥ 1 we will finally get an element of the form v ⊗ v˜ 0 ⊗ v gh . The alternative form is found by using the gradation Ngr .
608
S. Hwang
Proposition 5.10. Hˆ 0 (g 0 , M 0 ) are represented by elements v ⊗ v˜ 0 ⊗v0gh , or equivalently ˜ v˜ 0 are highest weight vectors w.r.t. g and by the elements v0 ⊗ v˜ ⊗v0gh , where v, v0 and v, g, ˜ respectively, with v0 and v˜ 0 being primary, and v0gh is the ghost vacuum. Furthermore, the weights µ and µ˜ of the primary highest weight vectors v0 and v˜ 0 , respectively, satisfy µ + µ˜ + ρ = 0. ˆ = 0 and we can use the gradation N˜ gr and the Proof. Let ψ ∈ Hˆ 0 (g 0 , M 0 ). Then dψ homotopy operator as in the proof of Proposition 5.9 to conclude that since N gh (ψ) = 0 we must have ψ = v ⊗ v˜ 0 ⊗ v0gh . By using the gradation Ngr we get the alternative form. The condition on the weights is a consequence of htot (v ⊗ v˜ ⊗ v0gh ) = 0. Corollary 5.11. Hˆ 0 (g 0 , L0 ) are represented by elements of the form v0 ⊗ v˜ 0 ⊗ v0gh . Furthermore, the weights µ and µ˜ of the primary highest weight vectors v0 and v˜ 0 , respectively, satisfy µ + µ˜ + ρ = 0. ˆ = 0 in C(g ˆ 0 , M 0 ). ˆ 0 , M 0 ) and πL0 (ψ) ∈ Hˆ 0 (g 0 , L0 ). Assume first dψ Proof. Let ψ ∈ C(g ˆ Then the corollary follows directly from Proposition 5.10. Consider now dψ = ν 6= 0, ˆ = 0, ˆ 0 , M 0 ) such that dω where πL0 (ν) = 0. Then by Lemma 5.6 (iv) there exists ω ∈ C(g 0 0(1) ∗ ˆ ω 6∈ C(g , M ) and hω |ψi 6= 0. We may then apply Proposition 5.10 to ω, so that ω is of the form claimed in the corollary. As hω ∗ |ωi 6= 0, ω is a singlet representation of the BRST cohomology (cf. Theorem 4.1) and, hence, ψ and ω yield equivalent elements in H 0 (g 0 , L0 ). Theorem 5.12. A necessary and sufficient condition for Hˆ ±p (g 0 , L0 ), p ≥ 1, to be non-zero is either one of the following: ˆ 0 , M 0(2) ) and ν ∈ ˆ 0 , M 0(1) ) satisfying ν 6∈ C(g (i) There exists a vector ν ∈ C(g −p+1 0 0(1) ˆ H (g , M ). ˆ 0 , M 0(2) ), N gh (χ) = ˆ 0 , M 0(1) ) satisfying χ 6∈ C(g (ii) There exists a vector χ ∈ C(g ˆ ˆ p − 1, dχ = 0 and dχ 6= 0. In addition, dim Hˆ −p+1 (g 0 , M 0(1) ) = dim Hˆ ±p (g 0 , L0 ) , p ≥ 1. Proof. Necessary: (i) follows by Corollary 5.3 and Lemma 5.6 (iii). (ii) follows from (i) together with Lemma 5.6 (ii) and (iii). Sufficient: (i) For p > 1 we use Lemma 5.5. This also gives the last assertion of dimensionalities for these cases. For p = 1 we have ν ∈ Hˆ 0 (g 0 , M 0(1) ). We have two ˆ for some ψ ∈ C(g ˆ 0 , M 0 ), πL0 (ψ) 6= 0. possibilities. Either ν ∈ Hˆ 0 (g 0 , M 0 ) or ν = dψ ˆ 6= 0 from Proposition 5.10, so that we get case (ii) of the In the first case we have dν theorem, which is proved below. For the second possibility we use Lemma 5.4. ˆ = ψ ˆ = 0 and dχ ˆ 6= 0 implies, using that dˆ is linear in the generators of g 0 , dχ (ii) dχ ˆ for some ψ satisfying lim→0 ψ 6= 0. Proposition 5.8 then gives πL0 (ψ) ∈ H p (g 0 , L0 ). We finally prove the assertion concerning dimensionalities for the case p = 1. Asˆ 0 , M 0 ), with πL0 (ω1 ), πL0 (ω2 ) ∈ H −p (g 0 , L0 ) and sume first that there exist ω1 , ω2 ∈ C(g −p+1 0 0(1) ˆ 1 , ν2 = dω ˆ 2 (which is necessary by Corolˆ ν 1 , ν2 ∈ H (g , M ), satisfying ν1 = dω ˆ 1 − ω2 ) = ν1 − ν2 = d(. ˆ . .), lary 5.3), where ν1 = ν2 mod exact terms. This implies d(ω 0 0 so that by Proposition 5.1, πL (ω1 ) = πL (ω2 ) mod exact terms. Consider the opposite case, i.e. two different vectors ν1 and ν2 give the same element in Hˆ −1 (g 0 , L0 ). Write ˆ 1 = ν1 and dω ˆ 2 = ν2 . The requirement that πL0 (ω1 ) and πL0 (ω2 ) are equivalent eledω ±1 0 ˆ . .), where ν 0 ∈ C(g ˆ ˆ 0 , M 0(1) ). Then ments in H (g , L0 ) now implies ω1 = ω2 + ν 0 + d(. 0 ˆ ν1 = ν2 + dν .
Semi-Infinite Cohomology of Affine Lie Algebras
609
Corollary 5.13. dimHˆ ±1 (g 0 , L0µµ˜ ) = 1 if l(µ) ˜ − l(µ) = 1 and µ and −µ˜ − ρ are on the ±1 0 ˆ same ρ-centered Weyl orbit, and dimH (g , L0µµ˜ ) = 0 otherwise. Proof. By Proposition 5.10 and Theorem 3.11 we have dimHˆ 0 (g 0 , M 0(1) ) = 1 if l(µ) ˜ − l(µ) = 0 and µ and −µ−ρ ˜ are on the same ρ-centered Weyl orbit and dimHˆ 0 (g 0 , M 0(1) ) = 0 otherwise. Then dimHˆ ±1 (g 0 , L0µµ˜ ) =dimHˆ 0 (g 0 , M 0(1) ) = 1 (Theorem 5.12). With the help of Proposition 5.10 we can easily construct ν as in Theorem 5.12 (i), which gives the corollary. ˜ − l(µ) 6= p, or if l(µ) ˜ − l(µ) = p and Theorem 5.14. Hˆ ±p (g 0 , L0µµ˜ ) = 0, p ≥ 0, if l(µ) µ and −µ˜ − ρ are not on the same ρ-centered Weyl orbit. Proof. The theorem is true for p = 0 by Corollary 5.11 and for p = 1 by Corollary 5.13. Assume the theorem to be true for Hˆ ±q (g 0 , L0µµ˜ ), 0 ≤ q ≤ p − 1 and consider q = p. Assume there exists ω ∈ Mµ0 µ˜ such that π(ω) ∈ Hˆ −p (g 0 , L0µµ˜ ) 6= 0. Let ˆ 0 , M 0(s) ). Then there exists grad(σ) = s if s is the largest integer for which σ ∈ C(g 0(1) −p+1 0 (g , Mµµ˜ ), grad(ν) = 1 (Theorem 5.12). Write ν = ν1 + ν2 + . . . νn , where ν ∈ Hˆ νi ∈ Vi , i = 1, 2, . . . , n, grad(νi ) = 1 and Vi are Verma or BG modules of primitive ˜ − l(µ) − 1 (Proposition 3.15). We may weights (µi , µ˜ i ). We have l(µ˜ i ) − l(µi ) = l(µ) assume that ν cannot be written as a sum ν 0 + ν 00 , where ν 0 , ν 00 ∈ Hˆ −p+1 (g 0 , Mµ0(1) µ˜ ) and unequal, grad(ν 0 ) = grad(ν 00 ) = 1, as this would yield two different elements in ˆ i = 0 for some value of i, then νi = d(. ˆ . .) (Proposition 5.10) and Hˆ −p (g 0 , L0µµ˜ ). If dν clearly ν − νi will correspond to the same element ω. Hence, we may restrict to νi with ˆ i 6= 0, i = 1, . . . , n. dν ˆ = 0 using the gradation Ngr . Let νˆi be the highest Consider now the equation dν order term of νi and Ngr (νi ) = Ni , i = 1, . . . , n. Let Nˆ = max{Ni }ni=1 and order so Pm that Ni = Nˆ for i ∈ {1, 2, . . . , m}, m ≤ n. Then d0 ( i=1 )νi = 0. As d0 νi ∈ Vi , this equation may only be solved if there exists at least one Vj such that d0 νˆ i ∈ Vi ∩ Vj . Let φV M i
φV M i be the g-homomorphism Vi → Mi0 , i = 1, . . . , n, where Mi are Verma modules of the same primitive weight as Vi . φV M i exists for all i (see the note after Theorem 3.10). Then d0 φV M i (νˆi ) ∈ Mi0 ∩Mj0 . This is only possible if d0 φV M i (νˆi ) ∈ Mi0(1) for all ˆ V M i (νˆi ) ∈ M 0(1) to highest order in Ngr and that there exists ηi ∈ M 0 i. This implies dφ i i ˆ ˆ 0 to leading order such that ηi = dφV M i (νi ). If there exists ηi0 ∈ Mi0(1) such that η = dη i ˆ V M i (νi ) − η 0 ) to leading order, which contradicts dφ ˆ V M i (νi ) = ξi , where ξi is then d(φ i non-exact in Mi0(1) . Hence, ηi ∈ Hˆ −p+2 (g 0 , Mµ0(1) ˜ i ) to highest order. Theorem 5.12 now iµ −p+1 0 0 ˆ (g , Li ) to highest order. The induction hypothesis asserts that πLi φV M i (νˆi ) ∈ H implies l(µ˜ i ) − l(µi ) = p − 1 and that µi and −µ˜ i − ρ lie on the same ρ-centered Weyl orbit. Then l(µ) ˜ − l(µ) = p, µ and −µ˜ − ρ lie on the same Weyl orbit (Theorem 3.11 and Proposition 3.15). Theorem 5.15. Let µ, µ˜ ∈ 0w be such that µ + ρ/2 and µ˜ + ρ/2 are regular and µ and −µ˜ − ρ are on the same ρ−centered Weyl orbit. Then Hˆ ±p (g 0 , Lµµ˜ ) 6= 0, where p = l(µ) ˜ − l(µ) ≥ 0. Proof. For p = 0 the theorem is given by Corollary 5.11 (cf. the proof of Theorem 5.14, where it is shown that µ + µ˜ + ρ = 0 implies l(µ) ˜ − l(µ) = 0). For p = 1 the theorem follows from Corollary 5.13. We proceed by induction on p. Assume the theorem to be true for 0 ≤ l(µ) ˜ − l(µ) ≤ p − 1. We will also assume the following to hold to this
610
S. Hwang
order of p. Let ω ∈ Mµµ˜ such that πL (ω) ∈ Hˆ −q (g 0 , Lµµ˜ ) for 0 ≤ q ≤ p − 1. We then ˆ = ν1 + ν2 + . . . + νn with νi ∈ Mµi µ˜ i and grad(νi ) = 1, i = 1, . . . , n (with assume dω grad(...) defined as in Theorem 5.14). This assumption clearly holds for p = 1. We now consider µ, µ˜ such that l(µ) ˜ − l(µ) = p ≥ 2 with µ˜ + ρ/2 and µ + ρ/2 being regular. Introduce the following notation. For the Verma module Mµµ˜ we let M1 , . . . , Mn denote all submodules such that Mi ⊂ Mµ(1)µ˜ , Mi 6⊂ Mµ(2)µ˜ , i = 1, . . . , n. Denote by φi be a non-zero element of (µ1 µ˜ 1 ), . . . , (µn µ˜ n ) their respective highest weights. S Let S Homg (Mi , Mµµ˜ ), i = 1, . . . , n. Let Mi1 ...ik = Mi1 . . . Mik , i1 , . . . , ik = 1, . . . , n and φi1 ...ik be a non-zero element of Homg (Mi1 ...ik , Mµµ˜ ). Consider now ω1 ∈ M1 with πL (ω1 ) ∈ Hˆ −p+1 (g 0 , Lµ1 µ˜ 1 ). By Theorem 3.11 and the induction hypothesis ω1 ˆ 1 = ν1 + . . . + νs , where νi ∈ M1,i ⊂ M (1) (induction hypothesis). As exists. Then dω 1 grad(νi )=1 we have grad(φi (νi ))=2. Therefore, there will exist a union of Verma modules, M2...k say, such that φ1 (ν1 + . . . + νs ) = φ2...k (ν10 + . . . + νt0 ), ν10 + . . . + νt0 ∈ M2...k . By Lemma 3.17, M2...k is non-zero and different from M1 . Thus, φ1 (ν1 + . . . + νs ) may either be viewed as originating from an element ν1 + . . . + νs in M1 or from ˆ 0 + . . . + νt0 ) = 0, there exists an element an element ν10 + . . . + νt0 in M2...k . As d(ν 1 0 0 ˆ 2...k (Proposition 5.1). From this it follows ω2...k ∈ M2...k such that ν1 + . . . + νt = dω ˆ 2...k (ω2...k )−φ1 (ω1 )) = 0. Define ξ = φ2...k (ω2...k )−φ1 (ω1 ), which must be nonthat d(φ zero as M1 6= M2...k . We now prove that ξ is a non-trivial element of Hˆ −p+1 (g 0 , M 0(1) ), which by Theorem 5.12 proves our assertions for l(µ)−l(µ) ˜ = p (including the additional induction assumption). ˆ with η ∈ M (1) . Let ξ = ξ1 + . . . ξk , where ξi ∈ Mi , Assume the contrary, ξ = dη µµ˜ ˆ i = 0 then the corresponding ˆ i ∈ M (2) , i = 1, . . . , k. If dξ i = 1, . . . , k. By construction dξ µµ˜ Verma module Mi may be deleted from M2...k without affecting the construction of ξ (as ˆ i = 0 implies ξi = d(. ˆ . .)). Hence, we may consider dξ ˆ i 6= 0, i = 1, . . . , k. To highest dξ ˆ yields ξ (N ) = ξ (N ) + . . . + ξ (N ) = d˜0 η (N ) , order in the gradation N˜ gr the equation ξ = dη i k where ξ (N ) and η (N ) are the leading terms in ξ and η, respectively, and ξi(N ) denotes the N th order term of ξi (which is non-zero for at least one value of i). Generally, d˜0 γ ∈ V only if γ ∈ V for any Verma or BG module V. Therefore, η (N ) = η1(N ) + . . . ηk(N ) and ˆ i = 0, ξi(N ) = d˜0 ηi(N ) , ηi(N ) ∈ Mi i = 1, . . . , k. This implies d˜0 ξ (N ) = 0 and in turn dξ which is a contradiction. Remark 1. Results similar to Theorem 5.15 may be obtained for weights µ+ρ/2 and µ+ ˜ ρ/2 being singular, provided the corresponding Verma modules satisfy the multiplicity condition of Lemma 3.17. It is clear, however, that this generalization does not hold for all singular cases. Remark 2. The proof of Theorem 5.15 provides also an explicit method for finding the elements of the cohomology for negative ghost numbers. It is the same method as was presented in ref. [3]. Acknowledgement. I would like to thank Henric Rhedin for stimulating discussions during the progression of this work.
References 1. Karabali, D. and Schnitzer, H.: BRST quantization of the gauged WZW action and coset conformal field theories. Nucl. Phys. B329, 649–666 (1990)
Semi-Infinite Cohomology of Affine Lie Algebras
611
2. Hwang, S. and Rhedin, H.: The BRST formulation of G/H WZNW models. Nucl. Phys. B406, 165–184 (1993) 3. Hwang,S. and Rhedin, H.: Construction of BRST invariant states in G/H models. Phys. Lett. B350, 38–43 (1995) (hep-th/9501084) 4. Malikov, F.G., Feigin, B.L. and Fuks, D.B.: Singular vectors in Verma modules over Kac–Moody algebras. Funkt. Anal. Ego Prilozh. 20, 25–37 (1986) (English translation in Funkt. Anal. Appl. 20, 103–113 (1986)) 5. Jantzen, J.C.: Moduln mit einem h¨ochsten Gewicht. Lecture notes in Mathematics, Eds. A. Dold and B. Eckmann, Berlin–Heidelberg–New York: Springer-Verlag, 1979 6. Rocha-Caridi, A. and Wallach, N.R.: Projective modules over graded Lie algebras I. Math. Z. 180, 151–177 (1982) 7. Kugo, T. and Ojima, I.: Local covariant operator formalism of non-abelian gauge theories and quark confinement problem. Suppl. Prog. Theor. Phys. 66, 1–130 (1979) 8. Rocha-Caridi, A. and Wallach, N.R.: Highest weight modules over graded Lie algebras: Resolutions, filtrations and character formulas. Trans. Am. Math. Soc. 277, 133–162 (1983) 9. Kac, V.G. and Peterson, D.: Infinite dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) 10. Kac, V.G. and Kazhdan, D.A.: Structure of representations with highst weight of infinite-dimensional Lie algebras. Adv. in Math. 34, 97–108 (1979) 11. Bernstein, I.N., Gel’fand, I.M. and Gel’fand, S.I.: Structure of representations generated by vectors of highest weight. Funct. Anal. App. 5, 1–8 (1971) 12. Conze, N. and Dixmier, J.: Id´eaux primitifs dans l’alg`ebre enveloppante d’une alg`ebre de Lie semisimple. Bull. Sc. Math 96, 339–351 (1972) 13. Kac, V.G.: Infinite dimensional Lie algebras. Third edition, Cambridge: Cambridge Univ. Press, 1990 14. Dixmier, J.: Enveloping Algebras. North Holland Mathematical library, Amsterdam: North Holland Publ. Co., 1977 15. Verma, D.-N.: Structure of certain induced representations of complex semisimple Lie algebras. Bull. Amer. Math. Soc. 74, 160–166 (1968) 16. Humphreys, J.: Introduction to Lie algebras and representation theory. Revised edition, Berlin– Heidelberg–New York: Springer-Verlag, 1980 17. Hwang, S. and Marnelius, R.: BRST symmetry and a general ghost decoupling theorem. Nucl. Phys. B320, 476–486 (1989) 18. Enright, T.J.: On the fundamental series of a real semisimple Lie algebra: Their irreducibility, resolutions and multiplicity formulae. Ann. of Math. 110, 1–82 (1979) 19. Bernstein, I.N., Gel’fand, I.M. and Gel’fand, S.I.: Differential operators on the base affine space and the study of g-modules. In: I.M. Gel’fand, ed., Publ. of the 1971 summer school in Math., Janos Bolyai Math. Soc., Budapest, pp. 21–64 20. Lian, B.H. and Zuckermann, G.: BRST cohomology and highest weight vectors. Comm. Math. Phys. 135, 547–580 (1991) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 613 – 630 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Zeta-Function Regularization, the Multiplicative Anomaly and the Wodzicki Residue Emilio Elizalde1 , Luciano Vanzo2 , Sergio Zerbini2 1
Unitat de Recerca, CSIC, IEEC, Edifici Nexus 201, Gran Capit`a 2–4, 08034 Barcelona, Spain and Departament ECM and IFAE, Facultat de F´ısica, Universitat de Barcelona, Diagonal 647, 08028 Barcelona, Spain. E-mail: [email protected] 2 Dipartimento di Fisica, Universit` a di Trento, and Istituto Nazionale di Fisica Nucleare, Gruppo Collegato di Trento, Italia. E-mail: [email protected], [email protected] Received: 3 February 1997 / Accepted: 5 November 1997
Abstract: The multiplicative anomaly associated with the zeta-function regularized determinant is computed for the Laplace-type operators L1 = −1+V1 and L2 = −1+V2 , with V1 , V2 constant, in a D-dimensional compact smooth manifold MD , making use of several results due to Wodzicki and by direct calculations in some explicit examples. It is found that the multiplicative anomaly is vanishing for D odd and for D = 2. An application to the one-loop effective potential of the O(2) self-interacting scalar model is outlined.
1. Introduction Within the one-loop or external field approximation, the importance of zeta-function regularization for functional determinants, as introduced in [1], is well known, as a powerful tool to do with the ambiguities (ultraviolet divergences) present in relativistic quantum field theory (see for example [2]-[4]). It permits to give a meaning, in the sense of analytic continuation, to the determinant of a differential operator which, as the product of its eigenvalues, is formally divergent. For the sake of simplicity we shall here restrict ourselves to scalar fields. The one-loop Euclidean partition function, regularised by zeta-function techniques, reads [5] 1 LD 1 1 ln Z = − ln det 2 = ζ 0 (0|LD ) + ζ(0|LD ) ln µ2 , 2 µ 2 2 where ζ(s|LD ) is the zeta function related to LD – typically an elliptic differential operator of second order – ζ 0 (0|LD ) its derivative with respect to s, and µ2 a renormalization scale. The fact is used that the analytically continued zeta-function is generally regular at s = 0, and thus its derivative is well defined. When the manifold is smooth and compact, the spectrum is discrete and one has
614
E. Elizalde, L. Vanzo, S. Zerbini
ζ(s|LD ) =
X i
λ−2s , i
λ2i being the eigenvalues of LD . As a result, one can make use of the relationship between the zeta-function and the heat-kernel trace via the Mellin transform and its inverse. For Re s > D/2, one can write Z ∞ 1 −s ts−1 K(t|LD ) dt , (1.1) ζ(s|LD ) = Tr LD = 0(s) 0 K(t|LD ) =
1 2πi
Z Re s>D/2
t−s 0(s)ζ(s|LD ) ds ,
(1.2)
where K(t|LD ) = Tr exp(−tLD ) is the heat operator. The previous relations are valid also in the presence of zero modes, with the replacement K(t|LD ) −→ K(t|LD ) − P0 , P0 being the projector onto the zero modes. A heat-kernel expansion argument leads to the meromorphic structure of ζ(s|LD ) and, as we have anticipated, it is found that the analytically continued zeta-function is regular at s = 0 and thus its derivative is well defined. Furthermore, in practice all the operators may be considered to be trace-class. In fact, if the manifold is compact this is true and, if the manifold is not compact, the volume divergences can be easily factorized. Thus Z (1.3) Kt (LD ) = dVD Kt (LD )(x) and
Z ζ(LD , z) =
dVD ζ(LD |z)(x),
(1.4)
where Kt (LD )(x) and ζ(LD |z)(x) are the heat-kernel and the local zeta-function, respectively. However, if an internal symmetry is present, the scalar field is vector valued, i.e. φi and the simplest model is the O(2) symmetry associated with self-interacting charged fields in R4 . The Euclidean action is Z λ (1.5) S = dx4 φi −1 + m2 φi + (φ2 )2 , 4! where φ2 = φk φk is the O(2) invariant. The Euclidean small disturbances operator reads Aij = Lij +
λ 2 λ 8 δik + 8i 8k , 6 3
Lij = −1 + m2 δik ,
(1.6)
in which 1 is the Laplace operator and 8 the background field, assumed to be constant. Thus, one is actually dealing with a matrix-valued elliptic differential operator. In this case, the partition function is [6] " #
Aik (L + λ2 82 ) (L + λ6 82 )
. (1.7) ln Z = − ln det 2 = − ln det µ µ2 µ2
Zetas, Multiplicative Anomaly and Wodzicki Residue
615
As a consequence, one has to deal with the product of two elliptic differential operators. In the case of a two-matrix, one has ln det(AB) = ln det A + ln det B .
(1.8)
Usually the way one proceeds is by formally assuming the validity of the above relation for differential operators. This may be quite ambiguous, since one has to employ necessarily a regularization procedure. In fact, it turns out that the zeta-function regularized determinants do not satisfy the above relation and, in general, there appears the so-called multiplicativity (or just multiplicative) anomaly [7, 8]. In terms of F (A, B) ≡ det(AB)/(det A det B) [8], it is defined as: aD (A, B) = ln F (A, B) = ln det(AB) − ln det(A) − ln det(B) ,
(1.9)
in which the determinants of the two elliptic operators, A and B, are assumed to be defined (e.g., regularized) by means of the zeta-function [1]. It should be noted that the non vanishing of the multiplicative anomaly implies that the relation ln det A = Tr ln A
(1.10)
does not hold, in general, for elliptic operators like A = BC. It turns out that this multiplicative anomaly can be expressed by means of the noncommutative residue associated with a classical pseudo-differential operator, known as the Wodzicki residue [9]. Its important role in physics has been recognized only recently. In fact, within the non-commutative geometrical approach to the standard model of the electroweak interactions [10, 11], the Wodzicki residue is the unique extension of the Dixmier trace (necessary to write down the Yang-Mills action functional) to the larger class of pseudo-differential operators (9DO) [12]. Other recent contributions along these lines are [13–15]. Furthermore, a proposal to make use of the Wodzicki formulae as a practical tool in order to determine the singularity structure of zeta-functions has appeared in [16] and the connection with the commutator anomalies of current algebras and the Wodzicki residue has been found in [17] The purpose of the present paper is to obtain explicitly the multiplicative anomaly for the product of two Laplace-like operators – by direct computations and by making use of several results due to Wodzicki – and to investigate the relevance of these concepts in physical situations. As a result, the multiplicative anomaly will be found to be vanishing for D odd and also for D = 2, being actually present for D > 2, with D even. The contents of the paper are the following. In Sect. 2 we present some elementary computations in order to show the highly non-trivial character of a brute force approach to the evaluation of the multiplicative anomaly associated with two differential operators (even with very simple ones). In Sect. 3 we briefly recall several results due to Wodzicki, concerning the noncommutative residue and a fundamental formula expressing the multiplicative anomaly in terms of the corresponding residue of a suitable pseudodifferential operator. In Sect. 4, the Wodzicki formula is used in the computation of the multiplicative anomaly in RD and, as an example, the O(2) model in R4 is investigated. In Sect. 5, a standard diagrammatic analysis of the O(2) model is discussed and evidence for the presence of the multiplicative anomaly at this diagrammatic level is given. In Sect. 6 we treat the case of an arbitrary compact smooth manifold without boundary. Some final remarks are presented in the Conclusions. In the Appendix a proof of the multiplicative anomaly formula is outlined.
616
E. Elizalde, L. Vanzo, S. Zerbini
2. Direct Calculations Motivated by the example discussed in the introduction, one might try to perform a direct computation of the multiplicative anomaly in the case of the two self-adjoint elliptic commuting operators Lp = −1 + Vp , p = 1, 2, in MD , with Vp constant. Actually, we could deal with the shifts of two elliptic 9ODs. For the sake of simplicity, we may put µ2 = 1 and consider all the quantities to be dimensionless. At the end, one can easily restore µ2 by simple dimensional considerations. In order to compute the multiplicative anomaly, one needs to obtain the zeta-functions of the operators. Let us begin with MD smooth and compact without boundary (the boundary case can be treated along the same lines) and let us try to express ζ(s|L1 L2 ) as a function of ζ(s|Lp ). If we denote L0 = −1 and by λi its non-negative, discrete eigenvalues, the spectral theorem yields X (2.1) ζ(s|L1 L2 ) = [(λi + V1 )(λi + V2 )]−s . i
Making use of the identity (λi + V1 )(λi + V2 ) = (λi + V+ )2 − V−2 ,
(2.2)
with V+ = (V1 + V2 )/2 and V− = (V1 − V2 )/2, and noting that V−2 <1 , (λi + V+ )2
(2.3)
for every individual λi , the binomial theorem gives −s
[(λi + V1 )(λi + V2 )]
=
∞ X 0(s + k) k=0
k! 0(s)
V−2k (λi + V+ )−2s−2k ,
(2.4)
an absolutely convergent series expansion, valid without further restriction. Let us assume that Re s is large enough in order to safely commute the sum over i with the sum over k. From the equations above, we get ζ(s|L1 L2 ) = ζ(2s|L0 + V+ ) +
∞ X 0(s + k) k=1
k! 0(s)
V−2k ζ(2s + 2k|L0 + V+ ) .
(2.5)
This series is convergent for large Re s and provides the sought for analytical continuation to the whole complex plane. To go further, we note that, when |c| < λ1 (smallest non-vanishing eigenvalue of L), one has ζ(s|L + c) = ζ(s|L) +
∞ X 0(s + k) k=1
k! 0(s)
(−c)k ζ(s + k|L) ,
(2.6)
Let us use this expression for L1 and L2 . Since V 1 = V+ + V − , V 2 = V+ − V − , one has
(2.7)
Zetas, Multiplicative Anomaly and Wodzicki Residue
617
ζ(s|L1 ) = ζ(s|L0 + V+ + V− ) ∞ X 0(s + k) (−V− )k ζ(s + k|L0 + V+ ) , = ζ(s|L0 + V+ ) + k! 0(s)
(2.8)
k=1
and ζ(s|L2 ) = ζ(s|L0 + V+ − V− ) ∞ X 0(s + k) (V− )k ζ(s + k|L0 + V+ ) . = ζ(s|L0 + V+ ) + k! 0(s)
(2.9)
k=1
For s = 0, there are poles, but adding the two zeta-functions for suitable Re s and making the separation between k odd and k even, all the terms associated with k odd cancel. As a result ζ(s|L1 ) + ζ(s|L2 ) = 2ζ(s|L0 + V+ ) ∞ X 0(s + 2m) (V− )2m ζ(s + 2m|L0 + V+ ) . + (2m)! 0(s)
(2.10)
m=1
For suitable Re s, from Eqs. (2.5) and Eq. (2.10) we may write ζ(s|L1 L2 ) − ζ(s|L1 ) − ζ(s|L2 ) = ζ(2s|L0 + V+ ) − 2ζ(s|L0 + V+ ) ∞ X (V− )2m 0(s + m) ζ(2s + 2m|L0 + V+ ) + 0(s) m! m=1 0(s + 2m) ζ(s + 2m|L0 + V+ ) , (2.11) − 2 (2m)! The multiplicative anomaly is minus the derivative with respect to s in the limit s → 0. Thus, it is present only when there are poles of the zeta functions evaluated at positive integer numbers bigger than 2. From the Seeley theorem, the meromorphic structure of the zeta function related to an elliptic operator is known, also in manifolds with boundary, the residues at the poles being simply related to the Seeley-De Witt heatkernel coefficients Ar . For example, For a D-dimensional manifold without boundary one has [18] ∞
ζ(z|L) =
Ar 1 X 0(z) z+r− r=0
D 2
+
J(z) , 0(z)
(2.12)
J(z) being the analytical part. Since there are no poles at s = 0 for D odd and for D = 2 in the zeta functions appearing on the r.h.s. of Eq. (2.11), we can take the derivative at s = 0, i.e. ∞ X 0(2m) 0(m) −2 . (2.13) (V− )2m ζ(2m|L0 + V+ ) aD (L1 , L2 ) = 0(m + 1) 0(2m + 1) m=1
As a consequence, for D odd and for D = 2 the multiplicative anomaly is vanishing. For D > 2 and even, there are a finite number of simple poles other than at s = 0 in Eq. (2.11). As an example, in the important case D = 4, in a compact manifold without boundary, the zeta function has simple poles at s = 2, s = 1, s = 0, etc. Only the first
618
E. Elizalde, L. Vanzo, S. Zerbini
one is relevant, the other being harmless. Separating the term corresponding to l = 1, only this gives a non vanishing contribution when one takes the derivatives with respect to s at zero. Thus, a direct computation yields A0 V−2 2 VD (V1 − V2 )2 . = 4(4π)∈
a4 (L1 , L2 ) =
(2.14)
It follows that it exists potentially, an alternative direct method for computing the multiplicative anomaly for the shifts of two elliptic 9DOs and its structure will be a function of V−2 and of the heat-kernel coefficients Ar , which, in principle, are computable (the first ones are known). We will come back on this point in Sect. 6, using the Wodzicki formula. However, we observe that, here, the multiplicative anomaly is a function of the series of zeta-functions related to operators of Laplace type. One soon becomes convinced that it is not easy to go further along this way for an arbitrary D-dimensional manifold. We conclude this section with explicit examples. Example 1. MD = RD . Let us start with a particularly simple example, i.e. MD = RD . The two zeta-functions ζ(s|Li ) are easy to evaluate and read ζ(s|Li ) =
D
VD
D
Vi 2
− D2 ) , 0(s)
−s 0(s
(4π) 2
i = 1, 2 ,
(2.15)
where VD is the (infinite) volume of RD . We need to compute ζ(s|L1 L2 ). For Re s > D/2, starting from the spectral definition, one gets Z ∞ −s 2VD ζ(s|L1 L2 ) = dkk D−1 k 4 + (V1 + V2 )k 2 + V1 V2 . (2.16) D 4π) 2 0( D2 ) 0 For Re s > (D − 1)/4, the above integral can be evaluated [19], to yield √ 1 D 1−2s 2πVD 0(2s − D2 ) 2 −s α − 1 4 (V1 V2 ) 4 −s P 2 D+1 (α) , (2.17) ζ(s|L1 L2 ) = D s− 2 2s (4π) 2 0(s) Pνµ (z) being the associate Legendre function of the first kind (see for example [19]), and V1 + V2 . α= √ 2 V1 V 2
(2.18)
This provides the analytical continuation to the whole complex plane. For D = 2Q + 1, one easily gets ζ(0|L1 L2 ) = 0, √ ζ 0 (0|L1 L2 ) =
2πVD 0(−Q − 21 ) D
(4π) 2 =
VD 0(−Q − 21 ) D
(4π) 2
1
α2 − 1 D
4
1
D
(V1 V2 ) 4 P 2 D+1 (α) − 2
1/2
2(V1 V2 ) 2 (1 + cosh(Dγ))
,
(2.19)
Zetas, Multiplicative Anomaly and Wodzicki Residue
619
in which cosh γ = α. The first equation says that the conformal anomaly vanishes. On the other hand, one has for D odd, 0
0
ζ (0|L1 ) + ζ (0|L2 ) =
VD 0(−Q − 21 ) D
D V1 2
+
D V2 2
,
(2.20)
(4π) 2
As a consequence, making use of elementary properties of the hyperbolic cosine, one gets a(L1 , L2 ) = 0. Namely, for D odd the multiplicative anomaly is vanishing (see [8]). For D = 2Q, the situation is much more complex. First the conformal anomaly is non-zero, i.e. ζ(0|L1 L2 ) =
i VD (−1)Q h Q/2 (V V ) cosh(Qγ) , 1 2 (4π)Q Q!
(2.21)
and, in general, the multiplicative anomaly is present. As a check, for D = 2, we get i V2 h V2 (V1 V2 )1/2 cosh γ = − (V1 + V2 ) 4π 4π 1 a1 (A) = ζ(0|A) , = 4π
ζ(0|L1 L2 ) = −
(2.22)
where A = −1I + V is a 2 × 2 matrix-valued differential operator, I the identity matrix, V = diag (V1 , V2 ), and aR1 (A) is the first related Seeley-De Witt coefficient, given by the well known expression dx2 (− tr V ). Unfortunately, it is not simple to write down – within this naive approach – a reasonably simple expression for it, because the associate Legendre function depends on s through the two indices µ and ν. However, it is easy to show that the anomaly is absent when V1 = V2 , therefore it will depend only on the difference V1 − V2 . Thus, one may consider the case V2 = 0. As a result, Eq. (2.16) yields the simpler expression √ ζ(s|L1 L2 ) =
2πVD
D (4π) 2 0( D2 )
0( D2 − s)0(2s − 0(s)
D 2 )
D
V1 2
−2s
.
(2.23)
In this case the multiplicative anomaly is given by a(L1 , L2 ) = ln det(L1 L2 ) − ln det(L1 ) ,
(2.24)
since the regularized quantity ln det(L2 ) = 0. It is easy to show that, when D is odd, again aD (L1 , L2 ) = 0. When D = 2Q, one obtains a2Q (L1 , L2 ) =
VD (−1)Q Q V [9(1) − 9(Q)] . (4π)Q 2Q! 1
(2.25)
We conclude this first example by observing that the multiplicative anomaly is absent when Q = 1, D = 2, and that it is present for Q > 1, D > 2 even. The result obtained is partial and more powerful techniques are necessary in order to deal with the general case. Such techniques will be introduced in the next section.
620
E. Elizalde, L. Vanzo, S. Zerbini
Example 2. MD = S 1 × RD−1 , D = 1, 2, 3, . . .. In this case the zeta functions corresponding to Li , i = 1, 2, are given by " 2 #(D−1)/2−s ∞ π (D−1)/2−2s 0(s + (1 − D)/2) X L Vi (2.26) n2 + ζ(s|Li ) = 22s+1 LD−2s 0(s) 2π n=−∞ (i = 1, 2, here L is the length of S 1 ). In terms of the basic zeta function (see [20]): ζ(s; q) ≡
∞ X
(n2 + q)−s
(2.27)
n=−∞
=
∞ √ 0(s − 1/2) 1/2−s 4π s 1/4−s/2 X √ q q π + ns−1/2 Ks−1/2 (2πn q), 0(s) 0(s) n=1
where Kν is the modified Bessel function of the second kind, we obtain π −D/2 h −D D/2−s 2 L0(s − D/2)Vi ζ(s|Li ) = 0(s) # ∞ X p 2−s−D/2 s+1−D/2 D/4−s/2 s−D/2 +2 L Vi n Ks−D/2 (nL V1 ) (2.28) n=1
≡ ζ (s|Li ) + ζ (s|Li ). (1)
(2)
For the determinant we get, for D odd, n h D/2 det Li = exp −π −D/2 2−D L0(−D/2)Vi 1−D/2
+(2L)
D/4 Vi
∞ X
n
−D/2
#)
p KD/2 (nL Vi )
,
(2.29)
n=1
for D even (D = 2Q),
Q X 1 − ln Vi ViQ j j=1 # √ Q X ∞ p Vi −Q n KQ (nL Vi ) . + 4L 2πL
L det Li = exp − Q!
1 − 4π
Q
(2.30)
n=1
As for the product L1 L2 , using the same strategy as before, after some calculations we obtain (here we use the short-hand notation L± ≡ L0 + V± , cf. equations above): [Q/2] X V−2p (−V+ )Q−2p 2L det(L1 L2 ) = (det L+ )2 exp − (2p)!(Q − 2p)!(4π)Q p=1 p−1 X 1 1 1 + − ψ(2p) − ln V+ × 1 − C + 2 (Q − 2p)! 2 j j=1 ∞ ∞ X X V−2p (1) V−2p (2) − ζ (2p|L+ ) − ζ (2p|L+ ) , (2.31) p p p=[Q/2]+1
p=1
Zetas, Multiplicative Anomaly and Wodzicki Residue
621
where [x] means ‘integer part of x’ and C is the Euler–Mascheroni constant. We can check from these formulas that the anomaly (1.9) is zero in the case of odd dimension D. Actually, this is most easily seen, as before, by using the expression corresponding to (2.16) for the present case. It also vanishes for D = 2. The formula above is useful in order to obtain numerical values for the case D even, corresponding to different values of D and L (the series converge very quickly). The results are given in Table 1. We have looked at the variation of the anomaly in terms of the different parameters: L, D, V1 and V2 while keeping the rest of them fixed. Within numerical errors, we have checked the complete coincidence with formula (4.5) in Sect. 4. Table 1. Values of the multiplicative anomaly a(L1 , L2 ) in terms of the parameters: L, D, V1 and V2 . Observe its evolution when some of the parameters are kept fixed while the others are varied. In all cases, a perfect coincidence with Wodzicki’s expression for the anomaly is obtained (within numerical errors) L 1 0.1 1 5 10 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.1 0.5 1 2 5 10 20 0.1 0.5 1 2 5 10 20
D 2 2 2 2 2 2 4 6 8 10 12 14 16 4 4 4 6 6 6 4 4 4 4 4 4 4 6 6 6 6 6 6 6
V1 2 8 8 8 8 10 10 10 10 10 10 10 10 2 5 1 2 5 1 7 7 7 7 7 7 7 7 7 7 7 7 7 7
V2 2 3 3 3 3 1 1 1 1 1 1 1 1 1 2 6 1 2 6 2 2 2 2 2 2 2 2 2 2 2 2 2 2
a(L1 , L2 ) 0. –1.8686 × 10−14 –2.0817 × 10−17 –1.4572 × 10−16 –1.4572 × 10−16 2.87 × 10−12 0.064117 –0.028063 0.0151245 –0.003636 0.0006124 –0.00008166 9.09 × 10−6 0.0007916 0.007124 0.019789 –0.0000945 –0.001984 –0.005512 0.001979 0.009895 0.019789 0.0395786 0.098947 0.197893 0.395786 –0.00070865 –0.00354326 –0.0070865 –0.014173 –0.0354326 –0.07008652 –0.141730
Example 3. MD = RD with Dirichlet b.c. on p pairs of perpendicular hyperplanes. The zeta function is, in this case,
622
E. Elizalde, L. Vanzo, S. Zerbini
ζ(s|Li ) =
π (D−p)/2−2s 0(s + (p − D)/2) Qp 2D−p+1 j=1 aj 0(s) (D−p)/2−s 2 p ∞ X X nj + Vi , aj
n1 ,...,np =1
(2.32)
j=1
where the aj , j = 1, 2, . . . , p, are the pairwise separations between the perpendicular hyperplanes. For the determinant, we get, for D − p = 2h + 1 odd, det Li = h−1/2 2 p ∞ h+1/2 X X π nj exp − 2h+2 Qp 0(−h − 1/2) + Vi , (2.33) aj 2 j=1 aj n1 ,...,np =1 j=1 and, for D − p = 2h even, h−1 ∞ h X X (−π) 1 Q 2 + h det Li = exp p 2h+1 h! j 2 j=1 aj j=1 n ,...,n +
∞ X n1 ,...,np =1
2 p X nj j=1
aj
h
+ Vi ln
2 p X nj j=1
p =1
1
aj
2 p X nj j=1
aj
h + Vi
+ Vi .
(2.34)
For the calculation of the anomaly one follows the same steps of the two preceding examples and we are not going to repeat this again. In order to obtain the final numbers one must make use of the inversion formula for the Epstein zeta functions of these expressions [20, 2]. 3. The Wodzicki Residue and the Multiplicative Anomaly For the reader’s convenience, we will review in this section the necessary information concerning the Wodzicki residue [9] (see, also [7] and the references to Wodzicki quoted therein) that will be used in the rest of the paper. Let us consider a D-dimensional smooth compact manifold without boundary MD and a (classical) 9DO, A, of order m, acting on sections of vector bundles on MD . To any 9DO, A, it corresponds to a complete symbol a(x, k), such that, modulo infinitely smoothing operators, one has Z Z dk dyei(x−y)k a(x, k)f (y) . (3.1) (Af )(x) ∼ D RD (2π) RD The complete symbol admits an asymptotic expansion for |k| → ∞, given by X a(x, k) ∼ am−j (x, k) ,
(3.2)
j
and fulfills the homogeneity property am−j (x, tk) = tm−j am−j (x, k), for t > 0. The number m is called the order of A.
Zetas, Multiplicative Anomaly and Wodzicki Residue
623
If P is an elliptic operator of order p > m, according to Wodzicki one has the following property of the non-commutative residue, which we may take as its characterization.
Proposition. The trace of the operator AP −s exists and admits a meromorphic continuation to the whole complex plane, with a simple pole at s = 0. Its Cauchy residue at s = 0 is proportional to the so-called non-commutative (or Wodzicki) residue of A: res(A) = p Ress=0 Tr(AP −s ) .
(3.3)
The r.h.s. of the above equation does not depend on P and is taken as the definition of the Wodzicki residue of the 9DO, A. Properties. (i)
Strictly related to the latter result is the one which follows, involving the short-t asymptotic expansion Tr(Ae−tP ) '
X j
αj t
D−j p −1
−
res(A) ln t + O(t ln t) . p
(3.4)
Thus, the Wodzicki residue of A, a 9DO, can be read off from the above asymptotic expansion selecting the coefficient proportional to ln t. (ii) Furthermore, it is possible to show that res(A) is linear with respect to A and possesses the important property of being the unique trace on the algebra of the 9DOs, namely, one has res(AB) = res(BA). This last property has deep implications when including gravity within the non-commutative geometrical approach to the Connes-Lott model of the electro-weak interaction theory [12, 10, 11]. (iii) Wodzicki has also obtained a local form of the non-commutative residue, which has the fundamental consequence of characterizing it through a scalar density. This density can be integrated to yield the Wodzicki residue, namely Z Z dx a−D (x, k)dk . (3.5) res(A) = D MD (2π) |k|=1 Here the component of order −D of the complete symbol appears. Form the above result it immediately follows that res(A) = 0 when A is an elliptic differential operator. (iv) We conclude this summary with the multiplicative anomaly formula, again due to Wodzicki. A more general formula has been derived in [8]. Let us consider two invertible elliptic self-adjont operators, A and B, on MD . If we assume that they commute, then the following formula applies: res (ln(Ab B −a ))2 = a(B, A) , (3.6) a(A, B) = 2ab(a + b) where a > 0 and b > 0 are the orders of A and B, respectively. A sketch of the proof is presented in the Appendix. It should be noted that a(A, B) depends on a 9DO of zero order. As a consequence, it is independent on the renormalization scale µ appearing in the path integral.
624
E. Elizalde, L. Vanzo, S. Zerbini
(v) Furthermore, it can be iterated consistently. For example ζ 0 (A, B) = ζ 0 (A) + ζ 0 (B) + a(A, B), ζ (A, B, C) = ζ 0 (AB) + ζ 0 (C) + a(AB, C) = ζ 0 (A) + ζ 0 (B) + ζ 0 (C) + a(A, B) + a(AB, C) .
(3.7)
0
As a consequence, a(A, B, C) = a(AB, C) + a(A, B) .
(3.8)
Since a(A, B, C) = a(C, B, A), we easily obtain the cocycle condition (see [8]): a(AB, C) + a(A, B) = a(CB, A) + a(C, B) .
(3.9)
4. The O(2) Bosonic Model In this section we come back to the problem of the exact computation of the multiplicative anomaly in the model considered in Sect. 2. Strictly speaking, the result of the last section is valid for a compact manifold, but in the case of RD the divergence is trivial, being contained in the volume factor. The Wodzicki formula gives a(L1 , L2 ) =
1 2 res (ln(L1 L−1 . 2 )) 8
(4.1)
2 We have to construct the complete symbol of the 9DO of zero order [ln(L1 L−1 2 )] . It is given by
2 a(x, k) = ln(k 2 + V1 ) − ln(k 2 + V2 ) .
(4.2)
For large k 2 , we have the following expansion, from which one can easily read off the homogeneuos components: a(x, k) =
∞ X j=2
cj k
−2j
=
∞ X
a2j (x, k) ,
(4.3)
j=2
where cj =
j X (−1)j V1n − V2n V1j−n − V2j−n . n(j − n)
(4.4)
n=1
As a consequence, due to the local formula one immediately gets the following result: for D odd, the multiplicative anomaly vanishes, in perfect agreement with the direct calculation of Sect. 2. This result is consistent with a general theorem contained in [8]. For D even, if D = 2 one has no multiplicative anomaly, while for D = 2Q, Q > 1, one gets a(L1 , L2 ) =
Q−1 VD (−1)Q X 1 V1j − V2j V1Q−j − V2Q−j . Q 4(4π) 0(Q) j(Q − j) j=1
(4.5)
Zetas, Multiplicative Anomaly and Wodzicki Residue
625
It is easy to show that for V2 = 0 this expression reduces to the one obtained directly in Sect. 2. In the O(2) model, for D = 4, we have a(L1 , L2 ) =
V4 V4 (V1 − V2 )2 = λ2 84 , 4(4π)2 36(4π)2
(4.6)
which, for dimensional reasons, is independent of the renormalization parameter µ. Then, the one-loop effective potential reads ln Z V4 M12 M22 M14 3 M24 3 1 + ln + ln = − + − + λ2 84 , (4.7) 64π 2 2 µ2 64π 2 2 µ2 72(4π)2
Vef f = −
with M12 = m2 +
λ 2 8 , 2
M22 = m2 +
λ 2 8 . 6
(4.8)
Thus, the additional multiplicative anomaly contribution seems to modify the usual Coleman-Weinberg potential. A more careful analysis is required in order to investigate the consequences of this remarkable fact. 5. Feynman Diagrams The necessity of the presence of the multiplicative anomaly in quantum field theory can also be understood perturbatively, using the background field method. The effective action of the O(2) model in a background field 8 will be denoted by 0(8, φ), where φ is the mean field. Then, if 00 (φ) denotes the effective action with vanishing 8, it turns out that 0(8, φ) = 00 (8 + φ).
(5.1)
Therefore, the nth order derivatives of 0 with respect to φ at φ = 0 determine the vertex functions of the O(2) model in the background external field. The one-loop approximation to 0 is again given by log det(L1 L2 ), and the determinant of either of the operators, L1 and L2 , corresponds to the sum of all vacuum-vacuum 1PI diagrams where only particles of masses squared M12 = m2 + λ82 /2 or M22 = m2 + λ82 /6 flow along the internal lines. In Fig. 1 we have depicted this, by using a solid line for type-1 particles and a dashed line for type-2 particles. Thus, for example, the inverse propagator at zero momentum for type-1 particle, as computed from the above effective potential, is obtained from the second derivative with respect to φ1 . The only 1PI graphs which contribute are shown in Fig. 2. This is clearly not the case, as the full theory exhibits a trilinear coupling φ2 (φ1 )2 which gives the additional Feynman graph depicted in Fig. 3. Without investigating this question any further, we can safely affirm already that a perturbative formula for the Wodzicki anomaly given in terms of Feynman diagrams should exist. It surely owes its simple form to very subtle cancellations among an infinite class of Feynman diagrams. We conclude this section with some remarks. In the present model, the existence of a multiplicative anomaly of the type considered could be a trivial problem, in fact it has the same
626
E. Elizalde, L. Vanzo, S. Zerbini
+
LogDet(L 1 )
+
LogDet(L 2 )
Fig. 1. The Feynman graph giving the one-loop effective potential without taking into account the anomaly
+ Fig. 2. Contributions coming from 1PI graphs
form as the classical potential energy. This suggests that it can be absorbed in a finite renormalization of the coupling constant of the theory. Secondly, this anomaly gives no contribution to the one-loop beta function of the model, since it is independent of the arbitrary renormalization scale, but it certainly contributes to the two-loop beta function. And, finally, we have seen that the anomaly can be interpreted as an external field effect which, in the present model, could be relevant only when the theory is coupled to an external source. Therefore, it should be very interesting to study its relevance in at least two other situations, namely the cases of a spontaneously broken symmetry and of QED in external background fields.
Fig. 3. Additional Feynman graph of the full theory
6. The Case of a General, Smooth and Compact Manifold MD Without Boundary Since the multiplicative anomaly is a local functional, it is possible to express it in terms of the Seeley-De Witt spectral coefficients. Let us consider again the operator Lp = L0 + Vp , with L0 = −1 acting on scalars, in a smooth and compact manifold MD without boundary. We have to compute the Wodzicki residue of the 9DO, 2 ln(L1 L−1 . (6.1) 2 )
Zetas, Multiplicative Anomaly and Wodzicki Residue
627
With this aim, if V1 < V2 , we can consider the 9DO
2 −tL1 , ln(L1 L−1 2 ) e
(6.2)
and compute the ln t term in the short-t asymptotic expansion of its trace. We are dealing here with self-adjoint operators and thus, by using the spectral theorem, we get h Tr
2 −tL1 ln(L1 L−1 2 ) e
Z
i
∞
= V1
dλρ(λ|L1 ) [ln λ − ln(λ + V2 − V1 )]2 e−tλ , (6.3)
where ρ(λ|L1 ) is the spectral density of the self-adjoint operator L1 . Now, it is well known that the short-t expansion of the above trace receives contributions from the asymptotics, for large λ, of the integrand in the spectral integral. The asymptotics of the spectral function associated with L1 are known to be given by (see, for example [21, 22], and the references therein) r
ρ(λ|L1 ) '
X r=0
Ar (L1 ) D −r−1 , λ2 0( D2 − r)
(6.4)
here the quantities Ar (L1 ) are the Seeley-De Witt heat-kernel coefficients while, for large λ, we have in addition [ln λ − ln(λ + V2 − V1 )]2 '
∞ X
bj λ−j ,
(6.5)
j=2
being the bj computable, for instance b2 = (V2 − V1 )2 , b3 = −2(V2 − V1 )3 , etc. As a result, we get the short-t asymptotics in the form h Tr
2 −tL1 ln L1 L−1 e 2
i
r
'
X r=0
∞
Ar (L1 ) X r+j− D D 2 0( bj t 2 − r − j, tV1 ) , (6.6) 0( D2 − r) j=2
where 0(z, x) is the incomplete gamma function. From this expression one obtains the following results: (i) If D is odd, say D = 2Q + 1, the first argument of the incomplete gamma function is never zero or a negative integer. Thus, the ln t is absent and, from the Wodzicki theorem, the multiplicative anomaly is absent too, again in agreement with the Kontsevich-Vishik theorem [8] and the explicit calculations in the previous sections. (ii) If D is even, we have to search for the log terms only, that is −Q + r + j = 0, for r ≥ 0 and j ≥ 2. As a result, for D = 2 the log term is absent once more, again in agreement with the explicit calculations of the previous sections. The multiplicative anomaly is present starting from D ≥ 4. In the important case when D = 4, it turns out that the multiplicative anomaly is identical to the one, related with R4 , that has been evaluated previoisly. Terms depending on the curvature become operative only for D ≥ 6.
628
E. Elizalde, L. Vanzo, S. Zerbini
7. Conclusions In this paper, the multiplicative anomaly associated with the zeta-function regularised determinant of two 9DOs of Laplace type on a D-dimensional smooth manifold without boundary has been studied. From a physical point of view, this condition does not seem to be too restrictive, because the one-loop effective potential may be expressed as a logarithm of the determinant of such kind of elliptic differential operators. We have shown how a direct calculation leads to analytical difficulties, even in the most simple examples. Fortunately, a very elegant formula for the multiplicative anomaly has been found by Wodzicki and we have used it here in order to compute the anomaly explicitly. It is worth mentioning that, from a computational point of view, this constitutes a big improvement, since one can make use of the results concerning the computation of the one-loop effective potential, related to second order elliptic differential operators of Laplace type. Furthermore, within the background field method, we have identified the presence of the multiplicative anomaly in the diagrammatic perturbative approach too. With regard to our example, namely the product L1 L2 , we have shown that the multiplicative anomaly is vanishing for D odd and also for D = 2. This seems to be related with the fact that we have only considered differential operators of second order (Laplace type). For first-order differential operators (Dirac like), things could be quite different, in principle, and we will consider this important case elsewhere. Another interesting issue is the generalization of all these procedures to smooth manifolds with a boundary. Again one should expect to obtain different results in those situations. Acknowledgement. We would like to thank Guido Cognola and Klaus Kirsten for valuable discussions. This work has been supported by the cooperative agreement INFN (Italy)–DGICYT (Spain). EE has been partly financed by DGICYT (Spain), projects PB93-0035 and PB96-0925, and by CIRIT (Generalitat de Catalunya), grant 1995SGR-00602.
Appendix A: The Wodzicki formula for the multiplicative anomaly In this Appendix, for the reader’s convenience we present a proof of the multiplicative anomaly formula along the lines of Ref. [8]. Recall that if P is an elliptic operator of order p > a, according to Wodzicki. One has the following property of the non-commutative residue related to the 9DO A: in a neighborhood of z = 0, it holds z Tr(AP −z ) =
1 res(A) + zRA (P ) + O(z 2 ) . 0(1 + z) p
(A.1)
The quantity RA (P ) will play no role in the final formula. Now we resort to the following Lemma. If η is a 9DO of zero order, a, and B a 9DO of positive order, b, and γ and x positive real numbers then, in a neighborhood of s = 0, one has s Tr(ln ηη −xs B −γs ) =
res((ln η)2 ) res(ln η) − sx + sRln η (B) + O(s2 ) . (A.2) 0(1 + γs)γb 0(1 + γs)γb
Zetas, Multiplicative Anomaly and Wodzicki Residue
629
The lemma is a direct consequence of the formal expansion η −xs = e−xs ln η = I − xs ln η + O(s2 )
(A.3)
and of Eq. (A.1). From the above lemma, it follows that res[(ln η)2 ] res(ln η) lim ∂s s Tr(ln ηη −xs B −γs ) = C −x + Rln η (B) , (A.4) s→0 b γb in which C is the Euler–Mascheroni constant. Now consider two invertible, commuting, elliptic, self-adjont operators A and B on MD , with a and b being the orders of A and B, respectively. Within the zeta-function definition of the determinants, consider the quantity det(AB) = ea(A,B) . (A.5) F (A, B) = (det A)(det B) Introduce then the family of 9DOs, a
A(x) = η x B b ,
η = Ab B −a ,
(A.6)
det(A(x)B) . (det A(x))(det B)
(A.7)
and define the function F (A(x), B) = One gets F (A(0), B) =
det B a
a+b b
=1
(det B b )(det B)
(A.8) det(AB) F (A( b1 ), B) = = F (A, B) . (det A)(det B) As a consequence, one is led to deal with the following expression for the anomaly a(A(x), B) = ln F (A(x), B) = − lim ∂s Tr(A(x)B)−s − Tr A(x)−s − Tr B −s . s→0
(A.9)
This quantity has the properties: a(A(0), B) = 0 and a(A( b1 ), B) = a(A, B). The next step is to compute the first derivative of a(A(x), B) with respect to x, the result being a+b a . (A.10) ∂x a(A(x), B) = lim ∂s Tr ln ηη −xs B −s b −Tr ln ηη −xs B −s b s→0
Making now use of Eq. (A.3), one obtains res[(ln η)2 ] res(ln η) res[(ln η)2 ] res(ln η) −x −C +x b a+b b a b =x res[(ln η)2 ] . (A.11) a(a + b) And, finally, performing the integration with respect to x, from 0 to 1/b, one gets Wodzicki’s formula for the multiplicative anomaly, used in Sect. 3, namely res (ln(Ab B −a ))2 . (A.12) a(A, B) = a(B, A) = 2ab(a + b) ∂x a(A(x), B) = C
630
E. Elizalde, L. Vanzo, S. Zerbini
References 1. Ray, D.B. and Singer, I.M.: Adv. in Math. 7, 145 (1971) 2. Elizalde, E.,Odintsov, S.D., Romeo, A., Bytsenko, A.A. and Zerbini, S.: Zeta Regularization Techniques with Applications. Singapore: World Scientific, 1994 3. Elizalde, E.: Ten Physical applications of Spectral Zeta Functions. Berlin: Springer-Verlag, 1995 4. Bytsenko, A.A., Cognola, A.A., Vanzo, L. and Zerbini, S.: Phys. Rep. 266, 1 (1996) 5. Hawking, S.W.: Commun. Math. Phys. 55, 133 (1977) 6. Benson, K., Bernstein, K. and Dodelson, S.: Phys. Rev. D44, 2480 (1991) 7. Kassel, C.: Asterisque 177, 199 (1989), Sem. Bourbaki 8. Kontsevich, M. and Vishik, S.: Functional Analysis on the Eve of the 21st Century. Vol. 1, 173–197, (1993) 9. Wodzicki, M.: Non-commutative Residue Chapter I. In: Lecture Notes in Mathematics. Yu.I. Manin, editor, Vol. 1289, Berlin: Springer-Verlag, 1987 p. 320 10. Connes, A. and Lott, J.: Nucl. Phys. B18, 29 (1990); Connes, A.: Non-commutative geometry, New York: Academic Press, 1994 11. Connes, A.: Commun. Math. Phys. 182, 155 (1996); Chamseddine, A.H. and Connes, A.: Commun. Math. Phys. 186, 731 (1997); Chamseddine, A.H. and Connes, A.: A universal action formula. hep–th/9606056; Iochum, B., Kastler, D. and Sch¨ucker, T.: J. Math. Phys. 38, 4929 (1997); Martin, C.P., Gracia-Bondia, J.M. and Varilly, J.C.: Phys. Rep. 294, 363 (1998) 12. Connes, A.: Commun. Math. Phys. 117, 673 (1988); Kastler, D.: Commun. Math. Phys. 166, 633 (1995) 13. Kalau, W. and Walze, M.: J. Geom. Phys. 16, 327 (1995) 14. Ackermann, T. and Tolksdorf, J.: A generalized Lichnerowics formula, the Wodzicki Residue and Gravity. CPT-94, hep–th/9503152 (1995) 15. Ackermann, T.: A note on the Wodzicki Residue. funct–an/9506006 (1995) 16. Elizalde, E.: J. Phys. A30, 2735 (1997) 17. Mickelsson, J.: Wodzicki Residue and anomalies of current algebras. Integrable models and strings, Proc. Helsinki 1993, hep–th/9404093 (1994) 18. Seeley, R.T.: Am. Math. Soc. Proc. Symp. Pure Math 10, 288 (1967) 19. Gradshteyn, I.S. and Ryzhik, I.M.: Table of Integrals, Series and Products. New York: Academic Press, 1980 20. Elizalde, E.: J. Math. Phys. 35, 6100 (1994) 21. H¨ormander, L.: Acta Math. 121, 193 (1968) 22. Cognola, G., Vanzo, L. and Zerbini, S.: Phys. Lett. B223, 416 (1989) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 631 – 650 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Universality of Correlation Functions of Hermitian Random Matrices in an External Field P. Zinn-Justin Laboratoire de Physique Th´eorique de l’Ecole Normale Sup´erieure, 24 rue Lhomond, 75231 Paris Cedex 05, France (Unit´e propre du CNRS, associ´ee a` l’Ecole Normale Sup´erieure et l’Universit´e Paris-Sud). E-mail: [email protected] Received: 14 May 1997 / Accepted: 5 November 1997
Abstract: The behavior of correlation functions is studied in a class of matrix models characterized by a measure exp(−S) containing a potential term and an external source term: S = N tr(V (M ) − M A). In the large N limit, the short-distance behavior is found to be identical to the one obtained in previously studied matrix models, thus extending the universality of the level-spacing distribution. The calculation of correlation functions involves (finite N ) determinant formulae, reducing the problem to the large N asymptotic analysis of a single kernel K. This is performed by an appropriate matrix integral formulation of K. Multi-matrix generalizations of these results are discussed.
1. Introduction More than four decades ago, Wigner [18] suggested to study the distribution of energy levels of complex systems using random matrices. In this approach, one would like to characterize the structure of the energy levels, i.e. the eigenvalues of the Hamiltonian, the latter being considered as a large matrix with random entries. It is now known that many statistical properties of spectra of true physical systems are indeed well described by those of random matrices (cf. [16] for a review): it is therefore important to understand how much these spectral properties depend on the particular matrix ensemble chosen, i.e. determine universality classes of matrix ensembles. For technical reasons, we shall consider here ensembles of hermitian matrices, which correspond to systems without time-reversal invariance. The main quantities of interest are the probability distributions ρn of the eigenvalues: if M is a random hermitian N × N matrix, we define ρn (λ1 , . . . , λn ) to be the density of probability that M (λ1 , . . . , λn ) among its N eigenvalues, with the normalization convention that: R Qhas n i=1 dλi ρn (λ1 , . . . , λn ) = 1. To connect with correlation functions of the model, we define (following [15]):
632
P. Zinn-Justin
Rn (λ1 , λ2 , . . . , λn ) ≡
*n Y
+ tr δ(M − λi ) .
(1.1)
i=1 ! ρn (λ1 , . . . , λn ) for distinct λi (δ functions appear Then one has Rn (λ1 , . . . , λn ) = (NN−n)! for coinciding eigenvalues). This means that correlation functions of any U (N )-invariant quantities (i.e. functions of the eigenvalues only) can be computed using the ρn . We shall now study a class of models in which one can express the functions ρn in terms of a single kernel K(λ, µ); if we again assume the λi all distinct, the corresponding relation for Rn will be
Rn (λ1 , λ2 , . . . , λn ) = det(K(λi , λj ))i,j=1...n .
(1.2)
These formulae are exact at finite N . For example if we also define Rn(c) (λ1 , λ2 , . . . , λn )
≡
*n Y
+ tr δ(M − λi )
i=1
,
(1.3)
c
the connected correlation functions, this implies that R2(c) (λ, µ) = −K(λ, µ)K(µ, λ).
(1.4)
In these models, the study of the distribution of eigenvalues reduces to the analysis of this kernel; in particular the large N limit of K should allow to compute all correlation functions in this limit and find the different “universal” behaviors that can arise: ? The long distance behavior. As it is known that the kernel fluctuates wildly on intervals of size ∼ 1/N , one must first average the kernel to suppress the oscillations on this scale and obtain a sensible long distance behavior. ? The short distance behavior. This is the region λ − µ ∼ 1/N , in which the fast oscillations mentioned above are relevant. The long distance behavior can be studied by various standard large N techniques [1, 4, 2], so we shall concentrate here on the short distance behavior. Usually one characterizes this behavior by introducing the level-spacing distribution P (s), s = N (λ − µ) (P (s) a priori also depends on λ or µ). In the large N limit, P (s) can be simply related to the ˆ of the kernel on [λ, µ]: asymptotic form K 1 ˆ d2 (1.5) P (s) = 2 det 1 − K , ds N where det is the Fredholm determinant. Thus, in the short distance region, the universality of K implies the universality of the level spacing. In Sect. 2 we first consider the case of a matrix model, where the measure consists of a simple potential term tr V (M ). Expressions of the kernel K in terms of orthogonal polynomials have been known for a long time [15]. One can then proceed to derive a short distance universal behavior of the kernel [4, 8]: K(λ, µ) ∼
sin x , x
x ∼ N (λ − µ).
(1.6)
Random Matrices in an External Field
633
Here we shall rewrite K, and rederive its large N asymptotic form, using a new method which does not make use of orthogonal polynomials, keeping in mind that we are ultimately interested in the more difficult case of the model of matrices coupled to an external field. The first hint that the latter model could possess the same short distance universal behavior appeared in a series of papers [3] by Br´ezin and Hikami, who showed that formula (1.2) can be generalized to the gaussian ensemble with an external field, that is 2 for the measure exp − N2 tr M 2 + N tr M A dN M . This measure can be interpreted by saying that M is the sum of a fixed Hamiltonian A and a random gaussian part M − A: the constant part A breaks the U (N ) invariance of the model so neither orthogonal polynomials (like in the standard one-matrix model) nor biorthogonal polynomials (like in the two-matrix model) are of any use. Still, one can define a kernel in this model, which has the same short distance behavior (1.6). Then in [19] the more general case of a measure of the type exp(N tr(−V (M )+M A)) was investigated. The motivation of this measure was to study a random Hamiltonian which contains a not necessarily gaussian potential, and an external source term which breaks the U (N )-invariance: could the latter symmetry be somehow related to the universality of the level spacing distribution? To answer this question, we shall again define in Sect. 3 a kernel K such that Eq. (1.2) holds. Then, just as in Sect. 2, we shall rewrite K as a matrix integral, allowing to take the N → ∞ limit, and find the short distance universal behavior. The methods used here are very general, and in particular, they can be successfully applied to multi-matrix models; to show this we study in Sect. 4 a model of a chain of matrices, with or without an external field at the end of the chain. We give without proof, and in analogy with the one-matrix model, expressions for the kernel K, which exhibit the same short distance universality. Finally, Appendix 1 describes in detail the analytic structure of the functions involved in the large N limit, and Appendix 2 shows the connection between the formalism used here and large N character formulae. 2. The U (N )-Invariant Case 2.1. Definition of the model and of the kernel. Let us consider an ensemble of random hermitian N × N matrices with the measure Z −1 exp (−N tr V (M )) dN M, 2
(2.1)
where V is a polynomial and Z the partition function. An important remark is that this is not the most general U (N )-invariant measure (one could have products of traces of functions of M ). A classical result [15] expresses the distribution law ρn of n eigenvalues (1 ≤ n ≤ N ) of M in terms of the kernel K(λ, µ) =
N −1 X
Fk (λ)Fk (µ).
(2.2)
k=0
Here Fi is the orthonormal function associated to the usual orthogonal polynomial Pi (λ) = λi + · · ·:
634
P. Zinn-Justin −1/2
N
Fi (λ) = hi Pi (λ)e− 2 V (λ) , Z dλe−N V (λ) Pi (λ)Pj (λ) = hi δij
(2.3)
(see [8] for a review of orthogonal polynomials in matrix models). Let us briefly rederive this result. As the measure (2.1) only depends on the eigenvalues of M , the integration over the angular variables is trivial and one finds: ρN (λ1 , λ2 , . . . , λN ) = Z −1 12 (λi )e−N
PN i=1
V (λi )
.
(2.4)
The Van der Monde determinant 1(λi ) = det(λi j )1≤i≤N,0≤j≤N −1 can be rewritten in terms of the orthogonal polynomials: ρN (λ1 , λ2 , . . . , λN ) = Z −1 det(Pk (λi ))1≤i≤N,0≤k≤N −1 det(Pk (λj ))1≤j≤N,0≤k≤N −1 e−N
PN i=1
V (λi )
(2.5) .
QN −1 One can now easily compute Z = N ! i=0 hi by integrating over all λi . Combining the two determinants, we finally obtain: ρN (λ1 , λ2 , . . . , λN ) =
1 det(K(λi , λj ))i,j=1...N . N!
(2.6)
The kernel K has the following properties:
K(λ, µ) = K(µ, λ) Z [K ? K](λ, ν) ≡ dµ K(λ, µ)K(µ, ν) = K(λ, ν),
(2.7)
i.e. it is the orthogonal projector on the subspace spanned by the Fk , 0 ≤ k ≤ N − 1. Using the property K ? K = K and noting that Z (2.8) ρn (λ1 , λ2 , . . . , λn ) = dλn+1 ρn+1 (λ1 , λ2 , . . . , λn+1 ), one can then show inductively that ρn (λ1 , λ2 , . . . , λn ) =
(N − n)! det(K(λi , λj ))i,j=1...n . N!
(2.9)
for any n ≤ N . This is equivalent to formula (1.2). 2.2. Matrix integral formulation of the kernel. We shall now rewrite K(λ, µ) as a matrix integral: Z 2 N K(λ, µ) = Z −1 e− 2 (V (λ)+V (µ)) d(N −1) M det(λ − M ) det(µ − M ) exp(−N tr V (M )). (2.10) It is easy to check this formula by going over to eigenvalue variables:
Random Matrices in an External Field
Z
=
635
d(N −1) M det(λ − M ) det(µ − M ) exp(−N tr V (M )) 2
Z NY −1
(dλi (λ − λi )(µ − λi )) 12 (λi )1≤i≤N −1 e−N
PN −1 i=1
i=1
=
Z NY −1
dλi 1(λi )1≤i≤N,λN ≡λ 1(λi )1≤i≤N,λN ≡µ e−N
i=1
=
X
σ
σ0
(−1) (−1) Pσ(N ) (λ)P
σ 0 (N )
(µ)
σ,σ 0 ∈SN
N −1 Z Y
V (λi )
PN −1 i=1
V (λi )
dzPσ(i) (z)P
σ 0 (i)
(z)e
−N V (z)
.
i=1
(2.11)
This is non-zero when σ = σ 0 and we find as expected: N
K(λ, µ) = e− 2 (V (λ)+V (µ))
N −1 X
h−1 k Pk (λ)Pk (µ).
(2.12)
k=0
Note that Eq. (2.10) does not involve orthogonal polynomials, and that is why we shall be able to generalize it to the non U (N )-invariant model of matrices coupled to an external field, for which the orthogonal polynomials formalism is not available. 2.3. Large N asymptotics of the kernel. Formula (2.10) allows to compute asymptotics of K(λ, µ) as N → ∞. It relates K(λ, µ) to the partition function Z(λ, µ) of a matrix 2 model with the measure exp(tr log(λ − M ) + tr log(µ − M ) − N tr V (M ))d(N −1) M : Z 2 (2.13) Z(λ, µ) ≡ d(N −1) M det(λ − M ) det(µ − M ) exp(−N tr V (M )). Rather than directly applying the saddle point method to this expression, it is easier to write differential equations for Z. Indeed one has: ∂ log Z(λ, µ) = (N − 1)Gλ,µ (λ) ∂λ , (2.14) ∂ log Z(λ, µ) = (N − 1)Gλ,µ (µ) ∂µ where Gλ,µ is the resolvent of this model: 1 Gλ,µ (z) = N −1
1 tr z−M
,
(2.15)
which depends on λ and µ through the tr log(λ − M ) and tr log(µ − M ) terms in the action. Note that the factor N − 1 in front of Gλ,µ in (2.14) forces us to compute G up to 1/N corrections. Let us now find a saddle point for the eigenvalues. In the large N limit, we shall suppose that they fill a single interval [α, β]; then Gλ,µ (z) becomes an analytic function of z with a single cut on [α, β]: / λ,µ (z) ± iπρλ,µ (z), Gλ,µ (z ± i0) = G
(2.16)
636
P. Zinn-Justin
where ρλ,µ (z) is the density of eigenvalues at z. As a definition G(z) / ≡ 21 (G(z + i0) + G(z − i0)) for any function G. The saddle point equation can now be written: 2(N − 1)G / λ,µ (z) − N V 0 (z) +
1 1 + =0 z−λ z−µ
∀z ∈ [a, b].
(2.17)
At leading order in N , this equation is just the usual saddle point equation 2G(z) / = V 0 (z) for the resolvent G of the original matrix model we have started from. Therefore we can write: 1 (2.18) Gλ,µ = G + (CG + Cλ + Cµ ), N where we have introduced three 1/N corrections to the leading behavior of Gλ,µ corresponding to the three corrective terms in the saddle point Eq. (2.17). Following [9, 8] we deduce these corrections from their analytic properties. Both G(z) and Gλ,µ (z) behave as 1/z as z → ∞, so CG , Cλ and Cµ are O(1/z 2 ). Furthermore they satisfy / C /G (z) = G(z) 1 C /λ (z) = (2.19) 2(λ − z) 1 C /µ (z) = 2(µ − z) √ and are regular for all z except near α (resp. β) where they should behave as 1/ z − α √ (resp. 1/ z − β). This determines them entirely: 1 C (z) = G(z) − √ G (z − α)(z − β) √ √ 1 (z − α)(z − β) − (λ − α)(λ − β) 1 1− Cλ (z) = √ 2 (z − α)(z − β) z−λ √ √ 1 (z − α)(z − β) − (µ − α)(µ − β) 1 Cµ (z) = √ . 1− 2 (z − α)(z − β) z−µ (2.20) The right-hand side of (2.14) can now be written: (N − 1)Gλ,µ (λ) p 1 1 1 d log (λ − α)(λ − β) − = N G(λ) − 2 dλ 2λ−µ
s 1−
(µ − α)(µ − β) (λ − α)(λ − β)
! ,
(2.21) and a similar equation for (N − 1)Gλ,µ (µ). For λ ∈ [α, β], an ambiguity in (2.21) must be resolved: Gλ,µ , like G, has a cut on [α, β]; so we must choose λ slightly above or below the real axis to determine the right-hand side of (2.21): p N 0 1 d V (λ) ± N iπρ(λ) − log (λ − α)(β − λ) 2 2 dλ √ (2.22) 1 1 (µ − α)(µ − β) 1− √ − 2λ−µ ±i (λ − α)(β − λ)
(N − 1)Gλ,µ (λ ± i0) =
(ρ(λ) ≡ ρ1 (λ)). This ambiguity, which appears only at N = ∞, means that there are several saddle points which we must all take into account. The same problem appears
Random Matrices in an External Field
637
when µ gets close to the cut, so there is a total of 4 saddle points (, 0 ) (, 0 = ±) corresponding to the locations of λ and µ with respect to the cut [α, β]. Finally we can write differential equations for K(λ, µ): ∂ log K(,0 ) ∂λ = N iπρ(λ) −
p 1 d 1 1 log (λ − α)(β − λ) − 2 dλ 2λ−µ
∂ log K(,0 ) ∂µ
√ 0 (µ − α)(β − µ) 1− √ , (λ − α)(β − λ)
√ (λ − α)(β − λ) 1 − 0√ . (µ − α)(β − µ) (2.23) We introduce the function ϕ(z) which satisfies z = 21 (α + β) − 21 (β − α) cos ϕ(z) and √ 1 (z − α)(β − z). Noting then that 2 (β − α) sin ϕ(z) = d ϕ(λ) − 0 ϕ(µ) log sin dλ 2 (2.24) √ √ 1 (λ − α)(β − λ) + 0 (µ − α)(β − µ) √ , = λ−µ (λ − α)(β − λ) p 1 d 1 1 log (µ − α)(β − µ) − = N iπρ(µ) − 2 dµ 2µ−λ 0
Eqs. (2.23) can be integrated: sin
ϕ(λ)−0 ϕ(µ) 2
1 λ−µ sin ϕ(λ) sin ϕ(µ) ! Z λ Z µ 0 exp iN π ρ(z)dz + iN π ρ(z)dz .
K(,0 ) (λ, µ) = c(,0 )
λ0
√
(2.25)
λ0
The integration constants c(,0 ) satisfy c(,0 ) = c(0 ,) (interchange of λ and µ) and c(−,−0 ) = −¯c(,0 ) (complex conjugation). c(±,∓) are independent of the choice of λ0 and are fixed by imposing the normalization condition K(λ, λ) = N ρ(λ): we find (cf. next section) that c(±,∓) = 1/(2πi). c(±,±) are undetermined; if we assume that one can find λ0 such that c(±,±) = ±1/2π (for the case of an even potential we would have λ0 = 0), then we are left with only one unknown parameter λ0 . We can finally sum the four function K(,0 ) ; we obtain: K(λ, µ) =
1 1 1 √ π λ − µ sin ϕ(λ) sin ϕ(µ) " ! Z λ ϕ(λ) + ϕ(µ) sin sin N π ρ(z)dz (2.26) 2 µ !# Z λ Z µ ϕ(λ) − ϕ(µ) + sin cos N π ρ(z)dz + N π ρ(z)dz , 2 λ0 λ0
a formula for K that is equivalent to the one that was found in [4] by using an ansatz on the form of orthogonal polynomials (see also [8]).
638
P. Zinn-Justin
2.4. Short distance universal behavior of the kernel. Let us now inspect the region where λ − µ ∼ 1/N , α < λ < β. It is clear from (2.25)–(2.26) that the dominant contributions come here from the saddle points (±, ∓) (i.e. λ and µ on opposite sides of the cut). Actually, this can already be seen in Eqs. (2.23), which acquire a particularly simple form in this limit: 1 d log K(±,∓) = ±N iπρ(λ) − dλ λ−µ . (2.27) 1 d log K(±,∓) = ∓N iπρ(µ) − dµ µ−λ Here we do not need these simplified differential equations, since we can directly take the limit in (2.26), which yields K(λ, µ) =
sin(N π(λ − µ)ρ(λ)) . π(λ − µ)
(2.28)
This is the well-known short distance universal behavior (1.6) of the kernel. 3. Generalization to the Case of an External Field It was proven in [19] that in the case of a general measure with an external field, (1.2) still holds; we shall now review this result and write the kernel K in an appropriate way for asymptotic analysis. 3.1. Definition of the model and of the kernel. Let us consider the measure: Z −1 exp (−N tr V (M ) + N tr M A) dN M, 2
(3.1)
where V is an arbitrary polynomial, and A = diag(a1 , . . . , aN ) can be assumed diagonal. Particular matrix models of this type appear in several papers [12, 10]. One diagonalizes M : if M = 3† , where 3 = diag(λ1 , . . . , λN ), the integral over is the usual Itzykson–Zuber integral [11] on the unitary group and we find: det(exp(N λj al )) −N PN V (λi ) i=1 e ρN (λ1 , λ2 , . . . , λN ) = Z −1 1(λi ) . (3.2) 1(al ) Z can now be computed: Z = N!
1 1(al )
N! det = 1(al )
Z Y N
dλi det(λi k )1≤i≤N,0≤k≤N −1 eN
i=1
Z
PN i=1
(−V (λi )+ai λi )
(3.3)
k N (−V (λ)+al λ)
dλ λ e
0≤k≤N −1,1≤l≤N
.
Inserting (3.3) into (3.2) yields ρN (λ1 , λ2 , . . . , λN ) =
1 det(λi k )1≤i≤N,0≤k≤N −1 det(exp(N al λj ))1≤j,l≤N −N PN V (λi ) i=1 R e . N! det dλ λk eN (−V (λ)+al λ) 0≤k≤N −1,1≤l≤N
(3.4)
Random Matrices in an External Field
639
R The matrix mlk = dλ λk exp N (−V (λ) + al λ) possesses an inverse, which we denote by αkl ; putting together the three determinants (and the exp −N V (λ) factors) we finally obtain: 1 det(K(λi , λj ))i,j=1...N , (3.5) ρN (λ1 , λ2 , . . . , λN ) = N! where N −1 X N X N K(λ, µ) = e− 2 (V (λ)+V (µ)) αkl λk eN al µ . (3.6) k=0 l=1
The kernel K satisfies the property: N
[K ? K](λ, ρ) = e− 2 (V (λ)+V (ρ))
X
Z αkl λk
0 dµ µk eN al µ−N V (µ) αk0 l0 eN al0 ρ
k,k0 ,l,l0
= K(λ, ρ). (3.7) Thus, one can follow the same line of reasoning as in the U (N )-invariant case to obtain the determinant formulae ρn (λ1 , λ2 , . . . , λn ) =
(N − n)! det(K(λi , λj ))i,j=1...n N!
(3.8)
for any n ≤ N . If we introduce the polynomials Ql : Ql (λ) =
N −1 X
αkl λk ,
k=0
then N
K(λ, µ) = e− 2 (V (λ)+V (µ))
N X
Ql (λ)eN al µ .
l=1
The polynomials Ql are of degree N − 1, and satisfy the orthogonality relations: Z dλQl (λ)eN (−V (λ)+al0 λ) = δll0 . (3.9) This proves that K is a non-orthogonal projector on the space spanned by the Ql (λ) exp(−N V (λ)/2), 1 ≤ l ≤ N , which is also the space spanned by the Fk , 0 ≤ k ≤ N −1. 3.2. Matrix integral formulation of the kernel. We shall now guess the matrix integral formulae for the polynomials Ql : Z 2 Ql (λ) = cl d(N −1) M det(λ − M ) exp(N tr(−V (M ) + M A(l) )). (3.10) Here A(l) stands for the diagonal (N −1)×(N −1) matrix obtained from A by removing the eigenvalue al . The cl are normalization constants. The right-hand side of (3.10) is obviously a polynomial of degree N −1. As the polynomials Ql are entirely characterized by property (3.9), we compute
640
P. Zinn-Justin
Z dλl Ql (λl )eN (−V (λl )+aλl ) = cl (N − 1)!
1 1(A(l) )
Z Y N
dλi 1(λi )1≤i≤N e
P
N
i(6=l)
(−V (λi )+ai λi )−N V (λl )+aλl
,
i=1
(3.11) where we have introduced the eigenvalues λi , i 6= l, of the matrix M , and 1(A(l) ) ≡ 1(al0 )l0 6=l is the Van der Monde determinant of the eigenvalues of A(l) . Equation (3.11) looks like a N × N matrix integral; if we define A(l),a to be the diagonal N × N matrix obtained from A by replacing al with a, we have: Z dλl Ql (λl )eN (−V (λl )+aλl ) = cl
1 1(A(l),a ) N 1(A(l) )
Z
2
dN M exp(N tr(−V (M ) + M A(l),a )).
(3.12) If we now set a = al0 , l0 6= l, the Van der Monde determinant 1(A(l),al0 ) of the eigenvalues of A(l),al0 becomes zero. If a = al , A(l),al = A and the matrix integral is just the partition function Z. For (3.9) to hold we need cl to be: cl = N Z −1 Q
1 . l0 (6=l) (al − al0 )
(3.13)
We can finally express the kernel as: N
K(λ, µ) = Z −1 e− 2 (V (λ)+V (µ))
Z NY −1
dλi e−N
PN −1 i=1
V (λi )
(3.14)
i=1
1(λi )1≤i≤N,λN ≡λ det(exp(N λi al ))1≤i,l≤N,λN ≡µ . Note that K itself, in contrast with the polynomials Ql , is more naturally expressed as an integral over eigenvalues than as a matrix integral. 3.3. Saddle point equation and analytic structure. Before going on with the study of the kernel, we need to understand the analytic structure of the various functions that we shall now introduce. To do so, we first write saddle point equations for the standard partition function: Z∼
Z Y N
dλi 1(λi ) det(exp(N λj al ))e−N
PN i=1
V (λi )
.
(3.15)
i=1
We shall suppose that the density of the N eigenvalues of A has a smooth limit as N → ∞. Then the eigenvalues of M also have a smooth large N limit, characterized by a saddle point distribution: we shall assume that the eigenvalues fill a single interval [α, β] (it will be argued later that this hypothesis is only technical and does not change the short distance universal behavior), with a density ρ(λ) ≡ ρ1 (λ). Usually, at this stage, one replaces the determinant det(exp(N λj al )) with P exp(N λi ai ) using the symmetry of exchange of the eigenvalues; here we shall not do so, because this would prevent us from writing down a saddle point equation. Instead, we introduce 2 functions G(z) and a(z) which have a cut on [α, β], such that:
Random Matrices in an External Field
641
1 / i) = G(λ N 1 a/(λi ) = N
∂ log 1(λi ) ∂λi ∂ log det(exp(N λi al )). ∂λi
G is of course the resolvent:
G(z) =
tr
1 z−M
(3.16)
.
(3.17)
a cannot be defined by such a simple formula; we refer to Appendix 1 for a rigorous definition of a. Here let us note that the Itzykson–Zuber formula Z † det(exp(N λi al )) ∼ d eN tr(3 A) , (3.18) 1(λi ) ∈U (N ) implies that ∂/∂λi log(det(exp(N λi al ))/1(λi )) is a regular function of λi (i.e. there is no pole when λi ∼ λj ); so we write it under the form f (λi ), where f can be extended into an analytic function on the whole complex plane. Thus a(z) and G(z) are related by: a(z) ≡ G(z) + f (z). (3.19) From (3.19) we deduce that a has the same cut as G on [α, β], that is equation a(z ± i0) = a/(z) ± iπρ(z)
α
∀z ∈ [α, β].
α
β
G(z)
(3.20)
β
a(z)
Fig. 1. Analytic structure of a(z) and G(z). In the physical sheet (below), G(z) has a single cut, while in the other sheet (above) it can have more cuts. It is the opposite for a(z), because of Eq. (3.22)
(3.20) can be taken as a definition of a on [α, β] since a/(z) is given by (3.16) (see Appendix 1 for a definition of a(z) on the whole complex plane and a more detailed analysis of its properties). The saddle point equation for (3.15) is G(z) / + a/(z) = V 0 (z)
∀z ∈ [α, β].
(3.21)
Using what we know of the analytic structure of G and a (Eq. (3.20)), we can now extend (3.21) to the whole complex plane: G(z) + a? (z) = V 0 (z).
(3.22)
a? denotes the function connected to a by the cut [α, β]. In other words a is a multivalued function of z, and a(z) and a? (z) are the values on opposite sides of the cut [α, β],
642
P. Zinn-Justin
since (for generic A and V ) the cut is a square root-type cut which connects two sheets (Fig. 1). 3.4. Short distance asymptotics of the kernel. Let us analyze the more complicated saddle point equation for the integral Z(λ, µ) Z NY −1 PN −1 dλi e−N i=1 V (λi ) 1(λi )1≤i≤N,λN ≡λ det(exp(N λi al ))1≤i,l≤N,λN ≡µ , = i=1
(3.23) which is related to the kernel K by Eq. (3.14). As in the U (N )-invariant case, Z(λ, µ) can be considered as the partition function of a model in which the action is the action of (3.15) (of order N 2 ) plus additional terms dependent on λ and µ (of order N ). We shall again write differential equations for Z; in a very similar way to the U (N )invariant case, we find here: ∂ log Z(λ, µ) = (N − 1)Gλ,µ (λ) ∂λ (3.24) ∂ log Z(λ, µ) = (N − 1)aλ,µ (µ). ∂µ Gλ,µ and aλ,µ are defined in the same way as G and a (see Appendix 1 for a definition of a), but with a modified saddle point distribution of the λi due to the additional terms in the action. Of course, the leading behaviors of Gλ,µ and aλ,µ , when N → ∞ are simply G and a, since the corrective terms are negligible in the leading approximation. If we wanted to solve the differential equations for all values of λ and µ, we should now calculate the 1/N corrections to the leading behavior. However, as we are only interested in the short distance behavior of K, we can restrict ourselves to the region λ − µ ∼ 1/N : variations of λ, µ around the diagonal λ = µ (where K is known – K(λ, λ) = N ρ(λ)) are then of order 1/N , i.e. we only need the leading behavior of ∂/∂λ log K. This means that the next corrections of Gλ,µ (z) and aλ,µ (z) are actually irrelevant in this region, except for possible poles of the type 1/(z − λ) or 1/(z − µ), which would be again of order N . Of course Gλ,µ and aλ,µ do not have any poles on the “physical sheet”; but we have learnt from the U (N )-invariant case that different choices of sheets (or of values on the cut joining these sheets, which amounts to the same) correspond to different saddle points, and that is how we were led to taking into account saddle points (±, ∓) in which poles at z = λ and z = µ do appear (cf. Figs. 2 and 3 in the next section). We now recall that a(z) = G(z) + f (z), where f is an analytic function (regular on the cut [α, β]). In the same way aλ,µ (z) = Gλ,µ (z) + fλ,µ (z), and it is now clear that poles can only come from the Van der Monde part of a(z) and not from the regular part f (z). More explicitly, one can write down a saddle point equation for Gλ,µ and aλ,µ which looks like 1 1 + z−λ z−µ + (regular terms of order O(N 0 )) = 0.
(N − 1)G / λ,µ (z) + (N − 1) a/λ,µ (z) − N V 0 (z) +
(3.25)
At this level of accuracy fλ,µ = f , the correction to f being a regular term of order 1/N . Then the correction aλ,µ − a = Gλ,µ − G is the same for a or G, and the analysis of
Random Matrices in an External Field
643
Eq. (3.25) becomes perfectly identical to what was done in Sect. 2 with Eq. (2.17). We immediately write the differential equations that we obtain: √ 0 (µ − α)(β − µ) 1 1 ∂ log Z(,0 ) = N iπρ(λ) + N G(λ) 1− √ , / − ∂λ 2λ−µ (λ − α)(β − λ) √ (λ − α)(β − λ) 1 1 ∂ log Z(,0 ) = 0 N iπρ(µ) + N a/(µ) − 1 − 0√ . ∂µ 2µ−λ (µ − α)(β − µ) (3.26) Again it is clear that only the saddle-points (±, ∓) need to be considered because the saddle-points (±, ±) are suppressed by a factor of 1/N . It is now convenient to introduce a modified kernel; noting that transformations: 1 ˜ K(λ, µ) = γ(λ)K(λ, µ) γ(µ)
(3.27)
do not affect values of determinants of type (1.2), we choose here γ(λ) = exp( N2 V (λ)) so that ˜ K(λ, µ) ∼ e−N V (µ) Z(λ, µ). (3.28) Using the saddle point Eq. (3.21), we can rewrite the differential equations: 1 ∂ ˜ ±,∓ = ±N iπρ(λ) + N G(λ) log K / − ∂λ λ−µ 1 ∂ ˜ ±,∓ = ∓N iπρ(µ) − N G(λ) log K . / − ∂µ µ−λ
(3.29)
˜ Imposing the condition that K(λ, λ) = N ρ(λ), we finally get: sin(N π(λ − µ)ρ(λ)) N (λ−µ)G(λ) / ˜ K(λ, µ) = e . π(λ − µ)
(3.30)
This formula is a generalization of what was obtained in Sect. 2 (which is the case A = 0). The new factor (λ − µ) G / (λ) can be absorbed in a redefinition of the kernel of the type (3.27). 3.5. Multi-cut generalization. So far we have always assumed that the resolvent G has a single cut, i.e. the saddle point density of eigenvalues should be non-zero on a single interval [α, β]. Intuitively it seems clear that removing the “single cut” hypothesis, which is a long distance effect, should not change the short distance behavior of the kernel. Indeed, it has been checked in [6] that the short distance universality is preserved in the quartic U (N )-invariant case with two symmetric cuts (but the long distance behavior is modified). Here we shall argue that, in the more general models we consider, this change is irrelevant for short distance asymptotics. Let us start from Eq. (3.25), which was derived without any assumption on the cuts of G or a. We shall study the correction Cµ (z) which satisfies C /µ (z) =
1 2(z − µ)
(3.31)
and its effect on the differential equations (the same line of reasoning will apply to Cλ (z)). Since our interest lies in the region λ − µ ∼ 1/N , we can assume that both
644
P. Zinn-Justin
λ and µ are in the same interval [α, β], which is but one of the cuts of G. Of course λ (resp. µ) must be slightly shifted in the imaginary direction to remove ambiguities, with a shift of i (resp. 0 i). As already noted, Cµ (z) cannot have a pole at z = µ on the physical sheet. However, C /µ (z) =
1 1 (C(z + i0) + C ? (z + i0)) = (C(z − i0) + C ? (z − i0)), 2 2
(3.32)
where C ? (z) is the function connected to C(z) by the cut [α, β] (note that the definition of C ? does not depend on the cut chosen, because of the saddle point equation); so (3.31) can be continued to complex values of z and one concludes that C ? (z) has a pole at z = µ, with a residue of 1. It is now easy to derive the simplified differential equations for K,0 in an even more schematic way than before. If = 0 , λ will never get close to the pole of C ? Fig. 2); therefore there will be no 1/(λ − µ) term in the different equations and K,0 will be of order O(N 0 ), i.e. negligible. On the other hand, if = −0 , λ will reach the pole of C ? Fig. 3); this time we obtain Eqs. (3.29), and the usual short distance universality. poles
λ
α
poles
µ
λ
β
λ
α
G(z)
µ
β
λ a(z)
Fig. 2. Analytic structure of Gλ,µ and aλ,µ when λ and µ are on the same side of the cut [α, β]. When λ ∼ µ and the λ on the physical sheet (indicated by a small arrow) approaches the cut, it does not reach the pole at z = µ on the other sheet
poles
λ
α
poles
µ
λ
G(z)
λ
β
α
λ
µ
β
a(z)
Fig. 3. Analytic structure of Gλ,µ and aλ,µ when λ and µ are on opposite sides of the cut. This time the λ on the physical sheet, as it approaches the real axis, “sees” the pole at z = µ on the other sheet (through the cut)
4. Multi-Matrix Generalization The results obtained in the previous sections can be generalized to models of a chain of matrices. We shall only briefly discuss these, since they are less interesting physically.
Random Matrices in an External Field
645
4.1. The U (N )-invariant case. The first model considered is defined by the measure (with arbitrary potentials V (m) ): !! M M M −1 Y X X N2 (m) (m) (m) (m) (m+1) d M exp −N tr − V (M ) + M M . (4.1) m=1
m=1
m=1
It has a global U (N ) invariance, and just as in Sect. 1, can be treated by introducing the appropriate orthogonal polynomials. Here they are biorthogonal polynomials Pk and Qk with respect to a non-local measure: Z Y M
dλ(m) Pk (λ(1) )Ql (λ(M ) )e−N
PM m=1
V (m) (λ(m) )+N
PM −1 m=1
λ(m) λ(m+1)
= δkl .
(4.2)
m=1
We can now define the distributions ρn (λ1 , . . . , λn ) of the eigenvalues of M (1) . It is important to note that these distributions are different from the distributions of Sect. 3, after (quenched) averaging over the external field A. They satisfy the usual determinant formulae (1.2) with the kernel ! M N ! Z NY −1 Y Y (m) (1) (m) (1) −N (V (λ)+V (µ)) −N V (λ ) −N V (λ ) i i dλi e dλi e K(λ, µ) = e 2 i=1 (1) 1(λi )λ(1) ≡λ N M −1 Y
m=2 i=1 (1) (2) det(exp(N λi λj ))λ(1) ≡µ N
!
(m+1) ) det(exp(N λ(m) )) 1(λ(M ). i λj i
m=2
(4.3) We have expressed K directly as an integral over eigenvalues. Note that at this stage we (m+1) )) in (4.3) with can replace the determinants det(exp(N λ(m) i λj exp(N
X
(m+1) (λ(m) )), i λi
which would lead to an expression of K in terms of biorthogonal polynomials: N
K(λ, µ) = e− 2 (V (λ)+V (µ))
N −1 X
Pk (λ)P˜k (µ),
(4.4)
k=0
where P˜k (µ) =
Z Y M
dλ(m) Qk (λM )eN µλ
(2)
+N
PM −1 m=2
λ(m) λ(m+1) −N
PM m=2
V (λ(m) )
.
(4.5)
m=2
Here, as in Sect. 3, we do not follow this path: we keep the determinants, in order to have saddle points. We then write down differential equations for the kernel, which can be simplified in the region λ − µ ∼ 1/N : the resulting equations are identical to Eq. (3.29), and we find the same asymptotic expression (3.30). One could also consider correlations of eigenvalues of a matrix M (m) somewhere inside the chain, with presumably the same techniques and results. However, this has not been investigated in detail yet.
646
P. Zinn-Justin
4.2. The chain of matrices with an external field at one end. In this model we add to the measure (4.1) of the preceding section an external source term for M (1) : !! M M M −1 Y X X 2 dN M (m) exp −N tr − V (m) (M (m) ) + AM (1) + M (m) M (m+1) . m=1
m=1
m=1
(4.6) This measure is no more U (N )-invariant. We define again the distributions of eigenvalues of M (1) , which satisfy determinant formulae (1.2): we give without proof the kernel ! M N ! Z NY −1 Y Y (1) (m) N −N V (λi ) −N V (λi ) dλ(1) dλ(m) K(λ, µ) = e− 2 (V (λ)+V (µ)) i e i e i=1
m=2 i=1
(1) (2) det(exp(N ai λ(1) j ))λ(1) ≡λ det(exp(N λi λj ))λ(1) ≡µ N N ! M −1 Y ) (m+1) det(exp(N λ(m) )) 1(λ(M ). i i λj m=2
(4.7) The differential equations will look a little more general; for example the equivalent of Eq. (3.26) will be: ∂ log Z(λ, µ) = (N − 1)aλ,µ (λ) ∂λ , (4.8) ∂ (2) log Z(λ, µ) = (N − 1)λ (µ). λ,µ ∂µ where 2 functions a(λ(1) ) and λ(2) (λ(1) ) must be introduced (the logarithmic derivatives (1) (2) of det(exp(N ai λ(1) j )) and det(exp(N λi λj ))). The rest of the analysis is the same. At the end, one redefines the kernel K with a transformation of type (3.27), using the saddle point equation: a/(z) + λ/(2) (z) = V 0 (z) so that it satisfies the universal property (3.28). 5. Conclusion Let us summarize our results. The model considered is that of a matrix coupled to an external field A; the latter has a smooth large N limit characterized by a limiting density of eigenvalues. A kernel K(λ, µ) is defined such that the determinant formulae (1.2) hold. In the simplest case where A = 0, the large N form of K(λ, µ) for all λ and µ, first found in [4], is reproduced here. In the case of a non-zero A, we restrict ourselves to the region λ − µ ∼ 1/N (it is not clear that, for general A and V , the long distance behavior of K, even after smoothing the oscillations, should be interesting, e.g. should exhibit any kind of universal behavior). The asymptotic form (3.30) is obtained, extending the level spacing universality to this class of models. The key ingredient of the derivation of the short distance universality is of course the existence of the kernel. But it seems reasonable to assume that the level-spacing universality (observed experimentally for a broad range of systems) should be true for very general matrix models, in which the correlation functions do not satisfy (1.2). The problem is then to manage to compute the level-spacing distribution P (s), even though it is no more simply related (through K, cf. Eq. (1.5)) to the correlation functions, that
Random Matrices in an External Field
647
is the naturally calculable quantities of the model. This suggests that a totally different approach is probably necessary. Here, let us mention that the question of knowing how far the universality of P (s) can be extended is reminiscent of the question of knowing what is the domain of attraction of a fixed point in renormalization group theory. What we have shown is that in our matrix models, both non-gaussian terms and terms explicitly breaking the U (N )-invariance are irrelevant in the large N limit and lead to the gaussian U (N )-invariant fixed point. Maybe RG methods can be applied here too (cf. [5]) and allow a much more general approach to the problem. In the case of multi-matrix models, the same short distance behavior is found for the correlation functions of the eigenvalues of a given matrix, here the first matrix in a chain of matrices. Of course, it would be interesting to investigate the more general problem of the correlations between eigenvalues of different matrices in the chain. This has been already done in the gaussian case [3]. The conclusion is that the interesting limit is that of an infinite chain, which tends to the c = 1 matrix model Z ˙ 2) . Z = [dM (t)] exp −N tr(V (M (t), t) + M One should then study the correlations of the eigenvalues of M (t1 ) and M (t2 ) as t1 and t2 are close to each other. In order to apply the methods used in this paper, one will of course have to generalize the determinant formulae (1.2) to include eigenvalues of different matrices. Acknowledgement. I would like to thank E. Br´ezin and B. Eynard for stimulating discussions.
Appendix 1. Analytic Structure and Functional Inverses Let us define the function a(λ) used in Sect. 3, and its functional inverse λ(a). We shall assume, as in [14], that the logarithm of the Itzykson–Zuber integral W =
det(exp(N al λi )) 1 log N2 1(λi )1(al )
(A1.1)
has a smooth large N limit, so that it depends only on the distributions of eigenvalues ρ(λ) and ρ(a) ˜ of M and A. Then one can define functional derivatives of W with respect to ρ and ρ. ˜ It is clear (Eq. (3.16)) that a/(λ) − G(λ) / =
d δ W [ρ, ρ]. ˜ dλ δρ(λ)
(A1.2)
The r. h. s. can be extended to complex values of λ and has no cut on [α, β] (the support of the density ρ), therefore we can define a(λ) by removing the slashes in (A1.2). Note that this corresponds to the very simple finite N definition: a(λ) ≡
1 d log det(exp(N al λi ))λN +1 ≡λ . N dλ
(A1.3)
The only problem of this definition is that there are now N eigenvalues al and N + 1 eigenvalues λi : one should add one al or remove one λi . In the large N limit, this problem disappears, since if we believe that W depends only on ρ(λ) and ρ(a), ˜ then
648
P. Zinn-Justin
one can redefine the eigenvalues al (or λi ) to add an eigenvalue al (or remove one λi ), keeping the densities ρ(a) ˜ and ρ(λ) fixed. All that has been done so far is symmetric in the exchange of A and M , so we can also define λ(a): d δ ˜ W [ρ, ρ], ˜ (A1.4) λ(a) = G(a) + da δ ρ(a) ˜ ˜ where G(a) is the resolvent of A, and the functional derivative has been again extended to complex values of a. Let us now discuss the analytical structure of a(λ) and λ(a). On [α, β], according to (A1.2), a(λ) has the same cut as G(λ), i.e. Im a(λ ± i0) = ±πρ(λ).
(A1.5)
˜ Likewise, λ(a) has the same cut as G(a). We can then define a? (λ) (resp. λ? (a)), to ˜ If we now consider the be the functions on the other side of the cut of G (resp . G). matrix model with an external source term (measure (3.1)), a? (λ) satisfies an additional constraint which is the saddle point Eq. (3.22). One can see that this implies that a? (λ), just like G(λ), has no other cut in the whole complex plane than that on [α, β]. On the other hand, λ? (a) is not constrained and therefore it may have more cuts (leading to other sheets). Finally, it can be shown by studying the large N limit of the Itzykson–Zuber integral that λ(a) and a(λ) (as multi-valued functions) are functional inverses of each other. This can be thought of as a generalization of the inversion relation found in [14], even though the connection is non-trivial. Here we shall derive this relation in an elementary fashion. We rewrite definition (A1.3) explicitly: PN +1 N al λ l=1 al El,N +1 e , (A1.6) a(λ) = P N +1 N al λ l=1 El,N +1 e where El,N +1 is the determinant det(exp(N ak λi )) with 1 ≤ k ≤ N + 1, k 6= l and 1 ≤ i ≤ N . Now Eq. (A1.1) can be applied to El,N +1 : El,N +1 1 log = W [ρ, ρ˜l ] + O(1/N 2 ), N2 1(λi )1(ak )k6=l
(A1.7)
where ρ˜l is the density of the ak , k 6= l; computing the 1/N correction to log El,N +1 one finds X δ W [ρ, ρ] ˜ − log(ak − al ), (A1.8) log El,N +1 = C − N δ ρ(a ˜ l) k(6=l)
where C is independent of l. Finally, using the standard trick which is to replace the sum over l with a contour integral, (A1.6) becomes Ra 0 0 H ˜ daG(a) a eN aλ e−N λ(a )da Ra . (A1.9) a(λ) = H 0 0 ˜ daG(a) eN aλ e−N λ(a )da The saddle point equation gives simply: a(λ) = a with λ(a) = λ, which proves the functional inversion relation. A last remark: the analytic structure described here can be made more explicit in the gaussian case (with an external field), by using Pastur’s self-consistent relation [17].
Random Matrices in an External Field
649
Appendix 2. Connection with Characters One may notice the strong resemblance between equations discussed in Sect. 3.3 and Appendix 1 and large N character relations found by Kazakov et al. in [13]. To establish the connection, we shall now rederive in a very simple manner the latter relations. A representation of U (N ) can be described by its highest weights m1 ≥ m2 ≥ . . . ≥ mN . The corresponding character is defined by: h
χh (M ) =
det(zi j ) , 1(zi )
(A2.1)
where the zi are eigenvalues of the N × N matrix M , and the hi are the shifted weights hi = mi + N − i. If the weights are positive (mN ≥ 0, which we shall simply write h ≥ 0) then we can use the identity ∞ X X 1 tr M q tr M0q χh0 (M )χh0 (M0 ) = exp (A2.2) q 0 h ≥0
q=1
and the orthogonality of characters to compute Z X dM χ¯ h (M ) χh0 (M )χh0 (M0 ) χh (M0 ) = Z
M ∈U (N )
= M ∈U (N )
h0 ≥0
(A2.3)
dM χ¯ h (M ) exp(N tr V (M )).
P∞ The potential V (z) = q=1 tq z q /q is related to M0 by tq = N1 tr(M0q ). We now diagonalize M and integrate over the unitary group; we are left with the eigenvalues zi of M , |zi | = 1; using definition (A2.1) we obtain: χh (M0 ) =
I NY −1 dzi i=0
zi
−N hj
1(zi ) det(zi
)eN
PN −1 i=0
V (zi )
(A2.4)
(the shifted weights hj have been rescaled by a factor of N ). This is to be compared with partition function (3.15). The study of the two cases (and in particular the analytic structure that arises) being very similar, we shall now skip the details of the derivation. One first writes down a saddle point equation: k(z) / + G(z) / + V 0 (z) = 0,
(A2.5)
where the function k(z) is defined by: k(z / i) =
1 ∂ −N hj log det(zi ). N ∂zi
(A2.6)
k(z) has the same cut as the resolvent G(z), i.e. where the saddle point density of eigenvalues is non-zero. We then have k ? (z) + G(z) + V 0 (z) = 0
(A2.7)
650
P. Zinn-Justin
on the whole complex plane. (k ? is the function on the other side of the cut of k). We make the trivial redefinition h(z) = −zk ? (z), and expand in powers of z: ! ∞ ∞ N −1 X X X 1 q tq z q + 1 + z −q zi h(z) = N q=1 q=1 i=0 (A2.8) ∞ ∞ X X ∂ q −q 1 tq z + 1 + z q log χh (M0 ), = N 2 ∂tq q=1
q=1
which is precisely the result obtained by much more complicated methods in [13] (in their notation z ≡ G−1 .) Finally, the same functional inversion argument as in Appendix 1 applies here, proving that z(h) defined by z(h) = exp(−w(h)) with w(h / i) =
1 ∂ log χh (M0 ) N ∂hi
(A2.9)
(cf. [13] for a similar definition of G(h) = 1/z(h)) is the functional inverse of h(z). References 1. Ambjorn, J. and Makeenko, Yu.M.: Mod. Phys. Lett. A5, 1753 (1990) 2. Beenakker, C.W.J.: Nucl. Phys. B422, 515 (1994) 3. Br´ezin, E. and Hikami, S.: Nucl. Phys. B479, 697 (1996) Br´ezin, E. and Hikami, S.: Preprint cond-mat/9608116 Br´ezin, E. and Hikami, S.: Preprint cond-mat/9702213 4. Br´ezin, E. and Zee, A.: Nucl. Phys. B402, 613 (1993) 5. Br´ezin, E. and Zinn-Justin, J.: Phys. Lett. B288, 54 (1992) Higuchi, S., Itoi, C., Nishigaki, S., Sakai, N.: Nucl. Phys. B318, 63 (1993) 6. Deo, N.: Preprint cond-mat/9703136 7. Dyson, F.J.: J. Math. Phys. 13, 90 (1972) 8. Eynard, B.: Gravitation quantique bidimensionnelle et matrices al´eatoires: Th`ese de doctorat de l’Universit´e Paris 6, 1995 9. Eynard, B. and Zinn-Justin, J.: Nucl. Phys. B386, 558 (1992) 10. Gross, D.J. and Newman, M.J.: Phys. Lett. B266, 291 (1991) 11. Harish Chandra: Am. J. Math. 79, 87–120 (1957) Itzykson, C. and Zuber, J.-B.: J. Math. Phys. 21, 411 (1980) 12. Kazakov, V.A.: Nucl. Phys. B (Proc. Suppl.) 4, 93 (1988) 13. Kazakov, V.A., Staudacher, M. and Wynter, T.: Commun. Math. Phys. 177, 451 (1996) Commun. Math. Phys. 179, 235 (1996) 14. Matytsin, A.: Nucl. Phys. B411, 805 (1994) 15. Mehta, M.L.: Random matrices. 2nd ed. New York: Academic Press, 1991 16. Mello, P.A.: Theory of random matrices: Spectral statistics and scattering problems. In Mesoscopic quantum physics, Les Houches Session LXI, E. Akkermans, G. Montambaux, J.-L. Pichard, and J. Zinn-Justin, eds., Amsterdam: North-Holland, 1994, and references therein 17. Pastur, L.A.: Theor. Math. Phys. (USSR) 10, 67 (1972) 18. Wigner, E.P.: Proc. Cambridge Philos. Soc. 47, 790 (1951) and other papers reprinted in C.E. Porter: Statistical theories of spectra: fluctuations. New York: Academic Press, 1991 19. Zinn-Justin, P.: Nucl. Phys. B497, 725 (1997) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 651 – 674 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Intersection Numbers and Rank One Cohomological Field Theories in Genus One Alexandre Kabanov1,2 , Takashi Kimura1,3 1
Max Planck Institut f¨ur Mathematik, Gottfried Claren Str. 26, 53225 Bonn, Germany Department of Mathematics, Michigan State University, Wells Hall, East Lansing, MI 48824-1027, USA. E-mail: [email protected] 3 Department of Mathematics, Boston University, Boston, MA 02215, USA. E-mail: [email protected]
2
Received: 14 June 1997 / Accepted: 5 November 1997
Abstract: We obtain a simple recursive presentation of the tautological (κ, ψ, and λ) classes on the moduli space of curves in genus 0 and 1 in terms of boundary strata (graphs). We derive differential equations for the generating functions for their intersection numbers which allow us to prove a simple relationship between the genus zero and genus one potentials. As an application, we describe the moduli space of normalized, restricted, rank one cohomological field theories in genus one in coordinates which are additive under taking tensor products. Our results simplify and generalize those of Kaufmann, Manin, and Zagier. Introduction Recently, there has been a great deal of interest in the topology of the moduli space of curves. Much of this interest has been due to the important role that these spaces (and their cousins, the moduli space of stable maps) play in the theory of Gromov-Witten invariants and quantum cohomology [22, 32, 30] whose origins in the physical literature are called a topological gravity [32]. They furnish nontrivial examples of cohomological field theories (CohFTs) [22, 25], in genus zero (and conjecturally for higher genera). Often, this structure is enough to completely determine the Gromov-Witten invariants themselves. The moduli spaces of curves are endowed with tautological classes whose generating functions for their associated intersection numbers obey a system of differential equations which often possess remarkable properties [32, 21]. In this paper, we apply a mixture of algebraic geometry and combinatorics to find a simple presentation of these classes in genus 0 and 1 to obtain a generalization of some equations due to Witten and Dijkgraaf [32, 5]. These generating functions parameterize the potentials associated to the space of all normalized, restricted, rank one cohomological field theories in genus one and endow this space with coordinates which are additive with respect to tensor product in the category of CohFTs. This paper is motivated by the work of Kaufmann, Manin, and Zagier [19].
652
A. Kabanov, T. Kimura
The moduli space of genus g curves with n marked points, Mg,n := { [C; x1 , x2 , . . . , xn ] }, is the moduli space of configurations of n marked points, on a smooth, complex curve (Riemann surface) C of genus g. We assume throughout that the stability condition 2 − 2g − n < 0 is satisfied. This moduli space has a compactification Mg,n (due to Deligne-Knudsen-Mumford) which is the moduli space of stable curves of genus g with n marked points where the boundary divisor Mg,n − Mg,n is the locus of degenerate curves. The space Mg,n is a stratified, complex orbifold (stack) of complex dimension 3g − 3 + n, where each stratum is indexed by a decorated graph (stable graph) which denotes the type of degeneration that curves in that stratum have. A stable graph represents a cohomology class on Mg,n by taking the closure of the corresponding stratum and applying Poincar´e duality to the associated (rational) homology class. The space Mg,n is endowed with tautological cohomology classes whose study was initiated by Mumford [28]. Let Li be the line bundle over Mg,n whose fiber over a point [C; x1 , x2 , . . . , xn ] is Tx∗i C. Then ψ(g,n),i = c1 (Li ), the first Chern class of Li . The classes κ(g,n),i in H 2i (Mg,n , Q)Sn are defined by κ(g,n),i := π∗ (c1 (ωg,n (D))i+1 ), where ωg,n is the cotangent bundle to the fibers of the universal curve π : Mg,n+1 → Mg,n , D is the sum of the images of the canonical sections, and π∗ is fiber integration [1]. Integrals of products of these classes (intersection numbers) are of great geometric interest and are the main object of study in this paper. In particular, 2π1 2 κ(g,n),1 is the class of the Weil–Petersson symplectic form [1]. Zograf in [33] obtained a recursion formula for the classical Weil–Petersson volumes in genus zero. The intersection numbers of the κ classes studied in [19] are called higher Weil–Petersson volumes. In the first part of this paper we study a generating function H(t; s) ∈ C[[t, s]] which incorporates all intersection numbers of the ψ and κ classes. Here t = (t0 , t1 , . . . ) and s = (s1 , s2 , . . . ) are formal variables. The function Hg (t; s) denotes the summand of H(t; s) corresponding to genus g. This function has the property that H(t, 0) = F (t), the generating function for the ψ intersection numbers defined by Witten in [32, 31]. On the other hand, setting ti = 0 for i ≥ 1 gives a generating function for the intersection numbers of the κ classes. This function is closely related to that considered in [19]. We give a simple recursive presentation of powers of the κ and ψ classes in terms of boundary strata in genus zero and genus one and derive a simple system of differential equations for H in genus zero and one which completely determine those intersection numbers. Taking appropriate limits in genus zero, we obtain equations due to Witten [32] for the ψ classes and equations for the κ classes which are equivalent of those in [19] and much simpler. (A differential equation for the classical Weil–Petersson volumes in genus zero was first obtained in [27].) This simplification arises because our presentation of the κ classes in terms of boundary strata is simpler. Furthermore, the genus one equation can be solved to obtain the relation H1 =
1 log H0000 , 24
(1)
where 0 denotes the partial derivative with respect to t0 , for all values of s and t gen1 log F0000 [5]. We were eralizing the result of Dijkgraaf and Witten saying that F1 = 24 informed that (1) was known to Zograf [26] in the special case where tj = 0 for all j ≥ 1 and si = 0 for all i ≥ 2. Witten [32] conjectured (and Kontsevich [21] proved) that F (t) is the logarithm of a tau function of the KdV hierarchy after rescaling the variables. This tau function was completely characterized [5, 17, 24] by being annihilated by a sequence of differential operators Ln for n ≥ −1 satisfying [Lm , Ln ] = (m − n)Lm+n , the relations of the
Intersection Numbers and Rank One Cohomological Field Theories
653
Virasoro algebra.1 The equations corresponding to L−1 and L0 were proven by Witten in [32, 31] and are essentially the so-called puncture and dilaton equations. We derive the analog of the puncture and dilaton equations for intersection numbers of κ and ψ classes. These equations are valid for all genera and do not use the presentation of the tautological classes in terms of boundary strata. Solving these equations provides another proof of (1). It is not clear which one of these approaches will prove to be most useful in higher genera. In the second part of our paper, we apply the previous to describe the moduli space of normalized, restricted, rank one cohomological field theories (CohFTs) in genus one generalizing the results of Kaufmann, Manin, and Zagier in genus zero [19]. A (complete) CohFT of rank r [22, 25] is an r dimensional vector space with metric (V, h) together with a collection of linear maps H• (Mg,n ) → T n V which are equivariant under the action of the permutation groups and which satisfy some compatibility conditions arising from inclusion of strata on Mg,n . In the language of Getzler and Kapranov [13], the maps form a morphism of modular operads. Restricting to g = 0, a CohFT (V, h) is equivalent to endowing (V, h) with the structure of a (formal) Frobenius manifold [6, 16, 25]. The most spectacular examples of such theories arise when (V, h) is the cohomology ring of certain smooth, projective varieties with its intersection pairing and the morphisms come from the Gromov-Witten invariants associated to the manifold thereby endowing the cohomology ring with a deformed cup product giving it the structure of quantum cohomology. In many cases, e.g. CPn or Grassmannians, the structure of a CohFT is strong enough to completely determine, recursively, the number of rational curves in the manifold counted with multiplicity (see [22, 10]). A CohFT in genus zero can be described in terms of a certain generating function (potential) associated to the structure morphisms which must satisfy the WDVV (Witten-Dijkgraaf-Verlinde-Verlinde) equations (see [10, 22]). These equations encode the relations between the boundary strata in M0,n due to Keel in genus zero. Recently, Getzler [11] derived equations which are the analogs of WDVV in genus one by proving new relations which plays an analogous role in genus one to those of Keel in genus zero. His equation allowed him to predict the elliptic Gromov-Witten invariants of CP2 and CP3 (see also [2]). Kaufmann, Manin, and Zagier proved [19] that the moduli space of normalized, rank one, CohFTs in genus zero has coordinates s such that the tensor product in the category of CohFTs is additive in these coordinates. We prove that the moduli space of normalized, rank one, restricted, CohFTs in genus one has similar coordinates (s, u). Here the variable u arises because in genus one we need to introduce another tautological cohomology class (called λ) due to Mumford. Our proof involves a mixture of techniques from Kaufmann, Manin, and Zagier [19], Getzler [11], and results from the first part of this paper. In Sect. 1, we review the geometry of the moduli space of stable curves, its stratification in terms of stable graphs, tautological cohomology classes, their intersection numbers, and associated generating functions. In Sect. 2, we obtain a simple presentation for these classes in terms of stable graphs in genus 0 and 1 and derive differential equations satisfied by the generating functions associated to their intersection numbers. In Sect. 3, we derive the analogues of the puncture and dilaton equations. In Sect. 4, we write closed form expressions for these intersection numbers. In Sect. 5, we use analytic properties of the generating function to prove an asymptotic formula for the 1 This result is all the more interesting because of the recent conjecture in [7] which predicts a Virasoro algebra playing a similar role in the the case where the target manifold is nontrivial.
654
A. Kabanov, T. Kimura
Weil–Petersson volumes of the moduli space of genus 1 curves as the number of punctures becomes very large. Finally in Sect. 6, we describe the moduli space of rank 1, restricted, CohFTs in genus one.
1. Moduli Space of Curves Notation. In this paper we always consider cohomology with the rational coefficients: H • (X) stands for H • (X; Q). We denote the set {1, . . . , n} by [n]. If I is a finite set we denote its cardinality by |I|. 1.1. Basic definitions. Let Mg,n be the moduli space of smooth curves of genus g with n marked points, where 2g − 2 + n > 0, i.e. Mg,n = { [6; x1 , x2 , . . . , xn ] }, where 6 is a genus g Riemann surface and x1 , x2 , . . . , xn are distinct marked points on 6. Two such configurations are equivalent if they are related by a biholomorphic map. The moduli space Mg,n has a natural compactification due to Deligne, Knudsen, and Mumford denoted by Mg,n = { [C; x1 , x2 , . . . , xn ] } which is the moduli space of stable curves of genus g with n punctures in which Mg,n sits as a dense open subset. The spaces Mg,n are connected, compact, complex orbifolds (in fact, stacks) with complex dimensions 3g − 3 + n. The complement of Mg,n in Mg,n is a divisor with normal crossings and consists of those stable curves which have double points. The moduli space Mg,n forms the base of a universal family. Let π : Cg,n → Mg,n be the universal curve which can be identified with Cg,n = Mg,n+1 , where π is the projection obtained by forgetting the (n + 1)st puncture and followed by collapsing any resulting unstable irreducible components of the curve, if any, to a point. The universal curve Cg,n → Mg,n is furthermore endowed with canonical sections σ1 , σ2 , . . . , σn such that σi maps [C; x1 , x2 , . . . , xn ] 7→ [C 0 ; x01 , x02 , . . . , x0n+1 ], where C 0 is obtained from C by attaching a three punctured sphere to xi at one of its punctures to create a double point, then labeling the remaining two punctures on the sphere x0i and x0n+1 , and finally setting all other x0j = xj . The sections σi are well-defined since M0,3 is a point. The image of σi : Mg,n → Cg,n gives rise to a divisor Di in Cg,n for all i = 1, 2, . . . , n. 1.2. Natural stratification. In the sequel it will be convenient to consider markings by arbitrary finite sets rather than by just [n]. If I is a finite set we denote by Mg,I ∼ = Mg,|I| the corresponding moduli space. The natural stratification of Mg,n is best described in terms of graphs, and therefore we start with fixing the notation concerning graphs. We will consider only connected graphs. Each graph 0 can be described in terms of its set of vertices V (0), set of edges E(0), and set of tails S(0). Each edge has two endpoints belonging to V (0) which are allowed to be the same. Each tail has only one endpoint. If v ∈ V (0),we denote by n(v) the number of half-edges emanating from v, where each edge gives rise to two half-edges, and each tail to one half-edge. The natural stratification of Mg,n is determined by the type of the degeneration of the curve representing a point in the moduli space, and its strata can be labeled by stable graphs. A stable graph consists of a triple (0, g, µ), where 0 is a connected graph as above, g : V (0) → Z≥0 , and µ is a bijection between S(0) and a given set I. Moreover, one requires that for each vertex v, the stability condition 2g(v)−2+n(v) > 0 be satisfied. If [C; x1 , x2 , . . . , xn ] is a stable, n-pointed curve one obtains the corresponding stable graph, called its dual graph, by collapsing each irreducible component to a point (vertex),
Intersection Numbers and Rank One Cohomological Field Theories
655
connecting any two vertices if their corresponding components share a double point and attaching a tail to a vertex for each marked point P on that component. We define the genus g(0) of 0 to be b1 (0) + v∈V (0) g(v), where b1 (0) is the first Betti number of 0. We denote by Gg,n the set of the equivalence classes of stable graphs of genus g with n tails labeled by [n]. There is a natural action of the symmetric group Sn on Gg,n . Associating a stable curve to its dual graph provides an Sn -equivariant bijection between the strata of the natural stratification of Mg,n and the elements of Gg,n . Let M0 be (the closure of) the moduli space of stable curves whose dual graph is 0. It is a closed irreducible subvariety of codimension |E(0)| of Mg(0),S(0) . Moreover, it is isomorphic to a quotient of the cartesian product Y Mg(v),n(v) v∈V (0)
by Aut(0), where the automorphisms of a stable graph (0, g, µ) are required to preserve g and µ. This quotient morphism can be made canonical if one creates a pair of labels for each edge of 0 and labels the n(v) half-edges emanating from v by the corresponding elements of S(0) with the labels corresponding to the edges. Each M0 determines the fundamental class, in the sense of orbifolds, lying in H • (Mg(0),S(0) ). The pull back of this class under the morphism π : Mg,n+1 → Mg,n is represented by a subvariety of Mg,n+1 corresponding to the |V (0)| graphs each of which is obtained by attaching a tail numbered n + 1 to a vertex of 0. The coefficient of each summand is the ratio of the orders of the automorphism groups of the graphs. It is also easy to push down these fundamental classes. Let M00 represent an element of H • (Mg,n+1 ). The image of this element under π∗ : H •+2 (Cg,n ) → H • (Mg,n ), induced by the fiber integration, is zero if after removing the (n + 1)st tail from 00 the graph remains stable. In the other case, when the removal of the (n + 1)st tail destabilizes 00 , the image is obtained by stabilization, i.e., contracting the edge connecting the unstable Aut 0| vertex with the rest of the graph, and then multiplying by ||Aut 00 | . 1.3. Tautological classes. We will now describe three types of tautological cohomology classes (ψ, κ, and λ) associated to the universal curve. Consider the universal curve Cg,n → Mg,n . The cotangent bundle to its fibers (in the orbifold sense) forms the holomorphic line bundle ωg,n . Let L(g,n),i → Mg,n be given by the pullback L(g,n),i = σi∗ ω(g,n) . The tautological classes ψ(g,n),i in H 2 (Mg,n ) are defined by ψ(g,n),i := c1 (L(g,n),i ), where c1 denotes the first Chern class.2 The tautological classes κ(g,n),i in H 2i (Mg,n ) for i = 0, 1, . . . , (3g − 3 + n) are defined as follows. PnConsider the bundle ωg,n (D) → Cg,n consisting of ωg,n twisted by the divisor D = i=1 Di , then κ(g,n),i := π∗ (c1 (ωg,n (D))i+1 ). In particular, κ(g,n),0 = 2g −2+n ∈ H 0 (Mg,n ) is the negative of the Euler characteristic of a smooth curve of genus g with n points removed. We also have the equality ωg,n (D) = L(g,n+1),n+1 [1, 15]. Therefore 2
The Chern classes are in the sense of orbifolds and are therefore rational.
656
A. Kabanov, T. Kimura i+1 κ(g,n),i = π∗ (ψ(g,n+1),n+1 ).
(2)
The tautological λ classes are defined to be λ(g,n),i := ci (π∗ ωg,n ) ∈ H 2i (Mg,n ), where i = 1, . . . , g because π∗ ωg,n is an orbifold bundle of rank g. (There are no λ classes in genus 0 and we define λ(0,n),i := 0.) One can easily see that λ(g,n+1),i = π ∗ λ(g,n),i . Therefore, all of the λ classes are pull backs of the λ classes on M1,1 and Mg,0 , g ≥ 2. They can be expressed in terms of the κ classes, the ψ classes, and the cohomology classes lying at the boundary [28]. In particular, κ(g,n),1 = 12λ(g,n),1 − δ(g,n) +
n X
ψ(g,n),i ,
(3)
i=1
where δ(g,n) is the fundamental class of Mg,n − Mg,n [3, 9, 28]. (This formula was brought to our attention by E. Getzler.) We will drop subscripts associated to the genus and the number of punctures if there is no ambiguity. Notation. Let Sk be the set of infinite sequences of non-negative integers m = (mk , mk+1 , mk+2 , . . . ) such that mi = 0 for all i sufficiently large. We denote by δ a the infinite sequence which has only one non-zero entry 1 at the ath place. For m = (m0 , m1 , m2 , . . . ) ∈ S0 and t = (t0 , t1 , t2 , . . . ), a family of independent formal variables, we will use notation of the type X X Y Y i |m| := i mi , kmk := mi , m! := mi !, tm := tm i . i≥0
i≥0
i≥0
i≥0
We say that l ≤ m if li ≤ mi for all i. If l ≤ m we let Y m mi := . li l i≥0
We will use the same notation when m ∈ S1 . 1.4. Generating functions. Witten [31] defined a generating function which incorporates all of the information about the integrals of products of the ψ classes. In order to describe this function we need to introduce the following notation. Let Z ψ1d1 ψ2d2 . . . ψndn , hτd1 τd2 . . . τdn i := Mg,n
where g is determined by the equation 3g − 3 + n = d1 + d2 + · · · + dn . If there exists no such g, then the left hand side is by definition zero. In case we want to mention the genus explicitly we will write hτd1 τd2 . . . τdn ig . Note that this expression is symmetric with respect to d1 , d2 , . . . , dn since ψi ’s are interchanged under the action of the symmetric group Sn . Therefore one can write it as hτ0m0 τ1m1 τ2m2 . . .i, where the set {d1 , . . . , dn } contains m0 zeros, m1 ones, etc. The generating function is defined by
Intersection Numbers and Rank One Cohomological Field Theories ∞ X
F (t0 , t1 , t2 , . . . ) := hexp
hτ0m0 τ1m1 τ2m2 . . .i
{mi }∈S0
j=0
We will also use the notation
X
tj τj i =
657
X
hτ m i
m∈S0
∞ mi Y ti . mi ! i=0
tm m!
for the last expression. Note that one can also write F (t) = P tm m m∈S0 hτ ig m! .
P∞ g=0
Fg (t), where Fg (t) :=
Witten conjectured in [32] and Kontsevich proved in [21] that F is the logarithm of a τ -function in the KdV-hierarchy. In [19] Kaufmann, Manin, and Zagier considered a similar generating function for κ classes. If one defines Z κp1 1 κp2 2 . . . , hκp ig = hκp1 1 κp2 2 . . .ig := Mg,n
then their generating function is Kg (x; s) = Kg (x; s1 , s2 , . . . ) :=
X
hκp ig
p∈S1
x|p| sp . |p|! p!
Note that here it is important to indicate the genus. The number of punctures n is then determined from 3g − 3 + n = |p|. We introduce the generating function H which incorporates both of the ψ and κ classes. We shall see that F and K enjoy similar properties which arise because H obeys those properties. First we introduce the following notation. Let m ∈ S0 and p ∈ S1 . Define Z ψ1d1 ψ2d2 . . . ψndn κp1 1 κp2 2 . . . , hτ m κp i := Mg,n
where the set {d1 , . . . , dn } contains m0 zeros, m1 ones, etc., and (g, n) is determined by the equations n = kmk, 3g − 3 + n = |m| + |p|. If no such g exists we set the expression above to zero. As before, we write hτ m κp ig when we want to fix g. Definition 1.1. Let t = (t0 , t1 , . . . ) and s = (s1 , s2 , . . . ) be independent families of independent formal variables. We define H(t; s) :=
X
hτ m κp i
m∈S0 ,p∈S1
t m sp . m! p!
One can split H into the sum of Hg , g = 0, 1, . . . . Each Hg lies in a kernel of a certain scaling differential operator, i.e., it satisfies the charge conservation Pequation. The multiplication of H by kmk P is equivalent to applying the operator E := ti ∂i , and by |m| is equivalent to applying i ti ∂i . Therefore one has [3(1 − g) +
∞ X i=0
(i − 1) ti ∂i +
∞ X i=1
i si di ] Hg = 0,
(4)
658
A. Kabanov, T. Kimura
where ∂i := ∂/∂ti and di := ∂/∂si . Clearly H(t; 0) = F (t). In order to relate H to K one fixes a genus g, sets t1 = t2 = · · · = 0, and t0 = x. The infinite sequence m reduces to m0 = n = |p| + 3 − 3g. It follows that X x|p|+3−3g sp . hκp ig Hg (x, 0; s) = (|p| + 3 − 3g)! p! p∈S1
In this paper we are primarily interested in genus 0 and 1. Then K0 (x; s) = H0000 (x, 0; s) and K1 (x; s) = H1 (x, 0; s), where the prime denotes the partial derivative with respect to x. 2. Presentation of the Tautological Classes via Graphs In this section we will mainly focus on H0 and H1 , the generating functions for the intersection numbers of the κ and ψ classes in genus 0 and genus 1. First we show that H0 satisfies a system of nonlinear differential equations. These equations when, restricted to the ψ classes, were first obtained by Witten [31, 32]. When restricted to the κ classes in genus zero, equivalent but much more complicated equations were obtained by Kaufmann, Manin, and Zagier [19], the difference being accounted for by our simple recursive presentation of powers of κ and ψ classes in genus zero and one in terms of boundary divisors. We prove that H0 satisfies differential equations of the same form as those obtained by Witten for just the ψ classes. We then obtain a system of differential equations relating H0 and H1 , and the explicit formula (1). Here we present a geometric approach using the explicit presentation of the ψ and κ classes in terms of the boundary strata. 2.1. Explicit presentation in genus 0, 1. In genus 0 and genus 1 we will be able to express inductively all powers of the ψ classes, and therefore the κ classes, in terms of the boundary strata. Because the ψ classes are interchanged under the action of the symmetric group it is enough to compute ψ(g,n),1 , g = 0, 1, and all its powers. In the calculations below we will use the properties stated in Sect. 1 regarding the pull backs and push forwards of the cohomology classes represented by graphs. Let us introduce the following notation for the rest of this section. On graphs we denote by • vertices of genus 0, and by ◦ vertices of genus 1. We always assume that the ψ classes are associated to the marked point labeled 1, and subsequently omit it from the notation. We denote ψ(1,n),1 by ψ(n) , and we denote ψ(0,n),1 by φ(n) . Similarly, we denote κ(1,n),a by κ(n),a , and κ(0,n),a by ω(n),a . We also adopt the following convention. Let 0 be a stable graph. According to Sect. 1, 0 determines a canonical finite quotient map from a product of moduli spaces to M0 provided certain choices have been made. We denote by ρ0 the composition Y Mg(v),n(v) −→ M0 −→ Mg(0),S(0) , (5) ρ0 : v∈V (0)
where the first arrow is the quotient morphism, and the second arrow is the inclusion. 1 ρ0∗ (⊗v γv ) by the picture of 0 where Let γv ∈ H • (Mg(v),n(v) ). We denote | Aut(0)| each vertex v is in addition labeled by the cohomology class γv . We may omit the label of v if γ(v) is the fundamental class of Mg(v),n(v) . In particular, the fundamental class of M0 (in the orbifold sense) is represented by 0 with all additional labels omitted.
Intersection Numbers and Rank One Cohomological Field Theories
659
In the pictures the dashed line with two arrows indicates the (sub)set of the tails emanating from a particular vertex. If φ or ω labels a vertex of a graph we will omit the subscript from the notation because it is determined by the graph. (We also assume that φ is associated to the marked point labeled by 1.) The power φ0 represents the fundamental class. Proposition 2.1. If n ≥ 4, a ≥ 1, then the following holds in H • (M0,n ): φa(n) =
n−1
X
n
1
I
J φa−1
ItJ=[n−2] 1∈J
.
(6)
Remark. The class φa(n) is invariant under the subgroup Sn−1 ⊂ Sn whose elements fix 1. Therefore instead of n − 1, n we can choose any two labels a, b, 2 ≤ a < b ≤ n to be distinguished. R Proof. The statement is true when n = 4 and a = 1 because M0,4 ψ1 = 1. We shall first prove by induction that the statement is true for all n ≥ 4 and a = 1. Let us consider the projection π : M0,n+1 → M0,n which “forgets” the n + 1st marked point. By the induction hypothesis we assume that the statement is true for some n and a = 1. Recall that φ(n+1) = π ∗ φ(n) + D1 , where D1 is the image of the section σ1 : M0,n → M0,n+1 [32]. Therefore applying π ∗ one gets: n−1
φ(n+1) −
n
1
[n−2] n+1
=
X
n−1
n n+1
n−1
1
I
J
+
n
n+1
I
1 J
.
ItJ=[n−2] 1∈J
Moving D1 to the right hand side and relabeling the tails marked n − 1, n by n, n + 1 respectively one gets the statement in case of n + 1, a = 1. a−1 In order to prove the lemma for a ≥ 2 we write φa(n) as φ(n) φa−1 (n) , and restrict φ(n) to the boundary stratum representing φ(n) . Let π1 be the morphism M0,n+1 → M0,n forgetting the first marked point. Applying it to (6) with a + 1, using (2), and renumbering the labels {2, . . . , n + 1} by the elements of [n] we get the following Corollary 2.2. If n ≥ 4, a ≥ 1, then the following holds in H • (M0,n )
ω(n),a =
X ItJ=[n−2]
n−1
n
I
J ωa−1
(7)
In (7) if a = 1 one should use that ω(Jt∗),0 associated to the right vertex is equal to |J| − 1 times the fundamental class (cf. Sect. 1). This agrees with π1∗ of (6) when a = 1. Now we establish a relation between the ψ and κ classes in genus 0 and 1. The proof is virtually identical to that of Prop. 2.1, and we will not reproduce it. It is shown in [4, VI.4] that
660
A. Kabanov, T. Kimura
ψ(1) = Note that we take the coefficient is of the graph.
1 12
1 12
1
.
rather than
1 24
due to the non-trivial automorphism
Proposition 2.3. If n ≥ 1, a ≥ 1, then the following holds in H • (M1,n ) 1 a ψ(n)
1 = 12
[n]
+
φa−1
X
1 I
J φa−1
ItJ=[n] 1∈J
.
(8)
Pushing down the above along π1 : M1,n+1 → M1,n , and renumbering the labels one gets Corollary 2.4. If n ≥ 1, a ≥ 1, then the following holds in H • (M1,n )
κ(n),a =
1 12
[n] ωa−1
X
+
I
J ωa−1
ItJ=[n]
.
(9)
2.2. Recursion relations and differential equations. Now we derive the corresponding recursion relations and differential equations for the intersection numbers of the products of the ψ and κ classes using the explicit graph presentations above. In order to obtain the recursion relations we use a method from [19] to integrate the product of the ψ and κ classes over the Poincar´e dual of a chosen ψ or κ class. In order to do this we need to know how the tautological classes restrict to the strata of the natural stratification. The restriction of a ψ class to a boundary stratum is obvious. In order to restrict products of the κ classes we use Lemma 1.3 from [19] where the authors show the following restriction property for the κ classes. (They show it in the case of genus 0, but their proof is in fact valid for all genera.) Let (0, g, µ) be a stable graph, ρ0 is the corresponding morphism defined by (5), and κp is a product of the κ classes on Mg(0),S(0) . Then Z Q v∈V (0)
Mg(v),n(v)
ρ∗0 (κp ) = p!
X P
Y hκpv ig(v) . pv !
pv : v∈V (0) v∈V (0) pv = p
The argument uses a fact proved in [1] that the collection κ(g,n),a for each fixed a forms a logarithmic cohomological field theory (cf. Sect. 6), i.e., the κ classes satisfy the relation ρ∗0 (κa ) =
X
κg(v),n(v) .
(10)
v∈V (0)
We start with genus 0. Recall that H(t; s) is the generating function incorporating the intersection numbers for the products of the ψ and κ classes defined in Sect. 1, ∂a , P∞ da are partial derivative with respect to ta , a ≥ 0, sa , a ≥ 1, and E = i=0 ti ∂i .
Intersection Numbers and Rank One Cohomological Field Theories
661
Theorem 2.5. For each m ∈ S1 , p ∈ S, k, l ≥ 0, and a ≥ 1 one has hτ m+δk +δl +δa κp i0 X m p 0 0 00 00 = hτ m +δk +δl +δ0 κp i0 hτ m +δa−1 +δ0 κp i0 , 0 0 m p m0 +m00 =m p0 +p00 =p
hτ m+δk +δl κp+δa i0 X m p 0 0 00 00 hτ m +δk +δl +δ0 κp i0 hτ m +δ0 κp +δa−1 i0 . = 0 0 m p m0 +m00 =m p0 +p00 =p
Equivalently, for each k, l ≥ 0 the function H0 (t; s) satisfies ∂a ∂k ∂l H0 = (∂k ∂l ∂0 H0 )(∂a−1 ∂0 H0 ) when a ≥ 1, d1 ∂k ∂l H0 = (∂k ∂l ∂0 H0 )((E − 1)∂0 H0 ), da ∂k ∂l H0 = (∂k ∂l ∂0 H0 )(da−1 ∂0 H0 ) when a ≥ 2. This system together with H0 (t0 , 0; 0) =
t30 6
uniquely determines H0 .
Proof. The recursion relations are a direct consequence of (6), (7), and the restriction properties of the ψ and κ classes described above. In order to derive the differential equations from the recursion relations one notices that the increment of ma or pa by one in a recursion relation corresponds to taking the partial derivative with respect to ta or sa . The operator E appears because the second recursion relation when a = 1 produces ω0 . The corresponding moduli space is M0,Jt∗ . As |J t ∗| = km00 k + 1, it follows that ω0 = |J| − 1 = km00 k − 1, and we use that the multiplication by mi can be expressed by ti ∂i . Remark. Setting s = 0 in the first equation one gets differential equations satisfied by F0 (cf. [32]). Remark. Setting k = l = 0, t0 = x, and t1 = t2 = . . . = 0 in the second equation one gets differential equations satisfied by H(x, 0; s) whose third derivative with respect to x is K0 (x; s). These equations are a simple consequence of the results in [19, Sect. 1]. Now we turn to genus 1. We use the explicit presentations (8) and (9) and take into account the automorphism groups of the graphs to obtain the following Theorem 2.6. For each m ∈ S0 , p ∈ S1 , and a ≥ 1 one has 1 m+2δ0 +δa−1 p hτ κ i0 24 X 0 0 00 00 m p hτ m +δ0 κp i1 hτ m +δ0 +δa−1 κp i0 , + 0 0 m p m0 +m00 =m
hτ m+δa κp i1 =
p0 +p00 =p
1 m+2δ0 p+δa−1 hτ κ i0 24 X 0 0 00 00 m p + hτ m +δ0 κp i1 hτ m +δ0 κp +δa−1 i0 . 0 0 m p m0 +m00 =m
hτ m κp+δa i1 =
p0 +p00 =p
662
A. Kabanov, T. Kimura
Equivalently, the functions H1 (t; s) and H0 (t; s) satisfy 1 ∂a−1 ∂0 ∂0 H0 + (∂0 H1 )(∂a−1 ∂0 H0 ) when a ≥ 1, 24 1 E∂0 ∂0 H0 + (∂0 H1 )((E − 1)∂0 H0 ), d 1 H1 = 24 1 da−1 ∂0 ∂0 H0 + (∂0 H1 )(da−1 ∂0 H0 ) when a ≥ 2. d a H1 = 24 ∂ a H1 =
This system together with the system and the initial conditions from Thm. 2.5 uniquely determines the pair H0 , H1 . Remark. The first recursion relation when p = 0 was obtained by Witten in [32]. The system of differential equations above can be solved explicitly for H1 in terms of H0 to derive (1). Corollary 2.7. The functions H1 and H0 are related by H1 =
1 log ∂03 H0 . 24
1 log ∂03 H0 satisfies the differProof. Because of uniqueness it is enough to check that 24 ential equation in Thm. 2.6. This is a straightforward calculation which makes use of the differential equations from Thm. 2.5.
Remark. Setting s = 0 we recover a result from [5, Sect. 2.2]. Setting t0 = x, t1 = t2 = 1 . . . = 0 we get K1 = 24 log K0 . 3. Puncture and Dilaton Equations In this section we introduce an approach which does not use explicit presentations of ψ and κ classes in terms of graphs. Instead we introduce the analogues of the puncture and dilaton equations. These equations generalize the classical puncture and dilaton equations obtained by Witten [32, 31]. (We shall explain these equations below.) This will allow us to write differential equations for H, and then, using these differential equations, prove that H0 and H1 satisfy (1). 3.1. Recursion relations. In Sect. 1 we introduce the notation incorporating the intersection numbers of both of the ψ and κ classes. Now we shall to prove certain recursion relations for these numbers. Lemma 3.1. The following recursion relations are satisfied: p ∞ X X p m+δ 0 p m+δ i−1 −δ i p κ i = m hτ κ i + hτ m κp−j+δ|j|−1 i, hτ i j j=0 i=1
(11)
|j|>0
and for each a ≥ 1, hτ m+δa κp i =
p X p j=0
j
hτ m κp−j+δ|j|+a−1 i.
(12)
Intersection Numbers and Rank One Cohomological Field Theories
663
Proof. We continue to use the notation from the previous section, i.e., π is the universal curve over Mg,n , (ψbi , κ bi ) and (ψi , κi ) are classes upstairs and downstairs respectively, σi is the ith canonical section of π, and Di is its image. i It was shown in [31] that ψbia = π ∗ ψia + π ∗ ψia−1 Di and in [1] that κ bi = π ∗ κi + ψbn+1 . Note also that ψbi Di = 0, ψbn+1 Di = 0 for i = 1, . . . , n, and Di Dj = 0 when i 6= j. Using this one derives that bp1 1 κ bp2 2 . . . ) π∗ (ψb1d1 . . .ψbndn κ = π∗ (π ∗ ψ1d1 + π ∗ ψ1d1 −1 D1 ) . . . (π ∗ ψndn + π ∗ ψndn −1 Dn ) 2 )p2 . . . × (π ∗ κ1 + ψbn+1 )p1 (π ∗ κ2 + ψbn+1 X = ψ1d1 . . . ψidi −1 . . . ψndn κp1 1 κp2 2 . . . i:di 6=0
+
p X p j=0 |j|>0
j
ψ1d1 . . . ψndn κ|j|−1 κp1 1 −j1 κp2 2 −j2 . . . .
Similarly one can show that a π∗ (ψb1d1 . . .ψbndn ψbn+1 κ bp1 1 κ bp2 2 . . . ) p X p d1 ψ . . . ψndn κ|j|+a−1 κp1 1 −j1 κp2 2 −j2 . . . . = j 1 j=0
Recall that κ0 = 2g − 2 + n. One can further integrate the push forward formulas above to obtain the statement of the lemma. Remark. Recursion relations (11) and (12) do not mix intersection numbers in different genera. Remark. If p = 0, then the second sum in the first relation vanishes, and we obtain the classical puncture equation. If p = 0 and a = 1 in the second relation, then we get the classical dilaton equation. Note that both classical equations involve only ψ classes. Remark. This is clear that recursion relations (11) and (12) allow to eliminate τ from the intersection number, i.e., to express all mixed intersection numbers through the intersection numbers on M0,3 , M1,1 , and the intersection numbers of the κ classes on Mg,0 , g ≥ 2. In [19, Cor. 2.3] the authors obtained an explicit expression for the intersection numbers of the κ classes through the intersection numbers of the ψ classes. This can be related to (12). 3.2. Differential operators. Now we derive differential equations for H using recursions (11) and (12). Recall that ∂i , di denote the partial derivatives with respect to ti , si . Theorem 3.2. The function exp(H(t; s)) is annihilated by the following differential operators:
664
A. Kabanov, T. Kimura
−∂0 +
∞ X sj X d|j|−1 + ti ∂i−1 j!
j: |j|≥2
+ s1 (
∞ X i=0
i=1
∞
X2 1 1 2i + 1 ti ∂ i + i si di ) + t20 δg,0 + s1 δg,1 , 3 3 2 24 i=1
∞ ∞ X X X sj 1 2i + 1 2 d|j| + ( ti ∂i + i si di ) + δg,1 , −∂1 + j! 3 3 24 j: |j|≥1
−∂a +
Xs j
i=0
j
j!
d|j|+a−1
i=1
when a ≥ 2.
Remark. The differential operators above do not mix genus, and therefore they annihilate each exp(Hg (t; s)) separately. Remark. The function exp(F (t)) is annihilated by differential operators Li , i ≥ −1, which, after a rescaling of variables, satisfy the Virasoro relations [5]. The first two differential operators in the statement of the theorem are analogues of L−1 and L0 respectively, which encode the puncture and dilaton equations, respectively. Proof. The differential operators above are the direct translation of the recursion relations (11) and (12). The addition of δ i to m or p translates into taking the corresponding partial derivative. The subtraction of δ i from m, and multiplying the term by mi translates into the multiplication by ti . One should also change the summation index to obtain the second summand of each differential operator. The terms in parentheses in the first two equations come from the value of κ0 = 2g − 2 + n. We use (4) in order to express this number in terms of differential operators. Finally, the last terms in the first two equations correspond to the initial conditions 1 1 , and hτ1 i1 = 24 . hτ03 i0 = 1, hκ1 i1 = 24 The theorem above leads to another proof of Cor. 1. First one notes that H0 and H1 are uniquely determined by the differential operators above. Therefore it suffices to 1 3 ∂0 H0 satisfies the genus 1 equations. This is a direct calculation. check that 24 4. Explicit Expressions In this section we write the closed form expressions for the intersection numbers in genus 1 as sums of the multinomial coefficients. Notation. If b = (b1 , . . . , bk ) is a vector with integer entries we denote by [b] the k )! multinomial coefficient (bb11+···+b !...bk ! , and we set it to zero if at least one entry is negative. Recall also that kbk denotes the sum b1 + . . . + bk . In genus 0 the intersection numbers of the ψ classes are very simple: hτb1 . . . τbk i0 = [b]. In order to state our result in genus 1 we define for each k ≥ 1 a function fk : Zk≥1 → Z≥1 by kbk−k
fk (b) = fk (b1 , . . . , bk ) := hτ0
τb1 . . . τbk i1 .
Clearly each fk is invariant under the permutations of its arguments.
Intersection Numbers and Rank One Cohomological Field Theories
665
Proposition 4.1. For each k ≥ 1, fk (b) =
1 1 [b] − 24 24
X
(kεk − 2)! [b − ε].
(13)
ε∈{0,1}k kεk≥2
Proof. The intersection numbers of the ψ classes in genus 1 are determined by the classical puncture equation and the classical dilaton equation. (See the second remark after Lemma 3.1.) Reformulated in terms of the collection {fk } these equations say that this collection is uniquely determined by the following properties: • • • •
1 , f1 (b1 ) ≡ 24 fk is invariant under the permutation of the arguments, Pk fk (b) = i=1 fk (b − δ i ) when bi ≥ 2 for all i, fk (b1 , . . . , bk−1 , 1) = (b1 + · · · + bk−1 )fk−1 (b1 , . . . , bk−1 ).
The first three properties are obviously satisfied by the expression given in the statement of the proposition. A direct computation verifies the last property. An explicit expression for the intersection numbers of the κ classes in genus 1 can be obtained by substitution of (13) into Cor. 2.3 from [19]. This expression is quite complicated. We do not know how to simplify this expression, and therefore we do not present the resulting formula for the κ classes here. 5. Asymptotic Formulas for Volumes of M1,n In this section, we derive an asymptotic formula for the Weil–Petersson volumes of M1,n in the limit that n becomes very large extending the proof of a similar result for genus zero in [19]. We do so by using analytic properties of the generating function K(x; s) in the case where all si = 0 for all i ≥ 2. In some sense, these results are complementary to those of Penner [29] who obtains similar formulas for the case where the genus becomes very large. The class of the Weil–Petersson symplectic form on Mg,n is precisely 2π1 2 κ(g,n),1 . The symplectic volume of Mg,n is called the Weil–Petersson volume of Mg,n . For this reason, the intersection numbers associated to the κ classes are sometimes called higher Weil–Petersson volumes. To avoid unnecessary factors, we shall work instead with the quantity R . Definition 5.1. wg,n := Mg,n κ3g−3+n 1 Theorem 5.2 ([19]). The genus zero Weil–Petersson volumes satisfy the asymptotic relation as n → ∞, 3 1 γ0 2 2 22n n2n+ 2 , w0,n+3 ∼ √ C π C n e2n where γ0 ≈ 2.40482555777 . . . is the smallest zero of the Bessel function J0 and C = −2γ0 J00 (γ0 ) ≈ 2.496918339 . . . . They proved this by noticing that the function H000 (x, 0; s) is invertible, and when all si = 0 except for s1 the inverse function satisfies Bessel’s equation, after a change of variables. Combining their results with ours for genus one, we obtain the following.
666
A. Kabanov, T. Kimura
Theorem 5.3. The genus one Weil–Petersson volumes satisfy the asymptotic relation as n → ∞, π (2n)2n , w1,n ∼ 24 C n e2n where C is the same constant as in the above. Proof. One uses the asymptotic formulas for the genus zero case and our result that the generating functions are related by (1). Remark. The theorem above supports a conjecture of Itzykson regarding the existence of such an asymptotic formula for all genera with a constant C independent of the genus (cf. [19, p. 765]). 6. The Moduli Space of Cohomological Field Theories The moduli space of normalized, rank one cohomological field theories of genus zero was described in Kontsevich, Manin, and Zagier [19]. The generating function associated to the κ classes endows this moduli space with coordinates which behave nicely with respect to taking tensor products of cohomological field theories, a notion introduced in [23] (see also [18]). In this section, we introduce the notion of a restricted, normalized, cohomological field theory in genus one and describe the moduli space of rank one theories of this kind. Such CohFTs turn out to be almost completely determined by their genus zero part using the relations between the boundary strata of M1,n recently obtained by Getzler [11]. The analogous set of coordinates are constructed for this moduli space but to do so, we must introduce the λ classes, as well. 6.1. Cohomological field theories. Consider Gg,n , the set of stable graphs of genus g and n tails labeled with the set [n] . Each Gg,n is acted upon by the permutation group Sn which permutes the labels on the tails. There are composition maps Gg1 ,n1 × Gg2 ,n2 → Gg1 +g2 ,n1 +n2 −2 taking (0, 00 ) 7→ 0 ◦(i1 ,i2 ) 00 for all i1 in [n1 ] and i2 in [n2 ] given by grafting the tail i1 of 0 with tail i2 of 00 and then relabeling the remaining tails with elements of the set [n1 + n2 − 2] by inserting orders. There is another set of composition maps Gg,n → Gg+1,n−2 taking 0 7→ tr (i1 ,i2 ) 0 for all distinct pairs i1 and i2 in [n] in which the tails i1 and i2 of 0 are grafted together. These composition maps are equivariant with respect to the action of the permutation groups. Let C[Gg,n ] be the vector space over C with a basis Gg,n then the compositions and permutations group actions can be extended C-linearly. Let C[G] denote the direct sum of C[Gg,n ] for all stable pairs g, n. The collection { C[Gg,n ] } (or, for that matter, { Gg,n }) together with the composition maps and actions of the permutation groups described above forms an example of a modular operad, a notion due to by Getzler and Kapranov [13]. By restricting to just the genus zero subcollection { C[G0,n ] } and forgetting about the composition maps tr, we obtain an example of a cyclic operad [14]. Similarly, the homology groups H• (Mg,n ) are endowed with an action of Sn which relabels the punctures on the stable curve and there are composition maps ◦i1 ,i2 : Hp1 (Mg1 ,n1 ) ⊗ Hp2 (Mg2 ,n2 ) → Hp1 +p2 (Mg1 +g2 ,n1 +n2 −2 )
Intersection Numbers and Rank One Cohomological Field Theories
667
for all i1 in [n1 ], i2 in [n2 ] and tr (i1 ,i2 ) : Hp (Mg,n ) → Hp (Mg+1,n−2 ) for all distinct i1 and i2 in [n], both of which are induced from the inclusion of strata. These composition maps are equivariant under the action of the permutation groups. The natural maps αg,n : C[Gg,n ] → H• (Mg,n ) mapping 0 7→ [M(0)], where Q M(0) := v∈V (0) Mg(v),n(v) preserves the above structures and gives rise to the sequence of morphisms αg,n
0 −→ hRg,n i −→ C[Gg,n ] −→ H• (Mg,n )
(14)
where the kernel of αg,n is denoted by hRg,n i, the ideal in C[G] generated by some space of relations Rg,n . Definition 6.1. The modular operad H := { Hg,n } is the collection of Hg,n :=
C[Gg,n ] . hRg,n i
The canonical diagonal maps Mg,n → Mg,n × Mg,n induce maps H• (Mg,n ) → H• (Mg,n ) ⊗ H• (Mg,n ) making H• (Mg,n ) into a Hopf modular operad in the natural way [13]. This endows Hg,n with the structure of a Hopf modular operad, as well. In the case of g = 0, the results of [22] and [20] implies that α0,n is surjective and H0,n is isomorphic to H• (M0,n ). Furthermore, the relations R0,n are those due to Keel [20] which come from a lift of the basic codimension one relations R0,4 on M0,4 via the canonical forgetful map M0,n → M0,4 . In the case of g = 1, α1,n is known not to be surjective since M1,n has odd dimensional homology classes. However, Getzler [11] has shown that the space of relations R1,n , in addition to those coming from Keel’s relations, contains the lifts of two other relations. The first is the lift of the basic codimension one relation on M1,2 which contains no genus one vertices – this may be regarded as the image of Keel’s relations under the self-sewing morphism tr (3,4) : C[G0,4 ] → C[G1,2 ]. The second relation, which contains genus one vertices, is between codimension two strata and is of the form 12δ2,2 − 4δ2,3 − 2δ2,4 + 6δ3,4 + δ0,3 + δ0,4 − 2δβ = 0
(15)
where each term is an S4 -invariant combination of graphs of a given topological type and each graph 0 represents the homology class [M0 ]. (See [11] for details.) Getzler also states [11] that he has shown [12] that α1,n maps surjectively onto the even dimensional homology of M1,n and that the relations mentioned above do in fact generate all of R1,n . A cohomological field theory is essentially a representation, in the sense of operads, of H• (Mg,n ). In order to define such an object, we need to define the appropriate notion of the endomorphisms of a vector space in this context. Let V be a vector space over C with a symmetric, nondegenerate bilinear form h of degree zero. Let End(V )g,n := T n V be the nth tensor power of V for all nonnegative integers g, n such that 2g − 2 + n > 0, where T 0 V is understood to be C. Sn acts upon End(V )g,n by permuting the tensor factors and the composition maps End(V )g1 ,n1 ⊗ End(V )g2 ,n2 → End(V )g1 +g2 ,n1 +n2 −2 taking (µ, µ0 ) 7→ µ ◦(i1 ,i2 ) µ0 for all i1 in [n1 ] and i2 in [n2 ] given by applying the inverse of h to the corresponding tensor factors of µ and µ0 , and inserting the remaining
668
A. Kabanov, T. Kimura
factors in the usual way. Similarly, the composition End(V )g,n → End(V )g+1,n−2 taking µ 7→ tr (i1 ,i2 ) µ for all distinct pairs i1 and i2 in [n] corresponds to applying the inverse of h to the appropriate pair of tensor factors of µ. Definition 6.2 (Cohomological field theory). A (complete) cohomological field theory (CohFT) of rank r, (V, h), is a morphism of modular operads µg,n : H• (Mg,n ) → End(V )g,n , where (V, h) is an r-dimensional vector space with an invariant, symmetric bilinear form h. A CohFT of genus g are maps µg0 ,n : H• (Mg0 ,n ) → End(V )g0 ,n which are defined only for g 0 ≤ g which satisfy all the axioms of a CohFT in which no higher genus maps appear. A restricted CohFT is a morphism µg,n : Hg,n → End(V )g,n . A CohFT can also be described described dually in terms of maps End(V )g,n → H • (Mg,n ). Notice that a restricted CohFT of genus zero is the same as a CohFT of genus zero since H• (M0,n ) = H0,n . Remark. In the language of [31, 32], a topological gravity (coupled to topological matter) is a CohFT and the morphisms µg,n are the correlation functions of the theory. The genus zero CohFT is said to be tree level while a genus one CohFT is said to be one loop. Remark. The natural Hopf structure on H• (Mg,n ) endows the category of CohFTs with a tensor product as is usual in representation theory. An restricted CohFT is completely determined by a generating function called its potential. If the CohFT is not restricted then one can still define the notion of a potential (essentially since the modular operad H• (Mg,n ) is the quotient of some free modular operad) but we will not need to work in such generality. P∞ Definition 6.3. The potential 8 = g=0 8g of a restricted CohFT µ : H → End(V ) of rank r is defined by choosing a basis { e1 , . . . , er } for V , where Ig,n (ea1 , ea2 , . . . , ean ) is the number obtained by using h to pair µg,n ([Mg,n ]) with ea1 ⊗ ea2 ⊗ . . . ⊗ ean and 8g (x) :=
∞ X n=0
Ig,n (ea1 , ea2 , . . . , ean )
xa 1 x a 2 . . . x a n , n!
(where the summation convention has been used) which is regarded as an element in C[[x1 , . . . , xr ]]. Theorem 6.4. A element 80 in C[[x1 , . . . , xr ]] is the potential of a rank r, genus zero CohFT (V, h) if and only if [22, 25] it satisfies the WDVV equation, (∂a ∂b ∂e 80 ) hef (∂f ∂c ∂d 80 ) = (−1)|xa |(|xb |+|xc |) (∂b ∂c ∂e 80 ) hef (∂f ∂a ∂d 80 ), where ha,b := h(ea , eb ), hab is in inverse matrix to hab , ∂a is derivative with respect to xa , and the summation convention has been used. If (80 , 81 ) is the potential associated to a restricted, rank r CohFT of genus one then 80 must satisfy the WDVV equation and (80 , 81 ) must satisfy Getzler’s equation from proposition (3.14) in [11].
Intersection Numbers and Rank One Cohomological Field Theories
669
The WDVV equation can be read off from the basic codimension one relation on M0,4 . Similarly, Getzler’s equation can be seen from his relation (Eq. 15). The second statement will become an if and only if after the proof in [12] appears. 6.2. Rank one cohomological field theories. Let (V, h) be a rank one CohFT with a fixed unit vector e. The morphisms Hg,n → End(V )g,n are completely determined by the collection of numbers { Ig,n }, where Ig,n := µg,n ([Mg,n ])(e, e, . . . , e) which must | {z } n
satisfy relations between themselves reflecting the way that the boundary strata in Mg,n fit together. The potential in this case is 8g =
∞ X n=0
Ig,n
xn , n!
where Ig,n is defined to vanish for pairs (g, n) which are not stable. We will see that tautological classes on the moduli space of curves give rise to complete rank one CohFTs. In order to describe the moduli space of restricted, rank one CohFTs of genus one, we need to introduce a combination of the λ classes which behave nicely with respect to restriction. Definition 6.5. For all stable pairs, (g, n), let 3g,n be an element in H • (Mg,n )[s, u], (where s = (s1 , s2 , . . . ) and u = (u1 , u2 , u3 , . . . )) then let 3g,n := exp(
∞ X
( si κ(g,n),i + ui γ(g,n),i ) ),
i=1
where γ(g,n),i := ch2i−1 (π∗ ωg,n ). Here chi is the ith Chern character, and π∗ ωg,n is the pushforward of the relative dualizing sheaf. (Notice that ch2i (π∗ ωg,n ) vanishes for all i [25, 8].) The classes γ(g,n),i are polynomials in the λ classes. In particular, γ1 = λ1 . Theorem 6.6. The collection 3 := { 3g,n } gives rise to a complete, rank one CohFT for all values of u and s by integrating the cohomology classes 3g,n over the homology classes on Mg,n . Furthermore, the tensor product of the CohFT associated to parameter values (s1 , u1 ) and (s2 , u2 ) is the CohFT associated to (s1 + s2 , u1 + u2 ). Proof. In the case where ui vanishes for all i, this was proven in [23] where it was realized that the κ classes form a logarithmic CohFT following the work of Arbarello and Cornalba [1] (see Eq. 10). The proof for the case where all the si vanish is as follows. Consider the bundles Eg,n := π∗ ωg,n on Mg,n from Sect. 1. If 0 is a graph of genus g with n tails, then Q it determines the morphism ρ0 : v∈V (0) Mg(v),n(v) → Mg(0),S(0) . The pull back of Eg,n under ρ0 differs from ⊕v∈V (0) Eg(v),n(v) by a trivial bundle. It follows that for each k ≥ 1 the collection of the Chern characters γk = ch2k−1 Eg,n forms a logarithmic CohFT (see [8, 9]). The first part of the theorem follows by combining these two results. The proof that the coordinates (s, u) are additive with respect to taking tensor products follows from the definition of coproduct which is induced from the diagonal map.
670
A. Kabanov, T. Kimura
Corollary 6.7. The potential of the rank one CohFT associated to 3 (for given values of s and u) is precisely the generating function χg for the intersection numbers of κi and γi classes * χg (x; s, u) :=
exp(x τ0 +
∞ X
+ ( si κi + ui γi ))
i=1
where Ig,n =
= g
∞ X n=0
Ig,n
xn , n!
X sm u r h κm τ0n γ r ig . m! r! r,m
It is understood that Ig,n := 0 for unstable pairs (g, n), Notice that the CohFT arising from 3 have the property that I0,3 = 1 for all values of u and s. This motivates the following definition which will play an important role in what follows. Definition 6.8. A rank one, CohFT of genus g is said to be invertible if I0,3 is nonzero and normalized if I0,3 = 1. 6.3. Cohomological field theories in genus zero and one. Let us recall the results of Kaufmann, Manin, and Zagier for rank one CohFTs of genus zero [19]. rank one PA ∞ xn CohFT in genus zero is uniquely determined by its potential 80 (x) = n=3 I0,n n! . Furthermore, any function 80 (x) in x3 C[[x]] arises from some rank one, CohFT of genus zero since the WDVV equation is trivially satisfied for rank one theories. Therefore, the moduli space of CohFTs of genus zero are parameterized by the independent variables I0,n for n ≥ 3. What is nontrivial, however, is the behavior of these potentials under the tensor product. In particular, the coordinates I0,n do not behave nicely under the tensor product. However, the generating function associated to the genus zero κ classes H0 (x, 0; s) (which is equal to χ0 with u = 0) allows them to introduce coordinates on the space of normalized, rank one CohFTs which behave nicely under tensor products. Theorem 6.9 ([19]). The moduli space of normalized, rank one CohFTs in genus zero are parameterized by s with potential 80 (x; s) = H0 (x, 0; s) in C[s][[x]], our generating function for the intersection numbers of κ classes in genus zero. Furthermore, taking tensor products is additive with respect to the coordinates s. We now treat the case of genus one and discover that the κ classes are not sufficient to describe the entire moduli space of normalized, rank one CohFTs. We will see that one needs to introduce the first λ classes. Theorem 6.10. If the pair (80 , 81 ) is a potential associated to a restricted, rank one CohFT of genus one then the following equation holds in C[[x]] (3) (4) (1) 2 (2) − (8(3) 0 ) 81 + 80 80 81 − th where 8(l) g is the l derivative of 8g .
1 1 (8(4) )2 + 8(3) 8(5) = 0, 12 0 24 0 0
Intersection Numbers and Rank One Cohomological Field Theories
671
Proof. Equation 6.10 above is nothing more than the equation due to Getzler in Theorem 6.4 for the case of rank one theories. Our equation can be seen from Eq. 15 directly by associating to each graph 0 7→
Y 1 |Aut(0)|
v∈V (0)
∂ n(v) 8g(v) ∂xn(v)
and then extending linearly to linear combinations of graphs. One will obtain −36 times the equation above. Unlike the case of genus zero where the WDVV equation is trivially satisfied, solutions to this equation fall into two classes depending upon whether 8(3) 0 is invertible in the ring of formal power series C[[x]]. Theorem 6.11. The pair (80 , 81 ) is a potential associated to an invertible, restricted, 3 rank one CohFT of genus one if and only if 80 (x) is of the form I0,3 x6 + x4 C[[x]] for I0,3 nonzero and 1 00 81 = log 8000 0 + B80 , 24 where B is an arbitrary constant. Therefore, an invertible, restricted CohFT of genus one is uniquely determined by arbitrary values of I0,n for all n ≥ 4, I0,3 6= 0, and I1,1 . If the restricted, rank one, CohFT is not invertible then 80 = 0 and 81 obeys no constraints. Therefore, the space of such theories is parameterized by all values of I1,n for all n ≥ 1. Proof. If the pair (80 , 81 ) is a potential associated to an invertible restricted, rank one CohFT then since 8000 0 has an inverse in C[[x]], one can solve Eq. 6.10 explicitly. The converse is more difficult in the absence of the proof that the lifts of the relations described above span in genus 1 the entire space of relations R1,n . However, we will not need this statement but will explicitly construct restricted, normalized, rank one CohFTs in genus one which realize all solutions to Eq. 6.11 above. This we do in the next subsection. Since I1,1 = I0,4 + BI0,3 , when I0,3 nonzero varying B is the same as varying I1,1 and leaving all of the I0,n unchanged. In the case that the restricted, rank one CohFT is not invertible then our result follows from the equation. We shall not discuss noninvertible CohFTs any further in this paper. From now on, we shall restrict ourselves to normalized CohFTs. It is worth observing that by incorporating the ψ classes, one can use the previous 000 1 result to obtain yet another proof of the formula H1 = 24 log H0 . It is not clear which of these approaches will prove most useful in higher genera. 6.4. Potentials in genus zero and one. In this subsection, we construct potentials for a class of normalized, restricted, rank one CohFTs in genus one explicitly and show that they span the entire space of solutions to Eq. 6.11 completing the proof of that theorem. These potentials are generating functions associated to the κ classes and λ1 . This will give rise to coordinates which are additive under the tensor product in analogy with the case of genus zero in [19]. We begin with a useful lemma.
672
A. Kabanov, T. Kimura
Lemma 6.12. The tautological class λ1 on M1,n can be written in terms of boundary classes as follows:
λ1 =
1 12
[n]
. Proof. The proof follows from the fact that λ(1,n),1 = π ∗ λ(1,1),1 via the forgetful map π : M1,n → M1,1 . One uses (3) to express λ(1,1),1 in terms of boundary classes. In the sequel, let χ eg (s, u) be equal to the generating function χg (s, u), where all values of ui are set to zero except for u := u1 . Theorem 6.13. The intersection numbers above satisfy the following: χ e0 (x; s, u) = H0 (x, 0; s) and
u 00 1 H (x, 0; s) + log H0000 (x, 0; s), 24 0 24 where 0 denotes differentiation with respect to x. χ e1 (x; s, u) =
Proof. Using that λ1 vanishes on M0,n , the presentation of λ1 on M1,n in terms of boundary strata above, and the fact that the κ classes and λ1 forms a logarithmic CohFT, we obtain the equations hκm λr1 τ0n i0 = 0, (
and hκm λr1 τ0n i1
=
1 24
0
κm τ0n+2
0
if r = 1, if r ≥ 2.
eg (s, x, 0) = Hg (s, x) Rewriting these identities in terms of χ eg , using the fact that χ and Theorem 1, we obtain the desired result. u in the previous theorem and Proof (completion of Theorem 6.11). By setting B := 24 eg for g = 0, 1, we conclude the proof of Theorem 6.11 since Theorem 6.9 implies 8g = χ that by forgetting 81 , we obtain all possible CohFTs in genus zero. Furthermore, by varying u, one obtains all possible values of I1,1 without changing the values of I0,n .
Remark. The relations between the intersection numbers obtained in the previous proof can be encoded in the differential equations ∂ χ e0 = 0 ∂u
and
∂ 1 ∂2 χ e1 = χ e0 . ∂u 24 ∂x2
(16)
Putting everything together, we arrive at the following theorem. Theorem 6.14. The moduli space of normalized, restricted, rank one CohFTs of genus e1 ) where χ e0 (x) belongs one is parameterized by coordinates (s, u) via potentials (e χ0 , χ 3 to x6 + x4 C[s, u][[x]] and χ e1 (x) belongs to x C[s, u][[x]] satisfying Theorem 6.13. The tensor product is additive in the coordinates (s, u).
Intersection Numbers and Rank One Cohomological Field Theories
673
Given two rank one, normalized CohFTs in genus zero, it is not obvious how to write down the potential of the tensor product CohFT explicitly in terms of the potentials of the tensor factors. In [19], the authors show that the operation of tensor product corresponds to multiplication of the formal Laplace transforms of the two potentials associated to the tensor factors. Because a rank one, normalized, restricted CohFTs in genus one is determined by its genus zero potential and the value of u, an explicit expression for the potential associated to the tensor product of two such theories follows from the genus zero result of [19] and Theorem 6.13. Acknowledgement. We are grateful to the Max Planck Institut f¨ur Mathematik for their financial support and for providing a wonderfully stimulating atmosphere. We would like to thank R. Dijkgraaf, E. Getzler, and Yu. Manin for useful conversations. We are grateful to J. Stasheff for his comments on an earlier version of this paper. We would also like to thank K. Belabas for his TEXnical assistance and for providing the music.
References 1. Arbarello, E., Cornalba, M.: Combinatorial and algebro-geometric cohomology classes on the moduli space of curves. J. Algebraic Geom. 5, 705–749 (1996) 2. Caporaso, L., Harris, J.: Counting plane curves of any genus. Invent. Math. 131, 345–392 (1998) 3. Cornalba, M.: On the projectivity of the moduli spaces of curves, J. Reine Angew. Math. 443, 11–20 (1993) 4. Deligne, P., Rapoport, M.D.: Les sch´emas de modules de courbes elliptiques. Proc. Internat. Summer School, Antwerp 1972, LNM 349, 143–316 (1973) 5. Dijkgraaf, R.: Intersection theory, integrable hierarchies and topological field theory. In: New Symmetry Principles in Quantum Field Theory, G. Mack, ed., London: Plenum 1993, pp. 95–158 6. Dubrovin, B.: Geometry of 2D topological field theories. Integrable systems and Quantum Groups, Lecture Notes in Math. 1620, Berlin: Springer, 1996 7. Eguchi, T., Hori, K., Xiong, C.S.: Quantum cohomology and the Virasoro algebra. Phys. Lett. B 402, 71–80 (1997) 8. Faber, C.: A conjectural description of the tautological ring of the moduli space of curves. University of Amsterdam preprint (1996) 9. Faber, C.: Algorithms for computing intersection numbers on moduli spaces of curves, with an application to the class of the locus of Jacobians. alg-geom/9706006 10. Fulton, W., Pandharipande, R.: Notes on stable maps and quantum cohomology. Proceedings of the AMS conference on Algebraic Geometry, Santa Cruz, 1995 11. Getzler, E.: Intersection theory on M1,4 and elliptic Gromov–Witten invariants. J. Am. Math. Soc. 10, 973–998 (1997) 12. Getzler, E.: Generators and relations for H• (M1,n , Q). In preparation 13. Getzler, E., Kapranov, M.: Modular operads. Compositio Math. 110, 65–126 (1998) 14. Getzler, E., Kapranov, M.: Cyclic operads and cyclic homology. To appear in Geometry, Topology, and Physics for Raoul. ed. S. T. Yau, Cambridge, MA: International Press, 1994 15. Hain, R.M., Looijenga, E.: Mapping class groups and moduli spaces of curves. Proceedings of the AMS conference on Algebraic Geometry, Santa Cruz, 1995 16. Hitchin, N.: Frobenius manifolds. In: Gauge Theory and Symplectic Geometry. eds. J. Hurtubise and F. Lalonde. NATO-ASO Series C 488, Kluwer, 1997 17. Kac, V.G., Schwartz, A.: Geometric interpretation of the partition function of 2D gravity. Phys. Lett. 257, 329–334 (1991) 18. Kaufmann, R.: The intersection form in H • (M0,n ) and the explicit K¨unneth formula in quantum cohomology. Internat. Math. Res. Notices (1996) pp. 929–952 19. Kaufmann, R., Manin, Yu.I., Zagier, D.: Higher Weil–Petersson volumes of moduli spaces of stable n-pointed curves. Commun. Math. Phys. 181, 763–787 (1996) 20. Keel, S.: Intersection theory of moduli spaces of stable n-pointed curves of genus zero. Trans. AMS 330, 545–574 (1992)
674
A. Kabanov, T. Kimura
21. Kontsevich, M.: Intersection theory on the moduli space of curves and the matrix Airy function. Commun. Math. Phys. 147, 1–23 (1992) 22. Kontsevich, M., Manin, Yu.I.: Gromov–Witten classes, quantum cohomology, and enumerative geometry,. Commun. Math. Phys. 164, 525–562 (1994) 23. Kontsevich, M., Manin, Yu.I. (with Appendix by R. Kaufmann): Quantum cohomology of a product. Invent. Math. 124, 313–340 (1996) 24. Looijenga, E.: Intersection theory on Deligne–Mumford compactifications [after Witten and Kontsevich]. S´eminaire Bourbaki, Vol. 1992/93, Ast´erisque 216, 187–212 (1993) 25. Manin, Yu.I.: Frobenius manifolds, quantum cohomology, and moduli spaces (Chapters I, II, III). MPI Preprint No. 96–113, January 1996 26. Manin, Yu.I.: Private communication 27. Matone, M.: Nonperturbative model of Liouville gravity. J. Geom. Phys. 21, 381–398 (1997) 28. Mumford, D.: Towards an enumerative geometry of the moduli space of curves. In: Arithmetic and Geometry,eds. M. Artin and J. Tate, Part II, Progress in Math., Vol. 36, Basel: Birkh¨auser, 1983, pp. 271– 328 29. Penner, R.C.: Weil Petersson volumes. J. Diff. Geom. 35, 559–608 (1992) 30. Ruan, Y., Tian, G.: A mathematical theory of quantum cohomology. J. Diff. Geom. 42, 259–367 (1995) 31. Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B 340, 281–332 (1990) 32. Witten, E.: Two-dimensional gravity and intersection theory on modulispace. Surveys in Diff. Geom. 1, 243–310 (1991) 33. Zograf, P.:The Weil–Petersson volume of the moduli spaces of punctured spheres. In: Mapping Class Groups and Moduli Spaces of Riemann Surfaces, eds. R. M. Hain and C. F. B¨odigheimer, Contemp. Math. 150, 267–372 (1993) Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 675 – 705 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
D-Brane Bound States Redux? Savdeep Sethi1 , Mark Stern2 1 School of Natural Sciences, Institute for Advanced Study, Princeton, NJ 08540, USA. E-mail: [email protected] 2 Department of Mathematics, Duke University, Durham, NC 27706, USA. E-mail: [email protected]
Received: 23 May 1997 / Accepted: 5 November 1997
Abstract: We study the existence of D-brane bound states at threshold in Type II string theories. In a number of situations, we can reduce the question of existence to quadrature, and the study of a particular limit of the propagator for the system of D-branes. This involves a derivation of an index theorem for a family of non-Fredholm operators. In support of the conjectured relation between compactified eleven-dimensional supergravity and Type IIA string theory, we show that a bound state exists for two coincident zero-branes. This result also provides support for the conjectured description of M-theory as a matrix model. In addition, we provide further evidence that there are no BPS bound states for two and three-branes twice wrapped on Calabi–Yau vanishing cycles. 1. Introduction Remarkable progress by Polchinski in describing the solitons of Type II string theory has provided the means by which many conjectured dualities involving string theories and M-theory can be stringently tested [1]. The low-energy dynamics of coincident D-branes has been described by Witten [2], who reduced the question of finding BPS bound states to one of studying the vacuum structure of various supersymmetric Yang-Mills theories. In simple cases, the BPS mass formula forbids the decay of a charged particle saturating the mass bound; hence, ensuring stability. However, there are a number of situations in which a particle is required that is only marginally stable against decay. Showing the existence of such particles, with energies at the decay threshold, is the goal of this paper. A similiar problem arose for finite SU (2) N=2 Yang-Mills theory, studied in [3] and [4], where certain dyon bound states at threshold were shown to exist. The situations we shall presently study are significantly more difficult because the Hamiltonians are not as well behaved, and gauge invariance provides an added complexity. ?
A preliminary version of this paper was circulated informally in November, 1996.
676
S. Sethi, M. Stern
Let us briefly recall the low-energy dynamics of coincident Dirichlet p-branes, described in [2]. The world volume theory is the dimensional reduction of ten-dimensional N=1 Yang-Mills to the p + 1-dimensional world volume of the brane. For a single brane, the gauge theory is abelian, and the dynamics therefore trivial in the infrared. For N coincident branes, the gauge symmetry is enhanced to U (N ) rather than U (1)N . After factoring out a U (1) corresponding to the center of mass motion, the existence of a bound state requires that the remaining SU (N ) p + 1-dimensional Yang-Mills theory possess a normalizable supersymmetric vacuum. The bosonic potential for this model generally has flat directions, and so we encounter the problem of bound states at threshold. If a bound state is required by a conjectured duality, there is a consistency check, described by Sen [5], that can sometimes be performed. In favorable cases, one might be able to further compactify one direction of the superstring theory. If a bound state exists prior to compactification, it should give rise to BPS states in the further compactified theory which, for appropriate choices of momentum along the circle, are no longer marginally bound. The existence of these states can then be analyzed with more conventional techniques. Of course, for this consistency check, there have to be enough remaining uncompactified directions so that problems with infra-red divergences do not arise. More generally, however, the question of bound states at threshold must be addressed. Note that a normalizable state for a theory in a compact space generally does not remain normalizable when the volume is taken to infinity. The spectrum can and often does change discontinously, and showing the existence of the bound state in the non-compact situation requires a separate analysis. In a similar spirit, we can arrive at descriptions of the effective dynamics of p-branes multiply-wrapped on supersymmetric cycles of a compactification space. In the case of p-branes wrapped on p-cycles, the resulting description of the low-energy dynamics is some flavor of quantum mechanics, although not generally just a supersymmetric gauge theory. Our aim in this paper is to address the fundamental issue – the existence of flat directions in the potential – which arises in studying binding in these situations. This analysis generalizes the discussion in [6], where we argued for the existence of a marginal bound state of a zero-brane and a four-brane, to the case where the gauge group is non-abelian. We shall see that there are very subtle issues that arise as a result of this complication. The most exciting reason for studying this question is, however, the remarkable conjecture that M-theory may be described in terms of zero-brane dynamics in the limit where the number of branes goes to infinity [7]. This conjecture is, in part, founded on previous work studying the relation between supermembranes, and the N → ∞ limit of type IIA zero-brane quantum mechanics [8,9]. In order for the M(atrix) model to have a chance at describing M theory, we need to be able to find states in the quantum mechanics which correspond to the gravitons of eleven-dimensional supergravity. The bound state that we shall find is precisely one of these particles. In the process of showing that such a bound state exists, we will provide a detailed study of the behavior of the propagator for the two zero-brane system when the zero-branes are far apart. There are a number of complications that make this analysis quite subtle. During the lengthy course of our investigation (which pre-dates the M(atrix) model), a number of germane papers have appeared. Among these papers have been interesting discussions of zero-brane scattering in various approximations [10,11], and more recently, an exciting extension of the original matrix model conjecture to the case of finite N [12]. There has also been an explicit argument showing that there are no normalizable ground states in a particular simplified matrix model [13], a heuristic attempt to argue for the existence of zero-brane bound states [14], and a recent paper which has some overlap with our results [15].
D-Brane Bound States Redux
677
In the following section, we consider the case of p = 0. We describe a seven parameter family of theories, which are the primary focus of this discussion. This family of theories is derived fundamentally from the quantum mechanics describing the zero-brane in type IIA string theory by adding mass deformations. These parameters allow us to ‘flow down’ from ten dimensions to models that correspond to the reduction of N=1 YangMills in lower dimensions by taking various mass terms to infinity. We discuss general features of these models, including the various physical scenarios in which they arise. Our approach to the question of counting bound states is described in section three. There, we argue that the L2 index for this class of supersymmetric quantum mechanical YangMills theories is actually computable. This involves a discussion of L2 index theory for non-Fredholm operators, which is an area of mathematics that is relatively unexplored. In section four, we study the question of two-particle binding in these models, and we derive a formula for the principal contribution to the index. The final section is a study of the two-particle propagator in the limit where the two particles are far apart. With this analysis, we can compute a subtle additional contribution to the index. The way in which this contribution arises involves some rather surprising cancellations. In the class of models that we investigate, we find that only the case which corresponds to the reduction of supersymmetric Yang-Mills from ten dimensions can have a unique bound state. This answers, in large part, the question of why the large N limit of the reduced ten-dimensional Yang-Mills theory should be distinguished from the large N limit of reductions of lower-dimensional Yang-Mills theories.
2. Quantum Mechanical Gauge Theory 2.1. General Comments. Let us begin by considering models that arise from reducing supersymmetric d+1-dimensional SU (N ) Yang-Mills to quantum mechanics; see, for instance, [16] for the first discussion of quantum mechanical gauge theories, or perhaps [17]. Whether the Yang-Mills theory contains additional matter multiplets does not significantly change the following discussion; so, for simplicity, we shall assume no additional matter. On reducing the connection Aµ , we obtain scalar coordinates xi where i = 1, ..., d which take values in the adjoint representation of the gauge group. We introduce canonical momenta obeying, [xiA , pjB ] = iδAB δ ij , where the subscript A is a gauge index. With the generators T A for the adjoint representation normalized so that Tr(T A T B ) = N δ AB , the Hamiltonian for the system takes the general form, 1 H= Tr(pi pi ) + V (x) + HF . (2.1) 2N The bosonic potential V (x) is polynomial in x, and generally has flat directions. The term HF is quadratic in the fermions and linear in x. Specific examples will be studied in the following subsection. The A0 equation of motion gives a set of constraints, CA , which must vanish on physical states by Gauss’ law. The constraints obey the algebra, [CA , CB ] = ifABC CC ,
(2.2)
where fABC are the structure constants. The constraints further obey the commutation relations [CA , H] = [CA , Q] = 0, where Q is a supersymmetry generator. The
678
S. Sethi, M. Stern
supersymmetry algebra closes on the Hamiltonian if the constraints are set to zero. An N -particle BPS bound state corresponds to a normalizable, gauge-invariant ground state for this supersymmetric system. Without detailed computation, what might we infer about the structure of the ground state? Away from the flat points, the wave function for the ground state will decay exponentially. The only interesting asymptotic behavior is expected near points where the potential is small. A preliminary comment about the structure of the flat directions is in order: for gauge group SU (N ), there are dc = (d − 1)(N − 1) + (N 2 − 1) commuting directions around a flat point, and da = (d − 1)(N 2 − N ) non-commuting directions. Let us consider the structure of the potential in the neighborhood of a flat point. As we shall subsequently describe in detail, the potential can be approximated by V ∼ − 21 r2 |v|2 , where v parametrizes the transverse directions, and r is a radial coordinate for the flat directions. The Hamiltonian is then essentially a set of bosonic and fermionic harmonic oscillators for the transverse directions, and a free Laplacian along the flat directions. The frequency for oscillation along the massive directions depends on r. This observation provides one way of seeing that there are no scattering states in the spectrum of the bosonic Hamiltonian for these models, as discussed in [18]. Somewhat surprisingly, the spectrum of the bosonic models only contains discrete states. To construct a scattering state along the flat direction, one would want to put the transverse harmonic oscillators into their ground states; however, the zero point energy of the oscillators increases with r, essentially forbidding finite energy scattering states. The same argument does not apply to the supersymmetric case, since the ground state energy for the additional fermions now cancels the zero point energy from the bosons, as required by supersymmetry. If this were not the case, the subtleties in counting zero-brane bound states would not exist! In a first approximation for large r, any zero-energy wavefunction, ψ(x), roughly takes a product form corresponding to placing the transverse oscillators into the ground 2 state, ψ(x) ∼ g(r, θ)e−r|v| /2 , where θ are angular variables for the flat directions. The leading dependence of g(r, θ) on r is believed to be power law decay for large r. Acting with the Hamiltonian for the massive directions on this wavefunction yields zero, since the zero point energies of the bosons and fermions cancel. We can now explain the key difficulty in studying the approximate asymptotic wavefunction: can the decay exponent be accurately estimated? We note that this issue is critical, and cannot be resolved by simple approximations of the asymptotic behavior. For instance, even in this approximation, the function g(r, θ) is not simply the solution of a free Laplacian for the dc -dimensional space of flat directions since the Laplacian, which for the radial coordinate is given by, 1 ∂ dc −1 ∂ r , (2.3) rdc −1 ∂r ∂r also acts on the harmonic oscillator component of the wavefunction. Actually, it is unlikely that the decay exponent can be accurately estimated without at least including the first excited mode for the massive direction into the approximation. We should also note that showing that the decay is fast enough to ensure normalizability is only a first step toward showing that a bound state exists. The structure of the wavefunction would need to be studied at small r where the non-abelian degrees of freedom are important. Currently, the only practical approach is to develop an appropriate index theory for the problem. As a final comment, note that the power law behavior of the asymptotic ground state wavefunction is a consequence of the lack of a mass gap in the spectrum. The supersymmetric theory contains a continuum of states which descend to zero energy, thanks to the existence of the flat direction [8]. 1r = −
D-Brane Bound States Redux
679
2.2. A family of models. We now turn to the models of primary interest to us. Let us recall that strongly coupled Type IIA string theory in ten dimensions has a conjectured dual description as weakly coupled eleven-dimensional supergravity compactified on an S 1 [19,20]. To match the Kaluza-Klein spectrum of the compactified supergravity theory, Type IIA string theory requires electrically charged particle states. The Dirichlet zero-branes, which carry RR charge, seem to be the only candidates. Since there is a single Kaluza-Klein mode for each choice of momentum along the circle direction, we desire a single D-brane bound state for each N . Proving this conjecture was our original motivation for studying these theories. Actually, Sen has argued in [5] that if a unique bound state exists in the quantum mechanics describing N zero-branes, then the spectrum of ultra-short multiplets in the toroidally compactified type II string agrees with the spectrum predicted by U-duality. The world-volume theory for the D-particle is given by the dimensional reduction of N=1 9 + 1-dimensional Yang-Mills to quantum mechanics. A Majorana-Weyl spinor in 9 + 1 dimensions has 16 real components, which means that the resulting quantum mechanical i be a real representation of the SO(9) Clifford theory has N=16 supersymmetry. Let γαβ algebra with i = 1, ..., 9 and α = 1, ..., 16. These Clifford matrices satisfy i j γ , γ = 2δ ij . After reduction, the Hamiltonian for this system takes the form, H=
1 1 X 1 Tr(pi pi ) − Tr(ψγ i [xi , ψ]), Tr([xi , xj ]2 ) − 2N 4N ij 2N
(2.4)
where the real fermions ψAα obey: {ψAα , ψBβ } = δAB δαβ .
(2.5)
The Hilbert space is then composed of spinors on which the quantized fermions act as elements of a Clifford algebra. The spinor wavefunctions contain an extremely large number of components, even for small N , which makes an explicit construction of the zero energy bound state wavefunction at best difficult.1 The supersymmetry algebra takes the form, i xiA CA , (2.6) {Qα , Qβ } = 2δ αβ H + 2γαβ where, Qα =
1 i i γαβ Tr(ψβ pi ) − Tr([γ i , γ j ]ψ[xi , xj ])α , N 4N
while the constraint, 1 C = −i[xi , pi ] − [ψα , ψα ], 2 or explicitly, i CA = fABC (xiB piC − ψBα ψCα ). (2.7) 2 The constraint takes exactly the form assumed in the previous discussion. It is natural to call this a nine-dimensional model, although it is quantum mechanics, since there are nine bosonic variables in the adjoint of SU (N ), and the model is the reduction of a 1 However, the existence of the nice Spin(9) flavor symmetry might bring an explicit construction of the ground state wavefunction within the realm of possibility. We leave the attempt to construct the explicit solution to braver souls.
680
S. Sethi, M. Stern
ten-dimensional theory. The flavor symmetry is clearly Spin(9). Note that there is a nice correlation between fermion number and flavor representation that is worth mentioning at this point. The correlation essentially follows from spin-statistics in ten dimensions: fermionic states in the Hilbert space transform under spinor representations of the flavor group, while bosonic states appear in representations of SO(9). If the ground state is unique, it must therefore be bosonic. There are similar relations for the other models that we shall soon discuss. Let us consider what sort of deformations are possible in this theory. We would like to add mass terms to compactify some of the bosonic variables and effectively reduce the dimension, but we will also require that the supersymmetry algebra maintain its nice structure. In particular, we shall not consider deformations which introduce additional terms into the right hand side of the supersymmetry algebra (2.6), which are linear in momenta. The mass deformations that we shall describe correspond, in special cases, to breaking N=4 Yang-Mills in four-dimensions to N=2 or N=1 by giving masses to various chiral fields in the adjoint representation, and reducing the corresponding model to quantum mechanics. To describe the allowed deformations, choose a real supersymmetry generator, Q = Qα . The generator can be split into terms involving momenta, and terms independent of momenta. Those depending on momenta can be expressed, schematically, as λiA piA , where λi is a real fermion, and i runs from 1 to 9. This leaves us with seven real j in the adjoint of the gauge group, unpaired with a momentum operator, fermions, ωA but each appearing in Q paired with an operator, fAj , quadratic in the coordinates. The supersymmetry generator is then roughly, j j fA + . . . . Q ∼ λiA piA + ωA
The seven fermions, ω j , then represent our deformation degrees of freedom. We can add any reasonable operator to f j , independent of the momenta, and not generate a new term linear in momenta in the expression for {Q, Q}. There are many interesting possible deformations that preserve at least one supersymmetry. Some deformations can give quite exotic classical minima of the resulting bosonic potential. This is a topic that merits further investigation. As a special prosaic case, we could add the perturbation mxi to one of the f j , which would lift some of the flat directions. This is the family of deformations to which we shall restrict our discussion. More explicitly, consider a term f j which squares to give the term in the potential, |f j |2 . Adding the term mxi to f j changes the potential to |f j + mxi |2 . Taking m → ∞ then effectively decouples xi from the model. In this way, we generate a seven parameter family of models which depend on the values of the allowed masses for seven of the coordinates. Note that taking all masses to infinity leaves us with a two-dimensional model, and further compactification is not possible without introducing additional terms linear in the momenta into the supersymmetry algebra. There are two cases of particular interest: the three and five-dimensional models. These models correspond to the reduction of N=1 Yang-Mills from four and six dimensions, respectively. For completeness and to fix annoying normalizations, we shall describe the Hamiltonian and supersymmetry algebra for both models explicitly. In the three-dimensional case, the Hamiltonian is given by, H=
1 X 1 1 Tr(pi pi ) − Tr([xi , xj ]2 ) + Tr(ψσ i [xi , ψ]), 2N 4N ij N
(2.8)
D-Brane Bound States Redux
681
where the index i = 1, 2, 3. The σ i are the Pauli matrices, and the complex fermions ψ obey the anti-commutation relations: ψAα , ψ Bβ = δAB δαβ , where α = 1, 2. The supersymmetry generators are now complex, but still take a form similar to the previous example, 1 i ψAβ piA − fABC [σ i , σ j ]αβ ψAβ xiB xjC , Qα = σαβ 4 while the constraints are given by, CA = fABC (xiB piC − iψ Bα ψCα ).
(2.9)
The supersymmetry algebra is now, {Qα , Qβ } = 0, Qα , Qβ = 0, i Qα , Qβ = 2δαβ H − 2σαβ xiA CA .
(2.10)
The most glaring difference between this model and the nine-dimensional zero-brane case is that the Hilbert space is now a Fock space with a canonical vacuum. This model is quite special because the Pauli matrices form a Lie algebra, and so the complex supercharge can be expressed as [16], i i ψβ (piA − fABC ijk xjB xkC ). Qa = σαβ 2 After introducing a potential, W = 16 fABC ijk xiA xjB xkC , we can conjugate the supercharge in the following way: i ψβ piA . eW Qa e−W = σαβ
The study of the ground state wavefunctions then takes on a cohomological flavor since the supercharge acts roughly as the operator, Q ∼ d + dW ∧, on the wavefunctions, which we can view as differential forms. We expect that, in this case, there should then be an explicit proof from studying the spectrum directly that shows there are no zero-energy L2 wavefunctions for this model. The five-dimensional case is governed by the Hamiltonian: H=
1 X 1 1 Tr(pi pi ) − Tr([xi , xj ]2 ) + Tr(ψγ i [xi , ψ]). 2N 4N ij N
(2.11)
The index i now runs from 1 to 5, and the matrices γ are elements of the SO(5) Clifford algebra. Again, the fermions are complex, and obey the relations, ψAα , ψ Bβ = δAB δαβ , where α = 1, . . . , 4. The constraint has a form identical to the previous case (2.9), and the supersymmetry algebra is given by:
682
S. Sethi, M. Stern
{Qα , Qβ } = 0, Qα , Qβ = 0, i Qα , Qβ = 2δαβ H − 2γαβ xiA CA .
(2.12)
2.3. Wrapped D-branes. Some of the models described in the previous section have already been realized from wrapped D-brane configurations. Let us begin by considering type IIB string theory, and the case of three-branes wrapped on a collapsing three-cycle. In his study of singularities near conifold points of Calabi–Yau manifolds, Strominger required only a single massless BPS state wrapped on the vanishing cycle [21]. That there should be no bound states has been argued from a somewhat different approach in [22]. The geometry of interest is R × S 3 , where the S 3 shrinks to zero size. Clearly, the effective theory on R×S 3 is not N=4 Yang–Mills; such a theory would make little sense. Rather the world-volume theory of a D-brane on a curved space should be described by a topologically twisted theory [23,22]. As the size of the sphere shrinks, only the light degrees of freedom are relevant. The question, in this situation, then concerns the existence of a ground state in the theory obtained from the dimensional reduction of four-dimensional N=1 Yang–Mills, as first mentioned in [2]. We can also check the situation for type IIA, where we perform the same analysis for the case of two-branes wrapped on a vanishing two-cycle. The situation is exactly analogous to the case described above. The geometry is now R×S 2 , where the S 2 shrinks to zero size. The only difference involves the number of supersymmetries. The effective theory is now the reduction of N=1 Yang-Mills from six dimensions. Both models were explicitly described in the previous subsection. It seems plausible that other D-brane configurations will realize many, if not all, of the remaining models which we have discussed. 2.4. Gauge invariance. There is a rather nice feature of some of the computations that we shall describe that deserves a separate comment. Whether it provides a hint at how to formulate covariantly M-theory as a matrix model we leave to the judgement of the reader.2 The gauge-fields in a quantum mechanical gauge theory are non-dynamical. They serve only to enforce the constraint that all states in the Hilbert space be gaugeinvariant. How do we enforce such a constraint in the operator formulation? For very high temperatures, the partition function, Z Z(β) =
dx tr e−βH (x, x)
(2.13)
can be well-approximated by perturbation theory. The notation that we will use throughout the paper may be unfamiliar, and so deserves a comment: we will often consider traces of some operator, say O, which we will denote as, tr O(x, y), 2 When this section was originally written, the preceding comment seemed most appropriate. Subsequently, there has been an interesting proposal for a non-perturbative definition of the type IIB string, given in [24]. The high temperature limit of the partition function that we describe in this section reduces precisely to the model in that proposal. The relation between M(atrix) theory and the proposal in [24] seems to be in the spirit of a “T-duality” in the time direction.
D-Brane Bound States Redux
683
where by (x, y) we mean the usual propagation of a particle from point y to point x. In an explicit basis of eigenfunctions, ψn (x), with eigenvalue λn for O, this expression takes the familiar form, X λn ψn (x)ψn (y), n
where n may index a continuous parameter. However, in computing the partition function (2.13), it is inconvenient to try to trace over the gauge invariant spectrum of the Hamiltonian, i.e. states |ψ(x) > satisfying CA |ψ(x) >= 0. Our first task is then to implement the projection onto gauge invariant states explicitly, so we can trace over the full, unconstrained spectrum. The gauge constraints, CA , split into two sets of SU (N ) generators: one generates rotations of the xi , which we shall denote C b , while the other, C f , generates rotations of the fermions. Let us denote the operator generating a finite gauge transformation g(t) on the fermions by 5(g(t)) where we shall drop the explicit dependence on t. To project onto gauge invariant states, we insert: Z Z dt dx tr eitA CA e−βH (x, x), Z(β) = SU (N ) Z Z (2.14) = dt dx tr 5(g) e−βH (gx, x), SU (N )
R where the measure for the SU (N ) integration is chosen so that SU (N ) dt = 1. The trace is now over the full Hilbert space, including gauge-variant states. For small β, we can now construct a reasonable approximation for the propagator, e−βH (x, y) =
|x−y|2 1 e− 2β e−βV e−βHF + . . . , l/2 (2πβ)
(2.15)
where l = d(N 2 − 1) is the dimension of the space of scalars. We shall describe this approximation in somewhat more detail in the following section. The fermion projection f operator can be expressed as 5(g(t)) = eitA CA , which yields the expression, Z
Z dt
Z(β) = SU (N )
dx tr
|x−gx|2 f 1 − 2β e e−βV e−βHF eitA CA + · · ·. (2πβ)l/2
As β → 0, we see that the contribution from group elements away from the identity ~ b , and the element is strongly suppressed. Indeed, we can then replace g by I + i~t · C exponential term involving g becomes, e
~ b x|2 −|i~ t·C 2β
.
~ b x is more transparent when written as 1 Tr[t, xi ]2 , but this is precisely The term i~t · C N the form of a term in the potential energy, V . Indeed, in this limit, the gauge parameters combine exactly with the remaining coordinates to give a trace which is SO(d + 1) symmetric, rather than SO(d) symmetric. Even the fermion projection operator combines naturally with HF to give a complete symmetry between xi and t, in the computation of this trace. We shall put this symmetry to good use in subsequent computations. Note that for the case of zero-branes in type IIA, the partition function appears to arise from a manifestly SO(10) invariant Hamiltonian, without any hint of gauge constraints.
684
S. Sethi, M. Stern
3. Counting Ground States 3.1. Defining the index. Ideally, to count the number of normalizable ground states for these models, we would like to compute the low temperature limit of the partition function, Z dx lim tr e−βH (x, x). β→∞
Except for very simple systems, that computation is beyond reach. As usual, we are then interested in counting the number of L2 ground states weighted by (−1)F , where F is the fermion number. Therefore, we wish to compute the index, Z Ind = dx lim tr (−1)F e−βH (x, x), β→∞ (3.1) = n B − nF , where the trace is over the gauge invariant spectrum of the Hamiltonian. Let us first note that the index is perfectly well-defined. The only way that the index (3.1) could not be counting the net number of ground states is if the Hamiltonian had an extremely pathological low-energy spectrum, i.e. if the density of states diverged badly as E → 0. That is certainly not the case for the models we are studying. Whether the index is computable is another question entirely. The purpose of this section is to argue that our approach to computing the index actually counts the number of ground states. Before discussing the issues that arise in the non-Fredholm cases, let us discuss in some detail the situation where there is a gap in the spectrum. First, when the spectrum is actually discrete, the twisted partition function is β-independent. In these cases, we can compute the index in the β → 0 limit, which reduces to a perturbative computation. The β → 0 limit is what we will call the principal contribution to the index. Even in the case where the spectrum is discrete, the principal contribution, which is often computed as an integral over the coordinates, x, can be shifted to a boundary term. To see this, note that we should first perform all our analysis on a ball BR , where |x| < R, and then take a limit R → ∞. We can then write, Z dx lim tr (−1)F e−βH (x, x), Ind = lim R→∞
|x|
Z
β→∞
F −βo H
dx{tr (−1) e
= lim lim
R→∞ βo →0
Z
∞
(x, x) +
|x|
dβ βo
∂ tr (−1)F e−βH (x, x)}. ∂β
F −βH
brings down H, which we can replace by Q2 . Now, computing ∂β of tr (−1) e When we try to run Q around the trace, tr (−1)F Q2 e−βH = −tr Q(−1)F Qe−βH = −tr (−1)F Q2 e−βH −
∂ tr ei (−1)F Qe−βH , ∂xi
∂ we find that the β variation vanishes up to a total divergence. In this expression, ei ∂x i, is the derivative term in the supercharge, Q. If we define en to be the fermion in the normal direction to the boundary, then the index can be written as a sum of two terms, Z Z ∞ Z 1 tr (−1)F e−βo H + dβ tr en (−1)F Qe−βH }. Ind = lim lim { R→∞ βo →0 2 |x|
D-Brane Bound States Redux
685
At first sight, keeping track of these various limits may seem like a technicality; however, that is not the case. So, rather than continue a general discussion, let us revisit an old friend to see how these manipulations work concretely. A number of the diffculties that arise in the D-particle cases will become clearer. 3.2. The harmonic oscillator revisited. Let us consider a single supersymmetric harmonic oscillator, which has a unique ground state, and a discrete spectrum. The supercharge is given by, Q = ψ1 p + ψ2 x, where ψ12 = ψ22 = 1 and {ψ1 , ψ2 } = 0. The Hamiltonian is one-half the square of the supercharge, 1 H = Q2 2 1 = (p2 + x2 − iψ1 ψ2 ). 2 Now to evaluate the principal term, we can consider the first term in (3.2), we which can write as, Z R βo 2 1 dx √ tr iψ1 ψ2 e− 2 (x −iψ1 ψ2 ) + . . . , 2πβo −R where (−1)F = iψ1 ψ2 in this case, and squares to the identity. The omitted terms are suppressed by powers of β. Evaluating the trace on the fermions, or in the equivalent pathintegral language, integrating out the fermion zero modes, gives a leading contribution in βo , Z R 2 1 dx √ βo e−βo x /2 , 2πβ o −R which gives,
Z
R
√
−R
βo /2
√
2 1 dx √ e−x . π βo /2
If we take βo to zero faster than R−2 , this term vanishes, while if we take βo to zero more slowly, we obtain the expected answer of one. Whether or not we get a contribution from this term depends on how we choose to take βo to zero. When this term does not contribute, the second term in (3.2) contributes, and the principal contribution is shifted to a boundary term as we shall see. In this model, the principal contribution is the only contribution to the index. The boundary term gets two equal contributions from R and −R in this case, and so can be written, Z 1 ∞ dβ tr (−iψ1 )(iψ1 ψ2 )Qe−βH . 2 βo x=R As R becomes large, the potential term e−βV damps the kernel, e−βH , for large β. We therefore do not need non-perturbative information about the kernel to evaluate this contribution – a small β approximation suffices. Whenever there is a mass gap, we have this nice damping, which is the reason that the index is usually computable. In this case, evaluating the trace on fermions gives, Z ∞ 2 1 e−βx /2 dβ x √ , x=R 2πβ βo
686
S. Sethi, M. Stern
and on rescaling, we obtain: Z
∞
1 dβ √ e−β . πβ βo R2 /2
Now we see that this term can contribute if βo is taken to zero sufficiently quickly with R. Of course, the L2 index is one regardless of how fast or slowly we choose to take βo to zero. In the case of a model with potential V homogeneous in x with degree k, a similar argument can be applied. In that case, if we take βo to zero slower than R−k then the principal contribution is localized to the first term of (3.2), while if take βo to zero faster than R−k , the second term contributes. In the following section, we shall evaluate the principal contribution for the two D-particle case by letting βo go to zero more slowly than R−4 . This seems computationally simpler than trying to localize the contribution to the boundary. 3.3. Reducing the principal term to quadrature. To evaluate the principal contribution, we have to construct a reasonable approximation to e−βH . We will not need to alter the usual perturbative construction of the partition function because of the flat points of the potential, V . We start by writing, Z 1 1 −βH = dz, e−βz e 2πi γ H −z where γ is a contour enclosing the spectrum of H. Let us consider the generic situation away from the flat points. We can approximate (H − z)−1 by a perturbation series, Z HF 1 eik·(x−y) (x, y) = (1 − 2 + . . . ), (3.3) H −z (k 2 /2 + V − z) (k /2 + V − z) where the first correction, proportional to HF , is shown, and subsequent terms are constructed iteratively in powers of (k 2 /2 + V − z)−1 . The corresponding propagator takes the form, e−βH (x, y) =
|x−y|2 1 − 2β e e−βV e−βHF + . . . , (2πβ)l/2
(3.4)
where l = d(N 2 − 1) is the dimension of the space of scalars. This approximation is reasonable for small β. The omitted terms which correct this approximation appear with a higher power of (k 2 /2 + V − z)−1 , in (3.3), and consequently give rise to terms suppressed by powers of β in (3.4). This approximation then suffices for evaluating the first term in (3.2), where we choose to take β to zero more slowly than R−4 . Substituting the leading approximation for the propagator gives, Z Z dt tr (−1)F e−βH 5(g) (gx, x) lim lim R→∞ β→0 |x|
D-Brane Bound States Redux
687
the potential. As usual, the inclusion of (−1)F forces us to absorb fermion zero modes. In our trace, (−1)F is realized as the volume form for the Clifford algebra (2.5). There are two sources for fermions: the first is from e−βHF , while the second source is the f fermion projection operator, eitA CA , inserted into the trace. After writing the fermion f term, itA CA − βHF , as ψM ψ for some matrix M , the trace over the Clifford factors gives the Pfaffian of M , which is a polynomial in x and t. On rescaling the integral, we obtain, Z Z 1 ~ ~b 2 e−|it·C x| /2 e−V (x) Pf(M ), (3.5) dt dx (2π)l/2 where the integration region for t is now RN −1 , while for x, the region is Rd(N −1) . As will be clear from the subsequent explicit computation, the Pfaffian is of definite sign when d is odd, and the integral is thus non-vanishing. However, it is far from clear that this term yields an integer, and indeed, it generally is not integral. Therefore, there had better be a non-vanishing correction term. We stress again that it is very natural to consider t on equal footing with the coordinates xi . Let us denote t by x0 , and define γ 0 to be iI, where I is the identity matrix. The coordinates xiA now form an (N 2 − 1) × (d + 1) matrix. In this notation, the matrix takes the form M = −(i/2)fABC xiB γ i , and the integral admits an SO(d + 1) symmetry which we shall use in section four to compute explicit values for this term in the two-particle case. 2
2
3.4. The non-Fredholm case. When the Hamiltonian under consideration has continuous spectrum, the twisted partition function is generally β-dependent. The heuristic reason for the β-dependence is that the density of states for the bosonic and fermionic scattering states can differ. Supersymmetry pairs bosonic and fermionic modes, but does not necessarily preserve the spectral density. In these cases, the principal contribution to the index is not necessarily integer, and there must be an additional contribution from the second term in (3.2). In the case where there is a mass gap, this contribution can be perturbatively evaluated. What happens in the case where there is no mass gap? Let us choose a real supercharge, Q, which squares to the Hamiltonian up to a gauge transformation. For this discussion, we will set the gauge constraints to zero. Q is then a self-adjoint elliptic first-order operator, which anti-commutes with our Z2 involution, (−1)F . Let us define Q+ as the restriction of Q to the +1 eigenspace of (−1)F , i.e. the bosonic states. It may be helpful to think of Q as a matrix, 0 Q∗+ , Q+ 0 where Q∗+ = Q− is the restriction of Q to fermionic states. The computation that we need to perform is the calculation of the L2 index of Q+ . This operator, though elliptic, is not Fredholm. Recall that a Fredholm operator, by definition, has a finite-dimensional kernel and cokernel. The fact that the continuous spectrum of H = Q2 contains scattering states with arbitrarily small energies implies that the image of Q+ is not closed, and the cokernel is infinite-dimensional, and distinct from the kernel of Q− . Hence, we take the L2 index of Q+ to be the dimension of Ker(Q+ ) ∩ L2 minus the dimension of Ker(Q∗+ ) ∩ L2 , which is not, in this case, the dimension of the kernel minus the dimension of the cokernel of Q+ . Let us consider how to compute this index. Suppose that there exists a Green’s function, G, for Q2 ; i.e. a self-adjoint singular integral operator G which annihilates
688
S. Sethi, M. Stern
the kernel of Q2 and acts as (Q2 )−1 on the orthogonal complement of the kernel. By definition, G obeys, Q2 G = I − P, where P denotes the orthogonal projection onto the kernel of Q2 . So, P then annihilates all states which are not zero-energy. We recall that by singular integral operator we mean an operator which is obtained by integrating against a matrix-valued kernel, g(x, y), which has a well-understood singularity along the diagonal x = y. For example, the inverse of the Laplacian in three dimensions is the familiar kernel, ∼ |x − y|−1 . Let us denote the restriction of G to the +1 eigenspace of (−1)F by G+ , where: G+ = G(I + (−1)F )/2. Let P± denote the orthogonal projection onto the L2 kernel of Q± . Then QQG+ = (I − P )(I + (−1)F )/2 = (I + (−1)F )/2 − P+ , and similarly, using the fact that Q anticommutes with (−1)F and commutes with G, we have QG+ Q = (I − (−1)F )/2 − P− . Therefore, the index of Q+ can be expressed as, IndQ+ = trP+ − trP− = tr[(I + (−1)F )/2 − QQG+ ] − tr[(I − (−1)F )/2 − QG+ Q] = tr((−1)F − [Q, QG+ ]) = tr((−1)F + [Q, (−1)F QG/2]). Of course, the difficulty is that we cannot construct G explicitly – even the claim that G is represented by a singular integral operator requires some justification, which we will give shortly. This difficulty can be summarized in the following way: given a set of eigenstates, ψEα , with eigenvalue E under Q2 , the inverse is formally, X (Q2 )−1 = E −1 ψEα (x)ψEα (y), Eα,E>0
where the sum may be over continuous indices. When the continuous spectrum is not bounded away from zero, it is not clear that the resulting sum converges to a function in any reasonable sense, since G is an unbounded operator on L2 wavefunctions in this case. However, if the scattering states do not pile up at low-energies, then it should be intuitively reasonable that G is still nice, as, for example, in the free-particle case. We will see this later by realizing G as a limit of bounded singular integral operators Gw . Physically, this limiting procedure is equivalent to adding a mass term to the propagator, and taking the limit where the mass vanishes. The problem we are decribing is a common one in any theory with massless particles. So, let us proceed along the usual path by constructing an explicit approximation W to G for which we can compute the trace, tr((−1)F + [Q, (−1)F QW/2]). We must then verify that for a carefully constructed W this trace is the same as the one computing the index of Q+ . Our approximation will have the property that, Q2 W = I − E, for some compact error term E given by integrating against a matrix-valued kernel, e(x, y). We also want e(x, y) to decay polynomially in (|x| + |y|) to a sufficiently high power which we need to determine, and study in greater detail later.
D-Brane Bound States Redux
689
Let us describe what data is needed in order to insure that tr((−1)F + [Q, (−1)F QW/2]) computes the index. This is equivalent to the vanishing of tr[Q, (−1)F Q(G − W )/2], which is the difference between tr((−1)F + [Q, (−1)F QG/2]), which computes the index, and tr((−1)F + [Q, (−1)F QW/2]), which we hope computes the index. For any integer m, the operator W can be constructed so that the kernel for G − W has m continuous derivatives. Let χR be the characteristic function of a ball BR of radius R. The characteristic function is defined to be one on the ball and zero elsewhere. We will throw χR into the various traces to serve as a cut-off on the infra-red physics. By a similar argument to the one used in Sect. 3.1, we may use the divergence theorem to transform trχR [Q, (−1)F Q(G − W )/2] into an integral over the boundary of BR . Therefore, if we can show that (−1)F Q(G − W )(x, x) is decaying sufficiently rapidly at large |x| = R, we deduce that trχR [Q, (−1)F Q(G − W )/2] converges to zero, and the index can be computed by replacing G by our approximation, W . To prove that Q(G − W ) decays sufficiently quickly, we will need to use an argument that may be unfamiliar to the reader which establishes a correspondence between asymptotic estimates for Q2 and decay rates for solutions ψ to Q2 ψ = F , where F satisfies some growth constraint. In order to orient the reader, let us first examine what the argument says in a much simpler case. Consider the differential equation in one variable r, on (0, ∞), d2 + w2 )f = g, dr2 where w is some constant. For this equation, a weak form of our general argument below says that if eawr g ∈ L2 , for some a ∈ (−1, 1), then we may conclude that eawr f is normalizable, if f satisfies the growth constraint e−bwr f ∈ L2 , for some b < 1. The condition on the growth of f is clearly necessary to rule out the addition of the nonnormalizable erw to any solution. In this simple case, we can prove this result by a direct integration. We will use the following analogous result, which is established in much greater generality in [25]. Suppose that for some positive constant c and some compact set K, we have an estimate of the following form: (−
kQf k2 ≥ kcf /rk2 , for all wavefunctions, f , which vanish outside K. With these assumptions, if Q2 F = e, with rc e ∈ L2 , and if F satisfies the growth constraint that F/rc−s be normalizable for some positive s, then rc−1 F ∈ L2 , ln(r)1+ for all positive . Also, rc QF ∈ L2 , ln(r) for all positive . In the familiar case where we have c instead of c/r, we would obtain exponential decay as in the one-dimensional example. The reason we have a weaker decay rate in this example is that the decay is roughly no worse than e−ψ , where ψ is a function with |dψ|2 less than the asymptotic lower bound for Q2 , which is c2 /r2 in this case, and w2 in the one-dimensional example.
690
S. Sethi, M. Stern
For purposes of illustration, let us sketch a proof of these growth estimates. For any function u supported in the complement of K, integration by parts yields 0 = (Q2 F, u2 F ) = kQuF k2 − k[Q, u]F k2 . The assumed estimate implies that, kcuF/rk2 ≥ k[Q, u]F k2 . From this formal inequality, we can deduce the normalizability of cuF/r when cu/r > |[Q, u]|. More accurately, we can deduce the integrability of (c2 u2 /r2 −|[Q, u]|2 )F 2 . Let us define a cutoff function ρ which is identically one outside a large ball K 0 containing K in its interior, and vanishing in K. Then taking u to be rc / ln(r) times the cutoff function ρ gives formally, kcrc−1 F/ ln(r) ρk2 ≤ k(rc−1 (c − / ln(r))ρ + rc ρ0 )/ ln(r) F k2 . Collecting terms, we obtain: Z Z c−2 2 2 1+ 2 r (2c − / ln(r))|F | / ln(r) ρk ≤ cK 0
K0
|F |2 ,
for some constant cK 0 , which depends on K 0 and c. A limiting argument, approximating r by a sequence of bounded functions can be used to obtain from this formal inequality the boundedness of the left side, when |F | is bounded on compact sets. We will need the following variant of this inequality. Suppose that ∂2 (2d − 1) ∂ + w, Q˜ 2 = − 2 − ∂r r ∂r where w is now a positive operator with w ≥ c2 /r2 . Suppose that Q˜ 2 F = 0 in the complement of a compact set K. For some k to be determined, consider the point where rk F is maximum. By the maximum principle, we have at the maximium that, F 0 = −kF/r and (rk F )00 ≤ 0, where: (rk F )00 =k(k − 1)rk−2 F + 2krk−1 F 0 + rk F 00 ≥k(k − 1)rk−2 F − 2k 2 rk−2 F − (2d − 1)rk−1 F 0 + rk−2 c2 F =(2d − k − 2 + c2 /k)krk−2 F. If we choose 0 < k < 2(d − 1), we may deduce that if the maximum exists it must occur in K. If F ra is bounded for any positive a, we may deduce that F rk is in fact bounded by its values on K for all k with k < 2d+c2 /k −2. In our applications, c2 will usually be the first or second eigenvalue of the standard spherical Laplacian, which is 0 or (d − 1). We recall that the standard Laplace operator on S d−1 has eigenvalues, k(d + k − 2), . In with k = 0, 1, 2, . . . , where the multiplicity of each eigenvalue is, (2k+d−2)(k+d−3)! k!(d−k)! d−2 decay; in the second we see that for d at least 3, the first case, we see that we get r we get faster than r2−2d decay. These estimates extend easily to the case when Q˜ 2 F = e, where e satisfies the condition that r2d e is bounded. We will apply this to the case when Q˜ 2 = Q2x + Q2y , the hamiltonian in the first and second variables and F will be a kernel constructed from the difference between the Green’s function and W . This argument formalizes the observation that the product of two elements of the kernel of H should
D-Brane Bound States Redux
691
decay twice as fast as a single element of the kernel, and G − W looks like such a product. We will show that, Q2 = 1r + u/r2 , acting on wavefunctions supported outside a large compact set; here r is the distance along the flat direction, 1r is the radial part of the Euclidean Laplacian in the flat directions, and the operator u is semi-positive with first eigenvalue greater than the first or second eigenvalue of the spherical Laplacian. We can then apply the above argument to deduce that G − W decays like r2−2d . We can improve this estimate by observing that we get the second eigenvalue when the operator is restricted to wavefunctions with odd parity, and the first eigenvalue for the restriction to even parity wavefunctions. Split (G − W ) into its even and odd components. The odd component of (G − W ) decays faster than r1−d by the preceding discussion, and it is easy to see that applying Q only improves the decay rate. To see this, use the inequality for Q2 F = 0, kwQF k ≤ 2k[Q, w]F k, and choose w appropriately. For the even component of G − W , we have that its image under Q is odd; therefore, we again have the improved decay rate determined by the second eigenvalue of the sphere. This estimate requires that (G − W )ra be bounded for some positive a and that G be given as a singular integral operator. The last condition is needed to ensure that (G − W ) is a smooth function. We will not prove these results in detail but will merely sketch how they follow from the same sequence of ideas we have introduced. One considers first instead of G and W corresponding operators Gw and Ww , where Gw is (Q2 + w)−1 restricted to the orthogonal complement of the kernel of Q2 , etc. It is easy to get the desired initial ra boundedness for Gw − Qw for each w > 0. One then can get the desired growth bound; i.e. we show that the max is controlled by an estimate on the compact set. We then allow w to tend to zero to get the desired result for G − W = limw→0 (Gw − Ww ). This type of argument can also be used to show that G is a singular integral operator. This then establishes the desired decay estimate given one: a construction of W which leads to sufficiently small E; that is |e(x, y)| ≤ (|x| + |y|)−d−1 , and two: a demonstration of the claimed asymptotic lower bound for Q2 . Under these two conditions, tr((−1)F + [Q, (−1)F QW/2]) computes the index. We now turn to the evaluation of [Q, (−1)F QW/2], which we need to boil the problem down to a concrete computation. It will be convenient to arrange the construction of W so that on a large compact set, say a ball BR , its contribution to the index can be computed by the standard principal term computation described in the previous subsection. On the complement of this set we will need to use special coordinates to find a nice expression for W . This requires us to define an approximation A to QG rather than the approximation W to G, but since, off a compact set, A will clearly be of the form QW , this will not affect our preceding discussion. The use of two separate constructions for W in different regions may seem, perhaps, a bit unnatural when dealing with Euclidean space. It is forced on us, in part, by the need to obtain very good control of the error term E as |x| tends to ∞. This rules out the use of the local computation used in BR and described previously. Moreover, the special coordinates we use in the complement of the compact set, like polar coordinates, become singular at the origin. Therefore, we will need to use two sets of cutoff functions to patch the two approximate inverses together.
692
S. Sethi, M. Stern
Let ρn,j (x) be a sequence of cutoff functions which approach χjR (x), and set ρn = ρn,1 . Let W 0 be our approximate Green’s function near ∞, which we will construct in Rβ 2 section five. The operator 0 0 dβ Qe−βQ (x, y) is the standard kernel that we will use in BR . We can create a global approximation to QG by defining: Z β0 2 A(x, y) = ρn (x) dβ Qe−βQ (x, y)ρn,2 (y) + (1 − ρn (x))QW 0 (x, y)(1 − ρn,1/2 (y)). 0
The cutoff functions on the left of each operator are inserted to average the two operators. The cutoff functions on the right, however, are inserted so that the operators are localized to the domains where they are well-defined, and satisfy the desired estimates. The right cutoffs are of course chosen to be identically one on the support of the left cutoffs; otherwise, they would destroy the averaging effected by the left cutoffs. This is the reason for the second index on ρ. Then to evaluate [(−1)F A, Q], we write: Z β0 2 F F dβ Qe−βQ ρn,2 − [Q, ρn ](−1)F QW 0 (1 − ρn,1/2 ) [(−1) A, Q] =[Q, ρn ](−1) 0
− ρn (−1) (I − e−β0 Q )ρn,2 + (1 − ρn )(−1)F (I − E)(1 − ρn,1/2 ) 2
F
+ (1 − ρn )(−1)F Q[Q, W 0 ](1 − ρn,1/2 ) + . . . , Z β0 2 dβ Qe−βQ − [Q, ρn ](−1)F QW 0 =[Q, ρn ] 0
− (−1)F (I − ρn e−β0 Q ) + (1 − ρn )(−1)F (−E) 2
+ (1 − ρn )(−1)F Q[Q, W 0 ] + . . . , where the omitted terms are terms that trace to zero. We will construct W 0 in section five so that the trace of (1 − ρn )(−1)F (−E) + (1 − ρn )(−1)F Q[Q, W 0 ] will tend to zero as n tends to ∞. Thus subtracting off the (−1)F I term we are left to compute, Z β0 2 2 tr[Q, ρn ] dβ Qe−βQ − [Q, ρn ](−1)F QW 0 + (−1)F ρn e−β0 Q . 0
The last term is the principal term which is given by evaluating the integral (3.5). Taking the limit as n tends to ∞, the two commutator terms converge to the boundary traces, ! Z Z β0
|x|=R
tr en
dβ Qe−βQ − tr en (−1)F QW 0 2
.
0
Choosing β0 to go to zero more slowly than R−4 , the integral, Z β0 Z 2 tr en dβ Qe−βQ , |x|=R
0
decomposes into two pieces. One is associated with a small neighborhood of the flat regions, which consist, say, of all points of distance at most one from a flat point. The other contribution is associated with the complementary region, i.e. almost all of the sphere of radius R. It is not difficult to show that the contribution from the flat region is squeezed to zero as R tends to ∞. The contribution from the complementary region is not
D-Brane Bound States Redux
693
vanishing. Using similar arguments to those presented earlier in this section, it is possible to show without extensive computation that this term exactly cancels the principal term. Standard constructions give a W 0 whose contribution from the complementary region R Rβ 2 of tr en (−1)F QW 0 also exactly cancels the contribution of |x|=R tr en 0 0 dβ Qe−βQ . Therefore, the total contribution of this boundary integral to the index comes from, Z 1 tr en (−1)F QW 0 , (3.6) − 2 NF (R) where NF (R) is a small neighborhood of the flat points on the boundary of the space, which is a sphere of radius R. We will show in section five that this integral converges to −1/4 in the two-particle case. In summary, the additional contribution to the index from the boundary is computed by evaluating (3.6), which is localized to the flat directions. The sum of (3.6) and the usual principal term from (3.5) must then be integer.
4. Two-Particle Binding 4.1. Symmetries and pfaffians. The simplest case to consider is N = 2; already, however, the integral (3.5) is thirty-dimensional for type IIA zero-branes! We shall have to use the various symmetries available to us to simplify the computation of the principal term. Recall that the coordinates xiA now form a 3 × (d + 1) matrix. The integral (3.5) is invariant under the symmetry, x → gxh, where g is an element of SO(3) acting on the left while h is an element of SO(d + 1) acting from the right on x. Note that the left action is a gauge transformation. By using these symmetries, we shall rotate x into a special form, 0 b1 0 0 · · · 0 b2 0 · · · 0 , (4.1) 0 0 b3 0 · · · 0 and reduce our integral to one over only three variables. In these special coordinates, the potential (including the gauge parameters) takes the special form, 1 Ve = − (b21 b22 + b21 b23 + b22 b23 ). 2 We shall now evaluate Pf(M ) by evaluating the determinant of M . For the moment, let us return to general coordinates where M = −(i/2)fABC xiB γ i . For convenience, let us denote xiA γ i by xA . The matrix then takes the form, i 0 x3 −x2 −x3 0 x1 . M= 2 x −x 0 2
1
By row manipulations, or equivalently, by studying the eigenvalue equation, we find that the determinant can be expressed as, det(M ) =
1 23(d−1)
−1 −1 det(x1 x2 x3 ) det(1 − x1 x−1 2 x3 x1 x2 x3 ),
694
S. Sethi, M. Stern
0 2 where x−1 A = (xA −2ixA I)/|xA | . After rotating x into our convenient set of coordinates (4.1), we can compute this determinant, and on taking the square root obtain,
Pf(M ) =
1 (b1 b2 b3 )d−1 . 22(d−1)
We can immediately see that when d is odd, the Pfaffian is an even function of the variables in both special and general coordinates, and the corresponding integral (3.5) is non-vanishing. The last ingredient that we require to compute the integral (3.5) is the measure for our simplified coordinates (4.1). We shall obtain the measure by gauge-fixing the integral (3.5) using the Faddeev-Popov approach. Let us take x0 to be, 0 b1 0 0 · · · 0 0 b02 0 · · · 0 . 0 0 b03 0 · · · 0 We shall insert one into the integral (3.5) in the form, Z db0 dgdh δ(x0 − gxh)f (b), where we have to determine f (b). For some g0 and h0 , x0 = g0 xh0 takes the form (4.1). The integrals over g and h then reduce to integrals in a small neighborhood of go and ho , with the exception of the SO(d − 2) subgroup of SO(d + 1) that leaves the form (4.1) invariant. If T is a generator for the left SO(3) action, and R a generator of the right SO(d + 1) action which does not leave (4.1) invariant, then we can replace integration over g, h by, Z η(d) vol(SO(d − 2)) db0 dT dR δ(x0 − xo − T xo − Rxo )f (b). The remaining integrals are straightforward, and we find that, f (b) =
1 (b1 b2 b3 )d−2 |(b21 − b22 )(b21 − b23 )(b22 − b23 )|. η(d) vol(SO(d − 2))
The integrals over b are constrained such that b1 > b2 > b3 . The symmetry factor η(d) is 4 for d > 2, but is 2 for d = 2 because the left and right symmetry groups are then both SO(3). The value of the symmetry factor can also be checked by computing a Gaussian integral in the d-dimensional model, and comparing the result to the answer obtained using the measure for these special coordinates. Finally, inserting one in this form into the integral (3.5), and integrating over x, g, h gives for d > 2, Z vol(SO(d + 1))vol(SO(3)) 1 1 (b1 b2 b3 )d−2 |(b21 − b22 )(b21 − b23 )× db vol(SO(3)) (2π)3d/2 4 vol(SO(d − 2)) 1 (b1 b2 b3 )d−1 eVe . 22(d−1) The first factor of 1/vol(SO(3)) comes from the normalization of the integration over R the gauge group, where we recall that we chose the normalization so that SU (2) dt = 1 prior to rescaling. (b22 − b23 )|
D-Brane Bound States Redux
695
4.2. Computing the principal contribution. Now there√is a nice change that √ of variables √ will allow us to evaluate this integral. Set: y1 = b2 b3 / 2, y2 = b1 b3 / 2, y3 = b1 b3 / 2, and the integral becomes, Z 1 1 (b1 b2 b3 )d−1 eVe db (b1 b2 b3 )d−2 |(b21 − b22 )(b21 − b23 )(b22 − b23 )| (2π)3d/2 22(d−1) Z 2 2 2 1 1 1 = dy(y1 y2 y3 )d−3 |(y12 − y22 )(y12 − y32 )(y22 − y32 )| 2(d−1) 3d/2 e−y1 +y2 +y3 22 π Z 2 1 1 η(d − 1) vol(SO(d − 3)) 1 = dxe−|x| 2 22(d−1) π 3d/2 vol(SO(d))vol(SO(3)) R3d 1 1 η(d − 1) vol(SO(d − 3)) . = 2(d−1) 22 vol(SO(d))vol(SO(3)) Lastly, we must multiply the result by the value of tr(I) from the trace over the fermions, which gives an extra factor 23(d−1) . The net result is the formula: P = 2d−2
η(d − 1) vol(SO(d + 1))vol(SO(d − 3)) , η(d) vol(SO(d − 2))vol(SO(d))vol(SO(3))
(4.2)
for the principal contribution, P , for d odd, where we recall that vol(SO(n)) = n+1 vol(S n−1 ) · · · vol(S 1 ), and that vol(S n ) = 2π 2 /0( n+1 2 ). Let us conclude this discussion by listing the explicit values in the following table:
Table 1. The principal contribution to the index Dimension
Principal contribution
3
1/4
5
1/4
9
5/4
5. The Propagator for Well-Separated Branes 5.1. Some general comments. We have determined in the previous section that the principal contribution to the index is fractional. Since the index must be integer, there is a missing contribution. The manner in which this contribution arises is quite surprising, and involves a bizarre conspiracy of cancellations. Let us outline the procedure we will follow before presenting a detailed discussion. We will construct an approximation to the propagator for the two zero-branes when they are far apart. The approximation will be sufficiently good in the sense that any corrections will not contribute to the boundary term (3.6). At long distances, the only states that make a sizable contribution to the propagator are those localized along the flat directions of the potential. The simplest approximate description of the physics governing the light degrees of freedom is in terms of free particle propagation along the flat directions. This is immediately modified when we try to “integrate” out the massive modes. Let us label coordinates for the da massive directions by y. Then for example, the action of ∂r2 , which is part of the Laplacian
696
S. Sethi, M. Stern
(2.3), on the wavefunction for the flat direction is modified because of its action on the harmonic oscillator ground state, r da /4 da /4 da 2 r −r|y|2 /2 −r|y|2 /2 2 2 − |y| ∂r e = e ∂r + ∂r π π 2r da (da − 4) da 2 |y|4 |y| . + − + 16r2 4r 4 Each of the terms appearing on the right is of order 1/r2 , and so none can a priori be neglected. In a similar way, the rest of the terms in the Hamiltonian modify the long distance behavior. This includes the O(y 4 ) terms in the potential which we recall is of the form, V ∼ r2 |y|2 /2 + O(y 4 ). Since we need to include the O(y 4 ) terms, this approximation is not one-loop in the usual sense. We will need to sum up all the corrections to free propagation, which are of order 1/r2 , and surprisingly, they all cancel. The remaining index computation then involves free particle propagation on the moduli space which is R(dc −2) /Z2 . From this computation, we will recover the needed corrections to the index. The construction that we shall describe is to be contrasted with the kind of effective action for the light modes that has been obtained using large-distance low-velocity expansions [7,11]. After integrating out the massive modes in a one-loop approximation, the leading correction to the effective Lagrangian at large r is a term of order ∼ v 4 /r7 for the case of the nine-dimensional model. The connection between that approach, and the computations that we shall describe does not seem transparent. It seems possible that exploring the connection in detail will give insight, and dare we hope a proof, of the desired non-renormalization theorem for the F 4 term. Computing the leading corrections to the three and five-dimensional models also seems an interesting question. Since the amount of supersymmetry is reduced, we might suspect that there is a correction to the metric on the moduli space. However, in constructing the propagator using this approach, we do not find any fundamental difference between the three cases. Let us start by discussing the form of the various operators that we need to study in special coordinates. Without rotating to a convenient set of coordinates, it will be very difficult to say anything about the structure of the partition function. We can rotate our coordinates, x, into a convenient basis by using a combination of gauge and flavor symmetries. So, we can choose a basis, x = k3q, with k ∈ SO(3), q ∈ SO(d) and 3 the following 3 × d matrix, r 0 ··· 0 2 d 3= 0 y 2 · · · y2 . 0 y32 · · · y3d
(5.1)
i , We have set 311 = r, and let y denote the remaining 2 × (d − 1) matrix, with 3iA = yA for i, A > 1. The reason for this choice is that the flat directions are now at the locus, y = 0. Note that the choice of k and q is not unique here. So the mapping,
m(k, 3, q) → k3q, projects SO(3) × R2d−1 × SO(d) onto our space of matrices, x, but is clearly not one-to-one. The fibers of the map, m, are non-trivial. Any function that depends on
D-Brane Bound States Redux
697
x can then be lifted to a function in the product space, which is constant under those transformations of k, 3, q that leave x invariant. Most of our computations will focus on the neighborhood, NF , of a flat point given by |y|2 < 1, and r tending to ∞. This is the region that contributes to (3.6). We need to write the Hamiltonian in terms of these new coordinates, which include the d+1 angular variables parametrizing the d+2-dimensional flat directions. The kinetic terms in the Hamiltonian can be determined by computing the Laplacian for the metric associated to this coordinate choice. Let us recall that given a metric, g, the Laplacian is given by: p 1 |g|g ij Tj , 1 = − p Ti |g| where |g| is det(g), and the Ti are a basis of vector-fields for some coordinate system. First, we need to make a choice of basis of vector-fields. For the coordinates, (5.1), it ∂ , ∂y∂i }2≤i≤d,B>1 as part of the basis. We need d + 1 additional is natural to have { ∂r B vector-fields. From the left SO(3), we can choose the two vector-fields, {X2 , X3 } which are associated to the two SO(3) generators, 0 10 0 01 −1 0 0 , 0 0 0 0 00 −1 0 0 respectively. Similarly, we can add the d − 1 vector-fields {Vj }j>1 , associated to the right SO(d) generators, z(j). The matrix z(j) is a d × d anti-symmetric matrix with only one positive entry, z(j)1j = 1. Our total basis is then composed of the subset of tangent vectors to the product space, SO(3) × R2d−1 × SO(d), given by {X2 , X3 } ∪ ∂ , ∂y∂i }2≤i≤d,B>1 ∪ {Vj }j>1 . { ∂r B We need to determine the metric, g, for this coordinate choice. The set of vector∂ , ∂y∂i } are orthonormal and orthogonal to the rest of the basis. The rest of the fields, { ∂r B basis have inner products, (Xj , Xk ) = r2 δjk + (yy t )jk , (Vj , Vk ) = r2 δjk + (y t y)jk , and, (Xj , Vk ) = 2ryjk . These inner products can be determined by pushing forward the vector-fields under m, and computing the resulting norms. Now the metric can be written as a direct sum of two metrics, g = g 0 ⊕ g 00 , where g 00 is the identity matrix for the coordinates corresponding to (r, y). The interesting part of the metric is the part for the angular variables. So, let us write, g 0 = r2 I + K, where K is determined from the above inner products. Then (g 0 )−1 = I/r2 − K/r4 + . . . , where the omitted terms are suppressed by more powers of r. To compute the Laplacian, we need: logdet(g) = logdet(g 0 ) = trlog(r2 I) + trlog(I + K/r2 ) = 2(d + 1) log(r) + tr(K/r2 ) − tr(K 2 /2r4 ) + . . . = 2(d + 1) log(r) − 2|y|2 /r2 + . . . ,
698
S. Sethi, M. Stern
where omitted terms are again of lower order. Finally, this allows us to write down an expression for the Hamiltonian in these special coordinates at (k, 3, q), j 2yB (d + 1) ∂ 1 X 2 X 2 ∂2 ∂ + 1 − + − 2( Xj + Vj )+ y j 2 2 ∂r r ∂r r ∂yB r j>1 j>1 X (y2i y3j − y3i y2j )2 + kAM 3sM qsj YAj + . . . , r2 |y|2 +
2H = −
(5.2)
i>j>1
where 1y is the Laplacian in the y variables. We have written HF as kAM 3sM qsj YAj , j fABC ψBα ψCβ , in the nine-dimensional model, and analogous expreswhere YAj = iγαβ sions for the other cases. The omitted terms are all of order O(1/r3 ) or smaller. 5.2. Inverting the Hamiltonian. To invert the Hamiltonian, let us focus first on the harmonic oscillator term, Hm = 1y + r2 |y|2 , with eigenvalues 2(d − 1 + n)r, where n a non-negative integer. If there were no cancelling fermion term, this oscillator term would immediately guarantee a potential linearly increasing with r, and therefore, a discrete spectrum. In order to see the cancelling fermion term, we write HF = rYr + HF0 , where Yr = kA1 q1j YAj , and HF0 consists simply of all those terms in HF without a 311 = r factor. It is an easy task to show that the eigenvalues of Yr range from −2(d − 1) to 2(d − 1). In particular, when we consider wavefunctions for which the transverse bosons and fermions are simultaneously in their ground states, all terms of order r cancel, and we are left with only terms of order 1/r2 to worry about in the Hamiltonian. On any state with excited oscillators, we see that H has lowest eigenvalue of order r. Our construction of W will use the nice factorization of the wavefunctions in terms of their behavior in the massive directions, and their behavior along the flat directions. We can think of H as a block 2 × 2 matrix, with respect to this decomposition, where the H11 piece corresponds to the terms in H which take the ground state for the oscillators back to itself. The piece H22 contains terms in H which send the state with one excited massive boson to itself, and the off-diagonal terms act in a similar way. As a first guess, one might try (and we did) to construct W as a perturbation of some nice approximation, −1 −1 ⊕H22 . Because of the large eigenvalue of the first excited state of Hm +rYr , W1 , to H11 −1 . Inverting H11 is more problematic but almost any construction should give a good H22 not excessively so. In a standard perturbative approach, we consider HW1 = I − E1 . Here, E1 will include, for example, such terms as H12 W1 . As a next step, set W2 = W1 + W1 E1 , with error E2 = −E12 . One could iterate this construction to construct W = Wn for some large n, if the errors were getting significantly smaller each time. For example if each Ek had a kernel ek (x, x0 ) which was bounded by (r(x) + r(x0 ))−k , this would lead us to the desired W after some number of iterations. With this in mind, it becomes clear what terms must be included in our initial approximation, W1 . Because any approximate −1 will have upper bound of size r2 , we see that we cannot discard any term of size H11 bigger than or equal to O(1/r2 ) in H which either acts on the ground state, or maps an excited state to the ground state. For these terms lead to errors which are not decreasing under iteration. Actually, this is an overstatement. If a term B maps us, for example, −1 −1 BH11 . from the lowest state into a higher state we see that the error will enter as H22 −1 0 −1 Because H22 is bounded above by (r(x) + r(x )) , this term will be decreasing in r if, for example, B is O(1/r2 ). With these remarks to guide us, we must now return to analyze H, treating as lower order only terms which are O(1/r3 ) if they map the ground
D-Brane Bound States Redux
699
state into itself, or are O(1/r2 ) if they mix the ground state with excited states. These lower order terms cannot be neglected altogether, but they simply enter as higher terms in our iterative construction of W . The gauge constraints provide some further simplication in our computations. First we can ‘move’ to the identity element k = 1 in our product space, SO(3) × R2d−1 × SO(d) with coordinates (k, 3, q), by a gauge rotation. From now on, we will restrict our discussion to the k = 1 subspace. Further, setting two of the gauge constraints to zero allows us to replace differentiation by Xj , which generates gauge transformations on the bosons, by multiplication by the fermion bilinear which generates gauge transformations on the fermions. For example, in the nine-dimensional case, the fermion bilinear is given by Qj := − 21 ψ1s ψjs . There are similar expressions for the other cases. The remaining gauge constraint, C1 of the three constraints in (2.2), generates a U (1) subgroup which acts on y. With these considerations in hand, let us turn to the task of getting rid of the massive modes, and obtaining an effective Hamiltonian on the flat directions which we can invert to compute (3.6). 5.3. Constructing the effective Hamiltonian. Let us begin by computing the total contribution of terms that map the ground state to itself. Each of these terms will give rise to an interaction in the effective Hamiltonian of the form, m/r2 , for some m, in a manner described in the beginning of this section. The ground state is of the form 2 s(q)r(d−1)/2 e−r|y| /2 , where s(q), the fermion ground state, is actually a section of a bundle determined by the lowest eigenvalue of the fermion term Yr , since s(q) must satisfy the equation: {Yr (q) + 2(d − 1)}s(q) = 0. Therefore, the ground state depends non-trivially on the right angular coordinates, q. We will examine the structure of the fermion ground state in some detail, shortly. Note that the ground state is invariant under the remaining U (1) subgroup of the gauge group. As a first approximation, a general state that we need to consider is a product of the ground state with a wavefunction, f (r, q), along the flat directions. Let us record the various mi /r2 contributions to the effective Hamiltonian which acts on f , where m3 , m5 and m9 denote the values of the contributions for the three, five and nine-dimensional models, respectively. First, by definition, the term Hm + rYr vanishes on the ground state. ∂ in terms of standard annihilation and creation It is convenient to rewrite y and ∂y operators: 1 y = √ (a + a† ), 2r r ∂ r = (a − a† ), ∂y 2 where [a, a† ] = 1. Now we can evaluate the contribution of the derivative terms in H acting on the bosonic ground state, r (d−1)/2 2 e−r|y| /2 . |0i = π The term in (5.2) with one radial derivative gives, d − 1 y2 − |0i h0|∂r |0i = h0| 2r 2 = 0.
700
S. Sethi, M. Stern
The term with two radial derivatives then gives, d−1 2 1 d−1 d−3 1 2 )( ) − 2( ) + h0||y|2 |y|2 |0i h0|∂r |0i = 2 ( r 2 2 2 4 1−d = . 4r2 Let us label this contribution, mr , then m3r = 1/4, m5r = 1/2, and m9r = 1. After noting that we can replace the Xi2 terms in (5.2) by the action of the fermion bilinears, Q2i , which are just matrices, we see that there are two remaining derivative terms in the ∂ term and it is relatively easy to analyze, Hamiltonian. The first is the y ∂y h0|y
1 ∂ |0i = − h0|aa† |0i ∂y 2 = (1 − d).
Let us call this contribution, my , where mdy = 1 − d. The sign of this term is critical since it is the only term that maps the ground state to the ground state, and gives a large negative contribution to the net m. The last derivative term comes from the action of Vi2 on the ground state, where the Vi generate certain flavor rotations. The q-dependent fermion oscillator ground state is the only part of the full ground state wavefunction that can give a non-vanishing contribution in this case. So, let us now describe the ground state of the fermions in some detail. The real massive fermions, {ψ2α , ψ3α }, where α runs from 1 to n = 4, 8 and 16 for the three, five and nine-dimensional cases, can be arranged into annihilation and creation operators: 1 bα = √ (ψ2α + iψ3α ), 2 1 b†α = √ (ψ2α − iψ3α ). 2 The operators b, b† obey the anti-commutation relation, {bα , b†β } = δαβ . Can we construct a fermion state in the kernel of Yr + 2(d − 1)? For the moment, let us pick q = 1. 1 ψ2α ψ3β , and we can pick γ 1 to be the diagonal element of the In this case, Yr = 2iγαβ Clifford algebra: 1 0 . 0 −1 Pn/2 With this choice, Yr = 2 α=1 b†α bα − b†n/2+α bn/2+α . Let us choose a fermionic
ground state |0iF which satisfies, bα |0iF = b†n/2+α |0iF = 0 for α = 1, . . . , n/2. This vacuum is then in the kernel of Yr + 2(d − 1) at the point q = 1. We should point out that, when restricted to the (d − 1)-sphere which is the SO(d) orbit of a flat point, the ground state takes values in a flat vector bundle, which is therefore trivial. This means that there is a globally defined fermion ground state wavefunction. To understand the fermion ground state for arbitrary q and, in particular, to understand the action of Vi2 on the ground state, we shall study the equivalent problem of its action on the operator which acts as projection onto the kernel of Yr + 2(d − 1). This operator is given by,
D-Brane Bound States Redux
701
I 1 1 , (5.3) dz 2πi 0 z − Yr (q) using a contour, 0, which we will take to be a small loop enclosing −2(d − 1). This construction of P (q) is readily seen to be correct by diagonalizing Yr (q), and computing the corresponding contour integral. P Here we should clarify that we are really interested in the action of P (q) i Vi2 ; the remainder is O(1/r2 ) and does not take the groundstate to itself. These P terms will therefore be of interest only as perturbative corrections. The operator P (q) i Vi2 should act as a scalar 2mq on an appropriately chosen basis of the ground state. Our task is to compute the scalar. Now, P (q) is given by conjugating the projection operator at q = 1 with a nonconstant orthogonal matrix, whose columns P give a basis of the ground state. We can write this as, P (q) = O(q)P (1)O(q)t . Then i Vi2 acting on P (q) has terms where O(q) or O(q)t are twice-differentiated and terms where each is differentiated once. It is easy to show that this last term is annihilated by P (q), and hence is not germane to this calculation. If the remaining term is a multiple of P (q), then the multiple will be 4mq , as both O(q) and its conjugate should contribute a factor of 2mq when differentiated. We preface the computation by commenting on how to compute the derivative of Yr (q). It is enough to compute Vi q1j . By a change of coordinates, we then only need to compute, Vi q1j (1). This is given by: d Vi q1j = (qetVi )1j = (qVi )1j . dt t=0 P (q) =
Evaluating at q = 1 gives, Vi q1j (1) = (Vi )1j = δij . Now we compute: X X 2 Z 1 1 1 2 Vi Yr Vi Yr dz+ Vi P (q) = 2πi z − Y (q) z − Y (q) z − Yr (q) r r i i X 1 Z 1 1 Vi2 Yr dz, 2πi z − Y (q) z − Yr (q) r i P where Yr is an eigenfunction of the Laplacian i Vi2 . Hence the second integrand is a simple double pole and therefore integrates to zero. In order to compute the remaining term, it is enough to make a change of coordinates equivalent to taking q = 1. Then it is easy to see that Vi (Yr ) takes the −2(d − 1) eigenspace of Yr to the 4 − 2(d − 1) eigenspace. This gives, X 2 Z 1 1 1 Vi Yr Vi Yr dz = P (q) 2πi z − Y (q) z − Y (q) z − Yr (q) r r i Z X 2 1 1 1 diag(Vi Yr )2 dz, 2πi (z + 2(d − 1)) (z + 2(d − 1) − 4) (z + 2(d − 1)) i P 2 where diag(Vi Yr )2 = P (q)(Vi Yr )2 V (q). Integrating gives − i diag(Vi Y Pr ) /8. 2 i We can compute this at q = 1 where, (Vi Yr ) = 2iγαβ ψ2α ψ3β , and i (Vi Yr ) /8 = i −4γαβ ψ2α ψ3β γαi 0 β 0 ψ2α0 ψ3β 0 /8. The terms which survive P (q) are, i i i i i i ψ2α ψ3β γαβ ψ2α ψ3β /8 − 4γαβ ψ2α ψ3β γβα ψ2β ψ3α /8 = γαβ γαβ /4 −4γαβ
= 4(d − 1).
702
S. Sethi, M. Stern
So, we finally get that m3q = 1/2, m5q = 2 and m9q = 8. With the derivative terms out of the way, we can consider the two remaining operators in the Hamiltonian that map the ground state to the ground state. The first is the O(y 4 ) term in the potential. This gives, X X (y2i y3j − y3i y2j )2 |0i = h0|2 (y2i y3j )2 |0i h0| i>j
i>j
1 = 2 (d − 1)(d − 2). 4r Calling this contribution, mV , we have m3V = 1/4, m5V = 3/2, and m9V = 7. The last contribution comes from the kinetic term for the two gauge rotation generators, X2 , X3 . Setting the gauge constraints to zero, we can replace the angular Laplacian, Xi2 , by the operator quartic in fermions: n X
(ψ1α ψ2α )2 + (ψ1α ψ3α )2 .
α=1
The only terms that map the ground state to the ground state are those proportional to the identity. A quick calculation gives the numerical value, n/2. Calling this contribution, mf , we note that m3f = 1, m5f = 2 and m9f = 4. If we were to stop at this point, and just consider these diagonal contributions to the effective Hamiltonian, a quick check would show that the net m is non-vanishing. There would therefore be a non-trivial 1/r2 interaction in the effective theory. Fortunately, we are not quite finished. First, we can shift the coefficient of the ∂r term in (5.2), and generate a new O(1/r2 ) term by redefining our wavefunctions. The reason this is useful is that on choosing an appropriate coefficient for ∂r , we can combine the radial derivatives with the Vi angular derivatives to obtain a Laplacian for flat space, together with some 1/r2 interaction. This way, we only need to deal with Euclidean coordinates, rather than the messier angular coordinates. We want to shift the coefficient of the ∂r term from d + 1 to d − 1, so we will end up with a free particle Hamiltonian on the d-dimensional moduli space, together with interactions. To do so, we note that: d+1 1 2 d−1 2 ∂r = − (∂r + ) + ∂r − ∂r + r r r d−1 1 2 d−1 1 = − (∂r + ) + (∂r + ) − r r r r2 1 d−1 d−1 =− ∂r2 + ∂r − r, r r r2 where we now redefine our wavefunctions, f (r, q) = r1 f˜(r, q). This gives us a new contribution to m, say mc = d−1 2 . So far, we have found that H acts on the ground state and first excited state in the following way, 1 d 2 t 1 + m /r b d T . H = 2 b H22 Here, mT = mc + mf + mV + mq + my + mr is the total effective interaction that we have found so far, where m3T = 1, m5T = 4 and m9T = 16. Lower order terms in H11 have been
D-Brane Bound States Redux
703
omitted. Let us turn to the form of b, which may change the effective interaction. For example, the terms in b acting on the ground state of order O(r−1/2 ) cannot be neglected. Our initial choice of W1 is then not diagonal to the requisite order in the basis that we have been using. Does b contain terms of the right order? It is not hard to check that the only terms that map the ground state to the first excited state, which are of the requisite order are those involving HF0 . This follows √ from noting that HF0 is proportional to y which is, ∼ a† / r acting on the ground state. Now we see that we can significantly lower our energy if we define a new ground state b )s(q)|0i. The factor of 2r is chosen because acting with b on a state raises |0i0 = (I − 2r its eigenvalue under Hm + rYr by 2r. So, H|0i0 = ( 21 1d + mdT /r2 − bt b/2r)s(q)|0i plus lower order terms. The final contribution to m from bt b is computed in the same way as the other contributions, and we find that mdb = −mdT . Wonderfully, the O(1/r2 ) terms sum to zero! From these considerations, we see that we are reduced to a free-particle calculation. Our choice for an approximation W11 to −1 is then particularly simple: we can just take the free-particle propagator, H11 Z 0 dd k eik·(x−x ) . W11 = (2π)d k2 5.4. Evaluating the boundary contribution. Now that we reached the point where we have a nice simple form for W11 , we note that the remainder of the perturbation construction is standard. We will not belabor the reader with the details of this expansion, but just provide some relevant comments. As we observed before, any reasonable construction −1 , with a nice error bound. The construction of W , which yields a good approximate H22 is perturbative in 1/r, then follows the outline that we have described earlier in this section. The contribution of the W11 term is non-vanishing, but it is clear after extensive, arduous but standard computations, which involve checking powers of r, that any trace involving the rest of W will bring in more powers of r−1 than appear in the W11 term. These terms will therefore not contribute to the boundary term (3.6), in the limit where r→∞. We can now restrict ourselves to the free-particle Green’s function, W11 . The remaining calculation is simple. We have a free particle propagating on Rd /Z2 . The Z2 identification comes from the Weyl group action on the Cartan of the gauge group SU (2). Let us take x as coordinates for Rd . The Z2 action acts as parity, sending x→−x. It also sends the free fermions, ψ1α , where α = 1, . . . , 2(d − 1), to minus themselves. The Hilbert space of gauge-invariant wavefunctions is given by: {f0 (x), f1 (x)ψ1α , f2 (x)ψ1α ψ1β , . . . }. Each function, fk , has parity (−1)k . All we need to do is compute the boundary term, Z 1 tren (−1)F QW11 , − 2 NF (r) for this system. This becomes an integral over the boundary of the d-dimensional moduli space: Z 1 tren (−1)F QW11 − 2 S d−1 (r) Z 1 1 =− tren (−1)F Q 0 d−1 0 d−2 vol(S )2(d − 2) S d−1 (r) |x − x | x =−x = −1/4.
704
S. Sethi, M. Stern
To close this computation, let us note that the lower bound on the asymptotic behavior of H is given by its lowest value on the modified lowest wavefunctions, where the modification involved the off-diagonal b term. The lower bound is therefore the same as the bound for 1d , up to terms of order O(1/r3 ). As discussed in section three, this easily leads to the claimed asymptotic lower bound for H. To summarize: we have found a formula for the index that counts the net number of L2 ground states in certain quantum mechanical systems, where the potential has flat directions. This involved a study of L2 index theory for a family of non-Fredholm operators, which allowed us to show that the prescription we presented actually computes the index. For the case of two-particle binding, we have shown that there is a bound state for coincident zero-branes in type IIA string theory. We have also found further evidence that there are no bound states for two-branes twice wrapped on an S 2 , and three-branes twice wrapped on an S 3 . Note that these models are only special points in the space of theories obtained by deforming the zero-brane quantum mechanics. The actual computation split into two parts. Computing the principal term involved evaluating the integral (3.5). It would be interesting, and quite non-trivial, to compute this integral for higher rank gauge groups. Even better would be a method for avoiding this integration altogether. The second part of the computation required a study of the propagator for the two particles when they are far apart. Surprisingly, after summing a variety of corrections, this computation reduced to one involving a free particle moving on the moduli space. Undoubtedly, there is a fundamental reason for this simplification, and finding it may also shed light on whether the F 4 term in the effective zero-brane Hamiltonian is protected from corrections. It seems likely that there will be an analogous reduction to a free particle calcuation for other gauge groups. As a further comment, note that if we had studied a system with gauge group U (1) and some charged matter, there would have been no boundary correction, as in the case involving H-monopoles [6]. The sort of decay estimates that we described can probably be used to get a handle on the structure of the ground state wavefunction. What is needed is an upper bound on how fast the wavefunction can decay along the flat directions. They may also lead to a vanishing theorem showing that all ground states in these systems must have a definite fermion number. The index would no longer be just an index, but would then count the total number of ground states. This would allow us to conclude that the zerobrane bound state is unique. Finally, systems involving marginal binding of branes with different dimensions can now be analyzed in much the same way. Acknowledgement. It is a pleasure to thank G. Jungman, L. Susskind and E. Witten for helpful discussions. The work of S.S. is supported by NSF grant DMS–9627351, while that of M.S. by NSF grant DMS-9505040.
References 1. 2. 3. 4. 5. 6. 7.
Polchinski, J.: Phys. Rev. Lett. 75, 47 (1995) Witten, E.: Nucl. Phys. B460, 335 (1996) Sethi, S., Stern, M. and Zaslow, E.: Nucl. Phys. B457, 484 (1995) Gauntlett, J. and Harvey, J.: Nucl. Phys. B463, 287 Sen, A.: Phys. Rev. D53, 2874 (1996); Phys. Rev. D54, 2964 (1996) Sethi, S. and Stern, M.: Phys. Lett. B398, 47 (1997) Banks, T., Fischler, W., Shenker, S.H. and Susskind, L.: Phys. Rev. D55, 5112 (1997)
D-Brane Bound States Redux
705
8. de Wit, B., Hoppe, J. and Nicolai, H.: Nucl. Phys. B305, 545 (1988); de Wit, B., Luscher, M.M. and Nicolai, H.: Nucl. Phys. B320, 135 (1989); de Wit, B., Marquard, V. and Nicolai, H.: Commun. Math. Phys. 128, 39 (1990) 9. Townsend, P.: Phys. Lett. B373, 68 (1996) 10. Danielsson, U.H., Ferretti, G., Sundborg, B.: Int. J. Mod. Phys. A11, 5463 (1996); Kabat, D. and Pouliot, P.: Phys. Rev. Lett. 77, 1004 (1996) 11. Douglas, M.R., Kabat, M.R., Pouliot, P. and Shenker, S.: Nucl. Phys. B485, 85 (1997) 12. Susskind, L.: hep-th/9704080 13. Frohlich, J. and Hoppe, J.: hep-th/9701119 14. Lowe, D.: hep-th/9704041 15. Yi, P.: hep-th/9704098 16. Claudson, M. and Halpern, M.: Nucl. Phys. B250, 689 (1985) 17. Brink, L., Schwarz, J.H. and Scherk, J.: Nucl. Phys. B121, 77 (1977) 18. Simon, B.: Ann. Phys. 146, 209 (1983) 19. Townsend, P.K.: Phys. Lett. B350, 184 (1995) 20. Witten, E.: Nucl. Phys. B443, 85 (1995) 21. Strominger, A.: Nucl. Phys. B451, 96 (1995) 22. Bershadsky, M., Sadov, V. and Vafa, C.: Nucl. Phys. B463, 420 (1996) 23. Sethi, S.: Unpublished 24. Ishibashi, N., Kawai, H., Kitazawa, Y. and Tsuchiya, A.: hep-th/9612115 25. Agmon, S.: Lectures on Exponential Decay of Solutions of Second-Order Elliptic Equations. Princeton, NJ: Princeton University Press, 1982 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 194, 707 – 732 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Extendability of Solutions of the Einstein–Yang/Mills Equations J. A. Smoller? , A. G. Wasserman University of Michigan, Mathematics Department, Ann Arbor, MI 48109-1109, USA Received: 17 June 1997 / Accepted: 6 November 1997
Abstract: We prove that any solution to the spherically symmetric SU (2) Einstein– Yang/Mills equations that is defined in the far field and is asymptotically flat, is globally defined. This result applies in particular to the interior of colored black holes.
1. Introduction In this paper we prove the following surprising property of spherically symmetric solutions to the SU (2) Einstein–Yang/Mills equations: Any solution to the EYM equations which is defined in the far field (r >> 1) and has finite (ADM) mass, is defined for all r > 0. We note that this is not true in the “other direction"; i.e., if a solution is defined near r = 0 with particle-like boundary conditions, a singularity can develop at some ρ > 0, and the solution cannot be extended for r > ρ, (see [8, Thm. 4.1]). Moreover, in general for nonlinear equations, existence theorems are usually only local, with perhaps global existence only for special parameter values . However for these equations we prove here a global existence result for all solutions defined in a neighborhood of infinity. Furthermore, we know (see [9]), that given any event horizon ρ > 0, there are an infinite number of black-hole solutions having event horizon ρ. Our results in this paper imply that all of these solutions can be continued back to r = 0. In particular, this gives information as to the behavior of the Einstein metric and the Yang–Mills field inside a black hole, a subject of recent interest; see [4,5]. In the papers [10, 14], we have studied solutions defined in a neighborhood of r = ∞, and we proved that either the solution is defined up to some r = ρ > 0, in which case it is a black-hole solution of radius ρ (as discussed in [9], and therefore continues through the event horizon; i.e., to ρ−ε ≤ r ≤ ρ) or else the solution is defined all the way to r = 0, and is particle-like or is Reissner–Nordstr¨om-like. In this paper, we complete our investigations by analyzing ?
Research supported in part by the N.S.F., Contract No. DMS-G-9501128.
708
J. A. Smoller, A. G. Wasserman
the behavior inside the black hole; i.e., on the interval 0 < r < ρ, see [4,5] for a discussion of the behavior near r = 0. In order to describe our results, we recall that for the spherically symmetric EYM equations, the Einstein metric is of the form ds2 = −AC 2 dt2 + A−1 dr2 + r2 (dθ2 + sin2 θdφ2 ),
(1.1)
and the SU (2) Yang–Mills curvature 2-form is F = w0 τ1 dr ∧ dθ + w0 τ2 dr ∧ (sin θdφ) − (1 − w2 )τ3 dθ ∧ (sin θdφ).
(1.2)
Here A, C and w are functions of r, and (τ1 , τ2 , τ3 ) form a basis for the Lie algebra su(2). Using (1.1) and (1.2), the spherically symmetric SU (2) EYM equations are (cf. [1–14]): (1 − w2 )2 , (1.3) rA0 + (1 + 2w02 )A = 1 − r2 (1 − w2 )2 w0 + w(1 − w2 ) = 0, (1.4) r2 Aw00 + r(1 − A) − r and
C 0 2w02 = . (1.5) C r Notice that (1.3) and (1.4) do not involve C so that the major part of our effort is to study the coupled system (1.3), (1.4). We define the “mass function” µ(r) by µ(r) = r(1 − A(r)). If
lim µ(r) ≡ µ¯ < ∞,
r→∞
(1.6)
the solution is said to have finite (ADM) mass. Our main result in this paper can be stated as Theorem 1.1. Any solution to the spherically symmetric SU (2) EYM equations defined in the far field and having finite (ADM) mass, is defined for all r > 0. Equivalently, (see Proposition 4.1), we can restate our result as Theorem 1.2. Any solution to the spherically symmetric SU (2) EYM equations defined in the far field and having A(r) ¯ > 0 for some r¯ > 1, is defined for all r > 0. We now give an outline of the proof. Assume that the solution is defined for all r > r0 > 0; we then prove that the solution can be continued through r0 ; i.e., on an interval of the form r0 − ε < r < ∞, for some ε > 0. In order to get a handle on the solution we first prove that A(r) has at most a finite number of zeros on the interval r0 ≤ r < ∞; this is the main content of Sect. 3. Thus A(r) must be of one sign for r near r0 , r > r0 , and so there are two cases to consider in the proof: A > 0 near r0 or A < 0 near r0 . When A > 0 near r0 , there are certain simplifying features of the problem; for example, µ0 (r) > 0 so µ(r) has a limit at r0 , and thus limr&r0 A(r) exists. If A(r0 ) ≥ 1 then (A, w) is a Reissner–Nordstr¨om-like (RNL) solution, and it was proved in [14] that such solutions are defined on 0 < r ≤ r0 . If A(r0 ) > 0, w2 (r0 ) > 1, and (ww0 )(r0 ) ≥ 0, this contradicts our assumption that the solution is defined in the far field, [10]. If
Extendability of Solutions of Einstein–Yang/Mills Equations
709
A(r0 ) > 0, w2 (r0 ) > 1, and (ww0 )(r0 ) < 0, then it was proved in [14] that again the solution is an RNL solution. Thus, in the case where A > 0 near r0 , we may assume that 1 > A(r) > 0 and w2 (r) < 1 for r near r0 . In this case, the results in [10] show that the solution can be continued beyond r0 ; see Theorem 4.2. The main thrust of this paper is to consider the case when A(r) < 0 for r near r0 (r > r0 ), and to prove that in this case too the solution can be continued beyond r0 . If A < 0 near r0 , there are two cases to consider: (I) Near r0 , A is bounded away from zero, and (II), A is not bounded away from zero; i.e., there is a sequence rn & r0 such that A(rn ) → 0. In Case (I), we prove that the equations are non-singular at r0 , and thus the solution can be continued beyond r0 . In Case (II), the equations are singular at r0 . However, we prove in this case that these solutions are exactly solutions of the type considered in [9], and the existence and uniqueness theorems proved in [10] imply that the solution can be continued beyond r0 . These cases form the subject of Sect. 5. In Sect. 2 we introduce some auxiliary functions which will be used in the paper, and we also recall some known results. The reader is advised to consult this section as needed. The final section consists of a list of miscellaneous results, open questions and conjectures. 2. Preliminaries The static, spherically symmetric EYM equations, with gauge group SU (2), can be written in the form (cf. [1, 3, 7]): rA0 + (1 + 2w02 ) = 1 −
(2.1)
u2 w0 + uw = 0, r Aw + r(1 − A) − r
(2.2)
C 0 2w02 = , C r
(2.3)
u = 1 − w2 .
(2.4)
2
where
u2 , r2
00
Here w(r) is the connection coefficient which determines the Yang/Mills field, and A and C are the metric coefficients in (1.1). If we define the function 8 by 8(A, w, r) = r(1 − A) −
u2 , r
(2.5)
then (2.1) and (2.2) can be written in the compact form rA0 + 2Aw02 = 8/r,
(2.6)
r2 Aw00 + 8w0 + uw = 0.
(2.7)
If (A(r), w(r)) is a given solution of (2.1), (2.2), then we write 8(r) = 8 (A(r), w(r), r) .
(2.8)
710
J. A. Smoller, A. G. Wasserman
We note that (cf. [8]) the function 8 satisfies the equation 80 =
2u2 4uww0 02 . + 2Aw + r2 r
(2.9)
We shall have occasion to analyse the behavior of the functions v, f and µ defined by
v = Aw0 , 02
and
(2.10)
f = Aw ,
(2.11)
µ = r(1 − A).
(2.12)
These satisfy the respective equations ([8, 9]) v0 +
2w02 uw + 2 = 0, r r
r2 f 0 + (2rf + 8)w02 + 2uww0 = 0,
(2.13) (2.14)
and
u2 . (2.15) r2 We now shall recall some results from the papers ([8-10, 12-14]); these will be needed in our development. The first theorem gives us control on orbits which leave the region w2 < 1. µ0 = 2Aw02 +
Theorem 2.1 ([10, 14]). Let (A(r), w(r)) be a solution of (2.1), (2.2), and assume that for some r0 > 0, w2 (r0 ) > 1 and A(r0 ) > 0, i) If (ww0 )(r0 ) > 0, then there is an r1 > r0 such that limr%r1 A(r) = 0, and w0 is unbounded near r1 . ii) If (ww0 )(r0 ) < 0, then there is an r1 , 0 < r1 < r0 such that A(r1 ) = 1, A(r) > 0 if 0 < r ≤ r0 , and limr&0 (A(r), w(r), w0 (r)) = (∞, w, ¯ 0), for some w. ¯ A solution which satisfies A(r) > 1 for some r > 0, is called a Reissner–Nordstr¨omlike (RNL) solution; see [14] for a discussion of these RNL solutions. The next two theorems disallow degenerate behavior of the function A(r). Theorem 2.2 ([12, 13]). Suppose that (A(r), w(r)) is a solution of (2.1), (2.2), and ¯ 1). limr&r¯ A(r) = 0 = limr&r¯ A0 (r). Assume too that A(r1 ) > 0 for some r1 > max(r, r−1 2 , Then (A, w) is the extreme Reissner–Nordstr¨om (ERN) solution: A(r) = r w(r) ≡ 0. Theorem 2.3. Suppose (A(r), w(r)) is any solution of (2.1), (2.2), defined on an interval r1 ≤ r ≤ r2 , and set w1 = w(r1 ), w2 = w(r2 ), and M = sup |Aw02 (r)| for r1 ≤ r ≤ r2 . Suppose w1 ≤ w(r) ≤ w2 for r1 ≤ r ≤ r2 , and suppose further that there is a constant δ > 0 such that |8(r)| ≥ δ on this r-interval. Then there exists a constant η > 0, depending only on δ, M, and |w1 − w2 | such that |r1 − r2 | ≥ η. We next recall the notions of particle-like and black hole solutions of the EYM equations. A (Bartnik–McKinnon) particle-like solution of (2.1), (2.2) is a solution defined for all r ≥ 0, A(0) = 1, (w2 (0), w0 (0)) = (1, 0), and w00 (0) = −λ < 0 is a free parameter; particle-like solutions are parametrized by (a discrete set of) λ: (A(r, λ), w(r, λ)).
Extendability of Solutions of Einstein–Yang/Mills Equations
711
Theorem 2.4 ([8, 9, 3]). There is an increasing sequence λn % λ¯ ≤ 2, where w00 (0, λn ) = −λn , such that the corresponding solutions (A(r, λn ), w(r, λn )) are particle-like and limr→∞ (A(r, λn ), C(r1 , λn )) , w2 (r, λn ), w0 (r, λn ) = (1, 1, 1, 0), and µn ≡ limr→∞ r (1 − A(r, λn )) < ∞. Moreover, w(r, λn ) has precisely n-zeros. A black-hole solution of radius ρ > 0 of (2.1), (2.2) is a solution defined for all r > ρ, limr&ρ A(r) = 0, A(r) > 0 if r > ρ. It was shown in [9] that the functions A and w are analytic at ρ, and that (w(ρ), w0 (ρ)) lies on the curve Cρ in the w − w0 plane given by Cρ = {(w, w0 ) : 8(0, w, ρ)w0 + uw = 0}. The curves Cρ differ depending on whether ρ < 1, ρ = 1, or ρ > 1; these are depicted in Figs. 1–3, below. On each of these figures we have indicated the sign of 8(ρ) in the relevant regions by + or − signs. The components of Cρ for which 8 > 0 correspond to (local) solutions for which A0 (ρ) > 0, and (some) yield black-hole solutions. The other components correspond to (local) solutions with A0 (ρ) < 0. Black-hole solutions can only emanate from the component of the curve containing Q (cf. Figs. 1–3). The orbits through P and R have A(r) < 0 for some r > ρ. Finally, we showed in [14] that the orbits through R correspond to RNL solutions. W = -1
W'
Q
W=1
P
S
RNL R
– +
+ –
S
– +
P
+ –
Q R
W = – 1+ρ
W = – 1–ρ
W = 1–ρ
W
RNL
W = 1+ρ
Fig. 1. Cp (ρ < 1)
Black hole solutions are parametrized by w(ρ), and the relevant theorem for blackhole solutions is: Theorem 2.5 ([9]). Given any ρ > 0, there is a sequence (αn , βn ) ∈ Cρ , where 8(αn , ρ, 0)) 6= 0, such that the corresponding solution (A(r, αn ), w(r, αn ), w0 (r, αn )) of (2.1),(2.2) is defined for all r > ρ satisfies A(r, αn ) > 0, and w2 (r, αn ) < 1. Moreover, limr→∞ (A(r, αn ), w2 (r, αn ), w0 (r, αn )) = (1, 1, 0), limr→∞ r(1 − A(r, αn )) < ∞ and w(r, αn ) has precisely n-zeros. Our final result classifies solutions which are well-behaved in the far-field. It does not describe the behavior of either the gravitational field or the YM field, inside a black hole – this is the subject dealt with in this paper.
712
J. A. Smoller, A. G. Wasserman W = -1
RNL
R
W'
W=1
Q
– +
S
– +
S
+ –
Q
W
R
RNL
W = 2
W=– 2
Fig. 2. Cp (ρ = 1)
W = -1
W'
W=1
S RNL
R Q
– +
+ –
W
Q R
RNL
S
W = – 1+ρ
W = 1+ρ
Fig. 3. Cp (ρ > 1)
Theorem 2.6 ([14]). Let (A(r), w(r)) be a solution of (2.1), (2.2) which is defined and smooth for r > r¯ > 0 and satisfies A(r) > 0 if r > r. ¯ Then every such solution must be in one of the following classes: (i) (ii) (iii) (iv) (v) (vi)
A(r) > 1 for all r > 0; 2 Schwarzschild Solution: A(r) = 1 − 2m r , w ≡ 1, (m = const.); Reissner–Nordstr¨om Solution: A(r) = 1 − rc + r12 , w(r) ≡ 0, (c = const.); Bartnik–McKinnon Particle-like Solution; Black-Hole Solution; RNL Solution.
In each case, limr→∞ w2 (r) = 1 or 0 (0 only for RN solutions), limr→∞ rw0 (r) = 0 and limr→∞ A(r) = 1. The solution also has finite (ADM) mass; i.e. limr→∞ r(1 − A(r)) < ∞.
Extendability of Solutions of Einstein–Yang/Mills Equations
713
3. The zeros of A In this section we shall prove that the zeros of A(r) are discrete, except possibly for an accumulation point at r = 0. We shall also show that A can have at most two zeros in the region r ≥ 1. In proving these, we shall make use of Figs. 1–3. In the remainder of this paper we shall always assume that the following hypothesis (H) holds for a given solution (A(r), w(r)) of (1.3) and (1.4): Hypothesis. There is an r1 > 1 such that the solution (A(r), w(r)) is defined for all r > r1 , and A(r2 ) > 0 for some r2 ≥ r1 . Theorem 3.1. If the hypothesis (H) holds, then A has at most a finite number of zeros in any interval of the form ε ≤ r < ∞, for any ε > 0. Furthermore, all the zeros of A, with at most two exceptions, lie in the set r < 1. ¯ for some r¯ > 0, then the solution is the Note that from [12], if A(r) ¯ = 0 = A0 (r), extreme Reissner–Nordstr¨om (ERN) solution A(r) =
r−1 r
2 ,
w(r) ≡ 0.
For this solution, Theorem 3.1 clearly is valid. Thus, in this section we shall as¯ 6= 0. In this case from1 ([10]), limr&r¯ A(r) = 0, sume that if A(r) ¯ = 0, then A0 (r) 0 0 ¯ w¯ ) exists, and (w, ¯ w¯ 0 ) ∈ Cr¯ ; (cf. Figs. 1–3). limr&r¯ (w(r), w (r)) = (w, Proposition 3.2. A cannot have more than two zeros in the region r ≥ 1.
A(r)
A(r)
ρ
or
η r
ρ
η r
Fig. 4.
Proof. Suppose that A has 3 zeros in the region r ≥ 1. Then there must exist ρ, η, 1 ≤ ρ < η with A(ρ) = 0 = A(η) and A0 (η) < 0 < A0 (ρ); cf. Fig. 4. Since (w(η), w0 (η)) ∈ Cη and η > 1, we see from Fig. 3 that w2 (η) > 1. Then from Theorem 2.1,(ii), A cannot have any zeros if r < η. This contradiction establishes the result. We next prove Proposition 3.3. If 0 < r0 < 1, then r0 cannot be a limit point of the zeros of A. Notice that Theorem 3.1 follows at once from Propositions 3.2 and 3.3. 1 In ([10]), the result was demonstrated for the case where A(r) > 0 for r near r, ¯ r > r, ¯ but the same proof holds if A(r) < 0 for r near r, ¯ r > r. ¯
714
J. A. Smoller, A. G. Wasserman
Proof. We shall show that there is a neighborhood of r0 in which A 6= 0. Choose ε > 0 such that r0 +ε < 1. We will show, using Theorem 2.3, that there exists an η > 0 such that if z1 and z2 are two consecutive zeros of A, r0 < z1 < z2 < r0 +ε < 1, A(z1 ) = 0 = A(z2 ), then
A0 (z1 ) > 0 > A0 (z2 ),
z2 − z1 > η.
(3.1) (3.2)
This implies that there can be at most a finite number of zeros of A in the interval (r0 , r0 + ε). Now A(z2 ) = 0 implies that (w(z2 ), w0 (z2 )) lies on Cz2 , and A0 (z2 ) < 0 implies that (w(z2 ), w0 (z2 )) lies on the middle curve in Fig. 1, (where ρ is replaced by z2 ). Without loss of generality, assume w0 (z2 ) > 0, w(z2 ) > 0. Now define δ by 1 = −2δ < 0. (r0 + ε) − (r0 + ε) Then there exists a constant c > 0 such that (r0 + ε) − Hence r−
u2 ≤ −δ < 0, if |w| < c. (r0 + ε)
u2 ≤ −δ, if r0 ≤ r ≤ r0 + ε, r
and thus
u2 − rA < −δ , z1 ≤ r ≤ z2 , (3.3) r since A(r) > 0 if z1 < r < z2 . Let w1 = −c, w2 = 0; then there exist r1 , r2 , z1 < r1 < r2 < z2 such that w(r1 ) = −c, w(r2 ) = 0. (This is because A cannot change sign in the interval −c ≤ w ≤ 0; cf. Fig. 1, and Fig. 5.) 8(r) = r −
W = -1
W'
CZ
1
r2 A=0 f=0
CZ
2
(W(Z ), W'(Z2)) 2
r1 W
r=a f (a) = 0
W=-C=W1 W=0=W2
Fig. 5.
Now in view of (3.3), if we can show that there is an M > 0, M independent of z1 , z2 for which |Aw02 | ≤ M, r1 ≤ r ≤ r2 (equivalently, w1 ≤ w(r) ≤ w2 ),
(3.4)
Extendability of Solutions of Einstein–Yang/Mills Equations
715
then on the interval w1 ≤ w ≤ w2 , we may apply Theorem 2.3 to conclude that (3.2) holds. Thus the proof of Proposition 3.3 will be complete once we prove (3.4); this is the content of the following lemma. Lemma 3.4. If −c ≤ w(r) ≤ 0, then f (r) ≡ (Aw02 )(r) <
2 . r02
(3.5)
Proof of Lemma 3.4. Recall that f satisfies r2 f 0 + (2rf + 8)w02 + 2uww0 = 0.
(3.6)
We first claim that there is a value a < r1 , with f (a) = 0, and for a ≤ r < r1 , −1 ≤ w(r) ≤ −c, and w0 (r) ≥ 0. Indeed, note that the orbit cannot exit the region w2 < 1 through w = −1, for r > z1 , because by Theorem 2.1 (ii), there would be no zero of A smaller than r1 . Therefore, either the point w(z1 ), w0 (z1 ) lies in −1 ≤ w ≤ −c, w0 > 0, in which case we take a = z1 , or else the orbit crosses the segment −1 ≤ w ≤ −c, w0 = 0 at some r = a, and again f (a) = 0. We now prove 2 (3.7) if f (r) = 2 , then f 0 (r) < 0, r0 for r in the interval (a, r2 ). Since f (a) = 0, then if (3.7) holds, there can be no first value of r for which f (r) = r22 , and hence (3.5) holds. Thus it suffices to prove (3.7). 0 To do this, we first note that 8(r) ≥ −
1 , r0
Indeed 8(r) = r(1 − A) − Now from (2.14), we have, when f =
if
a < r < r2 .
(3.8)
u2 u2 1 1 ≥− ≥− ≥− . r r r r0
2 , r02
r2 f 0 (r) = −(2rf + 8)w0 − 2uw w0 2 0 = − 2r 2 + 8 w − 2uw w0 r0 4r 1 1 0 ≤ − + w + 2 w0 r 0 r 0 r0 w0 4r = 2+ 1− w0 r0 r0 3w0 ≤ 2− w0 , r0 where we have used (3.8). Now when f = r22 , w02 = r22A > r22 , or w0 > 0 0 0 in (3.9) gives √ ! √ 3 2 w0 < (2 − 3 2)w0 < 0, r2 f 0 ≤ 2 − 2 r0
(3.9)
√
2 r0 .
Using this
716
J. A. Smoller, A. G. Wasserman
and this gives (3.7). Thus the proof of Lemma 3.4 is complete, and as we have seen, this proves Proposition 3.3. 4. The Case A > 0 Near r0 In this section we shall first prove the equivalence of Theorems 1.1 and 1.2. Then we shall prove Theorem 1.2 in the case where A(r) > 0 for r near r0 , r > r0 . In view of Theorem 3.1, we know that A can have at most a finite number of zeros on the interval (r0 , ∞). Hence A(r) is of one sign for r > r0 . In this section we shall prove that if A(r) > 0 for r near r0 , the solution can be extended. The far more difficult case where A(r) < 0 for r near r0 , will be considered in Sect. 5. Proposition 4.1. Theorems 1.1 and 1.2 are equivalent. Proof. Assume that Theorem 1.1 holds, and that A(r) ¯ > 0, where r¯ is as given in the ¯ . If w2 (r) ¯ ≥ 1 and (ww0 )(r) ¯ > 0, then statement of Theorem 1.2. Consider w(r), ¯ w0 (r) from Theorem 2.1, i), the solution cannot exist for all r > r, ¯ and this contradicts our ¯ ≥ 1 and (ww0 )(r) ¯ < 0, then from Theorem 2.1, ii), the solution assumptions. If w2 (r) is an RNL solution and is thus defined for all r, 0 < r < r. ¯ Thus, we may assume that w2 (r) ¯ < 1. If w2 (r) ˜ > 1 for some r˜ > r, ¯ then (ww0 )(r) ˜ > 0, so again Theorem 2.1, i) implies that the solution is not defined in the far-field. Hence we may assume that the orbit stays in the region w2 (r) < 1 for all r > r. ¯ Moreover A(r) > 0 for all r > r¯ because A(r) = 0 for some r > r¯ > 1 cannot occur. (In w2 < 1, “crash" can occur only if r < 1; see [7].) Thus from [14, Proposition 6.2], limr→∞ µ(r) < ∞, hence Theorem 1.2 holds. Conversely, if Theorem 1.2 holds, then (1.6) implies that limr→∞ r(1 − A(r)) < ∞ so A(r) → 1 as r → ∞; in particular A(r) > 0 for r large. This implies that Theorem 1.1 holds. This last result justifies our assumption that in the remainder of this paper that the following hypothesis (H) holds for a given solution (A(r), w(r) of (1.3) and (1.4): Hypothesis. There is an r1 > 1 such that the solution (A(r), w(r)) is defined for all r > r1 , and A(r2 ) > 0 for some r2 ≥ r1 . We now let r0 be any given positive number, and assume that the solution (A(r), w(r)), of (1.3), (1.4) is defined for all r > r0 . We then have the following theorem: Theorem 4.2. Assume that hypothesis (H) holds, and that A(r) > 0 for r near r0 , r > r0 . Then the solution can be extended to an interval of the form r0 − ε < r ≤ r0 . Proof. It follows from Theorem 2.1 that either w2 (r) < 1 for all r near r0 , or else (A, w) is an RNL solution and is thus defined for 0 < r ≤ r0 . In the case w2 (r) < 1 for all r near r0 then if A(r) is bounded away from zero for r near r0 the solution must continue into a region of the form (r0 − ε, r0 ], for some ε > 0. (The proof of this fact is the same if A > 0 or A < 0 near r0 . In (5.6) below we give the proof for A < 0, so we omit the proof here). If, on the other hand, A is not bounded away from zero near r0 , then A(rn ) → 0 for some sequence rn & r0 . In [10], we have shown that this implies limr&r0 A(r) = 0, and limr&r0 (w(r), w0 (r)) ∈ Cr0 , so the solution (A, w) is analytic at r0 and thus again continues past r0 ; i.e., to an interval of the form r0 − ε ≤ r ≤ r0 . This completes the proof of Theorem 4.2. In the next section we shall consider the case where A(r) < 0 for r near r0 , r > r0 .
Extendability of Solutions of Einstein–Yang/Mills Equations
717
5. The Case A < 0 Near r0 In this section we assume that the solution (A, w) of (1.3), (1.4) is defined for all r > r0 , and that A(r) < 0 for r near r0 , r > r0 . We shall prove that the solution can be continued past r0 . This is the content of the following theorem. Theorem 5.1. Assume that hypothesis (H) holds and that A(r) < 0 for r near r0 , r > r0 . Then the solution can be continued to an interval of the form r0 − ε < r ≤ r0 . Notice that Theorems 4.1 and 5.1 imply Theorem 1.2. Proof. There are two cases to consider: Case 1. There are positive numbers δ and 1 such that A(r) < −δ ,
if
0 < r0 < r < r0 + 1 ;
(5.1)
Case 2. There is a 1 > 0 such that A(r) < 0 ,
if
0 < r0 < r < r0 + 1 ;
(5.2)
and for some sequence rn & r0 , A(rn ) → 0.
(5.3)
We begin the proof of Theorem 5.1 by first considering Case 1. We shall need a few preliminary results, the first of which is Lemma 5.2. If (5.1) holds, and w(r) is bounded near r0 (r > r0 ), then w0 (r) is bounded near r0 . Proof. From (2.7), we can write w00 +
8 uw w0 = − 2 . 2 r A r A
(5.4)
Since 1 1 u2 8 = − − 3 , 2 r A rA r r A we see that both 8/r2 A and uw/r2 A are bounded near r0 . Thus the coefficients in (5.4) as well as the rhs are bounded, so w0 too is bounded near r0 . Lemma 5.3. If w0 is bounded near r0 , then A is bounded near r0 . Proof. From (2.1), we have rA0 + (1 + 2w02 )A = 1 −
u2 . r2
(5.5)
The hypothesis implies that w is bounded near r0 so the coefficients of (5.5) are bounded near r0 . Thus A too is bounded near r0 .
718
J. A. Smoller, A. G. Wasserman
These last two results enable us to dispose of the case where (5.1) holds, and also w(r) is bounded near r0 (r > r0 ).
(5.6)
Since (5.6) holds, then A, w, and w0 are bounded near r0 , and by (5.1), A(r) < −δ, we see from (1.3), and (1.4), that A0 , w0 and w00 are bounded. Thus limr&r0 (A(r), w(r), w0 (r), r) ¯ w, = (A, ¯ w¯ 0 , r0 ) ≡ P exists where A¯ < 0. Hence the orbit through P is thus defined on an interval , r0 − ε < r < r0 + ε, for some ε > 0. Remark. We did not use the fact that A < 0 to obtain this conclusion; all we needed was A bounded away from 0 and w bounded near r0 . We shall now show that in Case 1, w must be bounded near r0 . To do this, we will assume that w is unbounded near r0 , r > r0 , and we shall arrive at a contradiction. Thus, assume that for some ε > 0, w(r) is unbounded on (r0 , r0 + ε).
(5.7)
Lemma 5.4. If A(r) < 0 for r near r0 , and (5.7) holds, then the projection of the orbit (w(r), w0 (r)) has finite rotation about (0, 0), and about (±1, 0) for r near r0 . Remark. Note that we do not assume (5.1) but only that A < 0 near r0 . In Case 2, we use the contrapositive of Lemma 5.4; i.e., if A < 0 for r near r0 , and if the orbit has infinite rotation about either (0, 0) or (±1, 0), then w is bounded near r0 . Proof. Assume that the orbit has infinite rotation about either (0, 0), or (±1, 0); we will show that this leads to a contradiction. Since (5.7) holds, the orbit must rotate infinitely many times outside the region w2 ≤ 1, as r & r0 . We may also assume without loss of generality that limr&r0 w(r) = −∞. It follows that there exists sequences {rn }, {sn }, rr+1 < sn+1 < rn , with w0 (rn ) = 0, w(sn ) = −2, lim w(rn ) = −∞, and w(rn ) < w(r) < w(sn ), for rn < r < sn ; cf. Fig. 6 W = -2
W'
Sn W
rn
Fig. 6.
We first show that for w(r) ≤ −2, w0 is bounded; i.e, (as in the proof of Lemma 3.4, (cf. (3.7)), 2 (5.8) if w0 (r) = (r0 + ε), then w00 (r) < 0. 3 To prove (5.8), we use (2.7): w00 =
u −uw (r − rA)w0 u2 w0 − + 3 < 3 [−rw + uw0 ]. 2 2 r A −r A r A r A
(5.9)
Extendability of Solutions of Einstein–Yang/Mills Equations
719
Thus, if for some r > r0 , and w(r) ≤ −2, we had w0 (r) = 23 (r0 + ε), then since it follows that 2 w w0 (r) = (r0 + ε) > r , 3 u
w u
≤ 23 ,
so that (5.9) implies (5.8). Thus, if w(r) ≤ −2, then w0 (r) < 23 (r0 +ε). Since sn −rn < ε, we have, for large n 2ε −2 − w(rn+1 ) < (r0 + ε); 3 this violates (5.7).qed Corollary 5.5. If (5.1) and (5.7) hold, then limr&r0 |w(r)| = ∞. Proof. For r near r0 , the lemma implies that the orbit has finite rotation near r0 . Thus the orbit must lie in one of the four strips, w < −1, −1 < w < 0, 0 < w < 1, w > 1. Since in each strip w00 is of fixed sign when w0 = 0 it follows then that w0 is of one sign near r0 , so that w has a limit at r0 ; since w(r) is not bounded near r0 , the result follows. W = -1
(1)
W'
(2)
W=1
(3)
(4) W
(8)
(7)
(6)
(5)
Fig. 7.
It follows from the last result that if w is unbounded near r0 , then the orbit must lie in either region (1) or region (5), as depicted in Fig. 7. We will assume that the orbit lies in region (5) for r near r0 ; the proof for region (1) is similar, and will be omitted. Thus, assuming (5.1), and (5.7) we have w0 (r) < 0 near r0 , and lim w(r) = +∞.
(5.10)
r&r0
Since r0 is finite, (5.10) implies w0 (r) is unbounded for r near r0
(r > r0 ).
(5.11)
Lemma 5.6. If A(r) < 0 for r near r0 (r > r0 ), and (5.10) holds, then lim w0 (r) = −∞.
r&r0
(5.12)
Remark. We do not use hypothesis (5.1) in this lemma, but we only assume A < 0 near r0 . This result will be used in Case 2.
720
J. A. Smoller, A. G. Wasserman
Proof. If w0 does not have a limit at r0 , then in view of (5.11), we can find sequences rn & r0 , sn & r0 , rn < sn < rn+1 , such that w0 (sn ) = −n,
n w0 (rn ) = − , 2
(5.13)
and
n if rn ≤ r ≤ sn , w0 (r) ≤ − . 2 Then if rn ≤ r ≤ sn and n is large, (2.2) gives
(5.14)
2
00
−w (r) = = < <
(r − rA − ur )w0 + uw r2 A 2 2 (−r A − u2 )w0 + (ruw + r2 w0 − r3 A u2 0 2 −w (−r A − 2 ) −r3 A 2 −w0 −w0 −r A = < . −w0 3 −r A r r0
u2 2
w0 )
Thus
1 −w00 < , −w0 r0 and so integrating from rn to sn , gives 1 −w0 (sn ) < (sn − rn ), `n2 = `n −w(rn ) r0
so that
sn − rn > r0 `n2.
(5.15)
(5.16)
But for large n, rn < 1 + r0 , so that (5.16) implies 1 = (1 + r0 ) − r0 ≥ 6(sn − rn ) = ∞. This contradiction establishes (5.12) and the proof of the lemma is complete.
Thus to dispense with Case 1, and obtain the desired contradiction (assuming that w is unbounded near r0 ), we shall prove the following proposition. Proposition 5.7. It is impossible for (5.1) and (5.7) to hold. To prove this proposition, we shall obtain an estimate of the form w00 (r) ≤ k(−w0 (r)) for r near r0 . Integrating from r > r0 to r1 > r, gives −w0 (r) ≤ k(r1 − r), `n −w0 (r1 ) and this shows that w0 is bounded near r0 , thereby violating (5.12). In order to prove (5.17), we need two lemmas, the first of which is
(5.17)
Extendability of Solutions of Einstein–Yang/Mills Equations
721
Lemma 5.8. If A(r) < 0 for r near r0 , (r > r0 ), and both (5.10) and (5.12) hold, then writing Aw02 = f , we have − f (r) > w(r)5 , if r is near r0 .
(5.18)
Remark. We do not assume that (5.1) holds, but only that A < 0 near r0 . This result too will be used in Case 2. Proof of Lemma 5.8.. We write (2.14) in the form (cf. (2.5)) 2 −u 02 2 0 02 02 0 w + 2uww = 0. r f + rf w + (rf + r − rA)w + r
(5.19)
Now for r near r0 , rf + r − rA = rAw02 + r − rA = rA(w02 − 1) + r ≤ 0,
(5.20)
in view of (5.12). Furthermore, if r is near r0 , −u2 02 w + 2uww0 < 0 (5.21) r because of (5.10), and (5.12). Thus (5.19)-(5.21) imply r2 f 0 + rf w02 > 0, so that for r near r0 −w0 0 (−w0 ) > f w0 , f > −f r or f 0 /f < w0 . Integrating from r to r1 , where r0 < r < r1 , and r1 is close to r0 , gives r1 `n(−f ) < w(r1 ) − w(r), r
so that
`n(−f (r)) > w(r) − k1 , where k1 = w(r1 ) − `n(−f (r1 )). Exponentiating gives −f (r) > k2 ew(r) > w(r)5 ,
for r near r0 , in view of Corollary 5.5.
We shall use this last lemma for proving the following result. Lemma 5.9. Assume that (5.1) and (5.10) hold. Then there is a constant k > 0 such that (5.22) − A(r) > kw(r)4 , for r near r0 . Proof. From (2.1), if r is near r0 , 8 2 A = 2− f≥ r r 0
f −u2 − 3 r r
−
f −f > , r r
where we have used (5.18). Thus, using (5.12), −Aw02 = k3 Aw0 , r for some k3 > 0. It follows that for some constant k > 0, A0 >
−A(r) > ek3 w > kw4 , if r is near r0 .
722
J. A. Smoller, A. G. Wasserman
We can now complete the proof of Proposition 5.7. As we have seen earlier, it suffices to prove (5.17). Now since we are in region (5) (cf. Fig. 7), uw < 0, so that for r near r0 , (2.2) gives w00 <
(r − rA)w0 − −r2 A
u2 0 r w
<
u2 w 0 u2 w0 w4 w0 < 3 < c1 , 3 r A A r0 A
where c1 is a positive constant. Thus, w00 <
c1 c1 w4 (−w0 ) < (−w0 ) ≡ c(−w0 ), −A k
where we have used (5.22). This proves (5.17), and as we have seen, completes the proof of Proposition 5.7. We now consider Case 2, where (5.2) and (5.3) hold; we will show that: lim A(r) = 0,
(5.23)
lim (w(r), w0 (r)) = (w, ¯ w¯ 0 ) ∈ Cr0 .
(5.24)
r&r0
and
r&r0
Remark. If (5.23) and (5.24) hold, then by the uniqueness theorem of [10], the solution is analytic at r0 and hence continues past r0 . We begin with the following result. Proposition 5.10. If (5.2) and (5.3) hold, then the orbit has finite rotation about (0,0) in the (w − w0 )-plane; i.e. < ∞. The proof will follow from a series of lemmas, the first of which is Lemma 5.11. Assume that (5.2) and (5.3) hold. If the rotation = ∞, or if w is bounded near r0 , then lim A(r) = 0. (5.25) r&r0
Proof. If = ∞, then w is bounded near r0 , by Lemma 5.4 and the remark following. Thus we will prove that if w is bounded near r0 , then (5.25) holds, or equivalently, that lim µ(r) ≡ lim r(1 − A(r)) = r0 .
r&r0
r&r0
(5.26)
Since A(r) < 0 for r near r0 , µ(r) = r(1 − A(r)) ≥ r > r0 , if r > r0 , so since A(rn ) → 0, lim µ(r) = r0 .
(5.27)
r&r0
We shall next prove
limr&r0 µ(r) ≤ r0 ,
and this together with (5.27) will prove (5.26).
(5.28)
Extendability of Solutions of Einstein–Yang/Mills Equations
723
If limr&r0 µ(r) > r0 , then we can find numbers b and c, b > c > r0 , and sequences {sn }, {tn }, r0 < tn+1 < sn < tn , with µ(sn ) = c, µ(tn ) = b. Thus b − c = µ(tn ) − µ(sn ) = µ0 (ξ)(tn − sn ), where ξ is an intermediate point. Now from (2.15) for r near r0 , µ0 (r) = 2Aw02 +
u2 u2 ≤ ≤ k, r2 r2
since w is assumed to be bounded. P Hence (b−c) < k(tn −sn ), or tn −sn > (b−c)/k > 0. This is a contradiction since n (tn − sn ) is finite. Thus (5.28) holds and the proof is complete. Combining Lemmas 5.4 and 5.11, we get as an immediate corollary, Corollary 5.12. If (5.2) and (5.3) hold, and = ∞, then 8(r) is bounded for r near r0 . We next have Lemma 5.13. If (5.2) and (5.3) hold, and w is bounded near r0 , then either Aw02 is bounded near r0 , or limr&r0 (Aw02 )(r) = −∞. Proof. We write f = Aw02 , and again use (2.14): r2 f 0 + (2rf + 8)w02 + 2uww0 = 0.
(5.29)
If f is not bounded near r0 , then (Lemma 5.11) since 8 and w are bounded, (5.29) shows that f 0 > 0 if f is sufficiently large, and the result follows. Lemma 5.14. If (5.2) holds, and Aw02 is bounded near r0 , then the rotation number is finite. Proof. We are going to apply Theorem 2.3 with w1 = −1, w2 = −1 + ε, for some ε > 0. Thus assume = ∞; then there exists a sequence r0n & r0 with w(r0n ) = 0, w0 (r0n ) > 0. Since A < 0 near r0 , the orbit cannot cross the segment w0 = 0, −1 ≤ w ≤ 0 n n n for r < r0n . Thus we can find ε > 0 and numbers r−1 , and r−1+ε , such that w(r−1 ) = −1, n n n w(r−1+ε ) = −1 + ε, and for r−1 ≤ r ≤ r−1+ε , we have −1 < w(r) < −1 + ε, and n ≤ r ≤ r0n , −1 + ε < w(r) < 0. By hypothesis, Aw02 is bounded near r0 , so for r−1+ε n n ≤ r ≤ r−1+ε , for large n. In order to apply Theorem 2.3, it only in particular on r−1 remains to show that 8(r) is bounded away from 0 on this interval if ε is small. Choose ε > 0 so small that (1 − w2 )2 <
1 2 r , if − 1 ≤ w ≤ −1 + ε. 10 0
(5.30)
u2 .1r02 u2 >r− > r0 − = .9r0 . r r r0
(5.31)
On this interval, 8 = r − rA −
Now by Theorem 2.3, there exists an η > 0, such that for each n, n n r−1+ε − r−1 ≥ η.
724
J. A. Smoller, A. G. Wasserman
n n But as r−1+ε and r−1 both lie in (r0 , r0 + 1) for large n, we have n n 1 = (r0 + 1) − r0 ≥ 6 r−1+ε − r−1 = ∞,
and this is a contradiction.
Our final lemma in the proof of Proposition 5.10 is the following Lemma 5.15. If (5.2) and (5.3) hold, and = ∞, Aw02 is bounded near r0 . Proof. By Corollary 5.12, 8 is bounded. From (2.6), if Aw02 → −∞, then as r & r0 , rA0 = −2Aw02 + and this contradicts (5.3).
8 −→ +∞, r
Note that Lemmas 5.14, and 5.15 prove Proposition 5.10. Corollary 5.16. If (5.2) and (5.3) hold, then w(r) is of one sign for r near r0 . We next show that for r near r0 , either w2 (r) > 1 or w2 (r) < 1;
(5.32)
that is, either w < −1, or −1 < w < 0, or 0 < w < 1, or w > 1. To prove this we need two lemmas, the first of which is: Lemma 5.17. If (5.2) and (5.3) hold then limr&r0 w2 (r) = 1 is not possible. Proof. Suppose (for definiteness) that limr&r0 w(r) = −1. With ε defined by (5.30), we see that for r near r0 , −1 − ε ≤ w(r) ≤ −1 + ε. On this interval, (5.31) implies 8(r) > .9r0 . Then from (2.6), rA0 = −2Aw02 + and this contradicts (5.3).
.9r0 8 > > 0, r r
We next show that the orbit has finite rotation about (1, 0) in the case w > 0 near r0 , or about (−1, 0) in case w < 0. Lemma 5.18. If (5.2) and (5.3) hold and w > 0 for r near r0 , then the projection of the orbit in the w − w0 plane has finite rotation about (1, 0). Similarly if w < 0 for r near r0 , then the projection of the orbit in the w − w0 plane has finite rotation about (−1, 0). Proof. Suppose w > 0 near r0 (the proof for w < 0 is similar, and will be omitted), and the orbit has infinite rotation about (1,0). Since limr&r0 w(r) 6= 1, we must have either limr&r0 w(r) > 1 or limr&r0 w(r) < 1. In either case we repeat the argument of Lemma 5.10 using the w-interval [1, 1 + ε] or [1 − ε, 1]. We have that 8 is bounded away from 0 by (5.31). By Lemma 5.13, either (Aw02 )(r) → −∞ as r & r0 , or Aw02 is bounded near r0 . We rule out the case Aw02 → −∞ because w0 is of one sign; hence Aw02 is bounded near r0 . Using Theorem 2.3 exactly as in Lemma 5.14, we have that the orbit can cross the line w = 1 a finite number of times. Thus w > 1 or w < 1 for r near r0 .
Extendability of Solutions of Einstein–Yang/Mills Equations
725
Summarizing, we have Corollary 5.19. For r near r0 , precisely one of the following holds: w(r) < −1, −1 < w(r) < 0, 0 < w(r) < 1, or w(r) > 1. Since w00 , when w0 = 0, has a fixed sign in each of the four strips, we see that w0 must have a fixed sign for r for r0 ; i.e., the projection of the orbit in the w − w0 plane must lie in one of the 8 regions depicted in Fig. 7. Since we now have the orbit confined to one of these 8 regions, without loss of generality we will consider the case where w0 < 0. We will first show that orbit cannot lie in regions (6) or (8) for r near r0 . Then we will show that if the orbit is in regions (5) or (7), and w0 is bounded near r0 , then limr&r0 A(r) = 0 and limr&r0 (w(r), w0 (r)) exists and lies on Cr0 ; hence the orbit continues past r0 . We complete the proof of Theorem 5.1 by showing that the case where w0 is unbounded near r0 cannot occur. Lemma 5.20. If (5.2) and (5.3) hold, then the orbit cannot lie in regions (6), or (8) for r near r0 . Proof. In regions (6) and (8), w is bounded near r0 . Thus from Lemma 5.11, lim A(r) = 0.
r&r0
(5.33)
If v = Aw0 , then from (2.13) we see v 0 ≤ 0 so limr&r0 v(r) = L > 0 exists. Thus 2 writing Aw02 = vA , we see that lim (Aw02 )(r) = −∞.
r&r0
(5.34)
Since w is bounded near r0 (5.33) implies that 8 is bounded near r0 . Thus, from (2.6), rA0 =
8 − 2Aw02 −→ +∞ r
as r & r0 . However, this contradicts (5.3).
We now consider the case where (5.2) and (5.3) hold, and the orbit lies in one of the regions (5) or (7) for r near r0 , r > r0 . We first consider the case where w0 is bounded. Lemma 5.21. Suppose that (5.2) and (5.3) hold, and that the orbit lies in either region (5) or (7) for r near r0 . If w0 (r) is bounded near r0 then limr&r0 A(r) = 0, limr&r0 (w(r), w0 (r)) = (w, ¯ w¯ 0 ) exists, and (w, ¯ w¯ 0 ) lies on Cr0 . Note that in view of our remark preceding Proposition 5.10, Lemma 5.21 implies that Theorem 5.1 holds in this case. Proof. First note that since w0 is bounded, this implies w is bounded, and hence Lemma 5.11 implies that (5.35) lim A(r) = 0. r&r0
Now as A → 0, and w has a limit, we see that 8 = r − rA − u2 /r has a limit; call this limit 80 ; i.e.
726
J. A. Smoller, A. G. Wasserman
80 = lim = 8(r).
(5.36)
r&r0
If 80 6= 0, then as limr&r0 v(r) = 0 we may apply L’Hospital’s rule to obtain lim w0 (r) = lim
r&r0
r&r0
= lim
r&r0
v(r) v 0 (r) = lim 0 A(r) r&r0 A (r) −2w02 v − uw r r2 8 2Aw02 − r r2
= lim
r&r0
−uw , 8
where we have used (2.6) and (2.13). Thus
lim w0 (r) = lim
r&r0
r&r0
−uw . 8
(5.37)
We claim that 80 6= 0.
(5.38)
Note that if (5.38) holds, then since w has a finite limit at r0 , (5.37) implies that limr&r0 w0 (r) exists and is finite, and lim (w(r), w0 (r)) ∈ Cr0 .
r&r0
So, to complete the proof Lemma 5.21, it suffices to prove (5.38). Thus, assume 80 = 0; we show this leads to a contradiction. If (uw)(r0 ) 6= 0, then (5.37) implies that w0 (r) is unbounded near r0 , and this is a contradiction. Hence we may assume (uw)(r0 ) = 0. If u(r0 ) = 0, then 0 = 80 = r0 −
u20 = r0 , r0
and this is a contradiction since r0 > 0. Thus we may assume w(r0 ) = 0. In this case 0 = 80 = r0 −
1 , r0
so that r0 = 1. Note too that if w(r0 ) = 0, the orbit lies in region (7) for r near r0 . We now have A(rn+1 ) − A(rn ) = (rn+1 − rn )A0 (ξ),
(5.39)
where rn > ξ > rn+1 > 1. From (2.6) ξA0 (ξ) = 1 − A(ξ) − 2
u2 (ξ) − 2(Aw02 )(ξ). ξ2
(5.40)
Since ξ > 1, 1 − u ξ(ξ) > 0, so for large n, (5.40) implies A0 (ξ) > 0. Using this in (5.39) gives 0 > A(rn ) > A(rn+1 ), and this violates (4.3). Thus (5.38) holds and the proof is complete.
Extendability of Solutions of Einstein–Yang/Mills Equations
727
We now consider the case where (5.2) and (5.3) hold, and the orbit is in region (5) or (7), and w0 (r) is unbounded for r near r0 , r > r0 . We shall show that this case is impossible. First note that if w is bounded near r0 , it follows from Lemma 5.11 that lim A(r) = 0.
r&r0
(5.41)
Since w0 < 0, limr&r0 w(r) exists. Thus if w is bounded near r0 , limr&r0 8(r) exists and is finite; say (5.42) lim 8(r) = 80 . r&r0
We now have Proposition 5.22. If (5.2) and (5.3) hold, and w0 is unbounded near r0 , then w cannot be bounded near r0 ; in particular that orbit cannot lie in region (7). Proof. Suppose that w(r) is bounded for r near r0 ; we will show that this leads to a contradiction. Thus, in this case (5.41) holds and 80 is finite. We consider 3 cases 80 > 0, 80 < 0, 80 = 0, and we will obtain contradictions in all cases. Case 1. 80 > 0. From (2.6), for r near r0 , A0 (r) =
8 2Aw02 > 0, − 2 r r
and this violates (5.3); thus Case 1 cannot occur. Case 2. 80 < 0. We first show lim w0 (r) = −∞.
r&r0
(5.43)
To see this, note that if (5.43) were false, then as w0 is unbounded near r0 , there would exist a sequence sn & r0 such that w0 (sn ) < −n and w00 (sn ) = 0. Then from (2.7) 0 = s2n (Aw00 )(sn ) + 8(sn )w0 (sn ) + (uw)(sn ) = 8(sn )w0 (sn ) + (uw)(sn ) −→ ∞ as n → ∞. This contradiction implies that (5.43) holds. Now if f = Aw02 , then from (2.14), r2 f 0 + (2rf + 8)w02 + 2uww0 = 0,
(5.44)
and since (2rf + 8) is strictly negative near r0 and w is bounded near r0 it follows from (5. 43) that f 0 (r) > 0 if r is near r0 ). Thus lim f (r) = L < 0
r&r0
exists; where L ≥ −∞. We claim that L = −∞.
(5.45)
728
J. A. Smoller, A. G. Wasserman
To see this, we note first that (w02 v)(r) = w0 (r)f (r) → +∞, (v = Aw0 ),
(5.46)
so that (cf. (2.13)),
−2w02 v uw − 2 → −∞, r r since w is bounded near r0 . Hence, if r0 < r < r1 , and r1 is near r0 , v(r1 ) < v(r) so v0 =
(Aw02 )(r) = v(r)w0 (r) < v(r1 )w0 (r), and as v(r1 )w0 (r) → −∞, we see that (Aw02 )(r) → −∞, as r & r0 ; thus (5.45) holds. Now again using (2.6), rA0 (r) = −2(Aw02 )(r) +
8 → +∞, r
as r & r0 . But this violates (5.3); hence Case 2 cannot occur. We now turn to the final case, Case 3. 80 = 0. The proof in this case relies on Theorem 2.2. Indeed, we will show that limr&r0 A0 (r) = 0, and from (5.41), limr&r0 A(r) = 0. This is enough to invoke Theorem 2.2, to conclude that w(r) ≡ 0 and thus w0 (r) ≡ 0; this violates the assumption that w0 is unbounded. We first show lim A0 (r) ≤ 0. (5.47) r&r0
Indeed, if limr&r0 A0 (r) > 0 then for r > r0 , r near r0 , 0 > A(r) = A(r) − A(r0 ) = (r − r0 )A0 (ξ) > 0, where ξ is an intermediate point. This contradiction establishes (5.47). Next, since 8 rA0 = − 2Aw02 , r
(5.48)
it follows from (5.47) that limr&r0 ( 8r − 2Aw02 ) ≤ 0, so limr&r0 ( 8r00 − 2Aw02 ) ≤ 0, or 0 ≥ limr&r0 2Aw02 ≥ thus We next show
80 = 0, r0
limr&r0 Aw02 = 0.
(5.49)
lim Aw02 = limr&r0 Aw02 .
(5.50)
r&r0
(Note that if (5.50) holds, then limr&r0 Aw02 = 0, so from (5.48) A0 (r0 ) = 0. Thus the proof of Proposition 5.22 will be complete once we prove (5.50).) So suppose that there is an η > 0 such that lim Aw02 ≤ −2η . r&r0
(5.51)
Extendability of Solutions of Einstein–Yang/Mills Equations
729
Then in view of (5.49), if f = Aw02 , we can find a sequence sn & r0 such that f (sn ) = −η, f 0 (sn ) < 0. Since (5.41) holds, we have A(sn ) → 0 so that w0 (sn ) → −∞. From (5.44), s2 f 0 (sn ) + (−2sn η + 8(sn ))w02 (sn ) + 2(uww0 )(sn ) = 0.
(5.52)
But as f 0 (sn ) < 0 and w02 (sn ) → ∞, we see that (5.52) cannot hold for large n. Thus (5.50) holds and this implies limr&r0 Aw0 (r) = 0, and thus by Theorem 2.2, we have a contradiction. We now consider the final case in the proof of Theorem 5.1, namely in regions (5) or (7), (5.53) w and w0 are unbounded near r0 . (Of course, this implies that we are in region (5).) Note too that in this case we have lim w(r) = +∞.
(5.54)
r&r0
Proposition 5.23. If (2.2) and (2.3) hold, and the orbit lies in region (5), then (5.54) cannot hold. Note that once Proposition 5.23 is established this will complete the proof of Theorem 5.1. Proof. From our remark following the statement of Lemma 5.6, we have lim w0 (r) = −∞.
(5.55)
r&r0
Then as we have remarked earlier (5.18) holds; i.e. Aw02 > w5 , for r near r0 . Thus, from (5.48) for r near r0 , 8 u2 0 5 rA (r) = −2f + > 2w(r) + 1 − A − > 0, r r since u2 is of order w4 , and this contradicts (5.3).
6. Miscellaneous Results and Open questions In Sect. 3, we proved that the zeros of A are discrete, except possibly at r = 0. This leads to the first question. 1. Can r = 0 be a limit point of zeros of A? We conjecture that the answer is no. In a recent paper [4, p. 8, ` 7], the authors assume that the answer is no. A rigorous proof of this would be welcome. A related question is 2. Do there exist solutions of the EYM equations for which A has more than two zeros? A negative answer obviously implies a negative answer to question 1. In [5], the authors have numerically obtained a solution having two zeros. This leads to the next Problem. 3. Give a rigorous proof of the existence of a global solution of the EYM equations, (other than the classical Reissner–Nordstr¨om solution), where A has two zeros.
730
J. A. Smoller, A. G. Wasserman
4. A subject of much current interest is the study of solutions near r = 0 [4,5]. If, as we suspect, Question 1 has a negative answer, then every solution near r = 0, has either A > 0 or A < 0. If A > 0 near r = 0, then we have proved in [10], that either limr&0 A(r) = 1, in which case the solution is particle-like, or else lim r&0 A(r) = +∞, in which case the solution is a Reissner–Nordstr¨om-like (RNL) solution [14]; this case is re-discussed in [4]. If A < 0 near r = 0, much less is known. In [14], we proved the following theorem: Theorem 6.1. Given any triple of the form q = (1, b, c), there exists a unique local RNL solution (Aq (r), wq (r)), satisfying limr&0 rA(r) = b, wq (0) = 1, wq00 (0) = c, and the solution depends continuously on these values. If b < 0, then limr&0 A(r) = −∞, and limr&0 (w2 (r), w0 (r)) = (1, 0). These solutions have been termed Schwarzschild-like [5]. In [5], the authors also investigated RNL solutions but they mistakenly omitted the 2-parameter family of solutions that have w(0) = 0. These solutions have the following asymptotic form near r = 0 : 1 b + + h.o.t. , r2 r 3 w(r) = cr + h.o.t. .
A(r) =
These solutions are interesting since they give rise to asymptotically flat solutions with half-integral rotation numbers, see [14]. In addition there are solutions which have w2 (0) = 1; these solutions have the following asymptotic form near r = 0 : b + h.o.t. , r w(r) = ±1 + cr2 + h.o.t. .
A(r) =
There is still another type of local solution (discussed in [5]), having A < 0 near r = 0, but these do not appear to give rise to asymptotically flat global solutions, [5]. We are thus lead to the following “trichotomy conjecture”: Conjecture. If (A(r), w(r)) is a globally defined solution of the EYM equations (1.3), (1.4), then −∞, or +1, or lim A(r) = +∞. r&0 In view of our above remarks concerning the behavior of solutions if A(r) > 0 near r = 0, this conjecture can be rephrased as: Conjecture. If (A(r), w(r)) is a globally defined solution to the EYM equations (1.3), (1.4), and A(r) < 0 for r near 0, then limr&0 A(r) = −∞. 5. Another interesting question is the following: Does there exist a solution to the EYM equations (1.3), (1.4), where A(r) < 0 in a neighborhood of r = ∞? We conjecture that the answer to this question is negative. If our conjecture is true, this would enable us to drop the hypothesis A(r) ¯ > 0 in Theorem 1.2. If, on the other hand the conjecture is true, then we can show that the orbit must have infinite rotation in the (w, w0 )-plane and w must be unbounded. 6. Using the methods in [7–9], we have proved the following theorem
Extendability of Solutions of Einstein–Yang/Mills Equations
731
Theorem 6.2. There is a continuous 2-parameter family of solutions Aα,β (r), wα,β (r) to the EYM equations (1.3), (1.4), defined in the far-field, which are analytic functions of s = r1 . That is, if (A(r), w(r)) is a solution to the EYM equations (1.3), (1.4) which is asymptotically flat, and is analytic in s = r1 , then (A(r), w(r)) = Aα,β (r), wα,β (r) for some pair of parameter values (α, β). (We omit the details of the proof as they are similar to those in [7].) In the above theorem, one parameter is the (ADM) mass β, and in fact, A(s = 0) = 1, dw 2 and dA ds |s=0 = −β. The other parameter is α = ds |s=0 , and w (s = 0) = 1; cf. [10]. It follows from the results in [10 or 14], that the (ADM) mass β is finite for any solution which is defined in the far-field. Moreover, for such solutions lim r→∞ rw0 (r) = 0, cf. [9]. We do not know whether limr→∞ r2 w0 (r) ≡ lims→0 dw(s) ds exists. This leads to the next question: Is every asymptotically flat solution to the EYM equations (1.3), (1.4) analytic in s = r1 at s = 0? If the answer is affirmative, then we may consider the (α, β)-plane as representing those solutions having the following asymptotic form near s = 0: A(s) = 1 − βs + h.o.t. , w(s) = 1 − αs + h.o.t. , and all such solutions are described by a point in the (α, β)-plane (or in the corresponding plane corresponding to w(s = 0) = −1), or they correspond to the 1-parameter family of classical Reissner–Nordstr¨om solutions: A(r) = 1 − rc + r12 , w(r) ≡ 0.
β
Schwarzschild Solutions
RNL β=2
Ω=
1
Ω=
2
• • • Ω=n Pn
P1
RNL
α
P2
RNL Fig. 8.
We consider the (α, β)-plane as depicted in Fig. 8. In this plane, certain regions are easy to identify. Thus, if α < 0, these correspond to RNL-solutions. Similarly, the region α > 0, β < 0, also correspond to RNL-solutions. The line α = 0 corresponds to Schwarzschild solutions with mass β. Particle-like and black-hole solutions must lie in the 1st quadrant α > 0, β > 0. Presumably, there are a countable number of curves in the 1st quadrant distinguished by the number of zeros of w, parametrized by ρ, the event horizon. (These are schematically depicted in Fig. 8, where the points Pn correspond to particle-like solutions and the β coordinate of Pn tends to 2 as n → ∞; cf. [11].) There are also a countable number of points in this quadrant which correspond to particle-like solutions. All other solutions in this quadrant are RNL solutions.
732
J. A. Smoller, A. G. Wasserman
Thus, near any particular black-hole solution, there are global solutions which are neither black-hole or particle-like solutions; i.e., they must be RNL solutions. This follows since any point in this plane represents a global solution (from our results in this paper, cf. Theorem 1.2). Thus for any such global solution (A, w), either A has a zero, in which case the corresponding point (α, β) lies on one of the above-mentioned countable number of curves, or it is one of the countable number of particle-like solutions, or it is an RNL solution [10, 14]. It follows that in any neighborhood of a black-hole solution (A0 (r), w0 (r)) there are RNL solutions. In particular, if A0 (r1 ) = −η < 0, then arbitrarily close to this solution, there are solutions (A(r), w(r)) having A(r1 ) > 0. This is a spectacular example of non-continuous dependence on initial conditions. References 1. Bartnik. R., and McKinnon, J.: Particle-like solutions of the Einstein–Yang–Mills equations. Phys. Rev. Lett. 61, 141–144 (1988) 2. Bizon, P.: Colored black holes. Phys. Rev. Lett. 64, 2844–2847 (1990) 3. Breitenlohner, P., Forg´acs, P. and Maison, D.: Static spherically symmetric solutions of the Einstein– Yang–Mills equations. Commun. Math. Phys. 163, 141–172 (1994) 4. Breitenlohner, P., Lavrelashvili, G. and Maison, D.: Mass inflation and chaotic behavior inside hairy black holes. gr-qc/9703047 5. Donats, E.E., Gal’tsov, D.V., and Zotov, M. Yu: Internal structure of Einstein–Yang–Mills black holes. gr-qc/9612067 6. Kunzle, H.P., and Masood-ul-Alam, A.K.M.: Spherically symmstric static SU(2) Einstein–Yang–Mills fields. J. Math. Phys. 31, 928–935 (1990) 7. Smoller, J., Wasserman, A., Yau, S.-T., McLeod, J.: Smooth static solutions of the Einstein–Yang Mills equations. Commun. Math. Phys. 143, 115–147 (1991) 8. Smoller, J., Wasserman, A.: Existence of infinitely-many smooth static, global solutions of the Einstein/Yang–Mills equations. Commun. Math. Phys. 151, 303–325 (1993) 9. Smoller, J., Wasserman, A., and Yau, S.-T.: Existence of black hole solutions for the Einstein–Yang/Mills equtions. Commun. Math. Phys. 154, 377–401 (1993) 10. Smoller,J., and Wasserman, A.: Regular solutions of the Einstein–Yang/Mills equations. J. Math. Phys. 36, 4301–4323 (1995) 11. Smoller, J. and Wasserman, A.: Limiting masses of solutions of Einstein–Yang/Mills equations. Physica D., 93, 123–136 (1996) 12. Smoller, J. and Wasserman, A.: Uniqueness of extreme Reissner–Nordstr¨om solution in SU(2) Einstein– Yang/Mills theory for spherically symmetric spacetime. Phys. Rev. D., (15 Nov. 1995), 52, 5812–5815 (1995) 13. Smoller, J., and Wasserman, A.: Uniqueness of zero surface gravity SU(2) Einstein–Yang/Mills black holes. J. Math. Phys. 37, 1461–1484 (1996) 14. Smoller, J., and Wasserman, A.: Reissner–Nordstr¨om-like solutions of the SU(2) Einstein–Yang/Mills equations. J. Math. Phys. 38, 6522–6559 (1997) 15. Straumann, N., and Zhou, Z.: Instability of a colored black hole solution. Phys. Lett B. 243, 33–35 (1990) 16. Ershov, A.A., and Galtsov, D.V.: Non abelian baldness of colored black holes. Phys. Lett. A. 150, 747, 160–164 (1989) 17. Lavrelashvili, G., and Maison, D.: Regular and black-hole solutions of Einstein–Yang/Mills dilation theory. Phys. Lett. B. 295, 67 (1992) 18. Volkov, M.S. and Gal’tsov, D.V.: Black holes in Einstein–Yang/Mills theory. Sov. J. Nucl. Phys. 51, 1171 (1990) 19. Volkov, M.S., and Ga.’tsov, D.V.: Sphalerons in Einstein–Yang/Mills theory. Phys. Lett. B. 273, 273 (1991) Communicated by A. Jaffe