Commun. Math. Phys. 244, 1 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0990-6 Published online: 2 December 2003 c Springer-Verlag 2003
Communications in
Mathematical Physics
Editorial The Journal’s Mission and Standards of Presentation
The mission of Communications in Mathematical Physics is to offer a high forum for works which are motivated by the vision and the challenges of modern physics and which at the same time meet the highest mathematical standards. The above is a broad calling, as it encompasses different subfields, which in practice have varying foundations, different subcommunities of practitioners, and which may be at different stages of progress towards mathematical maturity. As a primary instruction, we call upon the authors to keep both the above stated goals and the broad audience in mind, and we offer the following suggestions to this end. • State the results in a broadly accessible way. • Bear in mind that the case for publication in CMP is enhanced by a clear indication of the work’s relation to physics issues, or to mathematics which have sprung from such a relation. • Ensure that the status of the results presented in the work is unambiguously understood: whether they are established rigorously, i.e. proven within a mathematically precise framework, or according to other standards, for example granting the validity of some particular physical framework. • As possible, formulate clear definitions and state theorems, as facilitated by the formalism used. Conjectures, clearly stated as such, also have their place. Finally, it is important to put the work into context, stating for a broad audience both the background and the essential advances being made. While none of this is new, and the best papers in CMP have always exemplified these standards, the breadth of our readership as well as of our contributors makes it appropriate to reemphasize the importance of these requirements. They can be met in different ways; one possibility is to include an introductory section aimed at a broader readership than the body of the paper, where one may assume a somewhat higher level of familiarity with the subject. Both authors and referees are asked to consider these requirements. In some areas this may be viewed as an addition of another hurdle to an already demanding refereeing process, but we see this as an important part of the growth of mathematics and physics. We hope that the authors, referees, and readers will agree that the goals, which include a strong field and a vital mathematical physics community, are well worth this effort. The Editors
Commun. Math. Phys. 244, 3–27 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0973-7
Communications in
Mathematical Physics
Projectively and Conformally Invariant Star-Products C. Duval1 , A.M. El Gradechi2,3 , V. Ovsienko4 1
Universit´e de la M´editerran´ee and CPT-CNRS, Luminy Case 907, 13288 Marseille, Cedex 9, France. E-mail:
[email protected] 2 Facult´e des Sciences, Universit´e d’Artois, 62307 Lens, France 3 CPT-CNRS, Luminy Case 907, 13288 Marseille, Cedex 9, France. E-mail:
[email protected] 4 Institut Girard Desargues, Universit´e Claude Bernard Lyon 1, 69622 Villeubanne, Cedex, France. E-mail:
[email protected] Received: 24 December 2002 / Accepted: 28 May 2003 Published online: 18 November 2003 – © Springer-Verlag 2003
Abstract: We consider the Poisson algebra S(M) of smooth functions on T ∗ M which are fiberwise polynomial. In the case where M is locally projectively (resp. conformally) flat, we seek the star-products on S(M) which are SL(n+1, R) (resp. SO(p +1, q +1))invariant. We prove the existence of such star-products using the projectively (resp. conformally) equivariant quantization, then prove their uniqueness, and study their main properties. We finally give an explicit formula for the canonical projectively invariant star-product.
1. Introduction The deformation quantization program initiated in the seventies [3] was aimed at defining an autonomous quantization method based on Gerstenhaber’s general theory of deformation of algebraic structures [28]. The original idea was to view quantum mechanics as a one-parameter deformation of classical mechanics, more precisely, a one-parameter deformation of the algebraic structures underlying classical mechanics. If P is a Poisson manifold, then C ∞ (P ) is naturally equipped with two algebraic structures, namely, the associative and commutative pointwise multiplication and the Lie algebra defined by the Poisson bracket. The deformed algebraic structure, describing the quantum mechanical counterpart of (C ∞ (P ), ·, {·, ·}) is (C ∞ (P )[[]], ), where the operation , called star-product, is an associative (but non-commutative) product on C ∞ (P )[[]] deforming the commutative multiplication in the direction of the Poisson bracket. More precisely: Definition 1.1. Let P be a Poisson manifold and C ∞ (P ) the space of smooth complex-valued functions on P . A star-product on P is an associative algebra structure on C ∞ (P )[[]], denoted , and given by a linear map from C ∞ (P ) ⊗ C ∞ (P ) to C ∞ (P )[[]], extended by linearity to C ∞ (P )[[]] ⊗ C ∞ (P )[[]], such that
4
C. Duval, A.M. El Gradechi, V. Ovsienko ∞
F G=F ·G+
i (i)r Br (F, G). {F, G} + 2
(1.1)
r=2
In the mathematical literature is a formal parameter, whereas in physical applications is Planck’s constant. There are usually three extra requirements for star-products : C1. The constant function 1 is the unit of (C ∞ (P )[[]], ), namely 1 F = F 1 = F ; C2. The star-product is symmetric, viz F G = G F ; C3. The bilinear maps Br are given by bidifferential operators. Note that Condition C2 is sometimes called the parity condition. The first reported star-product appeared in the work of Grœnewold [31]. It was derived from the Weyl-Wigner quantization on P = R2n . It is nowadays more commonly known as the Moyal star-product ; Moyal actually obtained the Lie algebra bracket associated with Grœnewold’s star-product [39]. This first star-product was later on rediscovered by Vey [44]. The general problem of existence of star-products was raised in [3]. Using cohomological techniques, De Wilde and Lecomte [16] proved the existence of star-products on any symplectic manifold. A geometric proof of the same result together with an algorithmic construction was obtained by Fedosov [24, 25] (see [45] for a survey of this construction and [41] for an alternative approach). More recently, Kontsevich proved an existence theorem for an arbitrary Poisson manifold, giving explicit formulæ for P = Rn [34].An operadic and a quantum field theoretic interpretations of Kontsevich’s result were later on given respectively by Tamarkin [43], and Cattaneo and Felder [11]. The problem of the uniqueness of star-products is usually studied modulo equivalence (see Sect. 2.2 for definitions and [32, 14] for recent developments). However, extra conditions can sometimes be imposed to single out a canonical star-product. For instance, Gutt [33] proved that the Moyal star-product is the unique (Sp(2n, R) R2n )-invariant and covariant star-product on R2n . The notion of a G-invariant star-product, where G is a Lie group of Poisson automorphisms of P , was introduced in [3] (see Sect. 2.1 for definitions). Existence of a G-invariant star-product on a symplectic manifold was proved by Lichnerowicz [38] for any compact Lie group G of symplectomorphisms. More recently, Fedosov [26] constructed a G-invariant star-product on a symplectic manifold endowed with a G-invariant symplectic connection. In this article, we deal with cotangent bundles P = T ∗ M equipped with their canonical symplectic structure, and restrict considerations to the Poisson algebra S(M) of smooth functions on T ∗ M polynomial on fibers. We furthermore assume M to be a smooth n-dimensional manifold endowed with either a projectively or a conformally flat structure, i.e., M admits a (locally defined) action of either SL(n + 1, R) or SO0 (p + 1, q + 1), the connected component of the pseudo-orthogonal group with n = p + q. The basic example of a projectively (resp. conformally) flat manifold is RPn (resp. (S p × S q )/Z2 ). Denote by G either the projective or the conformal group. We study, in the present article, G-invariant star-products on T ∗ M, where the G-action is the canonical lift of the natural action on the base. Our first result, Theorem 5.1, establishes the uniqueness of a G-invariant homogeneous star-product on S(M). Our second result, Theorem 5.7, proves the uniqueness of a G-invariant star-product modulo G-equivalence and reparametrization.
Projectively and Conformally Invariant Star-Products
5
Let us emphasize that we do not assume conditions C1, C2 and C3 a priori. It turns out that C1 and C2 are automatically satisfied while C3 doesn’t hold; in fact, the maps Br in (1.1) are pseudo-differential bilinear operators. Our G-invariant star-products cannot be obtained by Fedosov’s or Kontsevich’s constructions, as the latter lead to bidifferential star-products. The existence of G-invariant star-products on S(M) is based on the existence of a G-equivariant quantization map [37, 23] (see also [22]). The latter is the unique (up to normalization) isomorphism of G-modules, Qλ : S(M) → Dλ (M), where Dλ (M) is the space of differential operators acting on tensor densities of degree λ. Such a quantization map defines a G-invariant associative product on S(M) which turns out to be a star-product for λ = 21 as proved in [9, 23]. The existence and uniqueness results of the present article represent the deformation quantization counterparts of those obtained for G-equivariant quantization. In both situations invariance properties ensure uniqueness. The pseudo-differential nature of the G-invariant star-products has been revealed by Brylinski [9] and Astashkevich and Brylinski [2]. In the latter reference, invariant starproducts on minimal nilpotent coadjoint orbits of semi-simple Lie groups have been investigated. These results are closely related to ours since these orbits are punctured cotangent bundles T ∗ M \ M; nevertheless the Poisson algebras considered in [2] are smaller than S(M). Moreover, our approach provides explicit formulæ in the projective case, answering a question raised in [2]. The paper is organized as follows. In Sect. 2 we recall the notions of invariant and equivalent star-products, and we give a short account on equivariant quantization for cotangent bundles. In Sect. 3, we define projective and conformal geometries and determine the ring of projectively/conformally invariant linear operators on S(M). The existence of G-invariant star-product on T ∗ M, along with a few of their properties, are proved in Sect. 4. Sect. 5 contains our uniqueness theorems. In Sect. 6, we give an autonomous derivation of the canonical projectively invariant star-product on S(RPn ), based only on projective invariant theory. Explicit formulæ are then provided. We end this paper, with Sect. 7, where we gather our conclusion, a discussion and a few perspectives.
2. Invariant Star-Products and Equivariant Quantization In this section we introduce the general notions of invariance and covariance of starproducts with respect to a Hamiltonian action of a connected Lie group G.
2.1. Invariant, covariant and strongly invariant star-products. First of all, let us give the precise definition of an invariant star-product already mentioned in the Introduction. Definition 2.1. Given a Poisson action of a Lie group G on a Poisson manifold P , a star-product on C ∞ (P ) is called G-invariant if g ∗ (F G) = g ∗ F g ∗ G
(2.1)
for all F, G ∈ C ∞ (P )[[]] and g ∈ G. In the case where the G-action is Hamiltonian one has the following supplementary notions:
6
C. Duval, A.M. El Gradechi, V. Ovsienko
Definition 2.2. Consider a Hamiltonian G-action on a Poisson manifold P with associated equivariant moment map J : P → g∗ , where g∗ is the dual of the Lie algebra g of G. A star-product on P is called a) G-covariant if JX JY − JY JX = i {JX , JY },
(2.2)
b) strongly G-invariant if JX F − F JX = i {JX , F }
(2.3)
for all F ∈ C ∞ (P )[[]] and X, Y ∈ g, where JX is the Hamiltonian function on P corresponding to X. Remark 2.3. Note that a different terminology is sometimes attached to this last notion in the literature. What we call here strong G-invariance corresponds to the notion of preferred observables in [3, 18] and to Property IP2 in [1]. Beware that, in the latter reference, strong invariance means covariance and invariance. Let us now recall the following useful result. Proposition 2.4 ([1]). If a star-product is strongly G-invariant, then it is both G-invariant and G-covariant. Proof. Using the definition (2.3) of strong invariance, we write i{JX , F G} = JX F G − F G JX = JX F G − F J X G + F J X G − F G J X = i ({JX , F } G + F {JX , G}) which is nothing but the infinitesimal version of formula (2.1) expressing the invariance property. The G-invariance of the star-product then follows from the connectedness of G. As for covariance, it is an immediate consequence of strong invariance. Remark 2.5. The converse of Proposition 2.4 is proved in [1] under the additional assumption of a transitive G-action. 2.2. Equivalence, G-equivalence and reparametrization. In the traditional classification of star-products one introduces a notion of equivalence. Two star-products, and , are called equivalent if there exists a formal series = Id + i1 + (i)2 2 + · · · , where i :
C ∞ (P )
→
C ∞ (P )
(2.4)
are some linear operators, such that
(F ) (G) = (F G).
(2.5)
Usually, one also allows for formal changes of the parameter of deformation: µ : i → i +
∞
ak (i)k ,
(2.6)
k=2
where ak ∈ R, in order to comply with Property C2 from the Introduction. For G-invariant star-products it is natural to consider the notion of G-equivalence.
Projectively and Conformally Invariant Star-Products
7
Definition 2.6 ([38]). Two equivalent G-invariant star-products are called G-equivalent if each map i in (2.4) is G-equivariant. The condition for two star-products to be G-equivalent is much stronger than the usual condition of equivalence (see [4] for recent developments). 2.3. Equivariant quantization and the associated invariant star-product. Equivariant quantization as developed in [37, 23, 21, 22] applies to cotangent bundles. From here on we restrict ourselves to P = T ∗ M endowed with its canonical symplectic form. Let S(M) ⊂ C ∞ (T ∗ M) be the space of (complex-valued) functions on T ∗ M polynomial on fibers, and D(M) be the space of linear differential operators acting on C ∞ (M). The space S(M) is the space of symbols of operators in D(M); it has a natural grading S(M) =
∞
Sk (M)
(2.7)
k=0
by the degree of homogeneity. Let Fλ (M) be the space of tensor densities of degree λ ∈ C on M, i.e., the space of sections of the complex line bundle |n T ∗ M|λ ⊗ C. In local coordinates such densities are of the form f (x 1 , . . . , x n ) |dx 1 ∧ · · · ∧ dx n |λ
(2.8)
with f ∈ C ∞ (M). Denote Dλ (M) the space of linear differential operators on Fλ (M); it has a natural filtration Dλ0 (M) ⊂ Dλ1 (M) ⊂ · · · ⊂ Dλk (M) ⊂ · · · such that S(M) = gr(Dλ (M)). Definition 2.7. A quantization map is an invertible linear map Qλ : S(M) → Dλ (M)[] which preserves the principal symbol in the following sense: for a homogeneous polynomial F ∈ Sk (M), the principal symbol of the differential operator Qλ (F ) is equal to (i)k F . There is a natural action of the group of diffeomorphisms, Diff(M), on Fλ (M), denoted by gλ : Fλ (M) → Fλ (M) for all g ∈ Diff(M). We will rather use the corresponding action of the Lie algebra of vector fields, Vect(M), which is given by LλX f = Xi
∂f ∂X i + λ f ∂x i ∂x i
(2.9)
for all X = Xi ∂/∂x i ∈ Vect(M), with the local identification Fλ (M) ∼ = C ∞ (M) made in (2.8). (We will use Einstein’s summation convention throughout this article.) Note that the expression (2.9) is, indeed, independent of the choice of a coordinate system. The canonical lift of the Diff(M)-action to T ∗ M is automatically Hamiltonian with moment map J given by JX = ξi X i ∈ S1 (M).
(2.10)
8
C. Duval, A.M. El Gradechi, V. Ovsienko
Definition 2.8. Consider a Lie group G ⊂ Diff(M). A quantization map Qλ is called G-equivariant if Qλ (g ∗ F ) = gλ−1 ◦ Qλ (F ) ◦ gλ
(2.11)
for all g ∈ G and F ∈ S(M). The above formula plays a central rˆole in the forthcoming developments. We will need its infinitesimal guise LλX ◦ Qλ (F ) − Qλ (F ) ◦ LλX = Qλ (LX F )
(2.12)
for all X ∈ g, where LX stands for the canonical lift to T ∗ M of the fundamental vector field associated with X. From such a quantization map, we immediately obtain an associative product given by F λ G = Q−1 λ (Qλ (F ) ◦ Qλ (G)).
(2.13)
Note that this product is not necessarily of the form (1.1). However, Condition C1 is automatically satisfied. The following proposition is a direct consequence of the above definitions. Proposition 2.9. If Qλ is a G-equivariant quantization map on S(M), then the associative product on S(M) given by (2.13) is G-invariant. One wonders if there exists some extra condition sufficient to insure strong G-invariance of the G-invariant associative product (2.13). The next proposition introduces a natural geometric property of the quantization map that leads to the desired result. Proposition 2.10. If Qλ is a G-equivariant quantization map on S(M), which furthermore satisfies the following condition: Qλ (JX ) = i LλX
(2.14)
for all X ∈ g, then the associative product on S(M) given by (2.13) is strongly G-invariant. Proof. Let X ∈ g and F ∈ S(M), then, using successively (2.13), (2.14), and (2.12), we get JX λ F − F λ JX = (Qλ )−1 [Qλ (JX ), Qλ (F )] = (Qλ )−1 i LλX , Qλ (F ) = i LX F = i {JX , F }, where the last equality stems from the definition of the moment map. The proof that (2.3) holds is complete.
Projectively and Conformally Invariant Star-Products
9
3. Projectively/Conformally Invariant Operators We gather here definitions and results that will be used throughout the paper. Those mainly concern projective/conformal differential geometry. We will consider the Lie groups G = SL(n + 1, R) and G = SO0 (p + 1, q + 1) together with their homogeneous spaces M = RPn and M = (S p × S q )/Z2 , respectively. From here on, G will stand for either of the two groups above and M for either of the corresponding homogeneous spaces. In the framework of Weyl’s invariant theory [46], we will introduce, for each geometry, G-invariant linear operators on T ∗ M which will serve as our main tools.
3.1. The projective and conformal symmetries. The real projective space of dimension n is an SL(n+1, R)-homogeneous space. More precisely, RPn = SL(n+1, R)/Aff(n, R), where Aff(n, R) = GL(n, R) Rn is an affine subgroup of SL(n + 1, R). Let x 1 , x 2 , . . . , x n be an affine coordinate system on RPn , the fundamental vector fields associated with the SL(n + 1, R)-action on RPn are then given by : ∂ , ∂x i
xi
∂ , ∂x j
xi xj
∂ , ∂x j
(3.1)
with i, j = 1, . . . , n. The vector fields (3.1) correspond to translations, linear transformations and inversions, respectively; they generate a flag of Lie algebras Rn ⊂ aff(n, R) ⊂ sl(n + 1, R). The sphere S n with its canonical metric is a conformally flat manifold. The same is true for (S p × S q )/Z2 in the case of signature p − q. Those are homogeneous spaces SO(p +1, q +1)/CE(p, q), where CE(p, q) = CO(p, q)Rn is the conformal Euclid∗ , and n = p + q. ean group, CO(p, q) = SO(p, q) R+ The fundamental vector fields associated with the SO0 (p + 1, q + 1)-action on (S p × S q )/Z2 in an “anallagmatic” coordinate system are given (see, e.g., [19]) by ∂ , ∂x i
xi
∂ ∂ − xj i , j ∂x ∂x
xi
∂ , ∂x i
xj x j
∂ ∂ − 2xi x j j , i ∂x ∂x
(3.2)
where i, j = 1, . . . , n, and where indices are raised and lowered using the standard metric g of (S p × S q )/Z2 . The vector fields (3.2) correspond to translations, rotations, homotheties and inversions, respectively; they generate a flag of Lie algebras Rn ⊂ e(p, q) ⊂ ce(p, q) ⊂ o(p + 1, q + 1). These two groups of transformations, G, define respectively the projective and the conformal geometries; their Lie algebras, g, spanned by the vector fields (3.1) and (3.2) are finite-dimensional maximal Lie subalgebras of Vect(M), see [37, 5]. We also introduce, for convenience, the notation H ⊂ G for the affine Lie subgroups H = Aff(n, R) in the projective case, and H = CE0 (p, q) in the conformal case. The corresponding Lie subalgebras will be denoted by h ⊂ g.
10
C. Duval, A.M. El Gradechi, V. Ovsienko
3.2. Affine and Euclidean invariant operators. Since the group Diff(M) of diffeomorphisms of M admits a canonical lift to T ∗ M, let us lift, accordingly, the action of G. The search for G-invariant linear operators on S(M) will be dealt with in two stages. We first consider the affine (resp. Euclidean) subgroup and determine the algebra of Aff(n, R)-invariant (resp. (SO0 (p, q) Rn )-invariant) operators; in the next section we will then enforce full G-invariance. A classical result from invariant theory shows that the commutant of Aff(n, R) in End(S(M)) is generated by the following two operators: E = ξi
∂ , ∂ξi
D=
∂ ∂ ∂x i ∂ξi
(3.3)
which span the Lie algebra aff(1, R). Indeed, an Aff(n, R)-invariant linear operator mapping Sk (M) into S (M) is proportional to Dk− (see, e.g., [46, 29]). The commutant of Aff(n, R) in End(S(M)) is, hence, given by series in E and D, convergent on S(M). It has been shown in [23] that the commutant of SO0 (p, q) Rn in End(S(M)) is generated by the operators R = ξ i ξi ,
E = ξi
∂ , ∂ξi
T=
∂ ∂ ∂ξ i ∂ξi
(3.4)
whose commutation relations are those of sl(2, R), together with G = ξi
∂ , ∂x i
D=
∂ ∂ , ∂x i ∂ξi
=
∂ ∂ , ∂x i ∂xi
(3.5)
which span the Heisenberg Lie algebra h1 . The operators (3.4) and (3.5) span the Lie algebra sl(2, R) h1 . 3.3. Projectively and conformally invariant operators. It is noteworthy that E commutes with the lift of any diffeomorphism of M. One may ask if, upon restriction to G ⊂ Diff(M), there exist other linear operators on T ∗ M that commute with G. The answer is negative in the projective case and positive in the conformal case. Proposition 3.1. The commutant of SL(n + 1, R) in End(S(M)) is generated by E. Proof. An affinely invariant linear operator is a series in E and D of the form A=
∞
Ps (E) Ds ,
(3.6)
s=0
where Ps is a series in one variable. Let Xi = x i x j ∂/∂x j be the i th generator of inversions in (3.1). Straightforward computation (see [37]) yields the commutation relation
∂ LXi , D = (2E + n + 1) ◦ . ∂ξi
One then checks that
LXi , A =
∞ s=0
sPs (E)(2E + n + s) Ds−1 ◦
∂ . ∂ξi
(3.7)
This expression vanishes if and only if Ps = 0 for all s ≥ 1. Hence A = P0 (E) is a necessary condition for A to commute with the SL(n + 1, R)-action.
Projectively and Conformally Invariant Star-Products
11
The conformal counterpart of the above statement is as follows. Proposition 3.2. The commutant of SO0 (p +1, q +1) in End(S(M)) is the commutative associative algebra generated by E and the operator R0 = R ◦ T. Proof. A sketch of this proof was given in [23]; for the sake of completeness we give the details here. Let us consider an operator Z on the space of polynomials of degree k, namely S k (M) =
k
S (M),
=0
and commuting with the canonical lift of SO0 (p + 1, q + 1). It is, according to classical invariant theory [46, 29], a differential operator, polynomial in the generators (3.4) and (3.5). We therefore seek a differential operator Z on T ∗ M which commutes with the SO0 (p + 1, q + 1)-action. Its principal symbol σ (Z) is a function on T ∗ (T ∗ M), polynomial on fibers. More precisely, if (ζi , y i ) denote the conjugate variables to (x i , ξi ) respectively, then σ (Z) is polynomial in the variables ξi , ζi , y i . The function σ (Z) has to be annihilated by the canonical lifts to T ∗ (T ∗ M) of all generators (3.2) of the conformal Lie algebra o(p + 1, q + 1). Let us assume that σ (Z) is ce(p, q)-invariant and consider then invariance with respect to inversions whose i th generator is Xi = xj x j ∂/∂x i − 2xi x j ∂/∂x j . Its canonical lift to T ∗ (T ∗ M) is given by Xi = xj x j ∂ − 2xi x j ∂ L ∂x ∂x j i ∂ ∂ j ∂ +2xi ξj −y + ζj ∂y j ∂ζj ∂ξj ∂ ∂ ∂ ∂ ∂ ∂ j −2x ξi j − ξj i + yi j − yj + ζi j − ζj ∂ξ ∂ξ ∂y ∂ζ ∂ζ i ∂yi ∂ ∂ ∂ +2 ξi yj , (3.8) − y i ξj − ξj y j ∂ζj ∂ζj ∂ζ i Xi σ (Z) = 0. Now, invariance with and the invariance with respect to inversions reads L respect to ce(p, q) clearly implies that σ (Z) is annihilated by the first three terms in (3.8), so that ∂ ∂ ∂ ξi yj σ (Z) = 0 (3.9) − y i ξj − ξj y j ∂ζj ∂ζj ∂ζ i for all i = 1, . . . , n. Lemma 3.3. Equation (3.9) implies ∂σ (Z) =0 ∂ζi for all i = 1, . . . , n.
(3.10)
12
C. Duval, A.M. El Gradechi, V. Ovsienko
Proof. The determinant of the matrix Aij = y i ξj − ξ i yj + ξk y k δji intervening in (3.9) is det(A) = ξi ξ i yj y j (ξk y k )n−2 which is non-zero on the comple ment of a lower-dimensional smooth submanifold of T ∗ (T ∗ M). By e(p, q)-invariance, the operator Z is a polynomial in the differential operators (3.4) and (3.5), see Sect. 3.2. Furthermore, invariance with respect to the generator of homotheties X0 = x i ∂/∂x i shows that Z is in fact a polynomial in R0 = R ◦ T,
E,
G0 = G ◦ T,
0 = ◦ T.
D,
(3.11)
The principal symbols of the last three operators are σ (G0 ) = ξi ζ i yj y j ,
σ (D) = ζi y i ,
σ (0 ) = ζi ζ i yj y j .
These three polynomials are algebraically independent for n > 1. Condition (3.10) implies then that Z depends only on E and R0 . Note that if n = 1, we find R0 = E(E −1) in agreement with Proposition 3.1. We have thus shown that, for all k, any Z ∈ End(S k (M)) commuting with the SO0 (p + 1, q + 1)-action is polynomial in E and R0 . This completes the proof of Proposition 3.2. 4. Existence of Projectively and Conformally Invariant Star-Products Taking advantage of the results obtained in equivariant quantization (see [37, 23, 9]) and of Proposition 2.9, one defines a G-invariant star-product on T ∗ M . In this section we give a brief account on the projectively and conformally equivariant quantizations and discuss the main properties of the associated invariant star-products. 4.1. Construction of G-invariant star-products. It has been proved in [37, 23] that, for any λ ∈ C, there exists a unique G-equivariant quantization map Qλ : S(M) → Dλ (M)[] on T ∗ M. In a local coordinate system, one can locally identify S(M) and Dλ (M) through the normal ordering prescription: P i1 ...ik ξi1 · · · ξik → (i)k P i1 ...ik
∂ ∂ ··· i , i 1 ∂x ∂x k
(4.1)
where P i1 ...ik is a smooth function of (x 1 , . . . , x n ). The explicit formula of Qλ is only known in the projective case; it is given, in an adapted coordinate system, and using the identification (4.1), by the series [22] Qλ =
∞
Cr (E) (iD)r ,
(4.2)
1 (E + (n + 1)λ)r , r! (2E + n + r)r
(4.3)
r=0
where E and D are as in (3.3) and Cr (E) =
Projectively and Conformally Invariant Star-Products
13
where (a)r := a(a + 1) · · · (a + r − 1) is the Pochhammer symbol. The expression (4.2) is well defined globally on T ∗ M since M is projectively flat. An important feature of the quantization map (4.2) is that it is homogeneous in the following sense. Let us assign a degree to the deformation parameter , more precisely, we put deg = 1. Then Qλ preserves the total degree on S(M)[]. In other words, one has Proposition 4.1. The quantization map Qλ commutes with the Euler operator: ∂ E = E + . ∂
(4.4)
Proof. This follows from the commutation relation [E, D] = −D and the expression (4.2). In the conformal case we have no explicit formula for the SO0 (p + 1, q + 1)-equivariant quantization map. However, one can guarantee [23] that Qλ is also homogeneous in this case. A G-invariant star-product on T ∗ M can be obtained from such a G-equivariant quantization map. Proposition 4.2 ([23, 9]). The associative product associated with Qλ through (2.13) is a star-product if and only if λ = 21 . The proof consists in checking that λ = 21 is the only value of λ for which the first-order term in of the associative product (2.13) coincides with the Poisson bracket. Note, however, that the uniqueness of Q 1 does not a priori insure the uniqueness of 2 a G-invariant star-product. 4.2. Main properties. For the constructed G-invariant star-products, Condition C1 from Sect. 1 is satisfied. We will show below that Condition C2 also holds. Definition 4.3. A star-product on the space S(M) will be called homogeneous, if all the bilinear operators Br in (1.1) are homogeneous of degree r, that is, if they preserve the grading (2.7) according to Br : Sk (M) ⊗ S (M) → Sk+−r (M),
(4.5)
or, equivalently, if E is a derivation of the star-product algebra. Proposition 4.4. The G-invariant star-product (2.13) obtained from the G-equivariant quantization map Q 1 is symmetric, homogeneous and strongly G-invariant. 2
Proof. The quantization map Q 1 is symmetric, namely, it satisfies 2
Q 1 (F )∗ = Q 1 (F ) 2
2
for all F ∈ S(M) [37, 23, 21], where Q 1 (F )∗ denotes the formal adjoint operator with 2
respect to the natural pairing on compactly supported 21 -densities. Using the definition (2.13) of the star-product, one now gets the symmetry condition C2.
14
C. Duval, A.M. El Gradechi, V. Ovsienko
Homogeneity of the quantization map Q 1 readily implies the homogeneity of the 2 corresponding star-product. The projectively and the conformally equivariant quantization maps Q 1 coincide up 2 to second-order terms, namely, in both cases one has Q 1 = Id + 2
i D + O( 2 ) 2
in any coordinate system (cf. [37, 23]). One easily verifies that Q 1 satisfies con2 dition (2.14). By Proposition 2.10, the associated G-invariant star-products are thus strongly G-invariant. Condition C3 fails to be satisfied (as proved in [9] and [2] for a subalgebra of S(M)). Each term Br is a pseudo-differential bilinear operator, while its restriction Br |Sk (M)⊗S (M) is a bidifferential operator, just like Q 1 |Sk (M) is a differential operator, see [37]. Hence 2 the constructed star-product is local, namely, for all F, G ∈ S(M), Supp(F G) ⊂ Supp(F ) ∩ Supp(G), see Lemma 5.3 below. 5. Uniqueness of G-Invariant Star-Product Our goal is to show that the star-products constructed in Sect. 4.1 with the help of the Gequivariant quantization map are the unique G-invariant star-products where, as above, G = SL(n + 1, R) and G = SO0 (p + 1, q + 1), respectively. We prove uniqueness in two different settings: 1. in the class of homogeneous G-invariant star-products, 2. in the class of all G-invariant star-products modulo formal reparametrizations and G-equivalence.
5.1. Homogeneous star-products. Homogeneity of a star-product (see Definition 4.3) is a very natural property from a physical standpoint. Indeed, if one considers as a physical constant whose dimension is that of an action (i.e., the dimension of Planck’s constant which is also the inverse dimension of the Poisson bracket on T ∗ M), then the physical dimension of the star-product F G of two observables is the same as that of their product F G, when is homogeneous. This is a direct consequence of the fact that Br has the same physical dimension as −r , which follows from associativity. In other words a homogeneous star-product is dimensionless. On the other hand, homogeneous star-products were thoroughly studied in the mathematical literature. For instance, De Wilde and Lecomte proved [15] that any two homogeneous star-products on a cotangent bundle are equivalent (in the sense of the definitions of Sect. 2.2). The G-invariant star-products constructed in Sect. 4.1 are also homogeneous (see Proposition 4.4). The first main result of this paper is Theorem 5.1. There exists a unique homogeneous G-invariant star-product on the space of symbols S(M).
Projectively and Conformally Invariant Star-Products
15
Proof. In Sect. 4 we proved the existence of a homogeneous G-invariant star-product on S(M). We will now prove its uniqueness. Let and be two homogeneous G-invariant star-products. Let us assume that the first r − 1 terms of these star-products coincide, and use induction over r. The difference Br −Br is a G-invariant homogeneous Hochschild 2-cocycle. Indeed, associativity of the star-product implies that δBr depends only upon Bi with i < r, where the Hochschild coboundary of a 2-cochain B is given by δB(F, G, H ) = F B(G, H ) − B(F G, H ) + B(F, GH ) − B(F, G)H,
(5.1)
implying that δ(Br − Br ) = 0. Let C be a Hochschild 2-cocycle on S(M). Assume now that C is homogeneous as in (4.5) and G-invariant. As a bilinear map, C decomposes as a sum C1 + C0 , where C1 and C0 are, respectively, the skew-symmetric and the symmetric parts of C. We will need the following well-known result. Proposition 5.2. For any local Hochschild 2-cocycle C on S(M), the skew-symmetric part C1 is a bivector, and the symmetric part C0 is the coboundary of a local 1-cochain. This statement is an important result in deformation theory. It was first established in the differentiable case [44] and was later on generalized to local cocycles in [10] (let us mention that this result also holds for continuous cocycles [13, 40]). In order to apply Proposition 5.2, we will prove that each term Br of a G-invariant star-product is local, a result that generalizes Theorem 5.1 in [37]. Lemma 5.3. Any linear G-invariant operator B : Sk (M) ⊗ S (M) → Sm (M) with m ≤ k + is local. Proof. We must prove that Supp(B(F, G)) ⊂ Supp(F ) ∩ Supp(G) for all F ∈ Sk (M) and G ∈ S (M). Suppose that one of the arguments, F or G, vanishes in a neighbourhood of some x ∈ M; we will prove that B(F, G)(x) = 0. Let us now locally identify M with Rn and consider the subalgebra R Rn of g generated by the Euler vector field, E, and the translations. Using translation-invariance, we may, hence, assume x = 0. We will embed Sk (Rn ) ⊗ S (Rn ) into Sk+ (R2n ) and notice that F ⊗ G vanishes in a neighbourhood of x = 0 in R2n . It remains to show that if B : Sk+ (R2n ) → Sm (Rn ) is a linear map which commutes with the action of homotheties LE , then for all H ∈ Sk+ (R2n ) that vanishes in a neighbourhood of x = 0, we have B(H )(0) = 0, provided m ≤ k + . But the proof of the latter statement coincides with that of Theorem 5.1 in [37]. The building blocks of the operators Br are the H-invariant operators listed in (3.11). These operators never increase the degree of homogeneity in ξ = (ξi ), hence Lemma 5.3 applies. We are now able to use Proposition 5.2 and consider C1 and C0 separately. The assertion of Theorem 5.1 will follow from Lemmas 5.4 and 5.6 below. Lemma 5.4. There is no non-zero G-invariant bivector on T ∗ M with coefficients in S(M) homogeneous of degree r ≥ 2. Proof. There is clearly no non-zero such bivector W : Sk (M) ⊗ S (M) → Sk+−r (M), for r > 2. For r = 2, if it exists, it is necessarily of the form W = Wij ∂/∂ξi ∧ ∂/∂ξj with coefficients Wij of degree 0 in ξ . Since W is invariant with respect to the generators of translations, ∂Wij /∂x s = 0 for all s = 1, . . . , n. But, in this case, W cannot be invariant with respect to homotheties.
16
C. Duval, A.M. El Gradechi, V. Ovsienko
We thus have proved that there is no non-zero bivector invariant with respect to the (n + 1)-dimensional Lie algebra of translations and homotheties. This Lie algebra is a Lie subalgebra of both sl(n + 1, R) and o(p + 1, q + 1). Lemma 5.4 is proved. Remark 5.5. Note that, in the proofs of Lemmas 5.3 and 5.4, we only needed invariance with respect to a subalgebra of g. Lemma 5.6. There is no non-zero G-invariant Hochschild 2-coboundary C0 on the associative commutative algebra S(M) homogeneous of degree r ≥ 2. Proof. Suppose that such a C0 exists. Being a coboundary, it is of the form C0 = δA, where δA(F, G) = F A(G) − A(F G) + A(F )G
(5.2)
for some linear map A : Sk (M) → Sk−r (M), with r ≥ 2. Let us prove that A is G-invariant. Since C0 is G-invariant, then, for any X ∈ g, the linear map LX (A) = [LX , A] is a Hochschild 1-cocycle on S(M). Indeed one has δ ◦ LX = LX ◦ δ. Thus, LX (A) is a derivation on S(M). Therefore, this is a vector field on T ∗ M polynomial in ξ and, hence, LX (A) cannot decrease the degree by more than 1. Note, however, that LX (A) : Sk (M) → Sk−r (M) with r ≥ 2 since, again, LX (A) = LX ◦ A − A ◦ LX and LX preserves Sk (M) for any vector field X on M. It follows that LX (A) = 0 for all X ∈ g and thus A is G-invariant. The classification of G-invariant linear maps on S(M) is given by Proposition 3.1 and Proposition 3.2. Being homogeneous of degree zero in ξ , a non-zero G-invariant element A of End(S(M)) cannot decrease the degree. Lemma 5.6 is proved. Lemmas 5.4 and 5.6 imply that Br − Br = 0 for r ≥ 2. This completes the proof of Theorem 5.1. The unique homogeneous G-invariant star-product will be called G-canonical. According to Proposition 4.4, this G-canonical star-product is the one associated with the G-equivariant quantization map Q 1 from Sect. 4. The same proposition also states 2 that it is both symmetric and strongly G-invariant. 5.2. Uniqueness up to G-equivalence and reparametrization. The following theorem is the second main result of this paper. Theorem 5.7. The G-canonical star-product on the space of symbols S(M) is the unique G-invariant star-product modulo formal reparametrizations and G-equivalence. Proof. Let and be two G-invariant star-products. Let us assume that there exists a G-invariant formal series (2.4) and a reparametrization (2.6) intertwining the first r − 1 terms of these star-products, and use induction over r. Using this equivalence we can assume that and coincide up to the (r − 1)th order term. The difference Br − Br is then a G-invariant Hochschild 2-cocycle. As in Sect. 5.1 we consider the decomposition C = C1 + C0 , where C1 and C0 are, respectively, the skew-symmetric and the symmetric parts of C. By Proposition 5.2, C1 is a bivector and C0 is a coboundary.
Projectively and Conformally Invariant Star-Products
17
We will need the following two lemmas. Lemma 5.8.
(i) In the projective case, the canonical Poisson bivector =
∂ ∂ ∧ i ∂ξi ∂x
(5.3)
on T ∗ M is the unique (up to an overall multiplicative constant) G-invariant bivector. (ii) In the conformal case with n = 2, the canonical Poisson bivector on T ∗ M is the unique (up to an overall multiplicative constant) G-invariant bivector. (iii) In the conformal case with n = 2, there are two G-invariant bivectors on T ∗ M, namely the canonical Poisson bivector and the Poisson bivector =
∂ ∂ 1 ij g ξi ξj σk ∧ , 2 ∂ξk ∂ξ
(5.4)
where g = gij dx i dx j represents a conformal class of (pseudo-)Riemannian metrics and σ = 21 σk dx k ∧ dx stands for the surface element of (M, g). Proof. Consider an arbitrary bivector field W on T ∗ M. In any local coordinate system it is of the form ∂ ∂ ∂ ∂ ∂ ∂ j ∧ j + Wij (ξ, x) ∧ , (5.5) W = W ij (ξ, x) i ∧ j + Wi (ξ, x) ∂x ∂x ∂ξi ∂x ∂ξi ∂ξj j
where the coefficients W ij (ξ, x), Wi (ξ, x) and Wij (ξ, x) are functions of x i , ξi which are polynomial in ξ . Choose an adapted coordinate system related to the projective or conformal structure on M respectively (see Sect. 3.1). Since W is G-invariant, it commutes with the action of the generators of translations, that is, with the vector fields Xi = ∂/∂x i , where i = 1, . . . , n. It follows that the coefficients of W are independent of x i . Furthermore, W is invariant with respect to the action of the homothety vector field X0 = x i ∂/∂x i . The canonical lift of X0 to T ∗ M is LX0 = x i ∂/∂x i −ξi ∂/∂ξi . One immediately obtains the following homogeneity conditions: 1. the coefficient W ij (ξ ) has to be homogeneous in ξ of degree −2, j 2. the coefficient Wi (ξ ) has to be homogeneous in ξ of degree 0, 3. the coefficient Wij (ξ ) has to be homogeneous in ξ of degree 2, j
so that W ij (ξ ) = 0, while Wi (ξ ) are constant, and Wij (ξ ) = Wijk ξk ξ are quadratic polynomials. A G-invariant bivector (5.5) is, therefore, a sum of two independent j G-invariant bivectors W0 = Wi ∂/∂ξi ∧ ∂/∂x j and W2 = Wijk ξk ξ ∂/∂ξi ∧ ∂/∂ξj . Considering now invariance with respect to the linear subgroup of G entails that W0 represents an invariant in (Rn )∗ ⊗ Rn and W2 an invariant in 2 (Rn )∗ ⊗ S 2 Rn with respect to the standard linear action of SL(n, R) in the projective case, and SO0 (p, q) in the conformal case. A classical result of invariant theory (see [46, 29]) yields W0 = c0 with c0 ∈ C and W2 = 0, except for n = 2, in the conformal case, where W2 = c2 with c2 ∈ C. Hence, we have proved that the bivectors (5.3), and (5.4) for n = 2 in the conformal case, are the only bivectors invariant with respect to the affine subgroup of G. To complete the proof, one checks that the bivectors (5.3) and (5.4) are invariant with respect to inversions, i.e., the quadratic vector fields in (3.1) and (3.2).
18
C. Duval, A.M. El Gradechi, V. Ovsienko
Lemma 5.9. Every G-invariant Hochschild 2-coboundary C0 on the associative commutative algebra S(M) is of the form C0 = δA, where A is a G-invariant linear map on S(M). Proof. The 2-coboundary C0 is local thanks to Lemma 5.3. This clearly implies that any 1-cochain A such that C0 = δA is local, cf. Proposition 5.2. Given a G-invariant Hochschild 2-coboundary C0 = δA, we will prove that there such that δ A = δA and LX (A) = 0 for all X ∈ g. Clearly, Gexists a linear map A invariance of C0 = δA implies LX (δA) = 0 for any X ∈ g. Thus, δ(LX (A)) = 0 which means that LX (A) is a vector field. A local operator A is a locally given, according to the Peetre theorem [42], by a differential operator; in an arbitrary coordinate system, A = A(0) + A(1) + A(2) + · · · + A(m) ,
(5.6)
where A(i) =
s1 ...si
At1 ···ti 1 (x, ξ )
i1 +i2 =i
2
∂ ∂ ∂ ∂ · · · si ··· . s 1 1 ∂x ∂x ∂ξt1 ∂ξti2
(5.7)
Choose a coordinate system adapted to either the projective or the conformal structure. Consider first the action of the affine Lie subalgebra, h ⊂ g, that is, h = aff(n, R) in the projective case and h = ce(p, q) in the conformal case, introduced in Sect. 3.1. For each component A(i) , except for A(1) , one has LX (A(i) ) = 0, where X ∈ h, since = A − A(1) ; this operator is of the form (5.7) and thus cannot be a vector field. Put A this operator satisfies LX (A) = 0 for all X ∈ h and, obviously, δ A = δA. In particular, invariance with respect to translations guarantees that the coefficients in (5.7) are independent of x. is of the form (3.6); for the In the projective case, an affinely invariant operator A is given by (3.7). This is a vector field if and only generators Xi of inversions, LXi (A) = P0 (E) and thus LXi (A) = 0, see Proposition 3.1. if Ps = 0 for all s ≥ 1. Hence A in a different form, namely In the conformal case, let us rewrite the expression of A = A (0) + A (1) + A (2) + · · · + A (t) , A where t ≤ m and (j ) = A s1 ...sj A
∂ ∂ · · · sj ; ∂x s1 ∂x
s1 ...sj is a differential operator in ξ with polynomial coefficients in ξ . Each each A (j ) is invariant with respect to translations and homogeneous in x of degree −j . term A (j ) is homogeneous in ξ of deInvariance with respect to homotheties implies that A gree −j , that is, (j ) ] = −j A (j ) . [E, A (j ) ) is homogeneous Let Xi be the i th generator of inversions. The operator LXi (A is a vector in ξ of degree −j , since LXi (E) = 0, cf. Proposition 3.2. Hence, LXi (A) field only if LXi (A(j ) ) = 0 for j ≥ 2 since it is polynomial in ξ . belongs to the ring generated by the operators Because of its h-invariance, A E,
R0 = R ◦ T,
D,
G0 = G ◦ T,
0 = ◦ T,
Projectively and Conformally Invariant Star-Products
19
(1) is then where these operators have been defined in (3.4), (3.5) and (3.11). The term A necessarily of the form (1) = α D + β G0 A where α and β are polynomials in E and R0 . A direct computation yields ∂ ∂ ∂ LXi (A(1) ) = 2α ξi T − 2E − 2β R0 −n + 2ξi T . ∂ξi ∂ξi ∂ξi Every term in this expression, except for −2nα ∂/∂ξi , is a differential operator of order > 1 for any α and β. Thus, the right-hand side can be a non-zero vector field only if α is a non-zero constant. On the other hand, −2β R0 ∂/∂ξi is, at least, a thirdorder term unless β is zero. But, the remaining terms 2α ξi T and −4αE ∂/∂ξi are of (1) ) = 0. order 2 and linearly independent. One concludes that α = 0 and thus LXi (A (0) ) = 0. Finally, the term A(0) is obviously a polynomial in E and R0 and, hence, LXi (A We have thus proved that LXi (A) = 0 for all i = 1, . . . , n. Lemma 5.9 is proved. Let us resort to Lemmas 5.8 and 5.9 to complete the proof. The G-invariant Hochschild 2-cocycle C = Br − Br is a sum C = C1 + C0 . The symmetric part C0 is a Hochschild coboundary and, by Lemma 5.9 is of the form C0 = δA, where A is a G-invariant 1-cochain. This term can be removed by a G-equivalence map = Id + (i)r A. Under the hypotheses of parts (i) and (ii) of Lemma 5.8, the skew-symmetric part C1 is proportional to the canonical Poisson bivector, that is, to the first-order term B1 . It can be removed by a reparametrization i → i + c (i)r for some c ∈ R. Theorem 5.7 is proved for the first two options, (i) and (ii), of Lemma 5.8. In the conformal case and for n = 2 (part (iii) of Lemma 5.8), the skew-symmetric part C1 is a linear combination of the canonical Poisson bivector and of the bivector in (5.4). By a reparametrization map we can remove the canonical Poisson bivector but not the bivector . Let us, indeed, show that, if Br − Br = C1 = k, then necessarily k = 0. We associate to the star-products and the corresponding star-commutators [F, G] =
1 (F G − G F ) . i
(5.8)
Since the two star-products are associative, the corresponding star-commutators satisfy the Jacobi identity. Put J (F, G, H ) = [F, [G, H ] ] + (cyclic) and consider the difference J (F, G, H ) − J (F, G, H ). By assumption, this expression has to be identically zero. Since the two star-products coincide up to order r − 1 in i, this difference is trivially zero up to order r − 2. Straightforward computation shows that the (r − 1)th order term in the above difference is equal to 2k[ , ](F, G, H ), where [ , ] is the Schouten bracket of and . Jacobi identities for and -commutators then lead to k [ , ] = 0. Lemma 5.10. The two Poisson bivectors and are not compatible.
20
C. Duval, A.M. El Gradechi, V. Ovsienko
Proof. The Schouten bracket is
∂ ∂ ∂ ∂ [ , ] = 2 ∧ ∧ ξ1 1 + ξ2 2 ∂ξ1 ∂ξ2 ∂x ∂x G = 2 ∧ , R
where G and R are as in (3.5) and (3.4). This expression does not vanish.
Thus, the constant k in the above formula has to vanish. This completes the proof of Part (iii). Theorem 5.7 is proved. Lemmas 5.8–5.10 can be summarized as the following: Proposition 5.11. The second G-invariant Hochschild cohomology space is 2 R , in the conformal case for n = 2 HH2G(S(M); S(M)) = R, otherwise and the cup product in the first instance is non-zero. This result could have been derived from Kontsevich’s [34] or Fedosov’s [27] classification of equivalence classes of deformations. Remark 5.12. Theorem 5.7 does not guarantee uniqueness of a star-product but of a class of G-invariant star-products. Together with Propositions 3.1 and 3.2 this leads to an explicit description of all G-invariant star-products. Indeed, they are all obtained from the G-canonical homogeneous star-product by the equivalence (2.5) and reparametrization (2.6); the equivalence map is given in terms of the G-invariant operators E in the projective case and E and R0 in the conformal case. 5.3. Uniqueness up to G-equivalence and reparametrization, G-covariance and homogeneity. In this section we compare our uniqueness theorems with those obtained for the Moyal star-product in [33]. The Moyal star-product is the unique, up to reparametrization, (Sp(2n, R) R2n )-invariant star-product on R2n . It was also proved that it is uniquely selected within its reparametrization class by furthermore requiring its covariance. The (Sp(2n, R) R2n )-equivalence class of the Moyal star-product has a single element since the (Sp(2n, R) R2n )-commutant in End(C ∞ (R2n )) is trivial so that there are no non-zero invariant Hochschild 2-coboundaries. One may wonder if in our present setting G-covariance plays a similar role, namely, that of an extra condition that selects the canonical G-invariant star-product of Sect. 5.1 within its reparametrization and G-equivalence classes described in Sect. 5.2. The answer is negative; however we have Proposition 5.13. If two G-invariant and G-covariant star-products on S(M) are equivalent up to reparametrization, then they coincide. Proof. Let and be two G-invariant and G-covariant star-products on S(M) belonging to the same reparametrization class. Their G-covariance translates into (see (2.2)) : JX JY − JY JX = i {JX , JY } = JX JY − JY JX
(5.9)
Projectively and Conformally Invariant Star-Products
21
for all X, Y ∈ g. On the other hand reparametrization equivalence means that there exist a formal power series (2.6) such that F G = (i)r Br (F, G) = (µ(i))r Br (F, G). r≥0
r≥0
Using this equation, one rewrites the right-hand side of (5.9) in terms of , with µ(i) as deformation parameter. Now, using the left hand side of (5.9) one gets µ(i) = i, from which the conclusion follows. An analog of the above statement, where the reparametrization equivalence is replaced by G-equivalence, does not hold. Indeed, one shows using an argument similar to the one in the above proof, that two G-invariant and G-covariant star-products on S(M) in the same G-equivalence class, do not necessarily coincide. So, covariance does not play the same role for G as it does for Sp(2n, R) R2n . However, a simple verification shows that, for the Moyal star-product, homogeneity has exactly the same effect as (Sp(2n, R) R2n )-covariance. Hence, the G-canonical and the Moyal star-products are uniquely determined by two simple conditions, namely, invariance and homogeneity. 6. Explicit Formula for the Projectively-Invariant Star-Product In this section we compute the explicit formula of the canonical homogeneous projectively-invariant star-product. This solves a problem raised in [2]. Projective invariance will be dealt with in two stages. We first consider invariance with respect to an affine subgroup Aff(n, R) of SL(n + 1, R) and determine the affineinvariant bilinear operators on S(RPn ). Those will be used to write down an Ansatz for the star-product we are looking for. We will then enforce full projective invariance by further demanding that inversions preserve the star-product. This will give rise to Eq. (6.11) and (6.12) below. Another system of equations will arise from the associativity requirement (see (6.14)). The unique solution of the complete system of equations will be given explicitly at the end of this section. 6.1. Autonomous derivation from the invariance principle. We need to classify the bilinear Aff(n, R)-invariant differential operators on S(Rn ). For that purpose, let us resort to the natural embedding S(Rn ) ⊗ S(Rn ) → S(R2n )
(6.1)
and denote by (x, ξ, y, η) the natural coordinate system on T ∗ Rn ×T ∗ Rn . The operators of divergence with respect to the first and the second arguments Dxξ (F, G) = D(F ) G,
Dyη (F, G) = F D(G),
(6.2)
where D is as in (3.3), and the operators of contraction Dxη (F, G) =
∂ ∂
F (ξ, x)G(η, y) ,
η=ξ,y=x ∂x i ∂ηi
(6.3)
Dyξ (F, G) =
∂ ∂
F (ξ, x)G(η, y)
i η=ξ,y=x ∂y ∂ξi
(6.4)
are obviously Aff(n, R)-invariant differential operators. Restricting ourselves to homogeneous components, we get the following:
22
C. Duval, A.M. El Gradechi, V. Ovsienko
Proposition 6.1. Every bilinear differential operator Sk (Rn ) ⊗ S (Rn ) → Sm (Rn )
(6.5)
invariant with respect to the action of Aff(n, R), is a homogeneous polynomial in Dξ x , Dξ y , Dηx and Dηy of degree k + − m. This enables us to write the most general Aff(n, R)-invariant bilinear operation S(Rn ) ⊗ S(Rn ) → S(Rn )[]. According to Theorem 5.7, we will express it as a termwise homogeneous formal series which, when restricted to Sk (Rn ) ⊗ S (Rn ), takes the form F G=
∞
(i)r Brk, (F, G),
(6.6)
r=0
where Brk, is a bidifferential operator, homogeneous of degree r in Dξ x , Dξ y , Dηx , Dηy , viz
γ k, α β δ Bα,β,γ D D D D F (ξ, x)G(η, y) Brk, (F, G)(ξ, x) =
,δ ξ y ηx ξ x ηy η=ξ,y=x
α+β+γ +δ=r
(6.7) k, with constant coefficients Bα,β,γ ,δ . Since we seek a star-product, we have to impose k, B0,0,0,0 =1
(6.8)
and k, k, = −B0,1,0,0 = B1,0,0,0
1 2
(6.9)
in order to get the multiplication and Poisson bracket as the first two terms as in Eq. (1.1). Expressions (6.6) and (6.7) constitute our Ansatz for an SL(n + 1, R)-invariant starproduct on T ∗ RPn . It now remains to impose to the operation (6.6) the following conditions: (i) invariance with respect to inversions, and (ii) associativity. 6.2. Projective invariance. Let Xi = x i x j ∂x j be the i th generator of inversions. Denote by LXi = x i x j
∂ ∂ ∂ ∂ ∂ ∂ − x j ξj − x i ξj + y i y j j − y j ηj − y i ηj ∂x j ∂ξi ∂ξj ∂y ∂ηi ∂ηj
its canonical lift to T ∗ (R2n ). Invariance with respect to inversions translates into the following equations:
γ k, α β δ Bα,β,γ =0 (6.10) ,δ LXi , Dξ y Dηx Dξ x Dηy
α+β+γ +δ=r
η=ξ,y=x
Projectively and Conformally Invariant Star-Products
23
at each order r ∈ N. The latter yield the following system of equations: k, k, (α + 1)(α + δ − )Bα+1,β,γ ,δ + (β + 1)(β + δ − )Bα,β+1,γ ,δ k, k, = (γ + 1)(n + 2k − γ − 1)Bα,β,γ +1,δ + (α + 1)(β + 1)Bα+1,β+1,γ ,δ−1 (6.11)
and k, k, (β + 1)(β + γ − k)Bα,β+1,γ ,δ + (α + 1)(α + γ − k)Bα+1,β,γ ,δ k, k, = (δ + 1)(n + 2 − δ − 1)Bα,β,γ ,δ+1 + (α + 1)(β + 1)Bα+1,β+1,γ −1,δ . (6.12)
6.3. Associativity. If F ∈ Sk (Rn ), G ∈ S (Rn ), and H ∈ Sm (Rn ) the associativity condition takes the form r
k,+m−j
Br−j
(F, Bj,m (G, H )) =
j =0
r
k+−j,m
Br−j
(Bjk, (F, G), H )
(6.13)
j =0
for all r ∈ N. Equation (6.13) then reads r
k,+m−j
Bα,β,γ ,δ (Dξ y + Dξ z )α (Dηx + Dζ x )β
j =0 α+β+γ +δ=r−j γ ×Dξ x (Dηy + Dηz
× =
+ D ζ y + D ζ z )δ
α +β +γ +δ =j r
β
γ
α δ Bα,m ,β ,γ ,δ Dηz Dζ y Dηy Dζ z
k+−j,m
Bα,β,γ ,δ (Dξ z + Dηz )α (Dζ x + Dζ y )β
j =0 α+β+γ +δ=r−j
×(Dξ x + Dξ y + Dηx + Dηy )γ Dδζ z α β γ δ × Bαk, ,β ,γ ,δ Dξ y Dηx Dξ x Dηy .
(6.14)
α +β +γ +δ =j
6.4. Explicit solution of the system. We solve the system of equations (6.11), (6.12) and k, k, (6.14), by first determining the components Bα,β,0,0 , then Bα,β,γ ,0 and, finally, the full k, expression Bα,β,γ ,δ .
6.4.1. First stage. Identifying in the associativity equation (6.14) the coefficients of the r−j j k,+m k+,m k, monomials Dξ z Dζ x , one readily finds Br−j,j,0,0 = Br−j,j,0,0 . Thus, Bα,β,0,0 depends only on k + ; we write k, Bα,β,0,0 = Cα,β (k + ).
(6.15)
24
C. Duval, A.M. El Gradechi, V. Ovsienko r−j −1
Using again (6.14), we identify the coefficients of the monomials Dξ z r−j −1
and Dξ z
j
Dζ x Dηx
j
Dζ x Dξ y , respectively, to get the following system: k,+m k+,m k+,m k, (j + 1)Br−j −1,j +1,0,0 = Br−j −1,j,1,0 + Br−j −1,j,0,0 B0,1,0,0 , k,+m k+,m k+−1,m k, = Br−j (r − j )Br−j,j,0,0 −1,j,1,0 + Br−j −1,j,0,0 B1,0,0,0 .
Resorting to the invariance equation (6.11) for α = r − j − 1, β = j , and γ = δ = 0, we obtain the supplementary equation k, k, (r − j )(r − j − − 1)Br−j,j,0,0 + (j + 1)(j − )Br−j −1,j +1,0,0 k, −(n + 2k − 1)Br−j −1,j,1,0 = 0.
The previous three equations together with (6.9) and (6.15) imply 1 (r − j )(r − n − 2k)Cr−j,j (k) + (n + 2k − 2j − 1)Cr−j −1,j (k − 1) = 0. 2 The latter equation, supplemented with (6.8), yields then
k, Bα,β,0,0
(−1)β = (α + β)!
1 (n−1)+k+−β 1 (n−1)+k+−α 2
2
α
β
n+2k+2−α−β
.
(6.16)
α+β
6.4.2. Second stage. Here we only use the first invariance equation (6.11) with δ = 0. Long but straightforward calculations lead to k, Bα,β,γ ,0 =
γ 1 (α + 1)r (β + 1)s γ !(n + 2k − γ )γ r+s=γ r k, × (α − )r (β − )s Bα+r,β+s,0,0 ,
(6.17)
where the last term is as in (6.16). 6.4.3. Last stage. A reverse iterative computation on δ using the second invariance equation (6.12) finally leads to the sought for result k, Bα,β,γ ,δ
1 = δ!(n + 2 − δ)δ
r+s+t=δ
(−1)
s
δ r, s, t
×(α + 1)r (α + 1)s (β + 1)s (β + 1)t (α + γ − k)r (β + γ − k)t k, ×Bα+r+s,β+s+t,γ −s,0 ,
(6.18)
where the first line contains the trinomial coefficient and the last one is given by (6.17).
Projectively and Conformally Invariant Star-Products
25
6.5. Symmetry condition. Proposition 6.2. The symmetry condition C2 translates for the Ansatz (6.6)–(6.7) into k, α+β+γ +δ ,k Bα,β,γ Bβ,α,δ,γ . ,δ = (−1)
(6.19)
Proof. If F ∈ Sk (Rn ), and G ∈ S (Rn ), we immediately get from Condition C2 that Brk, (F, G) = (−1)r Br,k (G, F ). Then, a change of dummy variables in (6.7) completes the proof.
It turns out that our star-product given by (6.6), (6.7) and (6.18) automatically satisfies the symmetry condition (6.19). Although this is not transparent from the expression (6.18), it is however a direct consequence of Proposition 4.4 and Theorem 5.1. 7. Conclusion, Discussion and Outlook In this work we have proved the existence and uniqueness of a canonical G-invariant star-product on T ∗ M for G = SL(n+1, R) (resp. G = SO0 (p +1, q +1) and M = RPn (resp. (S p × S q )/Z2 ). We have, moreover, given an explicit formula for the canonical projectively invariant star-product. For both geometries, the canonical star-product so obtained is symmetric, homogeneous, strongly G-invariant (hence G-covariant), but not differential. These properties, except for the last one, are shared with the Moyal star-product on R2n . Theorem 5.1 shows that the homogeneity condition supplementing G-invariance uniquely determines the canonical G-invariant star-product on S(M). Likewise, the Moyal star-product is also uniquely specified by (Sp(2n, R) R2n )-invariance and homogeneity. This allows us to draw a parallel between our canonical G-invariant starproduct and Moyal’s, namely, they are uniquely determined by the same two simple conditions : invariance and homogeneity. Of course, this parallel is far from complete, since, for instance, G and Sp(2n, R) R2n do not have the same geometric status; the action of the former on T ∗ M is lifted from that on M, which is not the case for the latter. Furthermore, it is clear that, for the projective and the conformal cases, there is no G-invariant (symplectic) connection on T ∗ M, since G does not act on the bundle of linear frames of T ∗ M. Hence, no Fedosov [26] canonical G-invariant star-product can be constructed. Besides, Fedosov’s construction would have led to a star-product given by bidifferential operators. The generalization of the existence and uniqueness theorems for projectively/conformally invariant star-products on T ∗ M in the case of a non-flat projective/conformal connection on M remains an open problem. In a recent work [6], Bordemann has taken a significant step in this new direction, by investigating the projectively equivariant quantization on a cotangent bundle of a manifold with a non flat projective structure (see also [21 and 7]). Note also that since the canonical star-products studied in this work may be considered as the projective/conformal analogs of the Moyal star-product, they may play a similar role as the latter in a construction a` la Fedosov of a star-product on a symplectic manifold with a Cartan projective/conformal symplectic connection. In the case n ≥ 2, let us mention that the explicit form of the conformally invariant star-product is, so far, out of reach. This was already the situation for the conformally equivariant quantization map [23] (see also [21]).
26
C. Duval, A.M. El Gradechi, V. Ovsienko
In the conformal case with n = 2, Theorem 5.7 holds for star-products of the form (1.1) with the standard Poisson bracket on T ∗ M as first-order term. However, one could easily construct, in this case, another G-invariant star-product with the Poisson bracket (5.4) as first-order term. It would be interesting to give a physical status to this second, somewhat “exotic”, star-product. In the case of dimension n = 1, our results are related to earlier work by Cohen, Manin and Zagier [12]. The projective and the conformal algebras are, in this case, isomorphic to sl(2, R). Moreover, the canonical projectively and conformally invariant star-products coincide by uniqueness and thus the explicit formulæ given in Sect. 6.4 correspond to the one obtained in [12] for λ = 21 . Acknowledgements. It is a pleasure to thank Ranee Brylinski, Simone Gutt, Pierre Lecomte and John Rawnsley for valuable help and encouragement. This work was done while the second author was visiting CPT as a d´el´egu´e CNRS; he thanks CNRS for granting him a d´el´egation and the Universit´e d’Artois for consenting a one year leave of absence. The second and third authors both thank the CPT for hospitality.
References 1. Arnal, D., Cortet, J.-C., Molin, P., Pinczon, G.: Covariance and geometrical invariance in star-quantization. J. Math. Phys 24, 276–283 (1983) 2. Astashkevich, A., Brylinski, R.: Non-Local Equivariant Star Product on the Minimal Nilpotent Orbit. To appear in Advances in Math, math.QA/0010257 v2 3. Bayen, F.,Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation theory and quantization. I. Deformations of symplectic structures. Ann. Phys. 111(1), 61–110 (1978) 4. Bertelson, M., Bieliavsky, P., Gutt, S.: Parametrizing equivalence classes of invariant star products. Lett. Math. Phys 46(4), 339–345 (1998) 5. Boniver, F., Lecomte, P.B.A.: A remark about the Lie algebra of infinitesimal conformal transformations of the Euclidean space. Bull. London Math. Soc 32(3), 263–266 (2000) 6. Bordemann, M.: Sur l’existence d’une prescription d’ordre naturelle projectivement invariante. math.DG/0208171 7. Bouarroudj, S.: Projectively equivariant quantization map. Lett. Math. Phys. 51(4), 265–274 (2000) 8. Brylinski, R.: Equivariant Deformation Quantization for the Cotangent Bundle of a Flag Manifold. Ann. Inst. Fourier 52(3), 881–897 (2002) 9. Brylinski, R.: Non-Locality of Equivariant Star Products on T ∗ (RP n ). Lett. Math. Phys 58(1), 21–28 (2001) 10. Cahen, M., Gutt, S., De Wilde, M.: Local cohomology of the algebra of C ∞ functions on a connected manifold. Lett. Math. Phys. 4(3), 157–167 (1980) 11. Cattaneo, A., Felder, G.: A path integral approach to the Kontsevich quantization formula. Comm. Math. Phys. 212(3), 591–611 (2000) 12. Cohen, P., Manin, Yu., Zagier, D.: Automorphic pseudodifferential operators. In: Algebraic aspects of integrable systems, Progr. Nonlinear Differential Equations Appl. 26, Boston, MA: Birkh¨auser, 1997, pp. 17–47 13. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257– 360 (1985) 14. Deligne, P.: D´eformations de l’alg`ebre des fonctions d’une vari´et´e symplectique: Comparaison entre Fedosov et De Wilde, Lecomte. Selecta Math. (N.S.) 1(4), 667–697 (1995) 15. De Wilde, M., Lecomte, P.: Star-products on cotangent bundles. Lett. Math. Phys. 7(3), 235–241 (1983) 16. De Wilde, M., Lecomte, P.: Existence of star-products and of formal deformations of the Poisson Lie algebra of arbitrary symplectic manifolds. Lett. Math. Phys 7(6), 487–496 (1983) 17. De Wilde, M., Lecomte, P.: An homotopy formula for the Hochschild cohomology. Compositio Math. 96(1), 99–109 (1995) 18. Dito, G., Sternheimer, D.: Deformation quantization: Genesis, developments and metamorphoses, IRMA Lectures in Math. Theoret. Phys. 1, Berlin: Walter de Gruyter, 2002, pp. 9–54 19. Dubrovin, B.A., Fomenko, A.T., Novikov, S.P.: Modern geometry – methods and applications, Part I. Graduate Texts in Mathematics. 93, New York: Springer-Verlag, 1992 20. Duval, C., Ovsienko, V.: Space of second order linear differential operators as a module over the Lie algebra of vector fields. Adv. Math 132(2), 316–333 (1997)
Projectively and Conformally Invariant Star-Products
27
21. Duval, C., Ovsienko, V.: Conformally equivariant quantum Hamiltonians. Selecta Math. (N.S.) 7(3), 291–320 (2001) 22. Duval, C., Ovsienko, V.: Projectively equivariant quantization and symbol calculus: Noncommutative hypergeometric functions. Lett. Math. Phys. 57(1), 61–67 (2001) 23. Duval, C., Lecomte, P., Ovsienko, V.: Conformally equivariant quantization: Existence and uniqueness. Ann. Inst. Fourier. 49(6), 1999–2029 (1999) 24. Fedosov, B.V.: Formal quantization. In: Some topics of modern mathematics and their applications to problems of mathematical physics (in Russian), Moscow, 1985, pp. 129–136 25. Fedosov, B.V.: A simple geometrical construction of deformation quantization. J. Diff. Geom. 40(2), 213–238 (1994) 26. Fedosov, B.V.: Non-abelian reduction in deformation quantization. Lett. Math. Phys. 43(2), 137–154 (1998) 27. Fedosov, B.V.: Deformation quantization and index theory. Mathematical Topics. 9, Berlin: Akademie Verlag, 1996 28. Gerstenhaber, M.: On the deformation of rings and algebras. Ann. of Math. 79(2), 59–103 (1964) 29. Goodman, R., Wallach, N.: Representations and invariants of the classical groups. Encyclopedia of Mathematics and its Applications 68, Cambridge: Cambridge University Press, 1998 30. Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics. Reading, MA: Addison-Wesley, 1992 31. Grœnewold, H.J.: On the principles of elementary quantum mechanics. Physica 12, 405–460 (1946) 32. Gutt, S.: Variations on deformation quantization. Math. Phys. Stud. 21, Dordrecht: Kluwer Acad. Publ., 2000, pp. 217–254 33. Gutt, S.: Contribution a` l’´etude des espaces symplectiques homog`enes. Acad. Roy. Belg. Cl. Sci. M´em. Collect. 8o (2) 44(6), (1983) 34. Kontsevich, M.: Deformation quantization of Poisson manifolds I. q-alg/9709040 35. Lecomte, P.B.A.: Classification projective des espaces d’op´erateurs diff´erentiels agissant sur les densit´es. C. R. Acad. Sci. Paris S´er. I Math. 328(4), 287–290 (1999) 36. Lecomte, P.B.A.: On the cohomology of Sl(m + 1, R) acting on differential operators and Sl(m + 1, R)-equivariant symbol. Indag. Math., N.S. 11(1) 95–114 (2000) 37. Lecomte, P.B.A., Ovsienko, V.: Projectively invariant symbol calculus. Lett. Math. Phys. 49(3), 173–196 (1999) 38. Lichnerowicz, A.: D´eformations d’alg`ebres associ´ees a` une vari´et´e symplectique (les ∗ν -produits). Ann. Inst. Fourier 32(1), 157–209 (1982) 39. Moyal, J.E.: Quantum mechanics as a statistical theory. Proc. Cambridge Philos. Soc. 45, 99–124 (1949) 40. Nadaud, F.: On continuous and differential Hochschild cohomology. Lett. Math. Phys. 47(1), 85–95 (1999) 41. Omori, H., Maeda,Y.,Yoshioka, A.: Weyl manifolds and deformation quantization. Adv. Math. 85(2), 224–255 (1991) 42. Peetre, J.: Une caract´erisation abstraite des op´erateurs diff´erentiels, Math. Scand. 7, 211–218 (1959) and 8, 116–120 (1960) 43. Tamarkin, D.E.: Another proof of M. Kontsevich formality theorem. math.QA/9803025 44. Vey, J.: D´eformation du crochet de Poisson sur une vari´et´e symplectique. Comment. Math. Helv. 50(4), 421–454 (1975) 45. Weinstein, A.: Deformation quantization. S´eminaire Bourbaki, 1993/94, Ast´erisque 227, 389–409 (1995) 46. Weyl, H.: The Classical Groups. Princeton, NJ: Princeton University Press, 1946 Communicated by H. Spohn
Commun. Math. Phys. 244, 29–61 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0994-2
Communications in
Mathematical Physics
Random Matrix Ensembles Associated to Compact Symmetric Spaces Eduardo Duenez ˜ American Institute of Mathematics, The Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218, USA. E-mail:
[email protected] Received: 10 October 2002 / Accepted: 23 June 2003 Published online: 25 November 2003 – © Springer-Verlag 2003
Abstract: We introduce random matrix ensembles that correspond to the infinite families of irreducible Riemannian symmetric spaces of type I. In particular, we recover the Circular Orthogonal and Symplectic Ensembles of Dyson, and find other families of (unitary, orthogonal and symplectic) ensembles of Jacobi type. We discuss the universal and weakly universal features of the global and local correlations of the levels in the bulk and at the “hard” edge of the spectrum (i. e., at the “central points” ±1 on the unit circle). Previously known results are extended, and we find new simple formulas for the Bessel Kernels that describe the local correlations at a hard edge. 1. Introduction Local correlations between eigenvalues of various ensembles of random unitary, orthogonal or symplectic matrices, in the limit when their size tends to infinity, are known to exhibit universal behavior in the bulk of the spectrum. Dyson’s “Threefold Way” [14] predicts that this behavior is to depend only on the symmetry type of the ensemble (unitary, orthogonal or symplectic). Unfortunately, for general ensembles this conjecture remains open, though in the unitary case (modeled after the Gaussian Unitary Ensemble) the universality of the local correlations has been proven for some classes of families [9, 7, 3, 2]. In the orthogonal and symplectic cases the extension of results known for Gaussian ensembles is technically more complicated but some more recent work deals with families of such ensembles [26]. Most of the focus has been on non-compact (Gaussian and the like) matrix ensembles. In the present article we study families of compact (circular) ensembles including, in particular, Dyson’s circular ensembles: the COE, CUE and CSE [11]. In Sect. 2 we fit Dyson’s ensembles into the framework of the theory of symmetric spaces. In Sect. 3 we proceed to associate a matrix ensemble to every family of irreducible compact symmetric spaces (all of these are known by the work of
This research has been supported in part by the FRG grant DMS–00–74028 from the NSF
30
E. Due˜nez Table 1. Parameters of the probability measure of the eigenvalues for ensembles of type I Type AI (COE) A II (CSE) A III BD I D III CI C II
G/K
Parameters
U (R)/O(R)
β = 1 (not Jacobi)
U (2R)/U Sp(2R)
β = 4 (not Jacobi)
U (2R + L)/U (R + L) × U (R) O(2R + L)/O(R + L) × O(R) SO(4R)/U (2R) SO(4R + 2)/U (2R + 1) U Sp(2R)/U (R) U Sp(4R + 2L)/ U Sp(2R + 2L) × U Sp(2R)
β β β β β
= 2, (a, b) = (L, 0) = 1, (a, b) = ((L − 1)/2, −1/2) = 4, (a, b) = (0, 0) = 4, (a, b) = (2, 0) = 1, (a, b) = (0, 0)
β = 4, (a, b) = (2L + 1, 1)
Cartan [4, 5]). If G/K is a compact symmetric space, then K is the fixed-point set of an involution of G, and using this involution and a realization of G as a matrix group we embed G/K → G, thus realizing it as a matrix ensemble, cf., Theorem 1. By the Weyl integration formula, the measure on G is decomposed as a product of measures on K × A × K, where A is an abelian torus and it is the induced measure on A that yields the probability measure of the eigenvalues, cf., Theorem 2. As for specific examples, the most well-known of the compact symmetric spaces are the classical groups of orthogonal, unitary and symplectic matrices, for which questions about universality have known answers [17]. These are the so-called “Type II” spaces. Zirnbauer [30], on the other hand, has constructed the “infinitesimal” versions of the other (type I) ensembles, namely their tangent spaces at the identity element, which is enough to derive their eigenvalue measures. Our Theorem 1, however, describes the “global” ensembles associated to the infinite families of compact symmetric spaces of type I in a very explicit manner analogous to Dyson’s description of his circular ensembles. Besides Dyson’s COE and CSE, the other compact matrix ensembles of type I are of Jacobi type in the sense that their joint eigenvalue measure is given by dν(x1 , . . . , xR ) ∝
1≤j
|xj − xk |β
R
(1 − xj )a (1 + xj )b dxj
on [−1, 1]R (1)
j =1
for some parameters a, b > −1 (depending on the ensemble, see Table 1) and β = 1, 2, 4 (the “symmetry parameter”) in the orthogonal, unitary and symplectic cases, √ respectively. Here, the “free” eigenvalues come in R pairs (or quadruples) xj ± −1yj – excluding eigenvalues equal to +1 forced by the symmetry built into the ensemble. Also, R stands for the rank of the corresponding symmetric space, and our interest is in the semiclassical limit of the eigenvalue statistics as R → ∞ (here L ≥ 0 is a fixed integer: different values of this parameter yield different ensembles.) The name “Jacobi ensembles” comes from the intimate connection between the measure (1) and the classical Jacobi polynomials on the interval [−1, 1]. In Sect. 4 we state the main results on the level density (Theorem 3) and the universality of the local correlations (Theorem 4) in the bulk of the spectrum for general unitary, orthogonal and symplectic Jacobi ensembles (previous results of Nagao and Forrester [23] are insufficient for our purposes). We rely on work of Adler et al [1]. At the “hard edges” ±1 of the interval, Dyson’s universality breaks down and we obtain simple formulas for the Bessel kernel in terms of which the hard edge correlations are expressed. In a nutshell, for Jacobi ensembles:
Matrix Ensembles Associated to Symmetric Spaces
31
• Away from the “hard edge” x = ±1, the local correlations follow the universal law of the GOE (β = 1), GUE (β = 2) or GSE (β = 4). Namely, in terms of local parameters ξj around a fixed zo = cos αo ∈ (−1, 1) so that xj = cos(αo + (π/R)ξj ), these local correlations are given by (n) Lβ (zo ; ξ1 , . . . , ξn ) = DET(K¯ β (ξj , ξk ))n×n ,
(2)
where DET stands for either the usual (β = 2) or quaternion (β = 1, 4) determinant, and Kβ is the (scalar or quaternion) Sine kernel (cf., Eqs. (74)–(78).) • At the hard edge zo = +1, the local correlations depend on the parameter a of the Jacobi ensemble as well as on β. In terms of local parameters ξj > 0 with xj = cos((π/R)ξj ) the same expression (2) holds except that the kernel K¯ β is to (a) be replaced by a Bessel kernel Kˆ β (ξ, η) given by Eqs. (84)–(86). At the hard edge zo = −1 the result is equivalent upon replacing a by b. 2. Dyson’s Circular Ensembles as Symmetric Spaces For motivational purposes in this section we review the constructions of the circular ensembles of Dyson and their probability measures of the eigenvalues in a manner in which the theory of Riemannian symmetric spaces is brought into play. The Circular Unitary Ensemble (CUE) is the set S = S(N ) of all N × N unitary matrices H , endowed with the unique probability measure dµ(H ) that is invariant under left (also right) multiplication by any unitary matrix. This requirement makes the measure invariant under unitary changes of bases, hence the ensemble’s name. In the study of statistics of eigenvalues, the relevant probability measure is the one induced by dµ(H ) on the torus A = A(N ) ⊂ S(N ) consisting of unitary diagonal matrices A = {diag(λ1 = eiθ1 , . . . , λN = eiθN )},
(3)
where = (θ1 , . . . , θN ) ∈ [0, 2π)N , say. To be more precise, let us denote by K = K(N) the unitary group of N × N matrices (its underlying set is just S(N)). Then we have a surjective mapping K ×A S (k, a) → H = kak −1 ,
(4)
and correspondingly there exists a probability measure dν(a) on A such that, for any continuous function f ∈ C(S), f (H )dµ(H ) = f (kak −1 )dν(a)dHaar(k), (5) S
K
A
where we denote by dHaar(k) the unique translation-invariant probability measure on K (so here dHaar = dµ). This measure dν(a) can be pulled back to some measure on the space [0, 2π)N of angles which, abusing notation, we denote by dν( ). The measure dν() (or dν( )) is the so-called probability measure of the eigenvalues (for the CUE). We have [21], 2 dν( ) ∝ Van(ei ) d on [0, 2π )N . (6)
32
E. Due˜nez
Here the symbol “∝” stands for proportionality up to a constant (depending only on N ), d = dθ1 . . . dθN is the usual translation-invariant measure on the space of angles , ei = (eiθ1 , . . . , eiθN ) and, for a vector x = (x1 , . . . , xN ), Van(x) is the Vandermonde determinant (xk − xj ). (7) Van(x) = det (xjk−1 ) = N×N
1≤j
The construction of the Circular Orthogonal Ensemble (COE) is as follows. One starts with the set S = S(N ) of N ×N symmetric unitary matrices H . However, because S(N ) is not a group, the choice of the probability measure dµ(H ) is not as obvious as it was for the CUE. Let G = G(N ) again be the group of N × N unitary matrices g, and K = K(N ) ⊂ G(N ) be the group of orthogonal matrices. Let (g) = (g T )−1 be the involution of G whose fixed-point set is K. Then we may identify G/K S, G g → H = g (g)−1 =: g 1− ,
(8)
and by general principles the translation-invariant probability measures on G and K determine a unique G-invariant measure dµ(g) ¯ = dµ(H ) on G/K S which satisfies f (g)dHaar(g) = f (gk)dHaar(k) dµ(g), ¯ (9) G
G/K
K
where on the right-hand side g stands for a choice of an element g ∈ G such that gK = g. ¯ The left translation-invariance of dHaar(g) ensures that dµ(g) ¯ is invariant under left translations by elements of K, therefore the measure dµ(H ) is invariant under orthogonal changes of bases, hence the ensemble’s name. The probability measure of eigenvalues dν(a) = dν( ) is again that which satisfies (5) (with the same torus A ⊂ S as for the CUE). It is known that [11] dν( ) ∝ | Van(ei )|d on [0, 2π )N .
(10)
The constructions of the Circular Symplectic Ensemble CSE and of its measure on eigenvalues dν( ) are very similar to the case of the COE. Here S(N ) consists of 2N × 2N self-dual unitary matrices. Namely, letting −IN , (11) J = JN = IN then a matrix H is self-dual if it equals its dual H D := J H T J T . If we let G = G(N ) be the group of 2N × 2N unitary matrices and K = K(N ) be the subgroup of symplectic matrices k (they satisfy kJ k T = J ) then K is the fixed-point set of the involution (g) = (g D )−1 . The identification (8) continues to hold and (9) again defines the probability measure dµ(H ) = dµ(g) ¯ of the ensemble. It is invariant under symplectic changes of bases. The torus A consists here of diagonal matrices: A = {diag(eiθ1 , . . . , eiθN , eiθ1 , . . . , eiθN )}
(12)
Matrix Ensembles Associated to Symmetric Spaces
33
with twice-repeated eigenvalues. Then the probability measure of the eigenvalues is characterized by (5), and indeed dν( ) ∝ | Van(ei )|4 d on [0, 2π )N .
(13)
Summing up, the measure on eigenvalues for the circular ensembles is given by dν( ) ∝ | Van(ei )|β d ,
(14)
where β = 1, 2, 4 in the orthogonal, unitary and symplectic cases, respectively. Remark. It can be appreciated that the parameter β determines the strength of the repulsion between nearby eigenvalues: this repulsion is stronger the larger β is. Hence anything that measures the local interactions between eigenvalues is likely to depend on β. This is the case, in particular, of the “local correlations” between eigenvalues, cf. Sect. 4. Remark. The apparent dissimilarity in the construction of the measure dµ(H ) in the case of the unitary vs. the orthogonal and symplectic ensembles is not essential. In fact, the unitary ensemble S(N ) is still a quotient G(N )/K(N ), where G(N ) = U (N ) × U (N ) is the direct product of two copies of the unitary group, and K(N ) is the diagonal of G(N ) (isomorphic to the unitary group itself). If we identify S(N ) with the “anti-diagonal” {H = (g, g −1 )} ⊂ G(N ) and take (g, h) = (h, g), then the construction of the ensemble and of the measures dµ(H ) and dν( ) follows through in essentially the same manner. We omit the details. The key observation is that the constructions above show that the circular ensembles are examples of Riemannian globally symmetric spaces. 3. Compact Symmetric Spaces as Matrix Ensembles Any Riemannian globally symmetric space X is locally isometric to a product of irreducible ones (the symbol “≈” means “is locally isometric to”): (c) (nc) X≈ Xi × Xj × E , (15) i (c)
j
(nc)
where the Xi (resp., the Xj ) are irreducible symmetric spaces of compact (resp., noncompact) type, and E = (E 1 ) is -dimensional Euclidean space (a flat manifold). In the case of the circular ensembles, we have CUE = U (N ) ≈ SU (N ) × S 1 , COE = U (N )/O(N ) ≈ (SU (N )/SO(N )) × S 1 , CSE = U (2N )/U Sp(2N ) ≈ (SU (2N )/U Sp(2N )) × S 1 ,
(16)
where in each case the first factor is an irreducible symmetric space of the compact type and the other (Euclidean) factor is a circle S 1 ≈ E 1 (we write S 1 rather than E 1 to emphasize that the spaces are compact). In the language of differential geometry, the probability measure of a circular ensemble is the one determined by the natural volume element of the manifold. Hence the natural question arises as to how to construct a random matrix ensemble corresponding to each (infinite) family of irreducible symmetric spaces of compact type. The restriction to infinite families is due to the need to have a large parameter N such that the number of eigenvalues grows with N , and then we are interested mainly in limiting statistics.
34
E. Due˜nez
The presence of the Euclidean factor S 1 (which comes from the subset of scalar multiples of the identity matrix within the ensemble) is rather convenient and natural. If we were to define “irreducible” circular ensembles analogously to Dyson’s circular ensembles, except requiring that they consist of matrices with unit determinant, then the spaces so obtained would be irreducible symmetric spaces of the compact type (i. e., the factors S 1 would disappear from (16)). However, the measure on eigenvalues would no longer be translationally invariant (under transformations of the form → + (t, . . . , t)). Namely, instead of the measure (14), we would obtain an asymmetric version given by the same formula but with replaced by = (θ1 , . . . , θN−1 , −θ1 − · · · − θN−1 ) and with dθN omitted from the volume element d . As may be expected from such a loss of symmetry, a rigorous analysis of these “irreducible” ensembles would be more involved. Since we are considering only compact symmetric spaces, it is possible to normalize the natural volume element to obtain a probability measure. This is not the case for symmetric spaces of non-compact type. To clarify the difference, we analyze the example of the classical Gaussian matrix ensembles, which also fit within the framework of the theory of symmetric spaces (the construction is analogous to that of the circular ensembles): GUE ≈ SL(N, C)/SU (N ) × E 1 , GOE ≈ SL(N, R)/SO(N ) × E 1 , GSE ≈ SU ∗ (2N )/U Sp(2N ) × E 1 .
(17)
Finding the probability measure on eigenvalues also reduces to a factorization of measures dµ(H ) = dHaar(k)dν(a) in the sense of (5), where K is still the group of invariance (orthogonal, unitary, symplectic) of the ensemble’s measure, but where A E N is now a Euclidean space, which in the case of these ensembles consists of real diagonal matrices which can be parametrized by N -tuples = (λ1 , . . . , λN ) of real numbers. However, the measure dµ(H ) is certainly not the one obtained from the Riemannian volume element dHaar(g) of G through (10) since the latter is not normalizable. A choice has to be made to make this measure into a finite one while preserving its left and right K-invariance. One possibility is provided by a “Gaussian” probability measure on G proportional to β
e− 2 tr g dHaar(g) 2
(18)
(the symmetry parameter β = 1, 2, 4 corresponds to the orthogonal, unitary and symplectic cases, respectively, just as in the case of the Orthogonal ensembles), which in turn yields the measure on eigenvalues: dν(a) ∝ e−β
λ2j
| Van()|β d.
(19)
It can be rightfully argued that the choice of the Gaussian normalization for the measure on these matrix ensembles is rather arbitrary and motivated by analytical rather than conceptual considerations. The point we wish to state here is that making such a choice is unavoidable. For the compact spaces, however, no such choice needs to be made since their volume element already determines a unique probability measure. We will henceforth restrict our attention to compact ensembles for that reason. The general definition of a Riemannian symmetric space of the compact type is as follows. We start with a compact semisimple Lie algebra g (i. e., exp(ad(g)) ⊂ GL(g) is
Matrix Ensembles Associated to Symmetric Spaces
35
Table 2. The infinite families of symmetric spaces of type I Type
G/K
AI A II A III BD I D III CI C II
SU (N)/SO(N) SU (2N)/U Sp(2N) SU (M + N)/S(U (M) × U (N)) SO(M + N)/SO(M) × SO(N) SO(2N)/U (N) U Sp(2N)/U (N) U Sp(2M + 2N)/U Sp(2M) × U Sp(2N)
Rank R N −1 N −1 min(M, N) min(M, N) N/2 N min(M, N)
compact) having an involutive automorphism ω. Then g splits into the sum of the (+1)and (−1)-eigenspaces of ω as g=k⊕p
(20)
(the subspace p ⊂ g can be identified with the tangent space to G/K at the identity coset o = K/K). G/K is called a Riemannian symmetric space of the compact type if (1) K ⊂ G are Lie groups (G connected). Their Lie algebras are k, g; and (2) there is a (necessarily unique) involutive automorphism of G such that (G )o ⊂ K ⊂ G , where G is the fixed-point set of in G (a Lie subgroup of G) and (G )o is its identity component (then d e = ω). The complete list of irreducible symmetric spaces (up to local isometry) is known by the classical work of Cartan. As we will explain later, it suffices to consider one matrix ensemble in each equivalence class of locally isometric symmetric spaces, because the measures on eigenvalues for locally isometric ensembles are the same. The irreducible symmetric spaces of compact type are classified into spaces of “Type I” and “Type II”. Of these the latter are simplest to describe: they are the (connected) simple compact Lie groups G, provided with a bi-invariant (under both left and right translations) Riemannian metric. Proving that such a G is a bona fide symmetric space of the compact type as defined before involves expressing it as (G × G)/G in a manner analogous to the remark at the end of Sect. 2 (for the CUE.) Up to local isometry, the infinite families of Type II spaces are those of orthogonal SO(N ), unitary SU (N ) and (compact) symplectic U Sp(2N ) groups. The random matrix theory of these spaces is well-known [17]. The Type I spaces, on the other hand, are those symmetric spaces G/K of the compact type with G simple. The bi-invariant Riemannian metric on G determines that on the quotient G/K. Table 2 lists the infinite families of Type I spaces, up to local isometry. Without loss of generality, we assume henceforth that min(M, N ) = N . Choose a maximal abelian subalgebra a of g contained in p. Then the subgroup A = exp(a) is a torus that projects onto a totally flat submanifold AK/K ⊂ G/K (a flat torus). This totally flat manifold is maximal, and its dimension is the rank R of the symmetric space G/K. Thus, R = dim(AK/K) = dim A = dim a. Guided by the exposition in the previous section, it is reasonable to regard as ensembles the symmetric spaces G/K of type I endowed with their normalized Riemannian volume elements dµ(g), ¯ which satisfy (9). However, the elements of these ensembles are not matrices but rather cosets g¯ = gK ∈ G/K. Theorem 1. The infinite families of type I ensembles G/K can be realized as matrix ensembles S. Indeed, (8) maps G/K bijectively onto a submanifold S ⊂ G, and G is a
36
E. Due˜nez
classical group of matrices, hence S is a space of matrices. Under this correspondence, AK/K ⊂ G/K is mapped onto the torus A. The action of K on G/K by left translation corresponds to the conjugation H → kH k −1 on matrices H ∈ S, and any H ∈ S is conjugate to some a ∈ A under this action. Moreover, two matrices in A are conjugate under K if and only if they have the same eigenvalues. The proof of the theorem is a long exercise in elementary linear algebra. We shall omit most of the details, which can be found in [10]. In what follows we describe the explicit matrix ensembles S which are the images of the imbedding (8). In each case, we choose the involution of G so that its fixed-point set is exactly K. The cases of A I (COE) and A II (CSE) have been discussed already. We introduce some notation (recall that JN is defined by Eq. (11)): IN JN = , (21) IN 2N×2N JM , (22) JMN = JN (2M+2N)×(2M+2N) JM JMN = , (23) JN (2M+2N)×(2M+2N) IM IMN = . (24) −IN (M+N)×(M+N) The canonical bilinear antisymmetric matrix Jn in the definition of the compact symplectic group U Sp(2n) will be taken to be (11) in the case of ensembles with one parameter N (n = N ), and (22) in the case of ensembles with two parameters M, N (n = M + N ). A III. Take M ≥ N ≥ 1 and G(M, N ) = U (M + N ). Then K(M, N ) = U (M) × U (N) is the fixed-point set of the involution g → g := I gI ,
(25)
with I = IMN as in (24). The symmetric space U (M +N )/U (M)×U (N ) = SU (M +N )/S(U (M)×U (N )) is realized as the matrix ensemble
S(M, N ) := {H = GI such that G ∈ U (M + N ) is Hermitian of signature (M, N )}, under the identification (8). A choice of the abelian torus A is given by 1M−N N −N , A= N N
(26)
(27)
where N = diag(λ1 , . . . , λN ) is an arbitrary diagonal unitary matrix. Besides the eigenvalue 1 with multiplicity M − N , the eigenvalues of the matrix in (27) come in R = N pairs λj , λ−1 j , |λj | = 1. BD I. Let M ≥ N ≥ 1, G(M, N ) = O(M + N ), and K(M, N ) = O(M) × O(N ) be the fixed-point set of the involution (25) with I = IMN as in (24). Then
Matrix Ensembles Associated to Symmetric Spaces
37
G/K = O(M + N)/O(M) × O(N ) = SO(M + N )/S(O(M) × O(N )) ≈ SO(M + N)/SO(M) × SO(N ) (the last two spaces are locally isometric). The symmetric space O(M +N )/O(M)×O(N ) can be realized as the set of matrices S(M, N ) := {H = gI such that g ∈ O(M + N ) is symmetric of signature (M, N )},
(28)
by means of (8). The torus A is just as in (27) and we get the same description for the eigenvalues. D III. Let G(N ) = SO(2N ) and K(N) = SO(2N ) ∩ Sp(2N, C) U (N ): g −g U (N ) g → ∈ K(N ). (29) g g Then K(N ) is the fixed-point set of the involution g → g := J T (g −1 )T J = J T gJ
(30)
with J = JN as in (11). We can identify G(N )/K(N ) with the set S(N ) := {H ∈ SO(2N ) s. t. H J is “dexter” antisymmetric}
(31)
using Eq. (8). We now explain what we mean by a dexter matrix. Say G is a 2N × 2N orthogonal antisymmetric matrix. Then an orthogonal change of basis puts it into the canonical form JN . However, this may not be possible by means of a proper orthogonal change of basis (i. e., of determinant +1). Specifically, when N is even, the two complex structures ±JN are equivalent (under, say, the proper orthogonal change of basis JN as in (21)), but when N is odd they are not. We call G dexter if, by a proper orthogonal change of basis, it can be taken into the canonical form +JN . Thus, for N even, all orthogonal antisymmetric matrices are dexter, whereas for N odd, only half of them are (in this case, conjugation by JN takes “dexter” matrices into “sinister” ones and vice-versa). Now, for H ∈ S(N), G := H J is dexter antisymmetric, so our discussion above proves the surjectivity of the mapping. The torus A is R −R R R for N even; R R − R R 1 , (32) A= R −R R R for N odd. 1 R R −R R where R = diag(λ1 , . . . , λR ) is a diagonal unitary matrix. Besides the double eigenvalue 1, which occurs for N odd, the matrices in (32) have R quadruples of eigenvalues −1 λj , λj , λ−1 j , λj .
38
E. Due˜nez
C I. Here G(N ) = U Sp(2N ), and K(N) U (N ) is the fixed-point set of the involution (25) with I = INN as in (24). Explicitly, g ∈ K(N ). (33) U (N ) g → (g T )−1 Identify G(N )/K(N ) with the set S(N ) := {H = GI s.t. G ∈ U (2N ) is Hermitian and J G = −GJ }
(34)
by means of (8). The torus A is A=
N −N N N
,
(35)
with N a unitary diagonal matrix as before. The eigenvalues occur in pairs just as in the case of (27) with M = N . C II. Let M ≥ N ≥ 1 and G(M, N ) = U Sp(2M + 2N ). We take the complex structure J = JMN as in (22). Then K(M, N) = U Sp(2M) × U Sp(2N ) consists exactly of those elements that also stabilize IMN I = , (36) IMN with IMN as in (24), so that K(M, N ) is the fixed-point set of the involution
g → g := I gI .
(37)
We can realize the symmetric space G(M, N )/K(M, N ) as the set of matrices S(M, N ) := {H = GI such that G ∈ U Sp(2M + 2N ) is Hermitian of signature (M, N )},
(38)
where we mean the quaternionic signature as discussed below. We recall that any matrix G ∈ Sp(2n, C) which is Hermitian (G = GT ), has real eigenvalues and can be diagonalized with a symplectic matrix g ∈ Sp(2n, C), that is, n g −1 Gg = (39) −1 n for some real diagonal matrix n . The usual signature of G is of the form (2a, 2b), so we call (a, b) the quaternionic signature. The identification is, of course, given by (8). The torus A is IM−N N −N N N A= (40) IM−N N N −N N with N unitary diagonal. Besides the eigenvalue 1 with multiplicity 2(M − N ), the other eigenvalues occur in quadruples like those of the matrices in (32).
Matrix Ensembles Associated to Symmetric Spaces
39
For each of the ensembles, the torus A, which has dimension equal to the rank R of the symmetric space, is parametrized by diagonal unitary matrices R = diag(λ1 , . . . , λR ),
|λj | = 1.
(41)
Abusing notation, we will also write R for the vector (λ1 , . . . , λR ). The tangent space a to this torus at the identity is identified with the space of R-tuples i = (iθ1 , . . . , iθR ), θj ∈ R. Recall that we identify p with the tangent space to G/K at the base-point o = K/K. The exponential maps Exp of G/K and exp of G are related by Exp(X) = exp(X)K ∈ G/K
(42)
for X ∈ p = To (G/K). For i ∈ a, exp(i ) is given by the matrix on the right-hand side of Eqs. (27), (32), (35) and (40), respectively, provided we choose λj = eiθj in (41). Proposition 1 (KAK decomposition). Let G/K be a symmetric space of the compact type and A ⊂ G be as above. The mapping K ×A×K G (k1 , a, k2 ) → k1 ak2
(43)
is a surjection. The KAK decomposition has an integral counterpart. Proposition 2 (Weyl’s integration formula). There is a measure d ν¯ (a) on A such that, for any f ∈ C(G), f (g)dg = f (k1 ak2 )d ν¯ (a)dk2 dk1 . (44) G
K
K
A
(We have simplified our notation by dropping the name “Haar” of the respective invariant measures.) Denote by + the set of positive roots of the symmetric Lie algebra (g, ω), and by mα the multiplicity of a positive root α ∈ + . Then d ν¯ (a) ∝ | sin α( )|mα da =: ( )da, (45) α∈+
say, where is chosen so a = exp(i ). (With the notation above, we write i = log(a). This is well defined modulo 2π.) Now recall that the (positive) roots of (g, ω) are certain non-zero real-valued linear functionals on a (in fact one should speak about the roots which are positive with respect to a fixed Weyl chamber in a). The root systems of the irreducible orthogonal Lie algebras of compact type are well-known by Cartan’s work. Proposition 3. The positive roots and multiplicities for the irreducible orthogonal Lie algebras of type I are as follows (let L = M − N in the case of ensembles with two parameters). • A I. θk − θ j ,
α 1≤j
mα 1
40
E. Due˜nez
• A II. θk − θ j ,
α 1≤j
mα 4
θk − θ j ,
α 1≤j
mα 4
α θk ± θ j , 1 ≤ j < k ≤ R θj , 1 ≤ j ≤ R
mα 1 L
α θk ± θ j , 1 ≤ j < k ≤ R 2θj , 1 ≤ j ≤ R
mα 4 1
α θk ± θ j , 1 ≤ j < k ≤ R θj , 1 ≤ j ≤ R 2θj , 1 ≤ j ≤ R
mα 4 4 1
α θk ± θ j , 1 ≤ j < k ≤ R 2θj , 1 ≤ j ≤ R
mα 1 1
• A III. • BD I.
• D III. N even.
• D III. N odd.
• C I.
• C II. α mα θk ± θ j , 1 ≤ j < k ≤ R 4 θj , 1 ≤ j ≤ R 4L 2θj , 1 ≤ j ≤ R 3 We are now ready to derive the measure on eigenvalues for ensembles of type I. Theorem 2. The measure on eigenvalues for a symmetric mα space of type I is given by 1 sin α( ) da, i = log a. dν(a) ∝ ( /2)da = (46) 2 + α∈
Using Weyl’s integration formula, we deduce that, for any f ∈ C(S), f (H )dµ(H ) = S
f ((gk)1− )dk dµ(g) ¯
(by (8))
G/K
=
f (g 1− )dg (since k 1− = e) = f ((k1 ak2 )1− )dk2 d ν¯ (a)dk1 K A K = f ((k1 a)1− )dk2 d ν¯ (a)dk1 K A K = f ((ka)1− )d ν¯ (a)dk K A = f (ka 2 k −1 )d ν¯ (a)dk. (since a 1− = a 2 ). G
K
A
(47)
Matrix Ensembles Associated to Symmetric Spaces
41
This ought to be compared with (5), which defines the measure dν() on eigenvalues. A key property of the measure d ν¯ (a) defined by (45) is reflected in the fact that ( ) = ( ) if ≡ mod π (this follows in general from the fact that the roots take integral values on the “unit lattice” exp−1 (e), and can be verified for ensembles of type I directly using Proposition 3). From that observation, it follows that:
2 −1
f (ka k K
)d ν¯ (a)dk ∝
A
K
f (k exp(2i )k −1 )( )d dk [0,2π]R
f (k exp(2i )k −1 )( )d dk =2 K [0,π]R f (k exp(i )k −1 )( /2)d dk = R K [0,2π] f (kak −1 )( /2)da dk. (48) = R
K
A
When put together with (47), this proves (46). Now we restrict attention to the most interesting case, that of “class functions” f ∈ C(K\S), that is, those functions on S which depend only on the eigenvalues of the matrix, viz f (kak −1 ) = f (a).
(49)
The tori A are parametrized by R-tuples (λj = eiθj ). From the knowledge of the structure of the set of eigenvalues of the matrices in these tori, we see that for all the ensembles of type I except for Dyson’s A I and A II, changing the sign of any θj does not change the set of eigenvalues since these always come in pairs {e±iθj } (with single or double multiplicity), hence any class function f ∈ C(K\S) is determined by its values on exp([0, π ]R ) ⊂ A, and correspondingly
f (a)dν(a) ∝ f (exp(i ))( /2)d
A [−π,π]R f (exp(i ))( /2)d . = 2R
f (H )dµ(H ) = S
(50)
[0,π]R
Hence, except in the cases of A I and A II, it is convenient to regard the measure on eigenvalues as one supported on [0, π ]R . Noting that the contribution of a pair of roots θk ± θj to ( /2) is sin θk − θj sin θk + θj ∝ | cos θk − cos θj |, 2 2
(51)
it is clear that for all the ensembles of type I, except for the COE and the CSE, the measure on eigenvalues is proportional to the measure 1≤j
| Van(cos )|β
1≤j ≤R
| sin θj |P | sin(θj /2)|Q d ,
on [0, π ]R .
(52)
42
E. Due˜nez Table 3. Parameters of the probability measure of the eigenvalues for ensembles of type II Type
S(N)
Parameters
aN (CUE) bN
U (N) SO(2N + 1)
β=2 β = 2, (a, b) = ( 21 , − 21 )
cN
U Sp(2N)
β = 2, (a, b) = ( 21 , 21 )
dN
SO(2N)
β = 2, (a, b) = (− 21 , − 21 )
(Here β = 1, 2, 4 according to the multiplicity mα of the roots θk ± θj .) Because | sin θ| = |1 − cos θ |1/2 |1 + cos θ |1/2 and | sin(θ/2)| = 2−1/2 |1 − cos θ |1/2 , the above is proportional to the measure | Van(cos )|β |1 − cos θj |p |1 + cos θj |q d , on[0, π ]R . (53) 1≤j ≤R
1≤j
We make the change variables → x = cos to obtain dν(x) ∝ | Van(x)|β |1 − xj |a |1 + xj |b dx, 1≤j
on[−1, 1]R ,
(54)
1≤j ≤R
where a = p − 1/2, b = q − 1/2, and dx = dx1 . . . dxR . The weight function w(x) = |1 − x|a |1 + x|b
on [−1, 1]
(55) (a,b)
is that with respect to which the classical Jacobi orthogonal polynomials Pn (x) are defined, so a matrix ensemble for which the probability measure of the eigenvalues is given by (54) is called a Jacobi ensemble (with parameters (a, b)). For β = 1, 2, 4 we call such an ensemble orthogonal, unitary or symplectic, respectively. Recall that, for the COE and CSE, the probability measure of the eigenvalues is given by (14). It coincides with that given by Weyl’s formula (Proposition 2) since θk − θj |eiθk − eiθj | = 2 sin (56) . 2 For completeness, Table 3 is the analogue of Table 1 for (the infinite families of) symmetric spaces of type II (compact Lie groups). The CUE is a circular ensemble with β = 2 and measure on eigenvalues (14), whereas the orthogonal and symplectic groups are unitary Jacobi ensembles. 4. Universality of Local Correlations: Statement of Results In this section we state our main results, the limiting level density and correlation functions for general Jacobi ensembles. The proofs will be presented in the next section. As we have shown, with the exception of Dyson’s COE (A I) and CSE (A II), the ensembles of type I are special cases of (orthogonal, unitary or symplectic) Jacobi ensembles. We consider the joint probability measure of the R levels (we speak about levels rather than eigenvalues since the natural variables to use are xj = λj ) given in the general form dν(xR ) = PR (xR ) dxR ,
(57)
Matrix Ensembles Associated to Symmetric Spaces
43
where xR = (x1 , . . . , xR ) is an R-tuple of levels. The n-level correlation function (n) IR (xn ) is defined by R! (n) · · · PR (xn , xn+1 , . . . , xR )dxn+1 · · · dxR . (58) IR (xn ) = (R − n)! It is, loosely speaking, the probability that n of the levels, regardless of order, lie in infin(n) itesimal neighborhoods of x1 , . . . , xn (but the total mass of the measure IR (xn )dxn is now R!/(R − n)! and not 1). The semi-classical limit R → ∞ is of great interest. The so-called “universality conjecture” (which dates back to the work of Dyson [14]) states that the local correlations of the eigenvalues in the bulk of the spectrum tend to very specific limits that depend only on the symmetry parameter β. Special cases of the truth of this assertion are known. In particular, in the unitary case β = 2, the result is proven in certain generality [8, 7, 2, 3], but for β = 1, 4 it is known only for special ensembles such as the circular ensembles of Dyson [11–13] and, by work of Nagao and Forrester [23], for most Laguerre ensembles and Jacobi ensembles. However, the latter assumes that the parameters a, b are strictly positive, hence it is not applicable to ensembles of type I (cf., Table 1). It is an extremely important fact that for general orthogonal, unitary and symplectic ensembles the correlation functions can be expressed as determinants (which discovery goes back, in the unitary case, to the work of Gaudin and Mehta [15, 19], and in the orthogonal and symplectic cases to Dyson’s study of his circular ensembles, and later extended by Chadha, Mahoux and Mehta [18, 6, 22] to the general case). In the case (a,b) of unitary Jacobi ensembles there exists a scalar-valued kernel KR2 (x, y) defined in (a,b) terms of the classical Jacobi orthogonal polynomials Pn (x) (the projector kernel onto the span of the first R Jacobi polynomials) satisfying [24] (n)
IRβ (xn ) = det(KRβ (xj , xk ))j,k=1,... ,n .
(59)
In the case of the orthogonal (resp., symplectic) Jacobi ensembles, there exists a matrixvalued kernel [24] (alternatively, a “quaternion” kernel) (a,b) (a,b) SRβ (x, y) IRβ (x, y) − δ(x − y) (a,b) KRβ (x, y) = , (60) (a,b) (a,b)T SRβ (x, y) DRβ (x, y) where δ = 1 (resp., δ = 0 – the -term is absent in the symplectic case), (z) =
1 1 z sgn(z) = , 2 2 |z|
(61)
(a,b)
and the scalar kernel SRβ is defined in terms of the skew-orthogonal polynomials of the second (resp., first) kind depending on the weight (55) and the other quantities are given by y (a,b) (a,b) IRβ (x, y) = − Sβ (x, z)dz, (62) (a,b)
x (a,b)
DRβ (x, y) = ∂x SRβ (x, y), (a,b)T
SRβ
(a,b)
(x, y) = SRβ (y, x).
(63) (64)
44
E. Due˜nez (a,b)
(a,b)
The matrix kernel (60) is self-dual in the sense that KRβ (y, x) = KRβ (x, y)D (cf., Sect. 2). The correlation functions themselves are given by (n) IRβ (xn ) = det(KRβ (xj , xk ))n×n . (65) Indeed, if the matrix (KRβ (xj , xk ))n×n is interpreted as a quaternion self-dual matrix [20], then the right-hand side of (65) is its Dyson’s “quaternion determinant” qdet [11–13], so (65) can be rewritten: (n)
IRβ (xn ) = qdet(KRβ (xj , xk ))n×n .
(66)
Remark. In what follows we will unify notation by writing DET (all caps) to signify the usual determinant when β = 2 and the quaternion determinant when β = 1, 4. Thus, both Eqs. (59) and (66) will be written (n)
IRβ (xn ) = DET(KRβ (xj , xk ))n×n .
(67)
The first quantity of interest is the (global) level density. Indeed, since the first correlation function has total mass R, one might expect that the probability measure (1) R −1 IR (x)dx on [−1, 1] tend to a limiting measure as R → ∞. We define the level density to be the corresponding probability density function: ρ(x) = lim R −1 IR (x). (1)
R→∞
(68)
Assuming ρ(x) to be continuous, the bulk of the spectrum is the set {x : ρ(x) > 0}: points where the level density vanishes or blows up to infinity are excluded from the bulk of the spectrum. Theorem 3. For the orthogonal, unitary or Jacobi ensembles associated to the weight function (55), the global level density is given by ρ(x) =
1 √ π 1 − x2
on (−1, 1).
(69)
The limit in (68) is attained uniformly on compact subsets of (−1, 1). This theorem will be proved in the following section. If we revert to the angular variable θ with x = cos θ, we see that ρ(x)dx =
dθ = (θ )dθ, π
(70)
so the level density (θ ) ≡ 1/π on (0, π) is constant: the eigenvalues become equidistributed on the unit circle (with respect to its invariant measure), and uniformly so away from the central eigenvalues ±1, in the semiclassical limit R → ∞. The bulk of the spectrum excludes the edges ±1. The local n-level correlations are the “local” semi-classical limits of the n-level cor(n) relations IR . When localizing near the neighborhood of a fixed level zo belonging to the bulk of the spectrum, these local correlations are universal in the sense that they depend neither on the specific ensemble nor on the choice of zo but only on the symmetry parameter β. In particular they coincide with the local correlations of the Gaussian Orthogonal (β = 1), Unitary (β = 2) or Symplectic (β = 4) ensemble, respectively.
Matrix Ensembles Associated to Symmetric Spaces
45
For Jacobi ensembles the bulk of the spectrum consists of the open interval (−1, 1), whereas the local correlations near the “hard edges” ±1 (which correspond to the “central eigenvalues” ±1 on the unit circle) have a different behavior which is sensitive to the parameters (a, b) of the ensemble. Remark. As we shall see later, the level density vanishes to some order at, say, the hard edge +1 depending on the parameter a (which is natural since a determines the order to which the weight function (55) vanishes at xj = +1). The local correlations fail to follow Dyson’s universal “threefold way”, but rather depend on this parameter. The same limiting behavior occurs at the hard edge 0 of Laguerre ensembles [23], so that, at least conjecturally, these “universal” laws – manifestly different from Dyson’s bulk regimes – describe the behavior of the local correlations at a hard edge for general orthogonal, unitary or symplectic ensembles. We now fix a level zo ∈ [−1, 1]. Given that the eigenvalue density is uniform, it is natural to change variables from x to ξ stretching the angles by a factor R, namely setting π xj = cos αo + ξj , (71) R where αo = arccos zo (note that the change of variables depends on R). The semiclassical limit of the correlation functions is obtained by letting R tend to infinity. What the factor π/R accomplishes is that, on the bulk of the spectrum, the local level density (i.e., (1) ¯ ) ≡ 1. the local limit of the correlation function IRβ ) will be ρ(ξ Theorem 4. For the orthogonal (β = 1), unitary (β = 2) and symplectic (β = 4) Jacobi ensembles associated to the weight function (55), the local correlations are as follows: • Bulk local correlations (independent of β and of the choice of a fixed z0 = cos α0 ∈ (−1, 1)). – Local level density: ρ(ξ ¯ ) = lim (Rρ(x))−1 IRβ (x) ≡ 1, (1)
R→∞
ξ ∈ R,
(72)
where x depends on ξ as in (71) and ρ(x) is the global level density (69). – Local correlations: (n) (n) Lβ (zo ; ξ n ) = lim (Rρ(zo ))−n IRβ (xn ) = DET(K¯ β (ξj , ξk ))n×n , R→∞
(73)
where xn and ξ n are related by (71) (recall that DET stands for the usual or the quaternion determinant in the cases of β = 2 and β = 1, 4, respectively). In the case β = 2, K¯ 2 is the scalar Sine Kernel sin π(ξ −η) π(ξ −η) , ξ = η; K¯ 2 (ξ, η) = (74) ρ(ξ ¯ ) = 1, ξ = η. In the case β = 4 the matrix Sine Kernel K¯ 4 is given by S¯4 (ξ, η) I¯4 (ξ, η) ¯ K4 (ξ, η) = ¯ , D4 (ξ, η) S¯4T (ξ, η) where
(75)
46
E. Due˜nez
S¯4 (ξ, η) = K¯ 2 (2ξ, 2η), η I¯4 (ξ, η) = − S¯4 (ξ, t)dt,
(76)
ξ
D¯ 4 (ξ, η) = ∂ξ S¯4 (ξ, η), S¯4T (ξ, η) = S¯4 (η, ξ ). In the case β = 1 the matrix Sine Kernel K¯ 1 is given by K¯ 1 (ξ, η) =
S¯1 (ξ, η) I¯1 (ξ, η) − (ξ − η) , S¯1T (ξ, η) D¯ 1 (ξ, η)
(77)
where S¯1 (ξ, η) = K¯ 2 (ξ, η), η I¯1 (ξ, η) = − S¯1 (ξ, t)dt,
(78)
ξ
D¯ 1 (ξ, η) = ∂ξ S¯1 (ξ, η), S¯1T (ξ, η) = S¯1 (η, ξ ). • Hard edge zo = +1 (αo = 0). – Central point level density. For ξ > 0: −1 R (1) IRβ (x) = ρˆβ (ξ ) R→∞ π lim
(79)
(where x depends on ξ by (71)) is given by: π (π ξ )[Ja (π ξ )2 − Ja−1 (π ξ )Ja+1 (π ξ )], 2 ∞ π (a) (2a+1) (ξ ) + J2a+1 (π ξ ) J2a+1 , ρˆ1 (ξ ) = ρˆ2 2 πξ 2πξ π (a) (a) ρˆ4 (ξ ) = ρˆ2 (2ξ ) − Ja−1 (2π ξ ) Ja+1 . 2 0 (a)
ρˆ2 (ξ ) =
(80) (81) (82)
– Local correlations. For ξ n > 0 : −n R (n) IRβ (xn ) = DET(Kˆ β (ξj , ξk ))n×n , R→∞ π
(n)
Lβ (+1; ξ n ) = lim
(83)
(a) with xn related to ξ n by (71). The scalar “Bessel Kernel” Kˆ 2 = Kˆ 2 is given by
(a) Kˆ 2 (ξ, η) =
√ ξη [πξ Ja+1 (π ξ )Ja (π η) − Ja (π ξ )π ηJa+1 (π η)], ξ 2 −η2 (a) ρˆ2 (ξ ),
ξ = η; ξ = η. (84)
Matrix Ensembles Associated to Symmetric Spaces
47
2 1.75 1.5 1.25 1 0.75 0.5 0.25 0.5
1
1.5
2
2.5
3
(a) ρˆ2 (ξ )
Fig. 1. Graphs of for a = −1/2 (the “even” Sine Kernel, solid), a = +1/2 (the “odd” Sine Kernel, dotted), and a = 0 (the Legendre Kernel, dashed)
For β = 1, 4 the matrix Bessel Kernels are given by the same expressions of (75)– (a) (a) (78), except that the bars are to be replaced by hats and Sˆ1 = Sˆ1 , Sˆ4 = Sˆ4 are given by ∞ π ξ ˆ (2a+1) (a) ˆ (ξ, η) + J2a+1 (π η) J2a+1 (t)dt, (85) K S1 (ξ, η) = η 2 2 πξ 2πξ ξ ˆ (a−1) π (a) Sˆ4 (ξ, η) = (2ξ, 2η) − Ja−1 (2π η) Ja−1 (t)dt, (86) K2 η 2 0 where the Jν are the Bessel functions of the first kind. The next section will be devoted to the proof of this theorem. Remark. The local limits at the edge z0 = −1 are given by the same formulae replacing the parameter a by b. Remark. In connection with the hard edge correlations for the classical orthogonal and symplectic groups (Table 3), we remark that the unitary Bessel kernel (84), in the case a = +1/2 (resp., a = −1/2), coincides with the “odd” (resp., “even”) Sine Kernel [17]: sin π(ξ − η) sin π(ξ + η) (±1/2) K¯ 2 (ξ, η) = ∓ . π(ξ − η) π(ξ + η)
(87)
Remark. The integral in (86) diverges for −1 < a ≤ 0. However, in the next section we prove alternative versions of Eqs. (85) and (86). Refer to Eqs. (154), (155), (170) (which converges for all a > −1), and (171).
48
E. Due˜nez
1.4 1.2 1 0.8 0.6 0.4 0.2 1 Fig. 2. Graphs of
(a) ρˆ1 (ξ )
2
3
4
for a = −1/2 (solid), a = +1/2 (dotted), and a = 0 (dashed)
2 1.75 1.5 1.25 1 0.75 0.5 0.25 0.5 Fig. 3. Graphs of
(a) ρˆ4 (ξ )
1
1.5
2
2.5
3
for a = 0 (solid), a = 1 (dotted), and a = 2 (dashed)
5. Proofs In this section we prove Theorems 3 and 4. First we remark that the unitary case has been studied in the work of Nagao and Wadati [24, 25], but we reproduce the proofs here for completeness and also to show that the hypothesis a > −1 is, in a certain sense, unnecessary. Also we remark that Forrester and Nagao [23] have studied the hard edge correlations directly, using skew-orthogonal polynomial expressions for the matrix kernels KR1 , KR4 , but their results apply only when the parameters a, b are strictly positive,
Matrix Ensembles Associated to Symmetric Spaces
49
and in view of the application to symmetric spaces this restriction is unacceptable (see Table 1). Also, their somewhat more complicated formulas for the limiting quantities Sˆ1 , Sˆ4 are given in terms of iterated integrals of Bessel functions. Here we take advantage of the more recent work of Adler et al which provides simple “summation formulas” for the quantities SR1 , SR4 . 5.1. Some preliminary results and formulas. The various results we quote on Jacobi polynomials can be found in Szeg¨o’s book [28] and in his article on asymptotic properties of Jacobi polynomials [27] (reproduced in his collected papers [29]). Stirling’s formula and the Bessel function identities can be found, for instance, in the tables of (A,B) (x) the classical Jacobi polynomials Gradshteyn and Ryzhik [16]. We denote by PN defined by (−1)N d N A B (A,B) (x) = N (1 − x)N+A (1 + x)N+B . (88) (1 − x) (1 + x) PN 2 N ! dx When A, B > −1, these polynomials are orthogonal on [−1, 1] with respect to the weight w(x) = |1 − x|A |1 + x|B ,
(89)
but they are not normalized. However, the formula (88) is meaningful for arbitrary (real or complex) values of the parameters A, B, and defines a polynomial in A, B, x of degree (at most) N in x. In fact N A+N B +N x − 1 N−k x + 1 k (A,B) PN (x) = . (90) k N −k 2 2 k=0
In particular
(A,B) PN (+1)
A+N = . N
(91)
The derivative of a Jacobi polynomial is related to another Jacobi polynomial by the identity (the apostrophe denotes differentiation with respect to x) (A,B)
1 (A+1,B+1) (N + A + B + 1)PN−1 (x). (92) 2 Proposition 4 (Darboux’s formula). (With an improved error term due to Szeg¨o [27].) For arbitrary reals A, B, θ −A−1/2 θ −B−1/2 (A,B) PN cos (cos θ ) = (πN )−1/2 sin cos(N θ + γ ) + E, 2 2 1 π A+B +1 N = N + , γ =− A+ , (93) 2 2 2 PN
(x) =
for 0 < θ < π, where the error term E satisfies E = θ −A−3/2 O(N −3/2 ),
uniformly for c/N ≤ θ ≤ π − ,
(94)
for any positive constants c, , and the constant implied by the O symbol depends only on c, , A, B.
50
E. Due˜nez
Proposition 5 (Hilb’s formula). (As generalized by Szeg¨o to Jacobi polynomials [28].) For A > −1 and any real B: θ A (N + A + 1) θ θ B (A,B) sin cos PN (cos θ ) = N −A JA (N θ) + E, (95) 2 2 N! sin θ where N has the same meaning as in (93) and the error term E is given by θ 1/2 O(N −3/2 ) if c/N ≤ θ ≤ π − , E= θ A+2 O(N A ) if 0 < θ ≤ c/N,
(96)
where c, are arbitrary but fixed positive constants, and the constants implied by the O symbol depend on A, B, c, only. The restriction to A > −1, however, is too strong for some purposes, and we will need the following formula, also due to Szeg¨o [27] (reproduced in [29]): θ tan(θ/2) θ −B θ −A (A,B) PN cos 1− (cos θ ) = sin 2 2 sin θ 2θ (97) × JA (N θ ) + R, with N as in (93). Here A, B are arbitrary reals. The error term R satisfies: 1 θ 2 −A O(N −3/2 ) if c/N ≤ θ ≤ π − , R= O(N A−2 ) if 0 < θ ≤ c/N,
(98)
where c, are fixed positive numbers, and the constants implied by the O symbol depend only on A, B, c, . It must be noted, however, that the error term R of (98) does not depend on θ on the range 0 < θ < c/N , which makes this formula less useful than (95) with the error term (96) for θ in this range. Recall Stirling’s asymptotic formula for the Gamma function: 1 1 log (x) = x − log x − x + log 2π + O(x −1 ), as x → ∞. (99) 2 2 The Bessel functions of the first kind are defined by the series ∞ z ν z2k Jν (z) = , z ∈ C\(−∞, 0], ν ∈ R; (100) (−1)k 2k 2 2 k!(ν + k + 1) k=0
they satisfy, among many others, the relations: ν Jν (z) = Jν−1 (z) − Jν (z), z ν Jν (z) = −Jν+1 (z) + Jν (z), z 1 Jν (z) = [Jν−1 (z) − Jν+1 (z)], 2 2ν Jν+1 (z) = Jν (z) − Jν−1 (z), z d ν [z Jν (z)] = zν Jν−1 (z), dz d −ν [z Jν (z)] = −z−ν Jν+1 (z). dz
(101) (102) (103) (104) (105) (106)
Matrix Ensembles Associated to Symmetric Spaces
51
We also have Jν = 2
∞
∞
Jν+2k+1 ,
(107)
for ν > −1.
(108)
k=0
Jν (t)dt = 1
0
5.2. Asymptotics of the unitary Jacobi kernel. In this section we recall the proofs of some of the results of Nagao and Wadati [24], which will be needed later on in the analysis of the orthogonal and symplectic cases. (A,B) Using the Christoffel-Darboux summation formula [28], the scalar kernel KN2 can be written in the form (A,B)
KN 2
(x, y) =
2−A−B (N + 1)(N + A + B + 1) 2N + A + B (N + A)(N + B) (A,B) (A,B) (A,B) (A,B) ! PN (x)PN−1 (y) − PN−1 (x)PN (y) × w(x)w(y) , (109) x−y
for x = y, and (A,B)
KN 2
(x, x) =
2−A−B (N + 1)(N + A + B + 1) 2N + A + B (N + A)(N + B) (A,B) (A,B) (A,B) (A,B) ×w(x)[PN (x)PN−1 (x) − PN−1 (x)PN (x)]. (110)
We observe that the kernel KN2 given by (109) and (110) is well-defined for A, B > −c for any real constant c provided N is sufficiently large. First consider the global level density ρ(x) = lim N −1 K(x, x). N→∞
(111)
Using Darboux’s formula (93) together with the identity (92) in the expression (110) for the kernel, we find: (a,b)
KN2 (x, x) =
√
N
π 1 − x2
+ O(1),
(112)
where the implied constant depends only on for −1 + ≤ x ≤ 1 − . Equation (112) proves (69) (in the unitary case). A density function D = D(x1 , . . . , xn ) defines a measure D dx1 . . . dxn . Under a (monotonically increasing or decreasing) differentiable change of variables xj = X(uj ), this density is transformed into the density n |X (uj )| D(X(u1 ), . . . , X(un )). (113) D(u1 , . . . , un ) = j =1
If the density D is given as a determinant with a (scalar) kernel K(x, y), namely D = det(K(xj , xk ))n×n , then the change of variables reflects itself in the kernel in the following fashion:
52
E. Due˜nez
Lemma 1. After the (monotonic) differentiable change of variables u → x = X(u), the correlation functions are given as the determinant (59) defined using the kernel K(u, v) =
!
|X (u)X (v)|K(X(u), X(v)).
(114)
√ This is clear since the" introduction of the factor |X (u)X (v)| results in multiplying n the determinant (59) by j =1 |X (uj )|. The localization at some −1 < zo = cos αo < 1 given by the change of variables (71) leads us to consider the limit # ! $−1 (a,b) (a,b) K¯ 2 (ξ, η) = lim N ρ(x)ρ(y) KN2 (x, y) N→∞ (115) (a,b) = lim (Nρ(zo ))−1 KN2 (x, y), N→∞
with x, y related to ξ, η by (71), which from Darboux’s formula (93) can be easily seen to be the Sine Kernel (74), independently of the value of zo (as long as −1 < zo < 1), for any real a, b, and the limit is attained uniformly on compacta. For the localization at zo = +1 (αo = 0) – localization at zo = −1 is analogous provided a and b are interchanged, we use the same change of variables (71) with ξ n > 0. To compute the limit (a,b) Kˆ 2 (ξ, η) = lim
N→∞
# ! $−1 (a,b) N ρ(x)ρ(y) KN2 (x, y)
= lim (Nρ(zo ))−1 KN2 (x, y), (a,b)
(116)
N→∞
we use Szeg¨o’s formulas (95), (97), in conjunction with (109) and (110): (a) Kˆ 2 (ξ, η) =
√ ξη [πξ Ja (π ξ )Ja (π η) − Ja (π ξ )π ηJa (π η)]. ξ 2 − η2
(117)
Using the derivation formula (102) we rewrite this kernel in the form (84). For the case ξ = η we start with the expression (110) and use the derivation formula (92) to find: (a) (a) ρˆ2 (ξ ) = Kˆ 2 (ξ, ξ ) π (π ξ )Ja (π ξ ) − π ξ Ja (π ξ ) Ja+1 (π ξ )]. = [Ja (π ξ )Ja+1 (π ξ ) + πξ Ja+1 2 (118)
Applying the derivation formula (101) and the recurrence formula (104) this can be rewritten in the form (80).
5.3. Asymptotics of the orthogonal Jacobi kernel. We start with some general remarks. If a density P = P (x1 , . . . , xn ) is given as a quaternion determinant with a self-dual matrix kernel K(y, x) = K(x, y)D , namely P = qdet(Q(xj , xk ))n1 , then under a differentiable change of variables xj = X(uj ) the density is still given as a quaternion determinant.
Matrix Ensembles Associated to Symmetric Spaces
53
Lemma 2. After a (monotonic) differentiable change of variables u → x = X(u), a density function P (x1 , . . . , xn ) = qdet(K(xj , xk )) defined in terms of some self-dual matrix kernel (δ = 0, 1) S(x, y) I (x, y) − δ(x − y) K(x, y) = D(x, y) S T (x, y) with
I (x, y) = −
(119)
(120)
y
S(x, z)dz,
(121)
x
D(x, y) = ∂x S(x, y), S (x, y) = S(y, x) T
(122) (123)
is transformed into the density P(u1 , . . . , un ) = qdet(K(uj , uk )), where
(124)
K(u, v) =
S(u, v) I(u, v) − δ(u − v) , D(u, v) S T (u, v)
S(u, v) = S(X(u), X(v))|X (v)| = ±S(X(u), X(v))X (v), v I(u, v) = − S(u, w)dw,
(125) (126) (127)
u
D(u, v) = ∂u S(u, v),
(128)
S (u, v) = S(v, u). T
(129)
For the proof, we need first: Lemma 3. Let H = H D = Jn H T JnT be a 2n × 2n self-dual complex matrix. Let kj , j = 1, 2, . . . , n be arbitrary complex constants. Set K = diag(k1 , . . . , kn ). Then the matrices H1 = diag(I, K)H diag(K, I )
H2 = diag(−I, K)H diag(−K, I )
(130)
(where I = In is the n × n identity matrix) are both self-dual, and qdet(H1 ) = det(K) qdet(H ) = qdet(H2 ).
(131)
The verification that H1 and H2 are self-dual is trivial. On the other hand, since (qdet X)2 = det X for any self-dual matrix X, we have that (qdet(H1 ))2 = (qdet(H2 ))2 = (det(K))2 det(H ) = (det(K))2 (qdet(H ))2 .
(132)
Hence Eq. (131), which is an equality between polynomials in the entries of the matrices involved, must hold up to a sign. Setting K = In we see that the first equality in (131)
54
E. Due˜nez
holds, and setting K = −In , so H2 = −H , the validity of the second equality in (131) is equivalent to the easy fact that qdet(−H ) = (−1)n qdet H = det(−In ) qdet H . Proceeding to the proof of Lemma 2, we first observe that, after the change of variables u → x, the density P (x1 , . . . , xn ) transforms into the density P(u1 , . . . , un ) = P (X(u1 ), . . . , X(un ))
n
|X (uj )|.
(133)
j =1
We apply Lemma 3 with H = (K(X(uj ), X(uk )))n×n and kj = |X (uj )| to conclude that (124) holds with either of the two kernels (we write X(u, v) for (X(u), X(v))) S(X(u, v))|X (v)| ±(I − δ)(X(u, v)) K± (u, v) = . (134) ±D(X(u, v))|X (u)||X (v)| S T (X(u, v))|X (u)| The plus and minus signs correspond to applying the first and second of the equalities in (131), respectively. If x → u preserves orientation, then we observe that (X(u) − X(v)) = (u − v) and conclude by a simple application of the chain rule and a change of variables in the integral that the kernel K+ coincides with K from (125) for the choices (126)–(129). If x → u reverses orientation, we choose the minus signs, observe that (X(u)−X(v)) = −(u−v) and proceed exactly as before to see that K− coincides with (125) in this case. Lemma 2 explains the relations (78) between the entries of the limiting kernels K¯ β and also of Kˆ β (β = 1, 4). The relations certainly hold when R is finite after applying the change of variables (71) to the the matrix kernel KRβ so as to obtain another kernel KRβ . They can be shown to continue to hold in the limit either by noting that the sequence of scalar kernels {SRβ (ξ, η)}∞ R=0 is a normal sequence of analytic functions (i.e., it converges uniformly on compacta), or by direct verification that each of T } converges to the correct limit as R → ∞. In the sequences {SRβ }, {IRβ }, {KRβ }, {SRβ what follows we will only consider the limit of the quantity SRβ which alone determines the matrix kernel KRβ . Let A = 2a + 1, B = 2b + 1, where a, b are the parameters of the orthogonal Jacobi ensemble. Assume also that R is even. Observe that A, B > −1 if a, b > −1. The (a,b) summation formula of Adler et al [1] expresses the orthogonal kernel SR1 using the (A,B) unitary kernel KR−1,2 and another term. As we shall see, this other term is negligible in the localized limit (in the bulk of the spectrum), but it does contribute to the edge limit. (a,b) The summation formula for the quantity SR1 (x, y) of (60) is as follows [1]: 1 − x 2 (A,B) (a,b) K (x, y) + cR−2 ψR−1 (y)ψR−2 (x). (135) SR1 (x, y) = 1 − y 2 R−1,2 Here denotes the integral operator (cf., Eq. (61)) 1 (x − y)f (y)dy, (f )(x) =
(136)
−1
and we have set (A,B)
ψN (t) = ψN and
(A,B)
(t) = (1 − t)(A−1)/2 (1 + t)(B−1)/2 PN
(t)
(137)
Matrix Ensembles Associated to Symmetric Spaces
cN = 2−A−B−1
55
(N + 2)(N + A + B + 2) . (N + A + 1)(N + B + 1)
(a,b)
(138)
(a,b)
The quantity SR1 determines the entries of the matrix kernel KR1 as per Eqs. (62)– (64). From Stirling’s formula (99), the asymptotic behavior of the coefficient cN is cN ∼ 2−A−B−1 N 2 ,
as N → ∞.
(139)
Lemma 4. For any real A, B: (A,B)
lim ψN
N→∞
(cos φ) = 0
(140)
for 0 < φ < π, uniformly on compacta. This follows immediately from Darboux’s formula (93). This lemma is, however, insufficient to understand the asymptotics of the function ψN as N → ∞ since it says nothing about the behavior of ψN near the edge. First we note: Lemma 5. For A > −1 and B arbitrary: lim N −1 ψN
(cos(φ/N )) = 2
A+B 2
(cos(φ/N )) sin(φ/N ) = 2
A+B 2
(A,B)
N→∞
(A,B)
lim ψN
N→∞
JA (φ) , φ
(141)
JA (φ).
(142)
The limits hold uniformly on compact subsets of (0, ∞). These follow from Szeg¨o’s formula (95). Lemma 6. For A, B real with A > −1 and any 0 < θ < π we have:
θ
lim N
N→∞
0 θ/N
lim N
N→∞
0
(A,B)
(cos φ) sin φ dφ = 2
A+B 2
(A,B)
(cos φ) sin φ dφ = 2
A+B 2
ψN ψN
(143)
,
θ
(144)
JA . 0
These follow again from Szeg¨o’s formula (95) and Eq. (108). When −1 < A < 0, the dependence on θ of the second of the error terms in (96) is critical to ensure that the contribution of this error term to the integral is negligible (in particular, this lemma cannot be proven using the alternate formula (98) unless A > 0.) Corollary 1. For −1 < A, B and 0 < θ < π: (A,B)
lim N(ψN
N→∞
(A,B)
lim N (ψN
N→∞
)(cos θ ) = 0,
)(cos(θ/N )) = 2
A+B 2
(145)
θ
1−
JA 0
=2
A+B 2
∞
JA . θ
(146)
56
E. Due˜nez (A,B)
(B,A)
This follows from the previous lemma applied to both ψN and ψN . We also used (108) to obtain the last equality. We localize at some zo = cos αo ∈ (−1, 1) using the change of variable x → ξ of (71). The limit to consider is (a,b) (a,b) (a,b) S¯1 (ξ, η) = lim (Nρ(y))−1 SR1 (x, y) = lim (Nρ(zo ))−1 SR1 (x, y). R→∞
R→∞
(147)
By the lemmas above, the second term on the right-hand side of (135) is negligible 2
is 1 in the limit. Thus, the limit (147) is equal to the in the limit. Also, the factor 1−x 1−y 2 limiting unitary kernel, namely the Sine Kernel, whence the expression (78). As for the central point, let us now localize at z = +1. Using the summation formula (135), Lemma 5 and Corollary 1, we readily find: % & πξ ξ ˆ (2a+1) π a ˆ S1 (ξ, η) = (ξ, η) + J2a+1 (π η) 1 − J2a+1 (t)dt K η 2 2 0 ∞ ξ ˆ (2a+1) π = (ξ, η) + J2a+1 (π η) J2a+1 (t)dt. (148) K η 2 2 πξ As we remarked already, the conditions a > −1 and A > −1 are equivalent since A = 2a + 1. Thus we have derived a weak universality law for the local correlations at the central points ±1 for any a, b > −1. Lemma 7. Let κα (x, y) = xJα+1/2 (x)Jα−1/2 (y) − Jα−1/2 (x)yJα+1/2 (y). Then x y κα±1/2 (x, y) − κα∓1/2 (x, y) y x 2 x − y2 =∓ Jα−1/2∓1/2 (x)Jα−1/2±1/2 (y). √ xy
(149)
(This equation stands for two different equations, one with the top signs and another with the bottom signs.) We prove the equation with the choice of the top signs (the other case is analogous). Indeed, expanding the left-hand side we obtain: x 3/2 Jα+1 (x)y −1/2 Jα (y) − x 1/2 Jα (x)y 1/2 Jα+1 (y) −x 1/2 Jα (x)y 1/2 Jα−1 (y) + x −1/2 Jα−1 (x)y 3/2 Jα (y).
(150)
The central terms can be combined into −2αx 1/2 Jα (x)y −1/2 Jα (y) using the identity (104) and expanded using this same identity into −x 3/2 Jα−1 (x)y −1/2 Jα (y) − x 3/2 Jα+1 (x)y −1/2 Jα (y). Two terms cancel out, and the remaining two factor to give the right-hand side of (149). We now have, using Lemma 7, √ ξ ˆ (A) ξη K2 (ξ, η) = 2 κA+1/2 (π ξ, π η) η ξ − η2 √ η ξη = κA−1/2 (π ξ, π η) + π JA (π ξ )JA−1 (π η) ξ ξ 2 − η2
Matrix Ensembles Associated to Symmetric Spaces
η ˆ (A−1=2a) (ξ, η) − π JA−1 (π ξ )JA (π η), K ξ 2
= and similarly
57
(151)
ξ ˆ (A) K (ξ, η) = η 2
From (107):
πξ
η ˆ (A+1) (ξ, η) + π JA+1 (π ξ )JA (π η). K ξ 2
JA ± 2JA∓1 (π ξ ) =
0
(152)
πξ
(153)
JA∓2 . 0
(a) The last two equations provide alternative forms of the kernel Sˆ1 , namely % & πξ η ˆ (2a) π (a) Sˆ1 (ξ, η) = K2 (ξ, η) + J2a+1 (π η) 1 − J2a−1 (t)dt , ξ 2 0 % & πξ π η ˆ (2a+2) (a) Sˆ1 (ξ, η) = (ξ, η) + J2a+1 (π η) 1 − J2a+3 (t)dt . K2 ξ 2 0 ' ∞ As before, the terms in brackets can be replaced by πξ .
(154) (155)
5.4. Asymptotics of the symplectic Jacobi kernel. Here we set A = a − 1, B = b − 1, where a, b are the parameters of the symplectic Jacobi ensemble. Note that here a, b > (A,B) as in (137), the −1 corresponds to A, B > −2. With cN as in (138) and ψN = ψN summation formula in this case reads 1 1 − x 2 (A,B) 1 (a,b) SR4 (x, y) = K (x, y) − c2R−1 ψ2R (y)δψ2R−1 (x), (156) 2 1 − y 2 2R,2 2 where the operator δ acts by
δf (x) =
1
(157)
f (t)dt. x
The formula (156) only holds verbatim when a > 0 (that is, A, B > −1), since the (A,B) integral defining δψN is divergent for A ≤ −1. However, we note that the skew orthogonal polynomials of the second kind are analytic functions of the parameters a, b > −1 (corresponding to A, B > −2), hence the kernel KN4 is an analytic function on a, b > −1. Thus, we must find a suitable analytic continuation of (156) valid (A,B) for A, B > −2. First we remark that, although the original kernel K2R,2 of unitary Jacobi ensembles is defined for A, B > −1, Eq. (109) is well-defined and analytic for A, B > −2 if R > 1 (which we will assume). We write 1 (A,B) (A,B) δψN (x) = (1 − t)(A−1)/2 (1 + t)(B−1)/2 PN (t)dt x
=
x
1
(A,B)
(1 − t)(A−1)/2 (1 + t)(B−1)/2 (PN (A,B)
+PN
1
(1) x
(A,B)
(t) − PN
(1 − t)(A−1)/2 (1 + t)(B−1)/2 dt.
(1))dt (158)
58
E. Due˜nez
The first integral on the right-hand side is well-defined and analytic for A > −2. The $ # (A,B) (cf., Eq. (91)) vanishes for A = −1, which is sufficient to (1) = A+N term PN N extend the second integral on the right-hand side to a well-defined analytic function on the range A > −2. It is easy to rewrite that integral as an incomplete Beta function and use well-known results to achieve the extension, but one can also proceed elementarily as follows. Integrating the second integral by parts we obtain, for A > −1: 1 A+N (1 − t)(A−1)/2 (1 + t)(B−1)/2 dt N x 2 A+N = (1 − x)(A+1)/2 (1 + x)(B−1)/2 A+1 N 1 B −1 A+N + (1 − t)(A+1)/2 (1 + t)(B−3)/2 dt. (159) A+1 N x Observe that
1 1 A+N A+N = , A+1 N N N −1
(160)
and the latter is an analytic function of all A. Then both terms on the right-hand side of (159) are analytic functions of A > −2 for −1 < x ≤ 1, so this last equation provides the analytic extension of the integral (158) defining δψN (x), which is sensu stricti undefined for A ≤ −1, to an analytic function on A > −2. The rest of the reasoning is analogous to that in the orthogonal case. The only technical difficulty arises because the error term (98) in Szeg¨o’s formula does not depend on θ in the range 0 < θ ≤ c/N , effectively making the reasoning of the previous section inapplicable when −2 < A ≤ −1. This is to be expected since the summation formula only makes sense after being analytically continued. In what follows we prove that the various limits of the kernel do in fact depend analytically on the parameter A, thus allowing the expressions obtained for A > −1 to be extended to A > −2. Using Szego’s formula (97) (valid for all A), there is no problem to obtain this variant of Lemma 6: Lemma 8. For any A, B, θ real and 0 < ψ < π we have: φ A+B (A,B) lim N ψN (cos ψ) sin ψ dψ = 2 2 N→∞
θ/N
Lemma 9. Using Eq. (159), the expression θ/N (A,B) N ψN (cos φ) sin φ dφ
∞
JA .
(161)
θ
(162)
0
can be analytically continued to a regular function on A > −2. As N → ∞, this function tends to a limit which is also analytic for A > −2 and coincides with (144) for A > −1. We change variables φ → φ/N. As before, we split the integral to rewrite (162) in the form & % θ φ A φ B φ (A,B) (A,B) 2(A+B)/2 sin cos PN cos (1) dφ − PN 2N 2N N 0
Matrix Ensembles Associated to Symmetric Spaces
+2
(A+B)/2
(A,B) PN (1)
θ
φ sin 2N
0
59
A
φ cos 2N
B
(A,B)
PN
(1) dφ.
(163)
The first of these terms is analytic for A > −2, the second one has an analytic continuation given by (159). It is easy to see that this second term has the asymptotic behavior: θ φ A φ B (A,B) sin cos 2(A+B)/2 PN (1) dφ 2N 2N 0 A+1 B−A 1 A+N φ ∼2 2 (164) N N −1 N # $ as N → ∞, and from Stirling’s formula (99), the binomial coefficient A+N N−1 = (A+N +1) A+1 ), hence this second term is asymptotically negligible. As for (N)(A+2) = O(N the first term in (163), we first write φ (A,B) (A,B) PN cos (1) − PN N φ 1 ψ ψ (A,B) =− cos P sin dψ N 0 N N N N + A + B + 1 φ (A+1,B+1) ψ ψ =− cos PN −1 sin dψ, (165) 2N N N 0 where we have used the derivation formula (92). We can now use Szeg¨o’s formula (95) to (A+1,B+1) since A + 1 > −1. The upshot is that the limit of (162) as N → ∞ estimate PN−1 can be written as the following integral, which is an analytic function of A > −2: θ φ (A+B)/2 φ A ψ −A JA+1 (ψ)dψ dφ. (166) −2 0
0
Using the Bessel function identity (106) we can simplify the above integral, for A > −1: θ (A+B)/2 JA (φ)dφ, (167) 2 0
which is in agreement with Lemma 6. We note that the expression (167) can be easily continued to an analytic function of A > −2 without the need to rewrite it as the double integral (166). Namely, using (107) we have, for A > −1, θ θ JA (φ)dφ = JA+1 (θ ) + JA+2 (φ)dφ. (168) 0
0
The expression on the right-hand side is analytic for A > −2 and provides the desired analytic continuation. The global level density is derived identically to the previous section. The limiting (a) kernel in the bulk of the spectrum is given by the sum of two terms: S¯2 (2ξ, 2η) and another term which is negligible in the limit. For the central point z = +1, the lemmas above yield the following expression for the limiting kernel: 2πξ ξ ˆ (A) π (a) Sˆ4 (ξ, η) = K2 (2ξ, 2η) − JA (2π η) JA (t)dt, (169) η 2 0
60
E. Due˜nez
where the last integral is to be understood in the sense of Eq. (168) for A ≤ −1. Using Eqs. (151) and (152) together with (153) and Eq. (169) above, the kernel can be rewritten in either of the forms: 2πξ η ˆ (a) π (a) Sˆ4 (ξ, η) = K2 (2ξ, 2η) − Ja−1 (2πη) Ja+1 (t)dt, (170) ξ 2 0 2πξ η ˆ (a−2) π (a) (2ξ, 2η) − Ja−1 (2π η) Ja−3 (t)dt. (171) K2 Sˆ4 (ξ, η) = ξ 2 0 Acknowledgement. I wish to thank Prof. Peter Sarnak for his continued encouragement and guidance as my Ph. D. thesis advisor as well as Brian Conrey for making my stay at AIM possible.
References 1. Adler, M., Forrester, P.J., Nagao, T., van Moerbeke, P.: Classical skew orthogonal polynomials and random matrices. J. Statist. Phys. 99(1–2), 141–170 (2000) 2. Albeverio, S., Pastur, L., Shcherbina, M.: On asymptotic properties of certain orthogonal polynomials. Mat. Fiz. Anal. Geom. 4(3), 263–277 (1997) 3. Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann-Hilbert problem, and universality in the matrix model. Ann. Math. (2), 150(1), 185–266 (1999) ´ La g´eom´etrie des groupes simples. Ann. Math. Pura Appl. 4, 209–256 (1927) 4. Cartan, E.: ´ Sur certaines formes riemannienes remarquables des g´eom´etries a groupe fondamental 5. Cartan, E.: ´ simple. Ann. Sci. Ecole Norm. Sup. 44, 354–467 (1927) 6. Chadha, S., Mahoux, G., Mehta, M.L.: A method of integration over matrix variables. II. J. Phys. A 14(3), 579–586 (1981) 7. Deift, P., Kriecherbauer, T., T-R McLaughlin, K., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Comm. Pure Appl. Math. 52(12), 1491– 1552 (1999) 8. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou., X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52(11), 1335–1425 (1999) 9. Percy, Deift, A., Alexander, Its, R., Xin Zhou.: A Riemann-Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. Math. (2) 146(1), 149–235 (1997) 10. Due˜nez, E.: Random Matrix Ensembles associated to Compact Symmetric Spaces. PhD thesis, Princeton: Princeton University, 2001 11. Dyson, F.J.: Statistical theory of the energy levels of complex systems. I. J. Math. Phys. 3, 140–156 (1962) 12. Dyson, F.J.: Statistical theory of the energy levels of complex systems. II. J. Math. Phys. 3, 157–165 (1962) 13. Dyson, F.J.: Statistical theory of the energy levels of complex systems. III. J. Math. Phys. 3, 166–175 (1962) 14. Dyson, F.J.: The threefold way. Algebraic structure of symmetry groups and ensembles in quantum mechanics. J. Math. Phys. 3, 1199–1215 (1962) 15. Gaudin, M.: Sur la loi de l’espacement limit des valeurs propres d’une matrice al´eatoire. Nucl. Phys. 25, 447–458 (1961) 16. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. New York: Academic Press, 1965 17. Katz, N.M., Sarnak, P.: Random matrices, Frobenius eigenvalues, and monodromy. Providence, RI: American Mathematical Society, 1999 18. Mehta., M.L.: A method of integration over matrix variables. Commun. Math. Phys. 79(3), 327–340 (1981) 19. Mehta, M.L., Gaudin., M.: On the density of eigenvalues of a random matrix. Nucl. Phys. 18, 420–427 (1960) ´ 20. Mehta, M.L.: Matrix Theory. Selected Topics and Useful Results. Les Editions de Physique, Les Ulis cedex, France, first enlarged edition, 1989 21. Mehta, M.L.: Random Matrices. Boston, MA: Academic Press Inc., Second edition, 1991
Matrix Ensembles Associated to Symmetric Spaces
61
22. Mehta, M.L., Mahoux, G.: A method of integration over matrix variables. III. Indian J. Pure Appl. Math. 22(7), 531–546 (1991) 23. Taro Nagao, Peter, Forrester., J.: Asymptotic correlations at the spectrum edge of random matrices. Nucl. Phys. B 435(3), 401–420 (1995) 24. Nagao, T., Wadati., M.: Correlation functions of random matrix ensembles related to classical orthogonal polynomials. J. Phys. Soc. Japan 60(10), 3298–3322 (1991) 25. Nagao, T., Wadati., M.: Correlation functions of random matrix ensembles related to classical orthogonal polynomials. II. J. Phys. Soc. Japan 61(1), 78–88 (1992) 26. Stojanovic, A.: Une approche par les polynˆomes orthogonaux pour des classes de matrices al´eatoires orthogonalement et symplectiquement invariantes:Application a` l’universalit´e de la statistique locale des valeur propres. Technical report, BiBoS Universit¨at Bielefeld, 2000 27. Szeg¨o, G.: Asymptotische Entwicklungen der Jacobischen Polynome. Schr. K¨onigsb. Gelehr. Ges. Nat.wiss. Kl. 10, 35–112 (1933) 28. Szeg¨o, G.: Orthogonal polynomials. Providence, RI: American Mathematical Society, 1975, American Mathematical Society, Fourth edition, Colloquium Publications, Vol. XXIII 29. Szeg¨o, G.: Collected papers. Vol. 2. Boston, Mass: Birkh¨auser, 1982, 1927–1943, Edited by Richard Askey. 30. Zirnbauer, M.R.: Riemannian symmetric superspaces and their origin in random-matrix theory. J. Math. Phys. 37(10), 4986–5018 (1996) Communicated by P. Sarnak
Commun. Math. Phys. 244, 63–97 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0979-1
Communications in
Mathematical Physics
Classification of Two-Dimensional Local Conformal Nets with c < 1 and 2-Cohomology Vanishing for Tensor Categories Yasuyuki Kawahigashi1, , Roberto Longo2, 1
Department of Mathematical Sciences, University of Tokyo, Komaba, Tokyo, 153-8914, Japan. E-mail:
[email protected] 2 Dipartimento di Matematica Universit`a di Roma “Tor Vergata” Via della Ricerca Scientifica, 1, 00133 Roma, Italy. E-mail:
[email protected] Received: 14 April 2003 / Accepted: 1 July 2003 Published online: 13 November 2003 – © Springer-Verlag 2003
Abstract: We classify two-dimensional local conformal nets with parity symmetry and central charge less than 1, up to isomorphism. The maximal ones are in a bijective correspondence with the pairs of A-D-E Dynkin diagrams with the difference of their Coxeter numbers equal to 1. In our previous classification of one-dimensional local conformal nets, Dynkin diagrams D2n+1 and E7 do not appear, but now they do appear in this classification of two-dimensional local conformal nets. Such nets are also characterized as two-dimensional local conformal nets with µ-index equal to 1 and central charge less than 1. Our main tool, in addition to our previous classification results for one-dimensional nets, is 2-cohomology vanishing for certain tensor categories related to the Virasoro tensor categories with central charge less than 1.
1. Introduction The subject of Conformal Quantum Field Theory is particularly interesting in two spacetime dimensions and has indeed been intensively studied in the last two decades with important motivations from Physics (see e.g. [11]) and Mathematics (see e.g. [14]). Basically the richness of structure is due to the fact that the conformal group (with respect to the Minkowskian metric) is infinite dimensional in 1 + 1 dimensions. Already at the early stage of investigation, it was realized that such infinite dimensional symmetry group puts rigid constrains on structure and the problem of classification of all models was posed and considered as a major aim. Indeed many important results in this direction were obtained, in particular the central charge c > 0, an intrinsic quantum label associated with each model, was shown to split in a discrete range c < 1 and a continuous one c ≥ 1, see [2, 17, 19] and refs in [19].
Supported in part by JSPS. Supported in part by GNAMPA and MIUR.
64
Y. Kawahigashi, R. Longo
The main purpose of this paper is to achieve a complete classification of the twodimensional conformal models in the discrete series. In order to formulate such a statement in a precise manner, we need to explain our setting. The essential, intrinsic structure of a given model is described by a net A on the twodimensional Minkowski spacetime M [22]. With each double cone O (an open region which is the intersection of the past of one point and the future of a second point) one associates the von Neumann algebra A(O) generated by the observables localized in O (say smeared fields integrated with test functions with support in O). The net A : O → A(O) is then local and covariant with respect to the conformal group. One may restrict A to the two light rays x ± t = 0 and obtain two local conformal nets A± on R, hence on its one point compactification S 1 . So we have an irreducible two-dimensional subnet B(O) ≡ A+ (I+ ) ⊗ A− (I− ) ⊂ A(O) , where O = I+ × I− is the double cone associated with the intervals I± of the light rays. The structure of A, thus the classification of local conformal nets, splits in the following two points: • The classification of local conformal nets on S 1 . • The classification of irreducible local extension of chiral conformal nets. Here a chiral net is a net that splits in the tensor product of two one-dimensional nets on the light rays. Now the conformal group of M is Diff(S 1 ) × Diff(S 1 ) 1 thus, restricting the projective unitary covariance representation to the two copies of Diff(S 1 ), we get Virasoro nets Vir c± ⊂ A± with central charge c± . If there is a parity symmetry, then c+ = c− , so we may talk of the central charge c ≡ c± of A. If c < 1, it turns out that A is completely rational [32] and the subnet Vir c ⊗ Vir c ⊂ A has finite Jones index, where Vir c ⊗ Vir c (O) ≡ Vir c (I+ ) ⊗ Vir c (I− ). The classification of two-dimensional local conformal nets with central charge c < 1 and parity symmetry thus splits in the following two points: (a) The classification of Virasoro nets Vir c on S 1 with c < 1. (b) The classification of irreducible local extensions with finite Jones index of the twodimensional Virasoro net Vir c ⊗ Vir c . Point (a) has been completely achieved in our recent work [31]. The Virasoro nets on S 1 with central charge less than one are in bijective correspondence with the pairs of AD2n -E6,8 Dynkin diagrams such that the difference of their Coxeter numbers is equal to 1. Among other important aspects of this classification, we mention here the occurrence of nets that are not realized as coset models, in contrast to a long standing expectation. (See Remarks after Theorem 7 of [34] on this point. Also, Carpi and Xu recently made progress on classification for the case c = 1 in [10, 54], respectively.) The aim of this paper is to pursue point (b). We shall obtain a complete classification of the two-dimensional local conformal nets (with parity) with central charge in the discrete series. To this end we first classify the maximal nets in this class. Maximality here means that the net does not admit any irreducible local conformal net extension. Maximality will turn out to be also equivalent to the triviality of the superselection structure or to µ-index equal to one, that is Haag duality for a disconnected union of finitely many double cones. 1 More precisely Diff(S 1 ) × Diff(S 1 ) is the conformal group of the Minkowskian torus S 1 × S 1 , the conformal completion of M = R × R (light ray decomposition), and the covariance group is a central extension of Diff(S 1 ) × Diff(S 1 ), see Sect. 2.
Classification of 2D Local Conformal Nets
65
It is clear at this point that our methods mainly concern Operator Algebras, in particular Subfactor Theory, see [50]. Indeed this was already the case in our previous one-dimensional classification [31]. The use of von Neumann algebras not only provides a clear formulation of the problem, but also suggests the path to follow in the analysis. Our strategy is the following. The dual canonical endomorphism of Vir c ⊗ Vir c ⊂ A decomposes as Zij ρi ⊗ ρ¯j (1) θ= ij
(i.e. the above is the restriction to Vir c ⊗ Vir c of the vacuum representation of A), where {ρi }i are representatives of unitary equivalence classes of irreducible DHR endomorphisms of the net Vir c . Since µA = 1, it turns out, by using the results in [32], that the matrix Z is a modular invariant for the tensor category of representations of the Virasoro net Vir c [41], and such modular invariants have been classified by Cappelli-Itzykson-Zuber [9]. We shall show that this map A → Z sets up a bijective correspondence between the set of isomorphism classes of two-dimensional maximal local conformal nets with parity and central charge less than one on one hand and the list of Cappelli-Itzykson-Zuber modular invariant, on the other hand. We first prove that the correspondence A → Z is surjective. Indeed, by our previous work [31], Z can be realized by α-induction as in [5] for extensions of the Virasoro nets. (See [38, 52, 3, 6, 7, 4] for more on α-induction.) Then Rehren’s results in [48] imply that θ defined as above (1) is the canonical endomorphism associated with a natural Qsystem, and we have a corresponding local extension A of Vir c ⊗ Vir c and this produces the matrix Z in the above correspondence. To show the injectivity of the correspondence note that, due to the work of Rehren [47], we have an inclusion Vir c (I+ ) ⊗ Vir c (I− ) ⊂ A+ (I+ ) ⊗ A− (I− ) ⊂ A(O), where A+ ⊗ A− is the maximal chiral subnet. By assumption, A+ and A− are isomorphic with central charge c < 1, thus they are in the discrete series classified in [31]. Moreover Z determines uniquely the isomorphism class of A± and an isomorphism π from a fusion rule of A+ onto that of A− so that the dual canonical endomorphism λ on A+ ⊗ A− decomposes as αi ⊗ α¯ π(i) , (2) λ= i
where {αi }i is a system of irreducible DHR endomorphisms of A+ = A− . If Z is a modular invariant of type I, the map π is trivial, so the dual canonical endomorphism has the same form of the Longo-Rehren endomorphism [38]. Thus the classification is reduced to classification of Q-systems in the sense of [36] having the canonical endomorphism of the form given by Eq. (1). This type of classification of Q-systems, up to unitary equivalences, was studied by Izumi-Kosaki [27] as a subfactor analogue of 2-cohomology of (finite) groups. In our setting, we now have a 2-cohomology group of a tensor category, while the 2-cohomology of Izumi-Kosaki does not have a group structure in general. The group operation comes from a natural composition of 2-cocycles. Then the crucial point in our analysis is the vanishing of this 2-cohomology
66
Y. Kawahigashi, R. Longo
for a certain tensor category as we will explain below, and this vanishing implies that the dual Q-system for the inclusion A+ ⊗ A− ⊂ A has a standard dual canonical endomorphism as in the Longo-Rehren Q-system [38], namely A+ ⊗ A− ⊂ A is the “quantum double” inclusion constructed in [38]. At this point, as we know the isomorphism class of A± by our previous classification [31], it follows that the isomorphism class of A is determined by Z. If the modular invariant is of type II, then π gives a non-trivial fusion rule automorphism, however π is actually associated with an automorphism of the tensor category acting non-trivially on irreducible objects [4]. We may then extend our arguments of 2-cohomology vanishing and deal also with this case. It turns out that the automorphism π is an automorphism of a braided tensor category. We thus arrive at the following classification: the maximal local two-dimensional conformal nets with c < 1 and parity symmetry are in a bijective correspondence with the pairs of the A-D-E Dynkin diagrams such that the difference of their Coxeter numbers is equal to 1, namely Z is a modular invariant listed in Table 1 (end of Sect. 5). Note that Dynkin diagrams of type D2n+1 and E7 do appear in the list of present classification of two-dimensional conformal nets, but they were absent in the one-dimensional classification list [31]. Now, as we shall see, the two-dimensional local conformal net B in the discrete series is a finite-index subnet of a maximal local conformal net A. Moreover A and B have the same two-dimensional Virasoro subnet. Using this, we then obtain the classification of all local two-dimensional conformal nets with central charge less c < 1. The non-maximal ones correspond bijectively to the pairs (T , α), where T is a proper sub-tensor category of the representation tensor category of Vir c and α is an automorphism of T . There are at most two automorphisms, thus two possible nets for a given T . The complete list is given in Table 2 (end of Sect. 6). As we have mentioned, a crucial point in our analysis is to show the uniqueness up to equivalence of the Q-system associated with the canonical endomorphism of the form (2) in our cases. To this end we consider a cohomology associated with a representation tensor category that we have to show to vanish in our case. Note that our 2-cohomology groups are a generalization of the usual 2-cohomology groups of finite groups, so they certainly do not vanish in general. Before concluding this introduction we make explicit that our classification applies as well to the local conformal nets with central charge less than one on other twodimensional spacetimes. Indeed if N is two-dimensional spacetime that is conformally equivalent to M, namely conformally diffeomorphic to a subregion on the Einstein cylinder S 1 × R, we may then consider the local conformal nets on N that satisfy the double cone KMS property. These nets are in one-to-one correspondence with the local conformal nets on Minkowski spacetime M, see [21], and so one immediately reads off our classification in these different contexts. An important case where this applies is represented by the two-dimensional de Sitter spacetime. 2. Two-Dimensional Completely Rational Nets and Central Charge Let M be the two-dimensional Minkowski spacetime, namely R2 equipped with the metric dt 2 − dx 2 . We shall also use the light ray coordinates ξ± ≡ t ± x. We have the decomposition M = L+ × L− where L± = {ξ : ξ± = 0} are the two light ray lines. A double cone O is a non-empty open subset of M of the form O = I+ × I− with I± ⊂ L± bounded intervals; we denote by K the set of double cones.
Classification of 2D Local Conformal Nets
67
The M¨obius group P SL(2, R) acts on R ∪ {∞} by linear fractional transformations, hence this action restricts to a local action on R (see e.g. [8]), in particular if F ⊂ R has compact closure there exists a connected neighborhood U of the identity in P SL(2, R) such that gF ⊂ R for all g ∈ U. It is convenient to regard this as a local action on R of the universal covering group P SL(2, R) of P SL(2, R). We then have a local (product) action of P SL(2, R)×P SL(2, R) on M = L+ ×L− . Clearly P SL(2, R)×P SL(2, R) acts by pointwise rescaling the metric dξ+ dξ− , i.e. by conformal transformations. A local M¨obius covariant net A on M is a map A : O ∈ K → A(O), where the A(O)’s are von Neumann algebras on a fixed Hilbert space H, with the following properties: • Isotony. O1 ⊂ O2 ⇒ A(O1 ) ⊂ A(O2 ). • Locality. If O1 and O2 are spacelike separated then A(O1 ) and A(O2 ) commute elementwise (two points ξ1 and ξ2 are spacelike if (ξ1 − ξ2 )+ (ξ1 − ξ2 )− < 0). • M¨obius covariance. There exists a unitary representation U of P SL(2, R)×P SL(2, R) on H such that, for every double cone O ∈ K, U (g)A(O)U (g)−1 = A(gO),
g ∈ U,
with U ⊂ P SL(2, R) × P SL(2, R) any connected neighborhood of the identity such that gO ⊂ M for all g ∈ U. • Vacuum vector. There exists a unit U -invariant vector , cyclic the O∈K A(O). • Positive energy. The one-parameter unitary subgroup of U corresponding to time translations has positive generator. The 2-torus S 1 × S 1 is a conformal completion of M = L+ × L− in the sense that M is conformally diffeomorphic to a dense open subregion of S 1 × S 1 and the local action of P SL(2, R) × P SL(2, R) on M extends to a global conformal action on S 1 × S 1 . But in general the net A does not extend to a M¨obius covariant net on S 1 × S 1 ; this is related to the failure of timelike commutativity (note that a chiral net, i.e. the tensor product of two local nets on S 1 , would extend), indeed we have a covariant unitary representation of P SL(2, R) × P SL(2, R) and not of P SL(2, R) × P SL(2, R). Let however G be the quotient of P SL(2, R) × P SL(2, R) modulo the relation (r2π , r−2π ) = (id, id) (spatial 2π-rotation is the identity). Proposition 2.1. The representation U of P SL(2, R) × P SL(2, R) factors through a representation of G. The above proposition holds as a consequence of spacelike locality, it is a particular case of the conformal spin-statistics theorem and can be proved as in [20]. Because of the above Prop. 2.1, A does extend to a local G-covariant net on the Einstein cylinder E = R × S 1 , the cover of the 2-torus obtained by lifting the time coordinate from S 1 to R. Explicitly, M is conformally equivalent to a double cone OM of E. By parametrizing E with coordinates (t , θ), −∞ < t < ∞, −π ≤ θ < π , the transformation ξ± = tan( 21 (t ± θ))
(3)
is a diffeomorphism of the subregion OM = {(t , θ) : −π < t ± θ < π } ⊂ E with M, which is a conformal map when E is equipped with the metric ds 2 ≡ dt 2 − dθ 2 .
68
Y. Kawahigashi, R. Longo
G acts globally on E and the net A extends uniquely to a G-covariant net of E with U the unitary covariant action (see [8]). We shall denote by the same symbol A both the net on M and the extended net on E. If O1 ⊂ M (or O1 ⊂ E) we shall denote by A(O1 ) the von Neumann algebra generated by the A(O)’s as O varies in the double cones contained in O1 . If O ∈ K we shall denote by O the one-parameter subgroup of G defined as follows: O = g W g −1 if W is a wedge, W is the boost one-parameter group associated with W , and gO = W with g ∈ P SL(2, R) × P SL(2, R), see [23]. We collect in the next proposition a few basic properties of a local M¨obius covariant net. The proof is either in the references or can be immediately obtained from those. All the statements also hold true (with obvious modifications) in any spacetime dimension. We shall use the lattice symbol ∨ to denote the von Neumann algebra generated. Proposition 2.2. Let A be a local M¨obius covariant net on M as above. The following hold: (i) Double cone KMS property. If O ⊂ E is a double cone, then the unitary modular group associated with (A(O), ) has the geometrical meaning itO = U ( O (−2π t)) [8]. (ii) Haag duality on E; wedge duality on M. If O ⊂ E is a double cone then A(O ) = A(O) . Here O is the causal complement of O in E (note that O is still a double cone.) In particular A(W ) = A(W ) , where W is a wedge in M, say W = (−∞, a) × (−∞, b) and W its causal complement in M, thus W = (a, ∞) × (b, ∞) [8, 21]. (iii) Modular PCT symmetry. There is a anti-unitary involution on H such that A(O) = A(−O), U (g) = U (θ (g)) and θ = . Here O is any double one in E and θ is the automorphism of G associated with space and time reflection [8]. (iv) Additivity. Let O be a double cone and{Oi } a family of open sets such that i Oi contains the axis of O. Then A(O) ⊂ i A(Oi ) [15]. (v) Equivalence between and uniqueness of the vacuum. A is irreduc irreducibility A(O) = B(H)), iff A is irreducible on E, iff is the ible on M (that is O∈K unique U -invariant vector (up to a phase) [20]. (vi) Decomposition into irreducibles. A has a unique direct integral decomposition in terms of local irreducible M¨obius covariant nets. If A is conformal (see below) then the fibers in the decomposition are also conformal [20]. By the above point (vi) we shall always assume our nets to be irreducible. Let Diff(R) denote the group of positively oriented diffeomorphisms of R that are smooth at infinity (with the identification R = S 1 {∞}, Diff(R) is the subgroup of Diff(S 1 ) of orientation preserving diffeomorphisms of S 1 that fix the point ∞). By identifying M with the double cone OM ⊂ E as above, we may identify elements of Diff(R) × Diff(R) with conformal diffeomorphisms of OM . Such diffeomorphisms uniquely extend (by periodicity) to global conformal diffeomorphisms of E. Namely the element (r2π , id) of G generates a subgroup of G (isomorphic to Z) for which OM is a fundamental domain in E. We may then extend an element of Diff(R) × Diff(R) from OM to all E by requiring commutativity with this Z-action; this is the unique conformal extension to E. Let Conf(E) denote the group of global, orientation preserving conformal diffeomorphisms of E. Conf(E) is generated by Diff(R) × Diff(R) and G (note that Diff(R) × Diff(R) intersects G in the “Poincar´e-dilation” subgroup). Indeed if ϕ ∈ Conf(E), then
Classification of 2D Local Conformal Nets
69
ϕOM is a maximal double cone of E, namely the causal complement of a point. Thus there exists an element g ∈ G such that gOM = ϕOM . Then ψ ≡ g −1 ϕ maps OM onto OM and so ψ ∈ Diff(R)×Diff(R) and ϕ = g ·ψ. Note that, by the same argument, any element of Conf(E) is uniquely the product of an element of Diff(R) × Diff(R), a space rotation and time translation on E. A local conformal net A on M is a M¨obius covariant net such that the unitary representation U of G extends to a projective unitary representation of Conf(E) (still denoted by U ) so that the extended net on E is covariant. In particular U (g)A(O)U (g)−1 = A(gO),
g∈U,
if U is a connected neighborhood of the identity of Conf(E), O ∈ K, and gO ⊂ M for all g ∈ U. We further assume that U (g)XU (g)−1 = X,
g ∈ Diff(R) × Diff(R) ,
(4)
if X ∈ A(O1 ), g ∈ Diff(R) × Diff(R) and g acts identically on O1 . We may check the conformal covariance on M by the local action of Diff(R) × Diff(R). Given a M¨obius covariant net A on M and a bounded interval I ⊂ L+ we set A+ (I ) ≡ A(O) (5) O=I ×J
(intersection over all intervals J ⊂ L− ), and analogously define A− . By identifying L± with R we then get two M¨obius covariant local nets A± on R, the chiral components of A, but for the cyclicity of ; we shall also denote A± by AR and AL . By the Reeh-Schlieder theorem the cyclic subspace H± ≡ A± (I ) is independent of the interval I ⊂ L± and A± restricts to a (cyclic) M¨obius covariant local net on R on the Hilbert space H± . Since is separating for every A(O), O ∈ K, the map X ∈ A± (I ) → X H± is an isomorphism for any interval I , so we will often identify A± with its restriction to H± . Proposition 2.3. Let A be a M¨obius covariant (resp. conformal) net on M. Setting A0 (O) ≡ A+ (I+ ) ∨ A− (I− ), O = I+ × I− , then A0 is a M¨obius covariant (resp. conformal) subnet of A, there exists a consistent family of vacuum preserving conditional expectations εO : A(O) → A0 (O) and the natural isomorphism from the product A+ (I+ )·A− (I− ) to the algebraic tensor product A+ (I+ )A− (I− ) extends to a normal isomorphism between A+ (I+ ) ∨ A− (I− ) and A+ (I+ ) ⊗ A− (I− ). Proof. By the double cone KMS property in Prop. 2.2, for any given double cone O, A0 (O) is globally invariant under the modular group of A0 (O) w.r.t. the vacuum state. Hence, by the Takesaki theorem [50], there exists a vacuum preserving conditional expectation εO : A(O) → A0 (O). εO is given by εO (X)E = EXE, X ∈ A0 (O), where E is the orthogonal projection onto A0 (O). By the Reeh-Schlieder theorem E is independent of O, thus if O˜ is a double cone containing O we have εO˜ A(O) = εO , namely the εO ’s form a consistent family. Clearly A0 is a M¨obius covariant subnet, as the unitary M¨obius representation is generated by modular unitary one-parameter subgroups. In particular A0 is a factor. For similar reasons there exists a normal faithful expectation from A0 (O) to A+ (I+ ) and A− (I− ). Since A+ (I+ ) and A− (I− ) are commuting factors and A0 (O) is a factor, it follows that A0 (O) is naturally isomorphic to A+ (I+ ) ⊗ A− (I− ).
70
Y. Kawahigashi, R. Longo
Thus we may identify H+ ⊗ H− with H0 ≡ A0 (O) and A+ (I+ ) ⊗ A− (I− ) with A0 (O). Now suppose that A is conformal. The following corollary is immediate. Corollary 2.4. If A is conformal then A0 ≡ A+ ⊗ A− is also conformal, moreover A0 extends to a local Diff(S 1 ) × Diff(S 1 )-covariant net on the 2-torus, namely A± are local conformal nets on S 1 . Assuming A to be conformal we set Vir + (I ) ≡ U (g) : g ∈ Diff(I ) × {id} , I ⊂ L+ , Vir − (I ) ≡ U (g) : g ∈ {id} × Diff(I ) , I ⊂ L− ,
(7)
Vir(O) ≡ Vir + (I+ ) ∨ Vir − (I− ), I± ⊂ L± .
(8)
(6)
Proposition 2.5. Vir ± (I ) ⊂ A± (I ), I ⊂ L± , and Vir + (I+ ) ∨ Vir − (I− ) is naturally isomorphic to Vir + (I+ ) ⊗ Vir − (I− ), I± ⊂ L± . Vir ± is the Virasoro subnet of A± . Proof. Given an interval I+ ⊂ L+ and g ∈ Diff(I+ ) × {id} then, by property (4) and Haag duality on E (Prop. 2.2), U (g) belongs to A(O), where O = I+ × I− is a double cone, for all intervals I− ⊂ L− . Hence, by definition (5), U (g) ∈ A+ (I+ ). So Vir + (I+ ) ⊂ A+ (I+ ) and there is an analogous containment Vir − ⊂ A− . By Prop. 2.3 we then have a natural isomorphism Vir + (I+ )∨Vir − (I− ) Vir + (I+ )⊗ Vir − (I− ). The last statement is immediate because the restriction of U to H± implements the covariance unitary representation for A± . The central charge of Vir ± is denoted by c± and is called the central charge of A. In our case c+ = c− and we then refer to the common value c of c± as the central charge of A. In this paper we shall use only the a priori weaker form of conformal covariance given by the above proposition. Indeed we shall just need that A± are conformal nets on S 1 , with central charge less than one. Proposition 2.6. For every double cone O, Vir(M) ∩ A(O) = C (the coset net is trivial). If c± < 1 then Vir(O) ⊂ A(O) is an irreducible inclusion with finite Jones index. Proof. The proof is analogous to the one of [31, Prop. 3.5]. The second statement follows because Vir(O) is completely rational if c± < 1, see [31] and next section. Indeed, by a recent result by K¨oster [34], the local irreducibility Vir(O) ∩A(O) = C holds without assuming c± < 1, but we do not need this in our paper. Thus the left and right mover subalgebras A± are rich, and our problem is to classify the extensions of A0 , indeed of Vir. It is easy to see that A0 is the unique maximal chiral subnet of A, namely it coincides max max with the subnet Amax L ⊗ AR in Rehren’s work [47, 48]. That is to say AL (O) ⊗ 1 = max max max A(O) ∩ U ({id} × P SL(2, R)) and similarly for AR . Indeed AL ⊗ AR , being chiral, is clearly contained in A0 ; on the other hand A+ commutes with U id×P SL(2, R) max so A+ ⊂ Amax L and analogously A− ⊂ AR .
Classification of 2D Local Conformal Nets
71
2.1. Complete rationality. Let A be a local conformal net on the two-dimensional Minkowski spacetime M. We shall say that A is completely rational if the following three conditions hold: a) Haag duality on M. For any double cone O we have A(O) = A(O ) . Here O is the causal complement of O in M. b) Split property. If O1 , O2 ∈ K and the closure of O¯ 1 of O1 is contained in O2 , the natural map A(O1 ) · A(O2 ) → A(O1 ) A(O2 ) extends to a normal isomorphism A(O1 ) ∨ A(O2 ) → A(O1 ) ⊗ A(O2 ) . c) Finite µ-index. Let E = O1 ∪ O2 ⊂ M be the union of two double cones O1 , O2 such that O¯ 1 and O¯ 2 are spacelike separated. Then the Jones index [A(E ) : A(E)] is finite. This index is denoted by µA , the µ-index of A. The notion of complete rationality has been introduced and studied in [32] for a local net C on R. If C is conformal, the definition of complete rationality strictly parallels the above one in the two-dimensional case. In general, the above (one-dimensional version) of the above three conditions must be supplemented by the following two conditions: d) Strong additivity. If I1 , I2 ⊂ R are open intervals and I is the interior of I1 ∪ I2 , then C(I ) = C(I1 ) ∨ C(I2 ). e) Modular PCT symmetry. There is a vector , cyclic and separating for all the C(I )’s, such that if a ∈ R the modular conjugation J of (C(a, ∞), ) satisfies J C(I )J = C(I + 2a), for all intervals I . If C is conformal, then d) and e) follows from a), b), c). In any case all conditions a) to e) have the strong consequences on the structure of A [32]. In particular µC = d(ρi )2 , i
where the ρi form a system of irreducible sectors of C. Returning to the two-dimensional local conformal net A, consider the time-zero net C(I ) ≡ A(O), where I is an interval of the t = 0 line in M and O = I is the double cone with basis I . Note that C is local but not conformal (positivity of energy does not hold). However C inherits all properties from a) to e) from A. Thus we may define A to be completely rational by requiring C to be completely rational. In this way all results in [32] immediately apply to the two-dimensional context. 3. Modular Invariance and µ-Index of a Net Rehren raised a question in [49, p. 351, lines 8–13] about modular invariants arising from a decomposition of a two-dimensional net and its µ-index. M¨uger has then solved the problem affirmatively in [41]. We recall some notions and results necessary for our work here. In [47–49], Rehren studied 2-dimensional local conformal quantum field theory B(O) which irreducibly extends a given pair of chiral theories A = AL ⊗AR . That is, the mathematical structure studied there is an irreducible inclusion of nets, AL (I ) ⊗ AR (J ) ⊂ B(O), where I, J are light ray intervals and O is a double cone I × J . Note that here
72
Y. Kawahigashi, R. Longo
AL and AR can be distinct. For such an extension, we decompose the dual canonical endomorphism θ on AL ⊗ AR as Zij αiL ⊗ αjR , θ= ij
where {αiL }i and {αjR }j are systems of irreducible DHR endomorphisms of AL and AR , respectively. The matrix Z = (Zij ) is called a coupling matrix. The two nets AL and AR define S- and T -matrices, SL , TL , SR , TR , respectively, as in [46]. We are interested in the case where the S-matrices are invertible. (By the results in [32], this invertibility, which is called non-degeneracy of the braiding, holds if the nets are completely rational in the sense of [32].) Then Rehren considered when the following two intertwining relations hold. TL Z = ZTR ,
SL Z = ZSR .
(9)
Note that if AL = AR and the non-degeneracy of the braiding holds, this condition implies the usual modular invariance of Z. (We always have Z00 = 1 and Zij ∈ {0, 1, 2, . . . }.) He considered natural situations where the above equalities (9) hold, but also pointed out that it is not necessarily valid in general by showing a very easy counter-example to the intertwining property (9). He then continues as follows. “A possible criterium to exclude models like the counter examples, and hopefully to enforce the intertwining property, could be that the local 2D theory B does not possess nontrivial superselection sectors, but I have no proof that this condition indeed has the desired consequences.” M¨uger [41] has proved that this triviality of the superselection structures is indeed sufficient (and necessary) for the intertwining property (9), when the nets AL and AR are completely rational. Theorem 3.1 (Muger ¨ [41]). Under the above conditions, the following are equivalent. 1. The net B has only the trivial superselection sector. 2. The µ-index µB is 1. 3. The matrix Z has the intertwining property (9), TL Z = ZTR ,
SL Z = ZSR .
In the case where we can naturally identify AL and AR , the above theorem gives a relation between the classification problem of the modular invariants and the classification problem of the local extension of AL ⊗ AR with µ-index equal to 1. 4. Longo-Rehren Subfactors and 2-Cohomology of a Tensor Category Let M be a type III factor. We say that a finite subset ⊂ End(M) is a system of endomorphisms of M if the following conditions hold, as in [5, Def. 2.1]. 1. 2. 3. 4.
Each λ ∈ is irreducible and has finite statistical dimension. The endomorphisms in are mutually inequivalent. We have idM ∈ . ¯ is the conjugate For any λ ∈ , we have an endomorphism λ¯ ∈ such that [λ] sector of [λ].
Classification of 2D Local Conformal Nets
73
5. The set is closed under composition and subsequent irreducible decomposition, i.e., ν with [λ][µ] =
ν [ν] for any λ, µ ∈ , we have non-negative integers Nλ,µ N ν∈ λ,µ as sectors. Two typical examples of systems of endomorphisms are as follows. First, if we have a subfactor N ⊂ M with finite index, then consider representatives of unitary equivalence classes of irreducible endomorphisms appearing in irreducible decompositions of powers γ n of the canonical endomorphism γ for the subfactor. If the set of representatives is finite, that is, if the subfactor is of finite depth, then we obtain a finite system of endomorphisms. Second, if we have a local conformal net A on the circle, we consider representatives of unitary equivalence classes of irreducible DHR endomorphisms of this net. If the set of representatives is finite, that is, if the net is rational, then we obtain a finite system of endomorphisms of M = A(I ), where I is some fixed interval of the circle. Recall the definition of a Q-system in [36]. Let θ be an endomorphism of a type III factor. A triple (θ, V , W ) is called a Q-system if we have the following properties: V W V ∗V W ∗W V ∗W
∈ Hom(id, θ ), ∈ Hom(θ, θ 2 ), = 1, = 1, = θ (V ∗ )W ∈ R+ ,
W 2 = θ (W )W, θ (W ∗ )W = W W ∗ .
(10) (11) (12) (13) (14) (15) (16)
Actually, it has been proved in [39] that Condition (16) is redundant. (It has been also proved in [27] that Condition (15) is redundant if (16) is assumed.) In this case, θ is a canonical endomorphism of a certain subfactor of the original factor. For a finite system as above, Longo and Rehren constructed a subfactor M ⊗ M opp ⊂ R in [38, Prop. 4.10] such that the dual canonical endomorphism has a decomposition θ = λ∈ λ ⊗ λopp , by explicitly writing down a Q-system (θ, V , W ). We, however, could have an inequivalent Q-system for the same dual canonical endomorphism θ. (We say that two Q-systems (θ, V1 , W1 ) and (θ, V2 , W2 ) are equivalent if we have a unitary u ∈ Hom(θ, θ ) satisfying V2 = uV1 ,
W2 = uθ (u)W1 u∗ .
This equivalence of Q-systems is equivalent to inner conjugacy of the corresponding subfactors [27].) We study this problem of uniqueness of the Q-systems below. Classification of Q-systems for a given dual canonical endomorphism was studied as a subfactor analogue of 2-cohomology of a group in [27]. We show that for a Longo-Rehren Q-system, we naturally have a 2-cohomology group of a tensor category, while 2-cohomology in [27] is not a group in general. Suppose we have a family (Cλµ )λ,µ∈ with Cλµ ∈ Hom(λµ, λµ). An intertwiner ν ∈ End(Hom(ν, λµ)) for ν ∈ by composition Cλµ naturally defines an operator Cλµ from the left. For λ, µ, ν, π ∈ , we have a decomposition Hom(σ, λµ) ⊗ Hom(π, σ ν). Hom(π, λµν) = σ ∈
74
Y. Kawahigashi, R. Longo
We have
σ Cλµ ⊗ Cσπν ∈ End(Hom(π, λµν))
σ ∈
according to this decomposition. We similarly have π τ Cλτ ⊗ Cµν ∈ End(Hom(π, λµν)) τ ∈
based on the last expression of the decompositions Hom(π, λµν) ∼ Hom(π, λτ ) ⊗ λ(Hom(τ, µν)) = τ ∈
∼ =
Hom(π, λτ ) ⊗ Hom(τ, µν).
τ ∈
We now consider the following conditions. Definition 4.1. We say that a family (Cλµ )λ,µ∈ is a unitary 2-cocycle of , if the following conditions hold: 1. For λ, µ ∈ , each Cλµ is a unitary operator in Hom(λµ, λµ). 2. For λ ∈ , we have Cλid = 1 and Cidλ = 1. 3. For λ, µ, ν, π ∈ , we have σ π τ Cλµ ⊗ Cσπν = Cλτ ⊗ Cµν σ ∈
τ ∈
as an identity in End(Hom(π, λµν)) with respect to the above decompositions of Hom(π, λµν). We always assume unitarity for Cλµ in this paper, so we simply say a 2-cocycle for a π unitary 2-cocycle. For a 2-cocycle (Cλµ )λ,µ∈ , we define Cλµν ∈ End(Hom(π, λµν)) by σ Cλµ ⊗ Cσπν . σ ∈
Similarly, we can define µ µ ···µ
Cλ11λ22···λn m ∈ End(Hom(µ1 µ2 · · · µm , λ1 λ2 · · · λn )). (Note that well-definedness follows from Condition 3 in Definition 4.1.) In this notaλµ tion, we have Cλµ ∈ End(Hom(λµ, λµ)) and this endomorphism is given as the left multiplication of Cλµ ∈ Hom(λµ, λµ) on Hom(λµ, λµ), where the product strucλµ ture on Hom(λµ, λµ) is given by composition. In this way, we can identify Cλµ ∈ End(Hom(λµ, λµ)) and Cλµ ∈ Hom(λµ, λµ). We next consider a strict C ∗ -tensor category T , with conjugates, subobjects, and direct sums, whose objects are given as finite direct sums of endomorphisms in . We then study an automorphism of T such that (λ) and λ are unitarily equivalent for all objects λ in T . For all λ ∈ , we choose a unitary uλ with (λ) = Ad(uλ ) · λ. By adjusting with (Ad(uλ ))λ∈ , we may and do assume that (λ) = λ. Then such an automorphism gives a family of automorphisms µ µ ···µ
λ11λ22···λnm ∈ Aut(Hom(µ1 µ2 · · · µm , λ1 λ2 · · · λn )),
Classification of 2D Local Conformal Nets
75
for λ1 , λ2 , · · · , λn , µ1 , µ2 , · · · , µm ∈ , with the compatibility condition k νλ11νλ22···ν ···λn =
µ µ ···µ
µ1 ,µ2 ,··· ,µm ∈
k λ11λ22···λnm ⊗ νµ11νµ22···ν ···µm
on the decomposition
Hom(ν1 ν2 · · · νk , λ1 λ2 · · · λn ) =
Hom(µ1 µ2 · · · µm , λ1 λ2 · · · λn )
µ1 ,µ2 ,··· ,µm ∈
⊗Hom(ν1 ν2 · · · νk , µ1 µ2 · · · µm ). µ µ ···µ
It is clear that a family (Cλ11λ22···λn m ) arising from a 2-cocycle (Cλµ ) is an automorphism of a tensor category in this sense. Conversely, suppose that we have an automorphism of a tensor category acting on objects trivially as above. Then using the isomorphism Hom(λµ, λµ) ∼ =
Hom(ν, λµ) ⊗ Hom(λµ, ν),
ν∈
the family (νλµ ) gives a unitary intertwiner in Hom(λµ, λµ). We denote this intertwiner by Cλµ and then it is clear that the family (Cλµ ) gives a 2-cocycle in the above sense. Thus in this correspondence, we can identify a 2-cocycle on and an automorphism of the tensor category arising from that fixes each object in the category. We now have the following definition. ) Definition 4.2. (1) We say that 2-cocycles (Cλµ )λµ and (Cλµ λµ are equivalent if we have a family (ωλ )λ of scalars of modulus 1 such that ν Cλµ = ων /(ωλ ωµ )C λµ ∈ End(Hom(ν, λµ)). ν
If a 2-cocycle (Cλµ )λµ is equivalent to (1)λµ , then we say that it is trivial. ν ’s are scalar operators (2) We say that a 2-cocycle (Cλµ )λµ is scalar-valued if all Cλµ on Hom(λµ, ν). (3) We say that an automorphism of the tensor category as above is trivial if we have a family (ωλ )λ of scalars of modulus 1 satisfying µ µ ···µ
λ11λ22···λnm = ωµ1 · · · ωµm /(ωλ1 · · · ωλn ). Note that if a 2-cocycle is trivial, then it is scalar-valued, in particular. We now recall the definition of the Longo-Rehren subfactor [38, Prop. 4.10] as follows. (See [40, 43, 44] for related or more general definitions.) Let = {λk | k = 0, 1, . . . , n} be a finite system of endomorphisms of a type III factor
M where λ0 = id. We choose a system {Vk | k = 0, 1, . . . , n} of isometries with nk=0 Vk Vk∗ = 1 in the factor M ⊗ M opp , where M opp is the opposite algebra of M and we denote the anti-linear isomorphism from M onto M opp by j . Then we set ρ(x) =
n k=0
Vk ((λk ⊗ λk )(x))Vk∗ , opp
76
Y. Kawahigashi, R. Longo
for x ∈ M ⊗ M opp , where λopp = j · λ · j −1 . We set V = V0 ∈ Hom(id, ρ) and define W ∈ Hom(ρ, ρ 2 ) as follows: n dk dl opp Vk (λk ⊗ λk )(Vl )Tklm Vm∗ , W = wdm k,l,m=0
where dk is the statistical dimension of λk , w is the global index of the system, w =
n 2 k=0 dk , and m
Tklm
=
Nkl
(Tklm )i ⊗ j ((Tklm )i ).
i=1
is the structure constant, dim Hom(λm , λk λl ), and {(Tklm )i | i = 1, 2, . . . , Nklm } is a fixed orthonormal basis of Hom(λm , λk λl ). Note that the operator Tklm does not depend on the choice of the orthonormal basis. Proposition 4.10 in [38] says that the triple (ρ, V , W ) is a Q-system. Thus we have a subfactor M ⊗ M opp ⊂ R with index w corresponding to the dual canonical endomorphism ρ. We call this a Longo-Rehren subfactor arising from the system . Furthermore, if is a subsystem of all the irreducible DHR endomorphisms of a local conformal net A, then any Q-system having this dual canonical endomorphism gives an extension B ⊃ A ⊗ Aopp . This 2-dimensional net B is local if and only if ε(ρ, ρ)W = W by [38, Prop. 4.10], where ε is the braiding. In general, if the system
has a braiding ε, and this condition ε(ρ, ρ)W = W holds, we say that the Q-system (ρ, V , W ) satisfies locality. We now would like to characterize a general Q-system having the same dual canonical endomorphism ρ. First, we have the following simple lemma. Here Nklm
Lemma 4.3. Let F, F be finite dimensional complex Hilbert spaces and j an anti-linear isomorphism from F onto F . For any vector ξ ∈ F ⊗ F , we define a linear map A : F → F by ξ = k Aξk ⊗ j (ξk ), where {ξk } is an orthonormal basis of F . Then this linear map A is independent of the choice of the orthonormal basis {ξk }. Proof. This is straightforward by the anti-isomorphism property of j .
The next theorem gives our characterization of Q-systems. Theorem 4.4. Let , ρ, V , W be as above. If another triple (ρ, V , W ) with W ∈ Hom(ρ, ρ 2 ) is a Q-system, we have a 2-cocycle (Cλµ )λ,µ∈ such that n dk dl opp W = Vk (λk ⊗ λk )(Vl )(Cλk λl ⊗ 1)Tklm Vm∗ . (17) wdm k,l,m=0
Conversely, if we have a 2-cocycle (Cλµ )λ,µ∈ , then the triple (ρ, V , W ) with W defined as in (17) is a Q-system. The Q-system (ρ, V , W ) is equivalent to the above canonical Q-system (ρ, V , W ) if and only if the corresponding 2-cocycle (Cλµ )λ,µ∈ is trivial, if and only if the corresponding automorphism of the tensor category arising from is trivial. Moreover, suppose that the system has a braiding ε± . Then the Q-system (ρ, V , W ) satisfies locality if and only if the corresponding 2-cocycle (Cλµ )λ,µ∈ satisfies the following symmetric condition: − + Cµλ ελµ , Cλµ = εµλ
(18)
Classification of 2D Local Conformal Nets
77
for all λ, µ ∈ . If this symmetric condition holds, the corresponding automorphism of the tensor category arising from is an automorphism of a braided tensor category. Proof. If (ρ, V , W ) with W ∈ Hom(ρ, ρ 2 ) is a Q-system, then we have a system of intertwiners (Cλµ )λ,µ∈ such that identity (17) holds and the intertwiners (Cλµ ) are uniquely determined by Lemma 4.3. Expanding both sides of identity (15), we obtain the following identity: n dk dl dm opp opp opp Vk (λk ⊗ λk )(Vl )(λk λl ⊗ λk λl )(Vm ) w 2 dp k,l,m,p,q=0
× (λk ⊗ λk )((Cλl λm ⊗ 1)Tlm )(Cλk λq ⊗ 1)Tkq Vp∗ n dk dl dm opp opp opp = Vk (λk ⊗ λk )(Vl )(λk λl ⊗ λk λl )(Vm ) w 2 dp opp
q
p
k,l,m,p,r=0
× (Cλk λl ⊗ 1)Tklr (Cλr λm ⊗ 1)Trm Vp∗ . p
(19)
We decompose Hom(λp , λk λl λm ) ∼ = ∼ =
n q=0 n
Hom(λp , λk λq ) ⊗ Hom(λq , λl λm ) Hom(λr , λk λl ) ⊗ Hom(λp , λr λm ),
r=0
as above, and apply Lemma 4.3 to the above identity (19) to obtain Condition 3 in Definition 4.1. Similarly, Condition 2 in Definition 4.1 follows from identity (14). We next prove unitarity of Cλµ ∈ Hom(λµ, λµ). First note that the operators ¯ ¯ Cλidλ¯ , Cλid ¯ λ are scalar multiples of the identity because Hom(id, λλ), Hom(id, λλ) are both 1-dimensional. Since the triple (ρ, V , W ) also satisfies identity (16), we expand both sides of identity (16) and use Lemma 4.3 as in the above arguments. Then we obtain the following: The intertwiner space Hom(λµ, νσ ) for λ, µ, ν, σ ∈ can be decomposed in two ways as follows: Hom(λ, ντ ) ⊗ Hom(τ µ, σ ) (20) Hom(λµ, νσ ) ∼ = τ ∈
∼ =
Hom(λµ, π ) ⊗ Hom(π, νσ ).
(21)
π∈
On one hand, Lemma 4.3 applied to the left-hand side of identity (16) produces a map in End(Hom(λµ, νσ )) which maps Ti ⊗ Sj∗ ∈ Hom(λ, ντ ) ⊗ Hom(τ µ, σ ), identified with ν(Sj∗ )Ti ∈ Hom(λµ, νσ ), to ν(Sj∗ Cτ∗µ )Cντ Ti ∈ Hom(λµ, νσ ), where Ti and Sj are isometries in Hom(λ, ντ ) and Hom(σ, τ µ), respectively. On the other hand, Lemma 4.3 applied to the right-hand side of identity (16) produces a map in End(Hom(λµ, νσ )) which maps Ti ∗ ⊗ Sj ∈ Hom(λµ, π ) ⊗ Hom(π, νσ ), identified with Sj Ti ∗ ∈ ∗ ∈ Hom(λµ, νσ ), where T and S are isometries in Hom(λµ, νσ ), to Cνσ Sj Ti ∗ Cλµ i j Hom(π, λµ) and Hom(π, νσ ), respectively. These two maps are equal in
78
Y. Kawahigashi, R. Longo
End(Hom(λµ, νσ )). In the above decomposition (21), we set λ = σ = id and µ = ν, then we have τ = µ¯ and π = µ in the summations. With Frobenius reciprocity as in [25] and the above identity of two maps in End(Hom(λµ, νσ )), we obtain the identity id = 1. Cµidµ¯ Cµµ ¯
(22)
We next apply identity (13) to (17) and obtain the following equality:
ν dλ dµ Kλµ = wdν ,
(23)
λ,µ∈
ν = Tr((C ν )∗ C ν ) and Tr is the non-normalized trace on where we have set Kλµ λµ λµ Hom(λµ, λµ). Setting ν = id in (23), we obtain
dλ2 |Cλidλ¯ |2 = w,
λ∈
which, together with (22), implies |Cλidλ¯ | = 1 for all λ ∈ . In the above decomposition (21), we now set λ = id, then we have τ = ν¯ and π = µ in the summations. With Frobenius reciprocity as in [25] and the above identity of two maps in End(Hom(λµ, νσ )), we obtain the identity Cνidν¯ ν((Cνσ¯ µ )∗ T˜ )Rν ν¯ =
dµ µ C T, dν dσ νσ
(24)
for all T ∈ Hom(µ, νσ ), where T˜ ∈ Hom(¯ν µ, σ ) is the Frobenius dual of T and Rν ν¯ ∈ Hom(id, ν ν¯ ) is the canonical isometry. This identity (24), Condition 3 in Definition 4.1, already proved, and identity (22) imply the following identity: µ ˜∗ Cνσ¯ µ T , Cνσ¯ µ S = (Cνidν¯ )∗ Rν∗¯ν ν¯ (Cνσ S )Cνσ¯ µ T ∗ = (Cνidν¯ )∗ Cνid ¯νS T = T , S,
where we have T , S ∈ Hom(σ, ν¯ µ) and the inner product is given by T , S = S ∗ T ∈ C. This is the desired unitarity of Cν¯ µ . The converse also holds in the same way and the remaining parts are straightforward. It is easy to see that we can multiply 2-cocycles and the multiplication on the equivalence classes of 2-cocycles is well-defined. In this way, we obtain a group and this is called the 2-cohomology group of (or of the corresponding tensor category). It is also easy to see that the multiplication gives the composition of the corresponding automorphisms of the tensor category. The part of the above theorem on a bijective correspondence between Q-systems (ρ, V , W ) with locality and automorphisms of the braided tensor category has been also announced by M¨uger in [41].
Classification of 2D Local Conformal Nets
79
Example 4.5. If all the endomorphisms in are automorphisms, then the fusion rules determine a finite group G. It is easy to see that the Longo-Rehren Q-system gives a crossed product by an outer action of G and the above 2-cohomology group for is isomorphic to the usual 2-cohomology group of G. Furthermore, if the system has a braiding, then the group G is abelian. In this case, the symmetric condition of a cocycle means cg,h = ch,g for the corresponding usual 2-cocycle c of the finite abelian group G. It is well-known that such a 2-cocycle is trivial. (See [1, Lemma 3.4.2], for example.) When all the 2-cocycles for are trivial, we say that we have a 2-cohomology vanishing for . Thus, 2-cohomology vanishing implies uniqueness of the Longo-Rehren subfactor in the following sense. Corollary 4.6. Let be as above. If we have a 2-cohomology vanishing for and ρ = λ∈ λ ⊗ λopp is a dual canonical endomorphism for a subfactor M ⊗ M opp ⊂ P , then this subfactor is inner conjugate to the Longo-Rehren subfactor M ⊗ M opp ⊂ R. 5. 2-Cohomology Vanishing and Classification In this section, we first study a general theory of 2-cohomology for a C ∗ -tensor category and then apply it to the tensor categories related to the Virasoro algebra. We consider a strict C ∗ -tensor category T (with conjugates, subobjects, and direct sums) in the sense of [13, 39] and we assume that we have only finitely many equivalence classes of irreducible objects in T and that each object has a decomposition into a finite direct sum of irreducible objects. Such a tensor category is often called rational. We may and do assume that our tensor categories are realized as those of endomorphisms of a type III factor. Choose a system of endomorphisms of a type III factor M corresponding to the C ∗ -tensor category T . Suppose we have a 2-cocycle (Cλµ )λ,µ∈ . We introduce some basic notions. Suppose that we have σ ∈ such that for any λ ∈ , there exists k ≥ 0 such that λ ≺ σ k . Then we say that σ is a generator of . In the following, we consider only the case σ = σ¯ . In this case, we say that has a self-conjugate generator σ . Suppose σ ∈ is a self-conjugate generator of . We further assume that for all λ, µ ∈ , we have dim Hom(λσ, µ) ∈ {0, 1}. In this case, we say that multiplications by σ have no multiplicities. Take λ1 , λ2 , λ3 , λ4 ∈ and assume dim Hom(λ1 σ, λ2 ) = dim Hom(σ λ1 , λ3 ) = dim Hom(λ3 σ, λ4 ) = dim Hom(σ λ2 , λ4 ) = 1. Choose isometric intertwiners T1 ∈ Hom(λ2 , λ1 σ ), T3 ∈ Hom(λ3 , σ λ1 ),
T2 ∈ Hom(λ4 , σ λ2 ), T4 ∈ Hom(λ4 , λ3 σ ).
Then the composition T4∗ T3∗ σ (T1 )T2 is in Hom(λ4 , λ4 ) = C. This value is the connection as in [42, 14, Chapter 9]. We denote this complex number by W (λ1 , λ2 , λ3 , λ4 ). (Note that this value depends on T1 , T2 , T3 , T4 though they do not appear in the notation.) If all these complex numbers are non-zero, then we say that the connections of with respect to the generator σ are non-zero. This condition is independent of the choices of
80
Y. Kawahigashi, R. Longo
isometric intertwiners Tj ’s, because we now assume that multiplications by σ have no multiplicities. Suppose we have a map g : → Z/2Z. For an endomorphism σ that is a direct sum of elements λj ’s with g(λj ) = k ∈ Z/2Z, we also set g(σ ) = k. If we have g(λµ) = g(λ) + g(µ), then we say that has a Z/2Z-grading. An endomorphism λ ∈ is called even [resp. odd] when g(λ) = 0 [resp. g(λ) = 1]. Theorem 5.1. Suppose we have a finite system of endomorphisms with a self-conjugate generator σ ∈ satisfying all the following conditions: 1. Multiplications by σ have no multiplicities. 2. One of the following holds: (a) We have σ ≺ σ 2 . (b) The system has a Z/2Z-grading and the generator σ is odd. 3. The connections of with respect to the generator σ are non-zero. 4. For any λ, ν1 , ν2 ∈ with ν1 ≺ σ n , ν2 ≺ σ n , λ ≺ σ ν1 , and λ ≺ σ ν2 , we have µ ∈ with µ ≺ σ n−1 , ν1 ≺ σ µ, and ν2 ≺ σ µ. Then any 2-cocycle (Cλµ )λµ of is trivial. Before presenting a proof, we make a comment on Condition 4. Consider the Bratteli diagram for the higher relative commutants of a subfactor σ (M) ⊂ M. We number the steps of the Bratteli diagrams as 0, 1, 2, . . . . Then Condition 4 says the following. (Recall that σ is self-conjugate.) Suppose we have vertices corresponding to ν1 and ν2 at the nth step of the Bratteli diagrams, and they are connected to the vertex λ in the n + 1st step. Then there exists a vertex µ in the n − 1st step that is connected to ν1 and ν2 . Note that if ν1 and ν2 already appear in the n − 2nd step, then this condition trivially holds by taking µ = λ. Thus, if the subfactor σ (M) ⊂ M is of finite depth, then checking finitely many cases is sufficient for verifying Condition 4, and this can be done by drawing the principal graph of the subfactor σ (M) ⊂ M. Proof. Using Conditions 1, 3 and 4, we first prove that the unitary operator Cσλσ ···σ ∈ End(Hom(λ, σ σ · · · σ )) is scalar for any λ ∈ . Let the number of σ ’s in Cσλσ ···σ be k and we prove the above property Cσλσ ···σ ∈ C by induction on k. Note that the intertwiner space Hom(λ, σ σ · · · σ ) is decomposed as Hom(λ1 , σ σ ) ⊗ Hom(λ2 , λ1 σ ) ⊗ · · · ⊗ Hom(λ, λk−2 σ ), and each of the spaces Hom(λ1 , σ σ ) ⊗ Hom(λ2 , λ1 σ ) ⊗ · · · ⊗ Hom(λ, λk−2 σ ) is one-dimensional by Condition 1. Each such one-dimensional subspace gives a nonzero eigenvector of the unitary operator Cσλσ ···σ with eigenvalue Cσλ1σ Cλλ12σ · · · Cλλk−2 σ , and what we have to prove is these eigenvalues are all identical. Note that the decomposition of Hom(λ, σ σ · · · σ ) as above is depicted graphically in Fig. 1. Another picture Fig. 2 gives another decomposition into a direct sum of one-dimensional eigenspaces.
Classification of 2D Local Conformal Nets
81
λ
q
λk−2
q
q @
@ @
··· λ3 λ2 λ1
q
σ
q @
@
@ @
q
q @
@
@
@ q @ @ @ @ @ @q @q @q σ
σ
@ @
@
@ @q
···
σ
σ
Fig. 1. Decomposition into a direct sum of one-dimensional eigenspaces λ
q
q @ λk−2 @q ···
q
σ
q @ λ3 @q λ @ 2 @q λ1 @q @ @q q q
q
···
σ
σ
σ
σ
Fig. 2. Decomposition into a direct sum of one-dimensional eigenspaces
Roughly speaking, what we prove is that if a unitary matrix has several “different” decompositions into direct sums of one-dimensional eigenspaces, then the unitary matrix needs to be a scalar multiple of the identity matrix. First, let k = 2. By Condition 1, the space Hom(λ, σ σ ) is one-dimensional for any λ ∈ , so we obviously have Cσλσ ∈ C. Suppose now we have Cσλσ ···σ ∈ C for any λ ∈ if the number of σ ’s is less than or equal to k. We will prove Cσλσ ···σ ∈ C for any λ ∈ when the number of σ ’s is k + 1. µ First note that we have Cσλσ ···σ Cλσ ∈ C by the induction hypothesis and Condition 1. What we have to prove is that this scalar is independent of λ when µ is fixed. That is, suppose we have λ, λ , µ ∈ , λ ≺ σ k , λ ≺ σ k , µ ≺ λσ , µ ≺ λ σ . We will prove µ
µ
Cσλσ ···σ Cλσ = Cσλσ ···σ Cλ σ ∈ C.
82
Y. Kawahigashi, R. Longo
By Condition 4, there exists ν ∈ such that ν ≺ σ k−1 , λ ≺ σ ν, and λ ≺ σ ν. Then there exists τ ∈ such that τ ≺ νσ and µ ≺ σ τ . Note that we have µ
µ
Cσλσ ···σ Cλσ = Cσλν Cσν σ ···σ Cλσ ∈ C, µ
where the number of σ ’s in Cσν σ ···σ is k − 1. The scalar Cσλν Cλσ is the eigenvalue µ of the operator Cσ νσ corresponding to the eigenvector given by the one-dimensional τ C µ is the eigenintertwiner space Hom(λ, σ ν)⊗Hom(µ, λσ ). Similarly, the scalar Cνσ στ µ value of the same operator Cσ νσ corresponding to the eigenvector given by the onedimensional intertwiner space Hom(τ, νσ ) ⊗ Hom(µ, σ τ ). Condition 3 implies that these two eigenvectors are not orthogonal, thus the two eigenvalues are equal, because µ the operator Cσ νσ has an orthonormal basis of eigenvectors and thus it is normal. In this way, we obtain the identities
µ
µ
τ Cσλν Cλσ = Cνσ Cσµτ = Cσλν Cλ σ ,
which implies µ
µ
µ
µ
Cσλσ ···σ Cλσ = Cσλν Cσν σ ···σ Cλσ = Cσλν Cσν σ ···σ Cλ σ = Cσλσ ···σ Cλ σ ∈ C,
as desired, where the numbers of σ ’s in Cσλσ ···σ , Cσν σ ···σ , and Cσλσ ···σ are k, k − 1, and k, respectively. We next prove the triviality of the cocycle C by using Condition 2. First we assume we have 2 (a) of the assumptions in the theorem, that is, σ ≺ σ 2 . σ . Set ωid = 1. Since id ≺ σ 2 , the condition Cσσ σ σ ∈ C implies that Cσσ σ Cσσ σ = Cσidσ Cidσ σ σ −1 By unitarity of C in Theorem 4.4, we have |Cσ σ | = 1, we thus set ωσ = (Cσ σ ) ∈ C. 2 (Recall that we have already proved Cσσ σ is a scalar.) Then this implies Cid σ σ = ωid /ωσ . For λ ∈ not equivalent to id, σ , we choose a minimum positive integer k with λ ≺ σ k . We set ωλ = ωσk Cσλ···σ ∈ C, where the number of σ ’s in Cσλ···σ is k. For any m > k, we can represent the scalar Cσλ···σ , where σ appears for m times, as Cσσ σ · · · Cσσ σ Cσλ···σ , where the number of Cσσ σ ’s is m − k and the number of σ ’s in Cσλ···σ is k. This implies Cσλ···σ = ωλ /ωσm , where the number of σ ’s in Cσλ···σ is m. Now choose arbitrary λ, µ, ν ∈
with λ ≺ σ l , µ ≺ σ m . We can represent Cσν ···σ ∈ C with σ appearing for l + m times, µ ν Cλ as the product Cλµ σ ···σ Cσ ···σ , where σ ’s appear for l and m times, respectively, and then we obtain ων ν ωλ ωµ Cλµ = l+m , ωσl ωσm ωσ ν ω ω = ω . Unitarity in Theorem 4.4 gives ω ω = 0; we thus have which gives Cλµ λ µ ν λ µ ν Cλµ = ων /(ωλ ωµ ). We next deal with the case 2 (b), that is, we now assume that the system has a Z/2Z-grading and the generator σ is odd. We first set ωid = 1. Since id ≺ σ 2 , we next set ωσ to be a square root of (Cσidσ )−1 . (Note that |Cσidσ | = 1 by unitarity in Theorem 4.4.) It does not matter which square root we choose. For λ ∈ not equivalent to id, σ , we choose a minimum positive integer k with λ ≺ σ k in the same way as above in the case of 2 (a). We again set ωλ = ωσk Cσλ···σ ∈ C, where the number of σ ’s in Cσλ···σ is k. For any m > k, we can represent the scalar Cσλ···σ , where σ appears for m times, as Cσidσ · · · Cσidσ Cσλ···σ , where the number of Cσidσ ’s is (m − k)/2 and the number of σ ’s in Cσλ···σ is k, because m − k is now even, due to the Z/2Z-grading. Then we obtain Cσλ···σ =
1 ωσm−k
ωλ ωλ = m, ωσk ωσ
Classification of 2D Local Conformal Nets
83
where the number of σ ’s in Cσλ···σ is m. Then the same argument as in the above case of 2 (a) proves the triviality of the cocycle Cλµ . Remark 5.2. The 2-cohomology does not vanish in general, as is well-known in the finite group case. For example, if the system arises from an outer action of a finite group G = Z/2Z × Z/2Z, it is known that we have a non-trivial unitary 2-cocycle for this group G. So as in Example 4.5, the 2-cohomology for the corresponding tensor category does not vanish. In [31, Theorem 2.4, Theorem 4.1], we have classified local extensions of the conformal nets SU (2)k and Vir c with k = 1, 2, 3, . . . and c = 1−6/m(m+1), m = 2, 3, 4, . . . . (Here the symbol Vir c denotes the Virasoro net with central charge c.) We use the symbols SU (2)k and Vir c also for the corresponding C ∗ -tensor categories. We also say that the corresponding C ∗ -tensor categories of these local extensions of the nets SU (2)k and Vir c are extensions of the tensor categories SU (2)k and Vir c . Furthermore, the tensor category SU (2)k has a natural Z/2Z-grading and the even objects make a sub-tensor category. We call it the even part of SU (2)k . We then have the following theorem. Theorem 5.3. Any finite system of endomorphisms corresponding to one of the following tensor categories has a self-conjugate generator σ satisfying all the Conditions in Theorem 5.1, and thus, we have 2-cohomology vanishing for these tensor categories: 1. The SU (2)k -tensor categories and their extensions. 2. The sub-tensor categories of those in Case 1. 3. The Virasoro tensor categories Vir c with c < 1 and their extensions. 4. The sub-tensor categories of those in Case 3. Proof. We deal with the following cases separately. Here for the extensions of SU (2)k tensor categories and the Virasoro tensor categories Vir c , we use the labels by (pairs of) Dynkin diagrams as in [31, Theorem 2.4, Theorem 4.1], which arise from the labels of modular invariants by Cappelli-Itzykson-Zuber [9]. (These also correspond to the type I modular invariants listed in Table 1 in this paper.) Note that the braiding does not matter now, so we ignore the braiding structure here. 1. The SU (2)k -tensor categories and their extensions. (a) Tensor categories An . (b) Tensor categories D2n . (c) Tensor category E6 . (d) Tensor category E8 . 2. The (non-trivial) sub-tensor categories of those in Case 1. (a) The group Z/2Z. (b) The even parts of the SU (2)k -tensor categories. 3. The Virasoro tensor categories Vir c with c < 1 and their extensions. (a) Tensor categories (An−1 , An ). (b) Tensor categories (A4n , D2n+2 ). (c) Tensor categories (D2n+2 , A4n+2 ). (d) Tensor category (A10 , E6 ). (e) Tensor category (E6 , A12 ). (f) Tensor category (A28 , E8 ). (g) Tensor category (E8 , A30 ). 4. The (non-trivial) sub-tensor categories of those in Case 3. (a) The sub-tensor categories of those in Case 3(a).
84
Y. Kawahigashi, R. Longo
(b) (c) (d) (e) (f) (g)
The sub-tensor categories of those in Case 3(b). The sub-tensor categories of those in Case 3(c). The sub-tensor categories of those in Case 3(d). The sub-tensor categories of those in Case 3(e). The sub-tensor categories of those in Case 3(f). The sub-tensor categories of those in Case 3(g).
Case 1(a). We label the irreducible objects of the tensor category Ak+1 with 0, 1, 2, . . . , k, as usual. Let σ be the standard generator 1. Condition 1 of Theorem 5.1 clearly holds. Since the fusion rule of the tensor category SU (2)k has a Z/2Z-grading and this generator 1 is odd, Condition 2(b) also holds. Now the connection values with respect to this σ are the usual connection values of the paragroup Ak+1 as in [42, 30, 14, Sect. 11.5], and they are non-zero and Condition 3 holds. The multiplication rule by the generator σ is described with the usual Bratteli diagram for the principal graph Ak+1 as in [28, 14, Chapter 9], so we see that Condition 4 holds. Case 1(b). The irreducible objects of the tensor category are labeled with the even vertieven for this tensor category.) ces of the Dynkin diagram D2n . (So we also use the name D2n If 2n = 4, then this tensor category is given by the group Z/3Z, and we can verify the conclusion directly, so we assume that n > 2. We label σ as in Fig. 3. Then we can easily verify Conditions 1, 2(a) and 4. We next verify Condition 3. We label four irreducible objects as in Fig. 4. (If n = 3, we set λ1 = id.) Note that the connection with respect to the generator σ has a principal graph as in Fig. 5. (See [24], for example, for the fusion rules of a subfactor with principal graph D2n .) We first claim that if the vertices λ3 and λ4 are not involved, then the connection values with respect to the generator σ are non-zero. As in [3, II, Sect. 3], we may assume that (1) the irreducible objects of the tensor category are realized as {α0 , α2 , . . . , α2n−4 , α2n−2 , (2)
α2n−2 }, arising from α-induction applied to the system SU (2)4n−4 having the irreducible objects {0, 1, 2, . . . , 4n − 4}. (Note that it does not matter whether we use α + or α − , so we have dropped the ± symbol.) We denote, by W (i, j, k, l), the connection value with respect to the generator σ = α2 given by the square in Fig. 6. (Note that the value W (i, j, k, l) depends on the choices of intertwiners, but the absolute value |W (i, j, k, l)| is independent of such choices, since the intertwiner spaces are now all one-dimensional.) For example, assume n > 4 and consider the connection value W (α4 , α6 , α4 , α6 ). By σ
id
r @
r @
@
@ @r
r @
r
r
@
···
@ @r
@r
Fig. 3. The principal graph for the subfactor D2n λ1
id
r @
@
@ @r
···
λ2
r @
λ3
@
r @
@ @r
r
@
@ @r
Fig. 4. The principal graph for the subfactor D2n
λ4
r
Classification of 2D Local Conformal Nets id
σ
id
σ
r @
85
r @
1
2
3
4
1
2
3
4
r r rH r r @ @HH @ @ ··· @ @ H @ @ @ H @ @ @ @ @r @r@ @r H Hr @ @r @r r Fig. 5. The principal graph for the subfactor σ (M) ⊂ M i q
q k
qj q l
even Fig. 6. A connection value for D2n
[3, II, Sect. 3], all four intertwiners involved in this connection come from the intertwiners for SU (2)4n−4 , and thus the connection value is given by the connection W (4, 6, 4, 6) for SU (2)even 4n−4 with respect to the generator 2. This value is given as a single term of 6j -symbols of SU (2)4n−4 and it is non-zero by [29]. The general case is dealt with in the same method. Thus, we consider the remaining case where all four vertices of the connection value are one of λ1 , λ2 , λ3 , λ4 . Below, we denote the vertices λ1 , λ2 , λ3 , λ4 simply by 1, 2, 3, 4. Denote the statistical dimensions of 1, 2, 3, 4 by d1 , d2 , d3 , d4 respectively. Their explicit values are as follows: d1 = d2 =
2n−5 4n−2 π , π sin 4n−2 2n−3 sin 4n−2 π , π sin 4n−2
sin
d3 = d 4 =
1 π . 2 sin 4n−2
(25)
For a fixed pair (i, l), we denote the unitary matrix (W (i, j, k, l))j,k by Wil . Using the bi-unitarity Axioms 1 and 4 in [14, Chap. 10], originally due to [42], we compute several matrices Wil below. Recall that the renormalization Axiom 4 in [14, Chap. 10] now implies |W (i, j, k, l)| =
dj dk |W (j, i, l, k)|. d i dl
If i = 1 and l = 3, 4, then the entries in Wil are again given as single terms of the 6j -symbols of SU (2)4n−4 and thus, they are non-zero. The unitary matrices W13 and W14 have size 1 × 1, so the entries are obviously non-zero. The unitary matrix W21 has a size 2 × 2, and all the entries in Wil are again given as single terms of the 6j -symbols of SU (2)4n−4 and thus, they are non-zero. The unitary matrix W22 has a size 4 × 4. The entry W (2, 1, 1, 2) is non-zero because we have already seen that W11 has no zero entries and we have the renormalization axiom. Similarly, the entries W (2, 2, 1, 2), W (2, 1, 2, 2), W (2, 3, 1, 2), W (2, 1, 3, 2), W (2, 4, 1, 2), and W (2, 1, 4, 2) are non-zero.
86
Y. Kawahigashi, R. Longo
The entry W (2, 2, 2, 2) is also given as a single term of the 6j -symbols of SU (2)4n−4 and thus, it is non-zero. We assume W (2, 3, 2, 2) = 0 and will derive a contradiction. Using the renormalization axiom twice, we obtain W (2, 2, 3, 2) = 0. Another use of the renormalization axiom gives W (3, 2, 2, 2) = 0. Since the 2 × 2 matrix W32 is unitary, this implies |W (3, 4, 2, 2)| = 1. The renormalization axiom then gives |W (2, 2, 3, 4)| = 1. Since the 2 × 2 matrix W24 is unitary, this gives W (2, 3, 3, 4) = W (2, 2, 2, 4) = 0. These two equalities then give W (3, 4, 2, 3) = 0 and W (2, 2, 4, 2) = 0 with the renormalization axiom, respectively. Thus we have verified the (2, 4)-entry of the 4×4 unitary matrix W22 is zero. Similarly, its (4, 2)-entry is also zero. The identity W (3, 4, 2, 3) = 0 and unitarity of the 2 × 2 matrix W33 give |W (3, 2, 2, 3)| = 1. The renormalization axiom then produces |W (2, 3, 3, 2)| = d3 /d2 . The 1×1 matrix W43 is unitary, thus the renormalization axiom gives |W (2, 4, 3, 2)| = d3 /d2 . Similarly, we obtain |W (2, 3, 4, 2)| = d3 /d2 . The √ 1 × 1 matrix W13 is unitary, thus the renormalization axiom gives |W (2, 1, 3, 2)| = d1 d3 /d2 . Now we use the orthogonality of the second and third row vectors of the 4×4 unitary matrix W22 . We have so far obtained that the (2, 3), (2, 4), (3, 2)-entries are zero and the (3, 1)-entry is non-zero. We thus know that the (2, 1)-entry is zero, but this is a contradiction because we have already seen above that the (2, 1)-entry W (2, 1, 2, 2) is non-zero. We have thus proved W (2, 3, 2, 2) = 0. By a similar method, we can prove that W (2, 4, 2, 2), W (2, 2, 3, 2) and W (2, 2, 4, 2) are all non-zero. We next assume W (2, 3, 3, 2) = 0. For the same reason as above, we obtain √ d1 d3 |W (2, 3, 1, 2)| = |W (2, 4, 1, 2)| = |W (2, 1, 3, 2)| = |W (2, 1, 4, 2)| = , (26) d2 d3 |W (2, 3, 4, 2)| = |W (2, 4, 3, 2)| = . (27) d2 Since W (2, 3, 3, 2) = 0, the renormalization axiom implies W (3, 2, 2, 3) = 0. Since the 2 × 2-matrix W33 is unitary, √ we obtain |W (3, 2, 4, 3)| = 1. The renormalization axiom gives |W (2,√3, 3, 4)| = d3 /d2 . Unitarity of the 2 × 2-matrix W24 then gives |W (2, 2, 2, 4)| = d3 /d2 , which then gives |W (2, 2, 4, 2)| = |W (2, 4, 2, 2)| = d3 /d2 with the renormalization axiom. The identities (25), together with a simple computation of trigonometric functions, give d1 d3 + 2d32 = d22 .
(28)
Since the third row vector, the fourth row vector, and the third column vector of the unitary matrix W22 have a norm 1, this identity (28), together with (26), (27) gives |W (2, 2, 3, 2)| = |W (2, 3, 2, 2)| = d3 /d2 and W (2, 4, 4, 2) = 0. Thus the matrix A = (Aj k )j k = (|W (2, k, j, 2)|)j k is given as follows, where α, β, γ are non-negative real numbers, √ √ d1 d3 d1 d3 β α d2 d2 d3 d 3 β γ d2 d2 √ . (29) d1 d3 d3 d3 0 d2 √ d2 d2 d1 d3 d3 d3 0 d2 d2 d2
Classification of 2D Local Conformal Nets
87
Orthogonality of the first and third row vectors of W22 implies √ √ d1 d3 d3 d1 d3 d3 α +β . d2 d2 d22
(30)
Since the first row vector of W22 has a norm 1, we also have α2 + β 2 = 1 −
2d1 d3 . d22
(31)
The Cauchy-Schwarz inequality with (30), (31), we obtain √ d1 d 3 2d1 d3 d1 + d 3 1 − , d2 d22 which, together with (28), implies d1 d3 d1 + d3 2d32 − d1 d3 . This implies d12 2d3 , which gives sin2
2n − 5 1 π 4n − 2 2
(32)
by (25). This inequality (32) fails, if we have (2n − 5)/(4n − 2) > 1/4, that is, n > 9/2. Since we now assume n ≥ 3, this has produced a contradiction and we have shown W (2, 3, 3, 2) = 0, unless n = 3, 4. We deal with the remaining two cases n = 3, 4 by direct computations of the connection as follows. If n = 3, we have the Dynkin diagram D6 . A subfactor with principal with D6 is realized as the asymptotic inclusion [42, p. 137], [14, Def. 12.23], [26, Sect. 2], of a subfactor with principal graph A4 as in [43, Sect. III.1], [14, p. 663], [26, Theorem 4.1]. Thus the tensor category D6even is realized as a self-tensor product of the tensor category of Aeven and that our current generator σ is realized as a tensor product of the standard 4 generators in two copies of Aeven 4 . As in Case 2 (b) below, the connection values are non-zero for Aeven , thus our current connection values are also non-zero as products of 4 two non-zero values. We finally deal with the case n = 4. We label the even vertices of the principal graph D8 as in Fig. 7. We continue the computations of |W (i, j, k, l)|’s using the matrix (29), where the non-negative real numbers α, β, γ have been defined. The renormalization axiom gives 0
1
r @
2
@
r @
@ @r
@
@ @r
3
r @ @
r
@ @r
Fig. 7. The principal graph for the subfactor D8
4
r
88
Y. Kawahigashi, R. Longo
√ |W (1, 1, 2, 1)| = d2 /d1 |W (1, 1, 1, 2)| and unitarity of the 2 × 2-matrix W12 gives |W (1, 2, 2, 2)| = |W (1, 1, 1, 2)|. So we have d2 |W (1, 1, 2, 1)| = |W (1, 2, 1, 1)| = |W (1, 2, 2, 2)| = |W (2, 1, 2, 2)| = β, (33) d1 again by the renormalization. We also have |W (1, 2, 1, 1)| = β.
(34)
Unitarity of the 1×1-matrix W02 gives |W (0, 1, 1, 2)| = 1 and thus, the renormalization axiom gives √ d2 |W (1, 0, 2, 1)| = |W (1, 2, 0, 1)| = , (35) d1 since d0 = 1. Similarly, unitarity of the 1 × 1-matrix W01 gives 1 |W (1, 0, 1, 1)| = |W (1, 1, 0, 1)| = √ , d1
(36)
and unitarity of the 1 × 1-matrix W00 gives 1 . d1
(37)
d2 d2 |W (2, 1, 1, 2)| = α. d1 d1
(38)
|W (1, 0, 0, 1)| = We also have |W (1, 2, 2, 1)| =
Thus the 3 × 3-matrix B = (Bj k )j k = (|W (1, k, j, 1)|)j k is given as follows, where δ is a non-negative real number, by (33), (34), (35), (36), (37), (38), √ 1 1 d2 d √ d1 d1 1 1 √ (39) δ β d . √ 1 d2 d2 β α d1 d1 The first row vector of the matrix (29) has a norm 1, thus we have α2 + β 2 = 1 −
2d1 d3 . d22
(40)
The third row vector of the matrix (39) has a norm 1, thus we have d22 2 d2 2 + β + α = 1. d12 d12
(41)
Classification of 2D Local Conformal Nets
89
Eqs. (40) and (41) give the following value for β 2 : β2 =
d22 − 2d1 d3 − d12 + d2 d22 − d12
.
(42)
Note that the denominator is not zero. Let t be the index of the subfactor with principal graph D8 . (That is, t = 4 cos2 π/14.) Then the Perron-Frobenius theory gives the following identities: d1 = t − 1, d2 = t 2 − 3t + 1, t 3 − 5t 2 + 6t − 1 d3 = . 2 Then these imply d22 − 2d1 d3 − d12 + d2 = 0 in (42); we thus obtain β = 0, which has been already excluded above. We have thus reached a contradiction and shown W (2, 3, 3, 2) = 0. Similarly, we can prove W (2, 4, 4, 2) = 0. The unitary matrix W34 has a size 1 × 1, so the renormalization axiom implies W (2, 4, 3, 2) = 0. Similarly, we have W (2, 3, 4, 2) = 0. We have thus proved that all the entries of W22 are non-zero. The unitary matrix W23 has a size 2 × 2. If this matrix has a zero entry, we have either W (2, 2, 2, 3) = W (2, 4, 4, 3) = 0 or W (2, 2, 4, 3) = W (2, 4, 2, 3) = 0. The former case, together with the renormalization axiom, implies W (2, 2, 3, 2) = 0, which is already excluded in the above study of W22 . The latter case gives |W (2, 4, √4, 3)| = 1, which, together with the renormalization axiom, implies |W (4, 2, 3, 4)| = d2 /d4 > 1 by (25). This is against the unitarity axiom and thus cannot happen. The 2 × 2 unitary matrix W24 is dealt with in a similar way to the case W23 . The unitary matrices W31 and W34 also have size 1 × 1, so the entries are again non-zero. The matrices W32 and W33 have size 2 × 2. The entries of W32 have the same absolute values as the entries of W23 , so the above arguments for W23 show that they are non-zero. We next consider W33 . If this 2 × 2 unitary matrix contains a zero entry, then we have either W (3, 2, 2, 3) = W (3, 4, 4, 3) = 0 or W (3, 2, 4, 3) = W (3, 4, 2, 3) = 0. The former case, together with the renormalization axiom, implies W (2, 3, 3, 2) = 0, which is already excluded in the above study of W22 . The latter case, together with the renormalization axiom, implies W (2, 3, 3, 4) = 0, which is already excluded in the above study of W24 . The four matrices W4l can be dealt with in the same way as above for W3l . Thus we are done for Case 1(b). Case 1(c). Only fusion rules and 6j -symbols matter, and the braiding does not matter, for the Conditions in Theorem 5.1, so our tensor category can be identified with SU (2)2 and this is a special case of Case 1 above. Case 1(d). In a similar way to the above case, this tensor category can be identified with the even part of the tensor category SU (2)3 , so this is a special case of Case 2(b) below. Case 2(a). This is trivial. Case 2(b). We label the irreducible objects of the tensor category SU (2)k with index 0, 1, 2, . . . , k, as above. (We also use the name Aeven k+1 for this tensor category.) Let σ be the generator 2 this time. Conditions 1 and 2(a) of Theorem 5.1 clearly hold.
90
Y. Kawahigashi, R. Longo
Since all 6j -symbols for SU (2)k have non-zero values as in [29], Condition 3 holds, in particular. The multiplication rule by the generator σ is described with the even steps of the usual Bratteli diagram for the principal graph Ak+1 as in [28, 14, Chap. 9], so we see that Condition 4 holds. Case 3(a). This is the Virasoro tensor category with central charge c = 1 − 6/n(n + 1). We recall the description of the irreducible objects in the tensor category given by [53, Theorem 4.6] applied to SU (2)n−1 ⊂ SU (2)n−2 ⊗SU (2)1 , as follows. (Also see [31, Sect. 3] for our notations.) We now have a net of subfactors Vir c ⊗ SU (2)n−1 ⊂ SU (2)n−2 ⊗ SU (2)1 with finite index and apply the α-induction to this inclusion. The irreducible representations of the net Vir c are labeled as {σj,k | j = 0, 1, . . . , n − 2,
k = 0, 1, . . . , n − 1,
j + k ∈ 2Z}.
Xu’s result [53, Theorem 4.6] then shows the following. First, the systems {σj,k } and {ασj,k ⊗id } have the isomorphic fusion rules and 6j -symbols. Furthermore, the latter system is isomorphic to the system {(λ j ⊗ id)(αid×λk ) | j = 0, 1, . . . , n − 2,
k = 0, 1, . . . , n − 1,
j + k ∈ 2Z},
where {λk | k = 0, 1, . . . , n − 1} and {λ j | j = 0, 1, . . . , n − 2} are the system of irreducible DHR endomorphisms of the nets SU (2)n−1 and SU (2)n−2 , respectively. This system has further isomorphic fusion rules and 6j -symbols to the system {λ j ⊗ λk | j = 0, 1, . . . , n − 2,
k = 0, 1, . . . , n − 1,
j + k ∈ 2Z},
(43)
of irreducible DHR endomorphisms of the net SU (2)n−2 ⊗ SU (2)n−1 . (Note that we have a restriction j + k ∈ 2Z, so this system is a subsystem of all the irreducible DHR endomorphisms of the net SU (2)n−2 ⊗ SU (2)n−1 .) As in [31, Sect. 3], we can identify the system of these σj,k ’s with the system of characters of the minimal models [11, Subsect. 7.3.4] whose fusion rules are given in [11, Subsect. 7.3.3]. We take the DHR endomorphism σ1,1 as σ in Theorem 5.1 and then, from these fusion rules, we easily see that Condition 1 holds. It is also easy to see that we have a natural Z/2Z-grading such that σ is an odd generator, so Condition 2(b) holds. By considering the connection of the system (43), we know that the connection value with respect to the generator σ is a product of the two connection values of the systems SU (2)n−2 and SU (2)n−1 with respect to the standard generators. Since these two connection values for SU (2)n−2 and SU (2)n−1 are the usual connection values for the paragroups labeled with the Dynkin diagrams An−1 and An , and they are non-zero by [42, 30, 14, Section 11.5], we conclude that Condition 3 holds. From the fusion rule described as above, we verify that Condition 4 also holds. (Recall the comment on Condition 4 after the statement of Theorem 5.1 and draw the principal graph for a subfactor given by σ1,1 .) Case 3(b). The tensor category is produced with α-induction and a simple current extension of index 2 as in [3, II, Sect. 3]. The fusion rules and 6j -symbols are given by even a direct product of the two systems Aeven 4n and D2n+2 . We can use the direct product of the σ in Fig. 3 and the σ in Case 2(b) as the current σ for Theorem 5.1. Then Conditions 1, 2(a), and 4 easily follow and the connection values are non-zero as products of non-zero values in Cases 1(b) and 2. Case 3(c). This case is proved in a similar way as the above proof of case 3(b).
Classification of 2D Local Conformal Nets
91
Case 3(d). The tensor category is produced with α-induction as in [31, Sect. 4.2]. The irreducible objects of the tensor category are labeled with pairs (j, k) with j = 0, 1, . . . , 9 and k = 0, 1, 2 with j + k ∈ 2Z. The fusion rules of the objects {(j, 0) | j = 0, 1, . . . , 9} obey the A10 fusion rule and those of {(0, 0), (0, 1), (0, 2)} obey the A3 fusion rule. Let σ be the object (1, 1). Then as in Case 1, we can verify Conditions 1, 2(b), 3 and 4. Case 3(e). This case is proved in a similar way to the above proof of case 3(d). Case 3(f). The tensor category is again produced with α-induction as in [31, Sect. 4.2]. The fusion rules and 6j -symbols are given as the direct product of the two syseven tems Aeven 28 and A4 . The irreducible objects of the former system are labeled with 0, 2, . . . , 26 as usual, and the latter system is given as {id, τ } with τ 2 = id ⊕ τ . Then we can choose (14, τ ) as σ and verify Conditions 1, 2(a), 3 and 4, using the same arguments as in Cases 2 and 3(b). Case 3(g). This case is proved in a similar way to the above proof of case 3(f). Case 4(a). Now, the only non-trivial sub-tensor categories are Z/2Z, SU (2)n−2 , even SU (2)even n−2 , SU (2)n−1 , SU (2)n−1 and the even parts with respect to the Z/2Z-grading described in the above proof of Case 3(a). The conclusion trivially holds for the first case. The next four cases have been already dealt with in Cases 1(a) and 2(b). In the last case, we can identify the tensor category with the direct product of two tensor even categories SU (2)even n−2 and SU (2)n−1 . We use the same labeling of the irreducible DHR sectors as in the proof of Case 3(a) and then we can use the generator σ2,2 as σ in Theorem 5.1. even Case 4(b). The only non-trivial sub-tensor categories we have are now Aeven 4n and D2n+2 . Thus, we have the conclusion by Cases 2(b) and 1(b), respectively. Case 4(c). This case is proved in a similar way to the above proof of Case 4(b). Case 4(d). The only non-trivial sub-tensor categories we have are now Z/2Z, Aeven 10 , their direct product, and A3 . We can deal with the group Z/2Z trivially. The cases Aeven 10 and A3 are particular cases of Cases 2(b) and 1(a), respectively. For the case of the direct product of Aeven 10 and Z/2Z, we can choose σ = (2, 2) in the notation of the proof for Case 3(d). Case 4(e). This case is proved in a similar way to the above proof of Case 4(d). Case 4(f). The only non-trivial sub-tensor categories we have are now Aeven and Aeven 4 28 . Both are special cases of Case 2(b). Case 4(g). This case is proved in a similar way to the above proof of Case 4(f). Remark 5.4. We have the following application of the above theorem. Consider the tensor category corresponding to the WZW-model SU (2)28 . Regard the irreducible objects as irreducible endomorphisms of a type III factor M and label them as id = λ0 , λ1 , λ2 , . . . , λ28 as usual. Then the endomorphism γ = λ0 ⊕ λ10 ⊕ λ18 ⊕ λ28 is a dual canonical endomorphism and uniqueness of the Q-system (γ , V , W ) up to unitary equivalence was shown in [33, Sect. 6] based on a result in vertex operator algebras. (This uniqueness was used in our previous work [31].) Izumi has also given another proof of this uniqueness with a more direct method. We remark that our above theorem also gives a different proof of this uniqueness as follows. We may assume that M is injective. Suppose that we have two endomorphisms ρ1 , ρ2 of M such that ρ1 ρ¯1 = ρ2 ρ¯2 = γ . As in [6, Prop. A.3], we can prove that the two subfactors ρ1 (M) ⊂ M and ρ2 (M) ⊂ M have the isomorphic higher relative commutants, and then we conclude by [45, Cor. 6.4] that the two subfactors are isomorphic via θ ∈ Aut(M). We then may and do assume ρ2 = θ · ρ1 and now we have θ · γ · θ −1 = γ . Since γ = λ0 ⊕ λ10 ⊕ λ18 ⊕ λ28 and powers of γ produce all of λ0 , λ2 , λ4 , . . . , λ28 ,
92
Y. Kawahigashi, R. Longo
we know that [θ · λ2j · θ −1 ] = [λ2j ] for j = 0, 1, 2, . . . , 14, where the square brackets denote the unitary equivalence classes. Then we have a map θ : Hom(λ, µ) t → θ (t) ∈ Hom(θ · λ · θ −1 , θ · µ · θ −1 ) giving an automorphism of the tensor category generated by powers of γ . By Case 2 of Theorem 5.3, this automorphism θ is trivial in the sense of Definition 4.2. The automorphism θ sends the Q-system (γ , V1 , W1 ) for ρ1 to the one (γ , V2 , W2 ) for ρ2 , and now the triviality of θ implies that these two systems are unitarily equivalent. Using the above Theorem 5.3, we obtain the following classification result of 2dimensional completely rational nets. The meaning of the condition that the µ-index is 1 will be further studied in the next section. Consider a 2-dimensional local completely rational conformal net B with central charge c = 1 − 6/m(m + 1) < 1 and µ-index µB = 1. By [47], we have inclusions max AL ⊗ AR ⊂ Amax L ⊗ AR ⊂ B, max where AL , AR , Amax L , AR are one-dimensional local conformal nets. By assumption, max have the same central charge c. Rehren’s result [47, Cor. 3.5] and and A Amax L R our results [32, Prop. 24] together imply that the fusion rules of the systems of entire max irreducible DHR endomorphisms of the two nets Amax L , AR are isomorphic, and our max previous result [31, Theorem 5.1] implies that the two nets Amax L , AR are isomorphic as max max nets. Since both AL , AR contain Vir c as subnets, we obtain an irreducible inclusion Vir c ⊗ Vir c ⊂ B. A decomposition of a vacuum sector of B restricted on Vir c ⊗ Vir c produces a decomposition matrix (Zλµ )λµ , where λ, µ are representatives of unitary equivalence classes of irreducible DHR endomorphisms of the net Vir c . Since µB = 1, by Theorem 3.1, due to M¨uger [41], we know that this matrix Z is a modular invariant of the Virasoro tensor category Vir c and such modular invariants have been classified by Cappelli-Itzykson-Zuber [9] as in Table 1. We claim that this correspondence from B to Z is bijective.
Theorem 5.5. The above correspondence from B to Z gives a bijection from the set of isomorphism classes of such two-dimensional nets to the set of modular invariants Z in Table 1. Table 1. Modular invariants for the Virasoro tensor category Vir c m n 4n 4n + 1 4n + 2 4n + 3 11 12 17 18 29 30
Labels for modular invariants in [9] (An−1 , An ) (D2n+1 , A4n ) (A4n , D2n+2 ) (D2n+2 , A4n+2 ) (A4n+2 , D2n+3 ) (A10 , E6 ) (E6 , A12 ) (A16 , E7 ) (E7 , A18 ) (A28 , E8 ) (E8 , A30 )
Type I II I I II I I II II I I
Classification of 2D Local Conformal Nets
93
Proof. We first prove that this correspondence is surjective. Take a modular invariant Z in Table 1. By [31, Subsecs. 4.1, 4.2, 4.3], we conclude that this modular invariant can be realized with α-induction as in [5, Cor. 5.8] for extensions of the Virasoro nets. Then Rehren’s results in [48, Theorem 1.4, Prop. 1.5] imply that we have a corresponding Q-system and a local extension B ⊃ Vir c ⊗ Vir c and that this B produces the matrix Z in the above correspondence. We next show injectivity of the map. Suppose that we have inclusion max AL ⊗ AR ⊂ Amax L ⊗ AR ⊂ B,
where AL , AR are isomorphic to Vir c and that this decomposition gives a matrix Z. We have to prove that the net B is uniquely determined up to isomorphism. Recall that the max nets Amax L and AR are among those classified by [31, Theorem 5.1]. As we have seen max above, AL and Amax R are isomorphic as nets and we can naturally identify them. This max isomorphism class and an isomorphism π from a fusion rule of Amax L onto that of AR are uniquely determined by Z by [31, Theorem 5.1]. (Also see [4].) max If the modular invariant is of type I, then we can naturally identify Amax L and AR max ⊂ B has and the map π is trivial. Then the Q-system for the inclusion Amax ⊗ A L R a standard dual canonical endomorphisms as in the Longo-Rehren Q-system and the above results Corollary 4.6, Theorems 5.1, 5.3 imply that this Q-system is equivalent to the Longo-Rehren Q-system. If the modular invariant is of type II, then we have a non-trivial fusion rule automorphism π. We then know by [4, Lemma 5.3] that this fusion rule automorphism π actually gives an automorphism of the tensor category acting non-trivially on irreducible objects. The same arguments as in the proof of Theorem 4.4 show that 2-cohomology vanishing implies uniqueness of the Q-system. Again, the above results Corollary 4.6, Theorems 5.1, 5.3 give the 2-cohomology vanishing, thus we have the desired uniqueness of the max Q-system for the inclusion Amax L ⊗ AR ⊂ B. Remark 5.6. In the case when the modular invariant Z above is of type II, the automorphism π of the tensor category above is actually an automorphism of a braided tensor category, as seen from the above proof. In the above classification, we have shown 2-cohomology vanishing without assuming locality. In the context of classification of two-dimensional nets, this means that max any (relatively local irreducible) extension B of Amax L ⊗ AR with µ-index being 1 is automatically local. 6. The µ-Index, Maximality of Extensions, and Classification of Non-Maximal Nets In Theorem 5.5, we have classified 2-dimensional completely rational local conformal nets and central charge less than 1 under the assumption that the µ-index is 1. In this section, we clarify the meaning of this condition on the µ-index. As we have seen above, this condition is equivalent to triviality of the superselection structure of the net. We further show that this condition is equivalent to maximality of extensions of the 2-dimensional net, when we have a parity symmetry for the net B. Here the net B is said to have a parity symmetry if we have a vacuum-fixing unitary involution P such that P B(O)P = B(pO), where p maps x + t → x − t in the two-dimensional Minkowski space. In this case, P clearly implements an isomorphism of AL and AR and thus, an max isomorphism of Amax L and AR .
94
Y. Kawahigashi, R. Longo
Suppose we have a local extension C of the two-dimensional completely rational local conformal net B and the inclusion B ⊂ C is strict. Then we have µB > µC ≥ 1 by [32, Prop. 24]. That is, if the net B is not maximal with respect to local extensions, then we have µB > 1. This argument does not require a parity symmetry condition. Conversely, suppose we have µB > 1. Then the results in [41] show that the dual max ⊂ B is of the form canonical endomorphism for the inclusion Amax ⊗A λ⊗π(λ), L R max are local extensions of Vir and λ runs through a proper where both Amax and A c L R subsystem of the system of the irreducible DHR endomorphisms of the net Amax and L π is an isomorphism from such system onto another subsystem of irreducible DHR max and Amax are in the classification list of endomorphisms of the net Amax R . Both AL R [31, Theorem 5.1], and now they are isomorphic. Recall that at least one of the two subsystems is a proper subsystem, since µB > 1, and the parity symmetry condition now implies that both subsystems are proper. First suppose that the map π is trivial. Then the Q-system for the inclusion Amax L ⊗ Amax ⊂ B is the usual Longo-Rehren Q-system arising from the subsystem by CorolR lary 4.6, Theorem 5.1 and Case 4 of Theorem 5.3. Then, Izumi’s Galois correspondence [26, Theorem 2.5] shows that we have a further extension C ⊃ B such that the Q-system max ⊂ C is the Longo-Rehren Q-system using the entire system of the for Amax L ⊗ AR irreducible DHR endomorphisms of Amax L and the index [C : B] is strictly larger than 1. We know that the extension C arising from the Longo-Rehren Q-system is local. That is, the net B is not maximal with respect to local extensions. Next suppose that the map π is non-trivial. By checking the representation categories of the local extensions of the Virasoro nets classified in [31, Theorem 5.1], we know that only such non-trivial isomorphisms arise from interchanging of 2j and 4n − 2 − 2j of the system SU (2)4n−2 , where j = 0, 1, . . . , 2n − 1, or the well-known non-trivial even . In both cases, the map π can be extended to an automorphism of the system D10 automorphism of the entire system of irreducible DHR endomorphism of Amax L and we can obtain a proper extension C ⊃ B in a similar way to the above case. Thus, again, the net B is not maximal with respect to local extensions. We summarize these proper sub-tensor categories of the extensions of the Virasoro tensor categories Vir c (c < 1) with trivial or non-trivial automorphisms as in Table 2. Each entry “nontrivial” means that we have a unique nontrivial automorphism for the sub-tensor category. For example, the sub-tensor category SU (2)even of (A7 , A8 ) appears in the case n = 8 of the 4th entry 6 having a trivial automorphism and the case n = 2 of the 5th entry having a nontrivial automorphism. We thus have exactly two non-maximal local conformal nets for this sub-tensor category. Thus we have proved that the net B with parity symmetry has µB = 1 if and only if it is maximal with respect to local extensions. In such a case, we say that B is a maximal net. These results, together with Theorem 5.5, imply the following main theorem of this paper immediately. Theorem 6.1. The above correspondence from B to Z in Theorem 5.5 gives a bijection from the set of isomorphism classes of such maximal two-dimensional nets with parity symmetry and central charge less than 1 to the set of modular invariants Z in Table 1. Furthermore, the above discussions on the possible proper sub-tensor categories of the extensions of the Virasoro tensor categories Vir c (c < 1) with trivial or non-trivial automorphisms imply that non-maximal two-dimensional local conformal nets with parity symmetry and central charge less than 1 are classified according to Table 2, since we have 2-cohomology vanishing for all these tensor categories by Theorem 5.3.
Classification of 2D Local Conformal Nets
95
Table 2. Proper sub-tensor categories of extensions of the Virasoro tensor categories Virc with automorphisms m n n n n 4n n 4n − 1 n 4n 4n − 1 4n + 1 4n + 1 4n + 1 4n + 2 4n + 2 4n + 2 11 11 11 11 11 12 12 12 12 12 17 18 29 29 29 30 30 30
Tensor category (An−1 , An ) (An−1 , An ) (An−1 , An ) (An−1 , An ) (A4n−1 , A4n ) (An−1 , An ) (A4n−2 , A4n−1 ) (An−1 , An ) (A4n−1 , A4n ) (A4n−2 , A4n−1 ) (A4n , D2n+2 ) (A4n , D2n+2 ) (A4n , D2n+2 ) (D2n+2 , A4n+2 ) (D2n+2 , A4n+2 ) (D2n+2 , A4n+2 ) (A10 , E6 ) (A10 , E6 ) (A10 , E6 ) (A10 , E6 ) (A10 , E6 ) (E6 , A12 ) (E6 , A12 ) (E6 , A12 ) (E6 , A12 ) (E6 , A12 ) (A16 , D10 ) (D10 , A18 ) (A28 , E8 ) (A28 , E8 ) (A28 , E8 ) (E8 , A30 ) (E8 , A30 ) (E8 , A30 )
Sub-tensor category {id} Z/2Z SU (2)n−2 SU (2)even n−2 SU (2)even 4n−2 SU (2)n−1 SU (2)even 4n−2 even SU (2)even n−2 × SU (2)n−1 even SU (2)4n−2 × SU (2)even 4n−1 even SU (2)even 4n−3 × SU (2)4n−2 {id} SU (2)even 4n−1 even D2n+2 {id} SU (2)even 4n+1 even D2n+2 {id} Z/2Z SU (2)2 SU (2)even 9 Z/2Z × SU (2)even 9 {id} Z/2Z SU (2)2 SU (2)even 11 Z/2Z × SU (2)even 11 even D10 even D10 {id} SU (2)even 3 SU (2)even 27 {id} SU (2)even 3 SU (2)even 29
Automorphism trivial trivial trivial trivial nontrivial trivial nontrivial trivial nontrivial nontrivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial trivial nontrivial nontrivial trivial trivial trivial trivial trivial trivial
Theorem 6.2. The non-maximal two-dimensional local conformal nets with parity symmetry and central charge less than 1 are classified bijectively, up to isomorphism, according to the entries in Table 2. Acknowledgements. A part of this work was done during a visit of the first-named author to Universit`a di Roma “Tor Vergata”. Another part was done while both authors stayed at the Mathematisches Forschungsinstitut Oberwolfach for a miniworkshop “Index theorems and modularity in operator algebras”. We thank M. Izumi for useful discussions. We gratefully acknowledge the support of GNAMPA-INDAM and MIUR (Italy), Grants-in-Aid for Scientific Research, JSPS (Japan) and the Mathematisches Forschungsinstitut Oberwolfach.
96
Y. Kawahigashi, R. Longo
References 1. Baumg¨artel, H.: Operatoralgebraic Methods in Quantum Field Theory. Berlin: Akademie Verlag, 1995 2. Belavin, A.A., Polyakov, A.M., Zamolodchikov, A.B.: Infinite conformal symmetry in two-dimensional quantum field theory. Nucl. Phys. 241, 333–380 (1984) 3. B¨ockenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors I. Commun. Math. Phys. 197, 361–386 (1998), II. 200, 57–103 (1999), III. 205, 183–228 (1999) 4. B¨ockenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Commun. Math. Phys. 213, 267–289 (2000) 5. B¨ockenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral projectors and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 6. B¨ockenhauer, J., Evans, D.E., Kawahigashi,Y.: Chiral structure of modular invariants for subfactors. Commun. Math. Phys. 210, 733–784 (2000) 7. B¨ockenhauer, J., Evans, D.E., Kawahigashi, Y.: Longo-Rehren subfactors arising from α-induction. Publ. RIMS, Kyoto Univ. 37, 1–35 (2001) 8. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) (1) 9. Cappelli, A., Itzykson, C., Zuber, J.-B.: The A-D-E classification of minimal and A1 conformal invariant theories. Commun. Math. Phys. 113, 1–26 (1987) 10. Carpi, S.: On the representation theory of Virasoro Nets. To appear in Commun. Math. Phys., math.OA/0306425 11. Di Francesco, P., Mathieu, P., S´en´echal, D.: Conformal Field Theory. Berlin-Heidelberg-New York: Springer-Verlag, 1996 12. Doplicher, S., Haag, R., Roberts, J. E.: Local observables and particle statistics. I. Commun. Math. Phys. 23, 199–230 (1971), II. 35, 49–85 (1974) 13. Doplicher, S., Roberts, J.E.: A new duality theory for compact groups. Invent. Math. 98, 157–218 (1989) 14. Evans, D. E., Kawahigashi, Y.: Quantum symmetries on operator algebras. Oxford: Oxford University Press, 1998 15. Fredenhagen, K., J¨orß, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansion. Commun. Math. Phys. 176, 541–554 (1996) 16. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. I. Commun. Math. Phys. 125, 201–226 (1989), II. Rev. Math. Phys. Special issue, 113–157 (1992); Fr¨ohlich, J.: Statistics of fields, the Yang-Baxter equation, and the theory of knots and links. In: Nonperturbative quantum field theory (Carg´ese, 1987), NATO Adv. Sci. Inst. Ser. B Phys., 185, NewYork: Plenum, 1988, pp. 71–100 17. Friedan, D., Qiu, Z., Shenker, S.: Details of the non-unitarity proof for highest weight representations of the Virasoro algebra. Commun. Math. Phys. 107, 535–542 (1986) 18. Goddard, P., Kent, A., Olive, D.: Unitary representations of the Virasoro and super-Virasoro algebras. Commun. Math. Phys. 103, 105–119 (1986) 19. Goddard, P., Olive, D. (eds.): Kac-Moody and Virasoro algebras. A Reprint Volume for Physicists. Singapore: World Scientific, 1988 20. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 21. Guido, D., Longo, R.: A converse Hawking-Unruh erffect and dS 2 /CF T correspondence. gr-qc/0212025, to appear in Ann. H. Poincar´e 22. Haag, R.: Local Quantum Physics. 2nd ed., Berlin-Heidelberg-New York: Springer, 1996 23. Hislop, P.D., Longo, R.: Modular structure of the von Neumann algebras associated with the free massless scalar field theory. Commun. Math. Phys. 84, 71–85 (1982) 24. Izumi, M.: Application of fusion rules to classification of subfactors. Publ. RIMS, Kyoto Univ. 27, 953–994 (1991) 25. Izumi, M.: Subalgebras of infinite C ∗ -algebras with finite Watatani indices II: Cuntz-Krieger algebras. Duke Math. J. 91, 409–461 (1998) 26. Izumi, M.: The structure of sectors associated with the Longo-Rehren inclusions. Commun. Math. Phys. 213, 127–179 (2000) 27. Izumi, M., Kosaki, H.: On a subfactor analogue of the second cohomology. Rev. Math. Phys. 14, 733–757 (2002) 28. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 29. Kauffman, L., Lins, S.L.: Temperley–Lieb recoupling theory and invariants of 3-manifolds. Princeton, NJ: Princeton University Press, 1994
Classification of 2D Local Conformal Nets
97
30. Kawahigashi, Y.: On flatness of Ocneanu’s connections on the Dynkin diagrams and classification of subfactors. J. Funct. Anal. 127, 63–107 (1995) 31. Kawahigashi, Y., Longo, R.: Classification of local conformal nets. Case c < 1. To appear in Ann. Math., math-ph/0201015 32. Kawahigashi, Y., Longo, R., M¨uger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 33. Kirillov Jr., A., Ostrik, V.: On q-analog of McKay correspondence and ADE classification of sl (2) conformal field theories. Adv. Math. 171, 183–227 (2002) 34. K¨oster, S.: Local nature of coset models. Preprint 2003, math-ph/0303054 35. Longo, R.: Index of subfactors and statistics of quantum fields I–II. Commun. Math. Phys. 126, 217–247 (1989) & 130, 285–309 (1990) 36. Longo, R.: A duality for Hopf algebras and for subfactors. Commun. Math. Phys. 159, 133–150 (1994) 37. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237, 7–30 (2003), math.OA/0102196 38. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 39. Longo, R., Roberts, J. E.: A theory of dimension. K-theory 11, 103–159 (1997) 40. Masuda, T.:An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions. Internat. J. Math. 8, 249–265 (1997) 41. M¨uger, M.: Extensions and modular invariants of rational conformal field theories. In preparation 42. Ocneanu, A.: Quantized group, string algebras and Galois theory for algebras. In: Operator algebras and applications, Vol. 2 (Warwick, 1987). D.E. Evans and M. Takesaki, (eds.), London Mathematical Society Lecture Note Series 36, Cambridge: Cambridge University Press, 1988, pp. 119–172. 43. Ocneanu, A.: Quantum symmetry, differential geometry of finite graphs and classification of subfactors. University of Tokyo Seminary Notes 45, Notes recorded by Y. Kawahigashi, 1991 44. Popa, S.: Symmetric enveloping algebras, amenability and AFD properties for subfactors. Math. Res. Lett. 1, 409–425 (1994) 45. Popa, S.: Classification of subfactors and of their endomorphisms. CBMS Regional Conference Series. Amer. Math. Soc. 86 (1995) 46. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The Algebraic Theory of Superselection Sectors, D. Kastler, (ed.), Singapore: World Scientific, 1990 47. Rehren, K.-H.: Chiral observables and modular invariants. Commun. Math. Phys. 208, 689–712 (2000) 48. Rehren, K.-H.: Canonical tensor product subfactors. Commun. Math. Phys. 211, 395–406 (2000) 49. Rehren, K.-H.: Locality and modular invariance in 2D conformal QFT. In: Mathematical Physics in Mathematics and Physics, R. Longo, (ed.), Fields Inst. Commun. 30, AMS Publications, Providence, RI: AMS, 2001 pp. 341–354, math-ph/0009004 50. Turaev, V.G.: Quantum invariants of knots and 3-manifolds. Berlin-New York: Walter de Gruyter, 1994 51. Takesaki, M.: Theory of Operator Algebras. Vol. I, II, III, Springer Encyclopaedia of Math. Sci. 124 (2002), 125, 127 (2003) 52. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347– 403 (1998) 53. Xu, F.: Algebraic coset conformal field theories I. Commun. Math. Phys. 211, 1–44 (2000) 54. Xu, F.: Strong additivity and conformal nets. Preprint 2003, math.QA/0303266 Communicated by A. Connes
Commun. Math. Phys. 244, 99–109 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0976-4
Communications in
Mathematical Physics
Nonlinear Stability of Boundary Layers of the Boltzmann Equation, I. The case M∞ < −1 Seiji Ukai1 , Tong Yang2 , Shih-Hsien Yu2 1 2
Department of Applied Mathematics, Yokohama National University, Yokohama, Japan Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong, P.R. China
Received: 4 March 2003 / Accepted: 3 July 2003 Published online: 11 November 2003 – © Springer-Verlag 2003
Abstract: This is a continuation of the paper [15] on nonlinear boundary layers of the Boltzmann equation where the existence is established and shown to be strongly dependent on the Mach number M∞ of the Maxwellian state at far field. In this paper, when M∞ < −1, we will show that the linearized operator has the exponential decay in time property and therefore a bootstrapping argument yields nonlinear stability of the boundary layers. 1. Introduction and Main Result The nonlinear Miln´e problem can be stated as follows. Consider the 3-dimensional halfspace D = {(x, y, z) ∈ R3 |x > 0}, in which the mass density F of gas particles is assumed constant on each plane parallel to the boundary ∂D = {x = 0} although the particle motion is 3-dimensional. That is, F is assumed to be a function of position x (but not of y, z) and particle velocity ξ = (ξ1 , ξ2 , ξ3 ) ∈ R3 . Here, ξ1 stands for the velocity component along the x-axis. Then, F is governed by the stationary Boltzmann equation x > 0, ξ ∈ R3 , ξ1 Fx = Q(F, F ), (1.1) F| = Fb (ξ ), ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 , x=0 F → M∞ (ξ ) (x → ∞), ξ ∈ R3 , where M∞ (ξ ) = M[ρ∞ , u∞ , T∞ ](ξ ) =
ρ∞ |ξ − u∞ |2 , exp − (4πT∞ )3/2 2T∞
(1.2)
is a Maxwellian with constants ρ∞ > 0, u∞ = (u∞,1 , u∞,2 , u∞,3 ) ∈ R3 , and T∞ > 0 which are the macroscopic components in the particle distribution F . By a shift of the
100
S. Ukai, T. Yang, S.-H. Yu
variable ξ in the direction orthogonal to the x-axis, we can assume without loss of generality that u∞,2 = u∞,3 = 0, and then, the sound speed and Mach number of this equilibrium state are given by c∞ =
5 T∞ , 3
M∞ =
u∞,1 , c∞
(1.3)
respectively, see [4]. Here, Q, the collision operator, is a bilinear integral operator 1 Q(F, G) = F (ξ )G(ξ∗ ) + F (ξ∗ )G(ξ ) − F (ξ )G(ξ∗ ) − F (ξ∗ )G(ξ ) 2 R3 ×S 2 (1.4) ×q(ξ − ξ∗ , ω) dξ∗ dω, with ξ = ξ − [(ξ − ξ∗ ) · ω] ω,
ξ∗ = ξ∗ + [(ξ − ξ∗ ) · ω] ω,
(1.5)
where “·” is the inner product of R3 . We restrict ourselves to the hard sphere gas for which the collision kernel q is given by q(ζ, ω) = σ0 |ζ · ω|, where σ0 is the surface area of the hard sphere. The existence of stationary solutions, called boundary layer solutions, to the problem (1.1) is studied recently in [15]. The result there shows that the existence of boundary layer solutions depends on the Mach number M∞ at x = ∞. When M∞ = 0, ±1, a solvability condition is given implicitly so that the co-dimensions of the manifold for boundary data Fb (ξ ) is obtained. In the simplest case, i.e., M∞ < −1, there is no extra solvability condition because all the information at infinity goes into the layer, which means that as long as the boundary data Fb is close to the Maxwellian at x = ∞ under some suitable norm, the boundary layer solution always exists. As the first step, to study the stability of the boundary layer solutions obtained in [15], we will study the case when M∞ < −1. The main reason why this case is easiest is that the linearized problem has exponential decay phenomena. And this decay estimate is easier to be handled in the bootstrapping argument for nonlinear stability. For the other case, the decay rate should be algebraic as for the Cauchy problem so that it is more difficult and will be pursued by authors in the future. For the boundary layer problem, there are a lot of results on the linear existence, stability and the numerical computation, cf. [1, 2, 5–8, 12–14]. Since we will discuss the stability problem in this paper, we will not present their works in details. The main result in this paper can be stated as follows. Let F¯ = F¯ (x, ξ ) be the stationary solution to the problem (1.1). Consider the initial boundary value problem, F + ξ 1 Fx t F |t=0 F |x=0 F
= Q(F, F ), = F0 (x, ξ ), = Fb (ξ ), → M∞ (ξ ) (x → ∞),
t > 0, x > 0, ξ ∈ R3 , x > 0, ξ ∈ R3 , t > 0, ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 , t > 0, ξ ∈ R3 .
(1.6)
Nonlinear Stability of Boundary Layers of the Boltzmann Equation
101
Theorem 1.1. When M∞ < −1, under the assumption that |Fb (ξ ) − M∞ (ξ )| ≤ 0 Wβ (ξ ),
3 ξ ∈ R+ ,
β > 5/2,
with the weight function Wβ (ξ ) defined in (2.1) and 0 being a sufficiently small positive constant, there exists a boundary layer solution F¯ (x, ξ ) to (1.1) proved in [15]. For (1.6), when [[F0 (x, ξ ) − F¯ (x, ξ )]] < 1 with β > 5/2, where 1 > 0 is a sufficiently small constant and the norm [[·]] is defined in (2.28), there exists a unique solution F (t, x, ξ ) to the problem (1.6) which decays exponentially in time to the stationary solution F¯ (x, ξ ). In other words, the boundary layer solution in this case is nonlinearly stable. Remark 1.2. We prove the global existence in the setting of the contraction mapping principle associated to the reduced problem (2.7) related to the quantity F − F¯ , in the space endowed with the norm (2.30). Hence, the asymptotic stability is a straightforward consequence of it. As for the existence, the method in [11] may work for (1.1). The proof of our theorem is given in the following section. We will first consider two semigroups associated with two linearized problems of (1.6) and show that they both have exponential decay property. Then by applying the bootstrapping argument and the smallness of the strength of the boundary layer, we will have the nonlinear stability result stated in Theorem 1.1. In the following, c is used to denote a generic positive constant. 2. Stability Analysis The stability problem to (1.6) can be discussed in two steps. The first step is to consider the corresponding linearized problem by the energy method for L2x,ξ and then the bootstrapping argument for L∞ x,ξ . The exponential decay in time estimate obtained in the first step can be used in the second step for nonlinear stability by using Grad’s estimate on the nonlinear Boltzmann collision term to obtain an a priori estimate on the solution for the application of the fixed point theorem. In the following, we will use the following weighted function: 1/2 Wβ (ξ ) = (1 + |ξ |)−β M[1, u∞ , T∞ ](ξ ) , (2.1) with β ∈ R. First, we shall look for the solution of (1.6) in the form F (t, x, ξ ) = M∞ (ξ ) + W0 (ξ )f (t, x, ξ ),
(2.2)
where W0 is the weight of (2.1) with β = 0. Then, the problem (1.6) reduces to f + ξ1 fx − Lf = (f ), t > 0, x > 0, ξ ∈ R3 , t f |t=0 = f0 (x, ξ ), x > 0, ξ ∈ R3 , (2.3) f |x=0 = a0 (ξ ), t > 0, ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 , f → 0 (x → ∞), t > 0, ξ ∈ R3 , where
a0 = W0−1 Fb − M∞ ,
102
S. Ukai, T. Yang, S.-H. Yu
and Lf = W0−1 Q(M∞ , W0 f ) + Q(W0 f, M∞ ) ,
(f ) = (f, f ),
(2.4)
with
(f, g) = W0−1 Q(W0 f, W0 g). The operator L is linear while the remainder is quadratic, both acting only on the variable ξ . The following properties (and nothing else) from them will be used in the p 3 ∞ sequel. Set Lξ = Lp (Rξ3 ) and L∞ ξ,β = L (Rξ , Wβ (ξ )dξ ). Proposition 2.1. For the hard sphere model, the following holds with some positive constants ν0 , ν1 , k0 , k1 , k2 depending only on ρ∞ , u∞ , T∞ . (i) L has the decomposition L = −ν(ξ ) × +K, where ν(ξ ) is a positive function satisfying ν0 ≤ ν(ξ ) ≤ ν0−1 (1 + |ξ |),
ξ ∈ R3 ,
whereas K is an integral operator Kh =
R3
K(ξ, ξ )h(ξ )dξ
with the kernel enjoying the estimate 2
|K(ξ, ξ )| ≤ k0 (|ξ − ξ | + |ξ − ξ |−1 )e−k1 |ξ −ξ | . (ii) L is non-positive self-adjoint on L2ξ , with the estimate (Lh, h)L2 ≤ −ν1 ||(1 + |ξ |)1/2 P ⊥ h||2L2 , ξ
(2.5)
ξ
where P ⊥ = I − P , P being the orthogonal projection onto the null space N of L. (iii) K has the regularizing property that it is bounded as an operator ∞ K : L∞ ξ,β → Lξ,β+1
and
K : L2ξ → L∞ ξ
for all β ≥ 0. (iv) The bilinear operator (f, g) enjoys the estimate ||ν −1 (f, g)||L∞ ≤ k3 ||f ||L∞ ||g||L∞ ξ,β ξ,β ξ,β for all β.
Nonlinear Stability of Boundary Layers of the Boltzmann Equation
103
Proof. For ρ∞ = 1, u∞ = 0, and T∞ = 1, that is, for the case of the standard Maxwellian M 0 (ξ ) = M[1, 0, 1](ξ ), all the statements in the above are found in, e.g. [4], pp. 197-198, except for (2.5) which is stated in [6]. Let ν 0 (ξ ) and K 0 (ξ, ξ ) be ones corresponding to the standard Maxwellian M 0 . Their explicit formulas go back to [10, 3] (see also [4], pp. 196–197). Since M[ρ∞ , u∞ , T∞ ](ξ ) = αM 0 (γ (ξ − u∞ )), 3/2
1/2
for α = ρ∞ /T∞ and γ = 1/T∞ , it follows from (2.4) that ν(ξ ) = c0 ν 0 (γ (ξ − u∞ )),
K(ξ, ξ ) = c0 K 0 (γ (ξ − u∞ )), γ (ξ − u∞ )),
with c0 = α/γ = ρ∞ /T∞ , whence the proposition follows for the general Maxwellian. This proposition is also valid for Grad’s cut-off hard potential [9] with due modification, particularly with (|ξ | + 1)δ (δ ∈ [0, 1]) in place of (|ξ | + 1) in (2.5). Since the model we consider is the hard sphere (δ = 1), we can let f = e−σ x g in (2.3) and control by (2.5) (and by P ) the term σ ξ1 appearing in the deduced problem g + ξ1 gx − σ ξ1 g − Lg = e−σ x (g), t > 0, x > 0, ξ ∈ R3 , t x > 0, ξ ∈ R3 , g|t=0 = g0 (x, ξ ), (2.6) g|x=0 = a0 (ξ ), t > 0, ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 , 3 g → 0 (x → ∞), t > 0, ξ ∈ R . Now, denote the stationary boundary layer solution to (2.6) by g¯ and let the initial g0 be a small perturbation of g. ¯ Then the stability problem we consider can be formulated as follows: g˜ t + ξ1 g˜ x − σ ξ1 g˜ − Lg˜ = e−σ x {L¯ g˜ + (g)}, ˜ t > 0, x > 0, ξ ∈ R3 ,
g˜ t=0 = g˜ 0 (x, ξ ), x > 0, ξ ∈ R3 ,
g˜ x=0 = 0, t > 0, ξ1 > 0, (ξ2 , ξ3 ) ∈ R2 , g˜ → 0 (x → ∞), t > 0, ξ ∈ R3 , (2.7) ¯ g). ˜ where g˜ = g − g, ¯ g˜0 = g0 − g¯ and L¯ g˜ = 2 (g, Let S(t) be the solution operator (semi-group) of the linear problem ht + ξ1 hx − σ ξ1 h − Lh = 0, t > 0, x > 0, ξ ∈ R3 , h = 0 (ξ1 > 0), h → 0(x → ∞), t > 0, ξ ∈ R3 ,
x=0 h t=0 = h0 (x, ξ ), x > 0, ξ ∈ R3 .
(2.8)
Then we have h = S(t)h0 . For the case M∞ < −1, the L2 decay estimate for (2.8) is easy to establish. Recall that in this case, the operator A = P ξ1 P introduced in our previous paper [15] is negative definite on N , whereas L is also negative definite on N ⊥ with the estimate (2.5). Here, P and N are as in Proposition 2.1. Now for a small σ > 0, a straightforward energy estimate gives 1 d ||h(t)||2 + < |ξ1 |h0 , h0 >− +ν2 ||(1 + |ξ |)1/2 h(t)||2 ≤ 0, 2 dt
104
S. Ukai, T. Yang, S.-H. Yu
with a constant ν2 > 0 (say ν2 = ν1 /2), where ||·|| = ||·||L2 , < ·.· >− = (·, ·)L2 (ξ1 >0) , x,ξ
and h0 = h|x=0 . This implies that d ν2 t e ||h(t)||2 + eν2 t 2 < |ξ1 |h0 (t), h0 (t) >− +ν2 ||(1 + |ξ |)1/2 h(t)||2 ≤ 0. dt Then it follows that t eν2 t ||h(t)||2 + eν2 t 2 < |ξ1 |h0 (t), h0 (t) >− +ν2 ||(1 + |ξ |)1/2 h(t)||2 dt ≤ ||h0 ||2 , 0
(2.9) and ||S(t)h0 || ≤ e−κt ||h0 ||,
κ=
ν2 . 2
(2.10)
As for the existence analysis, we want to prove the following estimate which is sufficient for the application of the fixed point theorem to get the global existence of the solution to the nonlinear problem (2.5), ||S(t)h0 ||β ≤ ce−κt ||h0 ||β + ||h0 ||L2 , (2.11)
x,ξ
Wβ (ξ )dxdξ = L∞ for β ≥ 0, where || · ||β is the norm of the space β . In order to prove (2.11), we first consider another simpler linear solution operator. Let ν(ξ ) be as in Proposition 2.1(i) and let S0 (t) be the solution operator (semi-group) of ht + ξ1 hx − σ ξ1 h + ν(ξ )h = 0, t > 0, x > 0, ξ ∈ R3 , h = 0 (ξ1 > 0), h → 0(x → ∞), t > 0, ξ ∈ R3 , (2.12)
x=0 h t=0 = h0 (x, ξ ), x > 0, ξ ∈ R3 . L∞ x,ξ
The solution to the above linear initial boundary value problem has the following explicit expression: h = S0 (t)h0 = e−(ν(ξ )−σ ξ1 )t χ (x − ξ1 t)h0 (x − ξ1 t, ξ ),
(2.13)
where χ (y) is the usual characteristic function for y > 0. Based on this expression and with the lower bound ν(ξ ) ≥ ν0 > 0, a simple calculation yields the following estimate on S0 : ||S0 (t)h0 ||X ≤ ce−(2κ−ε)t ||h0 ||X ,
(2.14)
with κ chosen to be min(ν0 , ν2 )/2, for some small constant > 0. Here the space X 2 can be either L∞ β or Lx,ξ . From (2.8) and (2.12), we have t S(t)h0 = S0 (t)h0 + 0 S0 (t − s)KS(s)h0 ds = m−1 j =0 Ij (t) + Jm (t) (t)h I0 (t) = S 0 0t (2.15) I (t) = S (t − s)KIj −1 (s)ds = (S0 K) ∗ Ij −1 j 0 0 J (t) = (S0 K) ∗ (S0 K) ∗ · · · ∗ (S0 K) ∗ h, m m
Nonlinear Stability of Boundary Layers of the Boltzmann Equation
105
with h = S(t)h0 . Here and hereafter, “∗” stands for the convolution in t. By using the estimate (2.14) and the regularizing property of the compact operator K in Proposition 2.1(iii), we have for β ≥ j ≥ 0, ||Ij (t)||β ≤ cj e(−2κ+ε)t ||h0 ||β−j .
(2.16)
The estimate on Jm is more complicated and can be stated in the following bootstrapping lemma. Lemma 2.2. For β ≥ 0, we have ||Jβ+3 (t)||β ≤ ce−κt ||h0 ||L2 . x,ξ
Proof. First, again by the regularizing property of K in Proposition 2.1(iii), we have C t ||Jβ+3 (t)||β ≤ (t − τ )β e−(2κ−)(t−τ ) ||J2 ||L∞ (L2 ) (τ )dτ, (2.17) x ξ β! 0 where J2 (t) = (S0 K) ∗ (S0 K) ∗ h = S0 ∗ J¯,
(2.18)
with J¯ = KS0 K ∗ h =
t
t
KS0 (t − s)Kh(s)ds =
0
J¯0 (t − s, s)ds.
(2.19)
0
We now estimate J¯0 (t, s) as follows. Here, we need to use some integral property of the compact operator K. By definition, we have J¯0 (t, s) = KS0 (t)Kh(s) K(ξ, ξ )K(ξ , ξ )e−(ν(ξ )−σ ξ1 )t χ (y)h(s, y, ξ )dξ dξ , =
(2.20)
R3 ×R3
where y = x − ξ1 t. Hence, |J¯0 (t, s)| ≤ e−(ν0 −ε)t
R×R3
where K0 (ξ, ξ1 , ξ ) ≡
K0 (ξ, ξ1 , ξ )χ (y)|h(s, y, ξ )|dξ1 dξ , R2
(2.21)
|K(ξ, ξ )||K(ξ , ξ )|dξ2 dξ3 ,
with ξ = (ξ1 , ξ2 , ξ3 ). Notice that the estimate of the kernel K(ξ, ξ ) stated in Proposition 2.1(i) gives |K(ξ, ξ )| dξ = |K(ξ , ξ )| dξ ≤ C0 , 3 3 R R |K(ξ, ξ )| dξ2 dξ3 ≤ C1 , R2
106
S. Ukai, T. Yang, S.-H. Yu
where C0 and C1 are some positive constants depending only on the parameters ρ∞ , u∞ , T∞ . Thus, we have K0 (ξ, ξ1 , ξ )dξ1 dξ = |K(ξ, ξ )| |K(ξ , ξ )| dξ dξ ≤ C02 , R×R3 R3 ×R3 K0 (ξ, ξ1 , ξ )dξ ≤ C0 |K(ξ , ξ ) dξ2 dξ2 ≤ C0 C1 . R3
R2
By (2.21) and the Schwartz inequality,
J¯0 (t, s) 2 ≤ e−2(2κ−)t K0 (ξ, ξ1 , ξ ) dξ1 dξ R2 ×R3
2 K0 (ξ, ξ1 , ξ )χ (y) h(s, y, ξ ) dξ1 dξ × R2 ×R3
2 2 −2(2κ−)t K0 (ξ, ξ1 , ξ )χ (y) h(s, y, ξ ) dξ1 dξ . (2.22) ≤ C0 e R2 ×R3
Therefore, we have x
J¯0 (t, s) 2 dξ x>0 R3
2 2 −2(2κ−)t ≤ C 0 C0 C1 e χ (y) h(s, y, ξ ) dξ1 dξ R×R3
2
c −2(2k−ε)t ∞ dy dξ h(s, y, ξ ) = e 3 t R 0 c ≤ e−2(2k−ε)t e−2ks h0 2L2 . (2.23) x,ξ t
J¯0 (t, s) 2L∞ (L2 ) = sup ξ
Here, we have used the L2 decay estimate (2.10). Hence (2.19) and (2.23) give t
J¯(t) L∞ (L2 ) ≤
J¯0 (t − s, s) L∞ (L2 ) ds x x ξ ξ 0 t e−(2κ−)(t−s) ≤c e−κs h0 L2 ds √ x,ξ t −s 0 t −(κ−)(t−s) e ≤ c e−kt (2.24) ds h0 ≤ ce−κt h0 . √ t −s 0 This and (2.14), (2.18) give
J2 (t) L∞ (L2 ) = S0 ∗ J¯ ≤ x
ξ
t 0
≤c ≤ ce
e−(2κ−ε)(t−s) J¯(s) L∞ (L2 ) ds x
t
ξ
e−(2κ−ε)(t−s) e−κs ds h0 L2
0 −κt
x,ξ
h0 L2 . x,ξ
(2.25)
Plugging this into (2.17) yields
t c ||Jβ+3 (t)||β ≤ e (t − τ )β e−(κ−)(t−τ ) dτ ||h0 ||L2 x,ξ β! 0 −κt ≤ ce ||h0 ||L2 . −κt
x,ξ
And this completes the proof of the lemma.
(2.26)
Nonlinear Stability of Boundary Layers of the Boltzmann Equation
107
This lemma and (2.16) complete the proof of the L∞ β decay estimate (2.11). In order to estimate the nonlinear term (g) ˜ and the coupling term L¯ g˜ in (2.7) by Proposition 2.1(iv), we also need the following lemma. Lemma 2.3. When β ≥ 0, for the two semigroups S0 and S, we have ||S0 ∗ ν(ξ )h||β (t) ≤ ce−κt sup {eκτ ||h||β (τ )}, ||S ∗ ν(ξ )h||β (t) ≤ ce
0≤τ ≤t −κ/2t
{ sup (eκ/2τ ||h||β (τ )) + sup (eκ/2τ ||νh||L2 (τ ))}, 0≤τ ≤t
0≤τ ≤t
x,ξ
both for every function h(t, x, ξ ) with the relevant norm bounded. Proof. First, by the special property of the semigroup S0 and the linear growth rate of ν(ξ ), we have t ||S0 ∗ νh||β ≤ sup (1 + |ξ |β )e−(ν(ξ )−σ ξ1 )(t−s) χ (x − ξ1 s)ν(ξ )|h(s, x − ξ1 s, ξ )|ds x,ξ
0
≤ e−κt sup {eκτ ||h||β (τ )} sup 0≤τ ≤t
ξ
t
e−(ν(ξ )−κ−σ ξ1 )(t−s) ν(ξ )ds
0
≤ ce−κt sup {eκτ ||h||β (τ )}. 0≤τ ≤t
To give the estimate for S, we use the relation between S and S0 , S = S0 + S0 ∗ KS. First, write (2.11) as ||S(t)h0 ||β ≤ ce−κt [[h0 ]]β ,
(2.27)
[[·]]β = || · ||β + || · ||L2 .
(2.28)
with x,ξ
We assume β ≥ 1 but the proof is similar for other β. By the regularizing property of the operator K again, we have t ||S0 ∗ KS ∗ νh||β ≤ e−κ(t−s) ||KS ∗ νh||β (s)ds 0 t ≤c e−κ(t−s) ||S ∗ νh||β−1 (s)ds 0 t s ≤c e−κ(t−s) e−κ(s−τ ) [[νh]]β−1 (τ )dτ ds (by (2.27)) 0 0 t κ/2τ ≤ c sup {e [[νh]]β−1 (τ )} e−κ(t−s) e−κ/2s sds 0≤τ ≤t
0
≤ ce−κ/2t sup {eκ/2τ [[νh]]β−1 (τ )}. 0≤τ ≤t
108
S. Ukai, T. Yang, S.-H. Yu
Combining this with the estimate for S0 , we have −κ/2t κ/2τ κ/2τ sup e ||h||β (τ ) + sup e [[νh]]β−1 (τ ) . ||S ∗ ν(ξ )h||β (t) ≤ ce 0≤τ ≤t
0≤τ ≤t
Recalling the linear growth of ν(ξ ) and the definition (2.28) completes the proof of the lemma. By using the estimates in the above lemmas and (2.27), we can now construct a global solution to the nonlinear problem (2.7). The definition of the semigroup implies that ˜ (2.29) g˜ = S(t)g˜ 0 + S ∗ {e−σ x (L¯ g˜ + (g))}. Write the right-hand side by [g]. ˜ We have ˜ ||[g]|| ˜ β ≤ ||S(t)g˜ 0 ||β + ||S ∗ {νν −1 e−σ x (L¯ g˜ + (g))}|| β ≤ ce−κ/2t [[g˜ 0 ]]β + sup eκ/2τ ||e−σ x νν −1 (L˜ g¯ + (g))|| ˜ β (τ ) τ ≥0
+ sup e τ ≥0
κ/2τ
||e
−σ x
νν
−1
(L˜ g¯ + (g))|| ˜ L2
x,ξ
(τ )
≤ ce−κ/2t {[[g˜ 0 ]]β + ||g|| ¯ β |||g||| ˜ + |||g||| ˜ 2 }, where |||h||| = sup{eκ/2t ||h||β (t)}.
(2.30)
t≥0
In the above we have used the estimate in Proposition 2.1(iv) and the relation ∞ ||e−σ x νh||2L2 ≤ e−2σ x dx ν 2 (ξ )(1 + |ξ |)−2β dξ ||h||2β x,ξ R3 0 5 . = c||h||2β , β > 2 Consequently, we have ¯ β |||g||| ˜ + |||g||| ˜ 2 ), |||[g]||| ˜ β ≤ c([[g˜ 0 ]]β + ||g|| and similarly, ˜ + |||g˜ + h||||| ˜ ˜ ˜ β ≤ c(||g|| ¯ β |||g˜ − h||| g˜ − h|||), |||[g] ˜ − [h]||| with the same constant c. ¯ which follows from the The smallness assumption on [[g¯ 0 ]]]β and that on |||g||| smallness assumption on the boundary data a0 in (2.6) now assure that the nonlinear map is a contraction map in a small ball of the Banach space defined with the norm (2.30) and therefore a unique fixed point exists. This implies, taking into account the choice of the norm (2.30), that (2.7) has a unique global in time solution converging exponentially to 0 as t → ∞ in the norm (2.28). Thus Theorem 1.1 follows. Acknowledgement. The research of the first author was supported by Grant-in Aid for Scientific Research (C) 136470207, Japan Society for the Promotion of Science (JSPS). The research of the second author was supported by the Competitive Earmarked Research Grant of Hong Kong CityU 1092/02P# 9040737. The research of the third author was supported by the Competitive Earmarked Research Grant of Hong Kong # 9040645.
Nonlinear Stability of Boundary Layers of the Boltzmann Equation
109
References 1. Aoki, K., Nishino, K., Sone, Y., Sugimoto, H.: Numerical analysis of steady flows of a gas condensing on or evaporating from its plane condensed phase on the basis of kinetic theory: Effect of gas motion along the condensed phase. Phys. Fluids A 3, 2260–2275 (1991) 2. Bardos, C., Caflish, R.E., Nicolaenko, B.: The Milne and Kramers problems for the Boltzmann equation of a hard sphere gas. Comm. Pure Appl. Math. 49, 323–352 (1986) ´ 3. Carleman, T.: Sur La Th´eorie de l’Equation Int´egrodiff´erentielle de Boltzmann. Acta Mathematica 60, 91–142 (1932) 4. Cercignani, C., Illner, R., Purvelenti, M.: The Mathematical Theory of Dilute Gases. Berlin: SpringerVerlag, 1994 5. Cercignani, C.: Half-space problem in the kinetic theory of gases. In: E. Kr¨oner, K. Kirchg¨assner, (eds.), Trends in Applications of Pure Mathematics to Mechanics, Berlin: Springer-Verlag, 1986, pp. 35–50 6. Coron, F., Golse, F., Sulem, C.: A classification of well-posed kinetic layer problems. Commun. Pure Appl. Math. 41, 409–435 (1988) 7. Golse, F., Perthame, B., Sulem, C.: On a boundary layer problem for the nonlinear Boltzmann equation. Arch. Rat. Mech. Anal. 103(1), 81–96 (1988) 8. Golse. F., Poupaud, F.: Stationary solutions of the linearized Boltzmann equation in a half-space. Math. Methods Appl. Sci. 11, 483–502 (1989) 9. Grad, H.: Asymptotic Theory of the Boltzmann Equation. In: Rarefied Gas Dynamics, J.A. Laurmann, (ed.), Vol 1, 26, New York: Academic Press, 1963, pp. 26–59 10. Hilbert, D.: Grundz¨uge einer Allgemeinen Theorie der Linearen Integralgleichungen. (German) New York, N.Y.: Chelsea Publishing Company, 1953, pp. xxvi+282 11. Lions, P.-L.: Conditions at infinity for Boltzmann’s equation. Commun. Partial Diff. Eqs. 19, 335– 367 (1994) 12. Sone, Y.: Kinetic Theory of Evaporation and Condensation-Linear and Nonlinear Problems. J. Phys. Soc. Japan 45(1), (1978) 13. Sone, Y.: Kinetic Theory and Fluid Dynamics. Berlin: Birkh¨auser, 2002 14. Ukai, S.: On the half-space problem for the discrete velocity model of the Boltzmann equation. In: Advances in Nonlinear Partial Differential Equations and Stochastic, Kawashima, T. Yangisawa, (eds.), Series on Advances in Mathematics for Applied Sciences, Vol. 48, Singapore–New York: World Scientific, 1998, pp. 160–174 15. Ukai, S., Yang, T., Yu, S.-H.: Nonlinear Boundary Layers of the Boltzmann Equation: I, Existence. Commun. Math. Phys. 236, 373–393 (2003) Communicated by H.-T. Yau
Commun. Math. Phys. 244, 111–131 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0966-6
Communications in
Mathematical Physics
Young Wall Realization of Crystal Graphs for Uq (Cn(1) ) Jin Hong1 , Seok-Jin Kang2, , Hyeonmi Lee2, 1 2
National Security Research Institute, 161 Gajeong-Dong, Yuseong-Gu, Daejeon 305-350, Korea. E-mail:
[email protected] Korea Institute for Advanced Study, 207-43 Cheongryangri-dong, Dongdaemun-Gu, Seoul 130-722, Korea. E-mail:
[email protected];
[email protected]
Received: 7 August 2002 / Accepted: 8 July 2003 Published online: 13 November 2003 – © Springer-Verlag 2003
Abstract: We give a realization of crystal graphs for basic representations of the (1) quantum affine algebra Uq (Cn ) using combinatorics of Young walls. The notion of splitting blocks plays a crucial role in the construction of crystal graphs. 1. Introduction In [10] and [13], Kashiwara and Lusztig independently developed the crystal basis theory (or canonical basis theory) for integrable modules over quantum groups associated with symmetrizable Kac-Moody algebras. A crystal basis can be viewed as a basis at q = 0 and has a structure of colored oriented graph, called the crystal graph, defined by Kashiwara operators. The crystal graphs have many nice combinatorial features reflecting the internal structure of integrable modules over quantum groups. For example, one can compute the characters of integrable representations by finding an explicit combinatorial description of crystal graphs. Moreover, the crystal graphs have extremely simple behavior with respect to taking the tensor product. Thus, the crystal basis theory provides us with a very powerful combinatorial method of studying the structure of integrable modules over quantum groups. Let Uq (g) be a quantum group associated with a symmetrizable Kac-Moody algebra g and let V (λ) denote the irreducible highest weight Uq (g)-module with a dominant integral highest weight λ. One of the most interesting problems in crystal basis theory is to find an explicit realization of the crystal graph B(λ) of V (λ), called the irreducible highest weight crystal with highest weight λ. When g is a classical Lie algebra, the crystal graph B(λ) can be realized as the set of semistandard Young tableaux of a given shape satisfying certain additional conditions [11]. This work was supported by KOSEF Grant # 98-0701-01-5-L and theYoung Scientist Award, Korean Academy of Science and Technology. This work was supported by BK21 Project, Mathematical Sciences Division, Seoul National University.
112
J. Hong, S.-J. Kang, H. Lee (1)
For the quantum affine algebras of type An−1 , Misra and Miwa constructed the crystal graphs B(λ) for basic representations using colored Young diagrams [14]. Their idea (1) was extended to construct crystal graphs for irreducible highest weight Uq (An−1 )-modules of arbitrary higher level [4]. The crystal graphs constructed in [4] and [14] can be parameterized by certain paths which arise naturally in the theory of solvable lattice modules. Motivated by this observation, Kang, Kashiwara, Misra, Miwa, Nakashima, and Nakayashiki developed the theory of perfect crystals for quantum affine algebras and gave a realization of crystal graphs B(λ) over classical quantum affine algebras of arbitrary higher level in terms of paths [7–9]. In [5], Kang introduced the notion of Young walls as a new combinatorial scheme for realizing the crystal graphs for quantum affine algebras. The Young walls consist of colored blocks with various shapes that are built on a given ground state wall, and they can be viewed as generalizations of colored Young diagrams. For the classical quantum (1) (2) (1) (2) (2) affine algebras of type An (n ≥ 1), A2n−1 (n ≥ 3), Dn (n ≥ 4), A2n (n ≥ 1), Dn+1 (1)
(n ≥ 2), and Bn (n ≥ 3), the crystal graphs B(λ) of the basic representations were realized as the affine crystals consisting of reduced proper Young walls. However, for (1) the quantum affine algebras of type Cn (n ≥ 2), the problem of Young wall realization of crystal graphs was left open. The purpose of this paper is to fill in this missing part: we develop the combinatorics (1) of Young walls for the quantum affine algebras Uq (Cn ) (n ≥ 2) and give a realization of crystal graphs B(λ) for the basic representations as the sets of reduced proper Young walls. This case is more difficult to deal with than the other classical quantum affine algebras, because the level-1 perfect crystals for this case are intrinsically of level-2. The notion of splitting blocks was introduced to overcome this difficulty. We believe this notion will play a crucial role in the construction of higher level irreducible highest weight crystals for all classical quantum affine algebras. 2. Quantum Group Uq (Cn(1) ) and its Level-1 Perfect Crystal We refer the readers to the references cited in the introduction, or to the books on quantum groups [2, 3] for the basic concepts on quantum groups and crystal bases. Familiarity with at least the following concepts will be assumed: quantum group, crystal basis, irreducible highest weight crystal, (abstract) crystal, perfect crystal, ground state path, λ-path, signature (of a path), path space. A clear understanding of the Young wall theory for any one of the affine types developed prior to this work will be immensely helpful in reading this paper, although not a logical prerequisite. Let us fix basic notations here. • • • • • • • • •
(1)
(1)
Uq = Uq (Cn ) : quantum group of type Cn (n ≥ 2). I = {0, 1, . . . , n} : index set. (1) A = (aij )i,j ∈I : generalized Cartan matrix of type Cn . ∨ P = (⊕i∈I Zhi ) ⊕ Zd : dual weight lattice. h = C ⊗Z P ∨ : Cartan subalgebra. αi , δ, i : simple root, null root, fundamental weight. P = (⊕i∈I Zi ) ⊕ Zδ : weight lattice. (1) ei , Ki±1 , fi , q d : generators of Uq (Cn ). (1) Uq = Uq (Cn ) : subalgebra of Uq generated by ei , Ki±1 , fi (i ∈ I ).
(1)
Young Walls for Uq (Cn )
• • • • • •
113
P = ⊕i∈I Zi : classical weight lattice. wt, wt : (affine) weight, classical weight. B(i ) : irreducible highest weight crystal of highest weight i . (1) B (1) : level-1 perfect crystal of type Cn . e˜i , f˜i : Kashiwara operators. P(i ) : the set of i -paths (with crystal structure).
We cite two theorems that are crucial for our work. The first is the path realization of irreducible highest weight crystals. Theorem 2.1. The path space is isomorphic to the irreducible highest weight crystal : B(i ) ∼ = P(i ). (1)
Below is a perfect crystal of type Cn , introduced in [6]. We use a special case of this result. (1)
Theorem 2.2. A level-1 perfect crystal of type Cn is given as follows: B (1) = (x1 , . . . , xn |x¯n , . . . , x¯1 ) xi , x¯i ∈ Z≥0 , ni=1 (xi + x¯i ) = 0 or 2 . For b = (x1 , . . . , xn |x¯n , . . . , x¯1 ), the action of the Kashiwara operator f˜i on B (1) is given as follows. For i = 0, if x1 ≥ x¯1 , (x1 + 2, x2 , . . . , x¯2 , x¯1 ) ˜ f0 b = (x1 + 1, x2 , . . . , x¯2 , x¯1 − 1) if x1 = x¯1 − 1, (x , x , . . . , x¯ , x¯ − 2) if x1 ≤ x¯1 − 2. 1 2 2 1 For i = 1, . . . , n − 1, (x1 , . . . , xi − 1, xi+1 + 1, . . . , x¯1 ) if xi+1 ≥ x¯i+1 , f˜i b = (x1 , . . . , x¯i+1 − 1, x¯i + 1, . . . , x¯1 ) if xi+1 < x¯i+1 . For i = n, f˜n b = (x1 , . . . , xn − 1|x¯n + 1, . . . , x¯1 ). The action of the Kashiwara operator e˜i on B (1) is given as follows. For i = 0, if x1 ≥ x¯1 + 2, (x1 − 2, x2 , . . . , x¯2 , x¯1 ) e˜0 b = (x1 − 1, x2 , . . . , x¯2 , x¯1 + 1) if x1 = x¯1 + 1, (x , x , . . . , x¯ , x¯ + 2) if x1 ≤ x¯1 . 1 2 2 1 For i = 1, . . . , n − 1, (x1 , . . . , xi + 1, xi+1 − 1, . . . , x¯1 ) if xi+1 > x¯i+1 , e˜i b = (x1 , . . . , x¯i+1 + 1, x¯i − 1, . . . , x¯1 ) if xi+1 ≤ x¯i+1 . For i = n, e˜n b = (x1 , . . . , xn + 1|x¯n − 1, . . . , x¯1 ).
114
J. Hong, S.-J. Kang, H. Lee
The remaining maps describing the crystal structure on B (1) are given below : 1
ϕ0 (b) = 1 − (xi + x¯i ) + (x¯1 − x1 )+ , 2 n
i=1
ϕi (b) = xi + (x¯i+1 − xi+1 )+ (i = 1, . . . , n), ϕn (b) = xn , 1
ε0 (b) = 1 − (xi + x¯i ) + (x1 − x¯1 )+ , 2 n
i=1
εi (b) = x¯i + (xi+1 − x¯i+1 )+ (i = 1, . . . , n), εn (b) = x¯n , wt(b) =
n
(ϕi (b) − εi (b))i .
i=0
Here, (x)+ = max(0, x). (1)
The perfect crystal for Uq (C2 ) used in [1] is different from the one given in this theorem. (1)
Example 2.3. The following is a drawing of the level-1 perfect crystal for Uq (C2 ) in the form of the above theorem. Readers familiar with the crystal basis theory will notice the Uq (C2 )-crystal B(21 ) ⊂ B(1 ) ⊗ B(1 ) in the drawing. This is what we meant (1) by the level-2 nature of the perfect crystal for Uq (Cn ) in the introduction. (1, 0|0, 1) 1
1
(1, 0|1, 0) 2
(0, 1|0, 1)
0
0
(1, 1|0, 0) 1
2
(0, 1|1, 0) 1
(2, 0|0, 0)
2
(0, 0|1, 1) 2
(0, 2|0, 0)
1 (0, 0|2, 0)
0
1 (0, 0|0, 2)
0 (0, 0|0, 0)
3. New Realization of the Level-1 Perfect Crystal In this section, we construct the set of slices and obtain a new realization for the level-1 perfect crystal B (1) . 3.1. Slices. A slice is what will later become a column in our Young walls. The basic ingredient of our discussion will be the following colored blocks. 0
: half-unit height, unit width, unit depth.
i
: unit height, unit width, unit depth (i = 1, . . . , n).
To simplify drawings, we shall use just the frontal view when representing a set of blocks stacked in a wall of unit thickness.
(1)
Young Walls for Uq (Cn )
1
4 2 0 2
1 1
115 0 2 0 0 0 3
1 2 2 3 2
3 1 0
←→
1
1
4 2 0 2
1 1
0 2 0 0 0 3
1 2 2 3 2
3 1 0
1
A set of finitely many blocks, stacked in one column, following the pattern
n ···
covering blocks
1 0 0 1 ···
1 0 covering block → 0
supporting blocks
(1)
is called a level- 21 slice of type Cn . For those with previous Young wall experience, we stress that the bottom of the column must be a 0-block as given in the above pattern. We see that, in this repeating pattern, an i-block appears twice in each cycle for i = 0, . . . , n − 1. To distinguish the two places, we have given names to these positions or blocks. A covering block is one that is closer to the n-block that sits below it than to the position for n-block above it. If it is the other way around, it is a supporting block. Notice that, by convention, each n-block is both a supporting block and a covering block. Any consecutive sequence of blocks in a level- 21 slice that contains one n-block and two i-blocks for each i = 0, . . . , n − 1 is called a δ. If we may place an i-block on top of some level- 21 slice and still obtain a level- 21 slice, we shall call that place an i-slot. The notions of covering i-slot or supporting i-slot is self-explanatory. Remark 3.1. We warn the reader that, even though δ = α0 + 2α1 + · · · + 2αn−1 + αn (1) for Uq (Cn ), we are using two 0-blocks for our definition of δ. This is because we shall always be using 0-blocks in pairs. For example, in applying f˜0 action, two 0-blocks will be added. We may add a δ to a level- 21 slice or remove a δ from a big enough level- 21 slice c (1) and write this as c ± δ. For example, when dealing with Uq (C2 ), we have 1 0 0 1 2 1 0 0
+δ =
1 0 0
.
Definition 3.2. An ordered pair C = (c1 , c2 ) of level- 21 slices is a level-1 slice of type (1) Cn , if c1 ⊂ c2 ⊂ c1 + δ and if it contains an even number of 0-blocks. Each ci is called the i th layer of C. The set of all level-1 slices is denoted by S (1) .
116
J. Hong, S.-J. Kang, H. Lee
We shall often just say slice, when dealing with level-1 slices. Mentally, we picture a level-1 slice as two columns with the first layer placed in front of the second layer, rather than as an ordered pair. We explain how to draw a slice with the following example.
( c1 =
1 0 0
,
c2 =
1
1
1
2
2
2
1 0 0
1 0 0
←→
)
C =
=
1 0 0
We now explain the notion of splitting an i-block in a level-1 slice. Suppose that the top part of some level-1 slice C takes one of the following two shapes:
···
···
n
0 0 ···
i−1
···
i
i
···
···
i−1
C=
0 0
or
C=
n
for some 0 < i < n. To split an i-block in such a level-1 slice, means to break off the top half of the i-block and to place it on top of the (i − 1)-block, so that it looks like
···
n
0 0 ···
i/2 i−1
i/2
···
···
···
i/2 i−1
···
i/2
C =
0 0
or
C =
n
.
The “i/2” written in the cut off i-blocks are supposed to convey the idea that this is a half of the i-block. We will never split a 0-block, but splitting an n-block may be done similarly. n
n/2 n−1
C=
0 0
···
···
n−1
−→
C =
0 0
.
(1)
0 0
0 0
C=
n
···
n/2 n−1
···
n−1
···
117
···
Young Walls for Uq (Cn )
−→
C =
n/2
.
Simply put, splitting an i-block (i = 0) is breaking off the top half of a covering i-block and placing it in a supporting i-slot. Remark 3.3. The result obtained after splitting an i-block in a slice will not be considered a level-1 slice. As it will become clearer when we deal with Young walls in the following sections, splitting is supposed to be a temporary act, used to see things from a different point of view. Remark 3.4. If it is possible to split an i-block in some slice, splitting a block of color different from i in the same column is not possible. So it makes sense to split a column if possible. Any non-split bock is a whole block. Hence a whole 0-block is of half-unit height and whole blocks of any other color is of unit height. We now explain how to apply some action, which we denote by f˜i (i = 0, . . . , n), on the set S (1) ∪ {0}. For i = 1, . . . , n, we go through the following steps, until we see a matching case, either to add one i-block to the slice, or to take the result as zero. (1) The f˜i action on zero is zero. (2) If i =
n and splitting an (i + 1)-block is possible, we take the result to be zero. (3) If neither of the slots at the top of the two level- 21 slices are for i-blocks, the result is zero. (4) If just one of the two slots is for an i-block, place an i-block in the slot. (5) If we have two i-slots at the top, and either i = n or they are of the same kind (supporting or covering), do as follows: • If they are at different heights, place an i-block in the first layer, i.e., the lower slot. • Otherwise, place an i-block in the second layer, i.e., the back slot. (6) If we’ve come this far, we must have two i-slots of different kinds. Place an i-block in the covering slot. To apply f˜0 , we follow the next steps. (1) (2) (3) (4) (5)
The f˜0 action on zero is zero. If it is possible to split a 1-block, the result is zero. If there are no 0-slots available, the result is zero. If the top of both layers are 0-slots, place a 0-block in each of the two slots. If only one of the slots is a 0-slot, place two 0-blocks in that layer.
We also define e˜i action on S (1) ∪ {0}. For i = 0, the action of e˜i on a slice removes one i-block or sends it to zero following the next set of rules. (1) The e˜i action on zero is zero. (2) If i =
n and splitting an (i + 1)-block is possible, we take the result to be zero.
118
J. Hong, S.-J. Kang, H. Lee
(3) If neither of the blocks at the top of the two level- 21 slices are i-blocks, the result is zero. (4) If just one of the two top blocks is an i-block, remove the i-block. (5) If the top of both level- 21 slices are i-blocks, and if either i = n or they are of the same kind, do as follows: • If they are at different heights, remove the i-block in the second layer, i.e., the higher block. • Otherwise, remove the i-block in the first layer, i.e., the closer block. (6) We must now have two i-blocks of different types at the top of the two level- 21 slices. Remove the supporting i-block. To apply e˜0 , we use the following steps. (1) (2) (3) (4) (5)
The e˜0 action on zero is zero. If it is possible to split a 1-block, the result is zero. If neither of the two top blocks are 0-blocks, the result is zero. If the top of both layers are 0-blocks, remove a 0-block from each of the two layers. If only one of the top blocks is a 0-block, remove two 0-blocks in that layer.
3.2. The perfect crystal. In this subsection, we give a new realization for the level-1 (1) perfect crystal of type Cn by moding out the repetitive part from the set of slices. Definition 3.5. We may add a δ to a slice C = (c1 , c2 ), by changing this into C + δ = (c2 , c1 + δ). If, for the same slice C, we have c2 ⊃ δ, we may also remove a δ from C, by changing C into C − δ = (c2 − δ, c1 ). Definition 3.6. Two slices C and C are related, denoted by C ∼ C , if one of the two slices may be obtained from the other by adding finitely many δ. Define C (1) = S (1) / ∼ . Proposition 3.7. The actions f˜i and e˜i , previously defined on the set of level-1 slices, (1) gives the set C (1) a Uq (Cn )-crystal structure. Proof. Consider the map C → C +δ, where C denotes a slice. We may easily check that each of the steps used in defining f˜i and e˜i actions on the set of level-1 slices commutes with this map. So the Kashiwara operators f˜i and e˜i are well-defined on C (1) . We may now define various other maps as usual: εi (C) = max{n | e˜in C ∈ C (1) }, ϕi (C) = max{n | f˜in C ∈ C (1) },
wt(C) = ϕi (C) − εi (C) i . i
(1)
Young Walls for Uq (Cn )
119
Checking that these maps satisfy all of the following relations defining a crystal structure is straightforward : ϕi (C) = εi (C) + hi , wt(C) wt(e˜i C) = wt(C) + αi wt(f˜i C) = wt(C) − αi
for all i ∈ I ,
if e˜i C ∈ C (1) , if f˜i C ∈ C (1) ,
εi (e˜i C) = εi (C) − 1, ϕi (e˜i C) = ϕi (C) + 1 if e˜i C ∈ C (1) , εi (f˜i C) = εi (C) + 1, ϕi (f˜i C) = ϕi (C) − 1 if f˜i C ∈ C (1) , f˜i C = C if and only if C = e˜i C for C, C ∈ C (1) , i ∈ I , if ϕi (C) = −∞ for C ∈ C (1) , then e˜i C = f˜i C = 0. Hence we have a Uq (Cn )-crystal structure on C (1) . (1)
Recall the finite Uq (Cn )-crystal B (1) given in Sect. 2. We shall define a map from to C (1) . The following preliminary mapping is first needed: (1)
B (1)
(0, . . . , 0|0, . . . , 0)
−→
0
(1, 0, . . . , 0|0, . . . , 0)
−→
0 0
···
i−1
(0, . . . , 0, 1, 0, . . . , 0|0, . . . , 0)
−→
0 0
(1 at the i th place from left)
···
n
(0, . . . , 0|1, 0, . . . , 0)
−→
0 0
···
i
···
n
(0, . . . , 0|0, . . . , 0, 1, 0, . . . , 0)
−→
0 0
(1 at the i th place from the right)
Now, to map an element of B (1) to C (1) , we first write the element as a sum of two elements, and map it to C (1) , using the above preliminary mapping. The following few
120
J. Hong, S.-J. Kang, H. Lee (1)
examples in the case of C2 should make this clearer. (0, 0|0, 0) = (0, 0|0, 0) + (0, 0|0, 0) −→
,
0 2
(1, 0|1, 0) = (1, 0|0, 0) + (0, 0|1, 0) (0, 2|0, 0) = (0, 1|0, 0) + (0, 1|0, 0)
−→
1 0 0
,
−→
1 0 0
.
Of course, the right-hand side should be taken as the equivalence class in S (1) / ∼, represented by the drawing. It is easy to see that this correspondence does not depend on which of the two summands we decide to map to the first layer. For example, we could have done 0 0 1 2
(1, 0|1, 0) = (0, 0|1, 0) + (1, 0|0, 0)
−→
1 0 0
for the second example above. This might look different at first sight, but you can check that the drawing on the right-hand side belongs to the same equivalence class as the one given above. Theorem 3.8. The map defined above is an isomorphism of Uq (Cn )-crystals, (1)
B (1) ∼ = C (1) . Proof. There is a natural map from S (1) to B (1) . It may be moded out by the equivalence relation on S (1) to obtain the inverse of the above defined map. Hence the map is bijective. It is a lengthy but straightforward case-by-case comparison to verify that the Kashiwara operators of Theorem 2.2 and those of this section are compatible under this bijection. Remark 3.9. If we change the definition of a slice slightly to be a column which extends infinitely downward, we may state that the set of slices S (1) is a realization for the affinization of B (1) . We close this section with an example that could be of help in understanding the proof of Theorem 3.8. (1)
Example 3.10. The following is a drawing of the level-1 perfect crystal for Uq (C2 ) in the form of C (1) . Readers may want to compare this with Example 2.3.
(1)
Young Walls for Uq (Cn )
121 1 2 1 0 0
1
1
2 1 0 0
1 2 1
2
0
0
1 0 0 1
2
2 1 1
1 2
2
2
1
1 0 0
0 0
1 1 2
2
0
0 0 1
A close study of this example will convince the reader that choosing to use two half-height blocks (and not a single unit-height block) for the f˜0 action was a natural decision. 4. Young Walls In this section, we define the set of reduced proper Young walls. We also define a crystal structure on the set of proper Young walls. 4.1. Level-1 Young walls. We line up the level- 21 slices defined earlier and consider blocks stacked in the following pattern.
···
···
···
n
n
n
n
···
···
1 0 0 1
···
1 0 0 1
···
1 0 0 1
···
1 0 0 1
1 0 0
1 0 0
1 0 0
1 0 0
(1)
Definition 4.1. A level- 21 weak Young wall of type Cn is a set of blocks, or halves of blocks, that satisfies the following conditions: • It is stacked in the pattern given above. • Except for the rightmost column, there is no free space to the right of any block. Remark 4.2. In previous works ([1, 5]), level-1 Young walls were defined to be built on ground state walls. Implicitly, it was also assumed that the building process was done in finite steps. Neither of these conditions are imposed on a level- 21 weak Young wall. For example, the wall
··· ··· ···
0 0
0 0
0 0
1 0 0
1 0 0
2
2
1 0 0
1 0 0
122
J. Hong, S.-J. Kang, H. Lee (1)
is a level- 21 weak Young wall of type Cn , but not a level-1 Young wall in the sense given in previous works [1, 5]. We also do not allow the empty wall to be considered a level- 21 weak Young wall. For a level- 21 weak Young wall Y , we define Y + δ to be the level- 21 weak Young wall obtained by adding a δ to each and every column of Y . Here, a δ is a connected sequence of blocks that contain one n-block and two i-blocks for each i = 0, 1, . . . , n − 1. (1)
Definition 4.3. An ordered pair Y = (Y1 , Y2 ) is a level-1 weak Young wall of type Cn , if it satisfies the following conditions. • Each Yi is a level- 21 weak Young wall. • In each column, any halves of blocks for each color add up to form a whole block. That is, any split blocks come in matching pairs. The level- 21 weak Young wall Yi is called the i th layer of the Young wall Y. Definition 4.4. A level-1 weak Young wall Y = (Y1 , Y2 ) is a level-1 Young wall, if it satisfies the following conditions. • It contains only whole blocks. • Y1 ⊂ Y2 ⊂ Y1 + δ. • Each column contains an even number of 0-blocks. In short, a level-1 Young wall is a level-1 weak Young wall obtained by concatenating level-1 slices. 4.2. Reduced proper Young walls. The i th column of a level-1 Young wall Y = (Y1 , Y2 ) is denoted by Y(i). We choose to number them so that the rightmost column is named Y(0). Note that each column of a level-1 Young wall is a level-1 slice. So we shall utilize the previous notation for drawing level-1 slices when drawing level-1 Young walls. That is, we shall color the first layer gray. The i th column of the j th layer Yj is denoted by Yj (i). We could view the same Yj (i) also as the j th layer of the i th column of Y. Normally, splitting some block in a column of a level-1 Young wall would not give us a Young wall, nor even a weak Young wall. Definition 4.5. Let us be given a level-1 Young wall Y = (Y1 , Y2 ). The Young wall Y is proper if it satisfies the following conditions: • When we split every possible column of Y (see Remark 3.4), the end result Y = (Y1 , Y2 ) is a level-1 weak Young wall. • For each of the two level- 21 weak Young wall Yj in the end result, none of the columns of integer height have the same height. The set of all level-1 proper Young walls is denoted by F. Since a column of a Young wall is a slice, we can add a δ to or remove a δ from a column. To add a δ to a column Y(i) = (Y1 (i), Y2 (i)) means to change this into Y(i) + δ = (Y2 (i), Y1 (i) + δ).
(1)
Young Walls for Uq (Cn )
123
Here, the “+δ” on the right hand side should be understood in the level- 21 sense. Similarly, if Y2 (i) ⊃ δ, we may remove a δ from the same column by changing it into Y(i) − δ = (Y2 (i) − δ, Y1 (i)). Definition 4.6. A column in a level-1 proper Young wall is said to contain a removable δ, if the Young wall is still proper after removing a δ from that column. A proper Young wall is reduced, if none of its columns contain a removable δ. The set of all level-1 reduced proper Young walls is denoted by Y. 4.3. The crystal structure. Recall that each column of a Young wall is a slice, so that we have the actions f˜i and e˜i defined on them (Sect. 3.1). Definition 4.7. (1) A column in a level-1 proper Young wall is k times i-admissible, if k is the maximal number of times we may act f˜i to the column while remaining a proper Young wall. (2) A column in a level-1 proper Young wall is k times i-removable, if k is the maximal number of times we may act e˜i to the column while remaining a proper Young wall. Remark 4.8. Recall that we add two 0-blocks to a column when applying f˜0 . So, being k times 0-admissible will imply that we can place 2k number of 0-blocks and still obtain a proper Young wall. Remark 4.9. Even for i = 0, the number of slots in a column of a Young wall, in which a single i-block may be placed while remaining a proper Young wall, does not necessarily equal the number of times a column is i-admissible. 1
···
2
2
1 0 0
1 0 0
···
(1)
In the above Young wall of type C2 , the left column is only once 1-admissible. But we may place a 1-block in either of the two i-slots in the left column and still obtain a proper Young wall. Remark 4.10. The property of a Young wall column being i-admissible depends on the column which sits to the right of the column in consideration. Likewise, being i-removable depends on the left column. The action of the Kashiwara operators f˜i and e˜i on a level-1 proper Young wall is defined as follows: (1) For each column of the Young wall, write under them x-many 1 followed by y-many 0, if the column is x times i-removable and y times i-admissible. (2) From the (half-)infinite list of 0 and 1, successively cancel out each (0, 1) pair to obtain a finite sequence of 1 followed by some 0 (reading from left to right). (3) Let f˜i act on the column corresponding to the left-most 0 remaining (as an operator acting on slices). Set it to zero if no 0 remains. (4) Let e˜i act on the column corresponding to the right-most 1 remaining (as an operator acting on slices). Set it to zero if no 1 remains.
124
J. Hong, S.-J. Kang, H. Lee
The 0 and 1 placed under the Young wall in the above process are called i-signature of the respective columns or of the Young wall. It is clear from the definition that the result obtained after the action of f˜i or e˜i is still a proper Young wall. We may now define various other maps as before. For a proper Young wall Y , we define εi (Y ) = max{n | e˜in Y is nonzero}, ϕi (Y ) = max{n | f˜in Y is nonzero},
ϕi (Y ) − εi (Y ) i . wt(Y ) =
(4.1) (4.2) (4.3)
i
Checking that these maps satisfy all of the relations defining a crystal structure is (1) straightforward. Hence we have a Uq (Cn )-crystal structure on the set of proper Young walls. Proposition 4.11. The set F of all level-1 proper Young walls forms an (abstract) (1) Uq (Cn )-crystal.
5. Irreducible Highest Weight Crystals In this section, we show that the set of all reduced proper Young walls built on some ground state wall is isomorphic to the irreducible highest weight crystal of appropriate highest weight. 5.1. Ground state walls. Let us denote by P, the set of all half-infinite tensor product of elements from the perfect crystal B (1) . In particular, we have P(k ) ⊂ P. By using the composition of maps ∼
S (1) −→ C (1) −→ B (1) on each column of a proper Young wall, we may define a map : F −→ P.
(5.1)
Now, fix any ground state path bk ∈ P and consider its inverse image −1 (bk ). For (1) example, when dealing with Uq (C2 ), all of the following level-1 proper Young walls are sent to b0 under the map .
(1)
Young Walls for Uq (Cn )
0
0
0
125
0
0 1
0 1
0 1
2
2
2
0
1 0 0
0
1 0 0
1 0 0
0 1
0 1
0 1
0 1
0 1
0 1
2
2
2
2
2
2
0
0
0
1 0 0
1 0 0
0
0
1 0 0
1 0 0
1 0 0
0
0
0
0
0
0
0
0
0
0
1 0 0
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
0 1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
0
k
k
k
k
k
···
···
···
···
n
n
n
n
n
,
···
0
···
0
···
0
···
0
···
Y0 =
···
We can see that three of these are reduced. Let us denote by Yk ∈ F the unique element of −1 (bk ) that is contained in all other elements of −1 (bk ). That is, we take it to be the smallest element of −1 (bk ). It is clear that each Yk is a reduced proper Young wall. Explicitly, we have
···
···
···
···
Yk =
···
k−1 k−1 k−1 k−1 k−1 0 0
0 0
0 0
0 0
0 0
n
n
n
n
n
(for k = 1, . . . , n − 1),
···
···
···
···
Yn =
···
n−1 n−1 n−1 n−1 n−1 0 0
0 0
0 0
0 0
0 0
.
Definition 5.1. The level-1 reduced proper Young wall Yk is called the ground state wall of weight k . Any level-1 proper Young wall that contains only finitely many more blocks than Yk is said to have been built on Yk . 5.2. Irreducible highest weight crystal. Definition 5.2. The set of all level-1 proper Young walls built on Yk is denoted by F(k ). We will denote by Y(k ) the set of all level-1 reduced proper Young walls built on Yk .
126
J. Hong, S.-J. Kang, H. Lee
Proposition 5.3. The set F(k ) of level-1 proper Young walls built on Yk forms a (1) Uq (Cn )-crystal. Proof. It is clear that it forms a Uq (Cn )-subcrystal of F. So it suffices to give an affine weight to each Young wall which is compatible with other maps. Recall the previous definition of classical weight of a Young wall given by (4.3). We may define the affine weight of a Young wall Y ∈ F(k ) by (1)
1 wt(Y) = wt(Y) − (number of 0-blocks in Y \ Yk ) δ. 2
(5.2)
In particular, the affine weight of Yk is k . It is straightforward to verify that F(k ) (1) forms a Uq (Cn )-crystal with this definition of affine weights. Notice that the image of Yk under given by (5.1) is the ground state path bk of P (k ). Hence it makes sense to restrict the domain and range of as : Y(k ) −→ P(k ).
(5.3)
We claim that this is a bijection. To check the surjectivity, we may explicitly construct a reduced proper Young wall which maps to the path in question, and in the process, we will notice that the condition reduced forces it to be chosen uniquely. Here is our main theorem, which shows that this bijection is a crystal isomorphism. (1)
Theorem 5.4. The set Y(k ) is a Uq (Cn )-subcrystal of F(k ), and it is isomorphic to the irreducible highest weight crystal B(k ). The rest of this section is devoted mostly to proving this theorem. Let us denote by the inverse of the map (5.3), but with the range enlarged as : P(k ) −→ F(k ). To prove the theorem, it suffices to show that this is a strict crystal morphism. Then the image, which we have already seen to be Y(k ), would be a subcrystal of F(k ). The second statement follows from Theorem 2.1. We shall focus our efforts on showing that this map commutes with the Kashiwara operator f˜i . Other parts of the proof are similar or easy. Let us first review the action of the Kashiwara operator f˜i on a path element p = · · · ⊗ p(j ) ⊗ · · · ⊗ p(1) ⊗ p(0) ∈ P(k ). (1) Under each p(j ), write εi (p(j ))-many 1 followed by ϕi (p(j ))-many 0. (2) From the (half-)infinite list of 0 and 1, successively cancel out each (0, 1) pair to obtain a finite sequence of 1 followed by some 0 (reading from left to right). (3) Let f˜i act on the p(j ) corresponding to the left-most 0 remaining. Set it to zero if no 0 remains. This is quite similar to the action of f˜i onYoung walls defined in Sect. 4.3. Now, recalling the definition of and Theorem 3.8, we find that, to prove Theorem 5.4, it suffices to prove the following two lemmas. Lemma 5.5. A Kashiwara operator acts on the j th tensor component of a path p if and only if it acts on the j th column of the Young wall (p). We also have f˜i (p) = 0 (or e˜i (p) = 0) for some path p if and only if f˜i ( (p)) = 0 (respectively, e˜i ( (p)) = 0).
(1)
Young Walls for Uq (Cn )
127
Lemma 5.6. The set Y(k ) ∪ {0} is closed under the action of Kashiwara operators. Before giving the proofs for these lemmas, we shall illustrate the signature cancellation comparison between path and Young wall with two examples. Example 5.7. Consider the following part of a path: · · · ⊗ (0, . . . , 0|0, . . . , 0, 2) ⊗ (2, 0, . . . , 0|0, . . . , 0) ⊗ · · · . Suppose we are dealing with i = 0 case. The signature that should be under the left element (0, . . . , 0, 2) is 00 and that for the right element (2, 0, . . . , 0) is 11. After cancelling out the (0, 1) pairs, we are left with nothing. Now consider columns of the reduced proper Young wall which corresponds to this path under the map .
···
1
0 0 1
or
···
···
1 . . . n . . .
0 0 1 . . . n . . .
0 0 1
0 0 1
···
The two drawings are δ-shifts of each other and it does not matter which of the two drawings we use. When dealing with the case i = 0, under the left column we would write 0 and under the right column we would write 1. After (0, 1)-pair cancellation, we are again left with nothing. The signatures under path description and the Young wall description agree after cancelling out (0, 1) pairs. We give one more example which is a bit more complicated. Example 5.8. Consider the following part of a path and columns of the reduced proper Young wall which corresponds to this path under the map . · · · ⊗ (0, . . . , 0|0, . . . , 0) ⊗ (0, . . . , 0|0, . . . , 0) ⊗ · · · ···
0 1
0 1
···
As before, the reader is free to use a δ-shift of this Young wall. When dealing with f˜0 , the signatures to be written under them are given in the following table: path left ε ϕ 1 0
right ε ϕ 1 0
Young wall left right ε ϕ ε ϕ ? · · ?
The first question mark in the above Young wall table signifies that the number of 1 that should be written there depends on the column that sits to left of the left column. Likewise, the second question mark is to signify that the number of 0 to be written depends on the column that comes to its right. The two dots imply that no 0 and 1, respectively, should be written there.
128
J. Hong, S.-J. Kang, H. Lee
So in this case, we do not know the complete signature to be written under the Young wall columns. Hence, a straightforward comparison of signatures after cancellations of (0, 1) pairs is not possible. But still, we can verify that what is left of the left-ϕ and right-ε signatures, after the (0, 1)-pair cancellation, is the same for the path and Young wall. In this example, they both amount to nothing. Proof of Lemma 5.5. It suffices to check that, for all possible left-right pairs of perfect crystal elements and their correspondingYoung wall elements, what remains after (0, 1)cancellation of left-ϕ and right-ε signatures agree. (The right-most column may be dealt with in a similar way.) Let us deal with the i = 0 case first. The following notation will be used to denote various columns of Young walls. k
k
column :
0 0 1
0 0 1
0 0 1
0 0 1
0 0 1
notation :
0
00
10
11
1
Here, the top k-blocks can be anything that comes between the supporting 1-block and the covering 2-block (inclusive), but may not be the covering 1-block. Columns that are related (Definition 3.6) are denoted by the same notation. Notice that we have taken the signature of the corresponding perfect crystal element for the notation of each Young wall columns. The following table lists all left-right pairs for which the signatures to be written under the path description and Young wall description are not trivially the same. The signatures in the table body are what should be written as the left-ϕ and right-ε signatures under the two Young wall columns. right
10
11
1
0
·
1
· / 01
00
0
01
0
10
·
1
·
left
The case when the left column is 0 and the right column is 1 breaks up into two cases. Depending on the two top k-blocks for the left and right columns, which may be distinct, the left-ϕ and the right-ε signatures to be placed under the two columns could be either nothing or 01. In the latter case the signature agrees trivially with that of the path description. We can easily see that the signatures agree with that of the corresponding path description in all the above cases after (0, 1)-pair cancellations. For all other possible left-right pairs not covered in this table, the left-ϕ and right-ε signatures to be placed under the Young wall columns agree exactly with the corresponding path signatures, that is, even before the (0, 1)-pair cancellations. For 0 < i < n, the following notation will be used.
(1)
Young Walls for Uq (Cn )
129
i
i
i
(k = i, i−1)
i+1
(k = i, i+1)
i+1
(k = i, i−1)
i+1
(k = i, i+1)
i
n
i
n
i
n
i
i+1
i
i−1
i
i+1
i
i
i−1
k
i+1
i
i+1
i+1
k
0
00
00
00
i
i−1
i
i−1
i i
i−1
k
0
10
10
10
k
1
11
11
11
1
Again, we have used notation that reflect the signature of the corresponding perfect crystal element. Underlines and overlines show whether the i-blocks and i-slots of the slice are in supporting or covering positions. The following table gives the Young wall signatures for all the nontrivial pairs. right
10
10
1
11
11
0
·
· / 01
· / 01
1
1 / 011
00
0
0
0
·
01
00
0
0
0
0 / 001
01
00
0
0
· / 01
left
10
11
1
01
01
0 / 001
0
01
·
0
·
1 / 011
1
· / 01
10
·
·
10
·
·
·
·
·
10
·
1
1
· / 01
1
1
1
· / 01
1
1
·
The blank slots are where the signatures agree trivially. As before, many of the cases break up into subcases, depending on the top k-blocks for 0, 0, 1, and 1. For the remaining i = n case, the following notation will be used. n n−1
n n
n
n
n−1
n−1
n−1
k
0
n−1 k
00
10
11
1
Here, the top k-blocks for 0 and 1 may be taken to be anything except for the n-block and the supporting (n − 1)-block. It could also be two 0-blocks. The following table lists all nontrivial pairs. right
10
11
1
0
·
1
· / 01
00
0
01
0
10
·
1
·
left
This completes the proof of Lemma 5.5.
130
J. Hong, S.-J. Kang, H. Lee
Proof of Lemma 5.6. Suppose that Y(k ) ∪ {0} is not closed under f˜i . That is, there exists some Y ∈ Y(k ) for which f˜i Y ∈ Y(k ) ∪ {0}. We assume that f˜i has acted on the j th column of Y and set p = −1 (Y ). Then, by Lemma 5.5 the action of f˜i on p would also have been on the j th tensor component of p. Since f˜i Y ∈ F(k ), we may remove δ’s finitely many times from f˜i Y to obtain a reduce properYoung wall Y . The number of δ’s removed is nonzero since f˜i Y ∈ Y(k ). Note that f˜i (p) = −1 (Y ). Let us apply e˜i to both f˜i (p) and Y . We have p = e˜i (f˜i (p)) and the action of e˜i on f˜i (p) would have been on the j th tensor component. By Lemma 5.5, the action of e˜i of Y will also be on the j th column. Recall that we started out with a reduced proper Young wall Y , added a block to the th j column to obtain f˜i Y , removed finitely many δ’s to obtain Y , and, finally, removed a block from the j th column of Y to obtain e˜i Y . Hence the proper Young wall e˜i Y may be obtained from the reduced proper Young wall Y by removing finitely many δ’s. We may now remove finitely many δ’s from e˜i Y to obtain a reduced proper Young wall Y which also corresponds to p = −1 (Y ) under the map . This contradicts the fact that only one element of Y(k ) corresponds to a given p under the map . This completes the proof of this lemma. Since the proofs of Lemmas 5.5 and 5.6 are complete, Theorem 5.4 has been proved. We close this paper with an example. (1)
Example 5.9. Following is the top part of the crystal graph Y(0 ) for Uq (C2 ).
2 0
0
0 0
1
1 0 0
2 1 0 0
1
1
2 1 0 0
1 2 1 0 0 0 2 1 0 0 0
1
1 0 0 0 0
0
2 0
1 2 1 0 0 0 0
1
2 0
0
1 2 1 0 0 0
2
2 1 0 0 0 2 1 0 0 0 0
1
1 2 1 0 0 0 0
···
1 2 1 1 0 0 0 0
···
1 2 1 0 0 0
···
0
2 1
2 1 0 0 0 0 2 1 1 0 0 0 0
···
···
References (1)
1. Hong, J., Kang, S.-J.: Crystal graphs for basic representations of the quantum affine algebra Uq (C2 ). In: Representations and quantizations (Shanghai, 1998), Beijing: China High. Educ. Press, 2000, pp. 213–227 2. Hong, J., Kang, S.-J.: Introduction to Quantum Groups and Crystal Bases. Graduate Studies in Mathematics 42, Providence, RI: Am. Math. Soc., 2002 3. Jantzen, J.C.: Lectures on Quantum Groups. Graduate Studies in Mathematics 6, Providence, RI: Am. Math. Soc., 1996
(1)
Young Walls for Uq (Cn )
131
sl(n)) at 4. Jimbo, M., Misra, K.C., Miwa, T., Okado, M.: Combinatorics of representations of Uq ( q = 0. Commun. Math. Phys. 136(3), 543–566 (1991) 5. Kang, S.-J.: Crystal bases for quantum affine algebras and combinatorics of Young walls. Proc. Lond. Math. Soc. (3) 86, 29–69 (2003) 6. Kang, S.-J., Kashiwara, M., Misra, K.C.: Crystal bases of Verma modules for quantum affine Lie algebras. Compositio Math. 92(3), 299–325 (1994) 7. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Affine crystals and vertex models. In: Infinite analysis, Part A, B (Kyoto, 1991), River Edge NJ: World Sci. Publishing, 1992, pp. 449–484 8. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Perfect crystals of quantum affine Lie algebras. Duke Math. J. 68(3), 499–607 (1992) 9. Kang, S.-J., Kashiwara, M., Misra, K.C., Miwa, T., Nakashima, T., Nakayashiki, A.: Vertex models and crystals. C. R. Acad. Sci. Paris S´er. I Math. 315(4), 375–380 (1992) 10. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63(2), 465–516 (1991) 11. Kashiwara, M., Nakashima, T.: Crystal graphs for representations of the q-analogue of classical Lie algebras. J. Algebra 165(2), 295–345 (1994) 12. Kuniba, A., Misra, K.C., Okado, M., Takagi, T., Uchiyama, J.: Crystals for Demazure modules of classical affine Lie algebras. J. Algebra 208(1), 185–215 (1998) 13. Lusztig, G.: Canonical bases arising from quantized enveloping algebras. J. Am. Math. Soc. 3(2), 447–498 (1990) 14. Misra, K., Miwa, T.: Crystal base for the basic representation of Uq ( sl(n)). Commun. Math. Phys. 134(1), 79–88 (1990)
Communicated by L. Takhtajan
Commun. Math. Phys. 244, 133–156 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0967-5
Communications in
Mathematical Physics
A New Class of Obstructions to the Smoothness of Null Infinity Juan Antonio Valiente Kroon Max Planck Institut f¨ur Gravitationphysik, Albert Einstein Institut, Am M¨uhlenberg 1, 14476 Golm, Germany. E-mail:
[email protected] Received: 12 November 2002 / Accepted: 14 July 2003 Published online: 5 November 2003 – © Springer-Verlag 2003
Abstract: Expansions of the gravitational field arising from the development of asymptotically Euclidean, time symmetric, conformally flat initial data are calculated in a neighbourhood of spatial and null infinities up to order 6. To this end a certain representation of spatial infinity as a cylinder is used. This setup is based on the properties of conformal geodesics. It is found that these expansions suggest that null infinity has to be non-smooth unless the Newman-Penrose constants of the spacetime, and some other higher order quantities of the spacetime vanish. As a consequence of these results it is conjectured that similar conditions occur if one were to take the expansions to even higher orders. Furthermore, the smoothness conditions obtained suggest that if time symmetric initial data which are conformally flat in a neighbourhood of spatial infinity yield a smooth null infinity, then the initial data must in fact be Schwarzschildean around spatial infinity.
1. Introduction Penrose introduced the seminal idea that the gravitational field of isolated systems can be conveniently described by means of the notion of asymptotic simplicity [21] 1 . Central to the concept of asymptotic simplicity is the idea – expectation – that the conformal boundary of the spacetime – null infinity, I – should possess a smooth differentiable structure. This approach to the description of isolated bodies in General Relativity is usually known as Penrose’s proposal – see e.g. [17, 18]. Static and stationary spacetimes have been shown to be (weakly) asymptotically simple, with a smooth null infinity [6]. However, the main purpose behind introducing the concept of asymptotic simplicity is to provide a suitable framework for the discussion of radiation. In spite of its elegance and Current address: Institut f¨ ur Theoretische Physik der Universit¨at Wien, Boltzmanngasse 5, 1180 Wien, Austria. 1 It will be assumed that the reader is familiar with the ideas of the so-called conformal framework to describe the properties of isolated bodies and the concept of asymptotic flatness. For a recent review, the reader is remitted to [18].
134
J.A.Valiente Kroon
aesthetical appeal, Penrose’s proposal is of little use if one is not able to prove that there exists a big family of non-trivial (in the radiative sense) asymptotically simple solutions to the Einstein Field Equations. A programme aimed to investigate the existence of such solutions, and provide a conclusive “answer” to Penrose’s proposal has been started by Friedrich – see e.g. [10–16]. His strategy is based on the use of the so-called Conformal Field Equations, which allow us to work and prove existence statements directly in the conformally rescaled, “unphysical” spacetime. Along these lines, Friedrich has been able to prove a semiglobal existence result that ensures that hyperboloidal initial data close to Minkowski data yields an asymptotically simple development which includes the point i + (future timelike infinity) [14]. Recently, Chru´sciel & Delay [2] – retaking an old idea by Cutler & Wald [5] – using a refined version of some initial data set constructed by Corvino [4] have been able to prove the existence of a big class of non-trivial (radiative) asymptotically simple spacetimes. Corvino’s initial data are constructed so that they are Schwarzschildean in a neighbourhood of spatial infinity, i 0 . This means that the radiation content in the spacetime arising from the development of the data is somehow special. This can be seen directly from the fact that the Newman-Penrose constants of the spacetime are zero [8, 26]. Chru´sciel & Delay’s result is no doubt very important. However, it is not as general as one would like. It has been suspected for a long time now that the region of spacetime where null infinity and spatial infinity meet is somehow problematic – see e.g. [22]. From the analysis of the hyperboloidal initial value problem it turns out that the smoothness of null infinity is preserved by the evolution if smooth data sufficiently close to Minkowski data are prescribed. The latter indicates that somehow the “decision” of having a smooth structure at null infinity is made in an arbitrarily small neighbourhood of spatial infinity. In some sense, Corvino’s data avoids all the intricacies and complications of this region of spacetime by setting the asymptotic end in the simplest way which is consistent with the presence of a non-vanishing ADM mass on the initial hypersurface. In connection with this, Friedrich [16] has performed a detailed first analysis of the behaviour of the gravitational field arising from asymptotically Euclidean, time symmetric initial data in the region where null infinity “touches” spatial infinity. By means of a novel representation of spatial infinity in which the point i 0 of the standard conformal picture is blown up to a cylinder I – the cylinder at spatial infinity – a certain regularity condition on the CottonBach tensor and symmetrised higher order derivatives of it has been obtained. The hope was that this regularity condition would ensure the smoothness of null infinity, at least in the region close to i 0 and I .A subsequent analysis by Friedrich & K´ann´ar [19] of the first orders of some expansions that can be obtained by evaluating the Conformal Field Equations at I , lead to conjecture that Friedrich’s regularity condition is the only condition one has to impose on time symmetric, initial data possessing an analytic compactification in order to obtain a development with smooth null infinity [18]. More precisely, Conjecture (Friedrich, 2002). There exists an integer k∗ > 0 such that for given k ≥ k∗ the time evolution of an asymptotically Euclidean, time symmetric, conformally smooth initial data set admits a conformal extension to null infinity of class C k near spacelike infinity, if the Cotton-Bach spinor 2 satisfies the condition, D(as bs · · · Da1 b1 babcd) (i) = 0, 2
s = 0, 1, . . . , s∗
The Cotton-Bach tensor Bij k is related to the Cotton-Bach spinor via: Bij k → babce df + babdf ce .
This correspondence is carried out by the Infeld-van der Waerden symbols.
New Class of Obstructions to the Smoothness of Null Infinity
135
for a certain integer s∗ = s∗ (k). If the extension is of class C ∞ then the condition should hold to all orders. The Cotton-Bach spinor alluded to in the previous conjecture can be regarded as the 3-dimensional analogue of the Weyl spinor in the sense that it characterises locally the conformal flatness of a manifold – i.e. it vanishes in a conformally flat region of a 3-dimensional manifold. The condition provided in Friedrich’s conjecture can be shown to be conformally invariant and thus it is an asymptotic condition on the initial data. More remarkably, it it can be shown to be satisfied by asymptotic Euclidean static data without restricting the multipole structure [15]. The objective of this paper is to provide a further insight into the conjecture above. It will turn out that the conjecture, as it stands, is false. In order to see why this is the case, the expansions of K´ann´ar & Friedrich will be carried to an even higher order. This requires the implementation of the Conformal Field Equations on a computer algebra system (Maple V). It should be emphasized that despite the use of the computer to perform the expansions, the results here presented are exact up to the order carried out. In order to simplify our discussion, the analysis will be restricted to developments of time symmetric initial data which are conformally flat near infinity. This class of data satisfies the regularity condition of the conjecture in the simplest possible way while still providing a big enough family of data. The time symmetry requirement stems from the fact that Friedrich’s analysis has only yet been carried for this class of data. A similar analysis of initial data with non-vanishing second fundamental form lies still in the future. However, some first steps have already been carried out [7]. The conformal flatness of the data ensures that the initial data satisfies the regularity condition trivially. Again, the construction of non-conformally flat data satisfying Friedrich’s regularity condition is a non-trivial endeavour whose undertaking will be left for future studies. In the light of the results here presented, it turns out that the restriction to the class of conformally flat data is not a drawback. Furthermore, it is not hard to guess how the results could generalise in the case of general time symmetric data. The principal result of our investigation is the following, Theorem (Main theorem). Necessary conditions for the development of initial data which are time symmetric, conformally flat in a neighbourhood Ba (i) of (spatial) infinity to be smooth at the intersection of null infinity and spatial infinity are that the Newman(5) Penrose constants Gk , k = 0, . . . , 4 and the higher order Newman-Penrose constants, (6) Gk , k = 0, . . . , 6 vanish. A more precise formulation of the theorem, including the definition of the Newman-Penrose constants and the higher order Newman-Penrose constants in terms of the initial data will be given in the main text. It is just mentioned that the Newman-Penrose constants are a set of absolutely conserved quantities along null infinity. They are expressed in terms of integrals on cuts of null infinity and their value is independent of the cut considered. For asymptotically simple spacetimes containing the point i + in their conformal completion – e.g. those spacetimes arising from Friedrich’s existence result of hyperboloidal data [14] – , the constants can be shown to correspond to the value of the rescaled Weyl tensor at i + . It is noted that the conditions appearing in the main theorem are fulfilled by the Schwarzschild initial data. The theorem has been obtained from the analysis, up to order p = 6, of expansions constructed from the solutions of the transport equations induced by the conformal field equations upon evaluation on the cylinder at spatial infinity. From the evidence provided by the equations it is not unreasonable to conjecture that similar
136
J.A.Valiente Kroon
conditions arise if one were to obtain expansions to even higher orders. In [16] it was shown that the whole setup of the cylinder at spatial infinity is completely regular for Schwarzschildean data. Thus, the hypothetic new conditions must be satisfied by the Schwarzschild initial data. It is on these grounds that the following conjecture is put forward: Conjecture (New conjecture). If an initial data set which is time symmetric and conformally flat in a neighbourhood Ba (i) of the point i yields a development with a smooth null infinity, then the initial data are in fact Schwarzschildean in Ba (i). Again, a more technical version of the conjecture is given in the main text. This conjecture can be understood as some kind of rigidity result – i.e. the only asymptotically simple spacetimes which arise from data which is conformally flat near infinity are those of the Chru´sciel-Delay type. For example, the Brill-Lindquist data – which is conformally flat – do not satisfy the conditions stated in the main theorem, and thus it is not asymptotically simple – see the main text for the details. However, one could always use the Corvino-Chru´sciel-Delay techniques to produce a modified Brill-Lindquist data whose development possesses a smooth null infinity. The article is structured as follows: in Sect. 2 a brief summary of the description of spacetime in the neighbourhood of spatial and null infinities in terms of the cylinder at spatial infinity is given. This digest has the intention of fixing the notation and conventions to be used in the calculations described in the present article. Particular attention is paid to the spatial 2-spinor formalism and to the expansions of functions on S 3 in terms of unitary representations of SU (2, C). The reader is, in any case, remitted to [16] for a more extensive discussion. In Sect. 3, the conformal field equations written in the conformal geodesic gauge are discussed. The initial data for the latter in the case of an asymptotically Euclidean, time symmetric, conformally flat initial hypersurface are described. The transport equations implied by the conformal field equations on the cylinder at spatial infinity are also introduced. Section 4 contains the new results to be presented in the article. Here a description of the solution of the transport equations is done. Due to the large size of the expressions involved, the description will be focused on what we believe are the most relevant features of the solutions. However, it should be emphasized that everything has been explicitly calculated. The main conclusions extracted from the calculations is presented as our main theorem. In order to understand the meaning of the conditions presented in the main theorem, the Schwarzschild solution is discussed in this context, and a conjecture is formulated. 2. Spacetime in a Neighbourhood of Spatial Infinity Let (M, gµν ) be a vacuum spacetime arising as the development of asymptotically Euclidean, time symmetric, conformally analytic initial data ( S, hαβ ). Later, we will restrict the class of initial data sets under discussion to those conformally flat around infinity. Assume for simplicity that S possesses only one asymptotic end. Let i be the infinity corresponding to that end. The point i is obtained by conformally compactifying the initial hypersurface S, with an analytic conformal factor which can be obtained from solving the time symmetric constraint equations. The compact 3 dimensional manifold obtained in this way will be denoted by S, and the conformally rescaled 3-metric by hαβ . Assume that the 3-metric hαβ is analytic in an open ball Ba (i) of S with radius a centered on i. Let ρ denote the geodesic distance along geodesics starting at i. The radius a of the ball Ba (i) is chosen such that Ba (i) is geodesically convex.
New Class of Obstructions to the Smoothness of Null Infinity
137
∩M be the domain of influence of the ball Ba (i) ∩ S. Intuitively, one Now, let N expects N to cover a region of spacetime “close to null and spatial infinities”. In reference [16] it has been shown that once the time symmetric constraint equations have been solved, a certain gauge based on the properties of conformal geodesics can be introduced. Let τ be the parameter along these curves. This gauge has the property of to producing a conformal factor which can be in turn used to rescale the region N obtain a “finite representation” N , of spacetime in a neighbourhood of spatial and null infinities. The relevant conformal factor is given by, κ2 = κ −1 1 − τ 2 2 , (1) ω where ω is given by, 2 , ω= √ |Da D a |
(2)
and κ is a smooth function depending on ρ and the “angular coordinates”– see below – such that κ = κ ρ with κ (i) = 1. The function κ contains the remaining piece of conformal freedom in our setting. Throughout this work, space 2-spinors will be systematically used. In order to avoid problems with vanishing frame vectors on surfaces diffeomorphic to spheres, our discussion will be carried out not on N but on a subbundle Ca,κ of the frame bundle over N . The subbundle Ca,κ can be shown to be a 5-dimensional submanifold of R × R × SU (2, C) with structure group U (1). More precisely, we define Ca,κ to be given by, ω ω Ca,κ = (τ, ρ, t) ∈ R × R × SU (2, C)| 0 ≤ ρ < a, − ≤ τ ≤ . (3) κ κ The projection π of Ca,κ into N corresponds to the Hopf map SU (2, C) → SU (2, C)/U (1) ≈ S 2 . Scalar fields and tensorial fields on N are lifted to Ca,κ . Their “angular” dependence will be then given in terms of functions of t ∈ SU (2, C). The manifold Ca,κ has the following important submanifolds, I = {(τ, ρ, t) ∈ Ca,κ | ρ = 0, |τ | < 1}, I ± = {(τ, ρ, t) ∈ Ca,κ | ρ = 0, τ = ±1}, ω I ± = (τ, ρ, t) ∈ Ca,κ | ρ > 0, τ = ± . κ
(4a) (4b) (4c)
2.1. Space 2-spinors. Consider the antisymmetric spinors ab , ab , a, b = 0, 1. These satisfy 01 = 1, 01 = 1. Let τ aa denote the tangent vector to the conformal geodesics parametrised by τ . We set,
τ aa = 0 a 0 a + 1 a 1 a .
(5)
Differential operators X+ , X− and X on SU (2, C) can be introduced by considering a basis of the Lie algebra of SU(2,C) and then looking at (complex) linear combinations of the real left invariant vectors fields generated by the basis of the Lie algebra on SU (2, C). In particular, the vector field X is chosen so that it generates U (1). The field X± , X satisfy the following commutation relations: [X, X+ ] = 2X+ ,
[X, X− ] = −2X− ,
[X+ , X− ] = −X.
(6)
138
J.A.Valiente Kroon
With the help of X± one can construct the following (frame) spinor fields, + − 0 1 caa = caa ∂τ + caa ∂ρ + caa X+ + caa X− ,
(7)
on Ca,κ .3 √ The use of a space spinor formalism based on the vector field 2∂τ = τ aa caa allows to perform our whole discussion in terms of quantities without primed indices. Accordingly, we write 1 caa = √ τaa ∂τ − τ ba cab 2
(8)
b
0 ∂ + c1 ∂ + c+ X + c− X . The connection is represented with cab = τ(a cb)b = cab τ ab ρ ab + ab − by coefficients abcd which can be decomposed in the form,
1 1 abcd = √ ξabcd − χ(ab)cd − ab fcd , 2 2
(9)
where the fields entering in the decomposition possess the following symmetries: χabcd = χab(cd) , ξabcd = ξ(ab)(cd) , fab = f(ab) . The curvature will be described by the rescaled conformal Weyl spinor φabcd = φ(abcd) , and by the spinor field abcd = ab(cd) which encodes information relative to the Ricci part of the curvature. For the purpose of writing g the field equations it will be customary to consider it decomposed in terms of g cd and (ab)cd . For latter use it is noted that an arbitrary four indices spinor Xabcd can be writi ten in terms of the “elementary spinors” abcd with i = 0, . . . , 4, ac xbd + bd xac , ac ybd + bd yac , ac zbd + bd zac , and habcd where, xab =
√ 0 1 2(a b) ,
1 yab = − √ a 1 b 1 , 2
1 zab = √ a 0 b 0 , 2
(10)
and, (e
f
g
i abcd = (a b c d)
h)i
,
habcd = −a(c d)b .
(11)
The notation (abcd)i means that the indices are to be symmetrised and then i of them set to 1. We write, 0 1 2 3 4 Xabcd = X0 abcd + X1 abcd + X2 abcd + X3 abcd + X4 abcd +Xx (ac xbd + bd xac ) + Xy (ac ybd + bd yac ) +Xz (ac zbd + bd zac ) + Xh habcd .
(12)
3 A knowledge of the spinorial fields is equivalent to a knowledge of the metric. The tensorial expres caa , where sions for the frame can be recovered using the Infeld-van der Waerden symbols via eµˆ = σµaa ˆ the indices with caret are frame indices. The metric tensor is then recovered from the completeness relation for the frame.
New Class of Obstructions to the Smoothness of Null Infinity
139
2.2. Expansions of functions on Ca,κ . In order to obtain our expansions of the gravitational field around spatial infinity, we will require to decompose the functions arising into their diverse spherical (harmonic) sectors. It is recalled that any function real analytic complex value function f on SU (2, C) can be expanded in terms of some functions √ m + 1 Tm kj forming a complete set in L2 (µ, SU (2, C)) where µ is the standard Haar measure in SU (2, C). One has, f (t) =
m m ∞
fm,k,j Tm kj .
m=0 j =0 k=0
The functions Tm kj can be shown to be related with the standard spin-weighted spherical harmonics. The operators X± and X introduced in the previous section can be seen to yield, upon application to the Tm kj functions the following, X+ Tm kj = βm,j Tm kj −1 , X− Tm kj = −βm,j +1 Tm kj +1 ,
with βm,j
= j (m − j + 1).
(13a) (13b)
A function f on SU (2, C) is said to have spin weight s if Xf = 2sf . This definition can be readily extended to functions on Ca,κ . As it will be seen later, all the quantities we will work with will have a well defined spin weight. Let f be an analytic function f on Ca,κ with an integer spin weight s. Now, consider a spinorial symmetric analytic function on Ca,κ , λa1 ···a2r with essential components λj = λ(a1 ,··· a2r )j , 0 ≤ j ≤ 2r, of spin weight s = r − j . Then, the components of the function will possess expansions of the form, λj =
∞
λj,p ρ p ,
(14)
p=0
where the coefficients λj,p can in turn be decomposed in terms of the functions Tm kj , as λj,p =
q(p)
2q
λj,p;2q,k T2q
k q−r+j ,
(15)
q=|r−j | k=0
where 0 ≤ |r − j | ≤ q(p) ≤ ∞. An expansion of the latter type will be referred to as an expansion of type q(p). The conformal field equations are nonlinear. Thus, when expanding them, one finds products of T -functions. These products can in turn be reexpressed as a linear combination of T ’s. More precisely, one has: Lemma 1. Multiplying T functions. The following holds: T2n1 k1l1 × T2n2 k2l2 =
n 1 +n2
(−1)n+n1 +n2 C(n1 , n1 − l1 ; n2 , n2 − l2 ; n, n1 + n2 − l1 − l2 )
n=q0
×C(n1 , n1 − k1 ; n2 , n2 − k2 ; n, n1 + n2 − k1 − k2 ) ×T2n n+k1 +k2 −n1 −n2n+l1 +l2 −n1 −n2 ,
(16)
140
J.A.Valiente Kroon
where q0 = max{|n1 −n2 |, n1 +n2 −k1 −k2 , n1 +n2 −l1 −l2 }, and C(l1 , m1 ; l2 , m2 ; l, m) denotes the standard Clebsch-Gordan coefficients of SU (2, C).4 3. The Conformal Evolution Equations Using the conformal geodesic gauge and the 2-spinor decomposition, it can be shown that the extended conformal field equations given in [16] imply the following evolution g µ equations for the unknowns v = (cab , ξabcd , fab , χ(ab)cd , (ab)cd , g cd ), µ = 0, 1, ±, ef
0 0 ∂τ cab = −χ(ab) cef − fab , α = ∂τ cab
ef α −χ(ab) cef ,
(17a) (17b)
1 ef ∂τ ξabcd = −χ(ab) ξef cd + √ (ac χ(bd)ef + bd χ(ac)ef )f ef 2 √ 1 f f e − 2χ(ab)(c fd)e − (ac f bd + bd f ac ) − i µabcd , (17c) 2 1 ef f ∂τ fab = −χ(ab) fef + √ f ab , (17d) 2 ef
∂τ χ(ab)cd = −χ(ab) χef cd − (cd)ab + ηabcd , √ ef ∂τ (ab)cd = −χ(ab) (ab)ef − ∂τ ηabcd + i 2d e(a µb)cde , √ g ef g ∂τ g ab = −χ(ab) g ef + 2d ef ηabef ,
(17e) (17f) (17g)
where 1 (φabcd + τa a τb b τc c τd d φ a b c d ), 2 i = − (φabcd − τa a τb b τc c τd d φ a b c d ), 2
ηabcd = µabcd
(18)
denoting respectively the electric and magnetic part of φabcd with respect to the field ∂τ . The quantities , ∂τ and dab , given by formulae (1) and (23) are known directly from the initial data. Thus, Eqs. (17a)–(17g) are essentially ordinary differential equations for the components of the vector v. The most important part of the propagation equations corresponds to the evolution equations for the spinor φabcd derived from the Bianchi identities, the Bianchi propagation equations: √ 0 0 α α )∂τ φ0 + 2c00 ∂τ φ1 − 2c01 ∂α φ0 + 2c00 ∂α φ 1 ( 2 − 2c01 = (20011 − 81010 )φ0 + (40001 + 81000 )φ1 − 60000 φ2 , 4
Some other used notations in the physics literature are: C(l1 , m1 ; l2 , m2 ; l, m) ≡< l1 , l2 ; m1 , m2 |l, m >, ≡ C(l1 , l2 , l|m1 , m2 , m).
.
(19a)
New Class of Obstructions to the Smoothness of Null Infinity
141
√
0 0 α α 2∂τ φ1 − c11 ∂τ φ0 + c00 ∂τ φ2 − c11 ∂α φ0 + c00 ∂ α φ2 = −(41110 + f11 )φ0 + (20011 + 41100 − 2f01 )φ1 + 3f00 φ2 − 20000 φ3 ,(19b) √ 0 0 α α 2∂τ φ2 − c11 ∂τ φ1 + c00 ∂τ φ3 − c11 ∂α φ1 + c00 ∂α φ 3 = −1111 φ0 − 2(1101 + f11 )φ1 + 3(0011 + 1100 )φ2 −2(0001 − f00 )φ3 − 0000 φ4 , (19c) √ 0 0 α α 2∂τ φ3 − c11 ∂τ φ2 + c00 ∂τ φ4 − c11 ∂α φ2 + c00 ∂α φ4 = −21111 φ1 − 3f11 φ2 + (21100 + 40011 + 2f01 )φ3 − (40001 − f00 )φ4 ,(19d) √ 0 0 α α ( 2 + 2c01 )∂τ φ4 − 2c11 ∂τ φ3 + 2c01 ∂α φ4 − 2c11 ∂α φ 3 = −61111 φ2 + (41110 + 80111 )φ3 + (21100 − 80101 )φ4 . (19e)
To the latter we add a set of three equations, also implied by the Bianchi identities which we refer as to the Bianchi constraint equations, 0 0 0 α α α c11 ∂τ φ0 − 2c01 ∂τ φ1 + c00 ∂τ φ2 + c11 ∂α φ0 − 2c01 ∂α φ1 + c00 ∂α φ 2 = −(2(01)11 − 41110 )φ0 + (20011 − 4(01)01 − 41100 )φ1 +6(01)00 φ2 − 20000 φ3 , 0 ∂τ φ 1 c11
0 − 2c01 ∂τ φ 2
0 + c00 ∂τ φ 3
α + c11 ∂ α φ1
α − 2c01 ∂ α φ2
= 1111 φ0 − (4(01)11 − 21101 )φ1 + 3(0011 − 1100 )φ2 −(20001 − 4(01)00 )φ3 − 0000 φ4 , 0 ∂τ φ 2 c11
0 − 2c01 ∂ τ φ3
0 + c00 ∂τ φ 4
α + c11 ∂α φ 2
α − 2c01 ∂ α φ3
= 21111 φ1 − 6(01)11 φ2 + (40011 + 4(01)01 −21100 )φ3 − (40001 − 2(01)00 )φ4 .
(20a)
α + c00 ∂α φ 3
(20b)
α + c00 ∂α φ 4
(20c)
3.1. The initial data. As pointed out in the introduction, only asymptotically Euclidean, time symmetric, analytically conformally flat initial data will be considered in our discussion. A number of simplifications arise under these assumptions. In particular, around i, the conformal factor of the initial hypersurface can be written as, =
ρ2 , (1 + ρW )2
(21)
where W (i) = m/2, m the ADM mass of the initial hypersurface. The function W satisfies the Yamabe equation, which under our assumptions reduces to the Laplace equation. Therefore W is harmonic, and thus can be written as ∞
W =
m 1 k + wp,2p,k T2p p ρ p , 2 p! 2p
(22)
p=1 k=0
where the coefficients wp,2p,k , p = 1, 2, . . . , k = 0, . . . , 2p are complex numbers satisfying the regularity condition w p,2p,k = (−1)i+k wp,2p,2p−k , so that W is a real valued function. As mentioned in Sect. 2, a crucial property of our setup based on the properties of conformal geodesics is that it renders a conformal factor – see formula (1) – for the region of spacetime under discussion. Furthermore, solving the conformal geodesic
142
J.A.Valiente Kroon
equations also yields a 1-form dab , which appears in the propagation equations (17f) and (17g). Under our assumptions of time symmetry and conformal flatness it is given by xab − ρ 2 Dab W dab = 2ρ . (23) (1 + ρW )3 Once the function κ of Sect. 2 has been chosen, the initial data for the conformal propagation equations (17a)–(17g) is given by abcd = −
κ2 D(ab Dcd) ,
(24a)
κ3 D(ab Dcd) , 2 0 1 cab = 0, cab = κxab , κ κ c+ = zab , c− = yab , ρ ρ
√ κ 1 ξabcd = 2 (ac xbd + bd xac ) − (ac Dbd κ + bd Dac κ) , 2ρ 2κ χ(ab)cd = 0, fab = Dab κ, φabcd =
(24b) (24c) (24d) (24e) (24f)
where Dab , the spinorial covariant derivative of the initial hypersurface S is given by Dab µcd = xab ∂ρ µcd +
1 1 zab X+ µcd + yab X− µcd − γab ec µed − γab ed µec , ρ ρ
(25)
where the flat connection coefficients γabcd are given by γabcd =
1 (ac xbd + bd xac ), 2ρ
(26)
for a given differentiable spinorial function µab . 3.2. The transport equations. Equations (17a)–(17g) can be concisely written in the form ∂τ v = Kv + Q(v, v) + Lφ,
(27)
where K and Q are respectively a linear and a quadratic function with constant coefficients, whereas L is a linear function depending on the coordinates via , ∂τ and dab . For the Bianchi propagation equations (19a)–(19e) one can write √ µ 2E∂τ φ + Aab cab ∂µ φ = B(abcd )φ, (28) µ
where now, E denotes the (5 × 5) unit matrix, Aab cab are (5 × 5) matrices, and B(abcd ) is a linear (5 × 5)-matrix valued function of the connection coefficients abcd . In the sequel, given an unknown u we will write, u(0) = u|I . The objects , ∂τ and dab from which L is constructed, vanish on I . Thus, L(0) = L|I = 0, and consequently Eqs. (27) and (28) decouple from each other. The system of equations for the v unknowns, Eq. (27), turns out to be an interior system upon evaluation on the cylinder
New Class of Obstructions to the Smoothness of Null Infinity
143
at spatial infinity. Its initial data can be read from the restriction of the initial data (24a)(24f) to I . It can be seen that this restriction, irrespectively of the choice of the function κ coincides with the initial data of Minkowski spacetime. With this information in hand, the system for v (0) can be readily solved yielding, (0)
abcd = 0, 0 (0) ) (cab
(0)
(0)
χ(ab)cd = 0,
= −τ xab ,
fab = 0,
1 (0) (cab )
= 0,
(0)
ξabcd = 0,
− (0) (cab )
= yab ,
(29a) + (0) (cab )
= zab . (29b)
1 in the system (28) satisfies, From this solution it follows that the matrix Aab cab 1 Aab cab |I = 0.
(30)
This particular result will be crucial in our later discussion. As a consequence of it, the system (28) implies another interior system on I , as no ρ-derivatives will arise upon evaluation on I . It can be solved giving, (0)
2 . φabcd = −6mabcd
(31)
Because of the fact that the whole system of conformal field equations reduces to an interior system on I – something that does not happen with normal characteristics – we call it a total characteristic. The idea of interior systems previously discussed can be generalised by applying p times ∂ρ to Eqs. (27) and (28) and then evaluating on I . In this way one obtains a hierarchy of interior systems for the unknowns u(p) = ∂ρ u|I . These quantities can be used to construct formal expansions of the form, u=
1 u(p) ρ p p!
(32)
p=0
for the field quantities. The resulting equations, which will be referred generically to as transport equations, are of the form, ∂τ v (p) = Kv (p) + Q(v (0) , v (p) ) + Q(v (p) , v (0) ) p−1
+ Q(v (j ) , v (p−j ) ) + L(j ) φ (p−j ) + L(p) φ (0) ,
(33)
j =1
which correspond to the propagation equations of the v unknowns, Eqs. (17a)-(17g). From the Bianchi propagation equations (19a)-(19e) one gets, √
0 (0) C (0) ∂τ φ (p) + Aab (cab 2E + Aab (cab ) ) ∂C φ (p) p
p (j ) µ (0) B(abcd )φ (p−j ) − Aab (cab )(j ) ∂µ φ (p−j ) , (34) = B(abcd )φ (p) + j j =1
where C = ±. The systems (33) and (34) can be regarded as systems for the unknowns v (p) and φ (p) if the lower order quantities v (j ) and φ (j ) , 0 ≤ k ≤ p − 1 are known. The two systems are decoupled from each other, and accordingly one would firstly solve the system (33) and then feed its solution into the system (34) which now could in turn be solved.
144
J.A.Valiente Kroon
I+
I+ x0 i0
ρ
S
I−
I−
Fig. 1. Spacetime close to spatial and null infinities: to the left the standard representation of spatial infinity as a point i 0 ; to the right the representation where spatial infinity is envisaged as a cylinder
Because of (29a) and (29b), the matrix accompanying the ∂τ derivative in the system (34) is given by, √
√ 0 (0) 2E + Aab (cab (35) = 2 diag(1 + τ, 1, 1, 1, 1 − τ ). ) As a consequence of this, the symbol of the system looses rank at τ = ±1 – the system degenerates there. The points τ = ±1 correspond precisely to the sets I ± – cf. (4b)– the sets where “null infinity touches spatial infinity”. It is exactly this particular feature of the field equations that forces us to undertake a complicated and detailed analysis of the system (28). From an heuristic point of view the degeneracy at the sets I ± can be understood as a consequence of the change of behaviour of the conformal boundary of the spacetime with regard to the conformal field equations: the cylinder at spatial infinity I is a total characteristic, while the I ± are “only” normal characteristics – i.e. only proper subsets of the field equations reduce to interior systems on either I + or I − . Now, standard theory of symmetric hyperbolic systems guarantees, for a given order p, the existence of solutions to the joint system (33)–(34) for any subset of I . However, for the sets I ± the degeneracy implied by Eq. (35) the usual energy estimates provide no information precisely at the points one is interested the most – a similar phenomenon occurs with the original system (27)–(28). Thus, one needs to devise non-standard methods in order to address existence issues – see for example the discussions in [9, 25]. A first analysis of the transport equations carried out in [16] reveals that the effect of the degeneracy described in the previous paragraph is to produce solutions of the transport equations containing terms of the form, (1 − τ )m (1 + τ )n ln(1 ± τ ), with the effect that the solutions to the transport equations do not extend smoothly to the sets I ± . A necessary condition to preclude the appearance of such logarithmic terms is given in the following theorem. Theorem 1 (Friedrich, 1997). The solutions u(p) of the transport equations extend smoothly to the sets I ± only if the condition D(as bs · · · Da1 b1 babcd) (i) = 0,
s = 0, 1, . . .
(36)
is satisfied at all orders s. It is not satisfied for some s, the solution develops logarithmic singularities at I ± .
New Class of Obstructions to the Smoothness of Null Infinity
145
This theorem is the basis to the conjecture by H. Friedrich mentioned in the introduction. The analysis leading to the latter theorem is essentially an analysis of the homogeneous parts of the transport equations (33)-(34). A careful look at the complete equations reveals that even if the regularity condition (36) is satisfied there are still other potential sources of logarithmic terms. In Sect. 4 we shall discuss these further obstructions to the smoothness of null infinity. So far, the function κ introduced in Eq. (1) has been required to be of the form κ = ρκ , with κ analytic and such that κ (i) = 1, but otherwise it has remained unspecified 5 . Two choices consistent with the requirement κ = ρκ will be considered here. The first, κ = ρ is the simplest non-trivial one. For this choice I + in a neighbourhood of I + is concave, while I − in a neighbourhood of I − would be convex. The choice κ = ρ has the virtue of rendering the simplest possible analytic expressions, both for the initial data (24a)-(24f) and the solution of the transport equations (33)-(34). Unfortunately, it is hard to attach to it some geometrical significance other that its simplicity. The other choice to be considered is κ = ω. This choice is fine as under our assumptions ω = ρ + O(ρ). With this choice,
= ω−1 1 − τ 2 , (37) so that I ± near I ± are described by the hypersurfaces τ = ±1, ρ > 0 respectively: null infinity will be composed of two parallel planes, formally similar to the case of Minkowski – see [25] 6 . Consequently, the system of conformal field equations (27)(28) degenerate not only on I ± but also on I ± . Thus, the choice κ = ω has more geometrical and analytic relevance. As a drawback it renders more complicated analytic expressions. For future use, we note the following result on the expansion types of the diverse unknowns appearing in the transport equations (33) and (34). Its proof comes from inspection [16]. 1 (p) (p) (p) Lemma 2. The functions cab − ρxab , v , φ p = 1, 2, . . . , are of expansion type p − 2, p − 1, and p respectively.
4. Solving the Transport Equations Given Time Symmetric, Conformally Flat Initial Data On simplicity and aesthetical grounds it is natural to wonder whether the regularity condition, Eq. (36), is the only requirement one has to impose on the initial data in order to obtain solutions to the transport equations which are smooth – see for example the discussion in [18]. Before trying to prove some statement along these lines, it is of clear interest to calculate some further orders in the expansions. The rationale behind it being firstly to verify whether the conjecture still holds, and secondly to try to find some patterns in the solutions that one could exploit in an eventual abstract proof. In order to simplify the calculations, we have chosen to restrict our attention to those time 5 It is noted that the simple choice κ = 1 would lead to the standard representation of spatial infinity as a point – see the figure. The requirement of κ being of the form κ = ρκ ensures that spatial infinity is blown up to a cylinder. See Fig. 1. 6 This similarity is in some aspects deceiving, as generically when m = 0 the generators of null infinity, although confined to the planes τ = ±, are bent and may rotate a spin frame that is parallelly transported along it.
146
J.A.Valiente Kroon I+
I+
I+ I
τ
I0
S
ρ
I− I−
I−
Fig. 2. Effect of the choice of the function κ on the representation of null infinity near spatial infinity. To the left with the choice κ = ρ; to the right the choice κ = ω so that null infinity corresponds to the hypersurfaces τ = ±1
symmetric initial data sets which are conformally flat near infinity. These satisfy the regularity condition (36) in a trivial way. Thus, they represent the simplest (non-trivial) class of initial data sets one can look at. Their simplicity is somehow deceiving, and should not be regarded as a drawback on the kind of insight that can be gained through them. A great deal about the solutions of the Einstein field equations has been learned from the analysis of this class of initial data – see e.g. [1, 20, 26]. The already “large” transport equation systems (33)–(34) do not give an appropriate dimension of the computational difficulties one has to face if one is to take the expansions carried in [19] to even higher orders. However, the calculations one has to carry out are suitable to a treatment using a computer algebra system. In order to analyse with ease the solutions of the transport equations some scripts in the computer algebra system Maple V have been written. With the aim to work with a system of ordinary differential equations, the Maple V scripts take the transport equations (33) and express them in terms of the functions T2n kl according to Lemma 2. This involves expanding products of the form T2n1 k1l1 × T2n2 k2l2 using formula (16). The resulting ordinary equations are then solved exactly using the ordinary differential equation solver of Maple V. Full details of the computer algebra implementation will be presented elesewhere. Because of the largeness of the expressions contained in the solutions, we have opted to provide a description of the qualitative features of the expansions rather than a complete list of all the terms calculated. In particular, attention will be focused on the solutions to the transport equations arising from the Bianchi identities, the functions (p) φj . It should be emphasized that this does not mean that the solutions to the v unknown transport equations are not important. They are also crucial: a tiny mistake in the calculation of their solutions would destroy the whole structure of the solutions. However, as the discussion in the previous section has pointed out, the logarithmic terms that destroy the smoothness of null infinity appear firstly in the components of the Weyl spinor. In order to perform our expansions a number of assumptions have been made. We list them here for the purposes of a quick reference. Assumptions. It will be assumed that: (i) the initial data set is asymptotically Euclidean and time symmetric. (ii) In a neighbourhood Ba (i) of i the initial data is assumed to be conformally flat. The function W appearing in the conformal factor of the initial hypersurface – see Eq. (21) – is a solution of the Laplace equation admitting in Ba (i) a decomposition
New Class of Obstructions to the Smoothness of Null Infinity
147
of the form, 1 1 m+ Wi ρ i + O(ρ 9 ), 2 i! 8
W =
(38)
i=1
where Wi =
2i
wi,2i,k T2i ki ,
(39)
k=0
with the coefficients wi,2i,k , i = 1, . . . , 7, k = 0, . . . , 2i complex numbers satisfying w i,2i,k = (−1)i+k wi,2i,2i−k so that W is a real valued function. This is in consistency with the decomposition given in Eq. (22). (iii) Likewise, the components of the vector unknowns v (p) and φ (p) admit on I expank sions in terms of Tj l functions consistent with Lemma 2. (iv) The two following choices of the function κ – see Eq. (1) – will be considered: κ1 = ρ, κ2 = ω. The result of the calculations under these assumptions are now described. 4.1. The orders p = 0, . . . , 4. Firstly, calculations for the orders p = 1, p = 2 and p = 3 were undertaken. The results are in complete agreement with those given by Friedrich & K´ann´ar when reduced to the case of conformally flat initial data. The solutions at order p = 0 can be schematically written as, (0)
φ0 = 0,
(40a)
(0) φ1 (0) φ2 (0) φ3 (0) φ4
= 0,
(40b)
= −m,
(40c)
= 0,
(40d)
= 0,
(40e)
independently of the choice of κ. At order p = 1 one has, (1)
φ0 = 0,
(41a)
(1) φ1
(41b)
(1)
φ2
= −3 X+ W1 (1 − τ )2 , 1 4 = m2 τ − 3τ 2 + 6 W1 (τ 2 − 1), 2
(1)
(41c)
φ3 = 3 X− W1 (1 + τ )2 ,
(41d)
(1) φ4
(41e)
= 0,
148
J.A.Valiente Kroon (1)
(1)
when κ = ρ. The expressions for φ0 , φ1 , (1) similar. That of φ2 is given by, 1 4 (1) φ2 = m2 τ − 3τ 2 − 2
(1)
(1)
φ3 and φ4 for the choice κ = ω are 3 2
+ 6 W1 (τ 2 − 1).
(41f)
At order p = 2 one has, (2)
φ0 = f1 (τ )X+ X+ W2 ,
(42a)
(2) φ1 (2) φ2 (2) φ3 (2) φ4
= f2 (τ )mX+ W1 + f3 (τ )X+ W2 ,
(42b)
= f4 (τ )m + f5 (τ )mW1 + f6 (τ )W2 ,
(42c)
= −f2 (−τ )mX− W1 − f3 (−τ )X− W2 ,
(42d)
= f1 (−τ )X− X− W2 ,
(42e)
3
where fi (τ ), i = 1, . . . 6 are polynomials on τ . Their explicit form is not relevant for our purposes. The polynomials are slightly different for each of the choices of κ, but are of the same order. Similarly, the components of the Weyl tensor at order p = 3 are of the form, (3)
(43a)
(3) φ1 (3) φ2
= g4 (τ )X+ W3 + g5 (τ )mX+ W2 + g6 (τ )W1 X+ W1 + g7 (τ )m X+ W1 , (43b)
φ0 = g1 (τ )X+ X+ W3 + g2 (τ )mX+ X+ W2 + g3 (τ )(X+ W1 )2 , 2
= g8 (τ )W3 + g9 (τ )mW2 + g10 (τ )(W1 )2 +g11 (τ )2 m2 W1 + g12 (τ )m4 + g13 (τ )b,
(3) φ3
(43c)
= −g4 (−τ )X− W3 − g5 (−τ )mX− W2 −g6 (−τ )W1 X− W1 − g7 (−τ )m2 X− W1 ,
(3) φ4
(43d)
= g1 (−τ )X− X− W3 + g2 (−τ )mX− X− W2 + g3 (−τ )(X− W1 ) . 2
(43e)
Again, the functions gi (τ ), i = 1, . . . , 13 are polynomials, while b = 2w1,2,0 w1,2,2 − 2 w1,2,1 . The first fully new result corresponds to the order p = 4. Here, again, the solutions are still fully regular and polynomial: (4)
φ0 = h1 (τ )X+ W1 X+ W2 + h2 (τ )W1 X+ X+ W2 + h3 (τ )m(X+ W1 )2 +h4 (τ )mX+ X+ W3 + h5 (τ )X+ X+ W4 , (4) φ1
= h6 (τ )m X+ W1 + h7 (τ )W2 X+ W1 + h8 (τ )W1 X+ W2 +h9 (τ )mW1 X+ W1 + h10 (τ )mX+ W3 + h11 (τ )X+ W4 ,
(4) φ2
= h12 (τ )m + h13 (τ )b + h14 (τ )m W1 + h15 (τ )W1 W2
(44a)
3
5
(44b)
3
+h16 (τ )m(W1 )2 + h17 (τ )mW3 + h18 (τ )W4 ,
(44c)
(4) φ3
= −h6 (−τ )m X− W1 − h7 (−τ )W2 X− W1 − h8 (−τ )W1 X− W2 −h9 (−τ )mW1 X− W1 − h10 (−τ )mX− W3 − h11 (−τ )X− W4 ,
(4) φ4
= h1 (−τ )X− W1 X− W2 + h2 (−τ )W1 X− X− W2 + h3 (−τ )m(X− W1 ) +h4 (−τ )mX− X− W3 + h5 (−τ )X− X− W4 . (44e)
3
(44d) 2
Again, the functions hi (τ ), i = 1, . . . , 16 are polynomials depending on the choice of κ.
New Class of Obstructions to the Smoothness of Null Infinity
149
4.2. The first obstructions to smoothness: Orders p = 5, 6.. The calculation of the solutions to the Bianchi transport equations up to order p = 4 have shown that all of them are polynomial, and thus smooth at I ± . Consequently, the v unknowns also happen to be polynomial. The first modifications to this behaviour occur at the rather high order p = 5. Feeding the solution of the transport equations up to order p = 4 into the v transport equations (33) with p = 5 and solving one finds again that the components of v (5) are again polynomial. However, the solution of the Bianchi transport equations are of the form, (5) φ0 = C0 m2 G(5) (1 − τ )7 ln(1 − τ )
+(1 + τ )3 (351 − 150τ + 48τ 2 − 10τ 3 + τ 4 ) ln(1 + τ ) + k0 (τ ), (45a) (5) φ1 = C1 m2 G(5) (1 − τ )6 (2τ + 5) ln(1 − τ )
−(2τ 3 − 15τ 2 + 48τ − 75)(1 + τ )4 ln(1 + τ ) + k1 (τ ), (45b) (5) φ2 = C2 m2 G(5) (1 − τ )5 (τ 2 + 5τ + 8) ln(1 − τ )
+(1 + τ )5 (τ 2 − 5τ + 8) ln(1 + τ ) + k2 (τ ), (45c) (5) φ3 = C1 m2 G(5) (1 − τ )4 (2τ 3 + 15τ 2 + 48τ + 75) ln(1 − τ )
−(1 + τ )6 (2τ − 5) ln(1 + τ ) + k3 (τ ), (45d) (5) φ4 = C0 m2 G(5) (1 − τ )3 (351 + 150τ + 48τ 2 + 10τ 3 + τ 4 ) ln(1 − τ )
(45e) +(1 + τ )7 ln(1 + τ ) + k4 (τ ), where C0 , C1 , C3 are non-relevant non-zero numerical factors, ki (τ ) i = 0, . . . , 4 are polynomials depending on m, W1 , W2 , W3 , W4 , their X± derivatives and products of them. Most remarkably, G(5) =
4
(5)
Gk T4 k2 ,
(46)
k=0
where,
√ 2 (5) , G0 = mw2,4,0 − 2 6w1,2,0 √ (5) G1 = mw2,4,1 − 4 3w1,2,0 w1,2,1 , (5) G2 (5) G3 (5) G4 (5)
=
2 mw2,4,2 − 4w1,2,1
√
− 4w1,2,0 w1,2,2 ,
= mw2,4,3 − 4 3w1,2,1 w1,2,2 , √ 2 = mw2,4,4 − 2 6w1,2,2 .
(47a) (47b) (47c) (47d) (47e)
Thus, the coefficients Gk are (besides an irrelevant numerical factor) the NewmanPenrose constants of the development of the time symmetric conformally flat initial data – [19]. Plugging the solutions (45a)–(45e) into the transport equations for p = 6 one finds that the solutions of the sectors of the form T4 kj develop terms containing ln2 (1 ± τ ) and
150
J.A.Valiente Kroon (p)
ln(1 ± τ ). Furthermore, the sectors T6 kj +1 in φj contain logarithms. More precisely, the solution will be of the form, (6) φj = G(5) l1j (τ ) ln2 (1 − τ ) + l2j (τ ) ln(1 − τ ) + l3j (τ ) ln2 (1 + τ )
+l4j (τ ) ln(1 + τ ) + G(6) l5j (τ ) ln(1 − τ ) + l6j (τ ) ln(1 + τ ) + l7j (τ ), (48) where j = 0, . . . , 7, and lkj (τ ) are polynomials. The coefficients G(6) are new obstructions to the smoothness of the solutions. These will be discussed a bit later. It is not hard to imagine that from this point onward, terms containing ln(1 ± τ ) and higher order powers of them will spread all around the solutions to the transport equations. Instead of analysing this phenomenon, we will rather focus on the smooth solutions. Setting the Newman-Penrose constants to zero, the solution at order p = 6 is of the form, (6) φ0 = D0 m3 G(6) (2 + τ )(1 − τ )8 ln(1 − τ ) + (−254 + 233τ − 128τ 2
(49a) +46τ 3 − 10τ 4 + τ 5 )(1 + τ )4 ln(1 + τ ) + l0 (τ ), (6) φ1 = D1 m3 G(6) (23 + 20τ + 5τ 2 )(1 − τ )7 ln(1 − τ )
−(233 − 256τ + 138τ 2 − 40τ 3 + 5τ 4 )(1 + τ )5 ln(1 + τ ) + l1 (τ ), (49b) (6) φ2 = D2 m3 G(6) (64 + 69τ + 30τ 2 + 5τ 3 )(1 − τ )6 ln(1 − τ )
+(−64 + 69τ ) − 30τ 2 − 5τ 3 )(1 + τ )6 ln(1 + τ ) + l2 (τ ), (49c) (6) φ3 = D1 m3 G(6) (233 + 256τ + 138τ 2 + 40τ 3 + 5τ 4 )(1 − τ )5 ln(1 − τ )
(49d) +(23 − 20τ + 5τ 2 )(1 + τ )7 ln(1 + τ ) + l3 (τ ), (6) φ4 = D0 m3 G(6) (254 + 233τ + 128τ 2 + 46τ 3 + 10τ 4 + τ 5 )(1 − τ )4 ln(1 − τ )
+(−2 + τ )(1 + τ )8 ln(1 + τ ) + l4 (τ ), (49e) with l0 (τ ), . . . , l4 (τ ) polynomials, and G(6) =
6
(6)
Gk T6 k3 ,
(50)
k=0 (6)
where the coefficients Gk , k = 0, . . . , 6 are given in terms of initial data quantities by, √ (6) 3 , (51a) G0 = m2 w3,6,0 − 12 10w1,2,0 √ (6) 2 G1 = m2 w3,6,1 − 12 30w1,2,0 w1,2,1 , (51b) √ √ 2 (6) 2 2 (51c) G2 = m w3,6,2 − 24 6w1,2,0 w1,2,1 − 12 6w1,2,0 w1,2,2 , (6)
3 , G3 = m2 w3,6,3 − 72w1,2,0 w1,2,1 w1,2,2 − 24w1,2,1 √ √ (6) 2 2 2 , G4 = m w3,6,4 − 24 6w1,2,1 w1,2,2 − 12 6w1,2,0 w1,2,2
(51d) (51e)
New Class of Obstructions to the Smoothness of Null Infinity
√ (6) 2 G5 = m2 w3,6,5 − 12 30w1,2,1 w1,2,2 , √ (6) 3 . G6 = m2 w3,6,6 − 12 10w1,2,2
151
(51f) (51g)
In analogy to the order p = 5 we will refer to these coefficients as to the order 6 Newman-Penrose “constants”. One is naturally bound to ask whether the coefficients G(6) are actually associated to some conserved quantities at null infinity in the same way that the coefficients G(5) are. This consideration is beyond the scope of the present article, and will be analysed in detail in future work. It is pointed out, as a plausibility argument, that in the analysis of polyhomogeneous Bondi expansions carried out in for example [3, 23, 24], the first logarithmic terms appearing in the expansions were associated with a conserved quantity on null infinity. As it can be seen from our discussion, if G(5) = 0, then the first logarithmic terms appearing in our expansions are precisely those associated to G(6) . Notwithstanding, the expansions described here are based in the conformal geodesics gauge, while those in [3, 23, 24] use the so-called Bondi gauge. Thus, one would have to look in detail into the possible appearance of logarithmic terms in the transformation connecting the two gauges. Theorem (Main theorem, precise formulation). Necessary conditions for the development of initial data which is time symmetric and conformally flat in a neighbourhood Ba (i) of (spatial) infinity to be smooth on the set I ∪ I + ∪ I − are that the Newman(5) Penrose constants, Gk , k = 0, . . . , 5, should vanish. Furthermore, the “higher order” (6) Newman-Penrose constants, Gk , k = 0, . . . , 6, should also vanish. If only the coeffi(5) cients Gk vanish then the rescaled Weyl spinor is at most C 2 on a neighbourhood of either I + or I − . If both G(5) and G(6) vanish then generically the Weyl tensor will be at most of class C 3 of the latter neighbourhoods. From the last result one can extract directly the following (important) consequence: Corollary 1. The regularity condition (36) is not a sufficient condition for the smoothness at I ∪ I + ∪ I − of the development of asymptotically Euclidean, time symmetric initial data sets. This corollary is, thus, a negative answer to the conjecture raised in [18]. It is nevertheless surprising that the obstructions to the smoothness of null infinity arise at such a high order in the expansions. In order to acquire a deeper understanding of why this is the case would require in turn an abstract understanding of the algebraic structure of the transport equations (33) and (34). It is not unreasonable to reckon that some group theoretical properties of the whole setup play a major role here. It is worth mentioning that the logarithmic singularities contained in the solutions of the transport equations are associated with the conformal structure, and not artifacts of a choice of gauge – see the discussion regarding this in [16]. Furthermore, because of the hyperbolic nature of the propagation equations, it is very likely that the logarithmic terms in the solutions of the transport equations will propagate along the generators of null infinity. The details of this propagation, however, still have to be worked out in full detail. The theorem also suggests the following unexpected conjecture: Conjecture. The developments of the (non-degenerate) Brill-Lindquist and Misner initial data sets possess non-smooth null infinities.
152
J.A.Valiente Kroon
By non-degenerate Brill-Lindquist and Misner data it is understood that neither the individual masses of the individual holes nor the separation parametre are zero, in which case data is strictly Schwarzschildean – and thus its development has a smooth null infinity. In [8, 26] the Newman-Penrose constants of the Brill-Lindquist and Misner data sets [1, 20] have been calculated using the formula found by Friedrich & K´ann´ar [19]. Due to the axial symmetry, there is only one non-vanishing Newman-Penrose constant. Because of the non-vanishing Newman-Penrose constants, their asymptotic expansions will contain logarithmic terms as indicated in Sect. 4. In order to prove the conjecture one would then require an existence statement for the development. This is a very hard task and may involve considerations regarding Cosmic Censorship. That the development of these two initial data sets possesses a non-smooth null infinity may have implications in the description of their late time behaviour in terms of linear perturbations on a Schwarzschild background. It is however noted that the Corvino-Chru´sciel-Delay techniques provide the possibility of constructing modified Brill-Lindquist and Misner data which are exactly Schwarzschild data outside a compact set. These data sets would, in principle, admit a smooth null infinity, but again the details of this cannot be filled with the techniques currently available. Some remarks regarding the theorem come also into place: Remark 1. The calculations for the order p = 7 are already beyond the capabilities of Maple V – the expressions involved are too large for the simplification routines of the computer algebra system, even for the Origin computer available at the Albert Einstein Institute. Nevertheless, the calculations of axially symmetric situations are still possible. These have been carried out for the orders p = 7 and p = 8 inclusive. Assuming that both G(5) and G(6) vanish, the solutions of the Bianchi transport equations have again the expected form:
(7) (52) φj = Ej m3 G(7) m1j (τ ) ln(1 − τ ) + m2j (τ ) + m3j (τ ), where now due to the axial symmetry there is only one order 7 Newman-Penrose constant,
4 T8 44 . (53) G(7) = m3 w4,8,4 − 192w1,2,1 If G(7) vanishes in turn,
(8) φj = Fj m4 G(8) n1j (τ ) ln(1 − τ ) + n2j (τ ) + n3j (τ ),
(54)
5 T10 55 . G(8) = m4 w5,10,5 − 1920w1,2,1
(55)
with,
From this evidence it is not too hard to guess the following general formula for the obstructions of the smoothness of null infinity in the axially symmetric situation,
p−3 (56) G(p) = mp−4 wp−3,2p−6,p−3 − 2p−4 (p − 4)!(w1,2,1 )p−3 T2p−6 p−3 . The proof of such a formula is nevertheless beyond our current understanding of the transport equations. The significance of the latter expression will be discussed shortly.
New Class of Obstructions to the Smoothness of Null Infinity
153
Remark 2. In the non-axially symmetric case, it is conjectured that obstructions to the smoothness of null infinity (Generalised Newman-Penrose constants) are given in terms of the initial data by the following expressions, (p) s0 s1 s2 Gk = mp−4 wp−3,2p−6,k − cp,k;s0 ,s1 ,s2 w1,2,0 w1,2,1 w1,2,2 , (57) s0 +s1 +s2 =k
with k = 0, 1, . . . , p, where the coefficients cp,k;s0 ,s1 ,s2 are some numerical constants. 4.3. Obstructions to smoothness and the Schwarzschild initial data. As mentioned before, the Newman-Penrose constants of the Schwarzschild spacetime are all zero. Thus, in order to gain some insight into the significance of the expressions (56) and (57) it is convenient to see what occurs in the case of the Schwarzschild initial data. It is not hard to calculate an expression for initial data for the Schwarzschild spacetime on the slice of time symmetry. On this slice the initial 3-metric is conformally flat. Thus, the required (harmonic) conformal factor can be calculated directly from the Green’s function for the three dimensional Laplace equation in spherical coordinates. The Green’s function is given by,
G(r, θ, φ; r , θ , φ ) =
n ∞ n=0
4π r n Yn,m (θ, φ)Yn,m (θ , φ ) n+1 , 2n + 1 r m=−n
(58)
for r > r . The latter expression can be lifted into the frame bundle Ca,κ and written in terms of k the functions Tj l so that, ∞
W =
m n r T2n 2n−kn (t ) T2n kn ρ n , 2 2n
(59)
n=0 k=0
where (t , r ) denote the coordinates of the singularity of the Green function on the frame bundle. Thus, the T2n 2n−kn (t ) are fixed complex numbers. We write, 1 n (60) mr n!T2n 2n−kn (t ), 2 for n = 0, 1, . . . and k = 0, . . . , 2n. The latter coefficients are not independent but related to each other via recurrence relations, (n + 1)(2n + 1)w1,2,1 × wn,2n,k = (k + 1)(2n − k + 1)wn+1,2(n+1),k+1 +r 2 n(n + 1) k(2n − k)wn−1,2(n−1),k−1 . (61) wn,2n,k =
These can be readily obtained from similar recurrence relations holding for the spherical harmonics. The Schwarzschild solution has an obvious axial symmetry. If one considers an orientation of the coordinate system in a way that makes this symmetry of the initial data explicit – the singularity of the Green’s function is set along the z axis – one ends up with a much simplified expression, ∞ n−1 2 W = (w1,2,1 )n T2n nn ρ n , (62) m n=0
154
J.A.Valiente Kroon
so that the only non-vanishing wn,2n,k coefficients are given by, wn,2n,n =
2 m
n−1 (w1,2,1 )n n! .
(63)
One obtains precisely this expression if one requires that the hierarchy of axial NewmanPenrose constants, formula (56), vanishes at every order p. It is however important to point out that this is not a proof but again a plausibility argument as formula (56) has only been verified up to order p = 8. A much more involved calculation shows that the expressions one obtains for the coefficients w2,4,0 , . . . , w2,4,4 and w3,6,0 , . . . , w3,6,6 by setting G(5) and G(6) to zero are precisely those one has for the Schwarzschild initial data. This involves the use of the recurrence relation (61). Now, the function W is a solution to W = 0, and thus analytic. This in turn implies that the function W one deduces from requiring the coefficients (56) to vanish at all orders p is exactly Schwarzschildean. This evidence leads to the following, Conjecture (Precise formulation). For every k > 0 there exists a p = p(k) such that the time evolution of an asymptotically Euclidean, time symmetric, conformally smooth initial data set which is conformally flat near i admits a conformal extension to null infinity of class C k near spacelike infinity, if and only if the initial data set is Schwarzschildean to order p(k) near i. The latter is the successor of the conjecture, in the context of time symmetric conformally flat, of the conjecture by Friedrich [18] to which reference was made in the introduction. It is, however, worth mentioning that if proved true, the conjecture here stated would constitute a rigidity result associated with the notion of asymptotic simplicity.
5. Conclusions and Extensions The results presented in the main theorem together with the considerations leading to the conjecture put forward in Sect. 4 seem to suggest that no gravitational radiation should be present around spatial infinity if one is to have a smooth null infinity. In other words, the notion of smooth null infinity seems to be incompatible with the presence of radiation around i 0 . Whether this behaviour of spacetime in the neighbourhood of spatial infinity has some implications on either the demeanour of the sources of the gravitational field in the infinite past or on the nature of incoming radiation traveling from past null infinity is a natural – but hard to answer – question. A natural extension of the calculations here described would be to consider what happens with more general (i.e. not conformally flat) time symmetric initial data. If the time symmetric initial data is not conformally flat, then the Newman-Penrose constants are given in terms of the initial data by, √ 2 (5) = mw2,4,0 − 2 6w1,2,0 G − 0 (5) G 1 (5) G 2
1 (2) √ R0 , 254 6 √ 1 (2) = mw2,4,1 − 4 3w1,2,0 w1,2,1 − √ R1 , 254 6 1 (2) 2 = mw2,4,2 − 4w1,2,1 − 4w1,2,0 w1,2,2 − √ R2 , 254 6
New Class of Obstructions to the Smoothness of Null Infinity
√ (5) = mw2,4,3 − 4 3w1,2,1 w1,2,2 − G 3 √ 2 (5) = mw2,4,4 − 2 6w1,2,2 G − 4
155
1 (2) √ R3 , 254 6
1 (2) √ R4 , 254 6
(2)
where Rk are coefficients appearing in the expansion of the Ricci scalar r of the initial 3-metric, around spatial infinity. In the view of the present results one would expect to (5) appearing as obstructions to the smoothness of null infinity. have the quantities G k (6) of the higher order NewSimilarly, one would expect to obtain generalisations G k (6) man-Penrose constants Gk by adding suitable expressions containing the Ricci scalar. Ultimately, one would like to prove that there is an infinite hierarchy of such quantities as obstructions to the smoothness of I. The relation of these quantities with static initial data should then be analysed, at least for the first orders. It could well be the case that a time symmetric initial data set will yield a development with smooth null infinity only if it is static in a neighbourhood of spatial infinity. In relation to this, it is noted that the asymptotically Euclidean static initial data satisfy the regularity condition (36) – see [15]. With regard to a proof to the conjecture presented in this article, it is noted that it would require a much deeper understanding of the properties of the transport equations at spatial infinity than the one currently available. In particular one would like to be able to discuss its algebraic properties in a much more abstract way, and without having to resort to “explicit expressions” as it was done here. As mentioned before, group theoretical properties of the setup should play a crucial role here. Acknowledgements. I thank H. Friedrich who suggested the problem and helped with discussions and clarifications all along the long winding road. I also thank S. Dain, M. Mars and R. Vera from whom I have benefited through several discussions and comments on early versions of the manuscript. The late stages of this research have been funded by a Lise Meitner grant (M-690-N09) of the FWF, Austria. The computer algebra calculations in Maple V have been carried out at the facilities of the Max Planck Institut f¨ur Gravitationsphysik, Albert Einstein Institut in Golm, Germany.
References 1. Brill, D.R., Lindquist, R.W.: Interaction energy in geometrostatics. Phys. Rev. 131, 471 (1963) 2. Chru´sciel, P.T., Delay, E.: Existence of non-rivial, vacuum, asymptotically simple spacetimes. Class. Quantum Grav. 19, L71 (2002) 3. Chru´sciel, P.T., MacCallum, M.A.H., Singleton, D.B.: Gravitational waves in general relativity XIV. Bondi expansions and the “polyhomogeneity” of I. Phil. Trans. Roy. Soc. Lond. A 350, 113 (1995) 4. Corvino, J.: Scalar curvature deformations and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214, 137 (2000) 5. Cutler, C. Wald, R.M.: Existence of radiating Einstein-Maxwell solutions which are C∞ on all of I + and I − . Class. Quantum Grav. 6, 453 (1989) 6. Dain, S.: Initial data for stationary spacetimes near spacelike infinity. Class. Quantum Grav. 18, 4329 (2001) 7. Dain, S., Friedrich, H.: Asymptotically flat initial data with prescribed regularity at infinity. Commun. Math. Phys. 222, 569 (2001) 8. Dain, S., Valiente Kroon, J.A.: Conserved quantities in a black hole collision. Class. Quantum Grav. 19, 811 (2002) 9. Friedrich, H.: Spin-2 fields on Minkowski space near space-like and null infinity. Class. Quantum Grav. 20, 101 (2003) 10. Friedrich, H.: On the existence of analytic null asymptotically flat solutions of Einstein’s vacuum field equations. Proc. Roy. Soc. Lond. A 381, 361 (1981)
156
J.A.Valiente Kroon
11. Friedrich, H.: On the regular and the asymptotic characteristic initial value problem for Einstein’s vacuum field equations. Proc. Roy. Soc. Lond. A 375, 169 (1981) 12. Friedrich, H.: Cauchy problems for the conformal vacuum field equations in General Relativity. Commun. Math. Phys. 91, 445 (1983) 13. Friedrich, H.: On purely radiative space-times. Commun. Math. Phys. 103, 35 (1986) 14. Friedrich, H.: On the existence of n-geodesically complete or future complete solutions of Einstein’s field equations with smooth asymptotic structure. Commun. Math. Phys. 107, 587 (1986) 15. Friedrich, H.: On static and radiative space-times. Commun. Math. Phys. 119, 51 (1988) 16. Friedrich, H.: Gravitational fields near space-like and null infinity. J. Geom. Phys. 24, 83 (1998) 17. Friedrich, H.: Einstein’s equation and conformal structure. In: The Geometric Universe. Science, Geometry and the work of Roger Penrose, S.A. Huggett, L.J. Mason, K.P. Tod, S.T Tsou, N.M.J. Woodhouse, (eds.), Oxford: Oxford University Press, 1999, p. 81 18. Friedrich, H.: Conformal Einstein evolution. In: The conformal structure of spacetime: Geometry, Analysis, Numerics, J. Frauendiener and H. Friedrich, (eds.), Lecture Notes in Physics, Berlin-Heidelberg-New York: Springer, 2002, p. 1 19. Friedrich, H., K´ann´ar, J.: Bondi-type systems near space-like infinity and the calculation of the NP-constants. J. Math. Phys. 41, 2195 (2000) 20. Misner, C.W.: The method of images in geometrodynamics. Ann. Phys. 24, 102 (1963) 21. Penrose, R.: Asymptotic properties of fields and space-times. Phys. Rev. Lett. 10, 66 (1963) 22. Penrose, R.: Zero rest-mass fields including gravitation: Asymptotic behaviour. Proc. Roy. Soc. Lond. A 284, 159 (1965) 23. Valiente Kroon, J.A.: Conserved Quantities for polyhomogeneous spacetimes. Class. Quantum Grav. 15, 2479 (1998) 24. Valiente Kroon, J.A.: Logarithmic Newman-Penrose Constants for arbitrary polyhomogeneous spacetimes. Class. Quantum Grav. 16, 1653 (1999) 25. Valiente Kroon, J.A.: Polyhomogeneous expansions close to null and spatial infinity. In: The Conformal Structure of Spacetimes: Geometry, Numerics, Analysis J. Frauendiner and H. Friedrich, (eds.), Lecture Notes in Physics, Berlin-Heidelberg-New York: Springer, 2002, p. 135 26. Valiente Kroon, J.A.: Early radiative properties of the developments of time symmetric conformally flat initial data. Class. Quantum Grav. 20, L53 (2003) Communicated by H. Nicolai
Commun. Math. Phys. 244, 157–185 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0989-z
Communications in
Mathematical Physics
“Extrinsic” and “Intrinsic” Data in Quantum Measurements: Asymptotic Convex Decomposition of Positive Operator Valued Measures Andreas Winter Department of Computer Science, University of Bristol, Merchant Venturers Building, Woodland Road, Bristol BS8 1UB, UK. E-mail:
[email protected] Received: 12 September 2001 / Accepted: 21 July 2003 Published online: 25 November 2003 – © Springer-Verlag 2003
Abstract: We study the problem of separating the data produced by a given quantum measurement (on states from a memoryless source which is unknown except for its average state), described by a positive operator valued measure (POVM), into a “meaningful” (intrinsic) and a “not meaningful” (extrinsic) part. We are able to give an asymptotically tight separation of this form, with the “intrinsic” data quantified by the Holevo mutual information of a certain state ensemble associated to the POVM and the source, in a model that can be viewed as the asymptotic version of the convex decomposition of POVMs into extremal ones. This result is applied to a similar separation therorem for quantum instruments and quantum operations, in their Kraus form. Finally we comment on links to related subjects: we stress the difference between data and information (in particular by pointing out that information typically is strictly less than data), derive the Holevo bound from our main result, and look at its classical case: we show that this includes the solution to the problem of extrinsic/intrinsic data separation with a known source, then compare with the well–known notion of sufficient statistics. The result on decomposition of quantum operations is used to exhibit a new aspect of the concept of entropy exchange of an open dynamics. An appendix collects several estimates for mixed state fidelity and trace norm distance, that seem to be new, in particular a construction of canonical purification of mixed states that turns out to be valuable to analyze their fidelity.
I. The Problem Consider a quantum system, represented by a Hilbert space H (which we assume to be of dimension d < ∞ in the sequel), and a measurement on this system, described by a positive operator valued measure (POVM) a = (a1 , . . . , am ), aj ∈ B(H) such that aj ≥ 0 and j aj = 11.
158
A. Winter
Following [22] and [29] we shall be concerned with the question “How much information is obtained by a?”, beginning with a clarification what this question should mean at all. Imagine that a family of states (represented by density operators) ρi on H is given, let us say witha priori probabilities pi , such that the density operator of this source of states is ρ = i pi ρi , then the “information” in question could mean the information in j about i, and one way to quantify it would be given by Shannon’s mutual information [25] I (i ∧ j ). Note that this is in general less than the amount of raw data, which is operationally quantified by the entropy of the distribution of the j : H (λ), with λj = Tr (ρaj ), due to Shannon’s source coding Theorem [25]. This choice however is rather arbitrary: asking about the identity of the state from a list. Why not allow a different list, or ask for some property of the state. Also, mutual information is a measure of correct identification; but what if we need only “almost correct” identification, as in quantum statistical detection theory [11]? It seems hence that specifying the information in measurement results, or even only the amount, in an operationally satisfying way, is problematic, and one reason might be the complementarity of quantum mechanics: qualitatively, accessing some observable property optimally entails rather poor performance for others. Nevertheless, it is quite obvious intuitively that in almost any POVM there is “quantum noise”, i.e. redundancy put into the j by the very quantum mechanical probability rule, most simply due to nonorthogonality of the operators aj , for example in an overcomplete system (see e.g. [15]). Our approach will thus be from the opposite end: instead of attempting the impossible, defining what “useful” means in any circumstances, we adopt a very simple criterion of uselessness: statistical independence from the measured states, because independent randomness can be generated from outside without accessing the quantum system. On the other hand we do not permit a distortion of the measurement itself, so that we are forced to consider a simulation of the original measurement by means of, first, a random choice ν of a measurement a(ν) from a list and, second, computation of a result from the outcome of this measurement and the random choice, such that the statistical distribution of these results is indistinguishable from the ones of the original measurements, on any prepared state. Because we can absorb the computation of the results into the labelling of the a(ν) , this means that we aim at finding such POVMs, whose indices are labelled by the same j as a and probabilities xν , such that a=
xν a(ν) , i.e. ∀j aj =
ν
(ν)
xν a j .
(1)
ν
(The operators must be the same because otherwise there would be states that induce distinguishable outcome distributions. Below we will introduce an element of approximation into this scheme). Why should we want to do such a decomposition, interesting though the structure (ν) exhibited (convex set of POVMs) might be mathematically? Observe that each a has (ν) (ν) of j conditional its distribution of outcomes, with the probabilities λj = Tr ρaj on ν. Shannon’s source coding Theorem [25] quantifies the amount of data in such a source as the (Shannon) entropy (ν) (ν) H λ(ν) = −λj log λj , j
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
159
by compression (we note that in this paper all logs and exps are to basis 2). Hence, on average, one needs H (j |ν) := xν H λ(ν) ν
bits to faithfully compress the data (j ), given ν as side–information. This motivates the study of the function (ν) δ(ρ, a) := min H (j |ν) : a = , xν a
(2)
ν
which is the minimum data rate (in Shannon’s sense) for exact reconstruction of the data. Example 1. Look at a qubit system, C2 , with basis {|0, |1}: there let us consider the five “Chrysler” states (in analogy to the “Mercedes” trine states)
πt πt |et = cos |0 + sin |1, for t = 0, . . . , 4. 5 5 The collection a = ( 25 |et et |)t=0,... ,4 is a POVM, and we can determine its decompositions into extremal ones: these latter are given by putting weights on the |et et |, and it is straightforward that for an extremal POVM at most 3 can be nonzero (as the “Chrysler” states form a pentagon on the Bloch sphere equator). In fact, every extremal must be of the form α|et et |, β|et+2 et+2 |, β|et+3 et+3 | , t = 0, . . . , 4, indices understood modulo 5. From here one can determine the weights to be
2π 2 2π −2 1 α = 1 − cot sin ≈ 0.5528, β = ≈ 0.8944. 5 2 5 For simplicity now look at the maximally mixed state ρ = 21 11, for which it is unimportant which decomposition into these extremal POVMs is chosen, as all contributions ν will give the same Shannon entropy:
α β β δ(ρ, a) = H (j |ν) = H , , 2 2 2
β β = H 1 − β, , = H (1 − β, β) + β ≈ 1.5447. 2 2 In contrast, the main Theorem 2 below will achieve a rate of H (ρ) = 1, asymptotically. The computation of δ(ρ, a) is an interesting problem in its own right (in particular the question if anything can be gained on δ by considering multiple copies, i.e. the additivity problem), however we take a different approach, bearing in mind that the operational content of Shannon’s theorem involves block coding – i.e., a large number l of independent copies of the simple system described above, and an arbitrarily small yet nonzero error probability. Thus we are really decomposing the POVM a⊗l = aj1 ⊗ · · · ⊗ ajl j l ∈{1,... ,m}l ,
160
A. Winter
where we have introduced the notation j l = j1 . . . jl for a string of symbols, used henceforth. And the error introduced through block compression entails that instead of Eq. (1) we will only have xν A(ν) , (3) a⊗l ≈ A = ν
where the ≈ sign is made precise to mean “average approximation of outcome statistics”: assuming an ensemble {σk , qk } with k qk σk = ρ ⊗n , there is the joint distribution of input k and output j l when applying a⊗l , γ (k, j l ) = qk Tr (σk aj l ),
(4)
(k, j l ) = qk Tr (σk Aj l ).
(5)
and likewise for A:
Then we require that, independent of the particular ensemble, 1 1
γ −
1 = |γ (k, j l ) − (k, j l )| ≤ . 2 2 l
(CP)
k,j
(It is not difficult to see that Eq. (1) raised to the l th tensor power, together with Shannon compression of the outcomes of a(ν1 ) ⊗· · ·⊗a(νl ) for the probable ν1 . . . νl yields exactly that.) Indeed we can, using the abbreviation ω = ρ ⊗l , rewrite Eq. (4) as √ √ γ (k, j l ) = Tr ω−1/2 qk σk ω−1/2 ωaj l ω , observing that the Sk = ω−1/2 qk σk ω−1/2 form a POVM on H⊗l (this fact was observed before, and used in [17] to classify all ensembles with a given average state). Similarly √ √
(k, j l ) = Tr Sk ωAj l ω , and we can rewrite and estimate the left-hand side of (CP) as follows: 1 √ √ 1 Tr Sk ω(a l − A l ) ω
γ −
1 = j j 2 2 l j
k
1 √ √ ω(a l − A l ) ω , ≤ j j 1 2 l j
with the trace norm · 1 . So (CP) is in fact implied by 1 √ ⊗l √ ⊗l ρ (Aj l − aj l ) ρ ≤ . 1 2 l
(CM)
j
Notice that the condition can be phrased in a particularly nice way introducing the quantum operations ϕ∗⊗l : σ −→ Tr (σ aj l )|j l j l |, (6) jl
∗ : σ −→
jl
Tr (σ Aj l )|j l j l |.
(7)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
SOURCE
k
a
l
161
jl
DATA RECORD
ml
K
ml
Fig. 1. The source represents a number of possible states encountered by the POVM, but there is no way of knowing which is present (apart from the a priori distribution). The data produced by the measurement is then stored in a record. The rates of these processes are represented by the sizes of the different boxes and width of the data flow arrows: originally the rates of the source and of the measurement outcomes are both large
Namely, for a purification π of ρ, (CM) is easily seen to be equivalent to 1 (8) (id ⊗ ∗ )(π ⊗l ) − id ⊗ φ∗⊗l (π ⊗l ) ≤ . 1 2 The organization of the paper is as follows: In Sect. II we will present our main Theorem 2 and its proof, which is much more satisfying than results in previous work [22, 29], that can now be regarded as precursors: they are shown to easily follow from Theorem 2 in Sect. III. Sect. IV is concerned with the asymptotic optimality of our main theorem, a strong converse result, Theorem 8. After this, in Sect. V we apply our result to a kind of asymptotic normal form of completely positive trace preserving maps (operations as well as instruments), and present an extensive discussion in Sect. VI: we restate our observation from [29] that one ought to distinguish obtained data from information, give a new, conceptually simple proof of the Holevo bound, remark on the classical case of the main theorem (which includes the problem of separating extrinsic and intrinsic data under a known source ensemble), comment on the related concept of sufficient statistics, and discuss the bearing of our results on the concept of entropy exchange of an open dynamics of a system. We close with a challenging open problem. An appendix features several not widely known facts about the mixed state fidelity, in particular introducing canonical purifications of mixed states, a second appendix collects properties of typical sequences and typical subspaces, used in the main text. II. Separating Extrinsic and Intrinsic Data We want to represent (up to a small deviation as specified by the (CM) condition) a⊗l as a convex combination of POVMs A(ν) , with positive weights xν , ν = 1, . . . , N, each being defined on the set [m]l and having a small number M of sequences on which it is (ν) supported (i.e. where Aj l = 0): this is an even stronger requirement than the entropy condition we had considered in the introduction. Performing A amounts to choosing a ν (with probability xν ), and performing A(ν) , which itself can generate at most M different outcomes: the ν–part of the produced data is obviously independent of the incoming signal, while the measurement outcome (conditional on the ν chosen) contains the useful information. Our central result is: Theorem 2. For the state ρ and the POVM a define a canonical ensemble {ρˆj , λj }, with λj = Tr (ρaj ),
ρˆj =
1√ √ ρaj ρ. λj
162
A. Winter
SOURCE
k
jl
A
DATA RECORD
M
ml
K ν RANDOMNESS
N Fig. 2. A nice way of picturing the content of Theorem 2 is in the form of an elaborate bottleneck between source and outcomes: it is supplied from outside with the extrinsic data ν, and conditional on this and the incoming k produces the intrisic data j l . Only the intrinsic data are correlated to the signal k, while the extrinsic data (though evidently an indispensable part of the whole data) is independent of it. To put it pointedly: while it is difficult and possibly ambiguous to speak of “useful data”, one can clearly identify data of no import in all respects: the unrelated randomness ν. This is put into focus by Theorem 2, and our concept of usefulness is just the remainder after extracting as much uselessness as possible
There exist POVMs A(ν) on [m]l , ν = 1, . . . , N, each supported on a set of cardinality at most M, where √ √ M = exp lI (λ; ρ) ˆ + O( l) , N = exp l H (λ) − I (λ; ρ) ˆ + O( l) , such that for A =
1 N
ν
A(ν) condition (CM) is satisfied.
The characteristic constant in the exponent is I (λ; ρ) ˆ = H (ρ) −
λj H (ρˆj ),
j
the entropy defect of the ensemble (Lebedev and Levitin [20]), or the quantum mutual information between a sender producing the letter j with probability λj and a receiver getting the letter state ρˆj (see [16, 24]). It is the difference between the von Neumann entropy H (ρ) = −Tr ρ log ρ of the ensemble and its conditional entropy H (ρ|λ) ˆ = j λj H (ρˆj ). Observe that not only ρ can be recovered from this ensemble (as its average), but also the POVM a: aj = ρ −1/2 λj ρˆj ρ −1/2 . This construction is known as the “square root measurement” [14], or “pretty good measurement” [10]. We shall give the proof of Theorem 2 in a minute, after a few preparations. A central part of the argument is the following auxiliary result from [2] that we state separately: Lemma 3 (Ahlswede, Winter [2], Thm. A.19). Let X1 , . . . , XM be independent identically distributed (i.i.d.) random variables with values in the algebra L(K) of linear operators on K, which are bounded between 0 and 11. Assume that the average
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
163
EXµ = σ ≥ s 11. Then for every 0 < η < 1/2,
M 1 η2 s Pr Xµ ∈ [(1 ± η)σ ] ≤ 2 dim K exp −M , M 2 ln 2 µ=1
where [(1 ± η)σ ] = [(1 − η)σ ; (1 + η)σ ] is an interval in the operator order: [A; B] = {X ∈ B(K) : A ≤ X ≤ B}. We shall use the concepts of typical and conditionally typical subspaces in the form of [28], which we collect in Appendix B. l let Proof of Theorem 2. Define the following operators: for j l ∈ Tλ,δ l l l l ξj l = lρ,δ lρ,δ ˆ (j )ρˆj l ρ,δ ˆ (j )ρ,δ .
We choose δ = m 2d , so that
l ) ≥ 1 − , S := λ⊗l (Tλ,δ
(9)
≥ 1 − ,
(10)
Tr ξj l
which is true by Chebyshev’s inequality and Eqs. (B.2) and (B.3), specifying later. Notice that in this way Tr ω ≥ 1 − 2 for ω = λj l ξj l . l j l ∈Tλ,δ
By Eq. (B.8) we have lρ,δ ωlρ,δ ≥ αlρ,δ , √ with α = exp(−lH (ρ) − O( l)). Define now to be the projector onto the subspace spanned by the eigenvectors of ω with eigenvalue ≥ α. By construction we find Tr ≥ 1 − 3 for = S −1 ω . (ν) l , ν = 1, . . . , N, Now let ξj l = ξj l and define i.i.d. random variables Jµ ∈ Tλ,δ µ = 1, . . . , M by λj l =: Lj l . Pr{Jµ(ν) = j l } = S l . That is, we consider N independent sets of M independent choices each, from Tλ,δ Observe that = EξJ (ν) , the expected value of the random operators ξJ (ν) . µ µ We shall show that with high probability the following conditions hold: M 1 ξJ (ν) ∈ [(1 ± )], µ M
(Iν )
N,M 1 δJ (ν) ∈ [(1 ± )L ]. µ NM
(II)
µ=1
for all ν, and
ν,µ=1
164
A. Winter
This is most easily seen with the help of Lemma 3: according to it
3α , Pr{¬Iν } ≤ 2Tr exp −M 2β ln 2
with
2γ , −N M 2 ln 2
Pr{¬II} ≤
l 2|Tλ,δ | exp
√ l } ≥ exp −lH (λ) − Kmδ l , γ = min{λj l : j l ∈ Tλ,δ
compare Eq. (B.8). Choosing M and N according to the theorem’s statement will force the sum of these probabilities to be less than 1, i.e. with positive probability all the events (Iν ) and (II) happen. (ν) Let us assume we fix now values for the Jµ such that all equations (Iν ) and (II) are satisfied. Then we may define operators S 1 (ν) ξJ (ν) ω−1/2 ω−1/2 Aj l = µ 1+ M (ν) µ:Jµ =j l
=
(ν)
S |{µ : Jµ = j l }| −1/2 ω ξj l ω−1/2 . 1+ M
We check that for each ν these form a sub–POVM (i.e., a collection of positive operators with sum upper bounded by 11): using (Iν ) and the definitions of and ω we find M (ν) √ √ S 1 ω Aj l ω = ξJ (ν) ≤ S = ω ≤ ω ≤ ω. µ 1 + M l µ=1
j
Finally, we check that condition (CM) holds: it is sufficient to do this for the sub–POVM constructed, because then we can distribute the remaining operator weight to fill up to 11 arbitrarily. We calculate directly from the definitions: 1 √ √
ω(aj l − Aj l ) ω 1 2 jl (µ) 1 S|{νµ : Jµ = j l }| = ξj l λj l ρˆj l − 2 (1 + )NM l
1 j 1 1 1 1 ≤ (1 − S) + Lj l ρˆj l − ξj l 1 + L − δJ (ν) µ 2 2 2 N M l νµ l j ∈Tλ,δ
1 ≤ + Lj l ρˆj l − ξj l 1 + 2 2 2 l j l ∈Tλ,δ
1 1 ≤+
ρˆ l − ξj l 1 + ξj l − ξj l 1 . Lj l 2 j 2 l l j ∈Tλ,δ
1
(11)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
165
By the definition of ξj l , using Eq. (10) and Lemma 4 below, we can bound the first of √ the two terms in brackets by + 2. It remains to estimate the second: consider = Lj l ξj l , l j l ∈Tλ,δ
and recall that ξj l = ξj l , hence = . By construction we have l j l ∈Tλ,δ
Lj l Tr ξj l ≥ 1 − 3,
thus, using Lemma 4 with each of the ξj l and employing concavity of the square root function, we end up with l j l ∈Tλ,δ
√ 1 Lj l ξj l − ξj l 1 ≤ 6, 2
which allows us to estimate (11) by 2 +
√ √ 2 + 6.
Here is the lemma that we needed in the proof: it says that a POVM element that is likely to respond to a state acts “gently” on it in the sense of little disturbance. Lemma 4 (Lemma V.9 of [28]). For a state ρ and and an operator 0 ≤ X ≤ 11, if √ √ √ Tr (ρX) ≥ 1 − λ, then ρ − Xρ X ≤ 8λ. 1 The same is true if ρ is only a subnormalized density operator. III. Previous Approaches The question addressed in the present paper of quantifying the “amount of information obtained by a quantum measurement” has been posed before, in the works [22] and [29], with mathematical modellings different from ours, though there is an evolution leading from the first to the present: In [22] the POVM a was assumed to maximize a certain Bayesian gain (there called “fidelity”) pi Tr (ρi aj )Fij , F (a) = ij
to achieve the optimal (i.e. maximal) value Fopt . On blocks of length l the gain (or fidelity) function was extended by defining Fi l j l = 1l lk=1 Fik jk . This definition has the easily checked property that the gain on blocks of length l, F (a⊗l ) = pi l Tr (ρi l aj l )Fi l j l , (12) il j l
equals the single letter expression F (a). Note that in this way the maximum Bayesian gain is still Fopt (which can be seen from Eq. (13) below). Then the following theorem was shown:
166
A. Winter
Theorem 5 (Massar, Popescu [22]). For > 0 and l large enough there exists a POVM A with fidelity F (A) ≥ Fopt − and M ≤ exp l(H (ρ) + ) ,
many outcomes among the j l .
This result was interpreted as saying that about any property of the ensemble states, as encoded in the Bayesian gain matrix Fij , one can learn at most one bit per qubit. In [29] this was extended and clarified as follows: observe that for any POVM A = (Ajµl )µ=1,... ,M one has F (A) =
pi l
µ
il
=
1 l
k=1
l k=1
i
1 Fik jµk l l
Tr (ρi l Ajµl )
pi Tr (ρi (A|k)j )Fij ,
where (with [l] = {1, . . . , l}) (A|k)j = Tr =k ρ ⊗[l]\k ⊗ 11k Ajµl = ρ −1 Tr =k ρ ⊗l
=
µ: jµk =j
(13)
j
√ √ ρ −1 Tr =k ρ ⊗l Ajµl ρ ⊗l ρ −1 .
µ: jµk =j
Ajµl (14)
µ: jµk =j
For each k, the collection ((A|k)j )j =1,... ,m obviously is a POVM on H. We may assume (as we shall do in the sequel) that the |Fij | are bounded by 1: then the fidelity condition of Theorem 5, reading |F (A) − F (a)| ≤ , is implied by ∀k
pi Tr ρi (A|k)j − pi Tr (ρi aj ) ≤ .
(C0)
(C1)
ij
This is itself implied by ∀k∀i
|Tr (ρi (A|k)j ) − Tr (ρi aj )| ≤ ,
(C2)
j
which in turn follows from ∀k
(A|k)j − aj ≤ ,
(C3)
j
with the operator norm · . It was then proved Theorem 6 (Winter, Massar [29]). Given > 0, there exists a POVM A = (Ajµl )µ=1,... ,M with √ M ≤ exp lI (λ; ρ) ˆ +C l (where C is a constant depending only on , d and m), and such that (C3) is satisfied.
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
k
SOURCE
A
167
µ
M
DATA REC.
M
K Fig. 3. In [22] and [29] the original POVM is replaced by an “equivalent” one (as made precise in Theorems 5 and 6) with many fewer outcomes. So, POVM and data record need much less rate of processing and storage, respectively. Of course, compared to Theorem 2 we lose many potential measurement results in constructing the new POVM
This theorem is in an asymptotic sense the best possible (such an optimality was missing in [22]): Theorem 7 (Winter, Massar [29]). Let 0 < ≤ (λ0 /2)2 , with λ0 = minj λj . Then for any POVM A = (Ajµl )µ=1,... ,M such that (C3) holds, one has 2 3 M ≥ exp l I (λ; ρ) ˆ + 2 log 2 . λ0 λ0 d Here we want to show that the Theorems 5 and 6 may be obtained as corollaries of Theorem 2. (ν) Proof of Theorem 5. Choose xν and A according to Theorem 2, such that condition (CM) is satisfied for A = ν xν A(ν) , with some > 0 (which implies that also (CP) is satisfied with the same ). Then, assuming without loss of generality that |Fij | ≤ 1, we get immediately out of Eq. (12) that
|F (A) − F (a⊗l )| ≤ . Since we assume that a maximizes F we conclude, using linearity of F in the POVM: Fopt − = F (a⊗l ) − ≤ F (A) = xν F (A(ν) ). ν
This finally means that for at least one ν, F (A(ν) ) ≥ Fopt − ,
√ ˆ + O( l) ≤ which is what we wanted to prove: recall that A(ν) has M ≤ exp lI (λ; ρ) √ exp lH (ρ) + O( l) many outcomes. Note that the latter estimate is met with equality if and only if a is maximally refined (i.e., consists of rank–1 operators only), so regardless of a, H (ρ) is the rate of intrinsic data of any probing of the ensemble. Note further that our derivation does not depend on the particular structure of the block–fidelity: obviously we can as well conclude for any ensemble {σk , qk } with average ω and any fidelity matrix Fkj l that qk |Tr (σk Aj l ) − Tr (σk aj l )|Fkj l ≤ F , F (A) − F (a⊗l ) ≤ kj l
168
A. Winter
with F := maxkj l |Fkj l |. If now F ≤ O F (a⊗l ) for l → ∞ then we get (for sufficiently large l) F (A) ≥ (1 − )F (a⊗l ). Of course, as explained in Eqs. (C0)–(C3), Theorem 5 is really a corollary of Theorem 6. So, we continue to prove the latter: (ν) , ν = 1, . . . , N like in Proof of Theorem 6. Assume that a collection of POVMs A Theorem 2 is chosen, with probabilities xν , such that A = ν xν A(ν) satisfies (CM). Define i.i.d. random variables T1 , . . . , TQ , each with Pr{Tq = ν} = xν . We want to study the random POVMs A(Tq ) , and especially their mean
1 (Tq ) A= A . Q Q
q=1
Observe that EA = EA(Tq ) = A . Recall the definition of marginal POVMs. Obviously, by linearity of this definition, we have Q 1 (Tq ) (A|k) = (A |k) Q q=1
and
E(A|k) = E(A(Tq ) |k) = (A |k).
From condition (CM) and the monotonicity of the trace norm under partial trace we get now, for every k, 1 √ √ ρ (A |k)j − aj ρ ≤ . 1 2
(15)
j
√ √ Denoting the smallest nonzero eigenvalue of √ √ any of the ρaj ρ by u, and√choosing √ small enough, this assures that ρ(A |k)j ρ restricted to the support of ρaj ρ is lower bounded by u/2. Then we can apply Lemma 3 and obtain
√ 2u √ √ √ Pr ρ(A|k)j ρ ∈ (1 ± ) ρ(A |k)j ρ on supp ρˆj ≤ 2d exp −Q . 4 ln 2 Thus we can estimate the sum of these probabilities over all k = 1, . . . , l and j = 1, . . . , m to less than 1 if Q≥1+
4 ln 2 log(2dlm). 2u
This implies that there exist actual values of the Tq such that for all k, 1 √ √ √ √ ρ(A|k)j ρ|supp ρˆj − ρaj ρ ≤ 2, 1 2 j
(16)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
169
√ √ where we observed that the ρ(A |k)j ρ all have trace at most 1, and have used √ √ Eq. (15). Hence we get (with kj = Tr (ρ(A|k)j ) and kj ! Pkj = ρ(A|k)j ρ) kj Tr ! Pkj |supp ρˆj ≥ 1 − 2, j
and using Lemma 4, this gives 1 √ √ √ √ √ ρ(A|k)j ρ|supp ρˆj − ρ(A|k)j ρ ≤ 2 . 1 2
(17)
j
Now (16) and (17) yield 1 √ √ √ ρ (A|k)j − aj ρ ≤ 2 + 2 . 1 2 j
Denoting the minimal eigenvalue of ρ by r (which we assumed to be positive) this readily implies √ 1 (A|k)j − aj ≤ 2 + 2 , 2 rd j
and we are done, since A has only MQ many possible outcomes.
IV. Strong Converse In this section we prove the asymptotic optimality of the separation of the measurement from Theorem 2. To be precise, it is Theorem 8. Whenever there are POVMs A(ν) on [m]l , ν = 1, . . . , N, each supported on at most M elements, and probability weights xν > 0, such that A = ν xν A(ν) satisfies condition (CM), for some < 1, then √ √ M ≥ exp lI (λ; ρ) ˆ − O( l) , MN ≥ exp lH (λ) − O( l) , where the constants depend only on . Proof. Let us begin with the second inequality: by construction the set R ⊂ [m]l of possible outcomes of A has cardinality at most MN . Denoting by the distribution of outcomes according to A, i.e. j l = Tr ρ ⊗l Aj l , from (CM) we get immediately 1 ⊗l
λ − 1 ≤ , 2 which in turn implies
λ⊗l (R) ≥ (R) − = 1 − .
(18)
170
A. Winter
l , By a well known trick [30] the lower bound now follows: we consider R = R ∩ Tλ,δ with δ = 1− 2m , whence we have, using Chebyshev’s inequality
λ⊗l (R ) ≥
1− . 2
Using the fact (compare Eq. (B.8)) √ λj l ≤ exp −lH (λ) + Kmδ l ,
l ∀j l ∈ Tλ,δ
we conclude MN ≥ |R| ≥ |R | ≥
√ 1− exp lH (λ) − Kmδ l . 2
Now for the first inequality: introduce the ensembles {! Pj l , j l }j l with (ν)
(ν) (ν) j l ! Pj l =
(ν)
√ (ν) √ ωAj l ω,
all of which have average ω. Then we define the (subnormalized) density operators (ν) l (ν) l l " Pj l = lρ,δ ˆ (j )Pj l ρ,δ ˆ (j ),
l l l ρ˜j l = lρ,δ ˆ (j )ρj l ρ,δ ˆ (j ), l , with δ = for j l ∈ Tλ,δ
4mn 1− .
Then by Chebyshev inequality and Eq. (B.2)
1 1− λj l ρ˜j l ≤ , ω − 2 2 l l j ∈Tλ,δ 1
while from (CM) we get (ν) (ν) 1 1+ " λj l ρ˜j l − xν j l Pj l ≤ =: . 2 l l 2 l ν j ∈Tλ,δ j l ∈Tλ,δ
(19)
1
These immediately imply ν
xν
l j l ∈Tλ,δ
j l Tr " Pj l ≥ 1 − , (ν)
(ν)
so there exists at least one ν such that (ν) (ν) j l Tr " Pj l ≥ 1 − . l j l ∈Tλ,δ
Now consider the (subnormalized) density operators # # (ν) (ν) (ν) ! j l = ! Pj l lρ,δ Pj l , ˆ
(20)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
171
which evidently satisfy θ :=
l j l ∈Tλ,δ
(ν)
(ν)
j l j l ≤
(ν) (ν) Pj l = ω. j l !
jl
Denoting with the projection onto the support of θ and inserting Tr j l = Tr " Pj l , we arrive at Tr (ω) ≥ 1 − , (ν)
(ν)
from where we conclude
√ rank = Tr ≥ exp lH (ρ) − O( l) .
This follows by a standard reasoning (which we take from [28]): for F = lρ,δ lρ,δ , choosing δ large enough, we get 1 − Tr lρ,δ ωlρ,δ = Tr (ωF ) ≥ . 2 By Eq. (B.8) the inequality follows. √ (ν) On the other hand each of the j l has rank at most exp lH (ρ|P ˆ ) + O( l) , and we deduce our claim. We may relax a bit the condition of the theorem regarding the parameter M: if we allow the different POVMs A(ν) to have different numbers Mν of possible outcomes, then we can prove the slightly stronger estimate √ xν Mν ≥ exp lI (λ; ρ) ˆ − O( l) M := ν
(while the second inequality obviously holds for ν Mν ). To see this go back to Eq. (20) and observe that by a Markov inequality argument √ √ (ν) (ν) j l Tr " Pj l ≥ 1 − ≥ 1 − , Pr x ν : whence the claim directly follows. Remark 9. While in the above proof we assumed the property (CM) for < 1, we conjecture that (CP) for all sources with average ω, with < 1, is sufficient to arrive at its conclusion. Let us inspect this possibility along the lines of the proof: crucial were the estimates (18) and (19), the former being an immediate consequence of (CP), so we would have to show this only for the latter. However, this demonstration has escaped us so far. Finally, a comment on why this converse is strong: optimality of Theorem 2 is proved already by our observation in the previous section that it implies Theorem 6, and the lower bound of Theorem 7. However, closer inspection of this lower bound reveals that it coincides with the upper bound only in the limit → 0. For positive it leaves room for a tradeoff between compression and error (not untypical for the type of error concept we had used). This is known in information theory as a weak converse [30]. The strong converse in contrast shows optimality of the upper bound in the asymptotic limit l → ∞, with any bounded away from 1.
172
A. Winter
V. Asymptotic Decomposition of Instruments and Operations An interesting generalization of our main theorem arises from the point of view that POVMs are just a special case of general open dynamics: the most general form of evolution is a completely positive, trace preserving linear map ϕ∗ from states on H to states on K. Such a map can (non–uniquely) be represented in the Kraus form ϕ∗ : π −→
m j =1
Vj πVj∗ ,
(21)
where Vj : H → K are C–linear and j Vj∗ Vj = 11. The representation can be made unique by considering it as a partial measurement, and including the outcome j : extend the output system to K ⊗ J , and modify the map ϕ∗ to " ϕ∗ : π −→
m j =1
Vj πVj∗ ⊗ |j j |.
(Technically this will amount to a change of the Kraus operators, too, but we will not need the details here.) This is the notion of an instrument (Davies and Lewis [7]). One can see that it is representable in Kraus form, too, so we will in the sequel always look at a particular Kraus representation. In analogy to the question about POVMs of this work we would like to approximate (ν) ϕ∗⊗l by the average of some ∗ , ν = 1, . . . , N, each of which should have a Kraus representation with a small number of contributing operators. As is well known this number is the dimension of the ancillary system (environment) sufficient to emulate the effect of the operation by a unitary interaction and subsequent partial trace. Its logarithm is an upper bound on the “information leakage” from the system to the environment. Note that (apart from looking at approximation) we are considering here the problem of convex decomposition of completely positive maps, like we did before for POVMs. Of course, every completely positive map has a decomposition into such extremal ones, with possibly fewer terms in the Kraus representation. For this one can employ a theorem of Choi [6], saying that ϕ∗ from Eq. (21) is extremal if and only if the family of operators Vj∗ Vk is linearly independent (in particular, then m ≤ d). We show now how to solve this problem as a consequence of Theorem 2, with an additional reasoning mainly directed to quantum state fidelities: Formally, we are looking for a family of maps ⊗l ⊗l (ν) ∗ : B(H ) −→ B(K ),
σ −→
M
Wµ(ν) σ Wµ(ν)∗
(22)
µ=1
(ν) and probabilities xν such that for ∗ = ν xν ∗ and any ensemble {σk , qk } with average ω = ρ ⊗l the following condition holds: k
1 qk ϕ∗⊗l (σk ) − ∗ (σk ) ≤ . 1 2
(CO)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
173
In fact, there is an appealing way to state them all together, and strengthen the content at the same time: for a purification π of ρ on an extended system H ⊗ H we ask for 1 (CO*) (ϕ∗ ⊗ id)⊗l (π ⊗l ) − (∗ ⊗ id⊗l )(π ⊗l ) ≤ . 1 2 Indeed, this implies (CO): just observe that by choosing a POVM (Tk ) on H⊗l one can “induce” any ensemble {σk , qk } on H⊗l for ω, in the following sense: qk σk = Tr H⊗l π ⊗l (11 ⊗ Tk ) . How to do this is explained in detail in [17] (or see Appendix A below). Note that this generalizes the implication of (CP) from (CM), discussed earlier, when we view the POVMs as the quantum operations Eqs. (6) and (7). Conversely, assuming (CO) for all ensembles for ω does unfortunately not imply (CO*) with a comparable error parameter. (Examples are not hard to construct for which (CO) holds with a small while the bound in (CO*) is close to 1.) With ϕ∗ there is associated the POVM a = (aj = Vj∗ Vj : i = j, . . . , m), and with this goes the ensemble {ρˆj , λj }, as before. Theorem 10. With the above notation and > 0 there exist quantum operations in the form of Eq. (22), with √ √ ˆ + O( l) , M ≤ exp lI (λ; ρ) ˆ + O( l) , N ≤ exp l H (λ) − I (λ; ρ) (ν) and such that ∗ = N1 ν ∗ satisfies (CO*). These bounds are asymptotically best possible if ϕ∗ is an instrument. Proof. Let A(ν) and xν be the POVMs and probabilities constructed in Theorem 2 from a⊗l and ω = ρ ⊗l , and let A = ν xν A(ν) . We use the notation from the proof of this theorem and from Sect. IV: √ (ν) √ √ √ (ν) (ν) (ν) ρaj ρ = λj ρˆj , ωAj l ω = j l ! x ν j l . Pj l , j l = ν
(ν) the ! Pj l
Note that by the proof of Theorem 2 either are 0 or equal to ! Pj l := Introduce the unitaries Uj by the polar decomposition √ √ √ Vj ρ = Uj ρVj∗ Vj ρ = Uj λj ρˆj ,
S 1+ ξj l .
(23)
(ν)
and let Uj l = Uj1 ⊗ · · · ⊗ Ujl . Now define Wj l by letting (ν) √ Wj l ω
#
(ν) (ν) = Uj l j l ! Pj l ,
(24)
and observe that for fixed ν only M of them are nonzero, and that for fixed j l these (ν) are all multiples of each other. Hence these operators define a quantum operation ∗ (ν) according to the theorem, and ∗ = ν xν ∗ .
174
A. Winter
√ With these definitions we check that (CO*) is satisfied: using π ⊗l = ( ω ⊗ √ 11⊗l )|I I |( ω ⊗ 11⊗l ) (see Lemma 14 in Appendix A) and Eqs. (23) and (24) we calculate ⊗l (ϕ∗ ⊗ id⊗l )(π ⊗l ) − (∗ ⊗ id⊗l )(π ⊗l ) 1 (ν) (ν)∗ ≤ xν Wj l ⊗ 11⊗l π ⊗l Wj l ⊗ 11⊗l Vj l ⊗ 11⊗l π ⊗l Vj∗l ⊗ 11⊗l − ν
jl
λ l = ρˆj l ⊗ 11⊗l |I I | ρˆj l ⊗ 11⊗l j jl
! −j l Pj l ⊗ 11⊗l |I I | ! Pj l ⊗ 11⊗l 1 ⊗l ⊗l ≤ λ − 1 + λj l ρˆj l ⊗ 11 |I I | ρˆj l ⊗ 11⊗l jl
⊗l ⊗l ! ! − |I I | Pj l ⊗ 11 Pj l ⊗ 11 .
1
(25)
1
The last line here is estimated as follows: the first term is bounded by 2 (see the proof of Theorem 8), and for the other we use Lemma 14 in Appendix A: observe that for each j l the two terms inside the trace norm are the canonical purifications of ρˆj l and ! Pj l , respectively. Thus we get
⊗l ⊗l ! ! ρˆ l ⊗ 11⊗l |I I | ρˆ l ⊗ 11⊗l − |I I | P P ⊗ 1 1 ⊗ 1 1 l l j j j j 1 √ 4 ! ≤ 2 2 ρˆj l − Pj l 1 , and using concavity of the root function and the estimate of Eq. (11) we can upper bound the last line of Eq. (25) by O( 1/8 ). If ϕ∗ is an instrument any approximate convex decomposition of ϕ∗⊗l implies a similar decomposition for the POVM a⊗l . Hence Theorem 8 gives the optimality of the bounds for M and N . Interestingly, the bounds of Theorem 10 depend on the Kraus representation (21) of the map ϕ∗ : all other such representations are related by unitary transforms, i.e. VJ σ VJ∗ ϕ∗ (σ ) = J
if and only if VJ =
UJj Vj ,
j
with a unitary matrix (UJj )Jj of complex numbers. (This is essentially a consequence of the uniqueness up to unitaries of the Stinespring dilation [26] of ϕ, which implies the Kraus representation. This fact is also discussed in detail in [23]). This motivates the introduction of (ρ; ϕ∗ ) :=
min
Kraus repr. of ϕ∗
I (λ; ρ), ˆ
(26)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
175
i.e. the minimum rate of the parameter M in decompositions of ϕ∗ according to Theorem 10. Note that, according to [23], the minimum of H (λ) over all Kraus representations is exactly Se , the entropy exchange of the map ϕ∗ (with respect to ρ). For a discussion see Subsect. VIE below. VI. Discussion We have introduced a separation into extrinsic and intrinsic data of a quantum measurement. It was shown to have definite minimal rates for either of these, and that it encompasses all previously known results on “meaningful” data in quantum measurements. A particular advantage of Theorem 2 before Theorems 5 and 6 is that it not even requires a new POVM (which might be experimentally difficult to realize). Instead, it can be understood as a mere re–interpretation of the data delivered by a⊗l : in fact, by (ν) our construction in the proof of Theorem 2 for all ν and j l either Aj l is 0 or very close to a multiple of aj l , in the sense of (CM). Hence the random variable N, defined as a function of j l : (ν) Tr ωA xν jl (ν) Pr N = ν|j l = , (27) Tr ωAj l = xν λj l Tr (ωaj l ) (up to a scaling factor, close to 1 for typical j l ), is almost independent from the source ensemble {σk , qk } in (CP). More precisely, 1 (ν) qk xν Tr (σk Aj l ) − qk Tr (σk aj l ) Pr{N = ν|j l } ≤ , 2 l kνj
and in fact, we even have 1 √ √ √ (ν) √ xν ωAj l ω − Pr{N = ν|j l } ωaj l ω ≤ . 1 2 l νj
This means that one can reproduce the statistics of the whole diagram in Fig. 2 from the outcomes of a⊗l , by inventing the ν distributed according to Eq. (27). This gives a new view on the extrinsic/intrinsic separation: rather than replacing the original POVM by a fancy construction, one can from the original data j l compute the extrinsic data ν, and conditional on that the intrinsic part. Then one can sucessfully pretend that this separation was delivered by the mixture of the POVMs A(ν) .
A. Data vs. information. One (as it turns out, rather careless) interpretation of our result could be that the “useful” information produced by the POVM a amounts to I (λ; ρ). ˆ This in itself is not yet precise, so let‘s fix “information” to mean “communicable informa tion” in the sense of Shannon [25]: for any source {σi , µi } with average i µi σi = ρ the source and measurement outcome are random variables X and Y with a joint distribution Pr{X = i, Y = j } = µi Tr (σi aj ),
176
A. Winter
and the mutual information of these is I (X ∧ Y ) = H (X) + H (Y ) − H (XY ). We repeat here the discussion of [29] regarding the relation between this quantity and I (λ; ρ): ˆ Observe first that the joint distribution of X and Y can be rewritten as √ √ Pr{X = i, Y = j } = Tr ρ −1/2 µi σi ρ −1/2 ρaj ρ = λj Tr (ρˆj Si ), where the Si = ρ −1/2 µi σi ρ −1/2 form a POVM (compare [17] where this correspondence between POVMs and ensembles was used to classify ensembles with given density matrix). But here the Holevo bound [13] applies, with the ensemble {ρˆj , λj }, and thus we have proved: Theorem 11. Let {σi , µi } be any ensemble whose average state i µi σi equals ρ. Define random variables X, Y with joint distribution Pr{X = i, Y = j } = µi Tr (σi aj ) (this is the probability for σi to occur and that j is observed on this state). Then I (X ∧ Y ) ≤ I (λ; ρ). ˆ Note that in general maximization over the ensemble {σi , µi } (yielding the accessible information Jρ (a) = Iacc (λ; ρ), ˆ because in the above proof it corresponds to an information maximization over the POVM Si ) does not achieve the upper bound: see [13], where it is shown that it does if and only if all the ρˆj commute. Furthermore, by a result from [12], Jρ ⊗l (a⊗l ) = lJρ (a), hence the gap remains even asymptotically! For further discussion of this point we refer the reader to [29], Sect. VII C. We record here only the consequence that one ought to distinguish between data (collected by measurement) and information (about a property of the states): the latter is never larger than the former, and typically in quantum situations it is strictly less. However, this seems nothing to worry about: after all, this is an observation quite familiar from our experience, though it is worth stressing that in the present context it is a purely quantum phenomenon. Peter Shor has remarked the notable fact that in the presence of entanglement, however, this distinction disappears: the entanglement–assisted capacity [3] for the quantum–classical channel that is represented by our POVM, i.e. ϕ∗ from Eq. (6), with the average of the sent symbols required to be ρ (this means that in the formula for the entanglement–assisted capacity one has to put a purification of ρ) coincides with our I (λ; ρ)! ˆ In fact, our result can be understood as a weak version of the conjectured “Quantum Reverse Shannon Theorem” [3, 4], for quantum–classical channels. To end this part of the discussion note that the bound of Theorem 2 in the case of a maximally refined measurement is simply the von Neumann entropy H (ρ) of the source, and this regardless of the nature of the POVM and of the source. In this sense, there is “democracy among measurements”, at least the maximally refined ones.
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
177
It is thus appealing to view our result as a dual to the creation of a density operator by mixing pure states: it is well known that in any representation ρ = i pi σi , with pure states σi , H (p) ≥ H (ρ), with equality iff the σi are mutually orthogonal eigenstates of ρ: hence, H (ρ) is the minimum entropy needed to generate ρ. In the present work we identify H (ρ) as the maximum entropy of measurement data correlated to ρ. B. Holevo bound. Here we show how to turn around the previous argument to actually prove the Holevo information bound. The statement is as follows: Theorem 12 (Holevo [13]). Let {ρˆj , λj }j =1,... ,m be an ensemble of states with average ρ = j λj ρˆj , and (Si )i=1,... ,n a POVM. Define the joint distribution of random variables Y , X to be Pr{Y = j, X = i} = λj Tr (ρˆj Si ). Then we have the inequality I (Y ∧ X) ≤ I (λ; ρ) ˆ = H (ρ) −
(28)
λj H (ρˆj ).
j
Proof. To begin with, observe that Eq. (28) may be rewritten as Pr{Y = j, X = i} = pi Tr (σi aj ), √ √ with aj = ρ −1/2 λj ρˆj ρ −1/2 and the ensemble {σi , pi }, where pi σi = ρSi ρ. Now consider i.i.d. realizations X1 , Y1 , . . . , Xl , Yl of the pair X, Y . We shall apply Theorem 2 to a⊗l and ρ ⊗l , with parameter 0 < < 1. Hence, for A = ν xν A(ν) and the ensemble {σi l , pi l } the condition (CP) holds. Let us define random variables , ϒ by (ν)
Pr{ϒ = j l , = i l , N = ν} = xν pi l Tr (σi l Aj l ), so that
Pr{ϒ = j l , = i l } = pi l Tr (σi l Aj l ).
Then we may calculate (with f () := (log m + 2 log n)) lI (Y ∧ X) = I (X l ∧ Y l ) ≤ I ( ∧ ϒ) + lf () + 4 ≤ I ( ∧ Nϒ) + lf () + 4 = I ( ∧ N) + I ( ∧ ϒ|N) + lf () + 4 ≤ 0 + log M + lf () + 4 √ ≤ lI (λ; ρ) ˆ + O( l) + lf () + 4. Only classical entropy relations have been used: line 2 is by Lemma 13 stated below, line 3 is by data processing, as υ is a function of ν and µ, line 4 is a standard identity, and line 5 by independence of ν and ξ and the standard inequality I (ξ ∧ µ|ν) ≤ H (µ). Now divide by l and let l → ∞: I (Y ∧ X) ≤ I (λ; ρ) ˆ + (log m + 2 log n). As > 0 was arbitary, the theorem follows.
Lemma 13 (Fano [8]). Let P and Q be probability distributions on a set with finite cardinality a, such that 21 P − Q 1 ≤ λ. Then |H (P ) − H (Q)| ≤ λ log a + 2H (λ, 1 − λ).
178
A. Winter
The reader may want to compare this proof to our earlier one in [29]: despite similarities they are conceptually completely different! In fact, there we introduced the Holevo mutual information as a certain fidelity measure (which may seem slightly artificial) and applied Theorem 6, while here we directly exploit the “bottleneck” nature of our main result (compare again Fig. 2), thus providing a much more natural approach. C. Fixed source ensemble and classical case. Our approach has concentrated on universal properties of the POVM, leaving the source as free as possible. What happens if we fix the source {ρi , pi }? Note that the whole situation is fully classical now, as we only have to regard the correlation between source issues X = i and measurement results Y = j. Thus it is modelled by the classical case of the initial problem: the source is {|ii|, pi }, and the POVM b consists of operators bj = Tr (ρi aj )|ii|. i
This model has the same joint statistics of i and j as the above described one (most generally, bj can be any operator with eigenbasis {|i}). Now observe the following: as long as the POVMs A(ν) are diagonal in the basis l {|i }, too (this is the classicality condition for the POVMs), the validity of (CP) for all ensembles with average P = pi |ii| i
is implied by its validity for the ensemble {|ii|, pi }. This is because source states ρi and i |ii|ρi |ii| produce the same statistics, so only sources consisting of mixtures of the |ii| have to be considered. The condition (CP) for them clearly is implied by its validity for {|ii|, pi }. At this point Theorems 2 and 8 can be applied: because the induced ensemble for source state P and POVM b is {σˆ j , λj }, with λj =
pi Tr (ρi aj ) = Tr (ρaj ),
i
σˆ j =
1 pi Tr (ρi aj )|ii|, λj i
we obtain I (X ∧ Y ), that is the Shannon mutual information between the source and the measurement, as the rate of intrinsic data. More precisely, we can perform a data separation by postprocessing, according to the prescription of the beginning of this section, Eq. (27), into extrinsic ν, almost independent of i l , and intrinsic j l depending on i l and ν. However, this is not exactly what we set out to do initially: Theorem 2 allows us to decompose the bj l into convex combinations of operators (ν) (ν) βj l |i l |i l i l |, Bj l = il
but it is not clear that these can be obtained from POVMs A(ν) , in the sense that (ν) (ν) ∀ν∀j l ∀i l βj l |i l = Tr ρi l Aj l .
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
179
(ν) For this to hold the vectors βj l |i l l (for all j l ) must belong to the cone spanned by the i vectors (ψ|ρi l |ψ)i l . It is conceiveable that under this condition the obtainable intrinsic data rate increases. We have to leave this interesting question for the moment. For classical sources and measurements we thus obtain that intrinsic data equals mutual information. On the other hand, we can come back to their being distinct in truly quantum situations: we pointed out in Subsect. VIA that the maximum of I (X ∧ Y ) over all sources with average ρ gives the accessible information Iacc (λ; ρ) ˆ of the ensemble {ρˆj , λj }, which in general is less than I (λ; ρ). ˆ The difference can be accounted for by considering that the sources in this maximization are of the special i.i.d. type (on l–blocks), while (CM) implies (CP) even for sources of entangled states, as long as their average is ω = ρ ⊗l . This should be viewed especially in the light of the conjecture implied in Subsect. VIF. D. Sufficient statistics. The reader familiar with classical statistical theories may have been reminded by our above discussion of the concept of sufficient statistics, at least when the quantum source and the observation are essentially classical, i.e. when all the ρi and aj commute: the former are then just probability distributions and the latter form a statistical decision rule, with distribution of j conditional on i denoted q(j |i). As there is also a distribution pi on the i we have here a statistical model in the sense of estimation theory (we refer the reader to [19] for detailed explanations). We will consider the values of i and j as random variables: then a sufficient statistics is a random variable k which is a function of j (whose distribution conditional on i we denote q(k|i)), ˜ such that the distribution of j conditional on k is independent of i: Pr{j |k} = Pr{j |k, i}
∀i.
Let us denote these conditional probabilities by r(j |k). This implies that we can simulate the distribution of j conditional on i from k: r(j |k)q(k|i). ˜ q(j |i) = k
In words, to each entry k of the new data record there exists a distribution on the j of the original data record such that the latter’s distribution is recovered as a convolution; in terms of stochastic maps q is factorized into q˜ and r: q˜
r
i −→ k −→ j. On the other hand, our Theorem 2 provides something appearing to be dual to this (apart from holding only approximately and in an asymptotic setting; these things are easily introduced in sufficient statistics, too): a random variable ν with distribution x, independent of i and j , and conditional on it a stochastic map aν (j |i) such that xν aν (j |i). q(j |i) = ν
In a diagram:
$
"ν Q
ν ↓ %&
' Rν
i −→ µ −→ j.
180
A. Winter
Like k in the case of sufficient statistics, the pair µν is a function of j , but unlike there, where q˜ and r were stochastic maps with independent sources of randomness (when stochastic maps are viewed as set function valued random variables, this is expressed by " and R draw their randomness from the same the independence of q˜ and r), the maps Q source ν. In summary, there is no direct isomorphism between our concept of data reduction and sufficient statistics (which, too, can be used to reduce the entropy of data sets): the " and R are independent. latter appears as a special case where the maps Q E. Entropy exchange. We want to discuss an application of Theorem 10 to the entropy exchange of quantum operations, introduced by Schumacher [23] (and previously by Lindblad [21]): for a quantum operation ϕ∗ in the form (21) it is defined as Se (ρ; ϕ∗ ) = H (W ), with Wj k = Tr (Vj ρVk∗ ). It can be shown to be independent of the Kraus representation, by identifying it with the entropy increase in an initially pure environment of the system by a Stinespring dilation of ϕ∗ , see [23]. In the latter work a number of interesting relations between Se and other entropic quantities are shown. In particular, returning to the notation of Sect. V, it is shown that there is a (in this sense, minimal) Kraus representation of ϕ∗ such that H (λ) = Se (ρ; ϕ). Because of I (λ; ρ) ˆ ≤ H (λ) (this is simply data processing inequality [1]), we conclude (ρ; ϕ∗ ) ≤ Se (ρ; ϕ∗ ). By the derivation this quantity may be dubbed genuinely quantum entropy exchange of a channel, as it is that part of the noise that cannot be accounted for classically. From a different point of view, in fact also the maximum of I (λ; ρ) ˆ over all Kraus representations of ϕ∗ (compare Eq. (26)) is interesting: in a cryptographic setting, where ϕ∗ connects users A and B, and is controlled by an eavesdropper E, it is the amount of data collected by E about A’s messages in the worst case. A deeper investigation of these concepts is relegated to another occasion. F. An open problem. An interesting and challenging question is about the amount of data collected by a under the hypothesis of an arbitrarily varying source (AVS), instead of the i.i.d. model considered here: An AVS is a collection of source ensembles {ρis , pis } (with average state ρs ), labelled by s ∈ S, which we make into a discrete memoryless source by considering the ensembles (labelled by s l ∈ S l ) {ρi l s l , pi l s l }i l . The idea is that at each position k = 1, . . . , l the source may be arbitrarily in one of the internal states s ∈ S. We have no – not even statistical information – about s, so our data separation must work for all s l ∈ S l : formally the condition on A = ν xν A(ν) is 1 ∀s l ω(s l )(aj l − Aj l ) ω(s l ) ≤ , (AVCM), 1 2 l j
where ω(s l ) = ρs1 ⊗ · · · ⊗ ρsl is the average state of the source when in internal state sl .
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
181
A natural candidate for the minimum data rate of the A(ν) seems to be ( ) max I (λ; ρ) ˆ : ρ ∈ conv{ρs : s ∈ S} , √ √ with λj ρˆj = ρaj ρ, and conv denoting the closed convex hull. If this is true, then in particular the quantity (a) = max I (λ; ρ) ˆ ρ
is the amount of data collected by a, regardless of any source ensemble. Acknowledgements. I am indebted to Serge Massar for his introducing me to the problem addressed in this paper and for interesting discussions, and to Hiroshi Nagaoka for pointing out to me the possible relation between the present approach and sufficient statistics. Thanks to Peter Shor who supplied the insight that the difference between data and information disappears in the presence of entanglement. I thank Masanao Ozawa for pointing out to me that Theorem 10, initially only formulated for operations, is in fact valid for instruments. Part of this work was done during my stay at the ERATO project “Quantum Computation and Information”, Tokyo (August/September 2001). I thank the members of the project for their hospitality, and especially Keiji Matsumoto for discussions on the content of the appendix, on which I also enjoyed conversation with Richard Jozsa and Masahide Sasaki. Last but not least, special thanks are due to Marco P. Carota for constant encouragement during the course of this work.
Appendix A. Canonical Purifications In this appendix we collect a few facts about mixed state fidelity and a certain kind of purification of mixed states, which we call canonical, that seem not to be widely known. These are used in the main text, but seem to be of interestin their own right. √ For the state ω on H1 consider a purification |ψ = i ri |i ⊗ |i on a bipartite system H1 ⊗ H2 , that we already have put in Schmidt polar form. Then on both systems there exist (R–linear) complex conjugation maps with respect to the basis {|i}: |φ = αi |i −→ αi |i =: |φ. i
Then, with |I =
i
i
|i ⊗ |i, it can be checked that
√ √ √ √ |ψψ| = ( ω ⊗ 11)|I I |( ω ⊗ 11) = (11 ⊗ ω)|I I |(11 ⊗ ω), see also the following Lemma 14. Then √ √ 11 ⊗ Sk |ψψ| 11 ⊗ Sk = 11 ⊗ Sk ω |I I | 11 ⊗ Sk ω √ √ = qk (11 ⊗ Uk ) (11 ⊗ τk )|I I |(11 ⊗ τk ) (11 ⊗ Uk∗ ) = qk (11 ⊗ Uk )|tk tk |(11 ⊗ Uk∗ ), √ √ the third qk τk = ωSk ω on H2 , and the polar decomposition √ √ line introducing √ Sk ω = Uk qk τk , the fourth the canonical purification |tk on H1 ⊗ H2 of τk (with respect to |I I |), see Lemma 14 below. By this lemma we can infer Tr H2 |ψψ|(11 ⊗ Sk ) = qk Tr H2 |tk tk | = qk τk ,
182
A. Winter
with the complex conjugated operator τk , which is defined as τk = |φi φi |, if τk = |φi φi |. i
i
Note that this is uniquely defined, regardless of the convex decomposition chosen, and in particular independent of the phases of the |φi . The ensemble {τk , qk } has average ω = ω, and conversely, the above formulas show how to induce any ensemble {σk , qk } for ω on H1 : let Sk = ω−1/2 qk σk ω−1/2 (this was noted before in [17] in the context of classifying ensembles with a given density operator). Lemma 14 (“Pretty good purifications”). Consider orthonormal bases of spaces H1 and H2 , both denoted {|i}, and introduce |I = i |i ⊗ |i. As before, we denote the complex conjugation with respect to this basis by . Then for a state ρ = i αi |ψi ψi | (in diagonalized form), √ √ √ |rr| = ρ ⊗ 11 |I I | ρ ⊗ 11 , with |r = αi |ψi ⊗ |ψi , i
is a purification of ρ. We call it the canonical purification with respect to |I . (Note that this definition makes sense as it is independent of phases in the |ψi ). If |ss| is the canonical purification of another state σ then for the fidelity between these: √ √ 2 F (|r, |s) = |r|s|2 = Tr ρ σ . (A.1) Furthermore
√ √ ρ σ ≥ 1 − ρ − σ 1 , (A.2) 1 |rr| − |ss| ≤ 4 4 ρ − σ 1 . (A.3) 1 2 Proof. The formula for the canonical purification is a straightforward calculation. With its help, it is also straightforward to check the fidelity identity, Eq. (A.1). Now for the last two estimates: begin with √ √ √ √ √ √ √ √ 1 − Tr ρ σ = Tr ρ ρ − σ ≤ ρ( ρ − σ ) 1 √ √ √ ≤ ρ 2 ρ − σ 2 ≤ |ρ − σ | = ρ − σ 1 , Tr
2
invoking two nontrivial inequalities: in the third line we use Cor. IV.2.6 of [5] (which is a kind of H¨older or Cauchy–Schwarz inequality), in the fourth line Thm. X.1.3 from the same book. Finally, use the well known identity 1 |rr| − |ss| = 1 − F (|r, |s) 1 2 to obtain √ 1 |rr| − |ss| = 1 − |r|s|2 ≤ 2 1 − |r|s| 1 2 √ √ √ √ = 2 1 − Tr ρ σ ≤ 2 4 ρ − σ 1 , which we wanted to show.
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
183
√ √ √ √ Remark 15. Observe Tr ρ σ ≤ ρ σ 1 , the square of this latter quantity being known as the (mixed state) fidelity √ By theorems of Uhlmann [27] and Jozsa [18] the √[18]. mixed state fidelity F (ρ, σ ) = ρ σ 21 equals the maximum over the pure state fidelities of all possible purifications of ρ and σ . Because of well known relations between mixed state fidelity and trace norm distance (see [9]), more precisely 1−
F (ρ, σ ) ≤
1
ρ − σ 1 ≤ 1 − F (ρ, σ ), 2
(A.4)
the lemma tells us that at least for (mixed state) fidelity close to 1 the canonical purifications are not too far off the optimum with respect to (pure state) fidelity. Appendix B. Typical Sequences and Subspaces For a probability distribution P on the finite set X define set of typical sequences (with δ > 0) √ TPl ,δ = x l : ∀x |N (x|x l ) − lPx | ≤ δ l Px (1 − Px ) , where N(x|x l ) counts the number of occurrences of x in the word x l = x1 . . . xn . For a state ρ fix eigenstates e1 , . . . , ed (with eigenvalues R1 , . . . , Rd ) and define for δ > 0 the typical projector as et1 ⊗ · · · ⊗ etl . lρ,δ = l t l ∈TR,δ
For a collection of states ρˆj , j = 1, . . . , m, and j l ∈ [m]l define the conditional typical projector as * Ij l lρ,δ ρˆ ,δ , ˆ (j ) = j
j I
where Ij = {k : jk = j } and ρˆj ,δ is meant to denote the typical projector of the state j ρˆj on the subsystem composed of the tensor factors Ij in the tensor product of l factors. From [28] we cite the following properties of these projectors: Tr (ρ ⊗l lρ,δ ) ≥ 1 − l Tr (ρˆj l lρ,δ ˆ (j )) ≥ 1 −
d , δ2
(B.1)
md , δ2
(B.2)
m2 d , δ2 √ ≤ exp lH (ρ) + Kdδ l , Tr (ρˆj l lρ,δ ) ≥ 1 −
Tr lρ,δ
Tr lρ,δ
d ≥ 1− 2 δ
√ exp lH (ρ) − Kdδ l ,
(B.3) (B.4) (B.5)
184
A. Winter
√ l Tr lρ,δ ˆ j l ) + Kmdδ l , ˆ (j ) ≤ exp lH (ρ|P
√ md l l Tr ρ,δ exp lH ( ρ|P ˆ (j ) ≥ 1 − l ) + Kmdδ l , j ˆ δ2
(B.6) (B.7)
for an absolute constant K > 0, and the empirical distribution Pj l of letters j in the word j l : 1 Pj l (j ) = N (j |j l ). l Finally, with √ √ α = exp −lH (ρ) − Kdδ l , α = exp −lH (ρ) + Kdδ l , √ β = exp −lH (ρ|P ˆ j l ) + Kmdδ l ,
√ β = exp −lH (ρ|P ˆ j l ) − Kmdδ l ,
we have α lρ,δ ≥ lρ,δ ρ ⊗l lρ,δ ≥ αlρ,δ ,
(B.8)
l l l l l l l β lρ,δ ˆ (j ) ≤ ρ,δ ˆ (j )ρˆj l ρ,δ ˆ (j ) ≤ βρ,δ ˆ (j ).
(B.9)
References 1. Ahlswede, R., L¨ober, P.: Quantum Data Processing. IEEE Trans. Inf. Theory 47(1), 474–478 (2001) 2. Ahlswede, R., Winter, A.: Strong converse for identification via quantum channels. IEEE Trans. Inf. Theory 48(3), 569–579 (2002) 3. Bennett, C.H., Shor, P.W., Smolin, J.A., Thapliyal, A.V.: Entanglement–assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Trans. Inf. Theory 48(10), 2637–2655 (2002) 4. Bennett, C.H., Devetak, I., Harrow, A., Shor, P.W., Winter, A.: The Quantum Reverse Shannon Theorem. In preparation 5. Bhatia, R.: Matrix Analysis. Graduate Texts in Mathematics 169, Berlin-New York: Springer Verlag, 1997 6. Choi, M.-D.: Completely positive linear maps on complex matrices. Linear Algebra Appl. 10, 285–290 (1975) 7. Davies, E.B., Lewis, J.T.: An operational approach to quantum probability. Commun. Math. Phys. 17, 239–260 (1970) 8. Fano, R.M.: Class Notes for Transmission of Information. Course 6.574, MIT, Cambridge MA, 1952. See also R. M. Fano, Transmission of Information, New York: Wiley and Sons, 1961 9. Fuchs, C.A., van de Graaf, J.: Cryptographic Distinguishability Measures for Quantum–Mechanical States. IEEE Trans. Inf. Theory 45(4), 1216–1227 (1999) 10. Hausladen, P., Wootters, W.K.: A ‘pretty good’ measurement for distinguishing quantum states. J. Modern Opt. 41(12), 2385–2390 (1994) 11. Helstrom, C.W.: Quantum Detection and Estimation Theory. New York: Academic Press, 1976 12. Holevo, A.S.: Information–theoretical aspects of quantum measurement. Probl. Inf. Transm. 9(2), 110–118 (1973) 13. Holevo, A.S.: Bounds for the quantity of information transmitted by a quantum channel. Probl. Inf. Transm. 9(3), 177–183 (1973) 14. Holevo, A.S.: Asymptotically optimal hypotheses testing in quantum statistics. Theor. Probability Appl. 23(2), 411–415 (1979) 15. Holevo, A.S.: Probabilistic and Statistical Aspects of Quantum Theory. Amsterdam: North Holland, 1982 16. Holevo, A.S.: The Capacity of the Quantum Channel with General Signal States. IEEE Trans. Inf. Theory 44(1), 269–273 (1998)
“Extrinsic” and “Intrinsic” Data in Quantum Measurements
185
17. Hughston, L.P., Jozsa, R., Wootters, W.K.: A complete classification of quantum ensembles having a given density matrix. Phys. Lett. A 183(1), 14–18 (1993) 18. Jozsa, R.: Fidelity for mixed quantum states. J. Mod. Optics 41, 2315–2323 (1994) 19. Lehmann, E.L., Casella, G.: Theory of Point Estimation. 2nd edition, Springer Texts in Statistics, Berlin-New York: Springer, 1998 20. Lebedev, D.S., Levitin, L.B.: The maximum amount of information transmissible by an electromagnetic field. Dokl. Akad. Nauk (SSSR) 149(6), 1299–1302 (1963) (Russian). [English translation: Soviet Physics Dokl. 8, 377–379 (1963)] 21. Lindblad, G.: Quantum entropy and quantum measurements. In: C. Bendjaballah, O. Hirota, S. Reynaud, (eds.), Quantum Aspects of Optical Communications, Lecture Notes in Physics, Vol. 378, Berlin: Springer Verlag, 1991, pp. 71–80 22. Massar, S., Popescu, S.: Amount of information obtained by a quantum measurement. Phys. Rev. A 61, 062303 (2000) 23. Schumacher, B.: Sending entanglement through noisy quantum channels. Phys. Rev. A 54(4), 2614– 2628 (1996) 24. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56(1), 131–138 (1997) 25. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 and 623–656 (1948) 26. Stinespring, W.F.: Positive functions on C∗ –algebras. Proc. Am. Math. Soc. 6, 211–216 (1955) 27. Uhlmann, A.: The ‘transition probability’ in the state space of a ∗–algebra. Rep. Math. Phys. 9, 273–279 (1976) 28. Winter, A.: Coding Theorem and Strong Converse for Quantum Channels. IEEE Trans. Inf. Theory 45(7), 2481–2485 (1999) 29. Winter, A., Massar, S.: Compression of quantum–measurement operations. Phys. Rev. A. 64 012311 (2001) 30. Wolfowitz, J.: Coding Theorems of Information Theory. 2nd edition, Berlin: Springer Verlag, 1964 Communicated by H. Araki
Commun. Math. Phys. 244, 187–208 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0978-2
Communications in
Mathematical Physics
A Statistical Approach to the Asymptotic Behavior of a Class of Generalized Nonlinear Schr¨odinger Equations Richard S. Ellis1, , Richard Jordan2, , Peter Otto1, , Bruce Turkington1, 1
Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003, USA. E-mail:
[email protected];
[email protected];
[email protected] 2 Dynamics Technology, Inc., 1555 Wilson Boulevard, Suite 320, Arlington, VA 22209, USA. E-mail:
[email protected] Received: 4 April 2003 / Accepted: 23 July 2003 Published online: 13 November 2003 – © Springer-Verlag 2003
Abstract: A statistical relaxation phenomenon is studied for a general class of dispersive wave equations of nonlinear Schr¨odinger-type which govern non-integrable, non-singular dynamics. In a bounded domain the solutions of these equations have been shown numerically to tend in the long-time limit toward a Gibbsian statistical equilibrium state consisting of a ground-state solitary wave on the large scales and Gaussian fluctuations on the small scales. The main result of the paper is a large deviation principle that expresses this concentration phenomenon precisely in the relevant continuum limit. The large deviation principle pertains to a process governed by a Gibbs ensemble that is canonical in energy and microcanonical in particle number. Some supporting MonteCarlo simulations of these ensembles are also included to show the dependence of the concentration phenomenon on the properties of the dispersive wave equation, especially the high frequency growth of the dispersion relation. The large deviation principle for the process governed by the Gibbs ensemble is based on a large deviation principle for Gaussian processes, for which two independent proofs are given.
1. Introduction Many dynamical models of physical systems governed by nonlinear partial differential equations exhibit a typical long-time behavior in which coherent structures organize on the large spatial scales while turbulent fluctuations dominate the small scales [19]. Perhaps the most familiar setting for this behavior is two-dimensional or quasi-geostrophic This research was supported in part by grants from the Department of Energy (DE-FG02-99ER25376) and from the National Science Foundation (NSF-DMS-0202309). This research was partially supported by a Mathematical Sciences Postdoctoral Research Fellowship from the National Science Foundation. This research was supported in part by grants from the Department of Energy (DE-FG02-99ER25376) and from the National Science Foundation (NSF-DMS-0207064).
188
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
fluid turbulence, where the coherent structures are large-scale steady motions, such as shear flows or vortices, and the turbulent background is a vorticity field that fluctuates on the small scales [12, 32]. The zonal jets and embedded vortical spots in the active weather layer of Jupiter are especially persistent and conspicuous examples of this phenomenon [13, 29]. These coherent structures have been shown to be realizable as the equilibrium states in a statistical model of the geophysical fluid dynamical system [8, 33]. Another physical system exhibiting this behavior is two-dimensional magnetohydrodynamics. Long-time numerical simulations of MHD turbulence show that the systems end in states in which the magnetic and velocity fields fluctuate on small scales around a steady mean state on the large scales [4]. Equilibrium statistical models have also been able to capture these coherent structures [20, 24]. This statistical relaxation phenomenon is shared by certain dispersive wave systems, for which the generic coherent structures are solitary waves that interact with a disorganized background of wave radiation [9, 10, 15, 34]. For instance, non-integrable, focusing, nonlinear Schr¨odinger equations in a bounded domain organize after a long evolution into a single solitary wave coupled with small-scale fluctuations [22, 23]. This self-organization behavior has been shown to be consistent with relaxation to a statistical equilibrium state, both qualitatively and quantitatively [25]. Motivated by this fundamental phenomenon exhibited by many complex systems, we devote the present paper to a detailed analysis of the equilibrium statistical behavior of a particular class of dispersive wave systems. Specifically, we consider a class of generalized nonlinear Schr¨odinger (GNLS) equations on a bounded domain D in Rd with appropriate boundary conditions. These systems govern the dynamics of a complex field ψ(x, t), x ∈ Rd , t ∈ R, via the equation iψt + Lψ + f (|ψ|2 )ψ = 0.
(1.1)
L denotes an unbounded linear operator on the complex Hilbert space L2c (ρ) of squareintegrable functions on D with respect to a measure ρ; ·, · denotes the inner product on L2c (ρ). It is assumed that L is symmetric and that the spectrum of −L consists of positive eigenvalues λk satisfying ∞ 1/λ k < ∞. In addition, the corresponding k=1 eigenfunctions ek are assumed to be real functions that form an orthonormal basis of L2c (ρ). We choose to focus our analysis on these systems for two reasons. First, they are widely considered to be prototypes for dynamical systems that exhibit organization of coherent structures within turbulence, and accordingly there is a rich literature on their phenomonology [9, 10, 15, 34]. Second, they are simple enough to be amenable to a complete and rigorous analysis by the methods of equilibrium statistical mechanics. In one space dimension (d = 1) the basic example of this class is L = ∂ 2 /∂x 2 on D = [0, ] with Lebesgue measure ρ on [0, ] and with homogeneous Dirichlet boundary conditions, where < ∞. In this case, we refer to (1.1) as the basic NLS equation. Our analysis also applies to the operator L = ∂ 2 /∂x 2 + U (x), where U (x) is a suitable potential, and to other boundary conditions such as -periodic conditions. For the basic NLS equation and for this wider class of NLS equations in one dimension, the eigenvalues λk grow like k 2 as k → ∞. In (1.1) we restrict our attention to smooth nonlinearities f that satisfy f (0) = 0,
sup
|f (a)| < ∞ ;
(1.2)
a∈[0,∞)
e.g., f (|ψ|2 ) = b|ψ|2 /(1 + |ψ|2 ) with scale factor b. Nonlinearities with these properties arise in physical applications as large-amplitude corrections to the cubic NLS
Statistical Approach to Asymptotic Behavior of NLS Equations
189
equation, and they are referred to as bounded or saturated nonlinearities [31]. Our analysis applies to both the focusing GNLS, for which f (a) > 0 , and the defocusing GNLS, for which f (a) < 0. Our main interest, however, is on the focusing case since the formation of coherent solitary waves is the dominant mechanism in that case. The restriction to bounded nonlinearities excludes blow-up of solutions and the collapse of waves due to self-focusing. We impose this restriction because our goal is to analyze statistical equilibrium ensembles of regular solutions that model the long-time average behavior of the system. Accordingly, we choose GNLS equations for which solutions exist and are regular for all time. An interesting generalization included in our analysis is to pseudo-differential operators L whose eigenvalues λk grow like k α as k → ∞ with α > 1. The GNLS equation (1.1) then resembles the equation introduced by Majda, McLaughlin and Tabak (MMT) in their study of weak turbulence closure theories [28]. From the standpoint of equilibrium statistical mechanics, we are interested in how the phenomenon of concentration into a coherent structure depends on α. In contrast to the MMT equations, we restrict our analysis to bounded nonlinearities in ψ itself; the MMT equations pertain to homogeneous, cubic nonlinearities in Mψ, where M is another pseudo-differential operator with eigenvalues that grow like k −σ . While it would be possible to study a broader class of such equations, the class of GNLS equations (1.1) is sufficiently broad to exhibit the typical behavior of the statistical equilibrium states and to show how this behavior depends upon the linear frequencies of the dispersive wave system. The object of our analysis is the statistical equilibrium description of the complex dynamical system (1.1) via classical Gibbsian statistics. Our choice of distribution on phase space is a mixed Gibbs ensemble that is canonical with respect to the energy invariant and microcanonical with respect to the particle number invariant. For the GNLS equation (1.1), the Hamiltonian, or energy functional, is 1 . 1 H (ψ) = − Lψ, ψ − F (|ψ|2 ) dρ, (1.3) 2 2 D a where F is related to the nonlinearity f by F (a) = 0 f (s) ds. The particle number, or wave action, is half the L2c (ρ)-norm squared: . 1 Q(ψ) = |ψ|2 dρ. 2 D The resulting statistical description rests on these two exact invariants of the GNLS equation (1.1), together with the conservation of phase volume under the Hamiltonian dynamics. In order to keep our development concise, we intentionally suppress the momentum invariant by breaking the x-translation invariance of the system. To this end, we consider (1.1) in a bounded domain D with homogeneous boundary conditions ψ = 0 imposed on ∂D. Alternatively, we could consider an operator L with a potential under periodic boundary conditions to obtain similar results. While the statistical equilibrium NLS equation has been the focus of several analyses, including [1, 7, 27, 30, 35], our approach and our results differ fundamentally from those investigations. In particular, previous investigators have constructed Gibbs distributions that are Wiener-type measures having infinite mean energy. Our interest, on the other hand, centers on modeling the ensemble-average behavior of regular solutions to (1.1) from initial conditions having given, finite, mean energy H (ψ 0 ) = E and given particle number Q(ψ 0 ) = N . Our motivation derives from numerical studies of
190
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
the underlying GNLS equation which show that from generic initial conditions, such as a field of waves emerging from a modulational instability, the dynamics approximately realize a Gibbs ensemble after a sufficient time [22, 23]. A spectral analysis of these numerical solutions identifies an approximate dimension n = n(T ) of the phase space that supports the Gibbs distribution after a long, but finite, time T . Moreover, a continuum limit is achieved as T → ∞, in the sense that n(T ) goes to infinity at a definite rate with T . This observed behavior of regular solutions to (1.1) strongly suggests that the relevant continuum limit for a statistical equilibrium theory is the one obtained from the Gibbs states of the spectrally-truncated dynamics on n eigenmodes with fixed mean energy E and fixed particle number N as n → ∞. Accordingly, this limit is the focus of our analysis. While we establish rigorous results about these statistical equilibrium states, we do not address the theoretical problem of proving ergodicity of this dynamics. Rather we accept the ergodic hypothesis on the basis of convincing numerical evidence [22, 23]. Our results pertain to a continuum limit n → ∞ of a sequence of Gibbs distributions on n-dimensional phase spaces corresponding to spectrally-truncated, Hamiltonian dynamics having a finite number of degrees of freedom n. As has been noted elsewhere [25, 27], these Gibbs ensembles are necessarily microcanonical in Q, since a Gibbs canonical ensemble with respect to both H and Q can be divergent for a focusing nonlinearity. We therefore use a mixed ensemble in which the microcanonical condition Q = N is imposed on the canonical distribution in H with an inverse temperature β that is rescaled by n so that the mean energy, H = E, remains finite as n → ∞. The study of the limiting behavior of these mixed ensembles is perfectly suited to analysis by large deviation techniques. Our main result is a large deviation principle demonstrating that the ground-state solitary waves are the most probable macroscopic states in the relevant continuum limit, in the sense that the mixed ensembles concentrate on these ground states in the L2c (ρ)-norm. This large deviation principle may be considered as a mathematically rigorous statement of the explanation of the observed formation of large-scale coherent structures within small-scale wave turbulence given in [25], where an asymptotically exact mean-field theory was developed and compared with direct numerical simulations. The outline of the paper is as follows. In Sect.2, we construct the statistical equilibrium model based on a spectral truncation of the GNLS dynamics and motivate the mixed ensemble based upon the invariants H and Q. In Sect.3 we state the main theorem, a large deviation principle for a sequence of finite-dimensional fields with respect to the mixed ensemble introduced in Sect.2. The main theorem is proved in Sect.4 using a basic large deviation theorem for Gaussian processes, for which two independent proofs are given. Both of these proofs require that the linear frequencies λk satisfy ∞ k=1 1/λk < ∞ (see Cond. 2.1). Finally, in Sect.5 we display the results of some Monte-Carlo simulations of the mixed ensemble in one space dimension. Besides demonstrating the concentration phenomenon numerically when ∞ k=1 1/λk < ∞, these simulations exhibit the change in behavior when this growth condition does not hold.
2. Statistical Equilibrium Description of GNLS Dynamics The GNLS equation (1.1) is considered on a bounded domain D in Rd . The nonlinearity is bounded, in the sense that f satisfies the conditions (1.2). The operator L defining the linear part of the GNLS equation is assumed to satisfy the following condition:
Statistical Approach to Asymptotic Behavior of NLS Equations
191
Condition 2.1. L is a symmetric operator on L2c (ρ). The spectrum of −L consists of ∞ positive eigenvalues λk satisfying k=1 1/λk < ∞. The corresponding eigenfunctions ek are real functions that form an orthonormal basis of L2c (ρ). A number of important examples underlie the general theory. Example 2.2. (a) The basic example is L = ∂ 2 /∂x 2 on D = [0, ] with homogeneous Dirichlet boundary conditions and ρ Lebesgue measure on [0, ], where < ∞. In √ this case, for each k ∈ N λk = (kπ/)2 and ek = 2/ sin(kπ x/). (b) Let p be a C 2 function on D = [0, ] satisfying inf x∈[0,] p(x) > 0, q a negative continuous function on [0, ], and ρ Lebesgue measure on [0, ]. For ξ ∈ L2c (ρ) we define dξ . d Lξ = p(x) − q(x)ξ(x) dx dx with homogeneous Dirichlet boundary conditions. By standard Sturm-Louiville theory, L satisfies Condition 2.1. As in the basic example given in part (a), the eigenvalues λk of L grow like k 2 as k → ∞ [3, Thm. 10.9]. d (c) Let D be any bounded on D, {λk }∞ k=1 any positive ∞domain in R , ρ any measure ∞ sequence satisfying k=1 1/λk < ∞, and {ek }k=1 any real orthonormal basis of L2c (ρ). We denote by ·, · the inner product on L2c (ρ). The operator L defined for any ξ ∈ L2c (ρ) by ∞ . Lξ = − λk ξ, ek ek k=1
satisfies Condition 2.1. Such operators L include a class of pseudodifferential operators that arise in weak turbulence theory [28], for which the boundary conditions are periodic, the Fourier basis functions ek are trigonometric, and eigenvalues λk are powers k α . Condition 2.1 limits the power to α > 1. We proceed with the definition of the probabilistic model, which pertains to a sequence of finite-dimensional approximations to the GNLS equation (1.1). For these approximations we choose a spectral truncation of the GNLS-dynamics [7, 25, 35]. The same ideas can be applied to other discrete approximations such as finite-difference [26]. . With respect to the basis {ek } of L2c (ρ), let Wn = span {e1 , . . . , en } be the n-dimensional subspace consisting of functions . (x) = ψk ek (x), n
ψ
(n)
(x) = u
(n)
(x) + iv
(n)
(2.1)
k=1
with arbitrary complex coefficients ψk = uk + ivk . For each fixed n, the field ψ (n) takes values in L2c (ρ) and corresponds to an n-dimensional microstate for the model; that is, . . a point ψ = (ψ1 , . . . , ψn ) in the phase space n = Cn or equivalently n = R2n . The microscopic dynamics for this model is governed by (n) iψt + Lψ (n) + P (n) f (|ψ (n) |2 )ψ (n) = 0, (2.2) where P (n) denotes the orthogonal projection that maps L2c (ρ) onto Wn . This spectral truncation of the GNLS equation (1.1) is equivalent to a system of ordinary differential
192
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
equations for the real Fourier coefficients uk and vk , k = 1, . . . , n, having a canonical Hamiltonian form; namely, duk ∂Hn f ((u(n) )2 + (v (n) )2 )v (n) ek dρ = , = λ k vk − dt ∂vk D dvk ∂Hn f ((u(n) )2 + (v (n) )2 )u(n) ek dρ = − = −λk uk + dt ∂uk D with Hamiltonian Hn (ψ) = Hn (u1 , v1 , . . . , un , vn ) n 1 . 1 2 = λk |ψk | − F (|ψ (n) |2 ) dρ 2 2 D k=1 1 1 (n) (n) F (|ψ (n) |2 ) dρ = − Lψ , ψ − 2 2 D ≡ Dn (ψ) + n (ψ). The functions Dn and n are defined by this display, and . 1
(ψ) = − F (|ψ|2 ) dρ; 2 D
(2.3)
(2.4)
hence for ψ ∈ Wn , n (ψ) = (ψ (n) ). Clearly, Hn (ψ) equals H (ψ (n) ), the restriction to Wn of the functional H defined in (1.3). The spectrally truncated particle number, . Qn (ψ) = Q(ψ (n) ), is also an invariant of the microscopic dynamics (2.2) and is given by . 1 Qn (ψ) = Qn (u1 , v1 , . . . , un , vn ) = 2
D
1 |ψk |2 . 2 n
|ψ (n) |2 dρ =
(2.5)
k=1
We define the statistical equilibrium model by a Gibbs ensemble on the 2n-dimensional phase space n , in which Hn is treated canonically and Qn is treated microcanonically. We refer to this ensemble as the mixed ensemble, and we denote it by PβN (dψ), where N ∈ [0, ∞) is a given particle number and β > 0 is a given inverse temperature. Formally, the mixed ensemble is the probability distribution . N Pn,β (dψ) =
1 exp(−βHn (ψ)) δ(Qn (ψ) − N ) Vn (dψ), Zn (β, N )
(2.6)
where Zn (β, N ) is the normalizing constant . Zn (β, N ) = exp(−βHn (ψ)) ds(ψ). {Qn =N}
. Here Vn (dψ) = nk=1 duk dvk is the phase volume on n , and ds(ψ) is the hypersurface area on the sphere Qn (ψ) = N , which is the support of the distribution (2.6). This choice of ensemble can be motivated intuitively from the known dynamical behavior of numerical solutions to (1.1). Long-time simulations of the dynamics show that, while the energy H is sensitive to the fluctuations that develop on the small scales,
Statistical Approach to Asymptotic Behavior of NLS Equations
193
the particle number Q depends on the coherent structure on the large scale. This phenomenon is related to the phenomenological description of weak turbulence in which there is a flux of energy to small scales and of particle number to large scales. Physical reasoning then suggests that the appropriate ensemble be canonical in H , since energy is in contact with a bath of turbulent small-scale waves, and that it be microcanonical in Q, since the particle number is contained in the coherent large-scale waves which are isolated from the turbulent bath. Let us define this mixed ensemble precisely as a conditional probability measure. We return to the decomposition Hn (ψ) = Dn (ψ) + n (ψ) given in (2.3). An easy calculation given in part (a) of Proposition 4.3 shows that for any bounded nonlinearity f , σ > 0 can be chosen sufficiently large so that (ξ ) + σ Q(ξ ) ≥ 0 for all ξ ∈ L2c (ρ). It follows that for all n ∈ N and ψ ∈ Wn , Since Dn (ψ) =
n (ψ) + σ Qn (ψ) = (ψ) + σ Q(ψ) ≥ 0. 1 − 2 Lψ (n) , ψ (n) ≥ 0, we have for all ψ ∈ Wn
Hn (ψ) + σ Qn (ψ) = Dn (ψ) + n (ψ) + σ Qn (ψ) ≥ 0. It is worth noting that such a σ also exists for a wider class of nonlinearities f ; e.g., unbounded, but subcritical nonlinearities. We then construct the following σ -regularized canonical measure: 1 . Pn,β (dψ) = exp(−β[Hn (ψ) + σ Qn (ψ)]) Vn (dψ), (2.7) Zn (β) which exists and is normalizable. The normalizing constant Zn (β) is given by . Zn (β) = exp(−β[Hn (ψ) + σ Qn (ψ)]) Vn (dψ). n
By contrast, when σ = 0, it is known that Zn (β) diverges for certain focusing nonlinearities since Hn goes to −∞ in some directions of the phase space n [25, 27]. Ideally, we would like to define the mixed ensemble to be . N Pn,β (dψ) = Pn,β ( dψ | Qn (ψ) = N ); (2.8) namely, a regular conditional distribution given the microcanonical constraint Qn = N . In this formulation, the mixed ensemble (2.8) is independent of the choice of σ and coincides with the formal expression (2.6). In order to avoid technicalities involving regular conditional distributions, we will consider, in place of (2.8), the conditional measure . N,ε Pn,β (dψ) = Pn,β ( dψ | Qn (ψ) ∈ [N − ε, N + ε] ), (2.9) where ε is a positive parameter defining the thickened shell [N − ε, N + ε]. For suitable values of N, all sufficiently large n, and all ε > 0, Pn,β {Qn (ψ) ∈ [N − ε, N + ε]} > 0, N,ε (dψ) is well defined [see (4.11)]. The main theand so the conditional probability Pn,β orem in this paper, stated in Theorem 3.1, is the large deviation principle on L2c (ρ) for N,ε ψ (n) with respect to Pn,nβ in the continuum limit n → ∞ followed by ε → 0; N is kept fixed while β has been replaced by the mean-field scaling nβ. With this scaling the ensemble mean energy Hn tends to a finite limit E. In contrast to the definition (2.8), N,ε the conditional measures Pn,nβ are no longer independent of σ because of the presence of ε in the definition (2.9). However, the rate function in Theorem 3.1 is independent of σ .
194
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
3. Statement of Main Theorem: LDP for Mixed Ensemble Earlier investigations in [22, 25] give theoretical and numerical evidence that, for the basic NLS equation in which L = ∂ 2 /∂x 2 and ρ is Lebesgue measure on [0, ], the random field ψ (n) defined in (2.1) concentrates on the set of ground states eiθ ϕ(x) of the NLS equation in the continuum limit n → ∞ with fixed E and N . In [25] a meanfield approximation is developed to explain this phenomenon, and long-time numerical simulations of the freely-evolving dynamics support the theory [22]. Alternatively, this phenomenon of concentration on the set of ground states can be demonstrated by implementing Monte-Carlo simulations of the mixed Gibbs ensemble in the continuum limit; this approach is used in Sect.5 of the present paper. Our main goal in the present paper is to formulate and prove a large deviation principle that holds for ψ (n) with respect to the N,ε mixed ensemble Pn,nβ defined in (2.9) and that is valid for general operators L satisfying Condition 2.1. This large deviation principle constitutes a precise and rigorous statement of the concentration phenomenon that occurs in the continuum limit. We start with two definitions. Let X be a Hilbert space, J a function mapping X into [0, ∞], {µn , n ∈ N} a sequence of probability measures on X , {µεn , n ∈ N, ε > 0} a family of probability measures on X , and an a positive sequence tending to ∞. J is called a rate function if for each M < ∞ the set {ξ ∈ X : J (ξ ) ≤ M} is compact. For A a subset of X we write J (A) for inf{J (ξ ) : ξ ∈ A}. The sequence µn is said to satisfy a large deviation principle (LDP) on X with the scaling constants an and the rate function J if for each closed subset F of X lim sup n→∞
1 log µn {F } ≤ −J (F ) an
and for each open subset G of X lim inf n→∞
1 log µn {G} ≥ −J (G). an
Similarly, as n → ∞ and ε → 0, the collection µεn is said to satisfy an LDP on X with the scaling constants an and the rate function J if for each closed subset F of X lim sup lim sup n→∞
ε→0
1 log µεn {F } ≤ −J (F ) an
and for each open subset G of X lim inf lim inf ε→0
n→∞
1 log µεn {G} ≥ −J (G). an
The main result in this paper is the LDP stated in Theorem 3.1. We first indicate the 2 form of the rate function. For ξ = ∞ k=1 ξ, ek ek ∈ Lc (ρ), the Hamiltonian introduced in (1.3) can be written as H (ξ ) = D(ξ ) + (ξ ), where (ξ ) is defined in (2.4), and ∞
. 1 λk |ξ, ek |2 . D(ξ ) = 2 k=1
In terms of the square root of the positive, symmetric operator −L,
1 √ √ −Lξ 2 if ξ ∈ dom( −L) √ D(ξ ) = 2 ∞ if ξ ∈ L2c (ρ) \ dom( −L) ,
Statistical Approach to Asymptotic Behavior of NLS Equations
195
where √
∞
. −Lξ = λk ξ, ek ek . k=1
For N ∈ [0, ∞) we also introduce . ¯ )= E(N inf{H (ξ ) : ξ ∈ L2c (ρ), Q(ξ ) = N }.
(3.1)
¯ We call E¯ the coherent energy function since E(N) is the energy of the coherent structure with particle number N . The function E¯ is lower semicontinuous and bounded below; indeed, by (4.12) E¯ differs by a constant from the function E˜ defined in (4.10), which by part (a) of Proposition 4.5 is nonnegative and lower semicontinuous. For N ∈ [0, ∞) and ξ ∈ L2c (ρ), the rate function in Theorem 3.1 is defined to be
¯ ) if Q(ξ ) = N . H (ξ ) − E(N N (3.2) J (ξ ) = ∞ otherwise. Theorem 3.1 will be proved in the next section. Theorem 3.1. The L2c (ρ)–valued process ψ (n) is defined in (2.1) and the mixed ensemN,ε ble Pn,nβ in (2.9). We fix β > 0, take N ∈ [0, ∞), and assume Condition 2.1. Then as N,ε -distributions of ψ (n) satisfy the LDP on L2c (ρ) with the n → ∞ and ε → 0, the Pn,nβ scaling constants nβ and the rate function J N defined in (3.2). Heuristically, the LDP means that the elements ξ¯ ∈ L2c (ρ) that minimize H subject to the constraint Q = N are the overwhelmingly most probable states with respect to the mixed ensemble in the continuum limit n → ∞ followed by ε → 0. This set of constrained minimizers is the set of equilibrium macrostates or ground states; we denote it by E N . For an equilibrium macrostate ξ¯ we have J N (ξ¯ ) = 0, while for any ξ ∈ L2c (ρ) that is not an equilibrium macrostate we have J N (ξ ) > 0. We now consider, for any r > 0, the complement of an r-neighborhood of the equilibrium set E N and define . j (r) = inf{J N (ξ ) : dist(ξ, E N ) ≥ r > 0},
the distance being taken in the L2c (ρ)-norm. Then j (r) > 0. From the large deviation upper bound in Theorem 3.1, we infer that N,ε Pn,nβ {dist(ψ (n) , E N ) ≥ r > 0 } ≤ e−nβj (r)/2 → 0 as n → ∞ , ε → 0.
Thus any set of ξ ∈ L2c (ρ) that lies a positive distance from the equilibrium set E N has an exponentially small probability of being observed for sufficiently large n and sufficiently small ε > 0. This property of E N justifies calling it the set of equilibrium macrostates. An LDP can be viewed as an exponential-order refinement of the law of large numbers [11, 16]. From this viewpoint, we might expect the random field ψ (n) (x) to satisfy an LDP in the continuum limit because it is the sum of component fields ψk ek (x) that are asymptotically independent. In essence, this insight is the basis for the mean-field approximation used in [25], which relies on the smallness of the fluctuations of ψ (n) (x) in the L2c (ρ)-norm. As we will see in the next section, the proof of the LDP for ψ (n) depends crucially on the continuity of the functionals Q and with respect to the L2c (ρ)topology [Prop. 4.3]. These properties that are needed to prove the LDP are intimately related to the properties used to derive the mean-field theory.
196
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
4. Proof of Theorem 3.1 Given β > 0 we introduce the following Gaussian measures on the phase space n = R2n : . Gn,β (dψ) =
1 exp(−βDn (ψ)) Vn (dψ), Cn,β
(4.1)
where Cn,β is the normalizing constant . Cn,β = exp(−βDn (ψ)) Vn (dψ). R2n
The proof of Theorem 3.1 is based on the LDP of ψ (n) with respect to the measures Gn,nβ , where β in (4.1) has been replaced by nβ. The motivation for introducing these measures is that the canonical measures Pn,nβ can be expressed in terms of Gn,nβ [see (4.8)], and hence the LDP for Pn,nβ follows directly from that for Gn,nβ [Thm. 4.1]. In N,ε -distributions of ψ (n) stated in Theorem 3.1 is derived from turn, the LDP for the Pn,nβ the LDP for the Pn,nβ -distributions of ψ (n) . The LDP for the Gn,nβ -distributions of ψ (n) is stated in the next theorem and is proved using a corollary of Baldi’s Theorem stated in [11, Cor. 4.5.27]. After giving this proof, we sketch a second proof using an LDP for Gaussian processes proved by Bolthausen [5]. The next theorem states the LDP on the complex Hilbert space of square-integrable functions L2c (ρ); any ξ ∈ L2c (ρ) can be written as ξ 1 +iξ 2 , where ξ 1 and ξ 2 are elements of the corresponding real Hilbert space L2 (ρ). Since both proofs of Theorem 4.1 are based on results formulated for real spaces, we will prove an equivalent LDP replacing L2c (ρ) by the topologically equivalent Hilbert space L2 (ρ) × L2 (ρ); this equivalence is defined by the correspondence ξ = ξ 1 + iξ 2 ∈ L2c (ρ) ↔ (ξ 1 , ξ 2 ) ∈ L2 (ρ) × L2 (ρ). Theorem 4.1. The L2c (ρ)-valued process ψ (n) is defined in (2.1) and the Gaussian measures Gn,β in (4.1). We fix β > 0 and assume Condition 2.1. Then as n → ∞, the Gn,nβ -distributions of ψ (n) satisfy the LDP on L2c (ρ) with the scaling constants nβ and the rate function 1 1 2 . 1 I (ξ ) = I (ξ 1 + iξ 2 ) = λk |ξk1 + iξk2 |2 = λk (ξk ) + (ξk2 ) , 2 2 ∞
∞
k=1
k=1
(4.2)
. where ξk1 + iξk2 = ξ 1 , ek + iξ 2 , ek . Alternatively, the rate function I (ξ ) equals
1 √ √ −Lξ 2 if ξ ∈ dom( −L) √ D(ξ ) = 2 ∞ if ξ ∈ L2c (ρ) \ dom( −L).
(4.3)
Proof. The equality of the quantities defined in (4.2) and in (4.3) is immediate. The function ψ (n) defined in (2.1) can be written as ψ (n) =
n k=1
uk ek (x) + i
n k=1
vk ek (x) ≡ ψ (n),1 + iψ (n),2 .
Statistical Approach to Asymptotic Behavior of NLS Equations
197
. . Setting ψk1 = uk and ψk2 = vk , we have ψ (n),1 = nk=1 ψk1 ek and ψ (n),2 = nk=1 ψk2 ek . Because L2c (ρ) and L2 (ρ) × L2 (ρ) are topologically equivalent, proving an LDP for the Gn,nβ -distributions of ψ (n) on L2c (ρ) is equivalent to proving an LDP for the Gn,nβ -distributions of (ψ (n),1 , ψ (n),2 ) on L2 (ρ) × L2 (ρ). The inner product on L2 (ρ) × L2 (ρ) is ∞
. (ξ 1 , ξ 2 ), (θ 1 , θ 2 ) = ξ 1 , θ 1 + ξ 2 , θ 2 = ξkα θkα , 2
α=1 k=1
. . where for α = 1, 2 ξkα = ξ α , ek and θkα = θ α , ek . We begin the proof by computing, for ϕ ∈ L2 (ρ) × L2 (ρ),
1 . c(ϕ) = lim exp nβ ϕ, ψ (n) Gn,nβ (dψ) log n→∞ nβ n n R ×R 2 n 1 1 1 α α α 2 α = lim dψk exp nβ ϕk ψk − λk (ψk ) log n→∞ nβ Cn,nβ 2 α=1 k=1 R n 1 nβ (ϕk1 )2 + (ϕk2 )2 = lim log exp n→∞ nβ 2 λk k=1
=
∞ 1 (ϕ 1 )2 + (ϕ 2 )2 k
2
k=1
k
λk
(4.4)
.
By Condition 2.1 λk > 0 and λk → ∞; hence 0 ≤ c(ϕ) ≤ const · ϕ2 < ∞. Because of the relatively simple form of c, it is elementary to check that c is Gateaux differentiable and is weakly continuous on L2 (ρ) × L2 (ρ). Because c is a sum of quadratic terms, it is also straightforward to calculate its Legendre-Fenchel transform. For ξ = (ξ 1 , ξ 2 ) ∈ L2 (ρ) × L2 (ρ), this function is given by . I (ξ ) = = =
sup
{ϕ, ξ − c(ϕ)} ∞ 2 )2 (ϕ 1 (ϕk1 )2 1 k + − sup ϕk2 ξk2 − 2 λk 2 λk ϕ 2 ∈R
ϕ∈L2 (ρ)×L2 (ρ) ∞ sup ϕk1 ξk1 1 ϕ k=1 k ∈R ∞ 1 1 2 λk (ξk ) 2 k=1
k=2
k
+ (ξk2 )2 .
(4.5)
The function I (ξ ) calculated in the preceding display coincides with the function I (ξ ) defined in (4.2). By Corollary 4.5.27 in [11], we will be able to conclude that the Gn,nβ -distributions of (ψ (n),1 , ψ (n),2 ) on L2 (ρ) × L2 (ρ)—and thus the Gn,nβ distributions of ψ (n) on L2c (ρ)—satisfy the LDP with rate function I (ξ ) after we show that the Gn,nβ -distributions of (ψ (n),1 , ψ (n),2 ) are exponentially tight; i.e., for any K < ∞ there exists a compact set A such that lim sup n→∞
1 log Gn,nβ {(ψ (n),1 , ψ (n),2 ) ∈ Ac } < −K. nβ
198
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
In order to prove the exponential tightness, we define for M < ∞ the level sets . AM = {ξ ∈ L2 (ρ) × L2 (ρ) : I (ξ ) ≤ M}. We first prove that the sets AM are compact by showing that any sequence ξ (n) = (ξ (n),1 , ξ (n),2 ) in AM has a subsequence converging to an element of AM . Since ξ (n) ∈ (n),α 2 ) ≤ M; thus for each k AM , we have for all k ∈ N, n ∈ N, and α = 1, 2 21 λk (ξk (n),α (n ),α has a convergent subsequence. For α = 1, 2 let ξ1 1 be the convergent and α ξk (nk+1 ),α (n),α subsequence of ξ1 , and for each k and α let ξk+1 be the convergent subsequence (nk ),α
of ξk
(nk ),α
. For each k and α there exists ξkα such that limnk →∞ ξk
= ξkα . A diagonal-
(n),α ˆ
ξk = ξkα for each ization argument yields a subsequence nˆ ∈ N such that limn→∞ ˆ . ∞ 1 2 k and α. The quantity ξ = k=1 (ξk , ξk )ek is an element of AM . Indeed, by Fatou’s lemma 2 ∞ 2 ∞ 1 1 (n),α ˆ λk (ξkα )2 ≤ lim inf λk (ξk )2 ≤ M. 2 2 n→∞ ˆ α=1 k=1
α=1 k=1
. ∞ (n),1 ˆ (n),2 ˆ ˆ = , ξk )ek . In order to complete the proof For each nˆ we define ξ (n) k=1 (ξk (n) ˆ 2 2 (n) ˆ that AM is compact, we show that ξ → ξ in L (ρ) × L (ρ). Since ξ and ξ are in AM , the finiteness of ∞ 1/λ assumed in Condition 2.1 implies that uniformly over k k=1 nˆ ∞ 2
(n),α ˆ
(ξk
− ξkα )2 ≤ 2
α=1 k=1
∞ 2
(n),α ˆ 2
(ξk
) + (ξkα )2
α=1 k=1
≤4
∞ M k=1
λk
< ∞.
Hence by the dominated convergence theorem lim ξ
n→∞ ˆ
(n) ˆ
− ξ = lim 2
n→∞ ˆ
2 ∞
(n),α ˆ
(ξk
− ξkα )2 = 0.
α=1 k=1
This concludes the proof that AM is compact in L2 (ρ) × L2 (ρ). We complete the proof of Theorem 4.1 by showing that the Gn,nβ -distributions of (ψ (n),1 , ψ (n),2 ) are exponentially tight. For any M < ∞ Chebyshev’s inequality yields Gn,nβ {(ψ (n),1 , ψ (n),2 ) ∈ AcM } n 1 1 2 2 2 = Gn,nβ λk (ψk ) + ψk ) > M 2 k=1 n nβ 1 2 1 nβ 2 2 exp λk (ψk ) + ψk ) Gn,nβ (dψ) ≤ exp − M 2 Cn,nβ Rn ×Rn 4 k=1
Statistical Approach to Asymptotic Behavior of NLS Equations
199
nβ α 2 dψkα exp − (ψ ) λ 2 n k k nβ 4 R = exp − M nβ 2 k=1 α=1 exp − λk (ψkα )2 dψkα 2 R nβ = exp − M 2n . 2 It follows that lim sup n→∞
1 log 2 M log Gn,nβ {(ψ (n),1 , ψ (n),2 ) ∈ AcM } ≤ − + . nβ 2 β
Since AM is compact and M can be taken arbitrarily large, the proof is complete.
Before proving Theorem 3.1 we sketch a second proof of Theorem 4.1 using an LDP for Gaussian processes proved by Bolthausen [5]. Let (, F, ) be a probability space on which is defined a doubly indexed sequence of independent, N (0, 1) Gaussian random variables gkα indexed by k ∈ N and α = 1, 2. In terms of the eigenvalues and eigenfunctions of L introduced in Condition 2.1, we define for n ∈ N, ω ∈ , and x ∈ D the independent mean-0 Gaussian processes ek (x) . 1 gk (ω) √ y (n),1 = y (n),1 (ω, x) = λk k=1 n
and
ek (x) . 2 y (n),2 = y (n),2 (ω, x) = gk (ω) √ . λk k=1 n
(4.6)
These processes take values in L2 (ρ), and y (n),1 + iy (n),2 take values in L2c (ρ). The basic NLS equation is defined by L = ∂ 2 /∂x 2 on [0, ]. Inserting into (4.6) the eigenvalues and eigenfunctions given in Example 2.2(a), we have for α = 1, 2 and x ∈ [0, ], y (n),α (ω, x) =
√ n 2 α sin(kπ x/) . gk (ω) π k k=1
It is well known that with probability 1, as n → ∞ these processes converge in L2 (dx) to independent Brownian bridges on [0, ]. The processes y (n),α are closely related to the processes used by Wiener in his construction of Brownian motion [21, pp. 21-22]. The probability–1 convergence of y (n),α in the general case of (4.6) is the basis of the second proof of Theorem 4.1. We will prove the convergence in a moment. √ We need the following lemma relating the distributions of (y (n),1 + iy (n),2 )/ β and ψ (n) . The routine proof is omitted. Lemma 4.2. Fix β > 0. Then as measures on L2c (ρ), the -distributions of (y (n),1 + √ iy (n),2 )/ β and the Gn,β -distributions of ψ (n) are equal. In particular, replacing β by √ nβ, we see that the -distributions of (y (n),1 +iy (n),2 )/ nβ and the Gn,nβ -distributions of ψ (n) are equal.
200
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
This lemma allows √ us to prove Theorem 4.1 by showing that the -distributions of (y (n),1 + iy (n),2 )/ nβ on L2c (ρ)—equivalently, the -distributions of (y (n),1 , y (n),2 )/ √ nβ on L2 (ρ) × L2 (ρ)—satisfy the LDP with the scaling constants nβ and the rate function I (ξ ) defined in (4.2). We first prove that as n → ∞, with -probability 1 the Gaussian processes y (n),1 and y (n),2 defined in (4.6) converge in L2 (ρ) to the independent Gaussian processes ∞
∞
. 1 ek . 2 ek gk √ and y 2 = gk √ . y1 = λk λk k=1 k=1 On the product space × D, we define for n ∈ N and α = 1, 2 ∞
. α |ek | . α |ek | = |gk | √ and y α = |gk | √ . λk λk k=1 k=1 n
y
(n),α
Since the gkα are independent, N (0, 1) Gaussian random variables and the ek are orthonormal in L2 (ρ), (y α )2 d × dρ = lim (y (n),α )2 d × dρ ×D
= =
n→∞ ×D ∞ k=1 D ∞ k=1
(gkα )2
(ek )2 d dρ λk
1 . λk
Since this sum is finite by Condition 2.1 and |y α | ≤ |y α |, we have y α ∈ L2 (ρ) -a.e. and thus y α ∈ L2 (ρ) -a.e. The bound |y α − y (n),α |2 ≤ (2y α )2 ∈ L1 (ρ) -a.e. allows us to apply the dominated convergence theorem, which yields the desired limit: |y α − y (n),α |2 dρ → 0 −a.e. D
This completes the proof that with -probability 1 y (n),1 and y (n),2 converge in L2 (ρ) to y 1 and y 2 . The probability–1 convergence just proved implies the weak convergence {(y (n),1 , y (n),2 )/ β ∈ dξ 1 × dξ 2 } ⇒ {(y 1 , y 2 )/ β ∈ dξ 1 × dξ 2 } on L2 (ρ) × L2 (ρ). (4.7) By Theorem 2 in [5] and the discussion √ following that theorem, we conclude that the -distributions of (y (n),1 , y (n),2 )/ nβ on L2 (ρ) × L2 (ρ)—and thus the Gn,nβ distributions of ψ (n) on L2c (ρ)—satisfy the LDP with scaling constants nβ and rate function 1 . 1 hβ (ξ ) = · sup {ϕ, ξ − log Mβ (ϕ)}, β β ϕ∈L2 (ρ)×L2 (ρ) where . Mβ (ϕ) =
expϕ, (y 1 , y 2 )/ β (dϕ).
Statistical Approach to Asymptotic Behavior of NLS Equations
201
In order to calculate Mβ (ϕ), we use the bound sup
n∈N
exp(t(y (n),1 , y (n),2 )) d < ∞ for all t > 0,
which follows from the weak convergence (4.7) [5, p. 427]. Applying Lemma 4.2 and calculating as in (4.4), we find that for ϕ = (ϕ 1 , ϕ 2 ) ∈ L2 (ρ) × L2 (ρ) Mβ (ϕ) = lim
n→∞
expϕ, (y (n),1 , y (n),2 )/ β d
= lim expϕ, ψ (n) Gn,β (dψ) n→∞ R2n ∞ 1 (ϕk1 )2 + (ϕk2 )2 = exp . 2β λk k=1
Via a calculation as in (4.5), we conclude that for ξ = (ξ 1 , ξ 2 ) ∈ L2 (ρ) × L2 (ρ) ∞
1 1 λk (ξk1 )2 + (ξk2 )2 ). h(ξ ) = β 2 k=1
Since this equals the function I (ξ ) defined in (4.2), our sketch of the second proof of Theorem 4.1 is complete. We now turn to the proof of Theorem 3.1, which states the LDP for the distributions N,ε of ψ (n) with respect to the conditional measures Pn,nβ defined in (2.9). Before proving this theorem, we must establish several properties of the functionals Q and F appearing N,ε in Pn,nβ . We recall that for ξ ∈ L2c (ρ) . 1 Q(ξ ) = 2
. 1 |ξ |2 dρ and (ξ ) = − 2 D
F (|ξ |2 ) dρ D
. a and that for a ≥ 0 F (a) = 0 f (s) ds. Proposition 4.3. The following properties are valid. (a) For any σ > f ∞ we have (ξ ) + σ Q(ξ ) ≥ 0 for all ξ ∈ L2c (ρ). (b) Both Q and are continuous functionals on L2c (ρ). Proof. (a) For any ξ ∈ L2c (ρ),
(ξ ) + σ Q(ξ ) = −
1 2
F (|ξ |2 ) dρ + D
σ 2
|ξ |2 dρ ≥ − D
The last expression is positive provided σ > f ∞ .
f ∞ σ ξ 2 + ξ 2 . 2 2
202
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
(b) The continuity of Q is obvious. To prove the continuity of , we note that for 0 ≤ a ≤ b < ∞ , |F (b) − F (a)| ≤ f ∞ |b − a|. Hence for ξ and ζ in L2c (ρ), | (ξ ) − (ζ )| f ∞ ≤ |ξ |2 − |ζ |2 dρ + |ζ |2 − |ξ |2 dρ 2 {|ξ |≥|ζ |} {ξ |<|ζ |} f ∞ 2 ≤ |ξ | − |ζ |2 dρ 2 D f ∞ ≤ |ξ − ζ | · (|ξ | + |ζ |) dρ 2 D 1/2 1/2 f ∞ |ξ − ζ |2 dρ ≤ (|ξ | + |ζ |)2 dρ 2 D D 1/2 f ∞ ≤ . ξ − ζ 2ξ 2 + 2ζ 2 2 This yields the continuity of .
For the remainder of this section we choose σ > f ∞ so that (ξ ) + σ Q(ξ ) ≥ 0 for any ξ ∈ L2c (ρ) [Prop. 4.3(a)]. We are now ready to prove Theorem 3.1, which states N,ε the LDP for the distributions of ψ (n) with respect to the conditional measures Pn,nβ . We first express the measures Pn,nβ in terms of Gn,nβ : 1 exp(−nβ[Hn (ψ) + σ Qn (ψ)]) Vn (dψ) Zn (nβ) 1 = exp(−nβ[ (ψ (n) ) + σ Q(ψ (n) )]) Gn,nβ (dψ), ˆ Zn (nβ)
. Pn,nβ (dψ) =
(4.8)
where Zˆ n (nβ) denotes the normalizing constant . ˆ Zn (nβ) = exp(−nβ[ (ψ (n) ) + σ Q(ψ (n) )]) Gn,nβ (dψ). n
In order to prove the LDP for the measures Pn,nβ , we need a definition. Let J be a rate function on L2c (ρ). A sequence of measure µn on L2c (ρ) is said to satisfy the Laplace principle on L2c (ρ) with the scaling constants nβ and the rate function J if for all bounded continuous functions h, 1 log lim exp(−nβh) dµn = − inf {h(ξ ) + J (ξ )}. n→∞ nβ ξ ∈L2c (ρ) L2c (ρ) As proved in Theorems 1.2.3 and 1.2.5 in [14], µn satisfies the Laplace principle on L2c (ρ) with the rate function J if and only if µn satisfies the LDP on L2c (ρ) with the rate function J . The measures Pn,nβ defined in (2.7) have the form of a canonical ensemble with interaction function +σ Q. We prove the LDP for the Pn,nβ -distributions of ψ (n) by proving the Laplace principle for these distributions. By Theorem 4.1 the Gn,nβ -distributions of ψ (n) satisfy the LDP on L2c (ρ) with rate function D(ξ ). Since (ξ ) + σ Q(ξ ) ≥ 0 for any ξ ∈ L2c (ρ) [Prop. 4.3 (a)], for any bounded continuous function h , + σ Q + h
Statistical Approach to Asymptotic Behavior of NLS Equations
203
is bounded below. Hence the Laplace principle for the Pn,nβ -distributions of ψ (n) is a consequence of Theorem 1.3.4 in [14]. A straightforward calculation (see, e.g., the proof of Thm. 3.1 in [6]) allows us to express the rate function in terms of the Hamiltonian . H (ξ ) = D(ξ ) + (ξ ): J (ξ ) = D(ξ ) + (ξ ) + σ Q(ξ ) − inf{D(ξ ) + (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ)} = H (ξ ) + σ Q(ξ ) − inf{H (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ)}.
(4.9)
We have proved the following result. Theorem 4.4. As n → ∞, the sequence Pn,nβ (ψ (n) ∈ dξ ) satisfies the LDP on L2c (ρ) with the scaling constants nβ and the rate function J defined in (4.9). N,ε We now address the question of when the conditional measure Pn,nβ is well defined. To this end, we introduce . ˜ )= E(N inf{J (ξ ) : ξ ∈ L2c (ρ), Q(ξ ) = N }, (4.10)
which is nonnegative and finite for all N ∈ [0, ∞). Because Q is a continuous functional on L2c (ρ), the preceding theorem and the contraction principle imply that E˜ is a rate function—and thus is lower semicontinuous—and also yield the LDP stated in part (a) of the next proposition [11, Thm. 4.2.1]. If one applies the large deviation lower bound to the ˜ ˜ ), open set (N −ε, N +ε) ⊂ [N −ε, N +ε] and uses the bound E((N −ε, N +ε)) ≤ E(N then one obtains part (b). Part (b) implies that for any N ∈ [0, ∞), all sufficiently large n, and all ε > 0 we have Pn,nβ {Q(ψ (n) ) ∈ [N − ε, N + ε]} > 0.
(4.11)
N,ε is well defined. Hence for these values of N , n, and ε, the conditional measure Pn,nβ
Proposition 4.5. (a) As n → ∞, the sequence Pn,nβ {Q(ψ (n) ) ∈ dx} satisfies the LDP ˜ In particular, E˜ is on R with the scaling constants nβ and the rate function E. nonnegative and lower semicontinuous on R. (b) For N ∈ [0, ∞) and any ε > 0, lim inf n→∞
1 ˜ ) > −∞. log Pn,nβ {Q(ψ (n) ) ∈ [N − ε, N + ε]} ≥ −E(N nβ
˜ Substituting into the definition of E˜ the formula We indicate other expressions for E. (4.9) for J , we obtain . ˜ )= E(N inf{J (ξ ) : ξ ∈ L2c (ρ), Q(ξ ) = N } = inf{H (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ), Q(ξ ) = N } − inf{H (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ)} = inf{H (ξ ) : ξ ∈ L2 (ρ), Q(ξ ) = N } + σ N − inf{H (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ)}. . ¯ )= Recalling the function E(N inf{H (ξ ) : ξ ∈ L2c (ρ), Q(ξ ) = N } introduced in (3.1), we see that ˜ ) = E(N ¯ ) + σ N − inf{H (ξ ) + σ Q(ξ ) : ξ ∈ L2c (ρ)}. E(N
(4.12)
204
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
Since E˜ is nonnegative and lower semicontinuous, it follows that E¯ is bounded below and lower semicontinuous. N,ε We now complete the proof of Theorem 3.1, which states the LDP for the Pn,nβ (n) distributions of ψ . This is carried out by proving that as n → ∞ and ε → 0 the N,ε sequence Pn,nβ (ψ (n) ∈ dξ ) satisfies the LDP on L2c (ρ) with the scaling constants nβ and the rate function J N defined in (3.2). The function Q defining the conditioning in this measure is continuous. If it were also bounded, then the desired LDP would be a consequence of Theorem 3.2 in our paper [17]. However, a quick examination of the proof of that theorem reveals that only the continuity of Q is required, not its boundedness (specifically, in the application of the contraction principle in the proof of Prop. 3.1 in [17]). For N ∈ [0, ∞), Theorem 3.2 in [17], with its proof modified as just described, N,ε shows that as n → ∞ and ε → 0 the sequence Pn,nβ (ψ (n) ∈ dξ ) satisfies the LDP on 2 Lc (ρ) with the scaling constants nβ and the rate function
˜ . J (ξ ) − E(N) if Q(ξ ) = N J˜N (ξ ) = ∞ otherwise. Substituting the formula (4.9) for J into the definition of J˜N and using (4.12) to relate ¯ we see that J˜N equals the function J N defined in (3.2). This yields the desired E˜ and E, N,ε LDP for Pn,nβ (ψ (n) ∈ dξ ). The proof of Theorem 3.1 is concluded. 5. Monte-Carlo Simulations Here we summarize some numerical computations that display the variety of behaviors N,ε that are exhibited by the mixed Gibbs ensembles Pn,nβ (dψ (n) ) in the continuum limit as n → ∞. In particular, we implement a Monte-Carlo sampling procedure to probe the dependence of the concentration phenomenon on Condition 2.1. To this end we consider operators √ L on L2c [0, π ] with Dirichlet boundary conditions and with eigenfunctions ek (x) = 2 sin(kx), and we choose the corresponding eigenvalues to be λk = −k α for 0 < α < +∞. Our main concern is to distinguish the case when α > 1, for which Condition 2.1 holds and hence our LDP applies, from the case when α ≤ 1, for which a concentration behavior may or may not occur. Our Monte-Carlo procedure is a modification of the standard Metropolis algorithm [2] appropriate to the mixed ensemble, which is canonical with respect to the energy Hn and microcanonical with respect to the particle number Qn . The microcanonical constraint is enforced exactly at each step of the Markov chain that defines the Metropolis algorithm; at each step two components of the random state ψ ∈ n are updated in a manner that preserves the spherical constraint, n1 |ψk |2 = 2N . To improve the sampling properties of the algorithm, a form of simulated annealing is used. That is, the sampling procedure is implemented in two stages: first at a small β (high temperature) and then at the prescribed β (given temperature). To exhibit the concentration behavior predicted by the LDP for ψ (n) , we sample N,ε the mixed Gibbs ensemble Pn,nβ (with ε → 0+) for the three values n = 16, 64, 256 with β fixed. In all our computations the underlying GNLS equation has the saturated nonlinearity f (|ψ|2 ) =
b |ψ|2 (1 + |ψ|2 )
(5.1)
Statistical Approach to Asymptotic Behavior of NLS Equations
n = 16
(a) α = 2.0 β = 0.1 b = 10
n = 64
n = 256
1.5
1.5
1.5
1
1
1
0.5
0.5
0.5
0
0
1
2
3
0
0
1
n = 16
(b) α = 0.5 β = 1.0 b = 1.0
205
2
3
0
1.5
1
1
1
0.5
0.5
0.5
1
2
3
0
0
1
2
2
3
n = 256
1.5
0
1
n = 64
1.5
0
0
3
0
0
1
2
3
Fig. 1a,b. Samples from mixed Gibbs ensemble for α = 2.0 and α = 0.5
with scale factor b. First we set α = 2.0, β = 0.1, and b = 10. These parameters are chosen to yield a ground state of approximately unit amplitude and unit width. Figure 1a displays the three plots for this sequence, each plot composed of five representative samples of |ψ| drawn from the Monte-Carlo sampled ensemble. The expected behavior under Condition 2.1 is clearly demonstated; namely, the fluctuations visible for n = 16 decrease for n = 64 and almost disappear for n = 256, for which all five samples remain close to the ground state. Next we set α = 0.5, β = 1.0, and b = 1.0. This choice of α furnishes an example of the behavior of the mixed Gibbs ensemble in the continuum limit when Condition 2.1 is violated. As in Fig. 1a, the parameters β and b are fixed so that the ground state has approximately unit amplitude and unit width. The sequence of three plots displayed in Fig.1b shows greater fluctuations than the corresponding plots in Fig.1a. As n increases, both the spatial scale and the magnitude of the fluctuations decreases, so that for n = 256 each of the five samples is a near-ground state on the large scale having small fluctuations on the small scales. Figure 1b appears to show a concentration around the ground state even in the case when α = 0.5, but possibly with a slower rate of convergence than for α > 1, and possibly in a weaker norm than the L2c (ρ)-norm. We do not know whether Condition 2.1 is necessary as well as sufficient for an LDP for the process ψ (n) in the L2c (ρ)-topology. On the one hand, the displayed computations for α < 1 and other computed results for α < 1 not included here suggest that a weaker condition may be sufficient for such an LDP. On the other hand, the proof of Theorem 4.1, the basic LDP for Gaussian processes on which our main Theorem 3.1 is based, is not valid for α < 1. This suggests
206
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
that the concentration property breaks down when α < 1. A more exhaustive numerical investigation might help to resolve this question, but such a computationally intensive study will not be pursued here. A second set of numerical computations is shown in Fig. 2. Here we investigate the dependence of the concentration phenomenon and the associated ground states on the strength of the nonlinearity in the GNLS equation. Specifically, we vary the parameter b > 0 in the saturated nonlinearity given in (5.1). Since the GNLS equation is focusing, an increase of b is expected to localize and intensify the ground state solitary wave. In Fig. 2a this effect is displayed for b = 10, 20, 100, with α = 2.0, β = 10 and n = 64; in Fig. 2b it is displayed for b = 1, 2, 10, with α = 0.5, β = 100 and n = 64. As in Fig. 1, these plots exhibit five samples of |ψ| for each ensemble. Since β is relatively large in each of these cases, each sample has the shape of a ground state. For large b, whether or not the summability of 1/λk in Condition 2.1 holds, we observe a ground state that is a highly localized solitary wave; an extreme case of this is given in Fig. 2b for b = 10. From the point of view of the present paper, the most noteworthy effect displayed in Fig. 2 is the presence of approximate translates of the exact ground state among the Monte Carlo samples. Indeed, for high β (low temperature) the energy of all the displayed samples is close to the ground state energy, and the samples themselves are often close to being translates of the exact ground state. This effect is straightforwardly explained by the fact that, for relatively localized states, the energy H is only slightly different among all translations of a ground state that are not too close to the boundary. According to our main LDP in Theorem 3.1, the rate function is simply the energy difference between a candidate state and the ground state(s) [see (3.2)]. Thus, while we have adopted Dirichlet
b = 10
b = 20
1
b = 100
1.5
3 2.5
0.8 (a) α = 2.0 β = 10 n = 64
1
2
0.6 1.5 0.4 0.5
1
0.2 0
0.5 0
1
2
3
0
0
1
b=1
2
3
0
0
1
b=2
2
3
b = 10
1
15 2
0.8 (b) α = 0.5 β = 100 n = 64
1.5
10
0.6 1
0.4
5 0.5
0.2 0
0
1
2
3
0
0
1
2
3
0
0
1
Fig. 2a,b. Samples from mixed Gibbs ensemble for increasing values of b
2
3
Statistical Approach to Asymptotic Behavior of NLS Equations
207
boundary conditions to reduce the translational invariance of the GNLS equation and thereby to simplify the presentation of our results, we have found that in the strongly focusing regime an approximate translational invariance persists. Of course, this effect diminishes as the Monte Carlo simulations are carried out for increasing n. In summary, the concentration phenomenon that is precisely expressed in our main LDP is definitely borne out by numerical sampling of the mixed ensembles over a wide range of parameters, even though this phenomenon is somewhat complicated by the near translation-invariance of localized ground states. In fact, the range of parameters for which the LDP holds may be wider than the range covered by our main theorem. Acknowledgement. The authors thank Adam Eisner, who provided the Monte-Carlo method and code used in Sect. 5 to carry out the simulations of the mixed ensemble.
References 1. Bidegaray, B.: Invariant measures for some partial differential equations. Physica D 82, 340–364 (1995) 2. Binder, K., Heermann, D.W.: Monte Carlo Simulation in Statistical Physics. Fourth edition. Springer Series in Solid-State Sciences, Vol. 80, Berlin: Springer-Verlag, 2002 3. Birkhoff, G., Rota, G.-C.: Ordinary Differential Equations. Second edition. Waltham: Blaisdell Publishing Co., 1969 4. Biskamp, D.: Nonlinear Magnetohydrodynamics. Cambridge Monographs in Plasma Physics. Cambridge: Cambridge Univ. Press,1993 5. Bolthausen, E.: On the probability of large deviations in Banach spaces. Ann. Probab. 12, 427–435 (1984) 6. Boucher, C., Ellis, R.S., Turkington, B.: Derivation of maximum entropy principles in twodimensional turbulence via large deviations. J. Stat. Phys 98, 1235–1278 (2000) 7. Bourgain, J.: Periodic nonlinear Schr¨odinger equation and invariant measures. Commun. Math. Phys. 166, 1–26 (1994) 8. Bouchet, F., Sommeria, J.: Emergence of intense jets and Jupiter’s Great Red Spot as maximumentropy structures. J. Fluid Mech. 464, 165–207 (2002) 9. Cai, D., Majda, A.J., McLaughlin, D.W., Tabak, E.G.: Spectral bifurcations in dispersive wave turbulence. Proc. Nat. Acad. Sci. 96, 14216–14221 (1999) 10. Cai, D., McLaughlin, D.W.: Chaotic and turbulent behavior of unstable 1D nonlinear dispersive waves. J. Math. Phys. 41, 4125–4153 (2000) 11. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Second edition. New York: Spring-Verlag, 1998 12. DiBattista, M.T., Majda, A.J., Grote, M.J.: Meta-stability of equilibrium statistical structures for prototype geophysical flows with damping and driving. Physica D 151, 271–304 (2001) 13. Dowling, T.E.: Dynamics of Jovian atmospheres. Ann. Rev. Fluid Mech. 27, 293–334 (1995) 14. Dupuis, P., Ellis, R.S.: A Weak Convergence Approach to the Theory of Large Deviations. New York: John Wiley & Sons, 1997 15. Dyachenko, S., Zakharov, V.E., Pushkarev, A.N., Shvets, V.F., Yan’kov, V.V.: Soliton turbulence in nonintegrable wave systems. Soviet Phys. JETP 69, 1144–1147 (1989) 16. Ellis, R.S.: Entropy, Large Deviations and Statistical Mechanics. New York: Springer-Verlag, 1985 17. Ellis, R.S., Haven, K., Turkington, B.: Large deviation principles and complete equivalence and nonequivalence results for pure and mixed ensembles. J. Stat. Phys. 101, 999–1064 (2000) 18. Gikhman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes I. Trans. by S. Kotz, Berlin: Springer-Verlag, 1974 19. Hasegawa, A.: Self-organization processes in continuous media. Adv. Phys. 34, 1–42 (1985) 20. Isichenko, M.B., Gruzinov, A.V.: Isotopological relaxation, coherent structures, and Gaussian turbulence in two-dimensional magnetohydrodynamics. Phys. Plasmas 1, 1802–1816 (1994) 21. Itˆo, K., McKean, H.P.: Diffusion Processes and Their Sample Paths. New York/Berlin: Academic Press/Springer Verlag, 1965 22. Jordan, R., Josserand, C.: Self-organization in nonlinear wave turbulence. Phys. Rev. E 61, 1527– 1539 (2000) 23. Jordan, R., Josserand, C.: Statistical equilibrium states for the nonlinear Schr¨odinger equation. Math. Comp. Simulation 55, 433–447 (2001)
208
R.S. Ellis, R. Jordan, P. Otto, B. Turkington
24. Jordan, R., Turkington, B.: Ideal magnetofluid turbulence in two dimensions J. Stat. Phys. 87, 661– 695 (1997) 25. Jordan, R., Turkington, B., Zirbel, C.L.: A mean-field statistical theory for the nonlinear Schr¨odinger equation. Physica D 137, 353–378 (2000) 26. Kevrekidis, P.G., Rasmussen, K.O., Bishop, A.R.: The discrete nonlinear Schr¨odinger equation: A survey of recent results. Int. J. Mod. Phys. B. 15, 2833–2900 (2001) 27. Lebowitz, J.L., Rose, H.A., Speer, E.R.: Statistical mechanics of a nonlinear Schr¨odinger equation. J. Stat. Phys., 50, 657–687 (1988) 28. Majda, A.J., McLaughlin, D.W., Tabak, E.G.: A one-dimensional model for dispersive wave turbulence. J. Nonlinear Sci. 7, 9–44 (1997) 29. Marcus, P.S.: Jupiter’s Great Red Spot and other vortices. Annual Rev. Astronomy and Astrophys. 31, 523–573 (1993) 30. McKean, H.P.: Statistical mechanics of nonlinear wave equations IV. Cubic Schr¨odinger. Commun. Math. Phys.168, 479–491 (1995) 31. Rasmussen, J.J., Rypdal, K.: Blow-up in nonlinear Schroedinger equations–I: A general review. Physica Scripta 33, 481–504 (1986) 32. Segre, E., Kida, S.: Late states of incompressible 2d decaying vorticity fields. Fluid Dyn. Res. 23, 89–112 (1998) 33. Turkington, B., Majda, A.J., Haven, K., DiBattista, M.: Statistical equilibrium predictions of jets and spots on Jupiter. Proc. Nat. Acad. Sci. USA 98, 12346–12350 (2001) 34. Zakharov, V.E., Pushkarev, A.N., Shvets, V.F.,Yan’kov,V.V.: Soliton turbulence. JETP Lett. 48, 83–86 (1988) 35. Zhidkov, P.E.: On an invariant measure for a nonlinear Schr¨odinger equation. Soviet Math. Dokl. 43, 431–434 (1991) Communicated by P. Constantin
Commun. Math. Phys. 244, 209–244 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0991-5
Communications in
Mathematical Physics
PCT Theorem for the Operator Product Expansion in Curved Spacetime Stefan Hollands Enrico Fermi Institute, Department of Physics, University of Chicago, 5640 Ellis Ave., Chicago, IL 60637, USA. E-mail:
[email protected] Received: 17 December 2002 / Accepted: 20 June 2003 Published online: 27 November 2003 – © Springer-Verlag 2003
Abstract: We consider the operator product expansion for quantum field theories on general analytic 4-dimensional curved spacetimes within an axiomatic framework. We prove under certain general, model-independent assumptions that such an expansion necessarily has to be invariant under a simultaneous reversal of parity, time, and charge (PCT) in the following sense: The coefficients in the expansion of a product of fields on a curved spacetime with a given choice of time and space orientation are equal (modulo complex conjugation) to the coefficients for the product of the corresponding charge conjugate fields on the spacetime with the opposite time and space orientation. We propose that this result should be viewed as a replacement of the usual PCT theorem in Minkowski spacetime, at least in as far as the algebraic structure of the quantum fields at short distances is concerned. 1. Introduction The operator product expansion [1, 2] states that the product of any finite number of field operators localized at nearby points can be approximated by a sum of products of c-number coefficient functions of the coordinates of the points relative to a reference point, times fields that are localized at the reference point. Furthermore, of these coefficient functions, only finitely many are singular as the spacetime points approach the reference point. In mathematical symbols, if the fields in the theory are denoted by the generic symbol φ (i) (with (i) a label that distinguishes the various kinds of fields), then the operator expansion is an expansion of the form c(i) (y1 , y2 , . . . , yn )φ (i) (x). (1) φ (1) (y1 )φ (2) (y2 ) · · · φ (n) (yn ) ∼ (i)
The notation “∼” means that the difference between the expectation value in any “reasonable” state of the left side and the expectation value of a suitable finite partial sum on
210
S. Hollands
right side goes to zero as the points y1 , . . . , yn approach the reference point1 , x. Moreover, the rate at which this difference goes to zero can be made arbitrary by including sufficiently many terms in the partial sum. In practice, the operator product expansion is useful to find approximate expressions at short distances (high momenta) for the expectation value of a product of n fields when the corresponding expectation values of the singly localized fields φ (i) on the right side of Eq. (1) are known, for example experimentally, and where the coefficients c(i) can be calculated. Such techniques have been used successfully e.g. to gain insights into the internal structure of hadrons. In Minkowski spacetime, model-independent derivations of the operator product expansion from first principles within an axiomatic framework, including a precise specification of the nature of states for which it holds, have been given in various contexts [3–7]. The derivation that is in our opinion most general and physically best motivated seems to be that of [7], (which is based in turn on earlier work by [8, 9]) and our analysis builds partly on the results and ideas of this work. More formal proofs of the validity of operator product expansion order by order in perturbation theory within quantum field theoretic models derived from a renormalizable Lagrangian had in fact been established much earlier [2]. In general curved spacetimes, a derivation of the operator product expansion from first principles is not available at present2 . In this paper, we will not investigate this important issue, but consider instead the simpler question which properties of the operator product expansion can be derived in curved spacetime if one assumes that such an expansion exists in a suitable sense and that it has certain general model-independent properties. Specifically, we are going to derive the following result about the invariance of the operator product expansion under parity, time, and charge (PCT) in a general 4-dimensional analytic curved spacetime: If the operator product expansion has the properties3 (L) that the distributional coefficients c(i) depend only locally on the metric (in a generally covariant manner), (M) that they satisfy a suitable “microlocal” spectrum condition [10], (A) that they vary analytically under analytic variations of the spacetime metric, then the operator product expansion will automatically have an invariance under reversal of the space and time orientations of the spacetime, and charge conjugation of the fields (if the theory contains any charged fields). Since the coefficients c(i) in the operator product expansion can be viewed in some sense as “structure constants” of the algebra of quantum fields, one can interpret this result as showing that the algebraic structure of the quantum fields in curved spacetime is invariant under PCT at short distances, at least under the above general assumptions. The status of our assumptions is the following: We will argue that property (M) is satisfied for the operator product expansion in Minkowski spacetime as constructed in [7], and we will show elsewhere [11] that properties (L), (M), and (A) are satisfied for perturbative constructions of the operator product expansion in an arbitrary curved spacetime. We would now like to qualify our statement about the PCT invariance of the operator product expansion and distinguish it from corresponding statements about (global) PCT invariance. For quantum field theories in Minkowski spacetime satisfying the Wight1 One can of course take advantage of translation invariance in Minkowksi spacetime to set the reference point x to the origin in Minkowkski space, which is done in usual formulations of the operator product expansion. We are avoiding this since it has no invariant meaning on a curved spacetime. 2 Heuristically, one expects that if an operator product expansion holds for a theory in Minkowski spacetime, then it also holds for the corresponding theory in curved spacetime, since, essentially by the “Einstein equivalence principle,” the short distance behavior of a quantum field theory in curved spacetime should be the “same” as that of the corresponding field theory in Minkowski spacetime. 3 We emphasize in particular that we do not assume that our model is derived from a Lagrangian.
PCT Theorem for the Operator Product Expansion in Curved Spacetime
211
man-axioms or the axioms of algebraic quantum field theory, one can show [12–14] that PCT is always implemented by an anti-unitary operator on the vacuum Hilbert-space of the theory, in the sense that φ(x)−1 = i F (−1)M φ(−x)∗ , where M is the number of unprimed spinor indices of the field, and where F is zero if the field is bosonic, and one if it is fermionic. If the theory has an operator product expansion like Eq. (1), then one easily finds that this expansion has a similar invariance under PCT by acting with on both sides of this expansion. Hence, in Minkowski spacetime, the PCT invariance of the operator product expansion is a simple and direct consequence of the (global) PCT invariance of the theory. On the other hand, it is clear that for a quantum field theory defined only on a single, fixed curved spacetime, one cannot in general even formulate the notion of PCT symmetry in the same manner as in Minkowski spacetime, since a generic spacetime does not possess any isometries analogous to parity (x → −x) and time reversal (t → −t) in Minkowski spacetime. Nevertheless, a notion of (global) PCT symmetry can naturally be formulated in a theory that is consistently given on all oriented and time oriented globally hyperbolic spacetimes in the sense of a generally covariant quantum field theory as recently introduced in [15, 16]: It is natural to say that a generally covariant quantum field theory is globally PCT invariant if the algebras of observables4 corresponding to any given spacetime equipped with the opposite orientations are isomorphic, and if any quantum field on the spacetime with the original orientation is mapped to the charge conjugate field on the spacetime with the opposite orientation under this isomorphism. In Minkowski spacetime, the map t → −t, x → −x is an isometry reversing the space and time orientations, so the notion of PCT invariance of a theory as just stated for general curved spacetimes reduces to the usual notion of PCT invariance Minkowski spacetime (with the isomorphism given by A → A∗ ). It is not known at present whether and under what circumstances PCT invariance as formulated above holds in general curved spacetimes. Consequently, a corresponding PCT invariance of the operator product expansion in curved spacetime does not automatically follow in the same straightforward manner as in Minkowski spacetime. As we show in this paper, PCT invariance of the operator product expansion can nevertheless be proven under the above general and model-independent assumptions (M), (L), and (A). As we have already mentioned, this result may be viewed as an “infinitesimal” version of the PCT-theorem in curved spacetime, in the sense that it proves PCT-invariance of the algebraic relations between the quantum fields at short distances. Our strategy for proving this result is the following. Using that the coefficients in the operator product expansion depend locally and covariantly on the spacetime metric and the spacetime orientations in a covariant manner, and using that this dependence is analytic, we show, using the ideas of [17], that each coefficient can be expanded into a sum of terms, each of which is a product of a curvature tensor at the reference point, x, times a Lorentz-invariant Minkowski space distribution in the Riemannian normal coordinates of the points yi relative to x. The PCT invariance of the coefficients in the operator product expansion is then seen to follow if these Minkowski space distributions are invariant (up to permutation of the arguments and a combinatorical factor) under a reflection of the Riemannian normal coordinates of the n points yi about the origin. In order to show that this invariance indeed holds, we use the microlocal spectrum condition to show that our Minkowski space distributions arise as the boundary value of certain analytic functions. The desired invariance is then shown using the transformation properties of these analytic functions under complex Lorentz transformations on a 4 The algebra of observables associated with a given spacetime is the abstract *-algebra generated by quantum fields smeared with testfunctions of compact support in the spacetime.
212
S. Hollands
suitable complex domain by methods that are similar to the proof of the PCT-theorem in Minkowski spacetime [12]. We remark that, while our proof makes essential use of the fact that the spacetime is real analytic, we expect that our result can be generalized to spacetimes that are only smooth by approximating such spacetimes with a sequence of real analytic spacetimes and by making suitable additional assumptions about the “continuity” of the operator product expansion (of the kind introduced in [17]) under such approximations. The organization of this paper is as follows. In Sect. 2, we recall the notion of a generally covariant quantum field theory in curved spacetime. In Sect. 3, we give a precise formulation of the operator product expansion in curved spacetime and in Sect. 4 we state our technical assumptions concerning the properties of this expansion. Section 5 contains the main result of this paper (Theorem 5.1). Our conventions and notations related to the spacetime geometry are as follows: We view a spacetime as a triple M = (M, gab , o), where M is a 4-dimensional manifold, gab is a metric tensor of signature (+ − −−), and o denotes space and time orientations, represented by a tuple (T , abcd ), where T is a time function on M and abcd is a nowhere vanishing volume form. Throughout, we assume the manifold structure of M and the spacetime metric gab to be real analytic. We denote (abstract) tensor indices by lower case letters of the Roman alphabet and the components of a tensor in a coordinate chart by letters of the Greek alphabet.
2. Mathematical Formulation of Quantum Field Theory on Curved Spacetimes The usual formulations of quantum field theory on Minkowski spacetime rely heavily on the existence of a preferred vaccum state and the special properties of that state. The existence of such a state is, in turn, tied up with the special symmetries of Minkowski spacetime, and indeed, there is no preferred state, nor even any preferred Hilbert space construction that can be singled out for special consideration on a generic curved spacetime. Moreover, apart from a very limited class of spacetimes such as static ones, most Lorentzian spacetimes cannot be viewed as a real section of a complex spacetime that also possesses a real, Euclidean section, so a formulation of quantum field theories on generic Lorentzian spacetimes via Euclidean methods such as the Euclidean path-integral is not possible5 in general. Fortunately, there is a simple, and fully satisfactory way to formulate quantum field theory in curved spacetime which bypasses all of these problems, namely the so-called “(generally covariant) algebraic approach to quantum field theory”[15, 16] (for a review of the algebraic approach to quantum field theory in Minkowski spacetime, see [18]). In this framework, a (generally covariant) quantum field theory is viewed as an assignment that associates with every oriented and time-oriented spacetime M ≡ (M, gab , o) an abstract *-algebra6 A(M) with unit whose elements are the observables7 of the theory. 5 By this we do not mean that it is not worthwhile to study the Euclidean path integral in curved space, or other related quantities, such as e.g. “effective actions”. What we mean is that the physical interpretation of such quantities and their properties is very unclear unless the Euclidean spacetime under consideration has a real, Lorentzian section. This is a very severe restriction that excludes essentially all spacetimes that are not static. 6 In [15], these algebras were assumed to be C∗ -algebras. This is too restrictive for the purposes of the present paper since we also want A(M) to contain unbounded elements. 7 We use the term “observable” somewhat sloppily since we will allow in the algebras also quantities that anti-commute (rather than commute) at spacelike separations.
PCT Theorem for the Operator Product Expansion in Curved Spacetime
213
The features of locality and general covariance of a quantum field theory are reflected in the following consistency properties of this assignment: Consider a situation in which we are given an isometric embedding χ : N → M of a spacetime N into a spacetime M which preserves the causal structure and the orientations, meaning that if o = (T , abcd ) is the orientation of M, then (χ ∗ T , χ ∗ abcd ) coincides with the orientation of N . Then we postulate that there exists an injective *-homomorphism αχ : A(N ) → A(M).
(2)
Furthermore, if χ1 and χ2 are isometric embeddings with the above properties such that the composition χ1 ◦ χ2 can be defined (and consequently defines again an isometric embedding with these properties), then we postulate that αχ1 ◦χ2 = αχ1 ◦ αχ2 .
(3)
The existence of the algebraic isomorphism αχ in Eq. (2) with the property (3) formalizes the idea that observables associated with a spacetime N that is isometric to a globally hyperbolic subregion of a larger spacetime M can be viewed via αχ as observables in the larger spacetime satisfying the same algebraic relations. This can be interpreted as saying that the algebraic relations between the observables depend locally and covariantly on the metric. If M is Minkowski spacetime with a given choice of orientations, then the orientation and causality preserving (global) isometries of M are given precisely by the translations x → x + a, where a ∈ R4 , together with the proper orthochronous Lorentz ↑ transformations x → x, where ∈ L+ . Thus, in the special case of Minkowski spacetime, our axioms say that the Poincar´e group acts on the algebra of observables by a group of *-automorphisms α{,a} . Requirements Eq. (2) and Eq. (3) may therefore be viewed as a replacement of the notion of Lorentz-covariance of a quantum field theory in Minkowski spacetime by the notion of general covariance. In order to formulate the notion of local commutativity respectively local anticommutativity in this algebraic framework, we need to assume that algebraic elements A ∈ A(M) can be uniquely decomposed into a “bosonic” and a “fermionic” part. This is formalized by requiring that there exists a *-automorphism γM for every oriented spacetime M with the property (γM )2 = 1 and γM = γN ◦ αχ whenever χ is an orientation and causality preserving isometric embedding from N into M. We can then uniquely decompose A = A+ + A− , where γM (A± ) = ±A± , and we call A+ the bosonic and A− the fermionic part of A. Given now two isometric embeddings χi : Ni → M, i = 1, 2, such that the image of N1 under χ1 in M is spacelike related to the image of N2 under χ2 , then our requirement of local (anti-) commutativity is [αχ1 (A1 ), αχ2 (A2 )]γ = 0 for all A1 ∈ A(N1 ) and A2 ∈ A(N2 ), where AB + BA [A, B]γ = AB − BA
A, B fermionic, A or B bosonic,
(4)
(5)
is the graded commutator. The algebras of observables, A(M), were referred to as “abstract”, because it has not been assumed that its elements are represented as linear operators on some particular Hilbert space. This is of great conceptual advantage, because there exist in general
214
S. Hollands
many inequivalent representations of which no particular one can be singled out for special consideration. The quantum states are simply all linear functionals ω : A(M) → C, A → A ω from the algebra associated with that spacetime with values in the complex numbers, which are positive in the sense that A∗ A ω ≥ 0 for all A ∈ A(M), and which are normalized in the sense that ? ω = 1, where ? is the identity element. By formulating the theory in terms of abstract algebras, we have therefore avoided predjudicing ourselves towards the particular class of states that can be represented as vectors or density matrices in some particular representation. States of particular interest may be singled out for example in spacetimes which happen to have symmetries or suitable asymptotic regions, or in models with additional internal symmetries, but we emphasize that the question whether such choices are possible is not in any way related to the algebraic structure of A(M), and hence does not affect the formulation of the quantum field theory. A local covariant (scalar) field, φ, is an assignment which associates with every spacetime M a linear map φM : D(M) → A(M),
f → φM (f ),
(6)
from the space D(M) of all smooth compactly supported functions on M to A(M). The locality and covariance property of the field is encoded in the requirement that αχ (φN (f )) = φM (χ∗ f ),
(7)
whenever χ : N → M is an orientation and causality preserving isometric embedding of a spacetime N into a spacetime M, and where χ∗ f denotes the testfunction on M corresponding to the testfunction f on N via the map χ . The above transformation law (7) expresses (a) that the field φ(x) is constructed entirely out of the metric in an arbitrary small neighborhood of the point x, and (b) that it is constructed out of the metric in a generally covariant way. In the case when M is Minkowski spacetime and χ = {, a} is an element of the Poincar´e group, Eq. (7) specializes to α{,a} (φ(x)) = φ(x + a), which is the familiar special relativistic transformation law for a scalar field. The above definition of local covariant quantum field of scalar type can be generalized in a relatively straightforward manner to fields of arbitrary spinor type. The main new issue is that the definition of spinors curved spacetime requires the existence and specification of a spin structure. We will consequently assume that the spacetimes under consideration can be equipped with a spin structure (for matters related to spinors in curved spacetime, see Appendix B). Since we want the quantum fields of spinor type to be elements in A(M) after smearing with a suitable testfunction, we will now view these algebras as depending not only on the spacetime metric and orientations, but also on the particular choice of spin structure, if several inequivalent spin structures are possible. The above locality and covariance property (2) and the local (anti-) commutativity property (4) of the assignment M → A(M) are then formulated in terms of embedding maps χ : N → M which not only preserve the metric structure and orientation, but in addition also lift to a homomorphism between the spin structures on N respectively M. Moreover, local and covariant quantum fields of spinor type are defined as assignments φM : D(M; F(M)) → A(M),
f → φM (f ),
(8)
where F(M) is the appropriate tensor product of vector bundles V(M), V (M), V ∗ (M) and V ∗ (M) corresponding respectively to the unprimed, primed, upper primed, and upper unprimed spinor indices of the field. The local and covariant transformation property of these fields is formulated as in the scalar case, Eq. (7), in terms of embedding
PCT Theorem for the Operator Product Expansion in Curved Spacetime
215
maps χ : N → M. The new feature is that these maps are now also required to preserve orientation and time orientation and to lift to a corresponding map between the respective spinor structures. Moreover, the symbol χ∗ f in Eq. (7) now denotes the compactly supported section in the vector bundle F(M) that corresponds to the compactly supported section f in the bundle F(N ) via the lift of the map χ to the spin structures over N and M. For the quantum field theories that we consider in this paper, we assume that there are countably many local covariant fields, which we shall denote by the generic symbol φ (i) , where i ∈ N is a label that distinguishes the various fields. Note that, since the grading maps satisfy γM = γN ◦ αχ for every orientation and causality preserving isometric embedding, we can consistently decompose any local and covariant field into its bosonic and fermionic parts for all spacetimes. Thus, without loss of generality, we can assume that a local covariant field is either bosonic or fermionic. We emphasize however that we do not assume that all half odd-integer spin fields are fermionic and that all integer spin fields are bosonic. Such a relation between spin and statistics has been proven recently by Verch [19], but the technical assumptions made in [19] are not identical with the technical assumptions we will be making here. The proof of our main result on the other hand does not rely on the spin-statistics relation, so we will avoid assuming the spin-statistics relation in this paper. We emphasize that the axiomatic framework we have set up so far says nothing a priori about the relation between the field observables and corresponding algebras associated with globally isometric spacetimes carrying different orientations. However, it is natural to conjecture that the framework (or, more likely, some extended version or variant thereof) implies a symmetry of the theory if the orientations of the spacetime are changed in a way corresponding to the reversal of time and parity in Minkowski spacetime. Namely, let M = (M, gab , o) be a given globally hyperbolic spacetime with orientations o = (T , abcd ), and let M = (M, gab , −o) be the same spacetime with orientations −o = (−T , abcd ), i.e., with the same orientation and the opposite time orientation. In this situation, a (as yet, hypothetical) PCT-theorem in curved spacetime can be formulated as follows: There exists, for each M, an anti-linear *-isomorphism, P CT : A(M) → A(M), θM
(9)
such that P CT ◦ θ P CT = id, θM
M
(10)
which is consistently given for all M in the following sense. If χ : N → M is a causality and orientation preserving embedding (lifting to a homomorphism of the spin structures over N and M compatible with o) and if χ : N → M is the corresponding embedding between the spacetimes with orientation −o, then P CT P CT ◦ αχ = αχ¯ ◦ θN . θM
(11)
P CT is The above consistency condition formally expresses the demand that the map θM P CT locally and covariantly constructed out of the metric. It implies in particular that θM maps local, covariant fields over M to local covariant fields over M. By analogy to P CT can be chosen in such Minkowski spacetime, we conjecture that, more precisely, θM a way that P CT (φM (f )) = i F (−1)M φM (f )∗ , θM
(12)
216
S. Hollands
where F is 0 or 1 if the field is bosonic resp. fermionic, and where M is the number of unprimed spinor indices of the field. It is maybe not immediately obvious how the above formulation of the PCT-property in the generally covariant framework is related to the usual PCT-theorem in Minkowski space, and so we briefly explain this point. In Minkowski spacetime consider the map χ : x → −x. In the framework of Wightman field theory, one proves [13, 12] the existence of an anti-unitary map on the Hilbert space on which the fields are represented as operators, such that φM (f )−1 = i F (−1)M φM (χ∗ f )∗ for all local fields on Minkowski spacetime M = (R4 , ηab , o) with a given set of orientations o. The map χ is a global orientation preserving isometry8 between Minkowski spacetime with a given set of orientations o, and Minkowski spacetime M = (R4 , ηab , −o) with the orientations −o. Hence, by the general covariance principle Eq. (7) applied to the special case of Minkowski spacetime, we get a corresponding *-isomorphism αχ : A(M) → A(M) satisfying αχ (φM (f )) = φM (χ∗ f ). Thus, the composition P CT θM ≡ αχ ◦ Ad
(13)
(where Ad (A) = A−1 ) defines a PCT-map with the desired properties in the special case of Minkowski spacetime. Note that, in Minkowski spacetime, it is not actually necessary to consider spacetimes with opposite orientations and one can simply view Ad : A(M) → A(M) itself as the PCT-map (as is of course done in all discussions of the PCT theorem in Minkowski space). However, it is clearly not possible to generalize this formulation of PCT to general curved spacetimes without symmetries analogous to x → −x. On the other hand, P CT does not rely on the existence of any the above formulation of PCT via the maps θM such symmetries. As we argue in remark (3) following our main Theorem 5.1 below, the P CT in the sense of results of this paper in some sense prove the existence of the maps θM an asymptotic expansion of the algebra structure near a spacetime point (for real analytic P CT with the above spacetimes). Thus, our results support the conjecture that the maps θM properties indeed exist. 3. Formulation of the Operator Product Expansion in Curved Spacetime In the last section we have reviewed the formulation of quantum field theory in curved spacetime as an assignement of spacetimes with *-algebras of observables, and we have introduced local, covariant quantum fields as suitable assignments of spacetimes with elements in the algebra of observables associated with the spacetime. We now wish to study quantum field theories in curved spacetime that possess in addition an operator product expansion. Let (M) be the space of all complex linear functionals on A(M), (M) = {σ : A(M) → C | σ (c1 A1 + c2 A2 ) = c1 σ (A1 ) + c2 σ (A2 )}.
(14)
We say that such a functional is real, if σ (A∗ ) = σ (A) for all A. Quantum states ω : A → A ω are normalized and positive elements of (M). The proof [7] of the operator product expansion in Minkowski spacetime suggests that one should view the coefficients c(i) appearing in the operator product expansion (1) as being the n-point functions of certain “standard” linear functionals σ (i) on A(M), 8
Note that χ is an orientation reversing isometry from M to M.
PCT Theorem for the Operator Product Expansion in Curved Spacetime
217
where (i) is a label that distinguishes the various local and covariant fields in the theory. We will adapt this viewpoint in our formulation of the operator product expansion in curved spacetime. Our (as yet, still formal) definition of a local, covariant quantum field theory possessing an operator product expansion is then as follows: Definition 3.1. We say that a local covariant quantum field theory (with only scalar fields) possesses an operator product expansion, if for any space and time oriented spacetime M and any point x ∈ M there exist linear functionals (i)
σM,x ∈ (M)
(15)
such that (i)
σM,x ◦ γM = (−1)
F (i)
(i)
σM,x ,
F
(i)
=
0 if φ (i) is bosonic, 1 if φ (i) is fermionic,
(16)
and
n
(jk )
−
φM (yk )
k=1
ω
N
(i)
σM,x
i=1
n
(jk )
φM (yk )
(i)
φM (x)
k=1
ω
→ 0,
(17)
as (y1 , . . . , yn ) → (x, . . . , x) and as N → ∞, for all suitable states ω on A(M), and any collection of fields φ (j1 ) , . . . , φ (jn ) . Remarks. (1) The coefficients c(i) in our previous expression for the operator product expansion (1) correspond to the standard functionals σ (i) in the above formulation via (i)
(i)
(j )
(j )
cM,x (y1 , . . . , yn ) = σM,x (φM1 (y1 ) · · · φMn (yn )),
(18)
where we have now put a subscript “M, x” on the coefficients c(i) in order to indicate the dependence on the spacetime and the reference point, x. Condition Eq. (16) expresses the demand that each term in the operator product expansion has the same fermion number modulo 2. (2) The above definition can be generalized in a straightforward way to theories that contain not only scalar fields but fields of arbitrary spinor type, in which case all quantities depend in addition on a choice of spin structure over M which is compatible with the space and time orientations (see Appendix B for details). Since local covariant fields of spinor type take testfunctions as entries that are sections in a vector bundle F(M) (i) corresponding to the spinor type of the field, it is natural in this case to view σM,x as linear functionals on A(M) taking values not in C but instead in the complex vector space Fx (M), where we mean the fibre of this vector bundle over x. To make the above definition mathematically precise, we still need to specify (a) the precise nature of the states ω that are allowed in Eq. (17), as well as the nature (i) of the functionals σM,x . (b) the precise sense in which the expression (17) tends to 0. We now turn to these tasks.
218
S. Hollands
Given a spacetime M, a collection φ (j1 ) , . . . , φ (jn ) of local covariant fields and a functional σ ∈ (M), we consider the multi-linear functional ×n D(M) → C,
(j )
(j )
(f1 , . . . , fn ) → σ (φM1 (f1 ) · · · φMn (fn ))
(19)
on the n fold cartesian product of the space of testfunctions on M, where for simplicity we assume that all the fields are scalar. The regularity properties of a functional σ may be specified by specifying regularity properties for the linear functionals (19) for an arbitrary set of local covariant fields. Firstly, we will ask that the linear functionals (19) are distributions on ×n M, i.e., that they are continuous with respect to the Laurent-Schwarz topology on the spaces of testfunctions9 ×n D(M). Among these, we now further restrict our attention to those functionals σ for which the distributions (19) have a particular singularity structure specified by the following “microlocal spectrum condition” [10]: (j )
(j )
WFA (σ (φM1 (y1 ) · · · φMn (yn ))) ⊂ M ,
(20)
where WFA is the “analytic wave front set” [20] of a distribution10 , and where M ⊂ T ∗ (×n M)\{0} is defined in terms of the geometry as follows: Let G(p) be a “decorated embedded graph” in M. By this we mean an embedded graph in M whose vertices are points x1 , . . . , xn in M and whose edges, e, are piecewise smooth curves11 γ in M connecting the vertices. Each such edge e is equipped with a future pointing timelike or null coparallel covectorfield (pe )a , meaning that γ˙ a ∇a (pe )b = 0,
g ab (pe )a (pe )b ≥ 0,
(pe )a ∇a T > 0,
(21)
where T is the time function that defines the time orientation of M. If e is an edge in G(p) connecting the points xi and xj with i < j , then we denote s(e) = i its source and t (e) = j its target. With this notation, we define
M = (x1 , k1 ; . . . ; xn , kn ) ∈ T ∗ (×n M)\{0} | ∃ decorated graph G(p) with vertices x1 , . . . , xn such that ki = pe − pe ∀i . (22) e:s(e)=i
e:t (e)=i
We will denote by
A (M) = σ ∈ (M) | WFA σ
n
(jk )
φM (yk )
⊂ M
(23)
k=1
the space of all linear functionals such that Eq. (20) holds for an arbitrary set of local covariant fields. Our operator product expansion will be required to hold only for states ω ∈ A (M). The analytic wave front set of φ (i) (x) ω is then empty, meaning that this expression is not just a distribution, but in fact an analytic function in x. This 9 Strictly speaking, we should demand that our functionals are continuously defined on the space D(×n M) rather than continuous multilinear functionals on ×n D(M). However, by the “Schwartz Nuclear Theorem”, these requirements are actually equivalent.
10 Our convention for the Fourier transform in Rm is fˆ(k) = (2π)−m/2 e+ikx f (x) d m x, which is opposite to the convention used in [20]. It follows from this that our definition of the analytic wave front set is minus the definition given in [20]. 11 We note that a more restrictive notion of a microlocal spectrum condition would be obtained if we would replace “piecewise smooth curve” by “causal curve” or “null-geodesic”.
PCT Theorem for the Operator Product Expansion in Curved Spacetime
219
implies in particular that the products of distributions implicit in our operator product expansion (17) are automatically well-defined. (i) We furthermore require that, for all x ∈ M, the standard functionals σM,x are such that Eq. (20) is satisfied in some neighborhood of x; in other words, we require: (M) For every x ∈ M there exists an isometric embedding χ : N → M preserving the orientations such that x is in the image of N under χ , and such that the linear functional on A(N ) defined by (i)
A(N ) A → σM,x (αχ (A)) ∈ C
(24)
is an element of A (N ). We have thus accomplished (a). The above microlocal spectrum condition (or rather, an analogous “C ∞ ”-version thereof) was first proposed by [10], as a replacement for the usual spectrum condition on vacuum states in Minkowski spacetime. It was shown in [10] that it is satisfied in any Wightman quantum field theory in Minkowski spacetime, as well as for so-called quasifree “Hadamard states” [21] in linear quantum field theories in curved spacetimes12 . The above analytic version of this condition is natural in analytic spacetimes and was first proposed in [16]. It is discussed in [23] in connection with long-range correlations in quantum field theories on curved spacetimes, and it was used in [24] to prove the PCT and spin-statistics theorem for certain non-local field theories on Minkowski spacetime. Our motivation for imposing the microlocal spectrum condition (20) on the operator product expansion comes from the following facts. It was shown in [7] that the operator product expansion in Minkowski spacetime will hold typically only for states ω that are well-behaved at high energies (for example energy-bounded), and that the standard (i) functionals σM,x can be chosen energy-bounded. On the other hand, one can show that if M is Minkowski spacetime, then every functional with bounded energy satisfies the microlocal spectrum condition. More specifically, assume that the algebra of observables corresponding to Minkowski space admits a faithful representation on a Hilbert space on which the group of automorphisms αa associated with the translations byµa four vector a µ is implemented by a strongly continuous group of unitaries, αa (A) = eia Pµ A e−ia Pµ , with self-adjoint generator P satisfying the spectrum condition, specP ⊂ V¯ + , where V¯ + is the closure of the future lightcone in Minkowksi spacetime, and where A has been identified with the linear operator on the Hilbert space representing it. We say that a functional σ ∈ (M) in Minkowski space has finite energy below p 0 (relative to some Lorentz frame) if σ (A) = σ (Ep0 AEp0 ) ∀A ∈ A(M),
(25)
where Ep0 denotes the projector on the spectral subspace of the Hamiltonian P 0 corresponding to energies less than p0 , and where we have assumed that σ can be identified with a functional on the image of A(M) under the represenation. Then such a σ satisfies the microlocal spectrum condition (20) in Minkowski spacetime (a formal proof of this statement, which follows closely a similar argument invented in [10], is given in Appendix A). Furthermore, it seems to be the case that the coefficients in the operator product 12 In fact, it can be shown that the Hadamard states as defined in [21] are precisely the states whose twopoint function satisfies a suitably strengthened version (see footnote 11 on p. 218) of the C ∞ -microlocal spectrum condition [22] (for a discussion of the analytic case, see [23]).
220
S. Hollands
expansion for free fields in analytic curved spacetimes satisfy our analytic microlocal spectrum condition [11], and we expect this also to be true for perturbatively defined self-interacting quantum field theories in curved spacetimes. We next turn to our second task (b) to explain the precise sense in which the expression (17) converges to zero. An investigation of the operator product expansion for free fields shows that one can certainly not expect that expression (17) tends to zero in the sense of a convergent sequence of functions, or rather, distributions. Rather, one can only expect that this expression has an arbitrarily low scaling degree as (y1 , . . . , yn ) → (x, . . . , x) for large N in any given state ω ∈ A (M). The model-independent derivation [7] of the operator product expansion in Minkowski spacetime from first principles leads to the same conclusion13 . We will consequently formulate the convergence of the operator product expansion by demanding that the scaling degree of the expression (17) becomes arbitrarily small when N → ∞. Let u be a distribution on an open, convex neighborhood X of Rn . If λ is a positive number less than 1 and f is a smooth compactly supported function on X, we define another such function fλ by setting fλ (y) = λ−n f (x+λ(y−x)). The scaling degree [25], δ, of u at the point x is defined as δ = inf{γ ∈ R+ | lim λγ u(fλ ) = 0 λ→0
∀f ∈ D(X)}.
(26)
The scaling degree of a distribution thus characterizes the strength of its singularity at x. It is a completly local concept in that it depends only on the behaviour of u near x and can be generalized in an invariant manner to distributions on a manifold X by localizing u in a chart near a point x in the manifold. The precise sense in which we assume the operator product expansion to converge is then the following: We ask that for every δ < 0, we can find an N such that the scaling degree of the distribution defined by the left side of expression (17) at (y1 , . . . , yn ) = (x, . . . , x) is less than δ. This accomplishes (b). We note however that the PCT-invariance of the operator product expansion that we are going to state and prove in Sect. 5 will follow independently of any assumptions made about the convergence of this expansion at small distances – in other words, property (b) will not be used at all in the proof given in Sect. 5 (see also the remark following Theorem 5.1).
4. Technical Assumptions About the OPE In the last section we have given a mathematically precise formulation of the operator product expansion in a curved spacetime. In order to be able to prove our main result that the operator product expansion (17) has a PCT-invariance, we will now make the following further assumptions about the nature of this expansion: (i)
(L) The standard functionals σM,x have a local and covariant dependence on the spacetime metric and orientations. (i) (A) The standard functionals σM,x have a suitable analytic variation under analytic variations of the spacetime metric. 13 We remark however that the convergence properties of the operator product expansion established in [7] are stronger than the convergence properties postulated here in that they hold uniformly for all states with energy below some arbitrary p0 .
PCT Theorem for the Operator Product Expansion in Curved Spacetime
221
Our motivation for imposing (L) and (A) comes from the fact that these properties are satisfied in free field theories in curved spacetime and are also expected to hold in perturbatively defined interacting quantum field theories in curved spacetime [11]. As we explain below, condition (L) is equivalent to the local and covariant dependence on the metric of the coefficients c(i) in the operator product expansion (1). Since these coefficients can be viewed, in some sense, as structure constants for the algebraic relations between the quantum fields at short distances, we may view (L) as a strengthened version of the general covariance property of the quantum field theory under consideration. We now discuss the precise form (L) and (A) in turn. In order to formulate our condition that the standard functionals depend locally and x covariantly on the metric, it is useful to first define an equivalence relation ∼ between linear functionals in (M) relative to a point x ∈ M by declaring two such functionals ϕ1 and ϕ2 to be equivalent if they coincide when restricted to some neighborhood of the point x, where the restriction of a linear functional on A(M) to a globally hyperbolic neighborhood O ⊂ M is defined in the obvious way by viewing A(O) as a subalgebra of A(M) via the *-isomorphism Eq. (2) corresponding to the embedding O ⊂ M. The assignment of pairs (M, x) consisting of oriented, time oriented spacetimes M and a (i) point x ∈ M, to the functionals σM,x is then said to be local and covariant if x
(i)
(i)
σM,χ(x) ◦ αχ ∼ σN ,x
(27)
for any orientation and causality preserving isometric embedding χ : N → M, where we have assumed for simplicity that all fields in the theory are scalar. In the case when the theory contains spinor fields as well, we consider causality and orientation preserving isometric embeddings χ that in addition lift to a corresponding map between the spin-structures over N and M respectively. If F(N ) is the vector bundle over N corresponding to the spinor type of the field φ (i) , then, as explained in the remark following (i) Def. 3.1, the functional σN ,x should be viewed as taking values not in C, but in the finite (i)
dimensional vector space Fx (N ), and likewise for the functional σM,χ(x) . The analog of Eq. (27) for spinor fields is then (i)
x
(i)
σM,χ(x) ◦ αχ ∼ χ∗ σN ,x ,
(28)
where χ∗ : Fx (N ) → Fχ(x) (M) is the linear map induced by χ . The above locality and covariance conditions (27) and (28) imply that the n-point functions (18) of our standard functionals are distributions which are locally and covariantly constructed out of the metric and the orientations near the reference point, x. Namely if χ : N → M is an orientation and causality preserving isometric embedding, then it follows immediately from the transformation law of the fields (7) and the functionals (27) that χ ∗ cM,χ(x) (y1 , . . . , yn ) = cN ,x (y1 , . . . , yn ) (i)
(i)
(29)
in the sense of distributions for all yj in some neighborhood of the point x, where χ ∗ denotes the pull-back of a distribution, defined by analogy with the pull back of a smooth density. The reader may wonder why we are not demanding equality in Eq. (27) rather x than only equivalence under ∼, or alternatively, why we do not impose that relation (29) holds for all yj in N , rather than some neighborhood of the point x. The reason for this is that we typically expect the coefficients (18) to contain expressions like the geodesic distance, sM (y1 , y2 ), between two points in M near x. Now the geodesic distance
222
S. Hollands
between two points is not a quantity that is locally constructed out of the metric, since the geodesic distance between two points in a spacetime N (even if it can be defined unambiguously) can be made shorter by embedding N into a suitably chosen larger spacetime M. Therefore, it is not in general true that χ ∗ sM = sN for the geodesic distance. On the other hand, it is true that χ ∗ sM = sN when both sides are restricted to a suitably small neighborhood O of x. Our locality and covariance condition as stated above requires only that there is some region, O, such that both sides of (27) are equal upon restriction to O, but we have not imposed any requirements upon the size of O, which could vary arbitrarily so far as the embedding varies. For technical reasons, we must also impose the additional condition that O can be chosen uniformly in the following sense as the embedding χ varies. We ask that for every spacetime M and point x, there exists an open neighborhood X of the identity in the space Diff A x (M) of analytic diffeomorphisms on M leaving x fixed (i) (i) such that σM,x ◦ αχ = σχ ∗ M,x for all χ ∈ X , when restriced to some fixed O. We view this additional requirement as part of our definition of locality and covariance of the functionals in the operator product expansion. We next want to formulate condition (A) that the local, covariant functionals in the operator product expansion have an analytic dependence under analytic variations of the spacetime metric. For this, we consider 1-parameter families of real analytic metrics (s) gab on M which vary analytically with respect to a real parameter s ∈ I = (a, b) in the sense that (s)
gab − (ds)a (ds)b
(30)
is a real analytic metric on the real analytic 5-dimensional manifold I × M. Since the standard functionals in the operator product expansion have already been assumed to be locally and covariantly constructed out of the metric, we therefore obtain from the family (s) of metrics gab a corresponding family of functionals (labelled by the parameter s) associated with this family of metrics. Our analyticity requirement (A) is then, in essence, that all n-point functions of these functionals have a suitable analytic dependence on the parameter s. A complication arises from the fact that these n-point functions are not analytic functions but rather only distributions, so we must first consider the question what we actually mean by the statement that a family of distributions depends analytically on a parameter. Following [17], we make the following definition. Definition 4.1. We say that a family of distributions u(s) on an analytic manifold X depends analytically on the parameter s ∈ I = (a, b) with respect to a family of conic sets K (s) ⊂ T ∗ (X) \ {0} if (a) the dependence on s of the family of distributions u(s) on X is such that can be viewed as a distribution u˜ on X˜ = I × X and if (b) it holds that ˜ ∈ T ∗ (X) ˜ \ {0} | f (s) (x) = x, ˜ ⊂ {(x, ˜ k) ˜ WFA (u)
˜ ∈ K (s) }, (x, t f (s) (x)k)
(31)
˜ f (s) is the differential where f (s) : X → X˜ maps any point x ∈ X to x˜ = (s, x) ∈ X, t (s) ˜ of this map viewed as a linear map T (X) → T (X), and f denotes the transpose of ˜ → T ∗ (X). this linear map, acting between T ∗ (X) A detailed discussion and motivation of this definition is given in [17, App. A]; here ˜ is any distribution satisfying (31), we only note the following facts. Firstly, if u˜ ∈ D (X) then by the results of [20, Thm. 8.5.1], the pull-back of this distribution and all of its
PCT Theorem for the Operator Product Expansion in Curved Spacetime
223
s-derivatives by the map f (s) exists as a distribution on X for any s ∈ I and defines an analytic family u(s) of distributions in the sense of the above definition, with each member satisfying WFA (u(s) ) ⊂ K (s) . In the special case when the cones K (s) are empty for all s, we consequently have that WFA (u(s) ) = ∅, so each u(s) is an analytic function on X. The set (31) is then empty as well and the family is consequently jointly analytic in s and x. Thus, when K (s) is empty, our definition of the analytic dependence on a parameter coincides with the natural notion for analytic functions. With this definition in mind, we now state the precise form of condition (A). Let (s) (M, gab ) be a family of analytic spacetimes whose metrics vary analytically with s, and suppose that there is a corresponding analytic family of time functions T (s) and (s) volume forms abcd for all s, which thus define a family of space and time oriented (i) spacetimes M(s). We say that σM,x depend analytically on the metric if there is a neighborhood N of x such that the restriction of (i)
(j )
(j )
(f1 , . . . , fn ) → σM(s),x (φM1 (s) (f1 ) · · · φMn(s) (fn ))
(32)
to ×n D(N ) is a family of distributions that depends analytically on s with respect to the family of conic sets M(s) defined in Eq. (22) for every set of local, covariant fields φ (j1 ) , . . . , φ (jn ) . 5. PCT-Invariance of the Operator Product Expansion We are now going to formulate our main result about the PCT invariance of the operator product expansion in curved spacetime. Let M be a globally hyperbolic spacetime with metric gab and space-time orientation o = (T , abcd ), which admits a spin-structure. Let M be the spacetime whose manifold structure and metric coincides with that of our original spacetime, but whose space and time orientation is given by −o = (−T , abcd ), i.e., are reversed relative to those of the original spacetime. Since the definition of spinors involves a choice of orientation, the notion of spinors on M and M will not coincide. Therefore, in order to formulate a relation between the operator product expansions on M and M involving spinors, one needs to identify spinors on M with spinors on M. As we show in Appendix B, it is always possible to choose the spinor structures on M and M in such a way that a natural identification is possible, namely, we get a map I˜ : V(M) → V(M)
(33)
between the corresponding associated vector bundles of which the spinors are elements. In the following, we shall therefore always assume that the spin structures over M and M have been chosen so that such an identification is possible. The same remarks apply to the bundles V ∗ (M), V (M), V ∗ (M), as well as their tensor products. Theorem 5.1. Suppose that a local, covariant quantum field theory possesses an operator product expansion in the sense of Def. 3.1, and suppose that the standard functionals (i) σM,x in this operator product expansion satisfy (L), (M), and (A). Then the dependence of these standard functionals on the space and time orientations is expressed by the relation (i) (i) (j ) (j ) (i) (j ) (i) (jn ) φ (yn ) · · · φ 1 (y1 ) , i F (−1)M σM,x φM1 (y1 ) · · · φMn (yn ) = i F (−1)M σ M,x
M
M
(34)
224
S. Hollands
for any finite number of local covariant fields, and any oriented and time oriented spacetime admitting a spin structure, and all yj in some open neighborhood of x. Here, F = F (j1 ) + · · · + F (jn ) , M = M (j1 ) + · · · + M (jn ) ,
(35)
with M (i) the nunber of unprimed spinor indices of the field φ (i) , 0 if φ (i) is bosonic, (i) F = 1 if φ (i) is fermionic,
(36)
and it is understood that the map I˜ is used to identify the spinor indices corresponding to the space-time orientation +o on the left side with the spinor indices corresponding to the space-time orientation −o on the right side of the above equation. Remarks. (1) The proof given below shows that relation (34) holds true for any family (i) of functionals σM,x with the properties (L), (M), (A), and Eq. (16). The fact that these functionals define an operator product expansion with property (17) does not play any role in our proof. We also re-emphasize that it is neither assumed nor used anywhere in the proof that the spin-statistics relation holds for the fields, i.e., it is not assumed that half odd-integer spin fields are fermionic and that integer spin fields are bosonic. (2) It follows from condition (16) on the functionals σ (i) , that the number F is even respectively odd if and only if φ (i) is bosonic respectively fermionic. (3) The theorem can be reformulated as an invariance condition of the theory under PCT as follows: If φM is a local covariant field on M with N primed and M unprimed spinor C on M, by indices, define a corresponding charge-conjugate field, φM C (f ) = i F (−1)M φM (f )∗ , φM
(37)
C is now a local, covariant field that has M primed and N unprimed spinor so that φM (i) indices. Consider the distributions cM,x defined in terms of the standard functionals (i)
σM,x by Eq. (18). These distributions are the coefficients appearing in the operator (j )
product expansion of the product of the fields φM , j = 1, . . . , n, on a spacetime M with a given space and time orientation, (i) (1) (n) (i) φM (y1 ) . . . φM (yn ) ∼ cM,x (y1 , . . . , yn ) φM (x), (38) (i)
where “∼” in the above relation is understood in the precise sense of Def. 3.1. It follows from Eq. (34) that the distributional coefficients in the operator product expansion of (j )C the charge conjugate fields φ (yj ) on the spacetime M with the opposite space and M time orientation relative to that of M are given by the complex conjugated coefficients (i)
cM,x for the spacetime M, i.e., φ
(1)C
M
(y1 ) . . . φ
(n)C
M
(yn ) ∼
(i)
(i)
cM,x (y1 , . . . , yn ) φ
(i)C (x). M
(39)
PCT Theorem for the Operator Product Expansion in Curved Spacetime
225
Relations (38) and (39) say that the operator product expansions on M and M are equivalent. If these relations were moreover honest equations rather than only asymp(i) totic relations as (y1 , . . . , yn ) → (x, . . . , x), then the cM,x could be viewed as structure constants of the algebra of fields A(M), and relations (38) and (39) could be viewed P CT : φ (f ) → φ C (f ) sending any smeared local covariant as saying that the map θM M M
field on M to its charge conjugate on M defines an (anti-linear) isomorphism between the field algebra A(M) and the field algebra A(M) of the spacetime associated with the opposite space and time orientation. In this sense, our theorem may be viewed as an analog of the PCT theorem in Minkowski spacetime as described in Sect. 2. (4) Throughout this paper, we are restricting attention to theories in spacetime dimension d = 4. Let us therefore briefly comment on what happens to our results and constructions in other dimensions. In even spacetime dimensions, our main Theorem 5.1 as well as the method of proof are basically unchanged. Some minor differences arise only from the fact that the notion of a spinor field is now based on the covering homomorphism ↑ Spin0 (d − 1, 1) → L+ of the d-dimensional proper orthochronous Lorentz group by the spin group. In odd spacetime dimensions d, the reversal of parity and time (x 0 , x 1 , . . . x d−1 ) → (−x 0 , −x 1 , . . . , −x d−1 ) in Minkowski spacetime corresponds to changing the orientations as (T , ab...c ) → (−T , −ab...c ), where we note the difference to the case of even d. However, it is well known that otherwise reasonable theories in Minkowski spacetime may fail to possess an invariance under PCT in this sense14 , and it is therefore not expected that there is an analog of our main result 5.1 in odd dimensions with regard to the change of orientations (T , ab...c ) → (−T , −ab...c ). On the other hand, there is an analog of the PCT theorem in Minkowski spacetime associated with the transformation (x 0 , x 1 , x 2 , . . . x d−1 ) → (−x 0 , −x 1 , x 2 , . . . , x d−1 ) corresponding to the change of orientations (T , ab...c ) → (−T , ab...c ). It can be shown that our main result 5.1 as well as the method of proof generalizes also to odd dimensions with regard to the change of orientations (T , ab...c ) → (−T , ab...c ). Some differences concerning the prefactors in our invariance property (34) arise only from the fact that the identification of the spin-bundles over M respectively M (see Appendix B for the case d = 4) have to be done in a different way than in even dimensions. Proof of Theorem 5.1, scalar bosonic case. For simplicity, we will first treat the special case when all fields in the theory are scalar and bosonic. Then Eq. (34) reduces to (1) (n) (n) (1) σM,x φM (y1 ) · · · φM (yn ) = σM,x φ (yn ) · · · φ (y1 ) , M M
(40)
for all yj in some neighborhood N of x, where we have set (jk ) = (k) without loss of generality, and where we have dropped the superscript (i) on σM,x to simplify the notation. Thus, when only scalar fields are present in the theory, our theorem will be proven if we can prove Eq. (40). We will prove Eq. (40) using a particular family of metrics which interpolates analytically between the metric gab and the Minkowski metric. The construction of this family is as follows. In a convex normal neighborhood around the point x, we introduce 14 The failure of the usual proof of the PCT theorem in odd dimensions can be traced back to the fact that the transformation (x 0 , x 1 , . . . x d−1 ) → (−x 0 , −x 1 , . . . , −x d−1 ) is not contained in the complexified proper, orthochronous Lorentz group in odd dimensions.
226
S. Hollands
Riemannian normal coordinates for our metric gab , denoted y α = (y 0 , y 1 , y 2 , y 3 ), so that the point x has the coordinates y α = 0, and so that the coordinate components gµν of the metric satisfy gµν (0) = ηµν at this point. On this convex normal neighborhood of (s) x, we define a family of metrics gab , s ∈ I = (−1 − c, 1 + c), c > 0 via its coordinate components by sn ∂ n gµν (0) (s) α (y ) = ηµν + y α1 · · · y αn α . (41) gµν n! α ...α ∂y 1 · · · ∂y αn n≥1
1
n
(0)
It is obvious from this expression that (in our convex normal neighborhood) gab is the (1) (s) flat, Minkowskian metric, that gab is equal to the original metric, and that gab has an analytic dependence on the parameter s ∈ I . Since the statement of the theorem is com(s) pletely local, we may pass from M to a neighborhood of x on which all metrics gab are defined globally and which are globally hyperbolic with respect to these metrics for all s ∈ I . Furthermore, we will from now on view M as a neighborhood of the origin in R4 by identifying points y ∈ M with their Riemannian normal coordinates, viewed as points in R4 . If (T , abcd ) is the time function respectively volume form defining the time and (1) space orientation, +o, of the spacetime (M, gab ), then it is clear that ∇a T will remain (s) timelike with respect to the metrics gab in a neighborhood of the point x for all s ∈ I . By shrinking M further if necessary, we can therefore assume without loss of general(s) ity that T defines a corresponding time orientation on the spacetimes (M, gab ) for all s ∈ I . Similar remarks apply to the space orientation, as well as the reversed orientations −o = (−T , abcd ). We have therefore defined 1-parameter families of oriented and time oriented spacetimes (s)
M(s) = (M, gab , +o),
(s)
M(s) = (M, gab , −o).
(42)
By our assumption (A), we know that σM,x are local and covariant functionals that vary analytically under analytic variations of the metric. This means by definition that we can pick a neighborhood N of the point x such that the restriction to ×n N of the (j ) family of n-point functions σM(s),x ( φM(s) (yj )) is a family of distributions which varies analytically with s with respect to the cones M(s) , and we may thus differentiate this family with respect to the parameter s and set s = 0 afterwards. An analogous statement holds of course for the family of spacetimes M(s) with the opposite orientations. If Eq. (40) holds for all spacetimes, then this gives (1) (n) dk dk (n) (1) σ (y ) · · · φ (y ) = σM(0),x φ (yn ) · · · φ (y1 ) , φ 1 n M (0),x M(0) M (0) k k M (0) M (0) ds ds (43) (0)
for all (y1 , . . . , yn ) ∈ ×n N , where M(0) is Minkowski spacetime (M, gab ) with space and time orientation +o, and where M(0) is Minkowski spacetime with the space and time orientation −o. This equation is therefore a necessary condition for our theorem to be true. Our first major step in the proof is to show that it is also sufficient. It follows immediately from the microlocal spectrum condition together with the transformation rules of the analytic wave front set under diffeomorphisms that (n) (1) WFA σM,x φ (yn ) · · · φ (y1 ) ⊂ π ∗ M , (44) M
M
PCT Theorem for the Operator Product Expansion in Curved Spacetime
227
where π is the permutation π=
1 2 ... n , n n − 1 ... 1
(45)
and where the set M is defined as in (22), but with the orientations reversed relative to the original orientations, o = (T , abcd ). We claim that π ∗ M = M .
(46)
there exists a decorated graph G(p) Let (y1 , k1 ; . . . ; yn , kn ) ∈ M , which means that with (pe )a ∇a T > 0 such that ki = e:i=s(e) pe − e:i=t (e) pe for all i. (Remember that if e is an edge joining yj and yk with j < k, then j = s(e) and k = t (e).) We need to show ¯ that (yn , kn ; . . . ; y1 , k1 ) ∈ M . Consider the graph G(p) whose edges and vertices are identical to the edges and vertices of G(p), but which are decorated with the covectors −pe , each of which is future pointing with respect to the time function −T on M. Note that the notion of source s¯ (e) and target t¯(e) relative to the ordering (yn , . . . , y1 ) of the vertices is opposite to the above notion of source and target relative to the ordering ¯ (y1 , . . . , yn ), so that if e is an edge in G(p) joining yj and yk with j < k, then j = t¯(e) and k = s¯ (e) relative to the ordering (yn , . . . , y1 ). Itis a trivial consequence of these definitions that ki can be written alternatively as ki = e:i=¯s (e) (−pe )− e:i=t¯(e) (−pe ) for all i, which displays (yn , kn ; . . . ; y1 , k1 ) as the element in M associated with the ¯ graph G(p). Let u be the distribution on ×n M given by the difference between the left and right side of Eq. (40). Equations (44) and (46) show that u is the difference of two distributions whose analytic wave front set is contained in M . Hence, by the rules for calculating the analytic wave front set of sums of distributions, we have WFA (u) ⊂ M . By a similar argument, if u(s) is the family of distributions defined as the difference of the left minus right side of Eq. (40) with M and M replaced by M(s) and M(s), then u(s) is an analytic family of distributions on ×n N relative to the conic sets M(s) . Furthermore, since we assume that Eq. (43) holds, we have that d k (s) u (y1 , . . . , yn )|s=0 = 0 ds k
∀k, (y1 , . . . , yn ) ∈ ×n N .
(47)
We need to show that it follows that u(s) (y1 , . . . , yn ) = 0 for all s ∈ I and (y1 , . . . , yn ) ∈ ×n N . Since we have identified M with an open neighborhood of the origin in R4 via Riemannian normal coordinates (with the point x corresponding to the origin under this identification), we may take the neighborhood N to be the ball Br of radius r around the origin in R4 with respect to the Euclidean norm y2 =
3
|y µ |2 ,
(48)
µ=0
defined by our choice of coordinates y µ . It is known [10, Lem. 4.2] that each component ( M )(y1 ,... ,yn ) , of M in the cotangent space T(y∗ 1 ,... ,yn ) (×n M) \ {0}, is a proper15 , 15
A cone is said to be proper if it does not contain any straight line.
228
S. Hollands
closed convex cone, which we identify with a proper, closed convex cone in R4n \ {0}. Moreover, it can be seen that r > 0 can be chosen so small that ( M(s) )(y1 ,... ,yn ) ⊂ C, (49) (y1 ,... ,yn )∈×n Br ,s∈I
where C is a proper, closed, convex cone in R4n . Our claim that u(s) (y1 , . . . , yn ) = 0 now follows from the following lemma, which we shall prove in Appendix C. Lemma 5.1. Let u(s) ∈ D (X), X an open subset of Rm , be a family of distributions that depends analytically on s ∈ I = (a, b) with respect to conic sets K (s) ⊂ X × (C \ {0}), where C is a closed, proper convex cone in Rm . Suppose that d k (s) u |s=s0 = 0 ds k
(50)
for all k and some s0 ∈ I . Then u(s) = 0 for all s ∈ I . Thus, we have obtained the important intermediate result that our theorem will be proven if Eq. (43) can be shown. It follows immediately from the definition of our analytic family of metrics that the map χ : y → −y satisfies χ ∗ gab = gab . (s)
(−s)
(51)
Moreover, it is clear that χ reverses the space and time orientation in the sense that χ ∗ o = (χ ∗ T , χ ∗ abcd ) defines the same space and time orientation as −o = (−T , abcd ). This shows that χ is an orientation preserving isometry between the space and time orientated spacetimes M(s) and M(−s) given in Eq. (42). By the locality and covariance property (7) of the fields, we therefore have that χ ∗ φM(−s) (y) = φM(−s) (−y) = αχ (φM(s) (y)),
(52)
and by the locality and covariance of the standard functionals, we have that x
σM(−s),x ◦ αχ ∼ σM(s),x . Putting this together, we therefore know that (1) (1) (n) σM(s),x φM(s) (y1 ) · · · φM(s) (yn ) = σM(−s),x φ
M(−s)
(53)
(n) (−yn ) , M(−s)
(−y1 ) · · · φ
(54) for all (y1 , . . . , yn ) ∈ ×n Br , all s ∈ I , for some r > 0. If we now differentiate both sides of this equation k times with respect to s at s = 0, and substitute the result into Eq. (43), we get the important intermediate result that the theorem will be proven if we can show (1) dk (n) σ (y1 ) · · · φM(0) (yn ) φ ds k M(0),x M(0) (n) dk (1) = (−1)k k σM(0),x φM(0) (−yn ) · · · φM(0) (−y1 ) , ds
(55)
PCT Theorem for the Operator Product Expansion in Curved Spacetime
229
for all k, (y1 , . . . , yn ) ∈ ×n Br and some r > 0, where M(0) is Minkowski spacetime (0) (M, gab ) with the orientation +o. Thus, by the preceding steps, we have managed to transform the original problem of proving identity (40) between distributions associated (1) with spacetimes (M, gab ) with space and time orientations +o respectively −o, to the problem of proving relations (55) for a set of distributions on Minkowski spacetime (0) (M, gab ) with a single orientation, +o. The remainder of this proof therefore consists in showing that these relations are indeed true. For this, we need to analyze the s-derivatives of the distributions (32) for our particular family of spacetime metrics (41) and orientation +o. Such an analysis was carried out in a similar context in [17, Thm. 4.1] in order to derive a “scaling expansion” for certain distributions that arise in the context of perturbative interacting quantum field theories in curved spacetimes. The properties of the distributions considered in [17] which enter the analysis are (a) that they are locally and covariantly constructed from the metric near a reference point, x, (b) that they depend analytically on the metric in analytic spacetimes, (c) that they depend smoothly on the metric in smooth spacetimes, and (d) that they have a certain scaling behavior under rescalings of the metric by a constant conformal factor. Using only these properties, it was shown that the k th derivative with respect to s of the family of these distributions corresponding to the spacetime metrics defined in (41) can be written as a linear combination of curvature terms of the appropriate “dimension”, times Lorentz-invariant Minkowski space distributions (they also satisfy other properties, but these are not relevant in the present context). Inspection of the proof of this statement given in [17] shows that in analytic spacetimes, it only relies on (a) and (b) above, but not on (c) and (d). Furthermore, one easily sees that the arguments given in [17] will still be valid when properties (a) and (b) are replaced by the essentially identical properties (L) and (A) assumed for our distributions (18). (In fact, the precise form of our conditions (L) and (A) has been chosen precisely so that the arguments of [17] are still valid.) We therefore conclude that the expression on the left side of Eq. (55) can be decomposed into a sum of curvature terms of the appropriate dimension, times Lorentz invariant distributions in Minkowski spacetime. However, since assumptions (L) and (A) are weaker than the requirements (a) and (b) used in [17, Thm. 4.1] in that they hold only for an arbitrary small neighborhood of the reference point, x, one gets only the weaker result that the Minkowski space distributions are in fact only defined in some neighborhood of the origin (which we take to be a ball), and that they are invariant only under those Lorentz transformations that are sufficiently close to the identity. More precisely, there exists an r > 0 such that (1) dk (n) σM(0),x φM(0) (y1 ) · · · φM(0) (yn ) ds k ) = C (J )µ1 ...µj (x) Wµ(J1 ...µ (y1 , . . . , yn ) j
(56)
j µ1 ...µj
for all (y1 , . . . , yn ) ∈ ×n Br , where W (J ) are tensor-valued distributions on ×n Br and where (J ) = (12 . . . n) is a shorthand for the indices labelling the fields. The expressions C (J ) are the coordinate components in Riemannian normal coordinates of curvature tensors that are polynomials C (J ) m1 ...mj (gab (x), Rabcd (x), . . . , ∇(e1 · · · ∇ek−2 ) Rabcd (x))
(57)
of the metric, the Riemann tensor and its covariant derivatives at x. Each monomial in C (J ) contains precisely k derivatives of the metric, implying that
230
S. Hollands
j = k mod 2.
(58)
The W (J ) have the further property: (i) There exists an r, δ > 0 such that W (J ) (y1 , . . . , yn ) = D()W (J ) (y1 , . . . , yn )
(59)
↑
for all ∈ L+ with − 1 < δ and all (y1 , . . . , yn ) ∈ ×n Br−δ , where the norm of a linear transformation is defined using the Euclidean norm (48), and where D() is the tensor representation ν ...ν
ν
D()µ11 ...µj j = νµ11 · · · µjj .
(60)
Furthermore, since the restriction of σM(s),x to a sufficiently small neighborhood of x satisfies the microlocal spectrum condition for the cone M(s) , and since this family has an analytic dependence on s with respect to these cones, we have, by the same arguments as in the remark at the end of Sect. 4 of [17], that (ii) There exists an r > 0 such that the restriction of W (J ) to ×n Br has analytic wave front set WFA (W (J ) ) ⊂ M(0) .
(61) (0)
Here, M(0) is the cone (22) defined with respect to the Minkowskian metric gab and orientation +o. Using our Riemannian normal coordinates (y 0 , y 1 , y 2 , y 3 ) to identify M with a subset of the ball Br , and assuming that ∇a y 0 is future pointing with respect to +o, it can be written as
M(0) = (y1 , k1 ; . . . ; yn , kn ) ∈ T ∗ (×n Br ) \ {0} ∃pij ∈ V¯+ , n ≥ j > i ≥ 1: pij − pj i for all i , (62) ki = j :j >i
j :j
where V¯ + is the closure of the forward light cone V + in Minkowski space, V ± = {k ∈ R4 | ηµν k µ k ν > 0,
±k 0 > 0}.
(63)
Besides the above properties (i) and (ii) for the W (J ) in the expansion (56), we will now derive one more property, (iii), from the fact that the coefficients in our operator product expansion are not just arbitrary local covariant distributions with a specific analytic dependence on the metric, but arise in fact from a set of linear functionals on the algebras of observables A(M). To exploit this fact, we consider the multilinear maps on ×n D(M) defined by (j ) (1) (j +1) (n) (64) (f1 , . . . , fn ) → σM,x φM (f1 ) · · · φM (fj ), φM (fj +1 ) · · · φM (fn ) . Then by Eq. (4), the right side of the above equation will vanish if the supports of fj (1) and fj +1 are spacelike related16 in M with respect to the metric gab . Consider now the 16 We say that two closed sets K and K are spacelike related if there exist open neighborhoods, O 1 2 1 and O2 such that x1 and x2 are spacelike for every pair of points (x1 , x2 ) ∈ O1 × O2 .
PCT Theorem for the Operator Product Expansion in Curved Spacetime
231
(s)
analytic family of metrics gab constructed above in Eq. (41) and a set of testfunctions on M such that the supports of fj and fj +1 are spacelike related with respect to the flat (0) Minkowskian metric gab . Then the supports of fj and fj +1 will continue to be spacelike (s) related also with respect to the metrics gab for sufficiently small s. Consequently, (1) (j ) (j +1) (n) σM(s),x φM(s) (f1 ) · · · φM(s) (fj ), φM(s) (fj +1 ) · · · φM(s) (fn ) = 0 (65) will hold provided that s is sufficiently small and provided fj and fj +1 are spacelike (0) related with respect to the flat Minkowskian metric gab . Differentiating this relation k times at s = 0, we therefore find that (f1 , . . . , fn ) →
(k) (1) dk (k+1) σM(s),x φM(s) (f1 ) · · · φM(s) (fk ), φM(s) (fk+1 ) k ds (n) · · · φM(s) (fn ) |s=0
(66)
is a distribution on ×n D(M) that vanishes whenever the supports of fj and fj +1 are (0) spacelike separated with respect to the Minkowskian metric gab . Since the distributions W (J ) are related to the above distributions Eq. (66) via the scaling expansion (56), we have found the following property of the distributions W (J ) : (iii) There exists an r > 0 such that if yj < r and yi and yi+1 are spacelike to each other with respect to the Minkowski metric ηµν , then there holds W (J ) (y1 , . . . , yi , yi+1 , . . . , yn ) = W (πi,i+1 J ) (y1 , . . . , yi+1 , yi , . . . , yn ), (67) where (πi,i+1 J ) stands for (1 . . . (i + 1)i . . . n). We have argued so far that the theorem will be proven if we can show Eq. (55). If we now substitute the expansion (56) into Eq. (55), and use that j = k modulo 2, we see Eq. (55) will follow if we can show that there is a r > 0 such that W (J ) (y1 , . . . , yn ) = (−1)j W (πJ ) (−yn , . . . , −y1 )
(68)
for all yj ∈ Br in the sense of distributions, where π is the permutation (45). Since we already know that the distributions W (J ) satisfy properties (i), (ii) and (iii) above, the proof of the theorem will therefore be complete once we have established the following proposition: (J )
Proposition 5.1. Suppose that Wµ1 ...µj are tensor-valued distributions on ×n Br for which there holds (i), (ii) and (iii). Then there is some r > 0 such that W (J ) satisfies Eq. (68) within ×n Br . The remainder of this section is devoted to the proof of this proposition. A key ingredient in our proof is the following theorem about distributions whose analytic wave front set is contained in the dual of an open, convex cone [20, Thm. 8.4.15]: Theorem 5.2. Let u be a distribution on X ⊂ Rm with WFA (u) ⊂ X × K D , where17 K D = {k ∈ Rm | k · x ≤ 0 ∀x ∈ K}
(69)
17 Our definition of the dual cone differs trivially from that employed in [20, Thm. 8.4.15] since our convention for the Fourier transform is opposite to the convention employed in [20].
232
S. Hollands
is the dual of an open convex cone K ⊂ Rm , with k · x the standard inner product in Rm . If X0 ⊂ X is an open subset with compact closure X¯ 0 ⊂ X, then one can find a γ > 0 and a function U analytic in {x + iy ∈ Cm | x ∈ X0 , y ∈ K, y < γ } such that u is the boundary value of U , u(f ) = lim U (x + iy)f (x) d m x (70) y∈K,y→0
which we write as u(x) = B. V. U (x + iy).
(71)
y∈K,y→0
Consider the linear transformation f : (ξ1 , . . . , ξn ) → (y1 , . . . , yn ) on R4n defined by f : yi = ξi + ξi+1 + · · · + ξn
for all i,
so that ξn = yn and ξi = yi − yi+1 for all i = n. For (ξ1 , . . . , ξn ) ∈ sufficiently small, we define a distribution w(J ) by w (J ) (ξ1 , . . . , ξn ) = W (J ) (f (ξ1 , . . . , ξn )),
(72) ×n B
r,
r > 0 (73)
which expresses W (J ) in terms of relative coordinates about the “center of mass” point ξn = yn . By the rules for calculating the analytic wave front set of the pull back of a distribution under an analytic map, we have WFA (w (J ) ) = f ∗ WFA (W (J ) )
= (ξ1 , 1 ; . . . ; ξn , n ) ∈ T ∗ (×n Br ) \ {0} ∃(y1 , k1 ; . . . ; yn , kn ) ∈ WFA (W (J ) ) : i = k1 + · · · + ki , yi = ξi + ξi+1 + · · · + ξn . (74) WFA (W (J ) ) if and only if there exists a By (ii), we know that (y1 , k1 ; . . . ; yn , kn ) is in set of covectors pij ∈ V¯ + , i < j such that ki = j :j
i pij for all i. Thus, if (ξ1 , 1 ; . . . ; ξn , n ) is in WFA (w (J ) ), then we must have i i i = (75) kj = pj l − plj j =1
j =1
=
l:l>j
l:l<j
(p1j + p2j + · · · + pij ),
(76)
j :j >i
where the equality in the second line can be proved by induction in i. Thus, i ∈ V¯+ for all i, and in particular n = 0. We have thus shown that WFA (w (J ) ) ⊂ (×n Br ) × (V¯ + × · · · × V¯ + × {0}).
(77)
We now use this information about the analytic wave front set of w (J ) to show that w (J ) is the boundary value of some analytic function. The closed forward lightcone is the dual of the open past lightcone, V¯ + = (V − )D , therefore V¯ + × · · · × V¯ + × {0} = (V − × · · · × V − × R4 )D ⊂ R4n ,
(78)
PCT Theorem for the Operator Product Expansion in Curved Spacetime
233
so Eq. (77) tells us in conjunction with the above theorem that w (J ) is the boundary value w(J ) (ξ1 , . . . , ξn ) =
B. V.
ηj →0,ηj ∈V − ∀j =n
w (J ) (ξ1 + iη1 , . . . , ξn + iηn )
(79)
of a function w(J ) (ζ1 , . . . , ζn ) that is holomorphic in the domain Tn = {(ζ1 , . . . , ζn ) ∈ C4n | ζj < r for all j , Im ζj ∈ V − for all j = n}
(80)
for some r > 0. Thus, we have shown that property (ii) of the distributions W (J ) implies that w(J ) is the boundary value of a function that is holomorphic on Tn . We next want to show that the analytic functions w(J ) (ζ1 , . . . , ζn ) are Lorentz invari↑ ant. For this let ∈ L+ with − 1 < δ, and consider the function (ζ1 , . . . , ζn ) → w(J ) (ζ1 , . . . , ζn ) − D()w (J ) (ζ1 , . . . , ζn )
(81)
for (ζ1 , . . . , ζn ) ∈ Tn such that ζj < r − δ for all j . The boundary value of this function as Im ζj → 0 vanishes by (i). Therefore, by the “edge-of-the-wedge theorem” (see e.g. [12, Thm. 2.17]), this function itself has to vanish, w (J ) (ζ1 , . . . , ζn ) = D()w (J ) (ζ1 , . . . , ζn )
(82)
↑
for all ∈ L+ such that − 1 < δ and such that (ζ1 , . . . , ζn ) and (ζ1 , . . . , ζn ) are in Tn . We finally would like to use property (iii) to infer a corresponding property for w (J ) . Let (ξ1 , . . . , ξn ) ∈ ×n Br , (y1 , . . . , yn ) = f (ξ1 , . . . , ξn ) such that all the difference vectors yi − yj are spacelike related with respect to ηµν . Then since w (J ) = f ∗ W (J ) , and since W (J ) satisfies (iii), we conclude that there exists an r > 0 such that n (J ) (πJ ) w (ξ1 , . . . , ξn−1 , ξn ) = w −ξn−1 , . . . , −ξ1 , (83) ξi , i=1
in the sense of distributions. We have thus altogether shown that properties (i), (ii) and (iii) for W (J ) imply that (J w ) = f ∗ W (J ) , with f given by (72), is the boundary value of an analyic function w(J ) (ζ1 , . . . , ζn ) on the domain Tn , satisfying Eqs. (82) and (83). When expressed in terms of w (J ) , the assertion (68) of the proposition reads n (J ) j (πJ ) w (ξ1 , . . . , ξn−1 , ξn ) = (−1) w ξn−1 , . . . , ξ1 , − (84) ξi , i=1
in the sense of distributions on ×n Br for some r > 0. We have therefore reached the important intermediate conclusion that the proposition will be shown if we can show that Eq. (84) holds for any w(J ) which is the boundary value of an analytic function on the domain Tn (see Eq. (79)) satisfying Eqs. (82) and (83). One notes that Eqs. (79), (82) and (83) satisfied by the functions w (J ) resemble properties of the Wightman functions in Minkowski spacetime [12] (when expressed in relative coordinates). Moreover, relation Eq. (84) which we want to prove resembles a property of the Wightman functions expressing the PCT invariance of the corresponding
234
S. Hollands
Wightman field theory. Our proof of Eq. (84) will follow closely the proof of the PCT theorem for Wightman field theories in Minkowski spacetime, see especially [12]. There are, however, two important differences between our functions w (J ) and the Wightman functions (expressed in relative coordinates) in Minkowski spacetime. Firstly, our functions W (J ) are by contrast with the Wightman functions not translation invariant, so the relations (83) reflecting the local commutativity are not identical to the corresponding relations for the Wightman functions. Secondly, our distributions w (J ) as well as their analytic extensions are defined only locally and the Lorentz invariance Eq. (82) holds a priori only for Lorentz transformations in a neighborhood of the identity transformation. On the other hand, one finds that global translation invariance and invariance under the full Lorentz group play an important role once one looks at the details of the proof (see e.g. [12]) of the PCT theorem in Minkowski space. For these reasons, the arguments given e.g. in [12] cannot be taken over wholesale, but must be carefully adapted. First, we notice that the transformation law Eq. (82) of the w(J ) not only holds for proper Lorentz transformations such that − 1 < δ and such that (ζ1 , . . . , ζn ) and (ζ1 , . . . , ζn ) are in Tn , but moreover for any rotation of the form
1 0 0 0 0 cos ϕ sin ϕ 0 R(ϕ) = . 0 − sin ϕ cos ϕ 0 0 0 0 1
(85)
This can easily be proven by noting that such a rotation leaves the region Tn invariant and that it can be written as a product of N rotations R(ϕ/N ), each of which satisfy R(ϕ/N )−1 < δ. The invariance then follows by applying the transformation rule (82) to each such small rotation in turn and using the group character of the transformation rule. We will now show that the transformation law can be further generalized to more general transformations by invoking the analyticity of the functions w (J ) . Consider the abelian group of complex Lorentz transformations
sinh(α + iβ) 0 0 1 (α + iβ) = 0 0 cosh(α + iβ) 0
0 cosh(α + iβ) 0 0 , 1 0 0 sinh(α + iβ)
α, β ∈ R,
(86)
which corresponds to a real, proper orthochronous Lorentz transformation if β = 0. The action of such a transformation on a complex vector, ζ → ζˆ = (α + iβ)ζ , can be written as ζˆ + = eα+iβ ζ + ,
ζˆ − = e−α−iβ ζ − ,
ζˆ 1 = ζ 1 ,
ζˆ 2 = ζ 2 ,
(87)
√ where we have introduced the notation ζ ± = (ζ 0 ± ζ 3 )/ 2 for every complex four vector. We now have (compare [12, Thm. 2.11]): Lemma 5.2. The functions w(J ) (ζ1 , . . . , ζn ) possess a unique, single-valued analytic continuation to the extended tube domain Tn = {((α + iβ)ζ1 , . . . , (α + iβ)ζn ) | (ζ1 , . . . , ζn ) ∈ Tn , α, β ∈ R} ⊂ C4n , (88)
PCT Theorem for the Operator Product Expansion in Curved Spacetime
235
which transforms as w (J ) ((α + iβ)ζ1 , . . . , (α + iβ)ζn ) = D((α + iβ))w (J ) (ζ1 , . . . , ζn )
(89)
for all (ζ1 , . . . , ζn ) ∈ Tn and all α, β. Proof. We already know by Eq. (82) that Eq. (89) holds if β = 0, if (ζ1 , . . . , ζn ) as well as ((α)ζ1 , . . . , (α)ζn ) are in Tn and if α is in a sufficiently small real neighborhood of 0. Since (α) is real analytic in α, both sides of Eq. (82) define analytic functions of α in this real neighborhood of 0. Therefore Eq. (89) holds for all α + iβ in a sufficiently small complex neighborhood of 0 such that ((α + iβ)ζ1 , . . . , (α + iβ)ζn ) ∈ Tn . If (ζ1 , . . . , ζn ) is in Tn but ((α + iβ)ζ1 , . . . , (α + iβ)ζn ) is not in Tn , then the right side of Eq. (89) is initially not defined and we try to define it by the left side in this case. It may happen that a point (ξ1 , . . . , ξn ) ∈ Tn can be reached in different ways from elements in Tn , i.e., that it can be written as ((α1 + iβ1 )ζ1 , . . . , (α1 + iβ2 )ζn ) or ((α2 + iβ2 )ρ1 , . . . , (α2 + iβ2 )ρn ), where (ζ1 , . . . , ζn ) and (ρ1 , . . . , ρn ) are in Tn . Unless these different ways of writing (ξ1 , . . . , ξn ) ∈ Tn give rise to the same definition of w(J ) (ξ1 , . . . , ξn ), our proposed extension of w (J ) will not be single valued. Thus, the nontrivial task is to show that Eq. (89) holds when (ζ1 , . . . , ζn ) and ((α + iβ)ζj , . . . , (α + iβ)ζn ) are in Tn , where α = α1 − α2 and β = β1 − β2 . We already know that Eq. (89) holds when α + iβ is sufficiently close to 0. Therefore, by the well known method of analytic continuation by overlapping neighborhoods combined with the group character of the transformation law Eq. (89), this equation will follow if we can show that there exists a continuous curve t → γ (t) = α(t) + iβ(t) such that γ (0) = 0, γ (1) = α + iβ and such that ((γ (t))ζ1 , . . . , (γ (t))ζn ) ∈ Tn for all 0 ≤ t ≤ 1. Thus, our construction of the analytic extension of w(J ) to the extended domain Tn will be complete if we can construct such a curve γ . Without loss of generality we can assume that 0 ≤ β ≤ π . Our proposal for the curve γ is then 2tα for 0 ≤ t ≤ 21 , γ (t) = (90) α + i(2t − 1)β for 21 ≤ t ≤ 1. We need to show that ζj (t) = (γ (t))ζj satisfies ζj (t) < r,
Im ζj (t) ∈ V −
for all j = 1, . . . , n and t ∈ [0, 1].
(91)
In order to prove the first relation, we note that for any complex four√vector ζ , we have that ζ 2 = |ζ + |2 + |ζ − |2 + |ζ 1 |2 + |ζ 2 |2 , where ζ ± = (ζ 0 ± ζ 3 )/ 2. For t ∈ [0, 21 ], this gives ζj (t)2 = | e2tα ζj+ |2 + | e−2tα ζj− |2 + |ζj1 |2 + |ζj2 |2 ≤ 21 ζj (0)2 + 21 ζj ( 21 )2 ,
(92)
by the convexity of the exponential function. For t ∈ [ 21 , 1], this gives ζj (t)2 = | eα+i(2t−1)β ζj+ |2 + | e−α−i(2t−1)β ζj− |2 + |ζj1 |2 + |ζj2 |2 = ζj (1)2 < r 2 .
(93)
It follows straightfowardly from these relations that ζj (t) < r for t ∈ [0, 1] and all j . In order to prove the second relation, we note that, since (2tα) are real restricted
236
S. Hollands
Lorentz transformations, we have that Im ζj (t) = (2tα) Im ζj ∈ V − for t ∈ [0, 21 ]. To show that Im ζj (t) ∈ V − also for all t ∈ [ 21 , 1], it is sufficient show that Im ζj (t)µ nµ > 0 for any real, future pointing timelike or null vector n, for all j . We have µ
µ
(94)
sin β Im ζj (t)µ nµ = sin βτ Im ζj (1)µ nµ + sin β(1 − τ ) Im ζj ( 21 )µ nµ ,
τ = 2t − 1. (95)
Im ζj (t)µ nµ = sin β(2t − 1) Re ζj nµ + cos β(2t − 1) Im ζj nµ , which implies that
The case β = 0 is trivial, and the case β = π cannot occur, since otherwise we would have Im ζj (1)µ nµ < 0 for some future pointing timelike or null vector n, which cannot be since Im ζj (1) ∈ V − by assumption. We can therefore assume that 0 < β < π. The above equation then displays Im ζj (t)µ nµ as a positive linear combination of two positive numbers for 0 ≤ τ ≤ 1. This proves that Im ζj (t) ∈ V − for 21 ≤ t ≤ 1 and hence altogether that (ζ1 (t), . . . , ζn (t)) ∈ Tn for all 0 ≤ t ≤ 1. Consider now a point (ζ1 , . . . , ζn ) ∈ Tn . Then we know that w (J ) (ζ1 , . . . , ζn ) transforms according to the transformation law Eq. (82) for any rotation R(ϕ) as in Eq. (85), and any complex Lorentz transformations (α + iβ). Applying this transformation rule in particular to the product (iπ)R(π) = −1, we find that w(J ) (−ζ1 , . . . , −ζn ) = (−1)j w (J ) (ζ1 , . . . , ζn )
(96)
for all (ζ1 , . . . , ζn ) ∈ Tn , since D(−1) = (−1)j . Moreover, since both sides of this equation are analytic functions, we find that this equation holds in fact for all (ζ1 , . . . , ζn ) in the extended domain Tn . We will now use this equation to get a relation between the w (J ) for real arguments. We note however that we cannot straightforwardly take the boundary value of both sides of the above equation as Im ζj → 0 in order to get such a relation since Im ζj has the opposite sign on both sides of the above equation and Im ζj = 0 can therefore not be approached from within V − on both sides. To circumvent this problem, one considers special real points in Tn defined as follows (compare [12, Thm. 2.12]): Let n be the spacelike vector in R4 given by (0, 0, 0, 1), and consider the open, proper, convex and spacelike cone K in R4 defined by the equation ξ µ nµ > ξ . Lemma 5.3. The extended domain Tn includes the set Jn = {(ζ1 , . . . , ζn ) ∈ R4n | ζj < r, ζj ∈ K},
(97)
and Jn is an open, real domain in R4n . Proof. The last statement is obvious. We must show that if (ξ1 , . . . , ξn ) ∈ Jn , then there is α, β such that ((α + iβ)ξ1 , . . . , (α + iβ)ξ1 ) is in Tn . One calculates that 3 ξj 0 Im (iβ)ξj = − sin β (98) 0 , (iβ)ξj = ξj . 0 ξj
PCT Theorem for the Operator Product Expansion in Curved Spacetime
237
µ
By definition, ξj ∈ K means that ξj nµ > ξj and ξj < r, where n = (0, 0, 0, 1). The first condition implies that ξj3 > (|ξj0 |2 + · · · + |ξj3 |2 )1/2 ≥ |ξj0 |, showing that Im (iβ)ξj ∈ V − for all j and any 0 < β < π, and the second condition shows that (iβ)ξj < r for all j . This proves the lemma. One now notes that if (ζ1 , . . . , ζn ) ∈ Jn , and (y1 , . . . , yn ) = f (ζ1 , . . . , ζn ), then it follows that the difference vectors yj − yk are all spacelike (and non-zero) since yj − yk = ζi ∈ K, (99) j ≤i
by the convexity of K. For such (ζ1 , . . . , ζn ) ∈ Tn we therefore know that Eq. (83) holds. Combining this relation with Eq. (96), we have therefore found that n w(J ) (ζ1 , . . . , ζn−1 , ζn ) = (−1)j w (πJ ) ζn−1 , . . . , ζ1 , − (100) ζj , j =1
for all (ζ1 , . . . , ζn ) ∈ Jn . Moreover, since the set Jn forms an open real domain in the complex domain Tn , this equation will in fact hold for all (ζ1 , . . . , ζn ) ∈ Tn , so in particular for (ζ1 , . . . , ζn ) ∈ Tn . We now take the boundary value of both sides of Eq. (100) as Im ζj goes to zero while keeping Im ζj ∈ V − for 1 ≤ j ≤ n − 1, which gives us B. V.
ηj →0,ηj ∈V − ∀j =n
= (−1)j
w (J ) (ξ1 + iη1 , . . . , ξn−1 + iηn−1 , ξn + iηn ) B. V.
ηj →0,ηj ∈V − ∀j =n
w (πJ ) ξn−1 + iηn−1 , . . . , ξ1 + iη1 , −
n
ξj + iηj .
j =1
(101) By Eq. (79), the left and right sides of this equation are equal, in the distributional sense, to the left respectively right side of Eq. (84). This proves the proposition and hence the theorem in the scalar, bosonic case. Proof of Theorem 5.1, general case. The proof of the theorem when fields of arbitrary spinor type and/or fermionic fields are present does not differ substantially from the scalar bosonic case, so we will only briefly outline the main changes that occur in this more general case relative to the scalar bosonic case. If the functionals σ (i) in Eq. (34) carry abstract spinor indices collectively denoted A0 , and the fields φ (jk ) carry spinor indices collectively denoted Ak , then the distributions (J ) Wµ1 ...µn in Eq. (56) get replaced by spinor valued distributions, ) n W (J µ1 ...µk α0 ...αn (y1 , . . . , yn ) ∈ D (× Br ),
(102)
where (J ) stands for (ij1 · · · jn ), and where αj label the coordinate components corresponding to the abstract spinor indices Aj in a suitable trivialization of the spin bundle. By repeating the same kind of arguments as in the scalar, bosonic case (taking into account the definition of the map I˜, see Appendix B), it is seen that the statement of the theorem in the general case can be reduced to the proof of the identity (i)
(i)
i F (−1)M W (J ) (y1 , . . . , yn ) = i F (−1)M (−1)j W (πJ ) (−yn , . . . , −y1 ).
(103)
238
S. Hollands
The distributions W (J ) now have the transformation behavior W (J ) ((L)y1 , . . . , (L)yn ) = D (J ) (L)W (J ) (y1 , . . . , yn )
(104)
for all L ∈ SL2 (C) with L − 1 < δ and all (y1 , . . . , yn ) ∈ ×n Br−δ , where D (J ) (L) is now β ...β ν ...ν
ν
D (J ) (L)α00 ...αnn µ11 ...µj j = D (i) (L)βα00 D (j1 ) (L)βα11 · · · D (jn ) (L)βαnn (L)νµ11 · · · (L)µjj , (105) where (L) is the proper orthochronous Lorentz transformation corresponding to L via ↑ the usual covering homomorphism SL2 (C) → L+ , and where D (j ) (L) is the spinor representation of SL2 (C) corresponding to the spinor character of the field φ (j ) . The commutation relations (iii) are modified to W (J ) (y1 , . . . , yk , yk+1 , . . . , yn ) = ±W (πk,k+1 J ) (y1 , . . . , yk+1 , yk , . . . , yn ), (106) whenever yk and yk+1 are spacelike to each other with respect to the Minkowski metric ηµν , where − is chosen if both fields φ (jk ) and φ (jk+1 ) are fermionic and + is chosen otherwise. (The permutation must also act in the obvious way on the spinor indices of W (J ) , but we have suppressed this.) As in the scalar, bosonic case, the distributions w(J ) = f ∗ W (J ) are shown to be boundary values of analytic functions on Tn . In order to pass to the extended tube, Tn , one considers the analytic family of transformations i(α+iβ)/2 e 0 L(α + iβ) = , (107) 0 e−i(α+iβ)/2 in the group SL2 (C), so that the complex Lorentz-transformations (α + iβ) in Eq. (86) are given by the pair (L(α + iβ), L(α − iβ)) ∈ SL2 (C) × SL2 (C) via the usual covering homomorphism SL2 (C) × SL2 (C) (L, M) → (L, M) ∈ L+ (C). It is then straightforward to prove the analog of Lemma 5.2 for the w (J ) . Taking these modifications into account, one then proves in basically the same way as in the scalar, bosonic case that W (J ) (y1 , . . . , yn ) = (−1)F (F −1)/2 (−1)m (−1)j W (πJ ) (−yn , . . . , −y1 )
(108)
for all yj ∈ Br in the sense of distributions, where F is the total number of fermion fields in the collection φ (j1 ) , . . . , φ (jn ) , and where m = M (i) + M (j1 ) + · · · M (jn ) is the total number of unprimed spinor indices represented by α0 . . . αn . One has iF if F is even, F (F −1)/2 (−1) = F −1 (109) i if F is odd. If F is even, then the field φ (i) corresponding to the index (i) in Eq. (34) has to be bosonic, F (i) = 0, by Remark (2) following Theorem 5.1. If F is odd, then φ (i) has to be fermionic, F (i) = 1. Hence in both cases, we get Eq. (103). This proves the theorem. Acknowledgements. I would like to thank D. Buchholz, K. Fredenhagen, R. Verch, and R. M. Wald for helpful and stimulating discussions. This work was supported by NSF grant PHY00-90138 to the University of Chicago.
PCT Theorem for the Operator Product Expansion in Curved Spacetime
239
A. ΣA in Minkowski Spacetime In this appendix, we show that when M = (R4 , ηab ) is Minkowski spacetime and σ is a linear functional on A(M) with bounded energy in the sense of Eq. (25), then σ ∈ A (M), i.e., the n-point functions of σ satisfy the microlocal spectrum condition (20) in Minkowski spacetime. To see most clearly what is involved, we will give a formal argument, which can however be made precise by the methods of [12, Chap. 3]. Consider the distribution u in n + 1 spacetime variables defined formally by µ
µ
µ
µ
u(ξ1 , . . . , ξn+1 ) = σ (eiξ1 Pµ φ (1) (0) eiξ2 Pµ · · · eiξn Pµ φ (n) (0) eiξn+1 Pµ ).
(110)
The Fourier transform of u is then (formally) given by u(k ˆ 1 , . . . , kn+1 ) = σ (δ(P − k1 )φ (1) (0)δ(P − k2 ) · · · δ(P − kn )φ (n) (0)δ(P − kn+1 )). (111) Since the spectrum of the energy-momentum operator P lies entirely within the forward lightcone V¯+ , and since σ has bounded energy in the sense of Eq. (25), it is easily seen that supp uˆ ⊂ Xp0 × (×n−1 V¯ + ) × Xp0 ,
(112)
where Xp0 = {k ∈ V¯ + | k 0 ≤ p0 }. It follows by [20, Thm. 8.4.17] that WFA (u) ⊂ (×n+1 R4 ) × ({0} × (×n−1 V¯ + ) × {0}).
(113)
Now, it is easy to see from the transformation law U (a)φ (i) (y)U (a)∗ = φ (i) (y + a) of the local covariant fields under translations y → y + a that σ (φ (1) (y1 ) · · · φ (n) (yn )) = u(y1 , y1 − y2 , . . . , yn−1 − yn , yn ) = f ∗ u(y1 , . . . , yn ),
(114) (115)
where the linear map f : R4n → R4(n+1) is defined by the last equation. Thus, by the transformation properties of the analytic wave front set under analytic maps, n WFA σ φ (k) (yk ) = f ∗ WFA (u) k=1
⊂ {(y1 , k1 ; . . . ; yn , kn ) | k1 = p12 , kn = −p(n−1)n , ki = pi(i+1) − p(i−1)i , (116) for 2 ≤ i ≤ n − 1, pij ∈ V¯ + for all i, j }. The right side of this inclusion is contained in the set M when M is Minkowski spacetime. This is seen by taking G(p) in the definition (22) of M to be the linear graph in which each point xi is connected to its predecessor xi−1 by precisely one straight line, e, decorated with the momentum pe = p(i−1)i ∈ V¯ + . This proves that σ ∈ A (M).
240
S. Hollands
B. Spinors in Curved Spacetimes We here review the construction of spinors on a 4-dimensional curved spacetime M, with a particular eye on the role played by the orientations. Our review follows closely [26], to which we refer for details. Let M be a globally hyperbolic spacetime with space and time orientation o = (T , abcd ). Let x be a point in the spacetime. In the tangent space Tx (M) at x we consider the set Fx (M) of all oriented, time oriented orthonormal frames (eµ )a , (where µ = 0, 1, 2, 3), meaning that gab (eµ )a (eν )b = ηµν ,
(e0 )a ∇a T > 0,
abcd (e0 )a (e1 )b (e2 )c (e3 )d > 0.
(117)
)a is another such frame, then there is a unique proper orthochronous LoClearly, if (eµ )a = ν (e )a . The frame bundle, F (M), is defined rentz transformation such that (eµ µ ν as the union of the spaces Fx (M) as x runs over all points in M. It has the stucture of a principal fibre bundle over M, whose structure group is the proper orthochronous ↑ Lorentz group L+ which acts upon elements in each fibre Fx (M) by transforming the orthonormal frames. Spinors over M can be defined if and only if there exists a principal fibre bundle S(M), called “spin-bundle”, with structure group SL2 (C) and base manifold M that covers the frame bundle in the sense that there is an onto map f : S(M) → F (M) such that the group action of SL2 (C) on the spin bundle corresponds to the group action of the Lorentz group on the frame bundle via the covering ↑ homomorphism SL2 (C) → L+ . A spin bundle need not exist in a general curved spacetime, and if it exists, it need not be unique. The situation is however rather simple in the case when the spacetime M is simply connected, π1 (M) = 0. In that case, a (necessarily unique) spin bundle will exist if and only if π1 (F (M)) = Z2 , and the spin bundle is in fact simply given by the the universal covering space of the frame bundle F (M). (Remember that the universal covering space X˜ of a topological space X is the space of equivalence classes of continuous paths γ : [0, 1] → X with γ (0) = x0 , where two paths are equivalent if they can be composed to a closed path in X that is homotopic to the trivial path given by γ (λ) = x0 for all λ ∈ [0, 1].) Let χ : N → M be an isometric, orientation and time orientation preserving embedding. If (eµ )a is an oriented and timeoriented orthonormal frame on N , then clearly χ∗ (eµ )a will be such a frame over M and this defines an embedding χ∗ : F (N ) → F (M). This embedding lifts to a corresponding map between the covering spaces by defining its action on a path γ : [0, 1] → F (N ) to be the path γ : [0, 1] → F (M) given by γ (λ) = χ∗ γ (λ) for all λ ∈ [0, 1], because it is easily seen that the equivalence class of γ only depends on the equivalence class of γ . Thus, if both N and M are simply connected and π1 (F (M)) = Z2 , then also π1 (F (N )) = Z2 , and we get a natural embedding map
χ∗ : S(N ) → S(M)
(118)
which is compatible with the action of the group SL2 (C) on these spaces. (This can be ↑ seen by noting that, since SL2 (C) is the universal cover of L+ , every L ∈ SL2 (C) can ↑ be identified with an equivalence of continuous paths γL : [0, 1] → L+ starting at 1 ↑ and ending at , the element in L+ covered by L.) If the spacetime M is not simply connected, i.e. π1 (M) = G = 0, then one can show that a spin-bundle S(M) covering the frame bundle will exist if and only if the fundamental group of the frame bundle is isomorphic to a direct product ψ : π1 (F (M)) ∼ = Z2 × G
(119)
PCT Theorem for the Operator Product Expansion in Curved Spacetime
241
in the sense that every element of the form (g1 , e2 ) (with e2 the identity element in G) corresponds via ψ to a path in F (M) that is homotopic to a path lying within a single fiber Fx (M), and that each path in F (M) corresponding to an element of the form (e1 , g2 ) (with e1 the identity in Z2 ) projects down to a path in M that is homotopic to a path representing g2 ∈ π1 (M). In this case, one can define a spin bundle S(M) as the space of equivalence classes of continuous paths in F (M), where two such paths are now regarded as equivalent if their composition can be continuously deformed to a path that corresponds to the group element of the form (e1 , g2 ) under the isomorphism ψ. Since ψ is not necessarily unique18 , there may now exist several inequivalent constructions of S(M), each corresponding to a particular choice for the isomorphism ψ. An isometric embedding χ : N → M between spacetimes admitting a spin structure, can therefore be lifted to a map χ∗ as in Eq. (118) for one and only one choice of spin-structure over N. Spinors in the spacetime M are constructed as elements in the vector bundles that are associated with the principal fibre bundle S(M). These are defined as follows. On the cartesian product S(M) × C2 we define an equivalence relation ∼ by declaring two elements (s, v) and (s , v ) to be equivalent if there is an element L ∈ SL2 (C) such that s = L−1 s and v = D(L)v, where L−1 s denotes the action of a group element L on an element s in the principal fibre bundle, and where D is the fundamental representation of SL2 (C) on C2 . The space of equivalence classes V(M) = (S(M) × C2 )/ ∼
(120)
is then seen to be a vector bundle over M with each fibre isomorphic to C2 . Classical spinors fields over M (with an upper unprimed spinor index) are by definition sections in this vector bundle. This construction can be varied by replacing the space C2 by ∗ ¯ 2∗ of antilinear the dual space C2 of complex linear functionals on C2 , or the space C 2 2 2∗ ¯ dual to C ¯ , and by replacing the representation D functionals on C , or the space C by the appropriate representations of SL2 (C) on these spaces. We shall denote the corresponding vector bundles over M by V ∗ (M), V ∗ (M) and V (M), respectively. They correspond to spinors with a lower unprimed, lower primed and upper primed index. Suppose that we are given an isometric, orientation and causality preserving embedding, χ : N → M between two oriented spacetimes N and M and suppose that each spacetime has a spin structure such that χ lifts to a map χ∗ as in Eq. (118) between the corresponding spin bundles. In this situation, we automatically get a map χ∗ : V(N ) → V(M),
[(s, v)]∼ → [(χ∗ (s), v)]∼
(121)
between the corresponding associated spin bundles (and likewise the bundles V (M), V ∗ (M) and V (M) as well as their tensor products). We finally explain the dependence of the above construction of spinors on the choice of space and time orientation of the spacetime. Let F (M) be the bundle of frames that are orthogonal with respect to the metric gab and that are oriented and time oriented with respect to a time and space orientation o = (T , abcd ) on M, and let F (M) be the bundle of orthonormal frames that are oriented with respect to the opposite time 18 It is clear that the different possible choices for ψ are in 1-to-1 correspondence with the non-unity automorphisms of the group Z2 × G. These in turn are easily seen to be in 1-to-1 correspondence with the normal subgroups H of G such that G/H = Z2 .
242
S. Hollands
and space orientation, −o = (−T , abcd ). Then these bundles are naturally isomorphic under the map I : F (M) → F (M),
(eµ )a → (−eµ )a ,
(122)
since a tetrad (eµ )a is positively oriented with respect to o if and only if the tetrad (−eµ )a is positively oriented with respect to −o. As explained above, a construction of a spinor bundle S(M) covering the frame bundle F (M) is equivalent to a choice of isomorphism ψ : π1 (F (M)) → Z2 × G, and a construction of a spinor bundle S(M) in the spacetime M with the opposite orientations is likewise equivalent to a choice of isomorphism ψ : π1 (F (M)) → Z2 × G. It is possible to see that the map I will lift to a corresponding bundle isomorphism I˜ between the spinor bundles S(M) and S(M) if and only if ψ and ψ are compatible in the sense that ψ ◦ I ◦ ψ −1 is the identity homomorphism in π1 (M) × Z2 . By changing ψ or ψ if necessary, we can therefore always assume that there is indeed a natural map I˜ identifying the spin bundles S(M) and S(M). From the above constructions it is then clear that this map will induce a corresponding map I˜ : V(M) → V(M)
(123)
between the corresponding associated vector bundles of which the spinors are elements, and similar statements hold for the bundles V (M), V ∗ (M), V ∗ (M), as well as their tensor products. This provides us with a natural identification of spinors over the spacetimes M and M. C. Proof of Lemma 5.1 For the convenience of the reader, we repeat the statement of the lemma: Lemma C.1. Let u(s) ∈ D (X), X an open subset of Rm , be a family of distributions that depends analytically on s ∈ I = (a, b) with respect to conic sets K (s) ⊂ X × (C \ {0}), where C is a closed, proper convex cone in Rm . Suppose that d k (s) u |s=s0 = 0 ds k
(124)
for all k and some s0 ∈ I . Then u(s) = 0 for all s ∈ I . Proof. Let us set X˜ = I × X, x˜ = (s, x), and view the family of distributions u(s) as ˜ Since the set C is proper, closed and convex, we may defining a distribution u˜ ∈ D (X). without loss of generality assume that C = {k ∈ Rm | k · n ≥ δnk}, where “dot” is the standard inner product for vectors in Rm , where · is the corresponding norm, where 0 < δ < 1 and where n is some nonzero vector in Rm . Let n˜ = (0, n) ∈ Rm+1 and consider the quantity c=
inf
˜ (x, ˜ k)∈WF ˜ X˜ 0 A (u)
k˜ · n˜ , ˜ kn ˜
(125)
˜ We claim that c > 0. Since the analytic where X˜ 0 is any closed compact subset of X. ˜ X˜ 0 . If wave front is closed, the infimum is achieved for some (x˜0 , k˜0 ) ∈ WFA (u) c ≤ 0, we have consequently 0 ≥ k˜0 · n˜ = k0 · n,
PCT Theorem for the Operator Product Expansion in Curved Spacetime
k0 = t f
(s)
(x0 )k˜0 ,
243
x˜0 = f (s) (x0 ),
(126)
where f (s) : X → X˜ is the embedding map. It is therefore not possible that k0 ∈ C, unless k0 = 0. This is however in contradiction with the assumption of the lemma, since the analyticity of u(s) with respect to s implies that when (x˜0 , k˜0 ) ∈ WFA (u), ˜ then necessarily k0 ∈ C \ {0}. We must therefore have that c > 0, and consequently that ˜ WFA (u) ˜ X˜ 0 ⊂ X˜ 0 × C,
˜ C˜ = {k˜ ∈ Rm+1 | k˜ · n˜ ≥ cn ˜ k},
(127)
and we may assume without loss of generality that c < 1. The cone C˜ is the dual of the ˜ x. ˜ By Thm. 5.2, we open cone consisting of all x˜ ∈ Rm+1 , such that x˜ · n˜ > (1 − c)n can therefore conclude that there is a function U˜ that is analytic in the complex domain consisting of all x˜ + i y˜ ∈ Cm+1 for which y˜ · n˜ > (1 − c)n ˜ y, ˜ and x˜ ∈ X˜ 0 so that u˜ is the boundary value of U˜ as y˜ → 0, u(s, ˜ x) =
B. V. √
(t,y)→0,n·y>(1−c)n
y2 +t 2
U˜ (s + it, x + iy),
(128)
where we are now writing x˜ = (s, x), y˜ = (t, y), and where we have used the definition n˜ = (0, n). We may set t = 0 on the right side of this equation when y · n > (1 − c)ny, and take k derivatives with respect to s of both sides of the equation. Setting s = s0 and using the assumption of the lemma, this gives 0=
dk ˜ U (s0 , x + iy) y→0,n·y>(1−c)ny ds k B. V.
∀k.
(129)
We already know that the function x + iy → d k /ds k U˜ (s0 , x + iy) is analytic when n · y > (1 − c)ny, and we have now found that its distributional boundary values as y → 0 vanish. We therefore conclude, by the “edge-of-the-wedge theorem” (see e.g. [12, Thm. 2.17]), that this function itself has to vanish. Therefore U˜ (s, x + iy) =
∞ (s − s0 )k d k ˜ U (s0 , x + iy) = 0 k! ds k
(130)
k=0
for sufficiently small |s − s0 |, and hence for all s in I . Thus, by Eq. (128), u(s, ˜ x) = u(s) (x) = 0 in the sense of distributions for all s ∈ I . References 1. Wilson, K.: Non-Lagrangian models of current algebras. Phys. Rev. 179, 1499 (1969); Anomalous dimensions and the breakdown of scale invariance in perturbation theory. Phys. Rev. D2(8), 1478–1493 (1970) 2. Zimmermann, W.: Normal products and the short distance expansion in the perturbation theory of renormalizable interactions. Ann. Phys. 77, 570 (1973) [Lect. Notes Phys. 558, 278 (2000)] 3. Wilson, K., Zimmermann, W.: Operator product expansions and composite field operators in the general framework of quantum field theory. Commun. Math. Phys. 24, 87–106 (1972) 4. Fredenhagen, K., J¨orss, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansions. Commun. Math. Phys. 176, 541–554 (1996) 5. Schlieder, S., Seiler, E.: Remarks concerning the connection between properties of the 4-point function and the Wilson-Zimmermann expansion. Commun. Math. Phys 31, 137–159 (1973) 6. Luscher, M.: Operator product expansions on the vacuum on conformal quantum field theory in two space-time dimensions. Commun. Math. Phys. 50, 23 (1976)
244
S. Hollands
7. Bostelmann, H.: Lokale Algebren und Operatorprodukte am Punkt. Doktorarbeit, Universit¨at G¨ottingen, 2000, available at http://www.lqp.uni-goettingen.de/papers/00/12/00121700.html 8. Fredenhagen, K., Hertel, J.: Local algebras of observables and pointlike localized fields. Commun. Math. Phys. 80, 555–561 (1986) 9. Buchholz, D., Wichmann, E.H.: Causal independence and the energy level density of states in local quantum field theory. Commun. Math. Phys. 106, 321 (1986) 10. Brunetti, R., Fredenhagen, K., K¨ohler, M.: The microlocal spectrum condition and Wick polynomials on curved spacetimes. Commun. Math. Phys. 180, 633–652 (1996) [arXiv:math-ph/9903028] 11. Hollands, S., Wald, R.M.: Work in progress 12. Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics and All That. New York: Benjamin, 1964 13. Pauli, W.: Exclusion Principle, Lorentz Group and Reflection of Space-Time and Charge. In: Niels Bohr and the Devolopment of Physics, W. Pauli (ed.), New York: Pergamon Press, 1955, p. 30; R. Jost, Eine Bemerkung zum CPT-Theorem. Helv. Phys. Acta 30, 409 (1957) 14. Borchers, H.J., Yngvason, J.: On the PCT-theorem in the theory of local observables. [arXiv:mathph/0012020] 15. Brunetti, R., Fredenhagen K., Verch, R.: The generally covariant locality principle: A new paradigm for local quantum physics. Commun. Math. Phys. 237, 31–68 (2003) 16. Hollands, S., Wald, R.M.: Local Wick polynomials and time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 223, 289 (2001) [arXiv:gr-qc/0103074] 17. Hollands, S., Wald, R.M.: Existence of local covariant time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 231, 309 (2002) [arXiv:gr-qc/0111108] 18. Haag, R.: Local Quantum Physics. Berlin: Springer-Verlag, 1992 19. Verch, R.: A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework. Commun. Math. Phys. 223, 261 (2001) [arXiv:math-ph/0102035] 20. H¨ormander, L.: The Analysis of Linear Partial Differential Operators I. Berlin: Springer-Verlag, 1983 21. Kay, B.S., Wald, R.M.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on spacetimes with a bifurcate killing horizon. Phys. Rep. 207, 49–136 (1995) 22. Radzikowski, M.J.: Micro-local approach to the Hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529 (1996) 23. Strohmaier, A., Verch, R., Wollenberg, M.: Microlocal analysis of quantum fields on curved spacetimes: Analytic wavefront sets and Reeh-Schlieder theorems. J. Math. Phys. 43, 5514–5530 (2002) 24. Soloview, M.A.: PCT, spin-statistics, and analytic wave front set. Theor. Math. Phys. 121, 1377 (1999) 25. Steinmann, O.: Perturbation expansions in axiomatic field theory. Lect. Notes Phys. 11, Berlin: Springer Verlag, 1971 26. Wald, R.M.: General Relativity. Chicago: Unversity of Chicago Press, 1984 Communicated by H. Nicolai
Commun. Math. Phys. 244, 245–260 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0974-6
Communications in
Mathematical Physics
A Strong Regularity Result for Parabolic Equations Qi S. Zhang Department of Mathematics, University of California, Riverside, CA 92521, USA. E-mail: [email protected] Received: 7 January 2003 / Accepted: 9 July 2003 Published online: 18 November 2003 – © Springer-Verlag 2003
Abstract: We consider a parabolic equation with a drift term u+b∇u−ut = 0. Under the condition divb = 0, we prove that solutions possess dramatically better regularity than those provided by standard theory. For example, we prove continuity of solutions when not even boundedness is expected. 1. Introduction We aim to study the parabolic equation u(x, t) + b(x, t)∇u(x, t) − ut (x, t) = 0, (x, t) ∈ Rn × (0, ∞),
(1.1)
where is the standard Laplacian, b is a vector valued function and n ≥ 2. Standard existence and regularity theory for this kind of equations has existed for several decades. For instance when |b| ∈ Lp (Rn ), p > n, the fundamental solution of (1.1) has a local in time Gaussian lower and upper bound ([A]). Hence bounded solutions are H¨older continuous. In this paper we study the regularity problem of (1.1) for much more singular functions b. Several factors provide strong motivations for studying these kind of problems. The first is to investigate a possible gain of regularity in the presence of the singular drift term b. This line of research has been followed in the papers [St, KS, O, CrZ, Se]. Under the condition |b| ∈ Ln (Rn ), Stampacchia [St] proved that bounded solutions of u + b∇u = 0 are H¨older continuous. In the paper [CrZ], Cranston and Zhao proved that solutions to this equation are continuous when b is in a suitable Kato class, |b(y)| i.e. limr→0 supx |x−y|≤r |x−y| n−1 dy = 0. In the paper [KS] Kovalenko and Semenov
proved the H¨older continuity of solutions when |b|2 is independent of time and is sufficiently small in the form sense. See the next paragraph for a statement of their condition. This result was recently generalized in [Se] to equations with leading term in divergence form. In [O], Osada proved, among other things, that the fundamental solution of (1.1)
246
Q.S. Zhang
has global Gaussian upper and lower bound when b is the derivative of bounded functions (in distribution sense) and divb = 0. More recently in the paper [LZ], H¨older continuity of solutions to (1.1) was established when |b|2 is form bounded and div b = 0. See the next page for a description. We should mention that many authors have also studied the regularity property of the related heat equation u + V u − ut = 0. Here V is a singular potential. We refer the reader to the papers by Aizenman and Simon [AS], Simon [Si] and references therein. It is worth remarking that the current situation exhibits fundamentally new phenomena comparing with that case. Another motivation comes from the study of nonlinear equations involving gradient structures. These include the Navier Stokes equations, which can be regarded as systems of parabolic equations with very singular first order terms. Our result provides a different proof of the well known fact that weak solutions to the two dimensional Navier-Stokes equations are smooth (Corollary 2). For the three dimensional Navier-Stokes equations, it is interesting to note that the singularity of the velocity field is covered by our theorem (see the discussion at the end of the introduction). In this paper we actually go much beyond the above kinds of singularities. A special case of our result states that, under the assumption that divb = 0, weak solutions of p (1.1) are bounded as long as b ∈ Lloc (Rn ) with p > n/2, n ≥ 4. For a time dependent vector field b ∈ L2loc (Rn × (0, ∞)), it suffices to assume the general form bounded condition: for a fixed m ∈ (1, 2], and φ ∈ C0∞ (Rn × (0, ∞)), Rn
|b(x, t)|m φ 2 dxdt ≤ k
Rn
|∇x φ(x, t)|2 dxdt,
where k is independent of φ. When m = 2 and k is sufficiently small, we are in a situation covered by the paper [KS] and [LZ]. The most interesting case is when m is close to 1. It is widely assumed that solutions of (1.1) can be regular only if the above inequality holds for m ≥ 2. However Theorem 1.1 below proves that weak solutions to (1.1) are bounded as long as m > 1. In fact they are Lipschitz in the spatial direction. Hence b can be almost twice as singular as allowed by standard theory, provided that divb = 0. The above class of the drift term b includes and much exceeds the generalized Kato class that has been studied in several interesting papers [ChZ, CrZ, G]. These functions in general are not the derivative of bounded functions considered in [O] either (see Remark 1.1 below). Here is an example. Let b = b(x1 , x2 , x3 ) be a vector field in R3 . If b has a local singularity of the form |x|c1+ with a small > 0, then b is in this class. In contrast c all previous results at best allow singularities in the form of |x| . In addition to the unexpected regularity result, we also prove that the fundamental solution of (1.1) satisfies a global Gaussian like upper bound. An interesting feature of the bound is that it is not just a perturbation. This means that the global bound holds when b satisfies (1.2) with no additional smallness condition. Using the approximation result in Sect. 2, the fundamental solution here means the minimum of pointwise limits of the fundamental solution of (1.1) with b replaced by a sequence of smooth and divergence free vector fields. An interesting fact is that the bound is no longer Gaussian, reflecting the contribution given by the singularity of the drift term b. In the Kato class case, local in time Gaussian bounds for the heat kernel with singular drift terms were obtained in [Z], which was extended in [LS] recently.
Regularity
247
In this paper we use the following definition of weak solutions. Definition 1.1. Let D ⊆ Rn be a domain and T ∈ (0, ∞]. A function u such that u, |∇u| ∈ L2loc (D ×[0, T ]) is a weak solution to (1.1) if: for any φ ∈ C0∞ (D ×(−T , T )), there holds T T (u∂t φ − ∇u∇φ)dxdt + b∇u φ dxdt = − u0 (x)φ(x, 0)dx. 0
0
D
D
D
The assumption that ∇u is square integrable can be weakened. However we are not seeking full generality here. Now we are ready to state the main theorem of the paper. Theorem 1.1. Suppose b satisfies: (1) b ∈ L2loc (Rn × [0, ∞)) and divb(·, t) = 0; (2) for a fixed m ∈ (1, 2] and any φ ∈ C ∞ (Rn × (0, ∞)) with compact support in the spatial direction, m 2 |b| φ dxdt ≤ k |∇φ|2 dxdt, (1.2) Rn
Rn
where k is independent of φ. Then the following statements hold. (i) Weak solutions of (1.1) are locally bounded. (ii) Weak solutions of (1.1) are H¨older continuous when b = b(x) and m = 2. (iii) Let G be the fundamental solution of (1.1). There exist positive constants c1 and c2 such that, for any x, y ∈ Rn and t > s > 0,
c (t−s) |x−y|m |x−y|2 , [(t−s)|B(x,√1t−s)|]m/(2(m−1)) exp −c2 (t−s)m−1 + exp −c2 t−s
G(x, t; y, s) ≤ m c√1 |x−y|2 exp −c2 |x−y| , t − s ≥ 1. m−1 + exp −c2 t−s |B(x, t−s)|
t − s ≤ 1,
(t−s)
Remark 1.1. Note that the above upper bound reduces to the standard Gaussian upper bound when m = 2. This case was recently investigated in [LZ]. Part (ii), follows from [LZ], is here for completeness. If b ∈ Lp (Rn ) with p > n/2, then it is well known that (1.2) is satisfied. See [Si]. Hence we have Corollary 1. Let u be a weak solution of the elliptic equation u + b∇u = 0. Suppose p b ∈ Lloc (Rn ), p > n/2, n ≥ 4, and divb = 0. Then u is a bounded function. Note that without the assumption of divb = 0, it is known ([St]) that u is H¨older p continuous when b ∈ Lloc with p = n. Remark 1.2. Due to its importance and potential applications, we single out part of the result of Theorem 1.1 in the three dimensional case as a corollary. Corollary 2. Let D ⊆ Rn , n = 2, 3. Assume |b| ∈ L∞ ([0, T ], L2 (D)) and divb = 0. Suppose u is a weak solution of (1.1) in D × [0, T ]. Then u is locally bounded. In particular weak solutions to the two dimensional Navier-Stokes equation is smooth when t > 0.
248
Q.S. Zhang
Proof. It is enough to prove that the above condition on b alone implies that condition (1.2) is satisfied for some m > 1. Here is a proof when n = 3. The case when n = 2 is dealt with similarly. Let us take m = 4/3 and p = 2/m = 3/2. Then, by H¨older’s inequality,
T
0
|b|4/3 φ 2 dxdt ≤ D
T
=
|b| dx 2
φ dx
T
|b|2 (x, t)dx
≤ C sup
dt
D
1/3 φ 6 dx
dt
D
2/3
T
|b|2 (x, t)dx 0
D
(p−1)/p
dt
0
D
φ 2p/(p−1) dx
1/3
D 2/3
≤ sup
t∈[0,T ]
D 6
D
t∈[0,T ]
1/p |b|mp dx
0 2/3
0
T
|∇φ|2 dxdt. D
The last step is by Sobolev imbedding. Now let u = (u1 (x, t), u2 (x, t)) be a weak solution to the 2 − d Navier-Stokes equation u − u∇u − ∇P − ut = 0,
div u = 0.
Then the curl of u, denoted by w, is a scalar satisfying w + u∇w − wt = 0. By definition, u ∈ L∞ ((0, ∞), L2 (R2 )). So Theorem 1.1 shows that w is bounded when t > 0. Hence u is smooth too when t > 0.
Discussion. Here we would like to speculate on some possible links between the regularity problem of the 3-d Navier-Stokes equation and Corollary 2. Let u be a Leray-Hopf solution to the 3-d Navier-Stokes equation u − u∇u − ∇P − ut = 0,
div u = 0,
|u(·, 0)| ∈ L2 (R3 ).
Then it is well known that u(·, t)L2 (Rn ) is non-increasing and hence uniformly bounded. Therefore assuming only the pressure term P is sufficiently regular locally, then Corollary 2 implies that u is bounded and hence smooth. It seems that all previous regularity results either make some global restrictions on P or on the initial value u0 . We mention the recent interesting result of Seregin and Sverak [SS]. The authors proved that u is smooth provided that u0 ∈ W 1,2 and P is bounded from below. See also [BG]. It would be interesting to see how far the method in this paper may go for the system case. The rest of the paper is organized as follows. In Sect. 2 we show some approximation results of solutions of (1.1) under some singular drift term. Theorems 1.1 will be proven in Sect. 3.
Regularity
249
2. Preliminaries Since the drift term b in (1.1) can be much more singular than those allowed by the standard theory, the existence and uniqueness of weak solutions of (1.1) can not be taken for granted. In order to proceed first we need to prove some approximation results. The next proposition shows that Eq. (1.1) possesses weak solutions even when b satisfies the assumption of Theorem 1.1. Proposition 2.1. Let b be given as in Theorem 1.1 and bk be a sequence of smooth divergence free vector fields. Suppose bk → b in L2 (D × [0, T ]) norm and let uk be the unique solution to uk + bk ∇uk − ∂t uk = 0, in D × [0, T ] (2.1) uk (x, t) = 0, (x, t) ∈ ∂D × [0, T ], u (x, 0) = u (x), u ∈ L2 (Rn ). k 0 0 Then there exists a subsequence of {uk }, still denoted by {uk }, which converges weakly in L2 (D × [0, T ]) to a solution of (1.1). Proof. Since divbk = 0, multiplying Eq. (2.1) by uk and integrating, one easily obtains
T
0
D
|∇uk |2 dxdt +
D
u2k (x, T )dx =
Hence there exists a function u such that u, |∇u| ∈ of {uk }, still denoted by {uk }, such that uk → u,
D
× [0, T ]) and a subsequence
L2 (D × [0, T ]);
weakly in
∇uk → ∇u,
L2 (D
u20 (x)dx.
L2 (D × [0, T ]).
weakly in
We will prove that u is a solution to (1.1). Clearly uk satisfies, for any φ ∈ C0∞ (D × [0, T ),
T
(uk ∂t φ − ∇uk ∇φ)dxdt + D 0 =− u0 (x)φ(x, 0)dx.
T
0
D
bk ∇uk φ dxdt (2.2)
D
By the weak convergence of uk and ∇uk , we have 0
T
D
(uk ∂t φ − ∇uk ∇φ)dxdt →
Next, notice that T 0
D
T
= 0
D
0
bk ∇uk φdxdt −
T
T
D
(u∂t φ − ∇u∇φ)dxdt,
k → ∞. (2.3)
b∇u φdxdt
0
D
(bk − b)∇uk φdxdt +
T 0
D
b(∇uk − ∇u) φdxdt.
250
Q.S. Zhang
By the strong convergence of bk and the weak convergence of ∇uk , we see that T T uk bk ∇φdxdt − b∇u φdxdt → 0, k → ∞. (2.4) 0
0
D
By (2.2) and (2.4) we obtain T (u∂t φ − ∇φ)dxdt + 0
D
i.e. u is a solution to (1.1).
T
D
b∇u φdxdt = −
0
D
u0 (x)φ(x, 0)dx, D
Proposition 2.2. Suppose b ∈ C ∞ (Rn × [0, ∞)) ∩ L∞ and divb = 0. Let G be the fundamental solution of (1.1). Then, for any x ∈ Rn and t > s > 0, G(x, t; y, s)dy = 1, G(x, t; y, s)dx = 1. Rn
Rn
Proof. Since b is smooth and bounded, G is smooth and has local Gaussian upper bound. Hence we have d G(x, t; y, s)dy = [−y G(x, t; y, s) + b(y, s)∇y G(x, t; y, s)]dy = 0. ds Rn Rn
The other equality is proved similarly.
Proposition 2.3. Let Q ≡ D × [0, T ] with D ⊆ Rn being a smooth domain and T > 0. Suppose that b ∈ C ∞ (Q) ∩ L∞ (Q) and f ∈ L1 (Q). Suppose u is a weak solution to u + b∇u − ut = f, in Q (2.5) u(x, t) = 0, (x, t) ∈ ∂D × [0, T ] u(x, 0) = 0. Here the boundary condition is in the sense that u ∈ L2 ([0, T ], W01,2 (D)). Then t u(x, t) = − G(x, t; y, s)f (y, s)dyds. 0
D
Here G is the Green’s function of (1.1) with initial Dirichlet boundary condition in Q. Proof. This result is trivial when f is bounded and smooth. When f is just L1 , it is known too. Here we present a proof for completeness. Let ψ be a smooth function in Q. Since b is bounded and smooth, standard theory shows that the following backward problem has a unique smooth solution: η − b∇η + ηt = ψ, in Q (2.6) η(x, t) = 0, (x, t) ∈ ∂D × [0, T ] u(x, T ) = 0. Moreover
η(y, s) = −
T
G(x, t; y, s)ψ(x, t)dxdt.
s
D
(2.7)
Regularity
251
Since u is a weak solution to (2.5), we have, by definition [−∇u∇η + b∇uη + uηt ]dxdt = f ηdxdt. Q
Q
Using integration by parts we have u[η − b∇η + ηt ]dxdt = f ηdxdt. Q
Q
By this, (2.6) and (2.7), we deduce T uψdxdt = − f (y, s) 0
Q
D
T
G(x, t; y, s)ψ(x, t)dxdt. D
s
That is
T
uψdxdt = − Q
t G(x, t; y, s)f (y, s)dyds ψ(x, t)dxdt.
0
D
0
D
The proposition follows since ψ is arbitrary.
Proposition 2.4. Suppose u is a weak solution of Eq. (1.1) in the cube Q = D × [0, T ], where b satisfies the condition in Theorem 1.1. Here D is a domain in Rn . Then u is the L1loc limit of functions {uk }. Here {uk } is a weak solution of (1.1) in which b is replaced by smooth, divergence free bk such that bk → b strongly in L2 (Q), k → ∞. Proof. First we select a sequence of smooth, bounded, divergence free bk such that bk → b strongly in L2 (Q), k → ∞. Let D ⊂ D be a smooth sub-domain of D. Then the following problem has a weak solution uk : uk + bk ∇uk − (uk )t = 0, in Q = D × (0, T ) (2.8) uk (x, t) = u(x, t), (x, t) ∈ ∂D × [0, T ] u (x, 0) = u(x, 0). k Clearly uk − u is a weak solution to the following: (uk − u) + bk ∇(uk − u) − (uk − u)t = (b − bk )∇u, (uk − u)(x, t) = 0, (x, t) ∈ ∂D × [0, T ] (u − u)(x, 0) = 0. k
in
Q = D × (0, T )
(2.9) Here the boundary condition is in the sense that uk − u ∈ L2 ([0, T ], W01,2 (D )). By our assumptions on b, bk and ∇u, we know that (b − bk )∇u ∈ L1 (Q ). Since bk is bounded and smooth, Proposition 2.3 shows that t (uk − u)(x, t) = − Gk (x, t; y, s)(b − bk )∇u(y, s)dyds. 0
D
252
Q.S. Zhang
Here Gk is the Green’s function of u + bk ∇u − ut = 0 in Q with Dirichlet initial boundary value condition. By Proposition 2.2, or the local version of it, we have Gk (x, t; y, s)dx ≤ 1. D
Hence D
Hence
|uk − u|(x, t)dx ≤
t 0
D
|b − bk ||∇u(y, s)|dyds.
D
|uk − u|(x, t)dx ≤ b − bk L2 (Q ) ∇uL2 (Q ) → 0.
This proves the proposition.
3. Proof of Theorem 1.1 Using the approximation result of Sect. 2, we may and do assume that the vector field b is bounded and smooth. The beginning of the proof generally follows the classical strategy of using test functions to establish L2 − L∞ bounds and weighted estimates for solutions of (1.1). However it is well known that this method does not provide a sharp global upper bound in the presence of lower order terms and the vector field b can not be as singular as we are assuming. For instance there is usually an extraneous ect term when t is large. Nevertheless, by using the special structure of the drift term and exploiting a special role of the divergence of b, we show that this classical method can be refined to derive sharp global bounds. In order to overcome the singularity of the drift term b, we need to construct a refined test function. This is the key step in proving the bounds. We divide the proof into five steps. For the sake of clarity we draw a flow chart for the proof: Step 1: Energy estimates using refined test function ⇒ Step 2: L∞ bound for weak solutions ((i) of Theorem 1.1) ⇒ Step 3: Weighted estimates ⇒ Step 4: Gaussian like upper bound ((iii) of Theorem 1.1); Step 5: Proof of (ii) Step 1. Caccioppoli inequality (energy estimates). Let u be a solution of (1.1) in the parabolic cube Qσ r = B(x, σ r) × [t − (σ r)2 , t]. Here x ∈ Rn , σ > 1, r > 0 and t > 0. By direct computation, for any rational number p ≥ 1, which can be written as the quotient of two integers with the denominator being odd, one has up + b∇up − ∂t up = p(p − 1)|∇u|2 up−2 .
(3.1)
Here the condition on p is to ensure that up makes sense when u changes sign. One can also just work on positive solutions now and prove the boundedness of all solutions later. See Step 6 at the end of the section.
Regularity
253
Choose ψ = φ(y)η(s) to be a refined cut-off function satisfying supp φ ⊂ B(x, σ r); φ(y) = 1, y ∈ B(x, r);
|∇φ| C ≤ , 0 ≤ φ ≤ 1; δ φ ((σ − 1)r)
here δ ∈ (0, 1). By scaling it is easy to show that such a function exists, supp η ⊂ (t − (σ r)2 , t); η(s) = 1, s ∈ [t − r 2 , t]; |η | ≤ 2/((σ − 1)r)2 ;
0 ≤ η ≤ 1.
Denoting w = up and using wψ 2 as a test function on (3.1), one obtains (w + b∇w − ∂s w)wψ 2 dyds = p(p − 1) |∇u|2 w 2 u−2 ≥ 0. Qσ r
Qσ r
Using integration by parts, one deduces ∇(wψ 2 )∇wdyds ≤ b∇w(wψ 2 )dyds − Qσ r
Qσ r
Qσ r
(∂s w)wψ 2 dyds.
(3.2)
By direct calculation, 2 ∇(wψ )∇wdyds = ∇[(wψ)ψ]∇wdyds Qσ r Q σr = [ ∇(wψ)( ∇(wψ) − (∇ψ)w) + wψ∇ψ∇w]dyds Q σr = |∇(wψ)|2 − |∇ψ|2 w 2 dyds. Qσ r
Substituting this to (3.2), we obtain |∇(wψ)|2 dyds ≤ b∇w(wψ 2 )dyds − (∂s w)wψ 2 dyds Qσ r Qσ r Qσ r + |∇ψ|2 w 2 dyds.
(3.3)
Qσ r
Next notice that 1 (∂s w)wψ 2 dyds = (∂s w 2 )ψ 2 dyds 2 Qσ r Q σr 1 =− w 2 φ 2 η∂s ηdyds + w 2 (y, t)φ 2 (y)dy. 2 Qσ r B(x,σ r) Combining this with (3.3), we see that 1 2 |∇(wψ)| dyds + w 2 (y, t)φ 2 (y)dy 2 B(x,σ r) Qσ r ≤ (|∇ψ|2 + η∂s η) w 2 dyds + b(∇w)(wψ 2 )dyds ≡ T1 + T2 . Qσ r
Qσ r
(3.4)
254
Q.S. Zhang
The first term on the right-hand side of (3.4) is already in good shape. So let us estimate the second term as follows: b(∇w)(wψ 2 )dyds T2 = Qσ r 1 1 = bψ 2 ∇w 2 dyds = − div(bψ 2 )w 2 dyds 2 Qσ r 2 Qσ r 1 1 =− divb(ψw)2 dyds − b∇(ψ 2 )w 2 dyds 2 Qσ r 2 Q σr 1 =− divb(ψw)2 dyds − b(∇ψ)ψw 2 dyds 2 Qσ r Qσ r 2 =− b(∇ψ)ψw dyds. Qσ r
Here we just used the assumption that divb = 0. The next paragraph contains the key argument of the paper. Notice that for δ ∈ (0, 1), a ∈ (0, 2) and m ∈ (1, 2], T2 ≤ | b(∇ψ)ψw 2 dyds| Q σr ∇ψ =| bψ 1+δ |w|2−a δ |w|a dyds| ψ Qσ r 1/m ≤ |b|m ψ (1+δ)m |w|(2−a)m dyds Qσ r (m−1)/m |∇ψ| × . ( δ )m/(m−1) |w|am/(m−1) dyds Qσ r ψ Take a, δ so that (2 − a)m = 2, Then 2 am/(m − 1) = a 2−a
(1 + δ)m = 2.
2 − 1 = 2, 2−a
δ = (2/m) − 1 < 1.
These and the assumption on the cut-off function ψ show that 1/m (m−1)/m c 2 T2 ≤ |b|m (ψw)2 dyds w dyds . m/(m−1) Qσ r Qσ r [(σ − 1)r] This implies for any > 0, T2 ≤ m |b|m (ψw)2 dyds + C Qσ r
−m/(m−1) [(σ − 1)r]m/(m−1)
By our assumptions on b, |b|m (ψw)2 dyds ≤ k Qσ r
Qσ r
w 2 dyds. Qσ r
|∇(ψw)|2 dyds.
(3.5)
Regularity
255
Substituting the above to (3.5), we can find k1 < 1/2 and k2 > 0 such that b(∇w)(wψ 2 )dyds| |T2 | = | Q σ r 1 2 ≤ k1 |∇(ψw)| dyds + k2 w 2 dyds. ((σ − 1)r)m/(m−1) Qσ r Qσ r
(3.6)
Combining (3.4) with (3.6), we reach 2 |∇(wψ)| dyds + w 2 (y, t)φ 2 (y)dy Qσ r B(x,σ r) C ≤ w 2 dyds, r ≤ 1, (3.7) ((σ − 1)r)m/(m−1) Qσ r C 2 2 2 |∇(wψ)| dyds + w (y, t)φ (y)dy ≤ w 2 dyds, r ≥ 1. 2 ((σ − 1)r) Qσ r B(x,σ r) Qσ r (3.7 ) Step 2. L2 − L∞ bounds. It is known that (3.7) implies the following L2 − L∞ estimate via Moser’s iteration. 1 u2 dyds, r ≤ 1. (3.8) sup u2 ≤ C |Qr |m/(2(m−1)) Q2r Qr Here m > 1. Also, (3.7 ) shows sup u2 ≤ C Qr
1 |Qr |
Indeed, by H¨older’s inequality,
2(1+(2/n)) (φw) ≤ Rn
Rn
(3.8 )
r ≥ 1.
u2 dyds, Q2r
(φw)
2n/(n−2)
(n−2)/n
2/n Rn
Using the Sobolev inequality, one obtains 2/n
(φw)2(1+(2/n)) ≤ C (φw)2 Rn
Rn
(φw)
2
.
|∇(φw)|2 .
The last inequality, together with (3.7) implies, for some C1 > 0, θ
2pθ −m/(m−1) 2p u ≤ C C1 (rτ ) u , Qσ r
Qσ r
where θ = 1 + (2/n). When the dimension n is odd or u ≥ 0, we set τi = 2−i−1 , σ0 = 1, σi = σi−1 − τi = 1 − 1i τj , p = θ i . The above then yields, for some C2 > 0,
u Qσi+1 r (x,t)
2θ i+1
≤C
C2i+1 r −m/(m−1)
θ
u Qσi r (x,t)
2θ i
.
256
Q.S. Zhang
After iterations the above implies
θ −i−1 u
2θ i+1
≤ C θ
−j −1
Qσi+1 r (x,t)
−(j +1)θ −j −1
C2
(r −m/(m−1) )θ
−j
u2 . Qr
Letting i → ∞ and observing that j∞=0 θ −j = (n + 2)/2, we obtain sup u2 ≤ Qr/2
C r m(n+2)/(2(m−1))
u2 . Qr
This proves (3.8) either for odd n or for all n and nonnegative u. Similarly one proves (3.8 ). In case n is even and u changes sign, we just regard u as a solution of Eq. (1.1) in Rn+1 × (0, T ). Then the L∞ bound of u follows from the above. Step 3. Weighted estimate. Let G be the heat kernel of (1.1). For a fixed λ ∈ R and a fixed bounded function ψ such that |∇ψ| ≤ 1, we write fs (y) = eλψ(y)
G(y, s; z, 0)e−λψ(z) f (z)dz.
Here and later the integral takes place in Rn if no integral region is specified. Direct computation shows that ∂s ||fs ||22
=2
(∂s fs (y))fs (y)dy λψ(y) =2 e fs (y) ∂s G(y, s; z, 0)e−λψ(z) f (z)dzdy
λψ(y) −λψ(z) =2 e G(y, s; z, 0)e fs (y)y f (z)dz dy
λψ(y) −λψ(z) G(y, s; z, 0)e +2 e fs (y)b(y)∇y f (z)dz dy
≡ J1 + J2 .
(3.9)
Following standard computation, we see that
J1 ≤ −2
|∇fs (y)| dy + 2cλ 2
2
Next we estimate J2 . For simplicity we write u(y, s) = e−λψ(y) fs (y),
fs (y)2 dy.
(3.10)
Regularity
257
which is a solution to u + b∇u − ∂t u = 0 in Rn × (0, ∞). Then J2 = 2 eλψ(y) fs (y)b(y)∇y u(y, s)dy = −2 div eλψ(y) fs (y)b(y) u(y, s)dy λψ(y) = −2 ∇ e fs (y) b(y)u(y, s)dy − 2 eλψ(y) fs (y)u(y, s)divb(y)dy = −2λ eλψ(y) fs (y)u(y, s)∇ψ(y)b(y)dy λψ(y) −2 e u(y, s)∇fs (y)b(y)dy − 2 fs (y)2 divb(y)dy = −2λ fs (y)2 ∇ψ(y)b(y)dy − 2 fs (y)∇fs (y)b(y)dy − 2 fs (y)2 divb(y)dy 2 = −2λ fs (y) ∇ψ(y)b(y)dy − fs (y)2 divb(y)dy. In the last step we have used integration by parts. Hence J2 = −2λ fs (y)2 ∇ψ(y)b(y)dy.
(3.11)
Using an argument similar to that in the middle of Step 2, we see that |bfs2−a ∇ψfsa |dy J2 ≤ 2λ Qσ r 1/m (m−1)/m m (2−a)m m/(m−1) am/(m−1) ≤ |b| fs dy 2λ (|∇ψ|) fs dy . Here, as before a ∈ (0, 2) and (2 − a)m = 2, am/(m − 1) = 2. It follows that, for any > 0, m 2 −1/(m−1) m/(m−1) fs (y)2 dy. λ J2 ≤ |b| fs (y) dy + c Combining this with the estimate for J1 , we have 2 2 m/(m−1) 2 2 + λ ) fs (y) dy + |b|m fs (y)2 dy. ∂s ||fs ||2 ≤ −2 |∇fs (y)| dy + c1 (λ Here c1 may depend on . Writing F (s) ≡ ||fs ||22 ,
H (s) ≡ −2
|∇fs (y)|2 dy +
|b|m fs (y)2 dy,
the above differential inequality can be written as ∂s F (s) ≤ c1 (λm/(m−1) + λ2 )F (s) + H (s).
258
Q.S. Zhang
Hence
s
F (s) ≤ ecz(λ)s F (0) + eCz(λ)s
e−z(λ)τ H (τ )dτ,
0
where z(λ) ≡ λm/(m−1) + λ2 . That is
s |∇(fτ (y)e−z(λ)τ/2 )|2 dydτ F (s) ≤ ecz(λ)s F (0) + eCz(λ)s − 2 0 s m −z(λ)τ/2 2 + |b| (fs (y)e ) dydτ . 0
Taking sufficiently small and using the condition on b we conclude that m/(m−1) +λ2 )s
||fs ||22 ≤ ecz(λ)s ||f ||22 = ec(λ
||f ||22 .
(3.12)
Step 4. Gaussian-like upper bound. For simplicity we only prove the bound for G(x, t; y, 0). We just prove the inequality t ≤ 1. When t ≥ 1, the situatioin is simpler and the proof is omitted. Now consider the function u(y, s) = e−λψ(y) fs (y) which is a solution to u + b∇u − ∂t u = 0 in Rn × (0, ∞). Here ψ is a function such that |∇ψ| ≤ 1 and √ whose precise values are to be chosen later. Applying (3.8) with Q√t/2 (x, t) = B(x, t/2) × (3t/4, t), we obtain t 1 u(x, t)2 ≤ C √ u2 . |Q t/2 (x, t)|m/(2(m−1)) 3t/4 B(x,√t/2) From (3.12), it follows that t 1 u2 |Q√t/2 (x, t)|m/(2(m−1)) 3t/4 B(x,√t/2) t 1 =C √ e2λ[ψ(x)−ψ(z)] fs2 |Q t/2 (x, t)|m/(2(m−1)) 3t/4 B(x,√t/2) √ t m/(m−1) +λ2 )t ≤ Ce2λ t ||f ||22 . √ m/(2(m−1)) ec(λ [t|B(x, t)|] √ Taking the supremum over all f ∈ L2 (B(y, t)) with ||f ||2 = 1, we find that G(x, t; z, 0)2 dz e2λ[ψ(x)−ψ(y)] √ e2λψ(x) u(x, t)2 ≤ Ce2λψ(x)
≤ Ce
B(y, t/2) √ 4λ t+c(λm/(m−1) +λ2 )t
[t|B(x,
√
t . t)|]m/(2(m−1))
Note that the second entries of the heat kernel G satisfies the equation u − ∇(bu) + ∂s u = 0.
(3.13)
Regularity
259
Hence it satisfies u − b∇u + ∂s u = 0. Therefore we can use (3.8) backward on the second entries of the heat kernel to conclude, from (3.13), that t/4 1 2 G(x, t; y, 0) ≤ C G(x, t; z, s)2 dzds √ |Q√t/2 (y, t)|m/(2(m−1)) 0 B(y, t/2) √ t2 m/(m−1) +λ2 )t−2λ[ψ(x)−ψ(y)] ≤C . √ m/(m−1) e4λ t+c(λ [t|B(x, t)|] √ This shows, since λ t ≤ c1 + c2 λ2 t, G(x, t; y, 0)2 ≤ C
t2 m/(m−1) +λ2 )t−2λ[ψ(x)−ψ(y)] . √ m/(m−1) ec(λ [t|B(x, t)|]
Now we select ψ so that ψ(x) − ψ(y) = |x − y|. Then it follows G(x, t; y, 0)2 ≤ C
t2 m/(m−1) +λ2 )t−2λ|x−y| ≡ C(t)eQ(λ) . (3.14) √ m/(m−1) ec(λ [t|B(x, t)|]
Here for simplicity, we write Q(λ) ≡ c(λm/(m−1) + λ2 )t − 2λ|x − y|. Now we choose λ to be a positive number satisfying λ1/(m−1) + λ = a|x − y|/t,
(3.15)
where a > 0 will be chosen in a moment. Then Q(λ) = cλ(λ1/(m−1) + λ)t − 2λ|x − y| = (ca − 2)λ|x − y|. Taking a = 1/c, we see that Q(λ) = −λ|x − y|.
(3.16)
Next we consider two separate cases. Case 1. |x − y|/t ≥ 1. Then from (3.15), there exists c0 > 0 such that λ ≥ c0 . Hence λ ≤ c1 λ1/(m−1) because m ≤ 2. By (3.15), λ ≥ c2 (|x − y|/t)m−1 . This shows, via (3.16), Q(λ) ≤ −c3
|x − y|m . t m−1
(3.17)
Case 2. When |x−y|/t ≤ 1. In this case (3.15) implies that λ ≤ c0 and hence λ1/(m−1) ≤ c1 λ. Therefore, by (3.15), λ ≥ c2 |x − y|/t. Hence Q(λ) ≤ −c3 |x − y|2 /t.
(3.18)
Substituting (3.17) and (3.18) to (3.14), we obtain
c1 t |x − y|m |x − y|2 G(x, t; y, 0) ≤ . +exp −c2 √ m/(2(m−1)) exp −c2 m−1 t t [t|B(x, t)|] This proves the upper bound for G.
260
Q.S. Zhang
Step 5. Proof of (ii). Since the proof is identical to that in [LZ], we omit the details. Final Remark. Using a Nash type estimate, it is easy to prove that G(x, t; y, s) ≤ c/(t − s)n/2 . It would be interesting to combine this bound with the bound in (iii) to get a sharper bound. The same could be said about the lower bound. Acknowledgement. We thank Professor Vitali Liskevich and Victor Shapiro for helpful conversations.
References [A]
Aronson, D.G.: Non-negative solutions of linear parabolic equations. Ann. Scuola Norm. Sup. Pisa 22, 607–694 (1968) [AS] Aizenman, M., Simon, B.: Brownian motion and Harnack inequality for Schrdinger operators. Comm. Pure Appl. Math. 35(2), 209–273 (1982) [BG] Berselli, Luigi, C., Galdi, Giovanni, P.: Regularity criteria involving the pressure for the weak solutions to the Navier-Stokes equations. Proc. Am. Math. Soc. 130(12), 3585–3595 (2002) [ChZ] Chen, Z.Q., Zhao, Z.: Diffusion processes and second order elliptic operators with singular coefficients for lower order terms. Math. Ann. 302(2), 323–357 (1995) [CrZ] Cranston, M., Zhao, Z.: Conditional transformation of drift formula and potential theory for 1 2 + b()∇. Commun. Math. Phys. 112(4), 613–625 (1987) [G] Gerhard, W.D.: The probabilistic solution of the Dirichlet problem for 21 + a, ∇ + b with singular coefficients. J. Theoret. Probab 5(3), 503–520 (1992) d ) generated [KS] Kovalenko, V.F., Semenov, Yu.A.: Co -semigroups in the spaces Lp (R d ) and C(R by + b.∇. (Russian) Teor. Veroyatnost. i Primenen. 35(3), 449–458 (1990); Translation in Theory Probab. Appl. 35(3), 443–453 (1990) [LS] Liskevich, V., Semenov, Y.: Estimates for fundamental solutions of second-order parabolic equations. J. Lond. Math. Soc. (2) 62(2), 521–543 (2000) [LZ] Liskevich, V., Zhang, Q.S.: Extra regularity for parabolic equations with drift terms. Manuscripta Math., to appear [MS] Milman, P.D., Semenov, Y.: Disingularizing weights and the heat kernel bounds. Preprint, 1998 [O] Osada, H.: Diffusion processes with generators of generalized divergence form. J. Math. Kyoto Univ 27(4), 597–619 (1987) [Se] Semenov, Y.A: H¨older continuity of bounded solutions of parabolic equations. Preprint, 1999 [Si] Simon, B.: Schr¨odinger semigroups. Bull. AMS 7, 447–526 (1982) [St] Stampacchia, G.: Le probl`eme de Dirichlet pour les e´ quations elliptiques du second ordre a` coefficients discontinus. (French) Ann. Inst. Fourier (Grenoble) 15(1), 189–258 (1965) ˇ [SS] Seregin, G., Svera’k, V.: Navier-Stokes equations with lower bounds on the pressure. Arch. Ration. Mech. Anal. 163(1), 65–86 (2002) [Z] Zhang, Q.S.: Gaussian bounds for the fundamental solutions of ∇(A∇u) + B∇u − ut = 0. Manuscripta Math. 93(3), 381–390 (1997) Communicated by B. Simon
Commun. Math. Phys. 244, 261–284 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0988-0
Communications in
Mathematical Physics
On the Representation Theory of Virasoro Nets Sebastiano Carpi Dipartimento di Scienze, Universit`a “G. d’Annunzio” di Chieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy. E-mail: [email protected] Received: 1 July 2003 / Accepted: 23 July 2003 Published online: 25 November 2003 – © Springer-Verlag 2003
Abstract: We discuss various aspects of the representation theory of the local nets of von Neumann algebras on the circle associated with positive energy representations of the Virasoro algebra (Virasoro nets). In particular we classify the local extensions of the c = 1 Virasoro net for which the restriction of the vacuum representation to the Virasoro subnet is a direct sum of irreducible subrepresentations with finite statistical dimension (local extensions of compact type). Moreover we prove that if the central charge c is in a certain subset of (1, ∞), including [2, ∞), and h ≥ (c − 1)/24, the irreducible representation with lowest weight h of the corresponding Virasoro net has infinite statistical dimension. As a consequence we show that if the central charge c is in the above set and satisfies c ≤ 25 then the corresponding Virasoro net has no proper local extensions of compact type. 1. Introduction The idea that the formulation of relativistic quantum physics in terms of local nets of von Neumann algebras (see e.g. [27]) provides a natural framework for the classification of two-dimensional conformal field theories was already present in the late eighties in the work of Buchholz, Mack and Todorov [3]. As an illustration of this idea these authors classified the local conformal nets over S 1 (compactified light ray) whose common “germ” is the U(1) chiral current algebra, namely the local nets extending the one generated by a U(1) current. In the same paper they suggested the more general (and ambitious) classification program of conformal field theories S 1 whose “germ” is the Virasoro algebra Vir. In other words, they proposed to classify the local extensions of the Virasoro nets, i.e. the local nets of von Neumann algebras on S 1 which are generated by the positive energy unitary irreducible representations with lowest weight 0 (vacuum representations) of Vir or equivalently (see [45]) by a chiral energy-momentum tensor
Supported in part by the Italian MIUR and GNAMPA-INDAM.
262
S. Carpi
T (z), cf. [5]. Since the equivalence class of a Virasoro net is completely determined by the value of the central charge c in the corresponding representation of Vir, one has to classify the local extensions of a family of nets labelled by a positive real number c and this is a clearly well defined problem which in fact turns out to be equivalent to the one of classifying diffeomorphism covariant nets on the circle. In a recent remarkable paper [33] Kawahigashi and Longo have been able to solve the above problem for all the Virasoro nets with c < 1 and subsequently they used this result to classify all local conformal nets on the two-dimensional Minkowski space-time, having parity symmetry and central charge less than 1 [34]. The extension of their results in the c ≥ 1 region appears to be a very important and difficult challenge. In the transition from c < 1 to c ≥ 1 two drastic differences are immediately evident. The first is that the Virasoro nets with c ≥ 1 are all known to be non-rational. The second is that they are all expected to have irreducible sectors with infinite statistical dimension [49], a fact that has been proved by the present author in the case c = 1 [8]. Rationality and the absence of irreducible sectors with infinite statistical dimension play a fundamental role in the classification of c < 1 conformal nets and hence, some of the main ideas in [33] do not apply for the remaining values of the central charge. The purpose of this paper is to give some new insight in the understanding of the representation theory of c ≥ 1 Virasoro nets and their local extensions with the above problems in mind, especially concerning the role of infinite statistical dimension. Our main results are the classification of local extensions of compact type (see Definition 3.2) for the Virasoro net with c = 1 (Theorem 3.5)1 and the proof that if the central charge c is in a certain subset of (1, ∞) containing [2, ∞), then the irreducible positive energy representations with lowest weight h ≥ (c − 1)/24 of the corresponding Virasoro net have infinite statistical dimension (Theorem 5.1 ), a fact that is expected to hold for every c > 1 and h > 0 [49]. As a consequence of the latter result we show that if c ∈ [2, 25] then the corresponding Virasoro net has no proper local extensions of compact type (Theorem 5.7). As in the c = 1 case [8], we use oscillator (Fock) representations of Vir to obtain the result on infinite statistical dimension but the argument we have found for c > 1 is more intricate and relies in part on recent results of S. K¨oster [38]. Besides these main results we also provide the proof of some relevant properties of the Virasoro nets which seem to appear at most implicitly in the literature, like the fact that every irreducible positive energy representation of a Virasoro net on a separable Hilbert space comes from a representation of the Virasoro algebra (Prop. 2.1) or the fact that every local extension of a Virasoro net is diffeomorphism covariant (Prop. 3.7).
2. Preliminaries 2.1. Conformal nets on S 1 and diffeomorphism covariance. Let I be the set of nonempty, nondense, open intervals of the unit circle S 1 = {z ∈ C : |z| = 1}. A conformal net on S 1 is a family A = {A(I ) : I ∈ I} of von Neumann algebras, acting on an infinite-dimensional separable complex Hilbert space HA , satisfying the following properties: (i) Isotony. A(I1 ) ⊂ A(I2 ), if I1 ⊂ I2 , I1 , I2 ∈ I. 1
An equivalent result as been independently obtained by F. Xu [55]
(1)
On the Representation Theory of Virasoro Nets
263
(ii) Locality. A(I1 ) ⊂ A(I2 ) , if I1 ∩ I2 = ∅, I1 , I2 ∈ I.
(2)
(iii) M¨obius covariance. There exists a strongly continuous unitary representation U of PSL(2, R) in H such that U (g)A(I )U (g)−1 = A(gI ), I ∈ I, g ∈ PSL(2, R),
(3)
where PSL(2, R) acts on S 1 by M¨obius transformations (cf. Appendix A). (iv) Positivity of the energy. The representation U has positive energy, namely the conformal Hamiltonian L0 , which generates the restriction of U to the one-parameter subgroup of rotations r(ϑ), has nonnegative spectrum. (v) Existence and uniqueness of the vacuum. There exists a unique (up to a phase) U -invariant unit vector ∈ HA . (vi) Cyclicity of the vacuum. is cyclic for the algebra A(S 1 ) := I ∈I A(I ). Some consequences of the axioms are [22, 25, 19]: (vii) Reeh-Schlieder property. For every I ∈ I, is cyclic and separating for A(I ). (viii) Bisognano-Wichmann property. If I is the modular operator associated to A(I ) and then itI = U (I ((2π )t),
(4)
where I is the one parameter subgroup of PSL(2, R) of special conformal transformations preserving I . (ix) Haag duality. For every I ∈ I, A(I ) = A(I c ),
(5)
where I c denotes the interior of S 1 \I . (x) Factoriality. The algebras A(I ) are type III1 factors. (xi) Irreducibility. A(S 1 ) = B(HA ), where B(HA ) denotes the algebra of all bounded linear operators on HA . (xii) Additivity. If S ⊂ I is a covering of the interval I then A(I ) ⊂ A(J ). (6) J ∈S
Furthermore, it follows easily from the strong continuity of the representation U that conformal nets are continuous from outside, namely A(I ) = A(J ). (7) J ⊃I
A conformal net A is said to be split if given two intervals I1 , I2 ∈ I such that the closure I1 of I1 is contained in I2 , there exists a type I factor N (I1 , I2 ) such that A(I1 ) ⊂ N (I1 , I2 ) ⊂ A(I2 ). If Tr(t L0 ) < ∞ for every t ∈ (0, 1) then A is split [13, Theorem 3.2].
(8)
264
S. Carpi
A is said to be strongly additive if for every I, I1 , I2 ∈ I with I1 , I2 obtained by removing a point from I we have A(I1 ) ∨ A(I2 ) = A(I ).
(9)
It is often convenient to identify S 1 /{−1} with the real line R. With this identification, the family of nonempty open bounded intervals of R corresponds to the family I0 = {I ∈ I : −1 ∈ / I }. The restriction A0 of a conformal net A to I0 can be considered as a net on R. Moreover, since I0 is directed under inclusion, one can define the quasi-local C ∗ -algebra (still denoted A0 ) ( I ∈I0 A(I ))−||·|| ) as C ∗ -inductive limit of the local von Neumann algebras A(I ), I ∈ I0 . We now briefly discuss diffeomorphism covariance. Let Diff + (S 1 ) the group of orientation preserving diffeomorphisms of the circle. It is an infinite dimensional Lie group modelled on the real topological vector space Vect(S 1 ) of smooth real vector fields on S 1 with the usual C ∞ topology [46, Sect.6]. Its Lie algebra coincides with Vect(S 1 ) with the bracket given by the negative of the usual brackets of vector fields. Hence if g(z), f (z), z = eiϑ , are real valued functions in C ∞ (S 1 ) then [g(eiϑ )
d d d d d , f (eiϑ ) ] = ( g(eiϑ ))f (eiϑ ) − ( f (eiϑ ))g(eiϑ ) . dϑ dϑ dϑ dϑ dϑ
(10)
d ∈ Vect(S 1 ) with the In this paper we shall often identify the vector field g(eiϑ ) dϑ corresponding real function g(z) ∈ C ∞ (S 1 ). Following [33] for every I ∈ I we shall denote by Diff(I ) the subgroup of Diff + (S 1 ) whose elements are the diffeomorphisms of the circle which act as the identity on I c . Note that Diff(I ) does not coincide with the group of diffeomorphisms of the open interval I , as the notation might erroneously suggest. By a strongly continuous projective unitary representation V of Diff + (S 1 ) on a Hilbert space we shall always mean a strongly continuous homomorphism of Diff + (S 1 ) into the quotient U(H)/T of the unitary group of H by the circle subgroup T. The restriction of the representation V to the M¨obius subgroup of Diff + (S 1 ) always lifts to a unique strongly continuous unitary representation U of the universal covering group R) of PSL(2, R). We shall say that V extends U and that V is a positive energy PSL(2, R), namely if the correrepresentation if U is a positive energy representation of PSL(2, sponding conformal Hamiltonian L0 , which generates the restriction of U to the lifting r˜ (ϑ) of the one-parameter subgroup r(ϑ) of rotations, has nonnegative spectrum. Note that although for γ ∈ Diff + (S 1 ), V (γ ) is defined only up to a phase as an operator on H, expressions like V (γ )T V (γ )∗ for T ∈ B(H) or V (γ ) ∈ M for a (complex) subspace M ⊂ B(H) are unambiguous and will be used in the following. We shall say that a conformal net on S 1 is diffeomorphism covariant if there is a strongly continuous projective unitary representation V of Diff + (S 1 ) on HA extending U ◦ q ( where U is the original unitary representation of PSL(2, R) making A M¨obius R) → PSL(2, R) denotes the covering map) and such that, for covariant and q : PSL(2, every I ∈ I,
V (γ )A(I )V (γ )∗ = A(γ I ), γ ∈ Diff + (S 1 ), V (γ )AV (γ )∗ = A, γ ∈ Diff(I ), A ∈ A(I c ).
(11) (12)
On the Representation Theory of Virasoro Nets
265
2.2. Representations of conformal nets. A representation of a conformal net A is a family π = {πI : I ∈ I}, where πI is a (unital) representation of A(I ) on a fixed Hilbert space Hπ , such that πJ |A(I ) = πI if I ⊂ J, I, J ∈ I.
(13)
Irreducibility, direct sums and unitary equivalence of representations of a conformal net can be defined in a natural way, see [22, 25]. If Hπ is separable then, since the local von Neumann algebras A(I ), I ∈ I, are factors, π is automatically locally normal, namely πI is normal for each I ∈ I, see [51]. Hence, πI (A(I )) is a type III1 factor. The unitary equivalence class representation π on a separable Hilbert space is called a sector and denoted [π ]. If π is irreducible then we say that [π ] is an irreducible sector (also called superselection sector). The defining representation π0 of a conformal net A on the Hilbert space HA is called the vacuum representation. The corresponding sector is called the vacuum sector and HA is said to be the vacuum Hilbert space of A. A representation π is said to be covariant if there is a strongly continuous unitary R) on Hπ such that representation Uπ of PSL(2, R), I ∈ I. AdUπ (g) ◦ πI = πq(g)I ◦ AdU (q(g)), g ∈ PSL(2,
(14)
If Uπ can be chosen to be a positive energy representation, then π is said to be covariant with positive energy. In this case one can always choose Uπ to be with positive energy and inner, namely such that R)) ⊂ π(A) := Uπ (PSL(2,
πI (A(I )),
(15)
I ∈I
and this choice is unique, see [36] and (the proof of) [1, Lemma 5.14]. Given a covariant representation π of A on a separable Hilbert space Hπ one has the (isomorphic) type III subfactors πI (A(I )) ⊂ πI c (A(I c )) , I ∈ I [22]. Hence the corresponding (minimal) index [πI c (A(I c )) : πI (A(I ))] [29, 39, 41] is independent of I ∈ I and the statistical dimension d(π) of π is given by 1
d(π) = [πI c (A(I c )) : πI (A(I ))] 2 .
(16)
A representation ρ of a conformal net A on its vacuum Hilbert space HA is said to be localized in an interval I0 ∈ I if ρI0c is the identity representation. As a consequence of Haag duality if a representation ρ is localized in I0 and I ∈ I contains I0 then ρI is an endomorphism of A(I ) whose index is the square of the statistical dimension of the representation ρ. Moreover, for every interval I0 ∈ I and every representation π of A on a separable Hilbert space one can find a representation ρ, localized in I0 and unitarily equivalent to π, see [22, 25]. The restriction to A0 of a representation ρ localized in some I0 ∈ I0 is called a DHR endomorphism and in fact yields an endomorphism of the quasi-local C ∗ -algebra A0 , see [43, Sect. 3]. The set of DHR endomorphisms is a semigroup (under composition) and it has a natural (DHR) unitary braiding, see [20, 22]. As usual we shall denote (ρ, σ ) the unitary braiding operator associated to the DHR endomorphisms ρ and σ .
266
S. Carpi
2.3. Subsystems. A conformal subsystem (or subnet) of a conformal net A is a family B = {B(I ) : I ∈ I} of nontrivial von Neumann algebras acting on HA such that: B(I ) ⊂ A(I ) I ∈ I; U (g)B(I )U (g)−1 = B(gI ) I ∈ I, g ∈ PSL(2, R); B(I1 ) ⊂ B(I2 ) if I1 ⊂ I1 , I1 , I2 ∈ I.
(17) (18) (19)
We shall use the notation B ⊂ A for conformal subsystems. Note that B is not in general a conformal net in the precise sense of the definition since it does not satisfy property (vi) (cyclicity of the vacuum) unless B = A. However one gets a conformal restricting the algebras B(I ), I ∈ I, and the representation U to the closure HB net B ) is an isomorphism for every of B(S 1 ). Since the map b ∈ B(I ) → b|HB ∈ B(I I ∈ I, because of the Reeh-Schlieder property, as usual, we shall often use the symbol specifying, if necessary, when B acts on HA or on HB . B instead of B, Let π be the defining representation of the conformal net B ⊂ A on the Hilbert space HA (i.e. the restriction to B of the vacuum representation of A). Because of the separability of HA , for every I0 ∈ I we can find a representation θ on the vacuum Hilbert space HB of B, which is unitarily equivalent to π and is localized in I0 . Then if I0 ⊂ I ∈ I, θI is a dual canonical endomorphism for the subfactor B(I ) ⊂ A(I ), namely there is a canonical endomorphism (in the sense of [41]) for the latter whose restriction to B(I ) coincides with θI , see [44, Prop. 3.4] and [43, Sect. 3.3]. 2.4. Virasoro nets and their representations. Let Vir denote the Virasoro algebra, i.e. the complex Lie algebra spanned by Ln , n ∈ Z and a central element κ with relations [Ln , Lm ] = (n − m)Ln+m + δn+m,0
n3 − n κ. 12
(20)
We shall denote L(c, h) the unique positive energy irreducible unitary representation of Vir with lowest weight h and central charge c (see e.g. [14, 30]). The conformal Hamiltonian L0 is diagonalizable on the corresponding Vir-module (still denoted L(c, h)) with spectrum contained in h + N0 2 and containing the lowest weight h. Moreover, the central element κ acts as multiplication by the real number c. Positivity of the energy implies h ≥ 0 and unitarity (or hermiticity) means that there is a positive definite sesquilinear form (·, ·) on L(c, h) such that (ξ, Ln ψ) = (L−n ξ, ψ),
(21)
for ξ, ψ ∈ L(c, h), n ∈ Z. The above conditions give restrictions on the values of the pair (c, h). In fact either c ≥ 1 and h ≥ 0 or we have a pair (c(m), hp,q (m)), m ∈ N, where 6 (m + 2)(m + 3)
(22)
((m + 3)p − (m + 2)q)2 − 1 , 4(m + 2)(m + 3)
(23)
c(m) = 1 − and hp,q (m) = 2
In this paper N0 (resp. N) denotes the set of nonnegative (resp. positive) integers.
On the Representation Theory of Virasoro Nets
267
p = 1, ..., m + 1, q = 1, ..., p, (discrete series representations). For later convenience we shall denote D ⊂ [ 21 , 1) the set of discrete values of the central charge in Eq. (22). Accordingly the set of allowed values of the central charge is D ∪ [1, ∞). Now let H(c, h) be the Hilbert space completion of the module L(c, h). Then the Virasoro algebra acts on H(c, h) by unbounded operators on the common invariant domain L(c, h) ⊂ H(c, h) which can in fact be identified with the subspace Hf in (c, h) of finite energy vectors, i.e. the linear span of the eigenvectors of the conformal Hamiltonian. The (chiral) energy-momentum tensor T(c,h) (z), z = eiϑ ∈ S 1 associated to L(c, h), is defined by the formal power series T(c,h) (z) = Ln z−n−2 . (24) n∈Z
For a function on S 1 , ϑ → f (eiϑ ) with finite Fourier series (trigonometric polynomial), the operator T(c,h) (f ) = L n fn , (25) n∈Z
where
2π
fn = 0
dϑ −inϑ f (eiϑ ), e 2π
(26)
belongs to Vir and hence is well defined on Hf in (c, h) and leave it invariant. Also the following (formal) notation is used
T(c,h) (f ) =
S1
zdz T(c,h) (z)f (z) = 2πi
2π 0
dϑ T(c,h) (eiϑ )ei2ϑ f (eiϑ ). 2π
(27)
The Virasoro net A(Vir,c) can be defined as in [5] as the net generated by the energymomentum tensor Tc (z) := T(c,0) (z) in the representation of lowest weight 0 on H(c, 0) =: HA(Vir,c) . First of all one can show that the map f → Tc (f ) extends (uniquely) to an operator valued distribution (Wightman field) on the invariant domain C ∞ (L0 ), the subspace of smooth vectors for L0 . Moreover the linear energy-bounds established in [5] (also cf. [24]) imply that for every smooth real valued function f , Tc (f ) is essentially self-adjoint (on any core for L0 ) and that eiTc (f1 ) commutes with eiTc (f2 ) when the real smooth functions f1 and f2 have disjoint supports (in fact these properties also hold in the representations with h > 0). It follows that the net of von Neumann algebras defined by A(Vir,c) (I ) = {eiTc (f ) : f ∈ C ∞ (S 1 ), real, supp f ⊂ I } , I ∈ I.
(28)
is local, and in fact one can verify all the other axioms of a conformal net. In particular the representation U of PSL(2, R) is obtained by integrating the self-adjoint part of the (complex) Lie subalgebra of Vir spanned by L−1 , L0 , L1 and the vacuum vector is the (normalized) lowest weight vector in L(c, 0). An alternative construction is obtained by integrating the representations L(c, 0) of Vir to the corresponding projective unitary representations of Diff + (S 1 ). In fact as shown by Goodman and Wallach [24] (cf. also [53]), for each allowed pair (c, h) there
268
S. Carpi
is a unique strongly continuous projective unitary representation V(c,h) of Diff + (S 1 ) on H(c, h) satisfying V(c,h) (exp(f )) = p(eiT(c,h) (f ) )
(29)
for every real smooth function (vector field) f on S 1 . Here exp(f ) ∈ Diff + (S 1 ) denotes the exponential of the vector field f , namely t → exp(tf ) is the unique one-parameter group of diffeomorphisms generated by f , and p : U(H(c, h)) → U(H(c, h))/T denotes the quotient map. Then the net A(Vir,c) can be defined by A(Vir,c) (I ) = {V(c,0) (γ ) : γ ∈ Diff(I )} ,
(30)
I ∈ I. The two definitions are equivalent because the group generated by the exponentials of smooth vector fields with support in I ∈ I is dense in Diff(I ), see [40, Sect. V.2]. From the second one the diffeomorphism covariance of the Virasoro nets is explicit. As a consequence of the finiteness of the (vacuum) Virasoro characters χ (t) := Tr(t L0 ) for every t ∈ (0, 1) the Virasoro nets are split for every allowed value of the central charge. For c ≤ 1 the Virasoro nets are strongly additive [33, 55] while for c > 1 they are not [5]. We now discuss some properties of the representation theory of the Virasoro nets that we shall need in the following. Let H(c, h) be the Hilbert space completion of Vir module L(c, h) as at the beginning of this subsection and let T(c,h) (z) be the corresponding energy-momentum tensor. A representation of A(Vir,c) on H(c, h) will be denoted πhc if for every I ∈ I and every real smooth real function f on S 1 with support in I , the following holds: πhc I (eiTc (f ) ) = eiT(c,h) (f ) .
(31)
It is immediate to verify that if a representation satisfying Eq. (31) exists, then it is unique. More complicated is to demonstrate the existence of such representations. Of course the vacuum representation π0c exists for every allowed value of c i.e. for each c ∈ D ∪ [1, ∞). If c < 1 and h is a corresponding allowed value of the lowest weight, then the representation πhc exists as a consequence of the Goddard, Kent, Olive coset construction [23] and the local equivalence of positive energy representations of the loop groups LSU(2) at fixed level [22, 54], cf. [40, V.3.3.2] and [33, Sect. 3]. If c ≥ 1 the existence of πhc has been proved by D. Buchholz and H. Schulz-Mirbach for every h ≥ (c − 1)/24. Finally if c ∈ (D + 1) ∪ [2, ∞), then c − 1 is an allowed value of the central charge. Then using the embedding A(Vir,c) ⊂ A(Vir,c−1) ⊗ A(Vir,1) , one can easily construct, for every h ≥ 0, the representation πhc as a subrepresentation of the restriction to A(Vir,c) of π0c−1 ⊗ πh1 .3 As far as we know the existence of the representation πhc for the remaining allowed pairs (c, h) is still an important open problem. Proposition 2.1. If π is an irreducible covariant representation with positive energy of the Virasoro net A(Vir,c) on a separable Hilbert space Hπ then it is unitarily equivalent to πhc for some h ≥ 0. 3
I learned this argument in an unpublished manuscript of D. Buchholz [6].
On the Representation Theory of Virasoro Nets
269
Proof. Let V(c,0) be the unique projective unitary representation of Diff + (S 1 ) on HA(Vir,c) such that Eq. (29) holds with h = 0. From [37, Sect. 2] (cf. also [33, Lemma 3.1]) we know that there is a strongly continuous positive energy projective unitary representation Vπ of Diff + (S 1 ) on Hπ such that p(πI (V(c,0) (γ ))) = Vπ (γ ) for each I ∈ I and γ ∈ Diff(I ). Then it follows from the irreducibility and local normality of π that Vπ is f in irreducible. As a consequence of Theorem A.1 in the Appendix , there is on Hπ a positive energy representation Rπ of theVirasoro algebra with central charge c ∈ D∪[1, ∞), which is unitarily equivalent to L(c , h) for some h ≥ 0. Let T π (z) =
Lπn z−n−2
n∈Z
be the corresponding energy-momentum tensor. Then, for every real smooth vector field f on S 1 , we have Vπ (exp(f )) = p(eiT
π (f )
).
It follows that if I ∈ I and the support of f is contained in I , πI (eiTc (f ) ) = eiαI (f ) eiT
π (f )
,
where αI (f ) is a real constant. Now, it is fairly easy to check that there is a (necessarily unique ) distribution α such that α(f ) = αI (f ) for every I ∈ I and every real function f with support is contained in I and that M¨obius covariance implies that α = 0. Hence we have the equality πI (eiTc (f ) ) = eiT
π (f )
,
which implies c = c. The conclusion then follows because the representation Rπ of Vir is unitarily equivalent to L(c, h) for some h ≥ 0. We conclude this subsection with the following proposition. Proposition 2.2. Let π be a positive energy covariant representation of the Virasoro net A(Vir,c) on a separable Hilbert space and let Uπ be the corresponding unique inner R). Assume that Uπ (˜r (2π )) ∈ C1. Then the following unitary representation of PSL(2, hold: (a) The representation π is a direct sum of irreducible covariant positive energy representations. (b) There exists a unique strongly continuous projective unitary representation Vπ of Diff + (S 1 ) on Hπ satisfying p(πI (eiTc (f ) )) = Vπ (exp(f )),
(32)
for every I ∈ I and every real smooth function f with support contained in I . Moreover, this representation satisfies Vπ (γ ) ∈ πI (A(Vir,c) (I )) ∀I ∈ I, ∀γ ∈ Diff(I ),
(33)
R). Vπ (q(g)) = p(Uπ (g)) ∀g ∈ PSL(2,
(34)
270
S. Carpi
Proof. The net A(Vir,c) as the split property and hence, as a consequence of [35, Prop. 56], π has a direct integral decomposition ⊕ π= πλ dµ(λ), X
where, for almost every λ, πλ is an irreducible representation of A(Vir,c) on a separable R) we also have Hilbert space H(λ). Since Uπ (g) ∈ π(A(Vir,c) ) for each g ∈ PSL(2, the decomposition Uπ (g) =
⊕
Uλ (g)dµ(λ). X
If hπ is the lowest eigenvalue of Lπ0 we have by assumption Uπ (˜r (2π )) = e2πihπ . Hence Uλ is, for almost every λ, a positive energy representation satisfying Lλ0 ≥ hπ and Uλ (˜r (2π )) = e2πihπ . It follows that, for almost every λ, πλ is an irreducible covariant representation of A(Vir,c) with positive energy which, because of Prop. 2.1, is unitarily equivalent to πhcπ +nλ for some nλ ∈ N0 . Now let Xn = {λ ∈ X : πλ πhcπ +n }. Then, it follows from [35, Lemma 60], that {X n : n ∈ N0 } is a family of pairwise disjoint measurable subsets of X such that µ(X\ n∈N0 Xn ) = 0. Hence ⊕ π πλ dµ(λ), n∈N0 Xn
⊕ and since Xn πλ dµ(λ) is unitarily equivalent to a (possibly zero) multiple of πhcπ +n , (a) follows. Now it follows from (a) and Prop.2.1 that on the dense subspace C ∞ (Lπ0 ) of smooth vectors for Lπ0 there is a projective representation η of the Lie algebra of smooth real vector fields on S 1 by essentially skew-adjoint operators satisfying eη(f ) = πI (eiTc (f ) ) if I ∈ I and suppg ⊂ I . Moreover η satisfies the assumptions in [53, Theorem 5.2.1] (cf. the proof of [53, Theorem 6.1.1] and the discussion in [38, Appendix]). Hence it can be integrated to a unique strongly continuous projective unitary representation of the covπ ering group of Diff + (S 1 ) which, since by assumption e2πiL0 = e2πihπ , factors through Diff + (S 1 ) giving a representation Vπ satisfying Eqs. (32) and (34). The remaining claim in (b) then follows easily. 3. Local Extensions Definition 3.1. We define a local extension of a conformal net A to be a conformal net (B, U, HB ) together with a conformal subsystem C ⊂ B such that the corresponding conformal net C on HC is isomorphic to A and such that U (PSL(2, R)) ⊂ C(S 1 ).
(35)
In agreement with the notation for conformal subsystems, since A and Care isomorphic, we shall often identify A and C and accordingly we shall write A ⊂ B instead of C ⊂ B. Condition (35) implies that C ⊂ B is a full subsystem, namely that C(S 1 ) ∩ B(I ) = C1 I ∈ I.
(36)
On the Representation Theory of Virasoro Nets
271
It prevents trivial extensions of the type A ⊂ A ⊗ C, cf. [3]. For finite index subsystems condition (35) is automatically satisfied and we don’t know any example of a full conformal subsystem violating it. Note that in the literature the term “local extension” is often used in a weaker sense (see e.g. [44]). A class of examples of local extensions is obtained by considering fixed point subsystems under compact group actions. More precisely given a conformal net B and a strongly compact group G of (vacuum preserving) internal symmetries of B one can define the fixed point subsystem A ≡ B G . This kind of construction is paradigmatic in the algebraic approach to quantum field theory, see [15, 18]. One has A(S 1 ) = G (cf. [18, Theorem 3.6]) and since U and G commute (see [22, Lemma 2.22]), condition (35) is satisfied. Hence B is a local extension of A in the sense of Definition 3.1. If π is the identical representation of A on HB one has π= d(ξ )πξ , (37) ˆ ξ ∈G
ˆ is the set of equivalence classes of irreducible unitary representations of G, where G the πξ are mutually inequivalent irreducible covariant representations of A (with trivial univalence) appearing with multiplicity d(ξ ) equal to the dimension of the representations of G of class ξ and satisfying d(πξ ) = d(ξ ), see [18, 28, 47] and [41, I Sect. 7]. Moreover, the vacuum representation of A is associated to the trivial one dimensional representation of G and the corresponding Hilbert space HA coincides with the subspace of G-invariant vectors of HB . We denote by ≡ B the semigroup of the DHR endomorphisms of A0 which are ˆ Then the (DHR) unitarily equivalent to a finite direct sum of representations πξ , ξ ∈ G. braiding on gives in fact a permutation symmetry, is a dual of G in the sense of Doplicher-Roberts duality theory [16, 17] and one can recover the local extension B by the Doplicher-Roberts reconstruction theorem [18], see [47, Prop. 3.8] for the necessary adaptations to conformal nets on S 1 . More generally, let A be a conformal net on S 1 and let be a semigroup of DHR endomorphisms of A0 , all covariant with finite dimension. Assume that the DHR braiding on is in fact a permutation symmetry (para-Bose statistics for the endomorphisms in ) and that is specially directed in the sense of [16, Sect. 5]. Then DoplicherRoberts construction provides a local extension B of A and a strongly compact group G of vacuum preserving internal symmetries of B such that A coincides with the fixed point net B G . Note that by [17, Theorem 3.4] (see also [18, Lemma 3.7]) if has direct sums, subobjects and conjugates then it is specially directed. In the following we shall use the notation B = A for the net obtained through the above Doplicher-Roberts cross product construction. The decomposition in Eq. (37) suggests the following generalization of the local extensions with compact group action discussed above, cf. [44, Sect. 5]. Definition 3.2. A local extension B of a conformal net A is of compact type if the corresponding representation π of A on HB satisfies π= ni πi , (38) where the πi are (necessarily covariant with positive energy) mutually inequivalent irreducible subrepresentations of π having finite statistical dimension and appearing with multiplicity ni .
272
S. Carpi
Although we did not assume in Definition 3.2 any bound on the multiplicities ni , these turn out to be finite as a consequence of the following proposition, cf. [33, Prop.2.3] and [10] for related results. Proposition 3.3. Let B be a local extension of compact type of a conformal net A on S 1 and let π be the corresponding representation of A on HB . Then the following hold: (a) On HB we have A(I ) ∨ A(I c ) = A(S 1 ),
I ∈ I.
(39)
(b) The local extension B is irreducible, namely A(I ) ∩ B(I ) = C1,
I ∈ I.
(40)
(c) Every irreducible representation of A is contained in π with finite (possibly zero) multiplicity. Proof. Let θ be a representation of A localized in I ∈ I and equivalent to π . Then for J ⊃ I , θJ is a dual canonical endomorphism for the subfactor A(J ) ⊂ B(J ). By assumption π is a direct sum of covariant representations with finite statistical dimension. Hence we can find isometries V i ∈ A(I ), i ∈ N, with orthogonal ranges, sat isfying Ei := Vi Vi ∗ ∈ θ (A(S 1 )) , i∈N Ei = 1 and such that the representations σ i defined by σ i J (·) = Vi ∗ θJ (·)Vi , J ∈ I are irreducible, covariant, localized in I and with finite statistical dimension. If T ∈ θI (A(I )) ∩ A(I ) then Vi ∗ T Vj σ j I (·) = σ i I (·)Vi ∗ T Vj for i, j ∈ N and hence by the equivalence of local and global intertwiners for localized representations with finite dimension [25, Theorem 2.3] we have Vi ∗ T Vj σ j J (·) = σ i J (·)Vi ∗ T Vj for every J ∈ I. It follows that Ei T Ej ∈ θ(A(S 1 )) and hence T ∈ θ (A(S 1 )) . Since T ∈ θI (A(I )) ∩ A(I ) was arbitrary and θ(A(S 1 )) ⊂ θI (A(I )) ∩A(I ) by Haag duality, we conclude that θ (A(S 1 )) = θI (A(I ))∨θI c (A(I c )). Hence π(A(S 1 )) = πI (A(I )) ∨ πI c (A(I c )) which proves (a). Now, recalling that U (PSL(2, R)) ⊂ A(S 1 ), by definition of local extensions, we find C1 = A(S 1 ) ∩ B(I ) and hence (b) follows from (a) and locality. Finally (c) follows from [28, p. 39]. Since the defining extensions of the fixed point nets under compact groups of internal symmetries and the finite index extensions are of compact type we can conclude that (b) of Prop. 3.3 generalizes the irreducibility results for conformal subsystems in [7, Prop. 2.1] and [13, Cor. 2.7] (the latter in the local case). Remark 3.4. If B is a local extension of compact type of a conformal net A on S 1 then it follows from Proposition 3.3 (and its proof) that A(I ) ⊂ B(I ), I ∈ I is an irreducible discrete inclusion of infinite factors in the sense of [28, Def. 3.7]. We now consider the Virasoro net A(Vir,1) with c = 1. By [49, Prop. 4] A(Vir,1) is the fixed point net under the action of SO(3) on the conformal net ASU(2)1 associated to the level one vacuum representation of the loop group LSU(2). The corresponding representation π of A(Vir,1) on HASU(2)1 satisfies (2j + 1)πj12 , (41) π= j ∈N0
where πj12 is the representation of A(Vir,1) with lowest weight j 2 . As a consequence d(πj12 ) = 2j + 1 for each j ∈ N [49, Cor. 6].
On the Representation Theory of Virasoro Nets
273
We can consider the permutation symmetric semigroup of DHR endomorphisms of A(Vir,1) which are localized in some I ∈ I0 and equivalent to a finite direct sum of representations of the type πj12 , j ∈ N0 . Then, as discussed above, ASU(2)1 can be identified with the Doplicher-Roberts cross product A(Vir,1) . Now let B be a local extension of compact type of A(Vir,1) and let π be the corresponding representation of A(Vir,1) on HB . By Prop. 2.1 and [8, Theorem 4.4] every irreducible subrepresentation of π is equivalent to a DHR endomorphism in (note that only subrepresentations with integer lowest conformal energy can appear) and hence π σ i, (42) i∈N
where σi ∈ , for each i ∈ N. The local extensions of A(Vir,1) with the above property have been independently classified by the author (cf. the announcement in [32]) and by Feng Xu [55, Sect. 4.2.2]. The resulting possibilities are described in the following theorem (we outline our original proof below). Theorem 3.5. A local extension B of A(Vir,1) is of compact type if and only if B is isomorphic to a fixed point net AH SU(2)1 for some closed subgroup H of SO(3). Proof. The “if part” is a straightforward consequence of the fact that ASU(2)1 is an extension of compact type of A(Vir,1) , cf. Eq. (41). Now let B be an extension of compact type of A(Vir,1) and let π be the corresponding representation of A(Vir,1) on HB . Given a representation θ of A(Vir,1) localized in I ∈ I0 and unitarily equivalent to π (so that if J ⊃ I , θJ is a dual canonical endomorphism for the inclusion A(Vir,1) (J ) ⊂ B(J )) we deduce from Eq. (42) that θ is equivalent to a (possibly infinite) direct sum of DHR endomorphisms in the permutation symmetric semigroup defined after Eq. (41). It follows that the monodromy operator M (ρ, θ ) := (ρ, θ ) (θ, ρ) is trivial (i.e. equal to 1) for every ρ ∈ . We now use the extension of DHR endomorphisms as defined in [44, Prop. 3.9] (cf. also [50, Sect. 3.4.7]) and which is called α-induction in [2]. For every ρ ∈ , the triviality of the monodromy operator M (ρ, θ ) implies that its extension αρ (we use the notation in [2]) to B0 is still localized in an interval in I0 , see [44, Prop. 3.9]. Now the crucial point is that the functorial properties α-induction (called homomorphism properties in [2]) imply that α := {αρ : ρ ∈ } is still a specially directed permutation symmetric semigroup of M¨obius covariant (bosonic) DHR endomorphisms of B0 . These functorial properties have been established in [11, 12] for inclusions of local nets on the four dimensional Minkowski space-time and in [2] for finite index nets of the subfactor on the real line. Due to the triviality of the monodromy (which is automatic in four space-time dimensions) one can use the arguments in [11, 12] (see also [9, Sect. 2] for an overview) to get the desired structure on α . Hence, as recalled at the beginning of this section, we can use the Doplicher-Roberts cross product construction to define a local extension B α of the conformal net B. The next point is that the proof of [12, Theorem 3.5] can be adapted to our situation to show that there is a natural inclusion (up to isomorphism) ASU(2)1 = A(Vir,1) ⊂ B α (compatible with A(Vir,1) ⊂ B ) and in fact it turns out that B α is a local extension of ASU(2)1 . But the latter conformal net has no proper local extensions (see e.g. [3, 33]) and hence we conclude that B α = ASU(2)1 . Accordingly B = AH SU(2)1 for some closed subgroup H of SO(3) as claimed.
274
S. Carpi
The above proof relies on specific properties of the representation category net A(Vir,1) , namely on the fact that the subcategory of representations with finite statistical dimension (and trivial univalence) is permutation symmetric, a fact that appears to be rather exceptional for conformal nets on S 1 . However in the case of local nets on the four dimensional space-time similar ideas have been used by the author and R. Conti to study local extensions in a fairly general context [10]. As matter of fact the above mentioned investigation in [10] inspired our proof of Theorem 3.5. Coming back to conformal nets on S 1 we remark that there are well known local extensions of the Virasoro net with c = 1 which are not conformal subsystems of ASU(2)1 (see e.g. [3] ) and hence are not of compact type as a consequence of Theorem 3.5. However F. Xu has made further progress and classified the local extensions B of the c = 1 Virasoro net such that the corresponding representation of A(Vir,1) on HB contains a subrepresentation equivalent to some πj12 , j ∈ N [55, Theorem 4.6]. The above condition is called the “spectrum condition” in [55] where it is conjectured that all nontrivial extensions of the c = 1 Virasoro net have to satisfy it. This motivates the following definition: Definition 3.6. A local extension B of a conformal net A is maximally non-compact if the corresponding representation π of A on HB satisfies the following condition: the only subrepresentation of finite statistical dimension of π is the vacuum subrepresentation. From the previous discussion we can conclude that a local extension of the c = 1 Virasoro net satisfies Xu’s spectrum condition if and only if it is not maximally noncompact. No examples of maximally non-compact extensions of this net seem to be known. We shall however exhibit in Sect. 5 various examples of maximally non-compact extensions for the Virasoro nets with c > 1. We conclude this section with the following proposition. Proposition 3.7. Let B be a local extension of the Virasoro net A(Vir,c) . Then the following hold: (a) A(Vir,c) (I ) ∩ B(I ) = C1, for every I ∈ I; (b) The net B is diffeomorphism covariant. Proof. Let π be the representation of A(Vir,c) on HB associated with the local extension B. If V is the corresponding strongly continuous projective unitary representation of Diff + (S 1 ) on HB given by (b) of Prop. 2.2, then V (γ ) ∈ A(Vir,c) (I ) if γ ∈ Diff(I ), for each I ∈ I. Moreover, for every g ∈ PSL(2, R) we have V (g) = p(U (g)), where U is the representation, makes B M¨obius covariant. Hence, it follows from [38, Theorem 12] that A(Vir,c) (I ) ∩ B(I ) = U (PSL(2, R)) ∩ B(I ) = C1, which proves (a). Now let I be a given interval in I and let γ ∈ Diff + (S 1 ) be such that γ I = I . Since γ preserves the orientation it must keep fixed the boundary points of I. An elementary argument (which we omit here) then shows that for every J ∈ I containing the closure of I we can find a diffeomorphism γ J ∈ Diff(J ) with γ J |I = γ |I , i.e. γ −1 γ J ∈ Diff(I c ). Since V (γ −1 γ J ) ∈ A(Vir,c) (I c ) ⊂ B(I c ), we find V (γ J )B(I )V (γ J )∗ = V (γ )V (γ −1 γ J )B(I )V (γ −1 γ J )∗ V (γ )∗ = V (γ )B(I )V (γ )∗ , and hence V (γ )B(I )V (γ )∗ ⊂ B(J ), for every J ∈ I containing the closure of I . Thus, being conformal nets continuous from outside, we conclude that V (γ )B(I )V (γ )∗ ⊂
On the Representation Theory of Virasoro Nets
275
B(I ). If γ is arbitrary we can always find a g ∈ PSL(2, R) such that gI = γ I . It follows that V (γ )B(I )V (γ )∗ = U (g)V (g −1 γ )B(I )V (g −1 γ )∗ U (g)∗ ⊂ B(gI ) = B(γ I ), and hence, for every I ∈ I, γ ∈ Diff + (S 1 ), we have V (γ )B(I )V (γ )∗ = B(γ I ), and also (b) is proved.
Remark 3.8. If B is a diffeomorphism covariant net on S 1 and V is the corresponding projective unitary representation of Diff + (S 1 ), one can define a covariant subsystem C of B by C(I ) = {V (γ ) : γ ∈ Diff(I )}
I ∈ I.
(43)
Arguing as in the proof of Prop. 2.1 it can be shown that the conformal net C on S 1 is isomorphic to A(Vir,c) for some c ∈ D ∪ [1, ∞). It follows that the correspondence between diffeomorphism covariant nets on S 1 and local extensions of the Virasoro nets is one-to-one, cf. [33]. 4. On the Oscillator Representations of the Virasoro Nets with c > 1 Let (AU(1) , U, HU(1) ) be the conformal net generated by the U(1) chiral current algebra, see [3, 5]. The Hilbert space HU(1) and the net AU(1) can be identified with the Fock space eH1 , where H1 is acted on by the irreducible representation of PSL(2, R) of lowest weight 1, and with the corresponding second quantization net respectively [26]. f in We denote HU(1) the dense subspace of finite energy vectors, i.e. the algebraic direct f in
sum of the L0 eigenspaces. Then HU(1) carries the unique irreducible lowest weight representation of the oscillator (Heisenberg) algebra [Jn , Jm ] = nδn+m,0
m, n ∈ Z, J0 = q1,
(44) (45)
with lowest weight q = 0, see [3 and 30, Sect. 2.2]. The corresponding lowest weight f in vector is the vacuum vector and for ξ, ψ ∈ HU(1) , n ∈ Z we have (ξ, Jn ψ) = (J−n ξ, ψ), q
q
(46)
(hermiticity). Note that defining Jn := Jn , J0 = q1 we obtain a unitary representation of the oscillator algebra with arbitrary lowest weight q ∈ R. The U(1) current J (z), z = eiϑ ∈ S 1 is defined as an operator valued distribution by Jn z−n−1 (47) J (z) = n∈Z
and the common invariant domain for the smeared field operators
dz J (z)u(z) u ∈ C ∞ (S 1 ) J (u) = S 1 2πi
276
S. Carpi
can be chosen to be the subspace C ∞ (L0 ) of smooth vectors for L0 . For a real function u ∈ C ∞ (S 1 ), J (u) is essentially self-adjoint and the unitary operators W (u) := eiJ (u) with u ∈ C ∞ (S 1 ) real, supp u ⊂ I generate AU(1) (I ) for every I ∈ I. Moreover the Weyl relations hold: W (u)W (v) = W (u + v)e−
dz S 1 4π i u (z)v(z)
(48)
,
d u(z) = −ie−iϑ for real smooth functions u, v, where u (z) denotes the derivative dz d iϑ dϑ u(e ). As shown in [3] (see also [4]) for every q ∈ R there is a covariant irreducible representation of AU(1) (BMT-automorphism) γq on HU(1) such that
γq I (W (u)) = eiq
dz −1 S 1 2π i z u(z)
W (u) = eiJ
q (u)
,
for I ∈ I, u ∈ C ∞ (S 1 ) with support in I . Here the field J q (z) is defined by q Jn z−n−1 = J (z) + qz−1 . J q (z) =
(49)
(50)
n∈Z
γq1 and γq2 are inequivalent if q1 = q2 . Moreover, if ϕ is a real smooth function such that −iϕ (z) = z−1 q for z ∈ I then γq I (·) = AdW (−ϕ)(·),
(51)
and hence γq is locally implementable by Weyl unitaries. In fact Eq. (51) can be used to define the representation γq . Note that γ0 is the vacuum representation of AU(1) and that for every I ∈ I, we have γq I (AU(1) (I )) = AU(1) (I ).
(52)
We now come to the oscillator representations of the Virasoro algebra. For λ, q ∈ R, n ∈ Z the operators (λ,q)
Ln
= δn,0
1 q q λ2 q + : J−j Jj +n : +iλnJn , 2 2
(53)
j ∈Z
where the colons denote normal ordering, define a positive energy unitary representaf in tion R(λ, q) of the Virasoro algebra on HU(1) with central charge c = 1 + 12λ2 , see (0,0)
e.g. [30, Sect. 3.4]. Since L0 coincides with L0 (by the Sugawara formula) we have (λ,q) 2 2 = L0 +(λ +q )/2 and hence is a lowest energy vector for these representations L0 with energy (λ2 + q 2 )/2. We associate to the above representations the energy-momentum tensors T (λ,q) (z) defined by (λ,q) T (λ,q) (z) = Ln z−n−2 . (54) n∈Z
Then the following holds (see [21, Remark 4.2]) 1 1 λ2 d q 2 (λ,q) T (z) = : J (z) : −iλ + J q (z) + 2 , 2 z dz 2z
(55)
On the Representation Theory of Virasoro Nets
277
and hence, recalling that J q (z) = J (z) + qz−1 , T (λ,q) (z) =
1 q 1 d λ2 + q 2 . : J (z)2 : + J (z) − iλ + J (z) + 2 z z dz 2z2
For f ∈ C ∞ (S 1 ) the smeared field operator
dz (λ,q) T (λ,q) (f ) = (z)f (z) T S 1 2πi
(56)
(57)
is well defined on the domain C ∞ (L0 ) and leaves it globally invariant. Moreover we see from Eq. (56) that the field T (λ,q) (z) is local with respect to J (z) in the sense that if f, u ∈ C ∞ (S 1 ) have disjoint supports, the operators T (λ,q) (f ) and J (u) commute on C ∞ (L0 ) and that T (λ,q) (f ) is hermitian if f is a real function. Finally, it follows from [5, Sect. 2] (cf. also [24]) that T (λ,q) (f ) is essentially self-adjoint for each real valued (λ,q) (f ) commutes with W (u) if the support smooth function f and that in this case eiT of the real function u is disjoint from the one of f . We now define an isotonous net B (λ,q) on HU(1) by : f ∈ C ∞ (S 1 ), real, supp f ⊂ I } ,
(λ,q) (f )
B (λ,q) (I ) = {eiT
(58)
for I ∈ I. As a consequence of the above discussion and of Haag duality for AU(1) we obtain the following proposition: Proposition 4.1. For every I ∈ I we have B (λ,q) (I ) ⊂ AU(1) (I ).
(59)
The net B (λ,q) so defined it is not in general a conformal subsystem of AU(1) . And in fact it can be shown that B (λ,q) transforms covariantly with respect to the representation U making AU(1) M¨obius covariant only for (λ, q) = (0, 0), B (0,0) being the (c = 1) (λ,q) = L0 + (λ2 + q 2 )/2 implies Virasoro subnet of AU(1) . However, the equality L0 2 rotation covariance for every (λ, q) ∈ R , namely U (r(ϑ))B (λ,q) (I )U (r(−ϑ)) = B (λ,q) (r(ϑ)I )
ϑ ∈ R, I ∈ I.
(60)
We shall need the following two lemmata: Lemma 4.2. For every pair (λ, q) ∈ R2 and every I ∈ I the following holds: B (λ,q) (I ) = γq I (B (λ,0) (I )). C ∞ (S 1 )
(61)
−iϕ (z)
z−1 q
Proof. Let ϕ, f ∈ be real functions such that = f in suppf ⊂ I . For ξ, ψ ∈ HU(1) a straightforward calculation shows that
for z ∈ I and
(ξ, W (−ϕ)T (λ,0) (f )ψ) = (T (λ,q) (f )ξ, W (−ϕ)ψ). Since Hf in is a common core for T (λ,0) (f ) and T (λ,q) (f ) it follows that W (−ϕ)eiT
(λ,0) (f )
W (ϕ) = eiT
(λ,q) (f )
,
and hence, recalling Eq. (51), γq I (eiT
(λ,0) (f )
) = eiT
(λ,q) (f )
,
cf. [5, p. 123] and [4, p. 361]. The conclusion then follows from the definition of B (λ,q) (I ) given in Eq. (58).
278
S. Carpi
Lemma 4.3. The representation R(λ, q) defined after Eq. (53) is irreducible for every λ = 0 and q ∈ R. Proof. The character χ(λ,q) (t), t ∈ (0, 1) of the representation R(λ, q) is given by (λ,q)
χ(λ,q) (t) = Tr(t L0
)=t
λ2 +q 2 2
p(t),
∞
where p(t) = n=1 (1−t n )−1 = Tr(t L0 ) and hence the conclusion follows since, by [30, Eq. (3.15) and Prop. 8.2], it coincides with the character of the irreducible representation L(c, h) of Vir with central charge c = 1 + 12λ2 and lowest weight h = (λ2 + q 2 )/2. Corollary 4.4. Let A(Vir,c) be the Virasoro net with central charge c = 1 + 12λ2 , λ = 0 and let πhc be the (irreducible) representation of A(Vir,c) with lowest weight h = (λ2 + q 2 )/2 as defined in Subsect. 2.4. Then there is a representation π(λ,q) of A(Vir,c) on HU(1) , unitarily equivalent to πhc , such that for every I ∈ I the following holds: π(λ,q) I (A(Vir,c) (I )) = B(λ,q) (I ).
(62)
We shall need the following proposition in the next section. Proposition 4.5. Let A(Vir,c) be a Virasoro net with c > 1. Then, if h ≥ (c − 1)/24 we have d(πhc ) = d(c), where d(c) ∈ [1, ∞] does not depend on h and satisfies d(c) > 1. Proof. The assumption on the range of c and h implies that we can find λ = 0 and q ∈ R such that c = 1 + 12λ2 and h = (λ2 + q 2 )/2. Then it follows from Corollary 4.4 that d(πhc ) = d(π(λ,q) ) and we have to show that the latter does not depend on q. By Eq. (16) and Corollary 4.4 we find d(π(λ,q) )2 = [B(λ,q) (I c ) : B(λ,q) (I )],
I ∈ I.
From Proposition 4.1 and Haag duality for AU(1) it follows that B(λ,q) (I ) ⊂ AU(1) (I ) = AU(1) (I c ) ⊂ B(λ,q) (I c ) , and hence, using the multiplicativity of the minimal index [42] (cf. the proof of [8, Prop. 3.1]), that d(π(λ,q) )2 = [AU(1) (I ) : B(λ,q) (I )] · [AU(1) (I c ) : B(λ,q) (I c )]. Now, using Lemma 4.2 and Eq. (52) we find, for an arbitrary J ∈ I, [AU(1) (J ) : B(λ,q) (J )] = [γq J (AU(1) (J )) : γq J (B(λ,0) (J ))] = [AU(1) (J ) : B(λ,0) (J )], and hence d(π(λ,q) )2 = [AU(1) (I ) : B(λ,0) (I )] · [AU(1) (I c ) : B(λ,0) (I c )] does not depend on q. Finally if d(c) = 1 then, for every I ∈ I, π(λ,q) (A(Vir,c) (I )) = AU(1) (I ), which is impossible since AU(1) is strongly additive (see [5, 26]) while A(Vir,c) it is not.
On the Representation Theory of Virasoro Nets
279
5. Sectors with Infinite Dimension and Maximally Non-Compact Local Extensions Let D ⊂ [ 21 , 1) be the set of allowed values of the central charge in the discrete series representations of Vir as defined in Subsect. 2.4 and let c ∈ (D + 1) ∪ [2, ∞). Then c − 1 is an allowed value of the central charge and the tensor product net A(Vir,c−1) ⊗ ASU(2)1 is a local extension of A(Vir,c) . The representation of A(Vir,1) on HASU(2)1 contains the irreducible lowest weight representation πj12 , j ∈ N0 with multiplicity 2j + 1 (see Eq. (41)) and hence the multiplicity m(c, j ) of πjc2 in the representation of A(Vir,c) on HA(Vir,c−1) ⊗ HASU(2)1 satisfies m(c, j ) ≥ 2j + 1 for every j ∈ N0 . We are now ready to prove the following theorem, cf. [8, Theorem 4.4] and the guess in [49, Sect. 2]. Theorem 5.1. If c ∈ (D + 1) ∪ [2, ∞) and h ≥ (c − 1)/24 then d(πhc ) = ∞. Proof. Let π be the representation of A(Vir,c) in HA(Vir,c−1) ⊗ HASU(2)1 as described above. Then, as explained in Sect. 2, π is unitarily equivalent to a representation θ on HA(Vir,c) localized in an interval I0 ∈ I and for every I ∈ I with I0 ⊂ I θI is a dual canonical endomorphism for the inclusion A(Vir,c) (I ) ⊂ A(Vir,c−1) (I ) ⊗ ASU(2)1 (I ), which is irreducible because of Prop. 3.7. Now let ρjc2 be a representation of A(Vir,c) on HA(Vir,c) , unitarily equivalent to πjc2 and localized in I0 and let I ∈ I be an interval containing I0 . As shown just before the statement of this theorem the multiplicity m(c, j ) of the representation ρjc2 in θ satisfies m(c, j ) ≥ 2j + 1. Hence (by Haag duality) the endomorphism ρjc2 is contained in θI with multiplicity n(c, j ) ≥ 2j + 1 for each I √ j ∈ N0 . Now, it follows from Prop. 4.5 that d(ρjc2 ) = d(c), for each j ≥ (c − 1)/24, where d(c) does not depend on j . Let us assume that d(c) < ∞. Then √ by [25, Cor. 2.10] ρjc2 is an irreducible endomorphism of A(Vir,c) (I ) for every j ≥ (c − 1)/24 and by I √ [28, p. 39] we conclude that 2j + 1 ≤ n(c, j ) ≤ d(c)2 for every j ≥ (c − 1)/24, in contradiction with the assumption d(c) < ∞. Hence d(c) = ∞ and the conclusion follows from Prop. 4.5. Corollary 5.2. If c ∈ (D + 1) ∪ [2, ∞) and B is a local extension of compact type of A(Vir,c) then the index [B : A(Vir,c) ] is finite. Proof. Let π be the representation of A(Vir,c) on HB defined by the local extension B. Only representations with integer lowest weight can appear in the decomposition of π . But there are only a finite number of positive integers m satisfying m < (c − 1)/24 and hence, by Theorem 5.1, only a finite number of irreducible DHR sectors can appear in the decomposition of π. Now, recalling that the inclusion A(Vir,c) (I ) ⊂ B(I ), I ∈ I, is irreducible, the conclusion follows from (the proof of) [33, Prop. 2.3]. Now let G be a simply connected compact Lie group with simple Lie algebra Lie(G) and let k be a positive integer. We denote by AGk the conformal net associated to the vacuum representation of the corresponding Loop group (or affine Lie algebra) at level k (see [22, 49, 52, 54]). As it is well known, the Sugawara formula (see e.g. [14, Sect. 15.2] and [30, Sect.10.1]), implies that the net AGk is a local extension of the Virasoro net A(Vir,c) with central charge c ≡ c(Gk ) =
dim(G)k , k + h∨
(63)
280
S. Carpi
where h∨ is the dual Coxeter number of Lie(G), cf. [22, Sect. III.7] and [49, Sect.1]. The central charge c(Gk ) is bounded by r ≤ c(Gk ) ≤ dim(G),
(64)
where r is the rank of Lie(G) and the lower bound is saturated only for simply laced Lie algebras at level k = 1. Note that c(Gk ) < 2 implies that r = 1 and thus that G = SU(2). In the latter case we have c(SU(2)k ) = 3k/(k + 2). If k ≥ 4 we have c(SU(2)k ) ≥ 2. The remaining possibilities are c(SU(2)1 ) = 1, c(SU(2)2 ) = 1 + 1/2 and c(SU(2)3 ) = 1 + 4/5. We summarize the above discussion in the following lemma. Lemma 5.3. If Gk = SU(2)1 then c(Gk ) ∈ (D + 1) ∪ [2, ∞). Recall that there is a strongly continuous representation of G in the (unitary) group of internal symmetries of AGk leaving the vacuum invariant. This representation is not in general faithful and its kernel coincides with the (finite) center Z(G) of G. It is known that the fixed point net AG Gk satisfies A(Vir,c) ⊂ AG Gk ⊂ AGk ,
c = c(Gk ),
(65)
see [49]. In particular, being G/Z(G) infinite, the index [AGk : A(Vir,c) ] is infinite. Corollary 5.4. If Gk = SU(2)1 then the local extension AGk of A(Vir,c) , c = c(Gk ), is not of compact type. Proof. Due to Lemma 5.3 we can apply Corollary 5.2 and the conclusion follows from [AGk : A(Vir,c) ] = ∞.
The following consequence of Corollary 5.4 has been pointed out by K.-H. Rehren in [49] with a different argument based on the comparison of characters. It can also be proved using [55, Theorem 2.4] and the fact that A(Vir,c) is not strongly additive when c > 1. Corollary 5.5. If Gk = SU(2)1 , then the inclusion A(Vir,c) ⊂ AG Gk is proper. The next result shows that maximally non-compact local extensions naturally appear for the Virasoro nets with c > 1. Proposition 5.6. If Gk = SU(2)1 and c = c(Gk ) ≤ 25 then AGk is a maximally non-compact local extension of A(Vir,c) . Proof. The representation π of A(Vir,c) in HAGk can only have irreducible subrepresentations with a nonnegative integer lowest weight. Since by assumption (c − 1)/24 ≤ 1, it follows from Theorem 5.1 that the only subrepresentation π with finite dimension is the vacuum representation. Hence the extension is maximally non-compact. For SU(N ) h∨ = N and hence c(SU(N )k ) = k(N 2 − 1)/(N + k) and we see that Prop. 5.6 gives an infinite series of maximally non-compact extensions of the c > 1 Virasoro nets. Examples are: SU(2)k , k > 1; SU(3)k , SU(4)k , SU(5)k , k arbitrary; SU(N )1 , 2 < N ≤ 26. Actually the same proof of Prop. 5.6, together with Prop. 2.1, gives the following stronger result. Theorem 5.7. If c ∈ (1 + D) ∪ [2, 25] then every local extension of the Virasoro net A(Vir,c) is maximally non-compact. In particular A(Vir,c) has no local extensions of compact type.
On the Representation Theory of Virasoro Nets
281
A. Appendix In this appendix we give a differentiability result for the representations of Diff + (S 1 ) which is used in the proof of Prop. 2.1. This result has been essentially obtained by T. Loke [40] (cf. also [54] for analogous results for loop groups) and here we consider the necessary modifications we need in this paper. We shall closely follow the discussion in [40, Chap. I]. An element of the group Mob of M¨obius transformations of S 1 is given by a map z → αz+β , where α, β are complex numbers satisfying |α|2 − |β|2 = 1. Mob is a Lie βz+α subgroup of Diff + (S 1 ) isomorphic to PSL(2, R). The corresponding Lie subalgebra of Vect(S 1 ) is spanned by the vector fields x := − sin ϑ
d d d , y := − cos ϑ , h := , dϑ dϑ dϑ
(66)
whose brackets are given by [h, x] = −y, [h, y] = x, [x, y] = h.
(67)
More generally, for each n ∈ N, the vector fields 1 1 1 d d d , yn := − cos nϑ , hn := , xn := − sin nϑ n dϑ n dϑ n dϑ
(68)
span isomorphic Lie subalgebras of Vect(S 1 ) each associated to a Lie subgroup Mobn of Diff + (S 1 ). Clearly Mob1 = Mob and it is not hard to see that, for each n > 1, Mobn is isomorphic to an n-fold covering of PSL(2, R) Mob and that the corresponding covering map transforms the one-parameter group exp(thn ) into the one-parameter subgroup r(t) of rotations of PSL(2, R). Now let V be a strongly continuous projective unitary representation of Diff + (S 1 ) on a separable Hilbert space. For every n ∈ N, the restriction of V to Mobn lifts to a R). Note that exp(2π hn )n = 1 strongly continuous unitary representation Un of PSL(2, n n and hence Un (˜r (2π)) = Un (exp(2πhn )) = χn 1 for a suitable complex number χn of modulus one. In particular U (˜r (2π)) has finite spectrum for each n ∈ N. Now let n1 Xn , n1 Yn and ni (L0 +cn ), cn ∈ R, c1 = 0, be the skew-adjoint generators of the one-parameter groups of unitaries Un (exp(txn )), Un (exp(tyn )), and Un (exp(thn )), respectively. On the dense subspace Dn ⊂ H of C ∞ vectors for the representation Un the above operators define a representation of the Lie algebra (67) and hence we have on Dn , [iL0 , Xn ] = −nYn , [iL0 , Yn ] = nXn , [Xn , Yn ] = in(L0 + cn ), n ∈ N.
(69)
If V is a positive energy representation, since the unitary operator ei2πL0 acts as multiplication by a complex number, the spectrum of L0 is pure point and every eigenvalue is of the form h + n, where h ≥ 0 is the lowest eigenvalue of L0 and n is a nonnegative integer. Now let Hf in be the linear span of the eigenvectors of L0 . Loke has shown in [40, Sect. I.1] that if a positive energy representation V is such that the eigenspaces of L0 are all finite-dimensional, then Dn . (70) Hf in ⊂ n∈N
282
S. Carpi
Moreover he proved that the operators L0 , Ln := iYn − Xn and L−n := iYn + Xn , n ∈ N define a unitary representation of Vir on Hf in and that the corresponding energymomentum tensor Ln z−n−2 (71) T (z) = n∈Z
extends to an operator valued distribution on the subspace of smooth L0 vectors such that T (f ) is essentially self-adjoint on Hf in for each f ∈ Vect(S 1 ) and satisfies p(eiT (f ) ) = V (exp(f )).
(72)
The finite dimensionality of the L0 eigenspace is used in [40] to infer that for each n ∈ N, R) the representation Un is a direct sum of positive energy representations of PSL(2, and that Dn ⊃ Hf in . However these facts hold for every positive energy representation V , as a consequence of the proposition below (applied to each representation Un ) and hence the results of Loke described above hold (without any essential modification in the proofs) also if the finite dimensionality of the eigenspaces of L0 is not assumed. Moreover the representation of Vir on Hf in so obtained can be seen to be irreducible (and hence unitarily equivalent to some L(c, h), cf. [30, Remark 3.5]) if and only if the corresponding projective representation V of Diff + (S 1 ) is irreducible, cf. Lemma 2.2. in [40, Sect. I.2]. R) Proposition A.1. Let U be a strongly continuous unitary representation of PSL(2, on a separable Hilbert space H and let L0 be the self-adjoint generator of the restriction of U to the lifting r˜ (t) of the one-parameter rotation subgroup of PSL(2, R). Assume that the spectrum of L0 is bounded from below and that the one of U (˜r (2π )) is finite. Then the following hold: (a) U is a positive energy representation (i.e. L0 has a nonnegative spectrum) and it is completely reducible to a direct sum of irreducible subrepresentations. (b) Every eigenvector of L0 is a smooth vector for the representation U . Proof. If U is assumed to be irreducible then the positivity of the energy follows from the bound on the spectrum of L0 as a consequence of the classification of the irreducible R) [48] (cf. [40, Sect. I.1.3]) and hence the positive energy representations of PSL(2, condition for U follows in general by direct integral decomposition. Then (a) follows e.g. from [36, Lemma 8]. As a consequence there is an increasing sequence 0 = n1 < n2 ... of nonnegative integers (which is possibly finite) and a decomposition Hk H= k
such that the restriction of U to Hk is a (possibly infinite) multiple of an irreducible representation with lowest weight h + nk . Hence if ψ is an eigenvector of L0 corresponding to the eigenvalue λ we can write (ψk , ψ)ψk , ψ= nk ≤λ−h
where ψk ∈ Hk is a normalized eigenvector of L0 . Since every eigenvector of the generator of rotations in an irreducible representation R) is smooth (see e.g. [48, Sect. I.1]), each ψk is smooth and hence ψ is of PSL(2, smooth vector for the representation U so that also (b) is proved.
On the Representation Theory of Virasoro Nets
283
We can summarize the discussion in this appendix in the following theorem, cf. [40, Sect. I.2.4]. Theorem A.1. Let V be a strongly continuous positive energy projective unitary irreducible representation of Diff + (S 1 ) on a (necessarily separable) Hilbert space H. Then V is unitarily equivalent to the unique projective unitary representation V(c,h) which integrates the Vir-module L(c, h) for some c > 0, h ≥ 0. In particular the corresponding generator of rotations L0 has finite-dimensional eigenspaces. Acknowledgements. The author would like to thank R. Conti, S. K¨oster and R. Longo for discussions, explanations and comments. Theorem 3.5 has been announced at the Miniworkshop “Conformal Field Theory. An Introduction” held in Rome in March 2003. The author thanks the organizers D. Guido and (again) R. Longo for the invitation.
References 1. Bertozzini, P., Conti, R., Longo, R.: Covariant sectors with infinite dimension and positivity of the energy. Commun. Math. Phys. 193, 471–492 (1998) 2. B¨ockenhauer, J., Evans, D.E.: Modular invariants graphs and α-induction for nets of subfactors I. Commun. Math. Phys. 197, 361–386 (1998) 3. Buchholz, D., Mack, G., Todorov, I.T.: The current algebra on the circle as a germ of local field theories. Nucl. Phys. B (Proc. Suppl.) 5B, 20–56 (1988) 4. Buchholz D., Mack G., Todorov I.T.: Localized automorphisms of the U(1)-current. In [31] 5. Buchholz, D., Schulz-Mirbach, H.: Haag duality in conformal quantum field theory. Rev. Math. Phys. 2, 105–125 (1990) 6. Buchholz, D.: Introduction to conformal QFT in two dimensions. Unpublished manuscript, 1990 7. Carpi, S.: Classification of subsystems for the Haag-Kastler nets generated by c = 1 chiral current algebras. Lett. Math. Phys. 47, 353–364 (1999) 8. Carpi, S.: The Virasoro algebra and sectors with infinite statistical dimension. math.OA/0203027, To appear in Ann. H. Poincar´e 9. Carpi, S., Conti, R.: Classification of subsystems, local symmetry generators and intrinsic definition of local observables. In: R. Longo (ed.), Mathematical physics in mathematics and physics. Fields Institute Communications, Vol.30, Providence, RI: AMS, 2001, pp. 83–103 10. Carpi, S., Conti, R.: In preparation 11. Conti, R.: Inclusioni di algebre di von Neumann e teoria algebrica dei campi. Ph.D. Thesis, Universit`a di Roma Tor Vergata, 1996 12. Conti, R., Doplicher, S., Roberts, J.E.: Superselection theory for subsystems. Commun. Math. Phys. 218, 263–281 (2001) 13. D’Antoni, C., Longo, R., Radulescu, F.: Conformal nets, maximal temperature and and models from free probability. J. Operator Theory 45, 195–208 (2001) 14. Di Francesco, Ph., Mathieu, P., S´en´echal, D.: Conformal Field Theory. Berlin-Heidelberg-NewYork: Springer-Verlag, 1996 15. Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations I, II. Commun. Math. Phys. 13, 1–23 (1969); Commun. Math. Phys. 15, 173–200 (1969) 16. Doplicher, S., Roberts, J.E.: Endomorphisms of C ∗ –algebras, cross products and duality for compact groups. Ann. Math. 130, 75–119 (1989) 17. Doplicher, S., Roberts, J.E.: A new duality theory for compact groups. Invent. Math. 98, 157–218 (1989) 18. Doplicher S., Roberts J.E.: Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics. Commun. Math. Phys. 131, 51–107 (1990) 19. Fredenhagen, K., J¨orß, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansions. Commun. Math. Phys. 176, 541–554 (1996) 20. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras II. Geometric aspects and conformal covariance. Rev. Math. Phys. Special Issue, 113–157 (1992) 21. Furlan, P., Sotkov, G.M., Todorov I.T.: Two-dimensional conformal quantum field theory. Riv. Nuovo Cimento 12(6), 1–202 (1989) 22. Gabbiani, F., Fr¨ohlich, J.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993)
284
S. Carpi
23. Goddard, P., Kent, A., Olive, D.: Unitary representations of the Virasoro and super-Virasoro algebra. Commun. Math. Phys. 103, 105–119 (1986) 24. Goodman, R., Wallach, N.R.: Projective unitary positive-energy representations of Diff(S 1 ). J. Funct. Anal. 63, 299–321 (1985) 25. Guido, D., Longo, R.: The conformal spin and statistic theorem. Commun. Math. Phys. 181, 11–35 (1996) 26. Guido, D., Longo, R., Wiesbrock, H.-W.: Extensions of conformal nets and superselection structures. Commun. Math. Phys. 192, 217–244 (1998) 27. Haag, R.: Local Quantum Physics. 2nd ed. Berlin-Heidelberg-New York: Springer-Verlag, 1996 28. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 29. Jones, V.: Index of subfactors. Invent. Math. 72, 1–25 (1983) 30. Kac, V.G., Raina, A.K.: Bombay Lectures on Highest Weight Representations of Infinite Dimensional Lie Algebras. Singapore: World Scientific, 1987 31. Kastler, D. ed.: The algebraic theory of superselection sectors. Singapore: World Scientific, 1990 32. Kawahigashi, Y.: Classification of operator algebraic conformal field theories. math.OA/0211141 33. Kawahigashi, Y., Longo, R.: Classification local conformal nets. Case c < 1. math.OA/0211141, to appear in Ann. Math. 34. Kawahigashi, Y., Longo, R.: Classification of two-dimensional local conformal nets with c < 1 and 2-cohomology vanishing for tensor categories. math-ph/0304022 35. Kawahigashi, Y., Longo, R., M¨uger, M.: Multi-interval subfactor and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 36. K¨oster, S.: Conformal transformations as observables. Lett. Math. Phys. 61, 187–198 (2002) 37. K¨oster, S.: Absence of stress energy tensor in CFT2 models. math-ph/0303053 38. K¨oster, S.: Local nature of cosets models. math-ph/0303054 39. Kosaki, H.: Extension of Jones’ theory on index to arbitrary subfactors. J. Funct. Anal. 66, 123–140 (1986) 40. Loke, T.: Operator algebras and conformal field theory of the discrete series representation of Diff + (S 1 ). PhD Thesis, University of Cambridge, 1994 41. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126 217– 247, (1989) and II. Correspondences, braid group statistics and Jones polynomial. Commun. Math. Phys. 130, 285–309 (1990) 42. Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) 43. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237, 7–30 (2003) 44. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 45. Mack, G.: Introduction to conformal invariant quantum field theory in two and more dimensions. In: G. t’ Hooft, et al., (eds.), Non perturbative quantum field theory. New York: Plenum Press, 1988, pp.353–383 46. Milnor, J.: Remarks on infinite-dimensional Lie groups. In: B.S. De Witt and R. Stora, (eds)., Relativity, groups and topology II. Les Houches, Session XL, 1983, Amsterdam, New York: Elsevier, 1984, pp. 1007–1057 47. M¨uger, M.: On charged fields with group symmetry and degeneracies of Verlinde’s matrix S. Ann. Inst. H. Poincar´e 71, 359–394 (1999) 48. Puk´anzsky, L.: The Plancherel formula for the universal covering group of SL(2,R). Math. Annalen 156, 96–143 (1964) 49. Rehren, K.-H.: A new view of the Virasoro algebra. Lett. Math. Phys. 30, 125–130 (1994) 50. Roberts, J.E.: Lectures on algebraic quantum field theory. In [31], pp. 1–112 51. Takesaki, M.: Theory of operator algebras I. Berlin-Heidelberg-New York: Springer-Verlag, 2002 52. Toledano Laredo, V.: Fusion of positive energy representations of LSpin2n . PhD Thesis, Cambridge: University of Cambridge, 1997 53. Toledano Laredo, V.: Integrating unitary representations of infinite-dimensional Lie groups. J. Funct. Anal. 161, 478–508 (1999) 54. Wassermann, A.: Operator algebras and conformal field theory III: Fusion of positive energy representations of SU(N) using bounded operators. Invent. Math. 133, 467–538 (1998) 55. Xu, F.: Strong additivity and conformal nets. math.QA/0303266 Communicated by Y. Kawahigashi
Commun. Math. Phys. 244, 285–296 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0965-7
Communications in
Mathematical Physics
Extremal Projectors of q-Boson Algebras Toshiki Nakashima Department of Mathematics, Sophia University, Tokyo 102-8554, Japan. E-mail: [email protected] Received: 9 July 2002 / Accepted: 26 July 2003 Published online: 13 November 2003 – © Springer-Verlag 2003
Abstract: We define the extremal projector of the q-boson Kashiwara algebra Bq (g) and study their basic properties. Applying their properties to the representation theory of the category O(Bq (g)), whose objects are “upper bounded” Bq (g)-modules, we obtain its semi-simplicity and the classification of simple modules. 1. Introduction In [3], we studied the so-called q-boson Kashiwara algebra, in particular, a kind of q-vertex operators and their 2 point functions. We found therein an interesting object . But at that time we did not reveal its whole properties, as the “Extremal Projector”. Tolstoy,V.N., et al., introduced the notion of “Extremal Projectors” for Lie (super)algebras and quantum (super) algebras, and made extensive study of their properties and applied them to representation theory, (see [2, 7, 8]) and references therein). In the present paper, we shall re-define the extremal projector for the q-boson algebras, clarify its properties and apply it to the representation theory of q-boson algebras. To be more precise, let {ei , fi , q h | i ∈ I, h ∈ P ∗ } be the generators of the q-boson algebra Bq (g). The extremal projector is an element in B q (g)(some completion of Bq (g)) which satisfies the following; ei = fi = 0, 2 = , ak bk = 1, k
for some ak ∈ Bq+ (g) and bk ∈ Bq− (g) (see Theorem 5.2). Let O(B) be the category of “upper bounded” Bq (g)-modules (see Sect. 3). By using the above properties of , we
shall show that the category O(B) is semi-simple and classify its simple modules. In [1], Kashiwara gave the projector P for the q-boson algebra of sl2 -case in order to define the crystal base of Uq− (g). He used it to show the semi-simplicity of O(Bq (sl2 )). So our is a generalization of his projector P to arbitrary Kac-Moody algebras.
286
T. Nakashima
The organization of this article is as follows: In Sect.2, we review the definitions of quantum algebras and q-boson Kashiwara algebras and their properties. In Sect.3, we introduce the category of modules of the q-boson algebras O(B), which we treat in the sequel. In Sect.4, we review the so-called Drinfeld Killing form and using it we define some element C in the tensor product of q-boson algebras, which plays a significant role of studying extremal projectors. In Sect.5, we define extremal projectors for the q-boson algebras and involve their important properties. In the last section, we apply it to show the semi-simplicity of the category O(B) and classify the simple modules in O(B). In [3] we gave the proof of its semi-simplicity, but there was a quite big gap. Thus, the last section is devoted to closing that gap. We can find an elementary proof of the semi-simplicity of the category O(B) in e.g.[9]. 2. Quantum Algebras and q-Boson Kashiwara Algebras We shall define the algebras playing a significant role in this paper. First, let g be a symmetrizable Kac-Moody algebra over Q with a Cartan subalgebra t, {αi ∈ t∗ }i∈I the set of simple roots and {hi ∈ t}i∈I the set of coroots, where I is a finite index set. We define an inner product on t∗ such that (αi , αi ) ∈ Z≥0 and hi , λ = 2(αi , λ)/(αi , αi ) for λ ∈ t∗ . Set Q = ⊕i Zαi , Q+ = ⊕i Z≥0 αi and Q− = −Q+ . We call Q a root lattice. Let P be a lattice of t∗ i.e. a free Z-submodule of t∗ such that t∗ ∼ = Q ⊗Z P , and P ∗ = {h ∈ t|h, P ⊂ Z}. Now, we introduce the symbols {ei , ei , fi , fi (i ∈ I ), q h (h ∈ P ∗ )}. These symbols satisfy the following relations:
q 0 = 1, and q h q h = q h+h , q h ei q −h = q h,αi ei , q h ei q −h = q h,αi ei ,
(2.1) (2.2) (2.3)
q h fi q −h = q −h,αi fi , q h fi q −h = q −h,αi fi ,
(2.4) (2.5)
[ei , fj ] = δi,j (ti − ti−1 )/(qi − qi−1 ),
h ,α ei fj = qi i j fj ei + δi,j , h ,α fi ej = qi i j ej fi + δi,j , 1−hi ,αj (1−hi ,αj −k) (k) (−1)k Xi Xj Xi k=0 for Xi = ei , ei , fi , fi ,
(2.6) (2.7) (2.8)
= 0, (i = j ),
(2.9)
where q is transcendental over Q and we set qi = q (αi ,αi )/2 , ti = qihi , [n]i = (qin − (n) qi−n )/(qi − qi−1 ), [n]i ! = nk=1 [k]i and Xi = Xin /[n]i !. Now, we define the algebras Bq (g), B q (g) and Uq (g). The algebra Bq (g) (resp. B q (g)) is an associative algebra generated by the symbols {ei , fi }i∈I (resp. {ei , fi }i∈I ) and q h (h ∈ P ∗ ) with the defining relations (2.1), (2.3), (2.4), (2.7) and (2.9) (resp. (2.1), (2.2), (2.5), (2.8) and (2.9)) over Q(q). The algebra Uq (g) is the usual quantum algebra generated by the symbols {ei , fi }i∈I and q h (h ∈ P ∗ ) with the defining relations (2.1),(2.2),(2.4), (2.6) and (2.9) over Q(q). We shall call algebras Bq (g) and B q (g) the q-boson Kashiwara algebras ([1]). Furthermore, we define their subalgebras
Extremal Projectors of q-Boson Algebras
287
T = q h |h ∈ P ∗ = Bq (g) ∩ B q (g) ∩ Uq (g), ∨
Bq∨ (g) (resp. B q (g)) = ei , fi (resp. ei , fi )|i ∈ I ⊂ Bq (g) (resp. B q (g)), +
Uq+ (g) (resp. Uq− (g)) = ei (resp. fi )|i ∈ I =: B q (g) (resp. Bq− (g)), Uq≥ (g) (resp. Uq≤ (g)) = ei (resp. fi ), q h |i ∈ I, h ∈ P ∗ , −
∨
Bq+ (g) (resp. B q (g)) = ei (resp. fi )|i ∈ I ⊂ Bq∨ (g) (resp. B q (g)), ≤
Bq≥ (g) (resp. B q (g)) = ei (resp. fi ), q h |i ∈ I, h ∈ P ∗ ⊂ Bq (g) (resp. B q (g)). We shall use the abbreviated notations U , B, B, B ∨ ,· · · for Uq (g), Bq (g), B q (g), Bq∨ (g),· · · if there is no confusion. For β = mi αi ∈ Q+ we set |β| = mi and ± = {u ∈ U ± |q h uq −h = q ±h,β u (h ∈ P ∗ )}, U±β − and call |β| a height of β and Uβ+ (resp. U−β ) a weight space of U + (resp. U − ) with a −
weight β (resp. −β). We also define Bβ+ and B −β by the similar manner.
Proposition 2.1 ([3]). (i) We have the following algebra homomorphisms : : U −→ U ⊗ U , (r) : B −→ B ⊗ U , (l) : B −→ U ⊗ B and (b) : U −→ B ⊗ B given by (q h ) = (r) (q h ) = (l) (q h ) = (b) (q h ) = q h ⊗ q h ,
(2.10)
(ei ) = ei ⊗ 1 + ti ⊗ ei , (fi ) = fi ⊗ ti−1 + 1 ⊗ fi , (r) (ei ) = (qi − qi−1 ) · 1 ⊗ ti−1 ei + ei ⊗ ti−1 , (r) (fi ) = fi ⊗ ti−1 + 1 ⊗ fi , (l) (ei ) = ei ⊗ 1 + ti ⊗ ei , (l) (fi ) = (qi − qi−1 )ti fi ti ei (b) (ei ) = ti ⊗ + ei ⊗ 1, qi − qi−1 t −1 f (b) (fi ) = 1 ⊗ fi + i i−1 ⊗ ti−1 , qi − q i
(2.11) (2.12) ⊗ 1 + ti ⊗ fi , (2.13)
(2.14)
and extending these to the whole algebras by the rule: (xy) = (x)(y) and (i) (xy) = (i) (x)(i) (y) (i = r, l, b). (ii) We have the following anti-isomorphisms S : U −→ U and ϕ : B −→ B given by S(fi ) = −fi ti , S(q h ) = q −h , S(ei ) = −ti−1 ei , 1 ϕ(ei ) = − ei , ϕ(fi ) = −(qi − qi−1 )fi , ϕ(q h ) = q −h , qi − qi−1 and extending these to the whole algebras by the rule: S(xy) = S(y)S(x) and ϕ(xy) = ϕ(y)ϕ(x). Here S is called a anti-pode of U . We also denote ϕ|U ≥ = ϕ|B ≥ by ϕ.
288
T. Nakashima
We obtain the following triangular decomposition of the q-boson Kashiwara algebra: Proposition 2.2. The multiplication map defines an isomorphism of vector spaces: ∼
Bq− (g) ⊗ T ⊗ Bq+ (g) −→ Bq (g), u1 ⊗ u 2 ⊗ u 3 → u1 u2 u3 . Proof. By [1, (3.1.2)], we have
ei fj
n (m)
min(n,m) 2nm+(n+m)i−i(i+1)/2 min(n,m) (m−i) n−i fi qi ei , if i = j, i = i=0 nmh i ,αj (m) n q fj ei , otherwise. i
By this formula and the standard argument, we can show the proposition.
We define weight completions of L(1) ⊗ · · · ⊗ L(m) , where L(i) = B or U (see [6]).
(1) ⊗
(m) = lim L(1) ⊗ · · · ⊗ L(m) /(L(1) ⊗ · · · ⊗ L(m) )L+,l ,
···⊗
L L ←− l
+ + where L+,l = ⊕|β1 |+···+|βm |≥l L(1) β1 ⊗ · · · ⊗ L(m) βm . (Note that U ∼ = U − ⊗ T ⊗ U + and − + (r) B∼ = B ⊗ T ⊗ B . ) The linear maps , , S, ϕ, multiplication, etc. are naturally extend for such completions.
3. Category O(B) Let O(B) be the category of left B-modules such that (i) Any object M has a weight space decomposition M = ⊕λ∈P Mλ , where Mλ = {u ∈ M | q h u = q h,λ for any h ∈ P ∗ }. (ii) For any element u ∈ M there exists l > 0 such that ei1 ei2 · · · eil u = 0 for any i1 , i2 , · · · , il ∈ I . The similar category O(B ∨ ) for Bq (g)∨ is introduced in [1], which is defined with the above condition (ii). In [1], Kashiwara mentions that the category O(B ∨ ) is semi-simple though he does not give an exact proof. Here we give a proof of the semi-simplicity of O(B) in Sect 6. Here for λ ∈ P we define the B-module H (λ) by H (λ) := B/Iλ , where the left ideal Iλ is defined as Iλ := Bei + B(q h − q h,λ ). i
h∈P ∗
In Sect. 6, we shall also show that {H (λ)|λ ∈ P } is a set of representatives of isomorphism classes of simple modules.
Extremal Projectors of q-Boson Algebras
289
4. Bilinear Forms and Elements C Proposition 4.1 ([4–6]).
(i) There exists the unique bilinear form ,
: U ≥ × U ≤ −→ Q(q),
satisfying the following; x, y1 y2 = (x), y1 ⊗ y2 , x1 x2 , y = x2 ⊗ x1 , (y),
q h , q h = q −(h|h ) , T , fi = ei , T = 0,
(x ∈ U ≥ , y1 , y2 ∈ U ≤ ), (x1 , x2 ∈ U ≥ , y ∈ U ≤ ),
(h, h ∈ P ∗ ),
ei , fj = δij /(qi−1 − qi ),
where ( | ) is an invariant bilinear form on t. (ii) The bilinear form , enjoys the following properties:
xq h , yq h = q −(h|h ) x, y, for x ∈ U ≥ , y ∈ U ≤ , h, h ∈ P ∗ . (4.1) − + For any β ∈ Q+ , , |U + ×U − is non-degenerate and Uγ , U−δ = 0, β −β if γ = δ. (4.2) We call this bilinear form the Drinfeld-Killing form of U . β For β = i mi αi ∈ Q+ (mi ≥ 0), set kβ := i timi , and let {xr }r be a basis of −β − Uβ+ and {yr }r be the dual basis of U−β with respect to the Drinfeld-Killing form. We + − denote the canonical element in Uβ ⊗ U−β with respect to the Drinfeld-Killing form by Cβ := xrβ ⊗ yr−β . r
We set C :=
U − = U + ⊗
B − . (1 ⊗ kβ−1 )(1 ⊗ S −1 )(Cβ ) ∈ U + ⊗
(4.3)
β∈Q+
The element C satisfies the following relations: Proposition 4.2.
(i) For any i ∈ I , we have (ti−1 ⊗ ei )C = C(ti−1 ⊗ ei + (qi − qi−1 )ti−1 ei ⊗ 1),
(fi ⊗ ti−1
+ 1 ⊗ fi )(ϕ ⊗ 1(C)) = (ϕ
⊗ 1(C))(fi ⊗ ti−1 ).
(4.4) (4.5)
Bq (g) and (4.5) is the equation in Here note that (4.4) is the equation in Uq (g)⊗
Bq (g). Bq (g)⊗ (ii) The element C is invertible and the inverse is given as q −(β,β) (kβ ⊗ kβ−1 )(S −1 ⊗ S −1 )(Cβ ). (4.6) C −1 = β∈Q+
Proof. The proof of (4.5) has been given in [3, 6.2]. Thus, let us show (4.4). For that purpose, we need the following lemma:
290
T. Nakashima
β −β − Lemma 4.3. For β ∈ Q+ , let Cβ = r xr ⊗yr be the canonical element in Uβ+ ⊗U−β as above and set Cβ := (1 ⊗ S −1 )(Cβ ). Then for any β ∈ Q+ and i ∈ I , we have −1 )(Cβ+α )] = (1 ⊗ kβ−1 )(Cβ )(ti−1 ei ⊗ (qi − qi−1 ) · 1) [ti−1 ⊗ ei , (1 ⊗ kβ+α i i ∈ Uq (g) ⊗ Bq (g), (4.7)
where we use the identification Bq− (g) = Uq− (g). − Proof. Applying ·, z ⊗ 1 on both sides of (4.7), where z ∈ U−β−α , we obtain i
(·, z ⊗ 1)(L.H.S.of(4.7)) =
−1 ti−1 xrβ+αi , z ⊗ ei kβ+α S −1 (yr−β−αi ) i r
−1 −xrβ+αi ti−1 , z ⊗ kβ+α S −1 (yr−β−αi )ei i −1 −1 = q −(αi ,β+αi ) ei kβ+α S −1 (z) − kβ+α S −1 (z)ei i i −1 = kβ+α (ei S −1 (z) − S −1 (z)ei ), i
(·, z ⊗ 1)(R.H.S.of(4.7)) =
xrβ ti−1 ei , z ⊗ (qi − qi−1 )kβ−1 S −1 (yr−β ).
(4.8)
r − − we can define v ∈ U−β uniquely by For z ∈ U−β−α i
(z) = 1 ⊗ z + fi ⊗ vti−1 + · · · . By the property of the Drinfeld Killing form, we have xrβ ti−1 ei , z = ei ⊗ xrβ ti−1 , (z)
= ei ⊗ xrβ ti−1 , 1 ⊗ z + fi ⊗ vti−1 + · · · = ei , fi xrβ ti−1 , vti−1
=
qi−2 xrβ , v. −1 qi − q i
Thus, R.H.S. of (4.8) = −qi−2 kβ−1 S −1 (v).
(4.9)
Here in order to complete the proof of Lemma 4.3, let us show: ei S −1 (z) − S −1 (z)ei = −qi−2 ti S −1 (v).
(4.10)
Without loss of generality, we may assume that z is in the form z = fi1 fi2 · · · fik ∈ − U−β−α (β + αi = αi1 + · · · + αik ). For β = j mj αj , we shall show this by induction i on mi for fixed i ∈ I . If mi = 0, z is in the form z = z fi z , where z and z are monomials of fj ’s not including fi . By S −1 (fj ) = −tj fj and ei (tj fj ) = (tj fj )ei (i = j ) we have ei S −1 (z ) = S −1 (z )ei ,
ei S −1 (z ) = S −1 (z )ei .
(4.11)
Extremal Projectors of q-Boson Algebras
291
Hence, we obtain ei S −1 (z) = S −1 (z )(−ei ti fi )S −1 (z ) = S −1 (z )(−ti fi ei − qi−2 ti )S −1 (z ) = S −1 (z )(−ti fi )S −1 (z )ei − qi−2 S −1 (z )ti S −1 (z ) = S −1 (z )S −1 (fi )S −1 (z )ei − q (β
−α ,α ) i i
ti S −1 (z z ),
where β = wt (z ). Therefore, for mi = 0, we have L.H.S. of (4.10) = −q (β
−α ,α ) i i
In the case mi = 0 we can easily obtain v = q (β R.H.S. of (4.10) = −q (β
−α ,α ) i i
,α ) i
ti S −1 (z z ).
z z and then
ti S −1 (z z ) = L.H.S. of (4.10).
Thus, the case mi = 0 has been shown. Suppose that mi > 0. We divide z = z z such that mi < mi and mi < mi , where mi ( resp. mi ) is the number of fi including in z (resp. z ). Writing (z ) = 1 ⊗ z + fi ⊗ v ti−1 + · · · ,
(z ) = 1 ⊗ z + fi ⊗ v ti−1 + · · · ,
and calculating (z z ) directly, we obtain v = z v + q (β
,α ) i
v z .
(4.12)
By the hypothesis of the induction, ei S −1 (z) = ei S −1 (z )S −1 (z ) = (S −1 (z )ei − qi−2 ti S −1 (v ))S −1 (z ) = S −1 (z )ei S −1 (z ) − qi−2 ti S −1 (z v )
= S −1 (z )(S −1 (z )ei − qi−2 ti S −1 (v )) − qi−2 ti S −1 (z v ) = S −1 (z z )ei − qi−2 ti (S −1 (z v ) + q (β
,α ) i
S −1 (v z ))
= S −1 (z)ei − qi−2 ti S −1 (v).
Note that in the last equality, we use (4.12). Now, we have completed the proof of Lemma 4.3. − Proof of Proposition 4.2. If β ∈ Q+ does not include αi , since ei and S −1 (z) (z ∈ U−β ) commute with each other by (4.11), we have
(ti−1 ⊗ ei )(1 ⊗ kβ−1 )(Cβ ) = (1 ⊗ kβ−1 )(Cβ )(ti−1 ⊗ ei ).
292
T. Nakashima
Thus, we have (ti−1 ⊗ ei )C − C(ti−1 ⊗ ei ) = (ti−1 ⊗ ei )(1 ⊗ kγ−1 )(Cγ ) − (1 ⊗ kγ−1 )(Cγ )(ti−1 ⊗ ei ) γ ∈Q+
=
−1 −1 (ti−1 ⊗ ei )(1 ⊗ kβ+α )(Cβ+α ) − (1 ⊗ kβ+α )(Cβ+α )(ti−1 ⊗ ei ) i i i i
β∈Q+
=
−1 [ti−1 ⊗ ei , (1 ⊗ kβ+α )(Cβ+α )] i i
β∈Q+
=
(1 ⊗ kβ−1 )(Cβ )((qi − qi−1 )ti−1 ei ⊗ 1)
(by Lemma 4.3)
β∈Q+
= C((qi − qi−1 )ti−1 ei ⊗ 1). Then we obtain (4.4). (β,β) Next, let us show (ii). Set C := q (1 ⊗ kβ )(S ⊗ 1)(Cβ ). By [6, Sect.4], we −1 −1 (β,β) have C := q (kβ ⊗ kβ )(Cβ ). Here note that = (S −1 ⊗ S −1 )(C) = =
q (β,β) (1 ⊗ S −1 ){(1 ⊗ kβ )(Cβ )}
q (β,β) {(1 ⊗ S −1 )(Cβ )}(1 ⊗ kβ−1 ) (1 ⊗ kβ−1 )(1 ⊗ S −1 )(Cβ )
= C. Thus, we obtain C −1 = (S −1 ⊗ S −1 )(C−1 ) = q (β,β) {(S −1 ⊗ S −1 )(Cβ )}(kβ ⊗ kβ−1 ) = q −(β,β) (kβ ⊗ kβ−1 )(S −1 ⊗ S −1 )(Cβ ), and complete the proof of Proposition 4.2.
Uq− (g) = Remark. By the explicit form of C −1 in (4.6), we find that C −1 ∈ Uq+ (g)⊗
Bq− (g). Uq+ (g)⊗ 5. Extremal Projectors Let C be as in Sect.4. We define the extremal projector of Bq (g) by := m ◦ σ ◦ (ϕ ⊗ 1)(C) =
kβ−1 S −1 (yr−β )ϕ(xrβ ),
(5.1)
β∈Q+ , r
where m : a ⊗ b → ab is the multiplication and σ : a ⊗ b → b ⊗ a is the permutation. Here note that is a well-defined element in B q (g).
Extremal Projectors of q-Boson Algebras
293
Example 5.1 ([1, 3]). In sl2 -case, the following is the explicit form of . 1 n = q 2 n(n−1) (−1)n f (n) e . n≥0
Theorem 5.2. The extremal projector enjoys the following properties: (i) ei = 0, fi = 0 (∀i ∈ I ). 2 (ii) = . (iii) There exists ak ∈ Bq− (g)(= Uq− (g)), bk ∈ Bq+ (g) such that
ak bk = 1.
k
q∨ (g). (iv) is a well-defined element in B Proof. It is easy to see (iv) by the explicit forms of the anti-pode S, the anti-isomorphism ϕ and in (5.1). The statement (ii) is an immediate consequence of (i). So let us show (i) and (iii). The formula fi = 0 has been shown+in [3]. Thus, we −shall show ei = 0. Here let us write C = k ck ⊗ dk , where ck ∈ Uq (g) and dk ∈ Bq (g). Thus, we have =
dk ϕ(ck ).
k
Equation (4.4) can be written as follows: ti−1 ck ⊗ ei dk = ck ti−1 ⊗ dk ei + (qi − qi−1 )ck ti−1 ei ⊗ dk . k
(5.2)
k
Applying m ◦ σ ◦ (ϕ ⊗ 1) on both sides of (5.2), we get ei dk ϕ(ck )ti = dk ei ti ϕ(ck ) − dk ei ti ϕ(ck ) = 0, k
k
k
and then ei ti = 0, which implies the desired result since ti is invertible. Next, let us see (iii). By the remark in the last section, we can write
Bq− (g). C −1 = bk ⊗ ak ∈ Uq+ (g)⊗ k
Then, 1⊗1=
bk cj ⊗ ak dj .
j,k
Applying m ◦ σ ◦ (ϕ ⊗ 1) on both sides of (5.3), we obtain ak dj ϕ(cj )ϕ(bk ) = ak ϕ(bk ). 1= j,k
Here setting bk := ϕ(bk ), we get (iii).
k
(5.3)
294
T. Nakashima
6. Representation Theory of O(B) As an application of the extremal projector , we shall show the following theorem; Theorem 6.1. (i) The category O(B) is a semi-simple category. (ii) The module H (λ) is a simple object of O(B) and for any simple object M in O(B) there exists some λ ∈ P such that M ∼ = H (λ). Furthermore, H (λ) is a rank one free Bq− (g)-module. In order to show this theorem, we need to prepare several things. For an object M in O(B), set K(M) := {v ∈ M | ei v = 0 for any i ∈ I }. Lemma 6.2. For an object M in O(B), we have · M = K(M).
(6.1)
ei
= 0 for any i ∈ I . Thus, it is trivial to see that Proof. By Theorem 5.2(i), we have · M ⊂ K(M). Owing to the explicit form of , we find that B 1− ∈ q (g)ei . i
Therefore, for any v ∈ K(M) we get (1 − )v = 0, which implies that · M ⊃ K(M). Lemma 6.3. For an object M in O(B), we have M = Bq− (g) · (K(M)). (6.2) Proof. By Theorem 5.2(iii), we have 1 = k ak bk (ak ∈ Bq− (g), bk ∈ Bq+ (g)). For any u ∈ M, u= ak (bk u). k
By Lemma 6.2, we have bk u ∈ K(M). Then we obtain the desired result. Proposition 6.4. For an object M in O(B), we have Im(fi )). M = K(M) ⊕ (
(6.3)
i
Proof. By (6.2), we get
M = K(M) + (
Im(fi )).
i
Thus, it is sufficient to show
Im(fi )) = {0}. K(M) ∩ ( i
(6.4)
Let u be a vector in K(M)∩( i Im(fi )). Since u ∈ i Im(fi ), there exist {ui ∈ M}i∈I such that u = i∈I fi ui . By the argument in the proof of Lemma 6.2, we have u = u for u ∈ K(M). It follows from Theorem 5.2(i) that (fi )ui = 0, u = u = i∈I
which implies (6.4).
Extremal Projectors of q-Boson Algebras
295
Lemma 6.5. If u, v ∈ M (M is an object in O(B)) satisfies v = u, then there exists P ∈ Bq (g) such that v = P u. β
Proof. By the definition of the category of O(B), there exists l > 0 such that ϕ(xr )u = 0 for any r and β with |β| > l. Thus, by the explicit form of in (5.1), we can write kβ−1 S −1 (yr−β )ϕ(xrβ ))u, v = u = ( |β|≤l, r
which implies our desired result.
Proof of Theorem 6.1. Let L ⊂ M be objects in the category O(B). We shall show that there exists a submodule N ⊂ M such that M = L ⊕ N . Since K(M) (resp. K(L)) is invariant by the action of any q h , we have the weight space decomposition: K(M)λ (resp. K(L) = K(L)λ ). K(M) = λ∈P
λ∈P
There exist subspaces Nλ ⊂ K(M)λ such that K(M)λ = K(L)λ ⊕ Nλ , which is a decomposition of a vector space. Here set N := ⊕λ Nλ . We have K(M) = K(L) ⊕ N. Let us show M = L ⊕ Bq (g) · N.
(6.5)
Since M = Bq (g) · (K(M)) = Bq (g)(K(L) ⊕ N ), we get M = L + Bq (g) · N . Let us show L ∩ Bq (g) · N = {0}.
(6.6)
For v ∈ L ∩ Bq (g) · N we have by Theorem 5.2 (iii), v= ak (bk v). k
It follows from v ∈ L that bk v ∈ K(L), and from v ∈ Bq (g) · N that bk v ∈ (Bq (g) · N) = N . These imply bk v ∈ K(L) ∩ N = {0}. Hence we get v = 0 and then (6.5). Next, let us show (ii). As an immediate consequence of Proposition 2.2 we can see that H (λ) is a rank one free Bq− (g)-module. Let πλ : Bq (g) → H (λ) be the canonical projection and set uλ := πλ (1). Here we have H (λ) = Bq− (g) · uλ = Q(q)uλ + Im(fi ). i
It follows from this, Proposition 6.4 and Q(q)uλ ⊂ K(H (λ)) that H (λ) = Q(q)uλ ⊕ i Im(fi ) and then · H (λ) = K(H (λ)) = Q(q)uλ .
(6.7)
296
T. Nakashima
In order to show the irreducibility of H (λ), it is sufficient to see that for arbitrary u(= 0), v ∈ H (λ), there exists P ∈ Bq (g) such that v = P u. Set v = Quλ (Q ∈ Bq− (g)). By Theorem 5.2 (iii), we have u= ak (bk u) = 0. k
Then, for some k we have bk u = 0, which implies that cbk u = uλ for some non-zero scalar c. Therefore, by Lemma 6.5, there exists some R ∈ Bq (g) such that uλ = Ru and then we have v = Quλ = QRu. Thus, H (λ) is a simple module in O(B). Suppose that L is a simple module in O(B). First, let us show dim(K(L)) = 1.
(6.8)
For x, y(= 0) ∈ K(L), there exists P ∈ Bq (g) such that y = P x. Since x ∈ K(L), we can take P ∈ Bq− (g). Because y ∈ K(L) and K(L) ∩ i Im(fi ) = {0}, we find that P must be a scalar, say c. Thus, we have y = cx, which derives (6.8). Let u0 be a basis vector in K(L). The space K(L) is invariant by the action of any q h and then, u0 ∈ Lλ for some λ ∈ P . Therefore, since H (λ) is a rank one free Bq− (g)-module, the map φλ : H (λ) −→ L P uλ → P u0 ,
(P ∈ Bq− (g)),
is a well-defined non-trivial homomorphism of Bq (g)-modules. Thus, by Schur’s lemma, we obtain H (λ) ∼ = L. Acknowledgement. The author would like to thank Y. Koga for valuable discussions and A.N. Kirillov for introducing the papers [2, 7] to him.
References 1. Kashiwara, M.: On crystal bases of the q-analogue of universal enveloping algebras. Duke Math. J. 63, 465–516 (1991) 2. Khoroshkin, S.M., Tolstoy, V.N.: Exremal projector and universal R-matrix for quantized contragradient Lie (super) algebras. In: Quantum Groups and related topics, Gielerak et al, (eds.), 1992, pp. 23–32 3. Nakashima, T.: Quantum R-matrix and Intertwiners for the Kashiwara algebras. Commum. Math. Phys. 164, 239–258 (1994) 4. Rosso, M.: Analogues de la forme de Killing et du th´eor`eme d’Harish-Chandra pour les groupes ´ Norm. Sup. 23, 445–467 (1990) quantiques. Ann. scient.Ec, 5. Rosso, M.: Certaines formes bilin´eaires sur les groupes quantiques et une conjecture de Schechtman et Varchenko. C.R. Acad. Sci. Paris Ser. 1 Math. 314(1), 5–8 (1992) 6. Tanisaki, T.: Killing forms, Harish-Chandra isomorphisms, and universal R-matrices for quantum algebras. Int. J. Mod. Phys. A7(Suppl. 1B), 941–961 (1992) 7. Tolstoy, V.N.: Extremal projectors for Quantized Kac-Moody superalgebras and some of their applications. In: Quantum Groups, (Clausthal, 1989), Lecture Notes in Physics 370, Berlin: Springer, 1990, pp. 118–125 8. Tolstoy, V.N.: Projection operator method for quantum groups. In: Special Functions 2000: Current perspective and future directions, J.Bustoz, et al., (eds.), NATO Science Series II, 30, Amsterdam: Kluwer Acad. Publishers, 2001, pp. 457–488, arXiv:math.QA/0104045 9. Tan, Y.: The q-analogue of bosons and Hall algebras. Comm. Algebra 30(9), 4335–4347 (2002) Communicated by Y. Kawahigashi
Commun. Math. Phys. 244, 297–309 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0977-3
Communications in
Mathematical Physics
Cantor Spectrum for the Almost Mathieu Operator Joaquim Puig Dept. de Matem`atica Aplicada i An`alisi, Univ. de Barcelona, Gran Via 585, 08007 Barcelona, Spain. E-mail: [email protected] Received: 24 March 2003 / Accepted: 29 July 2003 Published online: 18 November 2003 – © Springer-Verlag 2003
Abstract: In this paper we use results on reducibility, localization and duality for the Almost Mathieu operator, Hb,φ x n = xn+1 + xn−1 + b cos (2π nω + φ) xn on l 2 (Z) and its associated eigenvalue equation to deduce that for b = 0, ±2 and ω Diophantine the spectrum of the operator is a Cantor subset of the real line. This solves the so-called “Ten Martini Problem” for these values of b and ω. Moreover, we prove that for |b| = 0 small or large enough all spectral gaps predicted by the Gap Labelling theorem are open.
1. Introduction. Main Results In this paper we study the nature of the spectrum of the Almost Mathieu operator (Hb,φ x)n = xn+1 + xn−1 + b cos(2πωn + φ)xn ,
n∈Z
(1)
on l 2 (Z), where b is a real parameter, ω is an irrational number and φ ∈ T = R/(2π Z). Since for each b this is a bounded self-adjoint operator, the spectrum is a compact subset of the real line which does not depend on φ because of the assumption on ω. This spectrum will be denoted by σb . For b = 0, this set is the interval [−2, 2]. The understanding of the spectrum of (1) is related to the dynamical properties of the difference equation xn+1 + xn−1 + b cos(2πωn + φ)xn = axn ,
n∈Z
(2)
for a ∈ R, which is sometimes called the Harper equation. In what follows we will assume that the frequency ω is Diophantine:
298
J. Puig
Definition 1. We say that a real number ω is Diophantine whenever there exist positive constants c and r > 1 such that the estimate c |sin 2πnω| > |n|r holds for all n = 0. The nature of the spectrum of this operator has been studied intensively in the last twenty years (for a review, see Last [27]) and an open problem has been to know whether the spectrum is a Cantor set or not, which is usually referred to as the “Ten Martini Problem”. In this paper we derive two results on this problem. The first one is non-perturbative: Corollary 1. If ω is Diophantine, then the spectrum of the Almost Mathieu operator is a Cantor set if b = 0, ±2. Here, we prefer to call this result a corollary, rather than a theorem, because the proof requires just a combination of reducibility, point spectrum and duality developed quite recently for the Almost Mathieu operator and the related eigenvalue equation. The argument is in fact reminiscent of Ince’s original argument for the classical Mathieu differential equation (see [19]). In the critical case |b| = 2, Y. Last proved in [26] that the spectrum of the Almost Mathieu operator is a subset of the real line with zero Lebesgue measure and that it is a Cantor set for the values of ω which have an unbounded continued fraction expansion, which is a set of full measure. This last result has been obtained recently for the remaining Diophantine frequencies by Avila & Krikorian [2]. The Cantor structure of the spectrum of the Almost Mathieu operator can be better understood if we make use of the concept of rotation number, which can be defined as follows. Let (xn )n∈Z be a non-trivial solution of (2), for some fixed a, b, φ. Let S(N ) be the number of changes of sign of such a solution for 1 ≤ n ≤ N , adding one if x(N) = 0. Then the limit S(N) N→∞ 2N lim
exists, it does not depend on the chosen solution x, nor on φ and it is denoted by rot(a, b). A more complete presentation of this object can be found in Sect. 2. Here we only mention some properties which relate it to the spectrum of Hb,φ : Proposition 1 ([3, 12, 18, 23]). The rotation number has the following properties: (i) The rotation number, rot(a, b), is a continuous function of (a, b) ∈ R2 . (ii) For a fixed b, the spectrum of (1), σb , is the set of a0 ∈ R, such that a → rot(a, b) is not locally constant at a0 . (iii) (Gap labelling) If I is an open, non-void interval in the resolvent set of (1), ρb = R − σb , then there is an integer k ∈ Z such that 2rot(a, b) − kω ∈ Z for all a ∈ I. That is, rot(a, b) =
1 {kω} , 2
where {·} denotes the fractional part of a real number.
Cantor Spectrum for the Almost Mathieu Operator
299
3 2.5 2 1.5 1 0.5 0 0
0.5
1
1.5
2
2.5
3
3.5
4
Fig. 1. Numerical computation √ of the ten biggest spectral gaps for the Almost Mathieu operator with different values of b and ω = ( 5 − 1)/2. They correspond to the first |k| such that {kω}/2 belongs to [1/4, 1/2]. The coupling parameter b is in the vertical direction whereas the spectral one, a, is in the horizontal one. Note that for b = 0, all gaps except the upper one are collapsed
From this theorem we conclude that the resolvent set is the disjoint union of countably (or finitely) many open intervals called spectral gaps, possibly void, and which can be uniquely labelled by an integer k called the resonance. If the closure of a spectral gap degenerates to a point we will say that it is a collapsed gap and otherwise that it is a non-collapsed gap. See Fig. 1 for a numerical computation of the biggest gaps in the spectrum of the Almost Mathieu operator for several values b. In particular if, for a fixed b, all the spectral gaps are open and the frequency ω is irrational, then the spectrum σb is a Cantor set. The question of the non-collapsing of all spectral gaps is sometimes called the Strong (or Dry) Ten Martini Problem. However, if non-collapsed gaps are dense in the spectrum, then this is still a Cantor set, although some (perhaps an infinite number) of collapsed gaps may also coexist. Now we can formulate the second corollary in this paper: Corollary 2. Assume that ω ∈ R is Diophantine. Then, there is a constant C = C(ω) > 0 such that if 0 < |b| < C or 4/C < |b| < ∞ all the spectral gaps of the spectrum of the Almost Mathieu operator are open. Before ending this introduction we give a short account of the existing results (to our knowledge) on the Cantor spectrum of the Almost Mathieu operator for |b| = 0, 2. The Cantor spectrum for the Almost Mathieu operator was first conjectured by Azbel [4] and Kac, in 1981, conjectured that all the spectral gaps are open. The problem of the Cantor structure of the spectrum was called the “Ten Martini Problem” by Simon [32] (and remained as Problem 4 in [33]). Sinai [34], proved that for Diophantine ω’s and sufficiently large (or small |b|), depending on ω, the spectrum σb is a Cantor set. Choi, Elliott & Yui [8] proved that the spectrum σb is a Cantor set for all b = 0 when ω is a
300
J. Puig
Liouville number obeying the condition ω − p < D −q , q for a certain constant D > 1 and infinitely many rationals p/q. In particular, this means that for a Gδ -dense subset of pairs (b, ω) the spectrum is a Cantor set, which is the Bellissard-Simon result [5]. For results on Cantor spectrum for continuous quasi-periodic and almost periodic Schr¨odinger operators see Moser [28], Johnson [22], Eliasson [17] and Puig & Sim´o [31]. Nevertheless, collapsed gaps appear naturally in quasi-periodic Schr¨odinger operators, as it was shown by Broer, Puig & Sim´o [6] and there are examples which do not display Cantor spectrum, see De Concini & Johnson [11]. Finally, let us mention that, if we consider the case of rational ω, all spectral gaps, apart from the middle one, are open if b = 0. This result was proved by van Mouche [35] and Choi, Elliott & Yui [8]. Let us now outline the contents of the present paper. In Sect. 2 we introduce some of the tools needed to prove our two main results. These include the different definitions of the rotation number, the concept of reducibility of linear quasi-periodic skew-products and the duality for the Almost Mathieu operator. In Sect. 3 we apply the reducibility results by Eliasson to prove Corollary 2. Finally, in Sect. 4, the proof of of Corollary 1 is given, which is based on a result of non-perturbative localization by Jitomirskaya. 2. Prerequisites: Rotation Number, Reducibility, Duality and Lack of Coexistence Rotation number. The rotation number for quasi-periodic Schr¨odinger equations is a very useful object with deep connections to the spectral properties of Schr¨odinger operators. It is also related to the dynamical properties of the solutions of the associated eigenvalue equation. This allows several equivalent definitions, which we shall now try to present. The rotation number was introduced for continuous time quasi-periodic Schr¨odinger equations by Johnson & Moser [23]. The discrete version was introduced by Herman [18] (which is also defined for quasi-periodic skew-product flows on SL(2, R) × T) and Delyon & Souillard [12] (which is the definition given in the introduction). We will now review these definitions, their connection and some important properties. Herman’s definition is dynamical. Here we follow the presentation by Krikorian [25]. Write Eq. (2) as a quasi-periodic skew-product flow on R2 × T, un+1 = A(θn )un
θn+1 = θn + 2π ω,
(3)
setting un = (xn , xn−1 )T and A(θ ) =
a − b cos θ −1 , 1 0
(4)
which belongs to SL(2, R) the group of bidimensional matrices with determinant one. The quasi-periodic flow can also be defined on SL(2, R) × T considering the flow given by Xn+1 = A(θn )Xn ,
θn+1 = θn + 2π ω,
(5)
Cantor Spectrum for the Almost Mathieu Operator
301
with X0 ∈ SL(2, R). This can be seen as the iteration of the following quasi-periodic cocycle on SL(2, R) × T: SL(2, R) × T −→ SL(2, R) × T (X, θ ) → (A(θ )X, θ + 2π ω) ,
(6)
which we denote by (A, ω). We will now give Herman’s definition of the rotation number of a quasi-periodic cocyle like (6) with A : T → SL(2, R) homotopic to the identity. For a general A : T → SL(2, R), this last property is not always true, since SL(2, R) is not simply connected. Indeed, its first homotopy group is isomorphic to Z, with generator the rotation R1 : T → SL(2, R) given by cos θ − sin θ R1 (θ ) = sin θ cos θ for all θ ∈ T. In our case, the Almost Mathieu cocyle (4) is homotopic to the identity. Let S1 be the set of unit vectors of R2 and let us denote by p : R → S1 the projection given by the exponential p(t) = eit , identifying R2 with C. Because of the linear character of the cocyle, the continuous map 1 F : S1 × T −→ S ×T A(θ )v (v, θ ) → , θ + 2π ω A(θ )v
(7)
is also homotopic to the identity. Therefore, it admits a continuous lift F˜ : R×T → R×T of the form: F˜ (t, θ ) = (t + f (θ, t), θ + 2π ω) such that f (t + 2π, θ + 2πω) = f (t, θ ) and p (t + f (t, θ)) =
A(θ )p(t) A(θ )p(t)
for all t ∈ R and θ ∈ T. The map f is independent of the choice of F˜ up to the addition of a constant 2π k, with k ∈ Z. Since the map θ → θ + 2π ω is uniquely ergodic on T for all (t, θ ) ∈ R × T, the limit N−1 1 ˜n f F (t, θ ) N→∞ 2πN
lim
n=0
exists, it is independent of (t, θ ) and the convergence is uniform in (t, θ ), see Herman [18] and Johnson & Moser [23]. This object is called the fibered rotation number, which will be denoted as ρf (a, b), and it is defined modulus Z. For instance, if A0 ∈ SL(2, R) is a constant matrix, then the fibered rotation number of the cocycle (A0 , ω), for any irrational ω, is the absolute value of the argument of the eigenvalues divided by 2π . Using a suspension argument (see Johnson [24]) it can be seen that, for the Almost Mathieu cocycle (like for any quasi-periodic Schr¨odinger cocycle), the fibered rotation number coincides with the Sturmian definition given in the introduction. Note that this
302
J. Puig
last rotation number, rot(a, b), belongs to the interval [0, 1/2], whereas the fibered rotation number, f (a, b), is an element of R/Z. They can be both linked by means of the integrated density of states, see Avron & Simon [3]. Let kL (a, b, φ) be (L − 1)−1 times the number of eigenvalues less than or equal to a for the restriction of Hb,φ to the set {1, . . . , L − 1}, for some φ ∈ T, with zero boundary conditions at both ends 0 and L. Then, as L → ∞, the kL (a, b, φ) converge to a continuous function k(a, b), which is the integrated density of states. The basic relations are 2rot(a, b) = k(a, b)
and
2f (a, b) = k(a, b) + l,
for a suitable integer l ∈ Z. In particular, 1 rot(a, b) = ρf (a, b) (mod Z). 2 In what follows, the arithmetic nature of the rotation number will be of importance. We will say that the rotation number is rational or resonant with respect to ω if there exists a constant k ∈ Z such that rot(a, b) = {kω}/2 or equivalently, f (a, b) = kω/2 modulus 21 Z. Also, we say that it is Diophantine with respect to ω whenever the bound rot(a, b) − {kω} = min ρf (a, b) − kω − l ≥ K , 2 2 2 |k|τ l∈Z holds for all k ∈ Z − {0} and suitable fixed positive constants K and τ . Reducibility. A main tool in the study of quasi-periodic skew-product flows is its reducibility to constant coefficients. Reducibility is a concept defined for the continuous and discrete case (for an introduction see the reviews by Eliasson [14, 15] and, for more references, the survey [30] by the author). A quasi-periodic skew-product flow like (3), or a quasi-periodic cocycle like (6), with A : T → SL(2, R), is said to be reducible to constant coefficients if there is a continuous map Z : T → SL(2, R) and a constant matrix B ∈ SL(2, R), called the Floquet matrix, such that the conjugation A(θ )Z(θ ) = Z(θ + 2πω)B
(8)
is satisfied for all θ ∈ T. When ω is rational, in which case the flow is periodic, any skew-product flow is reducible to constant coefficients. Even in this periodic case, it is not always possible to reduce with the same frequency ω, but with ω/2. If there is a reduction to constant coefficients like (8), then a fundamental matrix of solutions of (3), Xn+1 (φ) = A(2πnω + φ)Xn (φ),
n ∈ Z,
with X0 : T → SL(2, R) continuous, has the following Floquet representation: Xn (φ) = Z(2πnω + φ)B n Z(φ)−1 X0 (φ)
(9)
for all n ∈ Z and φ ∈ T. This gives a complete description of the qualitative behaviour of the flow (3). The rotation number of a quasi-periodic cocycle is not invariant through a conjugation like (8). There are however the following easy relations:
Cantor Spectrum for the Almost Mathieu Operator
303
Proposition 2. Let ω be an irrational number and (A1 , ω) and (A2 , ω) be two quasiperiodic cocycles on SL(2, R) × T homotopic to the identity, being ρ1 and ρ2 the corresponding fibered rotation numbers. Assume that there exists a continuous map Z : T → SL(2, R) such that A1 (θ )Z(θ ) = Z(θ + 2πω)A2 (θ ) for all θ ∈ T. Then, if k ∈ Z is the degree of Z, ρ1 = ρ2 + kα modulus Z. This proposition shows that, for any fixed irrational frequency ω, the class of quasiperiodic cocycles with rational rotation number (resp. with Diophantine rotation number) is invariant under conjugation, although the rotation number itself may change. Also, that whenever a quasi-periodic skew-product flow in SL(2, R) × T is reducible to a Floquet matrix with trace ±2, the rotation number must be rational. Duality and lack of coexistence. To end this section, let us present a specific feature of the Almost Mathieu operator or, rather, of the associated eigenvalue equation which is in the basis of our arguments. It is part of what is known as Aubry duality or simply duality: Theorem 1 (Avron & Simon [3]). For every irrational ω, the rotation number of (2) satisfies the relation rot(a, b) = rot(2a/b, 4/b)
(10)
for all b = 0 and a ∈ R. According to Proposition 1 this means that the spectrum σ4/b , for b = 0 is just a dilatation of the spectrum σb . In particular, σb is a Cantor set (resp. none of the spectral gaps of σb is collapsed) if and only if σ4/b is a Cantor set (resp. none of the spectral gaps of σ4/b is collapsed). In the proof of our two main results we will use the following argument, which is analogous to Ince’s argument for the classical Mathieu periodic differential equation (see [19] §7.41). In principle, the eigenvalue equation of a general quasi-periodic Schr¨odinger operator may have two linearly independent quasi-periodic solutions with frequency ω (or ω/2). One may call this phenomenon coexistence of quasi-periodic solutions, in analogy with the classical Floquet theory for second-order periodic differential equations. A trivial example of this occurs in the Almost Mathieu case for b = 0 and suitable values of a. Let us now show that in the Almost Mathieu case this does not happen if b = 0, i.e. two quasi-periodic solutions with frequency ω of the eigenvalue equation cannot coexist. Let (xn )n∈Z satisfy the equation xn+1 + xn−1 + b cos(2πωn + φ)xn = axn ,
n∈Z
(11)
for some a, b = 0 and φ. If it is quasi-periodic with frequency ω, there exists a continuous function ψ : T → R such that xn = ψ(2πωn + φ) for all n ∈ Z. The Fourier coefficients of ψ, (ψm )m∈Z satisfy the following equation: b 2 cos(2π ωm)ψm + (ψm+1 + ψm−1 ) = aψm , 2
m ∈ Z,
304
J. Puig
which is equivalent to ψm+1 + ψm−1 +
4 2a cos(2π ωm)ψm = ψm , b b
m ∈ Z.
(12)
Since ψ is at least continuous, then (ψm )m∈Z belongs to l 2 (Z). Now the reason for the absence of coexisting quasi-periodic solutions is clear. Indeed, if (yn )n∈Z is another linearly independent quasi-periodic solution of (11) with frequency ω, say yn = χ (2π ω+φ), for some continuous χ , then the sequence of the Fourier coefficients of χ , (χm )m∈Z , would be a solution of (12) belonging to l 2 (Z). The sequences (ψm )m∈Z and (χm )m∈Z would be two linearly independent solutions of (12) which belong both to l 2 (Z). This is a contradiction, because for bounded potentials, like the cosine, we are always in the limit-point case (see [7, 9] for the continuous case). In our discrete case, this is even simpler, since any solution in l 2 (Z) of the eigenvalue equation must tend to zero at ±∞. Hence, the existence of two linearly independent solutions belonging both to l 2 (Z) would be in contradiction with the preservation of the Wronskian. Therefore, two quasi-periodic solutions with frequency ω cannot coexist if b = 0. A similar argument shows that quasi-periodic solutions of the form (−1)n ψ(2πωn + φ),
(13)
for a continuous ψ : T → R cannot coexist. Finally, note that the coexistence of two quasi-periodic solutions with frequency ω of Eq. (11) is equivalent to the reducibility of the corresponding two-dimensional skewproduct flow (3), with the identity as Floquet matrix. Similarly the coexistence of two quasi-periodic solutions of the type (13) is equivalent to the reducibility of the flow with minus the identity as Floquet matrix. 3. The Strong Ten Martini Problem for Small (and Large) |b| In this section we will show that for 0 < |b| < C, where C > 0 is a suitable constant, and for |b| > 4/C all spectral gaps are open. The theorem from which we will derive Corollary 2 is due to Eliasson and it was originally stated for the continuous case, based on a KAM scheme. It can be adapted to the discrete case to obtain the following: Theorem 2 ([16, 17]). Assume that ω is Diophantine with constants c and r. Then there is a constant C(c, r) such that, if |b| < C(c, r) and rot(a, b) is either rational or Diophantine, then the quasi-periodic skew-product flow
xn+1 xn
=
a − b cos θn −1 1 0
xn xn−1
,
θn+1 = θn + 2π ω
(14)
on R2 × T is reducible to constant coefficients, with Floquet matrix B, by means of a quasi-periodic (with frequency ω/2) and analytic transformation. Moreover, if a is at an endpoint of a spectral gap of σb , then the trace of B is ±2, being B = ±I if, and only if, the gap collapses. Finally, if B = ±I then the transformation Z can be chosen to have frequency ω.
Cantor Spectrum for the Almost Mathieu Operator
305
For other reducibility results in the context of quasi-periodic Schr¨odinger operators see Dinaburg & Sinai [13] and Moser & P¨oschel [29] for the continuous case and Krikorian [25] and Avila & Krikorian [2] for the discrete case. Taking into account the arguments from the previous section, Corollary 2 is immediate. Indeed, let |b| < C, where C is the constant given by the theorem for a fixed Diophantine frequency ω. Then the skew-product flow (14) is reducible to constant coefficients and the Floquet matrix has trace ±2 if a is an endpoint of a spectral gap. Moreover the gap is collapsed if, and only if, the Floquet matrix B is ±I . Since we have seen in the previous section that (14) for b = 0 cannot be reducible to these Floquet matrices, Corollary 2 follows. 4. Non-Perturbative Localization and Cantor Spectrum for b = 0 In this section we will see how Corollary 1 is a consequence of the following theorem on non-perturbative localization, due to Jitomirskaya: Theorem 3 ([20]). Let ω be Diophantine. Define the set of resonant phases as the set of those φ ∈ T such that the relation 1 |sin (φ + πnω)| < exp −|n| 2r (15) holds for infinitely many values of n, r being the constant in the definition of a Diophantine number. Then, if φ ∈ and |b| > 2 the operator Hb,φ has only pure point spectrum with exponentially decaying eigenfunctions. Moreover, any of these eigenfunctions (ψn )n∈Z satisfies that 2 log ψn2 + ψn+1 |b| β(b) = − lim = log . (16) |n|→∞ 2|n| 2 Now we prove Corollary 1. Let |b| > 2. Then, according to Theorem 3, the operators Hb,0 and Hb,π have only pure point spectrum with exponentially decaying eigenfunctions. The eigenvalue equation associated to these operators has the following properties: Lemma 1. Let (xn )n∈Z be a solution of the difference equation xn+1 + xn−1 + b cos(2πnω + φ)xn = axn ,
n ∈ Z,
for some constants a, b and φ ∈ T. Then, if φ = 0, π , (x−n )n∈Z is also a solution of this equation. Let us consider the operator Hb,0 . According to Theorem 3, there exists a sequence of eigenvalues (a k (b))k∈Z with eigenvectors (ψ k (b))k∈Z , exponentially localized and which form a complete orthonormal basis of l 2 (Z). Moreover the set of eigenvalues (a k (b))k∈Z must be dense in the spectrum σb . Again, we do not write the dependence on b for simplicity in what follows. None of these eigenvalues can be repeated, since we are in the limit point case. Writing each of the ψ k as ψ k = (ψnk )n∈Z , we define ψ˜ k (θ ) =
k∈Z
ψnk eikθ ,
306
J. Puig
for θ ∈ T. All these functions belong to Cβa (T, R), the set of real analytic functions of T with analytic extension to | θ | < β and they are even functions of θ , because of Lemma 1 (here we have applied again that we are in the limit point case). Passing to the dual equation, we obtain that, for each k ∈ Z, the sequence (ψ˜ k (2π ωn))n∈Z is a quasi-periodic solution of xn+1 + xn−1 +
4 2a cos θn xn = xn , b b
θn+1 = θn + 2π ω
n ∈ Z,
(17)
provided a is now replaced by a k . We are now going to see that 2a k /b is at an endpoint of a spectral gap and that this is collapsed. To do so we will use reducibility as in the proof of Theorem 2. For a direct proof that 2a k /b is at an endpoint of a gap (it has rational rotation number), see again Herman [18]. The fact that (ψ˜ k (2π ωn))n∈Z is a quasi-periodic solution of (17) means that, for all θ ∈ Td , the following equation is satisfied: k k 2a ψ˜ (2π ω + θ) ψ˜ (4π ω + θ ) − b4 cos θ −1 b = . 1 0 ψ˜ k (2π ω + θ ) ψ˜ k (θ ) The following lemma shows that, if this is the case, then the quasi-periodic skew-product flow k 2a 4 xn xn+1 b − b cos θn −1 = , θn+1 = θn + 2π ω (18) xn xn−1 1 0 is reducible to constant coefficients. Lemma 2. Let A : T → SL(2, R) be a real analytic map, with analytic extension to | θ| < δ for some δ > 0. Assume that there is a nonzero real analytic map v : T → R2 , with analytic extension to | θ| < δ such that v(θ + 2πω) = A(θ )v(θ ) holds for all θ ∈ T. Then, the quasi-periodic skew-product flow given by un+1 = A(θn )un ,
θn+1 = θn + 2π ω,
(19)
with (un , θn ) ∈ R2 × T for all n ∈ Z is reducible to constant coefficients by means of a quasi-periodic transformation which is analytic in | θ| < δ and has frequency ω. Moreover the Floquet matrix can be chosen to be of the form 1 c B= (20) 0 1 for some c ∈ R. Proof. Since v = (v1 , v2 )T does not vanish, d = v12 + v22 is always different from zero and the transformation v1 (θ ) −v2 (θ )/d(θ ) Z(θ ) = , v2 (θ ) v1 (θ )/d(θ )
Cantor Spectrum for the Almost Mathieu Operator
307
is an analytic map Z : T → SL(2, R). The transformation Z defines a conjugation of A with B 1 , being A(θ )Z(θ ) = Z(2πω + θ )B 1 (θ ), which means that B 1 is
B 1 (θ ) =
1 (θ ) 1 b12 , 0 1
1 : T → R. The transformed skew-product flow, defined by the for some analytic b12 1 matrix B is reducible to constant coefficients because it is in triangular form, the fre1 is analytic. Indeed, if y quency ω is Diophantine and b12 12 : T → R is an analytic solution of the small divisors equation 1 1 y12 (2πω + θ ) − y12 (θ ) = b12 (θ ) − [b12 ],
θ ∈ T,
1 ] is the average of b1 (see [1]), then the transformation where [b12 12 1 y12 Y (θ ) = 0 1
conjugates B 1 with its averaged part:
B = [B ] = 1
which is in the form of (20).
1 ] 1 [b12 0 1
Thus, applying this lemma, the flow (18) is reducible to constant coefficients with Floquet matrix B, of the form (20). That is, there exists a real analytic map Z : T → SL(2, R) such that A(θ )Z(θ ) = Z(θ + 2π ω)B
(21)
for all θ ∈ T. Moreover, since the trace of B is 2, the rotation number of (17) is rational, so that we are at the endpoint of a gap, which we want to show is non-collapsed. By the arguments of Sect. 2, we rule out the possibility of B being the identity. Indeed, this would imply the coexistence of two quasi-periodic analytic solutions with frequency ω, which does happen in the Almost Mathieu case. Therefore B = I and, thus, c = 0 in the definition above. If B = I , it is a well-known fact of Floquet theory that 2a k /b lies at the endpoint of a non-collapsed gap (see, for example, the monograph [36] for classical Floquet theory or [6] for the continuous and quasi-periodic Schr¨odinger case). For the sake of self-completeness we sketch the argument. We will see that there exists a α0 > 0 such that if 0 < |α| < α0 and α is either positive or negative (depending on the sign of c) then 2a k /b + α lies in the resolvent set of σ4/b . To do so, we will show that, for these values of α, the skew-product flow k 2a 4 xn xn+1 + α − cos θ −1 n b b = , θn+1 = θn + 2π ω (22) xn xn−1 1 0
308
J. Puig
has an exponential dichotomy (see Coppel [10]) which implies that 2a k /b + α ∈ σ4/b (see Johnson [21]). The reduction given by Z transforms this system into 2 2 1 + α z11 z12 − cz11 c + α −cz11 z12 + z12 yn+1 = yn , 2 −αz11 1 − αz11 z12 θn+1 = θn + 2πω, (23) where yn ∈ R2 are the new variables. The zij are the elements of the matrix Z and we have used the relations given by (21) and the special form of A and B. In the same calculation, we also see that (z11 (2πnω))n∈Z is a quasi-periodic solution of Eq. (17) and that it is not identically zero. Using averaging theory (see, for example, Arnol d [1]), system (23) can be transformed into 2 ] 2 ] 1 + α [z11 z12 ] − c[z11 c + α −c[z11 z12 ] + [z12 yn+1 = + M yn 2 ] −α[z11 1 − α[z11 z12 ] (24) θn+1 = θn + 2π ω by means of a conjugation in SL(2, R), with M analytic in both θ and α (in some narrower domains) and of order α 2 . The time-independent part of the above system is hyperbolic if cα < 0. Therefore, if |α| = 0 is small enough the time-dependent system (24) has an exponential dichotomy for cα < 0. Hence 2a k /b + α does not belong to σ4/b . Since this works for all a k , (which are dense in the spectrum), σ4/b is a Cantor set. By duality the result is also true for σb . This ends the proof of Corollary 1. Remark 1. The same can be done for the operator Hb,π instead of Hb,0 . In this case the Floquet matrix has trace −2. The corresponding point eigenvalues correspond to ends of non-collapsed gaps and are dense in the spectrum. Acknowledgements. The author is indebted to Hakan Eliasson, Raphael Krikorian and Carles Sim´o for stimulating discussions and comments on this problem. He would like to thank the anonymous referee for ´ his useful suggestions. He also wants to thank the Centre de Math´ematiques at the Ecole Polytechnique for hospitality. Help from the Catalan grant 2000FI71UBPG and grants DGICYT BFM2000-805 (Spain) and CIRIT 2000 SGR-27, 2001 SGR-70 (Catalonia) is also acknowledged.
References 1. Arnol d, V.I.: Geometrical methods in the theory of ordinary differential equations. Vol. 250 of Grundlehren der Mathematischen Wissenschaften. New York: Springer-Verlag, 1983 2. Avila, A., Krikorian, R.: Reducibility or non-uniform hyperbolicity for quasiperiodic Schr¨odinger cocycles. Preprint, 2003 3. Avron, J., Simon B.: Almost periodic Schr¨odinger operators II. The integrated density of states. Duke Math. J. 50, 369–391 (1983) 4. Azbel, M.Ya.: Energy spectrum of a conduction electron in a magnetic field. Soviet Phys. JETP. 19, 634–645 (1964) 5. Bellissard, J., Simon, B.: Cantor spectrum for the almost Mathieu equation. J. Funct. Anal. 48(3), 408–419 (1982) 6. Broer, H.W., Puig, J., Sim´o, C.: Resonance tongues and instability pockets in the quasi-periodic Hill-Schr¨odinger equation. Commun. Math. Phys. 241 (2-3), 467–503 (2003) 7. Carmona, R., Lacroix, J.: Spectral theory of random Schr¨odinger operators. The Probability and its Applications. Basel-Boston: Birkh¨auser, 1990 8. Choi, M.D., Elliott, G.A., Yui, N.: Gauss polynomials and the rotation algebra. Invent. Math. 99(2), 225–246 (1990) 9. Coddington, E.A., Levinson, N.: Theory of ordinary differential equations. New York-Toronto-London: McGraw-Hill Book Company, Inc., 1955
Cantor Spectrum for the Almost Mathieu Operator
309
10. Coppel, W.A.: Dichotomies in stability theory. Lecture Notes in Mathematics, Vol. 629, Berlin: Springer-Verlag, 1978 11. DeConcini, C., Johnson R.A.: The algebraic-geometric AKNS potentials. Ergodic Theory Dynam. Syst. 7(1), 1–24 (1987) 12. Delyon, F., Souillard, B.: The rotation number for finite difference operators and its properties. Commun. Math. Phys. 89(3), 415–426 (1983) 13. Dinaburg, E.I., Sinai,Y.G.: The one-dimensional Schr¨odinger equation with quasi-periodic potential. Funkt. Anal. i. Priloz. 9, 8–21 (1975) 14. Eliasson, L.H.: One-dimensional quasi-periodic Schr¨odinger operators – dynamical systems and spectral theory. In: European Congress of Mathematics, Vol. I (Budapest, 1996), Basel: Birkh¨auser, 1998, pp. 178–190 15. Eliasson, L.H.: Reducibility and point spectrum for linear quasi-periodic skew-products. In: Proceedings of the International Congress of Mathematicians, Vol. II (Berlin, 1998), number Extra Vol. II, (electronic), 1998, pp. 779–787 16. Eliasson, L.H.: On the discrete one-dimensional quasi-periodic Schr¨odinger equation and other smooth quasi-periodic skew products. In: Hamiltonian systems with three or more degrees of freedom (S’Agar´o, 1995), Volume 533 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci. Dordrecht: Kluwer Acad. Publ., 1999, pp. 55–61 17. Eliasson, L.H.: Floquet solutions for the one-dimensional quasi-periodic Schr¨odinger equation. Commun. Math. Phys. 146, 447–482 (1992) 18. Herman, M.R.: Une m´ethode pour minorer les exposants de Lyapunov et quelques exemples montrant le caract`ere local d’un th´eor`eme d’Arnold et de Moser sur le tore de dimension 2. Comment. Math. Helv. 58(3), (1983) 19. Ince, E.L.: Ordinary Differential Equations. New York: Dover Publications, 1944 20. Jitomirskaya, S.Y.: Metal-insulator transition for the almost Mathieu operator. Ann. Math. (2) 150(3), 1159–1175 (1999) 21. Johnson, R.: The recurrent Hill’s equation. J. Diff. Eq. 46, 165–193 (1982) 22. Johnson, R.: Cantor spectrum for the quasi-periodic Schr¨odinger equation. J. Diff. Eq. 91, 88–110 (1991) 23. Johnson, R., Moser, J.: The rotation number for almost periodic potentials. Commun. Math. Phys. 84, 403–438, (1982) 24. Johnson, R.A.: A review of recent work on almost periodic differential and difference operators. Acta Appl. Math. 1(3), 241–261 (1983) 25. Krikorian, R.: Reducibility, differentiable rigidity and Lyapunov exponents for quasi-periodic cocycles on T × SL(2, R). Preprint 26. Last, Y.: Zero measure spectrum for the almost Mathieu operator. Commun. Math. Phys. 164(2), 421–432 (1994) 27. Last, Y.: Almost everything about the almost Mathieu operator. I. In: XIth International Congress of Mathematical Physics (Paris, 1994), Cambridge MA: Internat. Press, 1995, pp. 366–372 28. Moser, J.: An example of schr¨odinger equation with almost periodic potential and nowhere dense spectrum. Comment. Math. Helv. 56, 198–224 (1981) 29. Moser, J., P¨oschel, J.: An extension of a result by Dinaburg and Sinai on quasi-periodic potentials. Comment. Math. Helv. 59, 39–85 (1984) 30. Puig, J.: Reducibility of linear differential equations with quasi-periodic coefficients: A survey. Barcelona: Preprint University of Barcelona, 2002, Available at http://www.maia. ub.es/∼puig/preprints/qpred.ps 31. Puig, J, Sim´o, C.: Analytic families of reducible linear quasi-periodic equations. In progress 2003 32. Simon, B.: Almost periodic Schr¨odinger operators: A review. Adv. Appl. Math. 3(4), 463–490 (1982) 33. Simon, B.: Schr¨odinger operators in the twenty-first century. In: Mathematical physics 2000, London: Imp. Coll. Press, 2000, pp. 283–288 34. Sinai, Ya.G.: Anderson localization for one-dimensional difference Schr¨odinger operator with quasiperiodic potential. J. Statist. Phys. 46(5-6), 861–909 (1987) 35. van Mouche, P.: The coexistence problem for the discrete Mathieu operator. Commun. Math. Phys. 122(1), 23–33 (1989) 36. Yakubovich, V.A., Starzhinskii V.M.: Linear differential equations with periodic coefficients. 1, 2. New York-Toronto, Ont: Halsted Press [John Wiley & Sons], 1975 Communicated by B. Simon
Commun. Math. Phys. 244, 311–334 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0975-5
Communications in
Mathematical Physics
Towards a Quantum Analog of Weak KAM Theory Lawrence C. Evans Department of Mathematics, University of California, Berkeley, CA 94720, USA Received: 3 February 2003 / Accepted: 31 July 2003 Published online: 18 November 2003 – © Springer-Verlag 2003
Abstract: We discuss a quantum analogue of Mather’s minimization principle for Lagrangian dynamics, and provide some formal calculations suggesting the corresponding Euler–Lagrange equation. We then rigorously construct from the dual eigenfunctions of a certain non-selfadjoint operator a candidate ψ for a minimizer, and recover aspects of “weak KAM” theory in the limit as h → 0. Regarding our state ψ as a quasimode, we furthermore derive some error estimates, although it remains an open problem to improve these bounds. 1. Introduction This paper proposes an extension of Mather’s variational principle [M1, M2, M-F] and Fathi’s weak KAM theory[F1, F2, F3] to quantum states. We interpret “weak KAM” theory to mean the application of nonlinear PDE methods, mostly for first–order equations, towards understanding the structure of action minimizing measures solving Mather’s problem. As explained in the introduction to [E-G], a goal is interpreting these measures as providing a sort of “integrable structure”, governed by an associated “effective Hamiltonian” H¯ , in the midst of otherwise possibly very chaotic dynamics. The relevant PDE are a nonlinear eikonal equation and an associated continuity (or transport) equation. In this work we attempt to extend this viewpoint and some of the techniques to a quantum setting, in the semiclassical limit. We do so by suggesting an analogue of Mather’s action minimization problem for the Lagrangian L(v, x) = 21 |v|2 − W (x), where the potential W is periodic, and formally computing the first and second variations. Thus motivated, we next build a candidate state ψ for a minimizer and discuss at length its properties. As in the nonquantum case, we come fairly naturally upon an eikonal PDE (with some extra terms) and an exact continuity equation. We next send h → 0 and show how the usual structure of weak KAM theory appears in this limit. Supported in part by NSF Grant DMS-0070480 and by the Miller Institute for Basic Research in Science, UC Berkeley
312
L.C. Evans
More interesting is understanding if our ψ is a good quasimode, that is, a decent approximate solution of an appropriate eigenvalue problem. This turns out to be so, although our error bounds are too weak to allow for any deductions about the spectrum. Much of the interest in the following calculations centers upon our minimizing, subject to certain side conditions, the expected value of the Lagrangian, namely the expression h2 |Dψ|2 − W |ψ|2 dx, n 2 T and not the expected value of the Hamiltonian, h2 |Dψ|2 + W |ψ|2 dx. Tn 2 It will turn out that owing to the constraints a minimizer of the former is approximately a critical point of the later. But the key question, unresolved here, is determining when the error terms are of order, say, o(h) in L2 . The calculations below represent improvements upon some ideas developed earlier in [E1]. An interesting recent paper of Anantharaman [A] presents a somewhat similar approach within a probabilistic framework, and Holcman and Kupka’s forthcoming paper [H-K] is related. Likewise, Gomes [G] found some related constructions for his stochastic analogue of Aubry-Mather theory. We later discuss also some formal connections with the “stochastic mechanics” approach to quantum mechanics of Nelson [N] and also with homogenization theory for divergence–structure second–order elliptic PDE. Action minimizing measures. Hereafter Tn denotes the flat torus in Rn , the unit cube with opposite faces identified. We are given a smooth and periodic potential function W : Tn → R and a vector V ∈ Rn . The Lagrangian is 1 2 |v| − W (x) 2 and the corresponding Hamiltonian is L(v, x) :=
(v ∈ Rn , x ∈ Tn ),
1 2 (p ∈ Rn , x ∈ Tn ). |p| + W (x) 2 Mather’s minimization problem is to find a Radon measure µ on the velocity–position configuration space Rn × Tn to minimize the generalized action 1 2 A[µ] := (1.1) |v| − W dµ, n n 2 R T H (p, x) :=
subject to the requirements that µ ≥ 0, µ(Rn × Tn ) = 1, v · Dφ dµ = 0 for all φ ∈ C 1 (Tn ), Rn Tn
and
(1.2) (1.3)
Rn Tn
v dµ = V .
The identity (1.3) is a weak form of flow invariance.
(1.4)
Towards a Quantum Analog of Weak KAM Theory
313
Quantum action minimizing states. We propose as a quantum version of Mather’s problem to find ψ minimizing the action h2 A[ψ] := (1.5) |Dψ|2 − W |ψ|2 dx, Tn 2 subject to the constraints that |ψ|2 dx = 1, n T ¯ ¯ · Dφ dx = 0 (ψDψ − ψD ψ) Tn
and
(1.6) for all φ ∈ C 1 (Tn ),
h ¯ ψDψ − ψD ψ¯ dx = V . 2i Tn
(1.7)
(1.8)
Here h denotes a positive constant. We always suppose that ψ has the Bloch wave form ψ =e
iP ·x h
ψˆ
for some P ∈ Rn and a periodic function ψˆ : Tn → C. If ψ is smooth, condition (1.7) reads ¯ ¯ = 0. div(ψDψ − ψD ψ)
(1.7 )
¯ This is the analogue of the flow invariance. The vector field j := ψDψ −ψD ψ¯ represents the flux. Remark. While it is presumably possible to introduce some sort of quantization for more general Lagrangians than L(v, x) = 21 |v|2 − W (x), most of the subsequent analysis would fail: we will from Sect. 3 onward rely upon some Cole-Hopf type transformations that depend upon the precise structure of this Lagrangian. 2. First and Second Variations, Local Minimizers In this section we provide some formal calculations concerning the first and second variations of our problem (1.5)−(1.8). These heuristic deductions motivate the constructions and computations in Sects. 3–8. Let us take the complex-valued state in polar form ψ = aeiu/ h ,
(2.1)
u=P ·x+v
(2.2)
where the phase u has the structure
for some Tn -periodic function v. Thus ψ has the requisite Bloch wave form. The action is then h2 a2 |Da|2 + |Du|2 − W a 2 dx, (2.3) A[ψ] = 2 Tn 2
314
L.C. Evans
and the constraints (1.6)–(1.8) become a 2 dx = 1, Tn
div(a 2 Du) = 0, a 2 Du dx = V . Tn
(2.4) (2.5) (2.6)
2.1. First variation. Let {(u(τ ), a(τ ))}−1≤τ ≤1 be a smooth one-parameter family satisfying (2.4)–(2.6), with (u(0), a(0)) = (u, a). We suppose also that for each τ ∈ (−1, 1), we can write u(τ ) = P (τ ) · x + v(τ ), where P (τ ) ∈
Rn
and v(τ ) is Tn –periodic. Define h2 a 2 (τ ) j (τ ) := |Da(τ )|2 + |Du(τ )|2 − W a 2 (τ ) dx, 2 Tn 2
and hereafter write = Theorem 2.1. We have
(2.7)
d dτ . j (0)
= 0 for all variations if and only if h2 |Du|2 − a = a +W −E 2 2
(2.8)
for some real number E. We interpret (2.8) as the Euler–Lagrange equation for our minimization problem, and call ψ = aeiu/ h a critical point if this PDE is satisfied. Proof. 1. We first compute j = h2 Da · Da + aa |Du|2 + a 2 Du · Du − 2W aa dx. Tn
Next, differentiate (2.5), (2.6): div(2aa Du + a 2 Du ) = 0, 2aa Du + a 2 Du dx = 0, Tn
(2.9) (2.10)
and set τ = 0. Recall also that Du = P + Dv. Multiply (2.9) by v, integrate by parts, then take the inner product of (2.10) with P . Add the resulting expressions to find 2aa |Du|2 + a 2 Du · Du dx = 0. Tn
Hence j (0) =
h2 Da · Da − aa |Du|2 − 2W aa dx 2 |Du|2 h =2 + W a dx. a − a − 2 2 Tn Tn
Towards a Quantum Analog of Weak KAM Theory
315
Then j (0) = 0 for all such a provided h2 |Du|2 − a − + W a = −Ea, 2 2 for some real constant E. This is so since the variation a must satisfy the identity a a dx = 0, (2.11) Tn
which we obtain upon differentiating (2.4).
Remark. The foregoing deduction depends upon the implict assumption that we can construct a wide enough class of variations to permit our concluding (2.8) from the integral identities involving a . We will return to this point in Sect. 8.
2.2. Second variation. We next differentiate j twice with respect to τ : Theorem 2.2. If ψ = aeiu/ h is a critical point, then 2 2 2 2 2 2 |Du| j (0) = h |Da | + a |Du | − 2(a ) + W − E dx. 2 Tn Proof. 1. We have j =
Tn
(2.12)
h2 |Da |2 + h2 Da · Da + (a )2 |Du|2
+aa |Du|2 + 4aa Du · Du + a 2 |Du |2 +a 2 Du · Du − 2W (a )2 − 2W aa dx.
(2.13)
Differentiating (2.9), (2.10) again, we find div(2(a )2 Du + 2aa Du + 4aa Du + a 2 Du ) = 0 and
Tn
2(a )2 Du + 2aa Du + 4aa Du + a 2 Du dx = 0.
Set τ = 0. Multiply (2.14) by v, integrate, multiply (2.15) by P , and add: 2(a )2 |Du|2 + 2aa |Du|2 + 4aa Du · Du + a 2 Du · Du dx = 0. Tn
2. We employ this equality in (2.13): 2 |Du|2 h j (0) = +W + h2 |Da |2 2a − a − a 2 2 Tn
(2.14)
(2.15)
(2.16)
−(a )2 |Du|2 + a 2 |Du |2 − 2W (a )2 dx 2 2 2 2 |Du| = + W + a 2 |Du |2 dx 2a (−Ea) + h |Da | − 2(a ) 2 Tn |Du|2 = + W − E dx. h2 |Da |2 + a 2 |Du |2 − 2(a )2 2 Tn
316
L.C. Evans
We have used here the identity Tn
a a + (a )2 dx = 0,
derived by twice differentiating (2.4).
2.3. Local minimizers. We continue to write ψ = aeiu/ h , and now assume as well that a>0
in Tn .
(2.17)
Observe that this follows from (2.8) and the strong maximum principle, provided a and u are smooth enough. Theorem 2.3. If ψ = aeiu/ h is a critical point and (2.17) holds, then
j (0) =
2 a a |Du | + a D dx > 0, n a 2
2
2
T
(2.18)
provided a = 0. In this case we call ψ = aeiu/ h a local minimizer. Proof. The Euler–Lagrange equation (2.8) asserts that −
h2 a = a 2
|Du|2 +W −E . 2
Hence 2 h a dx a 2 |Du |2 + h2 |Da |2 − 2(a )2 − 2 a Tn |Da|2 Da · Da = dx a 2 |Du |2 + h2 |Da |2 + h2 (a )2 2 − 2h2 a a a Tn 2 a 2 2 2 2 = a |Du | + h a D dx. n a
j (0) =
T
The last term is strictly positive, unless a ≡ λa for some constant λ = 0. But this is impossible, since a a dx = 0. Tn
3. Some Useful Identities Motivated by the foregoing calculations, our aim now is constructing an explicit state ψ, which will turn out to be a critical point, and indeed a local minimizer, of A[·], subject to (1.6) − (1.8). We start with two linear problems.
Towards a Quantum Analog of Weak KAM Theory
317
3.1. Dual eigenfunctions. Consider the dual eigenvalue problems: 2 − h2 w + hP · Dw − W w = E 0 w in Tn w is Tn -periodic
(3.1)
and
2
− h2 w ∗ − hP · Dw ∗ − W w ∗ = E 0 w ∗ w ∗ is Tn -periodic,
in Tn
(3.2)
where E 0 = E 0 (P ) ∈ R is the principal eigenvalue. Note carefully the minus signs in front of the potential W . We may assume the real eigenfunctions w, w ∗ to be positive in Tn and normalized so that ww ∗ dx = 1. (3.3) Tn
Furthermore, we can take w, w∗ and E 0 to be smooth in both the variables x and P . We employ a form of the Cole–Hopf transformation, to define v := −h log w (3.4) v ∗ := h log w∗ . Then
w = e−v/ h ∗ w ∗ = ev / h ,
and a calculation shows that − h2 v + 21 |P + Dv|2 + W = H¯ h (P ) v is Tn -periodic
(3.5)
in Tn
(3.6)
and
h ∗ 2 v ∗ v is
+ 21 |P + Dv ∗ |2 + W = H¯ h (P ) Tn -periodic,
in Tn
(3.7)
for |P |2 H¯ h (P ) := − E 0 (P ). 2 Standard PDE estimates applied to (3.6) and (3.7) provide the bounds |Dv|, |Dv ∗ | ≤ C, for a constant C depending only upon P and the potential W .
(3.8)
318
L.C. Evans
3.2. Continuity and eikonal equations. Define σ := ww ∗
(3.9)
and u := P · x +
v + v∗ . 2
(3.10)
Note that although w, w∗ , v, v ∗ , u and σ depend on h, we will for notational simplicity mostly not write these functions with a subscript h. The importance of the product (3.9) of the eigenfunctions is noted also in Anantharaman [A]. According to (3.3), n σ > 0 in T , σ dx = 1. Tn
Theorem 3.1. (i) We have div(σ Du) = 0 in Tn .
(3.11)
(ii) Furthermore, h 1 1 |Du|2 + W − H¯ h (P ) = (v − v ∗ ) − |Dv − Dv ∗ |2 2 4 8
in Tn .
(3.12)
We call (3.11) the continuity (or transport) equation, and regard (3.12) as an eikonal equation with an error term on the right-hand side. Proof. 1. We compute h div(w∗ Dw − wDw ∗ ) = h(w∗ w − ww ∗ ) 2 2 2 h ∗ h ∗ = w w − w w h 2 2 2 = (w ∗ (−E 0 w − W w + hP · Dw) h −w(−E 0 w ∗ − W w ∗ − hP · Dw ∗ )) = 2(w ∗ P · Dw + wP · Dw ∗ ) = 2P · Dσ. But ∗ Dv ∗ Dv 1 w −w w = − σ (Dv + Dv ∗ ), w∗ Dw − wDw ∗ = w∗ − h h h and therefore 1 P · Dσ + div σ D(v + v∗ ) = 0. 2 This is (3.11).
Towards a Quantum Analog of Weak KAM Theory
319
2. Recalling the formula 1 1 |a − b|2 + |a + b|2 = |a|2 + |b|2 , 2 2 we compute 2 1 1 1 ∗ P + D(v + v ) = |(P + Dv) + (P + Dv ∗ )|2 2 2 8 1 1 1 = |P + Dv|2 + |P + Dv ∗ |2 − |Dv − Dv ∗ |2 . 4 4 8 Hence 1 |P + Dv|2 + W − H¯ h (P ) 2 1 1 1 + |P + Dv ∗ |2 + W − H¯ h (P ) − |D(v − v ∗ )|2 2 2 8 1 1 h ∗ 1 h v + − v − |D(v − v ∗ )|2 , = 2 2 2 2 8
1 1 |Du|2 + W − H¯ h (P ) = 2 2
owing to (3.6), (3.7).
Remark. We also have the identities h − σ − div ((P + Dv)σ ) = 0, 2 h − σ + div (P + Dv ∗ )σ = 0. 2
(3.13) (3.14)
For a quick derivation, observe first that h 1 σ = div D(v ∗ − v)σ . 2 2 Add and substract this from (3.11).
3.3. Integral identities involving Du and D 2 u. To simplify notation, we will hereafter write dσ := σ dx. Theorem 3.2. These formulas hold: 1 1 |Du|2 + W dσ = H¯ h (P ) + |Dv − Dv ∗ |2 dσ, n n 2 8 T T 1 2 2 2 2 ∗ 2 |D u| + |D v − D v | dσ = − W dσ. 4 Tn Tn
(3.15)
(3.16) (3.17)
320
L.C. Evans
Proof. 1. In view of (3.12), 1 h |Du|2 + W − H¯ h (P ) dσ = (v − v ∗ )σ dx 4 Tn Tn 2 1 − |Dv − Dv ∗ |2 σ dx. 8 Tn v ∗ −v
But σ = ww∗ = e h , and therefore 1 h D(v ∗ − v) 2 ¯ |Du| + W − Hh (P ) dσ = − σ dx D(v − v ∗ ) · 4 Tn h Tn 2 1 − |Dv − Dv ∗ |2 σ dx 8 Tn 1 = |Dv − Dv ∗ |2 dσ. 8 Tn 2. Now differentiate the identity (3.12) twice with respect to xk : Du · Duxk xk + Duxk · Duxk + Wxk xk h 1 = (v − v ∗ )xk xk − D(v − v ∗ ) · D(v − v ∗ )xk xk 4 4 1 − D(v − v ∗ )xk · D(v − v ∗ )xk . 4 Multiply by σ and integrate: Du · Duxk xk + Duxk · Duxk + Wxk xk dσ Tn h 1 ∗ = (v − v )xk xk dσ − D(v − v ∗ ) · D(v − v ∗ )xk xk dσ 4 Tn 4 Tn 1 − D(v − v ∗ )xk · D(v − v ∗ )xk dσ. 4 Tn According to Theorem 3.1, the first term on the left vanishes. We integrate by parts in the first term on the right, and thereby derive an expression that cancels the second term on the right. Next sum on k, to derive (3.17). ¯h 4. First and Second Derivatives of H As explained in [E-G], the behavior of various expressions as functions of P is important: Theorem 4.1. We have D H¯ h (P ) = D 2 H¯ h (P ) =
Tn
Du dσ,
2 2 DxP u ⊗ DxP u dσ 1 2 + D 2 (v − v ∗ ) ⊗ DxP (v − v ∗ ) dσ. 4 Tn xP
(4.1)
Tn
In particular, H¯ h is a convex function of P .
(4.2)
Towards a Quantum Analog of Weak KAM Theory
321
Our notation means that the (l, m)th component of thefirst term on the right-hand side of (4.2) is Tn uxi Pl uxi Pm dσ and of the second term is 41 Tn (v −v ∗ )xi Pl (v −v ∗ )xi Pm dσ. Proof. 1. We differentiate (3.1) and (3.2) with respect to Pk : h2 wPk + hP · DwPk − W wPk − E 0 wPk = EP0 k w − hwxk , 2 h2 − wP∗ k − hP · DwP∗ k − W wP∗ k − E 0 wP∗ k = EP0 k w ∗ + hwx∗k . 2
−
(4.3) (4.4)
Multiply (4.3) by w∗ and integrate over Tn . Since w ∗ solves (3.2), we deduce 0 ∗ ∗ Dww dx = − Dv ww dx = − Dv dσ. DE (P ) = h Tn
Tn
Tn
Similarly, we multiply (4.4) by w and integrate: Dw ∗ w dx = − DE 0 (P ) = −h Tn
Then D H¯ h (P ) = P − DE 0 (P ) = P +
Tn
Dv ∗ dσ.
1 Dv + Dv ∗ dσ = Du dσ. 2 Tn Tn
2. Next, differentiate the identity (3.12) with respect to Pk and Pl : h H¯ h,Pk Pl = Du · DuPk Pl + DuPk · DuPl − (v − v ∗ )Pk Pl 4 1 1 + D(v − v ∗ ) · D(v − v ∗ )Pk Pl + D(v − v ∗ )Pk · D(v − v ∗ )Pl . 4 4 Multiply by σ and integrate, to discover h ¯ Du · DuPk Pl + DuPk · DuPl dσ − (v − v ∗ )Pk Pl dσ Hh,Pk Pl = 4 Tn Tn 1 + D(v − v ∗ ) · D(v − v ∗ )Pk Pl dσ 4 Tn 1 + D(v − v ∗ )Pk · D(v − v ∗ )Pl dσ. 4 Tn In view of Theorem 3.1 and the periodicity of uPk Pl , the first term on the right vanishes. v ∗ −v
Since σ = ww∗ = e h , we can integrate by parts in the third term on the right, obtaining an expression that cancels the fourth term. Formula (4.2) results. As an application of Theorem 4.1, we modify some ideas from [E-G] and [E2] to discuss the effects of a nonresonance condition on the asymptotics as h → 0. We will explain in the next section that as h → 0 the functions H¯ h converge uniformly on compact sets to the convex function H¯ , the effective Hamiltonian in the sense of Lions–Papanicolaou–Varadhan [L-P-V]. Let us suppose that H¯ is differentiable at P and that V = D H¯ (P ) satisfies V · m = 0
for each m ∈ Zn , m = 0.
(4.5)
322
L.C. Evans
Theorem 4.2. Suppose also that D 2 H¯ h (P ) is bounded as h → 0. Then lim (DP u) dσ = (X) dX h→0 Tn
Tn
(4.6)
for each continuous, Tn -periodic function . We discuss in [E-G] that this statement is consistent with the classical assertion that the Hamiltonian dynamics in the X, P variables, where X := DP u, correspond to the trivial motion X˙ = V , P˙ = 0. Proof. Fix any m ∈ Zn , and observe that the function e2πim·DP u = e2πim·x eπim·DP (v+v
∗)
is Tn -periodic. Consequently
2πim·DP u 0= Du · D e e2πim·DP u mk uxj uxj Pk dσ dσ = 2πi Tn Tn = 2π i e2πim·DP u mk H¯ h,Pk dσ n T e2πim·DP u mk (uxj uxj Pk − H¯ h,Pk ) dσ + 2π i
(4.7)
Tn
= : 2π i(A + B). We claim now that B = O(h) as h → 0. To confirm this, notice first that our differentiating (3.12) gives the identity uxj uxj Pk − H¯ h,Pk =
h 1 (v − v ∗ )Pk − (v − v ∗ )xj (v − v ∗ )xj Pk ; 4 4
and therefore h (uxj uxj Pk − H¯ h,Pk )σ = (v − v ∗ )xj Pk σ x . j 4 Hence
B=
Tn
e2πim·DP u mk (uxj uxj Pk − H¯ h,Pk )σ dx
h e2πim·DP u mk (v − v ∗ )xj Pk σ x dx j 4 Tn hπ i =− e2πim·DP u ml uxj Pl mk (v − v ∗ )xj Pk dσ. 2 Tn =
So
|B| ≤ Ch
Tn
2 2 |DxP u|2 + |DxP (v − v ∗ )|2 dσ = O(h)
according to (4.2), since D 2 H¯ h (P ) is bounded. This proves (4.8).
(4.8)
Towards a Quantum Analog of Weak KAM Theory
But then (4.7) implies m · D H¯ h (P )
Tn
e
323
2πim·DP u
dσ = |B| ≤ O(h).
We will see in Sect. 5 that H¯ h → H¯ , locally uniformly. Since therefore D H¯ h (P ) → D H¯ (P ) = V and since m · V = 0, we deduce that e2πim·DP u dσ = 0. lim h→0 Tn
This limit holds for all m = 0, and hence lim (DP u) dσ = h→0 Tn
for each Tn -periodic function .
Tn
(X) dX
5. An Identity Involving Exact Solutions of the Eikonal Equation Assume next that vˆ is a Lipschitz continuous almost everywhere solution of the cell problem 1 ˆ 2 + W = H¯ (P ) in Tn 2 |P + D v| (5.1) n vˆ is T -periodic. The term on the right-hand side of (5.1) is the effective Hamiltonian in the sense of Lions– Papanicolaou–Varadhan [L-P-V], a central assertion of which is that for a given vector P problem (5.1) is solvable in the sense of viscosity solutions. (Our H¯ corresponds to Mather’s function α, and is equivalent also to Ma˜ne´ ’s constant.) Write uˆ := P · x + v; ˆ
(5.2)
so that 1 |D u| ˆ 2 + W = H¯ (P ) 2
almost everywhere in Tn .
(5.3)
Theorem 5.1. This formula holds: 2 1 1 D(v + v ∗ ) − D vˆ dσ + 1 |Dv − Dv ∗ |2 dσ = H¯ (P ) − H¯ h (P ). (5.4) 2 Tn 2 8 Tn Proof. We employ the identity 1 1 1 |a − b|2 + a · (b − a) = |b|2 − |a|2 2 2 2 with 1 a = P + D(v + v ∗ ) = Du, b = P + D vˆ = D u, ˆ 2
324
L.C. Evans
to discover
2 1 1 1 D(v + v ∗ ) − D vˆ dσ + Du · D(vˆ − (v + v ∗ )) dσ n 2 Tn 2 2 T 1 1 2 = |D u| ˆ + W dσ − |Du|2 + W dσ. n n 2 2 T T
Owing to Theorem 3.1 and the periodicity of v, v ∗ and v, ˆ the second term on the left vanishes. Consequently (5.3) and (3.16) imply 2 1 1 D(v + v ∗ ) − D vˆ dσ = H¯ (P ) − H¯ h (P ) − 1 |Dv − Dv ∗ |2 dσ. 2 Tn 2 8 Tn As an application, we have the estimate: Theorem 5.2. (i) For each P ∈ Rn , H¯ h (P ) ≤ H¯ (P ) ≤ H¯ h (P ) + O(h) as h → 0.
(5.5)
2 1 D(v + v ∗ ) − D vˆ dσ + |Dv − Dv ∗ |2 dσ = O(h). Tn 2 Tn
(5.6)
(ii) Hence
Proof. 1. According to (5.4), H¯ h (P ) ≤ H¯ (P ). In addition, we have the minimax formula 1 2 ¯ H (P ) = inf max |P + Dv| + W (x) . v∈C 1 (Tn ) x∈Tn 2
(5.7)
(See for instance the appendix of [E2] for a quick proof due to A. Fathi.) Furthermore, standard PDE estimates deduce from (3.6), (3.7) the one-sided second derivative bounds vξ ξ ≤ C
and vξ∗ξ ≥ −C,
(5.8)
for some constant C and any unit vector ξ . Then formula (3.12) implies 2 1 1 ∗ ¯ ) P + D(v + v + W ≤ Hh (P ) + Ch. 2 2 Consequently, we can deduce from (5.7) that
2 1 1 ∗ ¯ H (P ) ≤ max P + D(v + v ) + W (x) ≤ H¯ h (P ) + Ch. 2 x∈Tn 2 2. Statement (ii) follows from (5.4), (5.5).
6. Quantum Lagrangian Calculations In this section we draw some connections between the minimization problems, both classical and quantum, discussed in Sect. 1–2, and the explicitly constructed state ψ studied in Sect. 3–5.
Towards a Quantum Analog of Weak KAM Theory
325 v ∗ −v
6.1. Quantum action. As before, we have a = σ 1/2 = e 2h and we continue to write iu ψ = ae h . Recall as well that the action of ψ is h2 |Dψ|2 − W |ψ|2 dx. A[ψ] := Tn 2 We next demonstrate that ψ satisfies the Euler–Lagrange equation (2.8). Theorem 6.1. We have h2 a 1 |Du|2 + W − H¯ h (P ) = − in Tn . 2 2 a In particular, ψ is a critical point of the action A[·], subject to (1.6)–(1.8).
(6.1)
According to Theorem 2.3, ψ is a local minimizer as well. I conjecture that ψ is in fact a global minimizer, but am unable to prove this. Proof. Since a = e
v ∗ −v 2h
, we compute h2 v−v∗ v∗ −v
a − = − e 2h e 2h 2 a 2 ∗ v −v h2 v−v∗ 1 1 ∗ ∗ 2 2h =− e (v − v) + 2 |D(v − v)| e 2h 2 2h 4h h 1 = (v − v ∗ ) − |D(v − v ∗ )|2 4 8 1 = |Du|2 + W − H¯ h (P ), 2 the last equality being (3.12). We now compute the action of ψ. To do so, we first introduce L¯ h , the Legendre transform of H¯ h , and also write ¯ Vh := D Hh (P ) = Du dσ. (6.2) h2
Tn
Theorem 6.2. The quantum action of ψ is A[ψ] = L¯ h (Vh ). Proof. Let us employ (3.11) and (3.13), to deduce h2 A[ψ] = |Dψ|2 − W |ψ|2 dx n 2 T h2 a2 = |Da|2 + |Du|2 − W a 2 dx 2 Tn 2 2 h 1 2 2 2 2 = |Da| dx + |Du| + W a 2 dx |Du| a dx − Tn 2 Tn Tn 2 1 1 ∗ 2 ∗ = P + D(v + v ) · Du dσ |Dv − Dv | dσ + 8 Tn 2 Tn 1 − |Du|2 + W dσ n 2 T = P · Vh − H¯ h (P ) = L¯ h (Vh ).
(6.3)
326
L.C. Evans
6.2. Convergence as h→0. We now determine the behavior of σ and u, defined by (3.9), (3.10), as h → 0. For the remainder of this section we for clarity add subscripts “h” to display the dependence on this parameter. Thus σh = wh wh∗ ,
uh = P · x +
vh + vh∗ , etc. 2
Define a measure µh on velocity-position configuration space by requiring (v, x) dµh := (Duh (x), x) dσh Rn Tn
Tn
(6.4)
for each continuous : Rn × Tn → R. In view of estimate (3.8) the measures {µh }h>0 have uniformly bounded support, and we can consequently obtain a sequence hj → 0 and a probability measure µ such that µhj µ
weakly as measures.
We may also suppose that Vhj → V
in Rn .
(6.5)
Theorem 6.3. The measure µ solves Mather’s minimization problem (1.1)–(1.4). Proof. 1. First we check that µ satisfies the constraints (1.2)–(1.4), the first of which is clear since µ is a probability measure. If furthermore φ ∈ C 1 (Tn ), then v · Dφ dµ = lim Duhj · Dφ dσhj = 0, j →∞ Tn
Rn Tn
since div(σh Duh ) = 0. This is (1.3); and (1.4) similarly holds since v dµ = lim Duhj dσhj = lim Vhj = V . j →∞ Tn
Rn Tn
j →∞
2. Recall from (5.5) that H¯ h → H¯ , uniformly on compact sets. Therefore Theorem 6.2 and (6.5) imply ¯ ) = lim L¯ hj (Vhj ) L(V j →∞
= lim
j →∞ Tn
h2j 2
|Dahj |2 +
ah2j
|Duhj |2 − W ah2j dx 1 − Dvh∗j |2 dσhj + lim |Duhj |2 − W dσhj (6.6) j →∞ Tn 2 2
1 |Dvhj j →∞ 8 Tn 1 2 = lim |v| − W dµhj j →∞ Rn Tn 2 1 2 |v| − W dµ = A[µ]. = n n 2 R T = lim
Here we recalled (5.6) to ensure that the first term on the third line goes to 0.
Towards a Quantum Analog of Weak KAM Theory
327
3. Suppose now ν is any other measure satisfying (1.2) − (1.4). Let u = η ∗ u, ˆ where η is a standard mollifier and as before uˆ solves the eikonal equation (5.3) for any ¯ ). Then P ∈ ∂ L(V 1 |Du |2 + W ≤ H¯ (P ) + C 2 for some constant C, everywhere on Tn . Therefore 1 2 1 2 1 |v| − W dν ≥ |v| + |Du |2 dν − H¯ (P ) − C A[ν] = n n n n 2 2 2 R T R T ≥ v · Du dν − H¯ (P ) − C n Tn R = v · Dv dν + P · V − H¯ (P ) − C Rn Tn
¯ ) − C. = L(V ¯ ) is less than or equal to the action of any measure This holds for each > 0, and so L(V ν satisfying (1.2) − (1.4). But then (6.6) guarantees that the measure µ is a minimizer. Anantharaman [A] provides interesting and additional information about the limit minimizing measure. Limits of the eikonal equations. Upon passing if necessary to a further subsequence, we deduce from (3.6), (3.7) that vhj → v, vh∗j → v ∗
uniformly on Tn ,
where v, v ∗ are viscosity solutions of the respective PDE 1 |P + Dv|2 + W = H¯ (P ) 2
in Tn
(6.7)
and 1 − |P + Dv ∗ |2 − W = −H¯ (P ) 2
in Tn .
(6.8)
In particular the Lipschitz continuous functions v, v ∗ solve (6.7), (6.8) almost everywhere. Therefore uhj → u := P · x +
v + v∗ uniformly on Tn , 2
where 1 |Du|2 + W ≤ H¯ (P ) 2
almost everywhere.
Finally, write σ := projx µ
(6.9)
328
L.C. Evans
for the projection µ onto Tn . Then σhj σ
weakly as measures.
Applying the regularity theory from [E-G], we deduce that Dv, Dv ∗ , and therefore Du, exist for each point in spt σ , and 1 |Du|2 + W = H¯ (P ) 2
on spt σ.
(6.10)
7. Quantum Hamiltonian Calculations 7.1. Quasimodes. The observations in the last section show that our construction in Sect. 3 is a sort of semiclassical “quantization” of weak KAM theory. We next show that ψ built above is an approximate solution of the stationary Schr¨odinger equation −
h2 ψ + W ψ = Eψ 2
in Tn . v ∗ −v
Notice the plus sign in front of the potential W . As usual, a = σ 1/2 = e 2h and iu ψ = ae h . Then h2 ih div(a 2 Du) h2 a 1 − ψ + W ψ − Eψ = ψ − |Du|2 + W − E ψ − ψ 2 2 2 a2 2 a = : A + B + C. (7.1) In view of Theorem 3.1, B ≡ 0. Now take E = H¯ h (P ). According to Theorem 6.1, A ≡ C; that is, the formal O(1)–term identically equals the formal O(h2 )–term in the expansion (7.1). Therefore h2 1 2 ¯ − ψ + W ψ − Eψ = 2 |Du| + W − Hh (P ) ψ (7.2) 2 2 for E = H¯ h (P ). It is sometimes useful to rewrite this as −
h2 ψ = (|Du|2 + W − H¯ h (P ))ψ. 2
Theorem 7.1. If E = H¯ h (P ), −
h2 ψ + W ψ − Eψ = O(h), 2
the right hand side estimated in L2 (Tn ).
(7.3)
Towards a Quantum Analog of Weak KAM Theory
329
Proof. Define the remainder term 1 2 ¯ R := 2 |Du| + W − Hh (P ) ψ. 2
(7.4)
Then 2 1 h 1 2 ∗ ∗ 2 |R| dx = dσ (v − v ) − |D(v − v )| 4 Tn 8 Tn 4 h2 h = ((v − v ∗ ))2 − (v − v ∗ )|D(v − v ∗ )|2 16 Tn 16 1 ∗ 4 + |D(v − v )| dσ. 64 Observe now that
Since
1 16
>
h − (v − v ∗ )|D(v − v ∗ )|2 dσ 16 Tn v ∗ −v h =− (v − v ∗ )|D(v − v ∗ )|2 e h dx 16 Tn 1 = |D(v − v ∗ )|2 D(v − v ∗ ) · D(v ∗ − v) dσ 16 Tn h + D(v − v ∗ ) · D 2 (v − v ∗ )D(v − v ∗ ) dσ 8 Tn 1 =− |D(v − v ∗ )|4 dσ 16 Tn h + D(v − v ∗ ) · D 2 (v − v ∗ )D(v − v ∗ ) dσ. 8 Tn
1 64 ,
Tn
we derive for some constant C the estimate: 2 ∗ 4 2 |R| dx + |D(v − v )| dσ ≤ Ch |D 2 (v − v ∗ )|2 dσ. Tn
Tn
(7.5)
We deduce finally from (7.5) and (3.17) that R is of order at most O(h) in L2 (Tn ).
Remark. The O(h)–error term is not especially good, and indeed M. Zworski has outlined for me some other constructions building quasimodes with similar error estimates in quite general circumstances. Estimate (7.5) does show that if E = H¯ h (P ) and if |D 2 (v − v ∗ )|2 dσ = o(1), (7.6) Tn
we would then have the better error bound −
h2 ψ + W ψ − Eψ = o(h) 2
as h → 0
(7.7)
in L2 (Tn ). We may hope that assertion (7.6) is true in some generality, although it can fail, as the following shows:
330
L.C. Evans
Example. Assume that P = 0 and that the potential W attains its maximum at a unique point x0 ∈ Tn , where W (x0 ) < 0. In this situation we can readily check that H¯ (0) = W (x0 ). Since P = 0, we can take w ≡ w∗ ; whence v ≡ −v ∗ and so u ≡ 0. Then according to (3.16) and (5.15), lim W dσ = W (x0 ) = max W. (7.8) Tn
h→0 Tn
So the weak limit of the measures σ as h → 0 is the unit mass at x0 . Consequently, the identity (3.17) implies lim |D 2 (v − v ∗ )|2 dσ = −4W (x0 ) > 0. (7.9) h→0 Tn
This example is from the forthcoming paper of Y. Yu [Y], who provides a very complete analysis of our problem in one dimension. In particular, if n = 1 and H¯ (P ) > min H¯ , the error term is indeed o(h) in L2 as h → 0. 7.2. Comparison with stochastic mechanics. There are formal connections with the Guerra–Morato and Nelson variational principle in stochastic quantum mechanics, as set forth in Nelson [N], Guerra–Morato [G-M], Yasue [Ya], Carlen [C], etc. (I thank A. Majda for some of these references.) I attempt here to explain the link by recasting their form of the action into our setting and notation, disregarding the probabilistic interpretations. In effect, then, the action of Guerra–Morato becomes 1 ∗ ˜ A[ψ] := (7.10) Dv · Dv − W a 2 dx Tn 2 v ∗ −v
for ψ = aeiu/ h , a = e 2h , u = 21 (v + v ∗ ), P = 0. (The Lagrangian density 1 ∗ 2 2 Dv · Dv − W a here should be compared with formula (93) in Guerra–Morato [G-M]. See also (14.31) in Nelson [N].) We rewrite this, observing that −
a2 a2 h2 |Da|2 + |Du|2 = Dv ∗ · Dv. 2 2 2
Consequently ˜ A[ψ] =
Tn
−
h2 a2 |Da|2 + |Du|2 − W a 2 dx. 2 2
(7.11)
This action differs from ours due to the sign change in the first term. ˜ Theorem 7.2. Let ψ = aeiu/ h be a smooth critical point of the action A[·], subject to the constants (1.6)–(1.8). Then for some real constant E: −
h2 ψ + W ψ = Eψ 2
in Tn ,
and so ψ is an exact solution of this stationary Schr¨odinger equation.
(7.12)
Towards a Quantum Analog of Weak KAM Theory
331
Proof. We deduce, as in the proof of Theorem 2.1, that h2 |Du|2 − a + a + W − E = 0, 2 2 with a sign change as compared with (2.8). Thus if we write out (7.1), we have − and B ≡ 0, A + C ≡ 0.
h2 ψ + W ψ − Eψ = A + B + C, 2
8. Connections with Linear Elliptic Homogenization This section works out some relationships between H¯ h and homogenization theory for divergence–structure, second order elliptic PDE: see for instance Bensoussan–Lions– Papanicolaou [B-L-P]. Our conclusions are very similar to those of Capdeboscq [Cp], and the calculations in Pedersen [P] are related as well. Let A = ((aij )) be symmetric, positive definite, and Tn -periodic. Suppose U is a bounded, open subset of Rn , with a smooth boundary. We consider this boundary value problem for an elliptic PDE with rapidly varying coefficients: − aij xε uεxi x = f in U j
uε = 0
on ∂U .
Then uε u weakly in H01 (U ), u solving the limit problem −a¯ ij uxi xj = f in U u = 0 on ∂U . The effective diffusion coefficient matrix A¯ = ((a¯ ij )) is determined as follows [B-L-P]. For j = 1, . . . , n, let χ j solve the corrector problem
j − akl χxk = (aj l )xl in Tn xl (8.1) χ j is Tn -periodic. Let us then for i, j = 1, . . . , n define j a¯ ij := aij − akl χxi k χxl dx. Tn
(8.2)
Theorem 8.1. Define Vh = D H¯ h (P ). Then ¯ = Vh , AP
(8.3)
where A¯ = ((a¯ ij )) is the effective diffusion coefficient matrix corresponding to A := a 2 I = ((a 2 δij )). Notice in (8.4) that a 2 = σ = ww∗ depends on both h and P .
(8.4)
332
L.C. Evans
Proof. For the special case of the diagonal matrix A given by (8.4), the corrector PDE (8.1) reads j −(a 2 χxk )xk = (a 2 )xj in Tn (8.5) χ j is Tn -periodic. Now u = x · P + 21 (v + v ∗ ) solves div(σ Du) = 0, and therefore
1 −div a 2 D(v + v ∗ ) = div(a2 P) = D(a2 ) · P. 2
Hence v + v∗ = Pi χ i . 2
(8.6)
Consequently, for j = 1, . . . , n, uxj dσ = (χ i + xi )xj Pi dσ Vh,j = Tn Tn j = Pj − χ i Pi σxj dx = Pj + χ i Pi (a 2 χxk )xk dx n n T T 2 i j ¯ )j . = Pj − Pi a χxk χxk dx = (AP Tn
Remark (Constructing a family of variations). We return finally to a point left open in Sect. 2. Recall that we introduced a one parameter family of variations {(u(τ ), a(τ ))}−1≤τ ≤1 satisfying the constraints (2.4)–(2.6), with (u(0), a(0)) = (u, a). We assumed also that u(τ ) = P (τ ) · x + v(τ ),
(8.7)
for P (τ ) ∈ Rn and v(τ ) is Tn -periodic. da was arbitrary, To finish up the proof of Theorem 2.1 we needed to know that a := dτ subject only to the integral identity (2.11). We show next we can indeed do so, provided a>0
in Tn .
(8.8)
To confirm this, take a smooth function a(·) of τ satisfying (2.4) and a(0) = a. For a given P ∈ Rn , we invoke the Fredholm alternative to solve −div(a 2 (τ )Dv) = div(a 2 (τ )P ) for a periodic function v = v(τ ). Then u(τ ) = P ·x +v(τ ) solves div(a(τ )2 Du(τ )) = 0, and the issue is whether for small τ we can select P = P (τ ) so that a 2 (τ )Du(τ ) dx = V . (8.9) Tn
Towards a Quantum Analog of Weak KAM Theory
333
We next introduce the function = { 1 , . . . , n } defined by (P ) := a 2 Du dx = P + a 2 Dv dx. Tn
Tn
Now since − div(a 2 Dv) = div(a 2 P ), we have −(a 2 vxk Pl )xk = (a 2 )xl ; and so vPl = χ l in the notation above. Therefore, as in the proof of Theorem 8.1, we can calculate k 2 Pl = δkl + a vxk Pl dx = δkl − (a 2 )xk vPl dx Tn Tn = δkl + (a 2 vxi Pk )xi vPl dx = δkl − a 2 vxi Pk vxi Pl dx = a¯ kl .
Tn
Tn
In other words, D(P ) = A¯ and the latter matrix is nonsingular. Hence the Implicit Function Theorem ensures for small τ that we can find P = P (τ ) satisfying (8.9). References [A]
Anantharaman, N.: Gibbs measures on path space and viscous approximation to action-minimizing measures. Trans. AMS, 2003 [B-L-P] Bensoussan, A., Lions, J.-L., Papanicolaou, G.: Asymptotic Analysis for Periodic Structures. Amsterdam: North-Holland, 1978 [Cp] Capdeboscq, Y.: Homogenization of a neutronic critical diffusion problem with drift. Proc. Royal Soc. Edinburgh 132, 567–594 (2002) [C] Carlen, E.: Conservative diffusions. Commun. Math. Phys. 94, 293–315 (1984) [E1] Evans, L.C.: Effective Hamiltonians and quantum states. Seminaire Equations aux D´eriv´ees Partielles, Ecole Polytechnique 2000–2001 [E2] Evans, L.C.: Some new PDE methods for weak KAM theory. Calculus of Variations and PDE 17, 159–177 (2003) [E-G] Evans, L.C., Gomes, D.: Effective Hamiltonians and averaging for Hamiltonian Dynamics I. Arch. Rat. Mech. Anal. 157, 1–33 (2001) [F1] Fathi, A.: Th´eor`eme KAM faible et th´eorie de Mather sur les syst`emes lagrangiens. C. R. Acad. Sci. Paris Sr. I Math. 324, 1043–1046 (1997) [F2] Fathi, A.: Solutions KAM faibles conjugu´ees et barri`eres de Peierls. C. R. Acad. Sci. Paris Sr. I Math. 325, 649–652 (1997) [F3] Fathi, A.: Weak KAM theory in Lagrangian Dynamics, Preliminary Version. Lecture notes 2001 [G] Gomes, D.: A stochastic analogue of Aubry-Mather theory. Nonlinearity 15, 581–603 (2002) [G-M] Guerra, F., Morato, L.: Quantization of dynamical systems and stochastic control theory. Phys. Rev. D 27, 1774–1786 (1983) [H-K] Holcman, D., Kupka, I.: Singular perturbations and first order PDE on manifolds. Preprint, 2002 [L-P-V] Lions, P.-L., Papanicolaou, G., Varadhan, S. R. S.: Homogenization of Hamilton–Jacobi equations. Unpublished, circa 1988 [M1] Mather, J.: Minimal measures. Comment. Math Helv. 64, 375–394 (1989) [M2] Mather, J.: Action minimizing invariant measures for positive definite Lagrangian systems. Math. Zeits. 207, 169–207 (1991) [M-F] Mather, J., Forni, G.: Action minimizing orbits in Hamiltonian systems. In: Transition to Chaos in Classical and Quantum Mechanics, Lecture Notes in Math 1589, S. Graffi, ed., BerlinHeidelberg-New York: Springer, 1994
334
L.C. Evans
[N] [P]
Nelson, E.: Quantum Fluctuations. Princeton, NJ: Princeton University Press, 1985 Pedersen, F.B.: Simple derivation of the effective mass equation using a multiple-scale technique. Euro. J. Phys. 18, 43–45 (1997) Yasue, K.: Stochastic calculus of variations. J. Funct. Anal. 41, 327–340 (1981) Yu, Y.: Error estimates for a quantization of a Mather set. To appear
[Ya] [Y]
Communicated by P. Constantin
Commun. Math. Phys. 244, 335–345 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0986-2
Communications in
Mathematical Physics
A Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification Xianzhe Dai Department of Mathematics, University of California, Santa Barbara, CA 93106, USA. E-mail: [email protected] Received: 16 June 2003 / Accepted: 8 August 2003 Published online: 25 November 2003 – © Springer-Verlag 2003
Abstract: We prove a positive mass theorem for spaces which asymptotically approach a flat Euclidean space times a Calabi-Yau manifold (or any special honolomy manifold except the quaternionic K¨ahler). This is motivated by the very recent work of HertogHorowitz-Maeda [HHM]. In general relativity, isolated gravitational systems are modelled by asymptotically flat spacetimes. The spatial slices of such spacetime are then asymptotically flat Riemannian manifolds. That is, Riemannian manifolds (M n , g) such that M = M0 ∪ M∞ with M0 compact and M∞ Rn − BR (0) for some R > 0 so that in the induced Euclidean coordinates the metric satisfies the asymptotic conditions gij = δij + O(r −τ ), ∂k gij = O(r −τ −1 ), ∂k ∂l gij = O(r −τ −2 ).
(0.1)
Here τ > 0 is the asymptotic order and r is the Euclidean distance to a base point. The total mass (the ADM mass) of the gravitational system can then be defined via a flux integral [ADM, LP] 1 m(g) = lim (∂i gij − ∂j gii ) ∗ dxj . (0.2) R→∞ 4ωn SR Here ωn denotes the volume of the n − 1 sphere and SR the Euclidean sphere with radius R centered at the base point. If τ > n−2 2 and n ≥ 2, then m(g) is independent of the asymptotic coordinates xi , and thus is an invariant of the metric. The positive mass theorem [SY1, SY2, SY3, Wi1] says that this total mass is nonnegative provided one has nonnegative local energy density. Theorem 0.1 (Schoen-Yau, Witten). Suppose (M n , g) is an asymptotically flat spin manifold of dimension n ≥ 3 and of order τ > n−2 2 . If the scalar curvature R ≥ 0, then m(g) ≥ 0 and m(g) = 0 if and only if M = Rn .
336
X. Dai
Remark. The scalar curvature R is the local energy density. According to string theory [CHSW], our universe is really ten dimensional, modelled by M 3,1 × X, where X is a Calabi-Yau 3-fold. This is the so called Calabi-Yau compactification, which motivates the spaces we now consider. We consider the complete Riemannian manifolds (M n , g) such that M = M0 ∪ M∞ with M0 compact and M∞ (Rk − BR (0)) × X for some R > 0 and X a compact simply connected Calabi-Yau manifold (or with any other special honolomy except Sp(m) · Sp(1)) so that the metric on M∞ satisfies ◦
◦
◦
◦ ◦
g =g +h, g = gRk + gX , h = O(r −τ ), ∇ h = O(r −τ −1 ), ∇ ∇ h = O(r −τ −2 ). (0.3) ◦
◦
Here ∇ is the Levi-Civita connection of g , τ > 0 is the asymptotical order. We will call M a space with asymptotic SUSY compactification. The mass for such a space is then defined by ◦ ◦ 1 m(g) = lim (∇ ea0 gj a − ∇ e0 gaa ) ∗ dxj dvol(X). (0.4) j R→∞ 4ωk vol(X) SR ×X ◦
Here {ea0 } = { ∂x∂ i , fα } is an orthornormal basis of g , the ∗ operator is the one on the Euclidean factor, the index i, j run over the Euclidean factor and the index α runs over X while the index a runs over the full index of the manifold. In fact, this reduces to 1 m(g) = lim (∂i gij − ∂j gaa ) ∗ dxj dvol(X). R→∞ 4ωk vol(X) SR ×X Remark. If τ >
k−2 2
and k ≥ 2, then m(g) is independent of the asymptotic coordinates.
Our main result is Theorem 0.2. Let (M, g) be a complete spin manifold as above and the asymptotic order τ > k−2 2 and k ≥ 3. If M has nonnegative scalar curvature, then m(g) ≥ 0 and m(g) = 0 if and only if M = Rk × X. Remark. The result extends without change to the case with more than one end. Remark. Just like in the usual case, the restriction k ≥ 3 has to do with getting the correct spin structure at the ends. See Sect. 5 for additional comments regarding the spin structures of the ends. Our motivation comes from a very recent work of Hertog-Horowitz-Maeda [HHM] on the Calabi-Yau compactifications. Using the existence result of Stolz [S1, S2] on metrics of positive scalar curvature, they constructed classical configurations which have regions of (arbitrarily large) negative energy density as seen from the four dimensional perspective. This should be contrasted with the positivity (nonnegativity) of the total mass, as guaranteed by Theorem 0.2. According to [HHM], physical consequences of the negative energy density include possible violation of Cosmic Censorship and new thermal instability. The Lorentzian version of Theorem 0.2 will be discussed in a separate paper.
Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification
337
1. Manifolds with Special Holonomy For a complete Riemannian manifold (M n , g), the holonomy group Hol(g) (with respect to a base point) is the subgroup of O(n) generated by parallel translations along all loops at the base point. For simply connected irreducible nonsymmetric spaces, Berger has given a complete classification of possible holonomy groups, namely, SO(n) which is the generic situation, U (m) (if n = 2m) which is K¨ahler, SU (m) for Calabi-Yau, Sp(m) · Sp(1) (if n = 4m) which is called quaternionic K¨ahler, Sp(m) which is called hyper-K¨ahler, Spin(7) (if n = 8), and G2 (if n = 7). Except for the generic and K¨ahler cases, the rest are called special holonomy. If a Riemannian manifold (M, g) is spin, then one can consider spinors φ on M which are sections of the spinor bundle S. The Levi-Civita connection ∇ of g lifts to a connection of the spinor bundle, which will still be denoted by the same notation. In fact, any metric connections lift in the same way. The Dirac operator Dφ = ei · ∇ei φ, where ei is a local orthonormal basis of M and ei · is the Clifford multiplication. A spinor φ is parallel if ∇φ = 0. Implicitly, all these depend on the underlying spin structure, which is in one-to-one correspondence with elements of H 1 (M, Z2 ) [LM]. Thus, for simply connected manifolds, one has a unique spin structure. It seems that the issue of spin structure in this context is a subtle one, deserving further study. (See also Sect. 5.) All manifolds with special holonomy, with the exception of the quaternionic K¨ahler ones, carry nonzero parallel spinor. In fact, one has the following theorem of McKenzie Wang [Wa]. Theorem 1.1. Let (M, g) be a complete, simply connected, irreducible Riemannian spin manifold and N be the dimension of parallel spinors. Then N > 0 if and only if the holonomy group is one of SU (m), Sp(m), Spin(7), G2 . Remark. Wang [Wa] actually characterizes each special holonomy by the number of parallel spinors. Remark. Manifolds with parallel spinors are called supersymmetric (SUSY) in physics literature. 2. Proof of Theorem 0.2 Our proof is an extension of Witten’s spinor proof [Wi1]. Here we follow the idea of Anderson and Dahl [AnD] and use the following alternative formula for the Lichnerowicz formula. Lemma 2.1. Given a spinor φ on a Riemannian spin manifold, define a 1-form α via α(X) = (∇X + X · D)φ, φ . Then div α =
R 2 |φ| + |∇φ|2 − |Dφ|2 . 4
338
X. Dai
Proof. Choose an orthonormal basis ea such that ∇ea = 0 at the given point. Then (Einstein summation enforced) div α = (∇ea α)(ea ) = ea (α(ea )) = (∇ea + ea · D)φ, ∇ea φ + ∇ea (∇ea + ea · D)φ, φ
= |∇φ|2 − |Dφ|2 + (δab + ea · eb ·)∇ea ∇eb φ, φ . The last term is just
1 1 R [ea ·, eb ·]∇ea ∇eb φ, φ = [ea ·, eb ·]R(ea , eb )φ, φ = |φ|2 2 4 4
by the usual calculation as in the Lichnerowicz formula [LM].
Therefore, for any compact domain ⊂ M,
R 2 2 2 |φ| + |∇φ| − |Dφ| dvol(g) 4 = (∇ea + ea · D)φ, φ int(ea ) dvol(g), ∂ = (∇ν + ν · D)φ, φ dvol(g|∂ ),
(2.5)
∂
where ea is an orthonormal basis of g and ν is the unit outer normal of ∂. Also, here int(ea ) is the interior multiplication by ea . In particular, for a harmonic spinor φ, i.e., Dφ = 0, the left-hand side of (2.5) will be nonnegative provided R ≥ 0. On the other hand, if the harmonic spinor φ can be chosen so that it is asymptotic to a parallel spinor at infinity and we choose the domain so that ∂ = SR × X, then we will show that the right-hand side of (2.5) converges to the mass (up to a positive normalizing constant). Thus, for the first part of our theorem, we are left with two tasks. First, we need to show the existence of harmonic spinors which are asymptotic to a parallel spinor. Second, we need to show that the limit of the boundary term converges to the mass. The existence of the harmonic spinor is dealt with in Sect. 4 (Lemma 4.1) after the necessary analysis in the next section and the computation of the limit of the boundary term is also left to Sect. 4 (Lemma 4.2). We now continue with the proof of the rigidity. If m(g) = 0, then it follows that φ is a (nonzero) parallel spinor on M. This implies that M is Ricci flat, as 1 ea · R(ea , X)φ = − Ric(X) φ. 2 Thus, we are in a position to use the splitting theorem of Cheeger-Gromoll [CG]. To find lines in M, we start with sequences of pairs of points pi , qi in M∞ (Rk −BR (0))×X. When R is sufficiently large, one can choose pi , qi so that their distance is comparable to their Euclidean distance. It follows that one can construct a line in M this way. Similarly, we can construct k lines in M that are almost perpendicular to each other. It follows that M = Rk × X.
Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification
339
3. Fibered Boundary Calculus We will use the fibered boundary calculus of Melrose-Mazzeo [MM] (and further developed by Boris Vaillant in his thesis [V] and in [HHMa]) to solve for the harmonic spinor with the correct asymptotic behavior. The change of variable r = x1 makes metric into what is called fibered boundary metric, which is defined in the more general setting as follows. Consider a complete noncompact Riemannian manifold (M, g). Assume that M has π a compactification M¯ such that ∂ M¯ comes with a fibration structure F → ∂ M¯ −→ B. ¯ the metric g has the form Moreover, in a neighborhood of the boundary ∂ M, g=
dx 2 π ∗ (gB ) + + gF , 4 x x2
(3.6)
where x is a defining function of the boundary, i.e., x = 0 on ∂ M¯ and dx = 0 on the boundary. Also, gB is a metric on the base B, gF is a family of fiberwise metrics. Thus, in the setting of spaces with asymptotic SUSY compactification, one has a trivial fibration S k−1 × X and x = 1r . ¯ and ∂M, ∂ M¯ interchangeably. For a manifold with We will use the notation M, M, boundary, the Lie algebra of b-vector fields consists of vector fields tangent to the boundary Vb (M) = {V | V is tangent to the boundary ∂M}. The Lie algebra of vector fields associated with the fibered boundary metric is Vf b = {V ∈ Vb (M) | V is tangent to the fibers F at ∂M, V x = O(x 2 )}.
(3.7)
If y is a local coordinate of B and z is a local coordinate of F , then Vf b is spanned by x 2 ∂x , x∂y , ∂z . The fibered boundary vector fields Vf b generate the ring of fibered boundary differential operators. The Dirac operator D associated to the fibered boundary metric is such a fibered boundary differential operator of first order. Define the L2 and Sobolev spaces as follows: L2 (M, S) = L2 (M, S; dvol(g)) = L2 (M, S,
dxdydz ), x 2+l
if dim B = l, Lp,2 (M, S) = { φ ∈ L2 (M, S) | ∇V1 · · · ∇Vj φ ∈ L2 (M, S), ∀j ≤ p, Vi ∈ Vb }. For γ ∈ R, the space of conormal sections of order γ is defined to be Aγ (M, S) = { φ ∈ C ∞ (M, S) | ∇V1 · · · ∇Vj φ| ≤ Cx γ , ∀j, Vi ∈ Vb }, while the space of polyhomogeneous sections is A∗phg (M, S) = { φ ∈ A∗ (M, S) | φ ∼
Nj
Reγj →∞ k=0
ψj k x γj (log x)k , ψj k ∈ C ∞ (∂M, S) }.
340
X. Dai
Here the expansion is the usual asymptotic expansion, uniform with all the derivatives. We usually specify all possible pairs (γj , Nj ) that can appear in the expansion and the collection of (γj , Nj ) is called the index set. Assume that ker DF has constant dimension so it forms a vector bundle on the base B. Let 0 be the orthogonal projection onto ker DF and ⊥ = I − 0 . The following is a summary of the results developed in [MM, V, HHMa]. Theorem 3.1. Suppose that a is not an indicial root of 0 x −1 D 0 . Then D : x a L1,2 (M, S) → x a+1 0 L2 (M, S) ⊕ x a ⊥ L2 (M, S) is Fredholm. If Dφ = 0 for φ ∈ x a L2 (M, S), then φ is polyhomogeneous with exponents in its expansion determined by the indicial roots of 0 x −1 D 0 and truncated at a. If Dξ = ψ for ψ ∈ Aa (M, S) and ξ ∈ x c−1 0 L1,2 (M, S) ⊕ x c ⊥ L1,2 (M, S) and c < a, then ξ ∈ 0 AIphg (M, S) + Aa (M, S). For the precise definition of the indicial root, and in particular, the indicial root of 0 x −1 D 0 , we refer the reader to [MM, HHMa]. For our purpose, we only note that it is a discrete set. ◦
Remark. Strictly speaking, only g is a fibered boundary metric in the pure sense but it is easy to see that the result generalize to the metric g. In any case, the metric perturbation produces only a lower order term (cf. Sect. 4). Lemma 3.2. If R ≥ 0 and a >
k−2 2
is not an indicial root, then
D : x a L1,2 (M, S) → x a+1 0 L2 (M, S) ⊕ x a ⊥ L2 (M, S) is an isomorphism. Proof. We first see that it is injective. If Dφ = 0 for φ ∈ x a L2 (M, S), then by Theorem 3.1, φ ∈ Aaphg (M, S). Now, from (2.5), |∇φ|2 +
R 2 ∇ν φ, φ dvol(∂). |φ| dvol = 4 ∂
By taking so that ∂ = Sr × X and r → ∞ we see that the right hand side goes to zero since φ ∈ Aaphg (M, S) and a > k−2 2 . It follows then by the assumption R ≥ 0 that φ is parallel and hence zero. Now, if ω is in the cokernel of D, then, by the Fredholm property, ω ∈ x a+1 0 L2 (M, S) ⊕ x a ⊥ L2 (M, S) and ω is a weak solution of Dirac equation: ω, Dξ = 0, ∀ξ ∈ x a L1,2 (M, S). It follows by the regularity part of Theorem 3.1, ω ∈ Aaphg (M, S). Therefore the same argument as above shows ω = 0.
Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification
341
4. Computation of the Mass ◦
◦
◦
◦ ◦
Recall that g =g +h with g = gRk + gX and h = O(r −τ ), ∇ h = O(r −τ −1 ), ∇ ∇ ◦ h = O(r −τ −2 ). Let ea0 be the orthonormal basis of g which consists of ∂x∂ i followed by an orthonormal basis fα of gX . Orthonormalizing ea0 with respect to g gives rise an orthonormal basis ea of g. Moreover, 1 ea = ea0 − hab eb0 + O(r −2τ ). 2
(4.8)
This gives rise to a gauge transformation ◦
A : SO(g ) ea0 → ea ∈ SO(g) which identifies the corresponding spin groups and spinor bundles. ◦
To compare ∇ and ∇ , in particular their lifts to the spinor bundles, one introduces a ◦
new connection ∇ 0 = A◦ ∇ ◦A−1 . This connection is compatible with the metric g but has a torsion ◦
◦
0 Y − ∇Y0 X − [X, Y ] = −(∇ X A)A−1 Y + (∇ Y A)A−1 X. T (X, Y ) = ∇X
(4.9)
The difference of ∇ and ∇ 0 is then expressible in terms of the torsion 0 2 ∇X Y − ∇X Y, Z = T (X, Y ), Z − T (X, Z), Y − T (Y, Z), X ,
(4.10)
where we use the metric g for the inner product , . Since ∇ and ∇ 0 are both g-compatible, their induced connections on the spinor bundle differ by ∇ea − ∇e0a = −
1 ◦ (ωbc (ea )− ωbc (ea ))eb ec , 4
(4.11)
b,c
where eb , ec act on the spinors by the Clifford multiplication and the connection 1-forms ◦
◦
ωbc (ea ) = ∇ea eb , ec , ωbc (ea ) = ∇ ea eb , ec . From (4.10) and (4.9) we obtain ∇ea − ∇e0a =
◦ 1 ◦ (∇ eb gac − ∇ ec gab )eb ec + O(r −2τ −1 ) 8
(4.12)
b=c
for the difference of the two connections acting on spinors. Lemma 4.1. There exists a harmonic spinor on (M, g) which is asymptotic to a parallel spinor at infinity.
342
X. Dai
Proof. Our manifold M = M0 ∪ M∞ with M0 compact and M∞ (Rk − BR (0)) × X. Since k ≥ 3 and X is simply connected, the end M∞ is also simply connected, and therefore has a unique spin structure coming from the product of the restriction of the spin structure on Rk and the spin structure on X. Now pick a unit norm parallel spinor ψ0 of (Rk , gRk ) and a unit norm parallel spinor ψ1 of (X, gX ). Then φ0 = A(ψ0 ⊗ ψ1 ) defines a spinor of M∞ . We extend φ0 smoothly inside. Then ∇ 0 φ0 = 0 outside the compact set. Thus, it follows from (4.12) that ∇φ0 = O(r −τ −1 ).
(4.13)
We now construct our harmonic spinor by setting φ = φ0 + ξ and solve Dξ = −Dφ0 ∈ O(r −τ −1 ). By using Lemma 3.2, adjusting τ slightly if necessary so that it is not one of the indicial roots, we have a solution ξ ∈ O(r −τ ). Lemma 4.2. For the harmonic spinor φ constructed above, we have lim (∇ea + ea · D)φ, φ int(ea ) dvol(g) = ωk vol(X)m(g). R→∞ SR ×X
Proof. By (2.5),
SR ×X
= Re Now,
(∇ea + ea · D)φ, φ int(ea ) dvol(g)
SR ×X
(∇ea + ea · D)φ, φ int(ea ) dvol(g).
1 (∇ea + ea · D)φ, φ = [ea ·, eb ·] ∇eb φ, φ 2 1 1 = [ea ·, eb ·] ∇eb φ0 , φ0 + [ea ·, eb ·] ∇eb φ0 , ξ 2 2 1 1 + [ea ·, eb ·] ∇eb ξ, φ0 + [ea ·, eb ·] ∇eb ξ, ξ . 2 2
(4.14)
The second term and the last term are O(r −2τ −1 ) and therefore contribute nothing in the limit. For the third term, one notices that if β is the n − 2 form, β = [ea ·, eb ·] φ, ψ int(ea ) int(eb ) dvol(g) (Einsterin summation here and below), then d β = −2 [ea ·, eb ·] ∇eb φ, ψ int(eb ) dvol(g) + [ea ·, eb ·] φ, ∇eb ψ int(eb ) dvol(g) = −4 [ea ·, eb ·] ∇eb φ, ψ int(eb ) dvol(g) − φ, [ea ·, eb ·] ∇eb ψ int(eb ) dvol(g) which yields [ea ·, eb ·] ∇eb φ, ψ int(eb ) dvol(g) = ∂
∂
(4.15)
φ, [ea ·, eb ·] ∇eb ψ int(eb ) dvol(g).
Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification
343
It follows then that the third term is similarly dealt with as the second. Thus the only contribution is coming from the first term, for which we note that 1 1 0 [ea ·, eb ·] ∇eb φ0 , φ0 = [ea ·, eb ·](∇eb − ∇eb )φ0 , φ0 2 2 ◦ 1 ◦ = (∇ ec gbd − ∇ ed gbc ) [ea ·, eb ·] ec · ed · φ0 , φ0
16 c=d −2τ −1
+O(r
)
by (4.12). Now ◦ 1 ◦ (∇ ec gbd − ∇ ed gbc ) [ea ·, eb ·] ec · ed · φ0 , φ0
16 c=d
=
◦ 1 ◦ (∇ ec gbd − ∇ ed gbc ) ea · eb · ec · ed · φ0 , φ0
8 c=d
+
◦ 1 ◦ (∇ ec gad − ∇ ed gac ) ec ·, ed · φ0 , φ0
8 c=d
1 ◦ 1 ◦ = ∇ ec gbd ea · eb · ec · ed · φ0 , φ0 + ∇ ed gbb ea · ed · φ0 , φ0
8 8 c=d
c=d
◦ 1 ◦ + (∇ ec gbd − ∇ ed gbc ) ec · ed · φ0 , φ0
8 c=d
1 ◦ 1 ◦ = ∇ ec gbb ea · ec · φ0 , φ0 + ∇ eb gbd ea · ed · φ0 , φ0
8 4 c=d
c=d
◦ 1◦ 1 ◦ + (∇ ec gbd − ∇ ed gbc ) ec · ed · φ0 , φ0 . ∇ ed gbb ea · ed · φ0 , φ0 + 8 8 c=d
c=d
For the last equality, we use ec · ed · = 21 [ec ·, ed ·] for c = d, and [ec ·, ed ·] skew-hermitian to see that its real part is zero. Finally, one uses ea · ed · = 21 [ea ·, ed ·] − δad and the skew-hermitian property of the commutators to obtain
◦ 1 1 ◦ Re [ea ·, eb ·] ∇eb φ0 , φ0 = (∇ eb gab − ∇ ea gbb )|φ0 |2 + O(r −2τ −1 ). 2 4 This yields
lim
R→∞ SR ×X
(∇ea + ea · D)φ, φ int(ea ) dvol(g)
= lim
R→∞ SR ×X
◦ 1 ◦ (∇ eb gab − ∇ ea gbb )|φ0 |2 int(ea ) dvol(g). 4
To see that this reduces to the definition of the mass, we first note that one can replace ea by ea0 in the integrand on the right-hand side, producing only an error of O(r −2τ −1 ), then replace dvol(g) by dxdvolX with a similar error term.
344
X. Dai
5. Negative Energy Solutions in Kaluza-Klein Theory It was observed by Witten that positive energy theorems do not extend immediately to Kaluza-Klein theory [Wi2]. He observed that there are two zero energy solutions on a space asymptotic to M4 × S 1 which should lead to perturbatively negative energy solutions. The explicit negative energy solutions were constructed later in [BP, BH]. The following example is from [BH]. The analytically continued Reissner-Nordstr¨om metric ds 2 = (1 − where r ≥ r+ = m +
2m q 2 2m q 2 −1 2 − 2 )dθ 2 + (1 − − 2 ) dr + r 2 d2 , r r r r 2πr 2
+ m2 + q 2 , θ ∈ R/r+ −m Z and d2 is the standard metric on the
2-sphere. This is a scalar flat metric on R2 × S 2 and asymptotic to R3 × S 1 at infinity. The mass can be computed via (0.4), which is m(g) =
1 r+ − m m . 2 2 2πr+
(5.16)
2πr 2
+ For fixed asymptotic geometry, i.e., fixed circle size r+ −m = l, this can be made arbitrarily negative if one takes m < 0 sufficiently large, while q = 0 is chosen appropriately (which will necessarily be large as well). The reason here is that the end R3 × S 1 , and in particular, S 1 has the wrong spin structure! Recall that S 1 has two spin structures which correspond to the trivial double cover of S 1 and the nontrivial double cover of S 1 . Here, since S 1 bounds the disk inside, it has the spin structure corresponding to the nontrivial double cover. It therefore has no parallel spinor.
Acknowledgement. This work is motivated and inspired by the work of Gary Horowitz and his collaborators [HHM]. The author is indebted to Gary for sharing his ideas and for interesting discussions. The author would also like to thank Is Singer for bringing them together and for useful discussion. Thanks are also due to Xiao Zhang for useful comments.
References [AnD] [ADM] [AsHa] [AsHo] [Ba1] [Ba2] [BP] [BH] [CG]
Andersson, L., Dahl, M.: Scalar curvature rigidity for asymptotically locally hyperbolic manifolds. Ann. Glob. Anal. Geom. 16, 1–27 (1998) Arnowitt, S., Deser, S., Misner, C.: Coordinate invariance and energy expressions in general relativity. Phys. Rev. 122, 997–1006 (1961) Ashtekar, A., Hansen, R.: A unified treatment of null and spatial infinity in general relativity. I. Universal structure, asymptotic symmetries, and conserved quantities at spatial infinity. J. Math. Phys. 19, 1542–1566 (1978) Ashtekar, A., Horowitz, G.: Energy-momentum of isolated systems cannot be null. Phys. Lett. 89A, 181–184 (1982) Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 36, 661– 693 (1986) Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) Brill, D., Pfister, H.: States Of Negative Total Energy In Kaluza-Klein Theory. Phys. Lett. B 228, 359 (1989) Brill, D., Horowitz, G.T.: Negative Energy In String Theory. Phys. Lett. B 262, 437 (1991) Cheeger, J., Gromoll, D.: The splitting theorem for manifolds of non-negative Ricci curvature. J. Diff. Geom. 6, 119–128 (1971)
Positive Mass Theorem for Spaces with Asymptotic SUSY Compactification
345
[CHSW] Candelas, P., Horowitz, G., Strominger,A., Witten, E.: Vacuum configurations for superstrings. Nucl. Phys. B258, 46 (1985) [Ch1] Chru´sciel, P.: Boundary conditions at spatial infinity from a Hamiltonian point of view. In: Topological Properties and Global Structure of Space-Time (Erice, 1985), NATO, Adv. Sci. Inst. Ser. B: Phys. 138, New York: Plenum, 1986, pp. 49–59 [GHHP] Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) [HHMa] Hausel, T., Hunsicker, E., Mazzeo, R.: Hodge cohomlogy of gravitational instantons. To appear in Duke Math J. [HHM] Hertog, T., Horowitz, G., Maeda, K.: Negative energy density in Calabi-Yau compactifications. hep-th/0304199 [He] Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) [HM] Horowitz, G., Myers, R.: The AdS/CFT correspondence and a new positive energy conjecture for general relativity. Phys. Rev. D59, 026005 (1999) [HP] Horowitz, G., Perry, M.: Gravitational energy cannot become negative. Phys. Rev. Lett. 48, 371–374 (1982) [HT] Horowitz, G., Tod, P.: A relation between local and total energy in general relativity. Commun. Math. Phys. 85, 429–447 (1982) [LM] Lawson, H., Michelsohn, M.: Spin Geometry. Princeton Math. Series, Vol. 38, Princeton, NJ: Princeton University Press, 1989 [LP] Lee, J., Parker, T.: The Yamabe problem. Bull. Am. Math. Soc. 17, 31–81 (1987) [MM] Mazzeo, R., Melrose, R.: Pseudodifferential operators on manifolds with fibered boundaries. Asian J. Math. 2, 833–866 (1998) [PT] Parker, T., Taubes, C.: On Witten’s proof of the positive energy theorem. Commun. Math. Phys. 84, 223–238 (1982) [Pe] Penrose, R.: Some unsolved problems in classical general relativity. In: Seminar on Differential Geometry, S.-T. Yau, (ed.), Annals of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 631–668 [RegT] Regge, T., Teitelboim, C.: Role of surface integrals in the Hamiltonian formulation of general relativity. Ann. Phys. 88, 286–318 (1974) [S1] Stolz, S.: Simply Connected Manifolds of Positive Scalar Curvature. Bull. Am. Math. Soc. 23, 427 (1990) [S2] Stolz, S.: Simply connected manifolds of positive scalar curvature. Ann. of Math. (2) 136(3), 511–540 (1992) [SY1] Schoen, R., Yau, S.-T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65, 45–76 (1979) [SY2] Schoen, R.,Yau, S.-T.: The energy and the linear momentum of spacetimes in general relativity. Commun. Math. Phys. 79, 47–51 (1981) [SY3] Schoen, R., Yau, S.-T.: Proof of the positive mass theorem. II. Commun. Math. Phys. 79, 231–260 (1981) [V] Vaillant, B.: Index and spectral theory for manifolds with generalized fibered cusps. Preprint, math.DG/0102072 [Wa] Wang, M.: Parallel spinors and parallel forms. Ann. Global Anal. Geom. 7(1), 59–68 (1989) [Wi1] Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981) [Wi2] Witten, E.: Instability Of The Kaluza-Klein Vacuum. Nucl. Phys. B 195, 481 (1982) [Yo] York, J.: Energy and momentum of the gravitational field. In: Essays in General Relativity, F.J. Tipler, (ed.), New York: Academic Press, 1980 [Z] Zhang, X.: Angular momentum and positive mass theorem. Commun. Math. Phys. 206, 137– 155 (1999) Communicated by G. W. Gibbons
Commun. Math. Phys. 244, 347–393 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0993-3
Communications in
Mathematical Physics
One-Dimensional Behavior of Dilute, Trapped Bose Gases Elliott H. Lieb1, , Robert Seiringer1,2, , Jakob Yngvason2 1 2
Department of Physics, Jadwin Hall, Princeton University, P. O. Box 708, Princeton, NJ 08544, USA Institut f¨ur Theoretische Physik, Universit¨at Wien, Boltzmanngasse 5, 1090 Vienna, Austria
Received: 12 May 2003 / Accepted: 14 August 2003 Published online: 25 November 2003 – © E.H. Lieb, R. Seiringer, J. Yngvason 2003
Abstract: Recent experimental and theoretical work has shown that there are conditions in which a trapped, low-density Bose gas behaves like the one-dimensional delta-function Bose gas solved years ago by Lieb and Liniger. This is an intrinsically quantum-mechanical phenomenon because it is not necessary to have a trap width that is the size of an atom – as might have been supposed – but it suffices merely to have a trap width such that the energy gap for motion in the transverse direction is large compared to the energy associated with the motion along the trap. Up to now the theoretical arguments have been based on variational - perturbative ideas or numerical investigations. In contrast, this paper gives a rigorous proof of the one-dimensional behavior as far as the ground state energy and particle density are concerned. There are four parameters involved: the particle number, N , transverse and longitudinal dimensions of the trap, r and L, and the scattering length a of the interaction potential. Our main result is that if r/L → 0 and N → ∞ the ground state energy and density can be obtained by minimizing a one-dimensional density functional involving the Lieb-Liniger energy density with coupling constant ∼ a/r 2 . This density functional simplifies in various limiting cases and we identify five asymptotic parameter regions altogether. Three of these, corresponding to the weak coupling regime, can also be obtained as limits of a three-dimensional Gross-Pitaevskii theory. We also show that Bose-Einstein condensation in the ground state persists in a part of this regime. In the strong coupling regime the longitudinal motion of the particles is strongly correlated. The Gross-Pitaevskii description is not valid in this regime and new mathematical methods come into play.
c
2003 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes. Work partially supported by U.S. National Science Foundation grant PHY 01-39984. Erwin Schr¨ odinger Fellow, supported by the Austrian Science Fund.
348
E.H. Lieb, R. Seiringer, J. Yngvason
1. Introduction The technique of trapping and cooling atoms, that led to the first realization of BoseEinstein condensation (BEC) in dilute alkali gases in 1995 [5, 18], has recently opened the possibility for experimental studies, in highly elongated traps, of Bose gases that are effectively one-dimensional. Some of the remarkable properties of ultracold one-dimensional Bose systems with delta function interactions, analyzed long ago [22, 23], may thus become accessible to experimental scrutiny in the not too distant future. Among these are pseudo-fermionic behavior [12], the absence of BEC in a dilute limit [21, 36, 14, 34, 10], and an excitation spectrum different from that predicted by Bogoliubov’s theory [23, 17, 20]. The paper [33] by Olshanii triggered a number of theoretical investigations on the transitions from 3D to an effective 1D behavior with its peculiar properties, see, e.g., [6–8, 11, 13, 19, 32, 35, 39]; systems showing the first evidence of such a transition have recently been prepared experimentally [4, 15, 16, 38]. Until now the theoretical work on the dimensional cross-over in elongated traps has either been based on variational calculations, starting from a three-dimensional deltapotential [7, 13, 33], or on numerical Monte Carlo studies [1, 3] with more realistic, genuine 3D potentials but particle numbers limited to the order of 100. This work is important and has led to valuable insights, in particular about different parameter regions [8, 35, 32], but a more thorough theoretical understanding is clearly desirable since this is not a simple problem. In fact, it is evident that for a potential with a hard core the true 3D wave function does not approximately factorize in the longitudinal and transverse variables and the effective one-dimensional potential can not be obtained by simply integrating out the transverse variables of the 3D potential. In this sense the problem is more complicated than in a somewhat analogous situation of atoms in extremely strong magnetic fields [2, 30], where the Coulomb interaction behaves like an effective onedimensional delta potential when the magnetic field shrinks the cyclotron radius of the electrons to zero. In that case the delta potential can be obtained formally by integrating out the variables transverse to the field in a suitable scaled Coulomb potential. With a hard core, on the other hand, where the energy is essentially kinetic, this method will not work since it would immediately introduce impenetrable barriers in 1D. The onedimensional effective interaction emerges only if the kinetic part of the Hamiltonian and the potential are considered together. In the present paper we start with an arbitrary, repulsive 3D pair potential of finite range and prove rigorously that in a well defined limit the ground state energy and particle density of the system are described exactly by a one-dimensional model with delta-function interaction. This is a highly quantum-mechanical phenomenon with no classical counterpart, since a 1D description is possible even though the transverse trap dimension is much larger than the range of the atomic forces. It suffices that the energy gap associated with the transverse confinement is much larger than the internal energy per particle. While the three-dimensional density remains low (in the sense that distance between particles is large compared to the three-dimensional scattering length) the one-dimensional density can either be high or low. We remark that, in contrast to three-dimensional gases, high density in one dimension corresponds to weak interactions and vice versa [22]. In this paper we shall always be concerned with large particle number, N , which is appropriate for the consideration of actual experiments. In order to make precise statements we shall typically take the limit N → ∞ but the reader can confidently apply these limiting statement to finite numbers like N = 100.
One-Dimensional Behavior of Dilute, Trapped Bose Gases
349
Besides N , the parameters of the problem are the scattering length, a, of the two-body interaction potential, and two lengths, r and L, describing the transverse and the longitudinal extension of the trap potential, respectively. To keep the introductory discussion simple let us first think of the case that the particles are confined in a box with dimensions r and L. The three-dimensional particle density is then ρ 3D = N/(r 2 L) and the one-dimensional density ρ 1D = N/L. The case of quadratic or more general trapping potentials will be considered later. We begin by describing the division of the space of parameters into two basic regions. This decomposition will eventually be refined into five regions, but for the moment let us concentrate on the basic dichotomy. In earlier work [27, 26] we have proved that the three-dimensional Gross-Pitaevskii formula for the energy (including its limiting “Thomas-Fermi” case) is correct to leading order in situations in which a is small and N is large. This energy has two parts: The energy necessary to confine the particles in the trap, which is roughly ( 2 /2m)N (r −2 + L−2 ), plus the internal energy of interaction, which is ( 2 /2m)N 4π aρ 3D . The trouble is that while this formula is correct for a fixed confining potential in the limit N → ∞ with a 3 ρ 3D → 0, it does not hold uniformly if r/L gets small as N gets large. In other words, new physics can come into play as r/L → 0 and it turns out that this depends on the ratio of a/r 2 to ρ 1D = N/L . As we shall show, the two basic regimes to consider in highly elongated traps, i.e., when r L, are • The one-dimensional limit of the three-dimensional Gross-Pitaevskii/“Thomas-Fermi” regime • The “true” one-dimensional regime. The former is characterized by aL/r 2 N → 0, while in the latter regime aL/r 2 N is of the order one or even tends to infinity (which is referred to as the Girardeau-Tonks1 region). These two regimes correspond to high one-dimensional density (weak interaction) and low one-dimensional density (strong interaction), respectively. The significance of the combination aL/r 2 N can be understood by noting that it is the ratio of the 3D energy per particle, ∼ aρ 3D ∼ N a/r 2 L, to the 1D energy ∼ (ρ 1D )2 = (N/L)2 . Physically, the main difference between the two regimes is that for strong interactions the motion of the particles in the longitudinal direction is highly correlated, while in the weak interaction regime it is not. Mathematically, this distinction also shows up in our proofs. In both regimes the internal energy of the gas is small compared to the energy of confinement which is of order N/r 2 . However, this in itself does not imply a specifically one-dimensional behavior. (If a is sufficiently small it is satisfied in a trap of any shape.) One-dimensional behavior, when it occurs, manifests itself by the fact that the transverse motion of the atoms is uncorrelated while the longitudinal motion is correlated (very roughly speaking) in the same way as pearls on a necklace. Thus, the true criterion for 1D behavior is that aL/r 2 N is of the order unity and not merely the condition that the energy of confinement dominates the internal energy. The starting point for our investigations is the Hamiltonian for N spinless Bosons in a confining 3D trap potential and with a short range, repulsive pair interaction. We find 1 We call this the Girardeau-Tonks region only because many authors refer in the present context to Tonks [41]. In our opinion this should really be called the Girardeau region because it was he who first understood how to compute the spectrum of a 1D quantum-mechanical hard core gas and who understood that the Fermi-Dirac wave functions played a role [12]. Tonks was interested in the positive temperature partition functions of a hard core classical gas – a very different and much simpler question.
350
E.H. Lieb, R. Seiringer, J. Yngvason
it convenient to write the Hamiltonian in the following way (in appropriate units): HN,L,r,a =
−j + Vr⊥ (xj⊥ ) + VL (zj ) +
N j =1
va (|xi − xj |)
(1.1)
1≤i<j ≤N
with x = (x, y, z) = (x⊥ , z), Vr⊥ (x⊥ ) =
1 ⊥ ⊥ V (x /r), r2
VL (z) =
1 V (z/L), L2
va (|x|) =
1 v(|x|/a) , a2
(1.2)
where r, L, a are variable scaling parameters while V ⊥ , V and v are fixed. The interaction potential v is supposed to be nonnegative, of finite range and have scattering length 1; the scaled potential va then has scattering length a. The external trap potentials V and V ⊥ confine the motion in the longitudinal (z) and the transversal (x⊥ ) directions, respectively, and are assumed to be locally bounded and tend to ∞ as |z| and |x⊥ | tend to ∞. To simplify the discussion we find it also convenient to assume throughout that V is homogeneous of some order s > 0, namely V (z) = |z|s , but weaker assumptions, e.g. asymptotic homogeneity [28], would in fact suffice. The case of a simple box with hard walls is realized by taking s = ∞, while the usual harmonic approximation is s = 2. Moreover, to avoid unnecessary technicalities we shall assume that V ⊥ is polynomially bounded at infinity, but our results certainly also hold for faster growing potentials, or even finite domains with Dirichlet boundary conditions. Units are chosen so that = 1 and 2m = 1. It is understood that the lengths associated with the ground states of −d 2 /dz2 + V (z) and −⊥ + V ⊥ (x⊥ ) are both of the order 1 so that L and r measure, respectively, the longitudinal and the transverse extensions of the trap. We denote the ground state energy of (1.1) by E QM (N, L, r, a) and the ground state particle QM density by ρN,L,r,a (x). In parallel with the three-dimensional Hamiltonian we consider the Hamiltonian for n Bosons in one dimension with delta interaction and coupling constant g ≥ 0, i.e., 1D = Hn,g
n
−∂j2 + g
j =1
δ(zi − zj ) ,
(1.3)
1≤i<j ≤n
where ∂j = ∂/∂zj . We consider this Hamiltonian for the zj in an interval of length in the thermodynamic limit, → ∞, n → ∞ with ρ = n/ fixed. The ground state energy per particle in this limit is independent of boundary conditions and can, according to [22], be written as e01D (ρ) = ρ 2 e(g/ρ) ,
(1.4)
with a function e(t) determined by a certain integral equation. Its asymptotic form is e(t) ≈ 21 t for t 1 and e(t) → π 2 /3 for t → ∞. Thus e01D (ρ) ≈ 21 gρ for g/ρ 1
(1.5)
and e01D (ρ) ≈
π2 2 ρ for g/ρ 1 . 3
(1.6)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
351
Taking ρe01D (ρ) as a local energy density for an inhomogeneous one-dimensional system we can form the energy functional √ E[ρ] = (1.7) |∇ ρ(z)|2 + VL (z)ρ(z) + ρ(z)3 e(g/ρ(z)) dz R
with ground state energy defined by minimizing E[ρ] over all normalized densities ρ, i.e., E(N, L, g) = inf E[ρ] : ρ ≥ 0 , ρ(z)dz = N . (1.8) R
By standard methods (cf.,e.g., [27]) one can show that there is a unique minimizer, i.e., a density ρN,L,g (z) with ρN,L,g (z)dz = N and E[ρN,L,g ] = E(N, L, g). Here it is important to note that t → t 3 e(1/t) is convex. We define the mean 1D density of this minimizer to be 2 1 ρ¯ = ρN,L,g (z) dz . (1.9) N R In a rigid box, i.e., for s = ∞, ρ¯ is simply N/L, but in more general traps it depends also on g besides N and L. The order of magnitude of ρ¯ in the various parameter regions will be described in the next section. Our main result relates the 3D ground state energy of (1.1), E QM (N, L, r, a), to the 1D density functional energy E(N, L, g) for a suitable g in the large N limit provided r/L and a/r are sufficiently small. To state this precisely, let e⊥ and b(x⊥ ) denote the ground state energy and the normalized, nonnegative ground state wave function of −⊥ + V ⊥ (x⊥ ), respectively. The corresponding quantities for −⊥ + Vr⊥ (x⊥ ) are e⊥ /r 2 and br (x⊥ ) = (1/r)b(x⊥ /r). In the case that the trap is a cylinder with hard walls b is a Bessel function; for a quadratic V ⊥ it is a Gaussian. In any case, b is a bounded function and, in particular, b ∈ L4 (R2 ). Hence we can define g by 8πa g= 2 |b(x⊥ )|4 d 2 x⊥ = 8πa |br (x⊥ )|4 d 2 x⊥ . (1.10) 2 2 r R R Our main theorem is: Theorem 1.1 (From 3D to 1D). Let N → ∞ and simultaneously r/L → 0 and a/r → 0 in such a way that r 2 ρ¯ · min{ρ, ¯ g} → 0. Then lim
E QM (N, L, r, a) − N e⊥ /r 2 =1. E(N, L, g)
(1.11)
¯ g} → 0 is the same Note that because of (1.5) and (1.6) the condition r 2 ρ¯ · min{ρ, as e01D (ρ) ¯ 1/r 2 ,
(1.12)
i.e., the average energy per particle associated with the longitudinal motion should be much smaller than the energy gap between the ground and first excited state of the confining Hamiltonian in the transverse directions. (The precise meaning of is that the ratio of the left side to the right side tends to zero in the limit considered.) Note also that while
352
E.H. Lieb, R. Seiringer, J. Yngvason
the one-dimensional density can be either high or low (compared to g), the gas is always dilute in a three-dimensional sense in the limit considered, i.e., a 3 ρ 3D ∼ a 2 g ρ¯ 1. The two regimes mentioned previously correspond to specific restrictions on the size of the ratio g/ρ¯ as N → ∞, namely g/ρ¯ 1 for the limit of the 3D Gross-Pitaevskii regime (weak interaction/high 1D density), and g/ρ¯ 1 for the “true” one-dimensional regime (strong interaction/low 1D density). We shall now describe briefly the finer division of these regimes into five regions altogether. Three of them (Regions 1–3) belong to the weak interaction regime and two (Regions 4–5) to the strong interaction regime. In each of these regions the general functional (1.7) can be replaced by a different, simpler functional, and the energy E(N, L, g) in Theorem 1.1 by the ground state energy of that functional. The five regions are • Region 1, the Ideal Gas case: g/ρ¯ N −2 , with ρ¯ ∼ N/L, corresponding to a non-interacting gas in an external potential. • Region 2, the 1D GP case: g/ρ¯ ∼ N −2 , with ρ¯ ∼ N/L, described by a 1D GrossPitaevskii energy functional with energy density 21 gρ 2 . • Region 3, the 1D TF case: N −2 g/ρ¯ 1, with ρ¯ ∼ (N/L)(NgL)−1/(s+1) , where s is the degree of homogeneity of the longitudinal confining potential V . This region is described by a Thomas-Fermi type functional with energy density 21 gρ 2 , without a gradient term. • Region 4, the LL case: g/ρ¯ ∼ 1 , with ρ¯ ∼ (N/L)N −2/(s+2) , described by an energy functional with the Lieb-Liniger energy (1.4), without a gradient term. • Region 5, the GT case: g/ρ¯ 1, with ρ¯ ∼ (N/L)N −2/(s+2) , described by a functional with energy density ∼ ρ 3 , corresponding to the Girardeau-Tonks limit of the LL energy density. We note that the condition g/ρ¯ ∼ 1 means that Region 4 requires the gas cloud to have aspect ratio r/L¯ of the order N −1 (a/r) or smaller, where L¯ ≡ N/ρ¯ ∼ LN 2/(s+2) is the length of the cloud. Experimentally, such small aspect ratios are quite a challenge and the situations described in [4, 15, 16, 38] are still rather far from this regime. It may not be completely out of reach, however. The condition a/r → 0 is automatically fulfilled in Regions 1–4, provided (1.12) holds, since (a/r)2 = r 2 (g ρ)(g/ ¯ ρ) ¯ = r 2 ρ¯ 2 (g/ρ) ¯ 2 1 by (1.12) if g/ρ¯ is bounded. Moreover, as discusssed in the next section, in Regions 1–2 the condition r/L → 0 implies (1.12) and hence a/r → 0 , and in Region 4, a/r → 0 implies (1.12). The hypotheses of Theorem 1.1 are thus not entirely independent. In the next Sect. 2 we define the various energy functionals more precisely and also QM discuss the 1D behavior of the density ρN,L,r,a (x), separately for each region. Moreover, we prove, in Subsect. 2.7, that Regions 1–3 can be reached as limiting cases of a 3D Gross-Pitaevskii theory. In this sense, the behavior in these regions contains remnants of the 3D theory, which also shows up in the fact that BEC prevails in Regions 1 and 2, as discussed in Sect. 5. Heuristically, these traces of 3D can be understood from the fact ¯ gives that in Regions 1–3 the 1D formula for the energy per particle g ρ¯ ∼ aN/(r 2 L), the same result as the three-dimensional formula [31], i.e., scattering length times threedimensional density. This is no longer so in Regions 4 and 5 and different mathematical methods are required. Despite significant differences the proof of Theorem 1.1 has some basic strategies in common with the (considerably simpler) proof of the Gross-Pitaevskii limit Theorem in [27] (see also [31] and [26]). The upper bound for the energy in Regions 1–3, given
One-Dimensional Behavior of Dilute, Trapped Bose Gases
353
in Subsect. 4.1, is the simplest estimate and can be obtained by a method analogous to that of [27]. For the lower bound, and also for the upper bound in Regions 4–5, one considers first finite numbers of particles in boxes with Neumann or Dirichlet boundary conditions, and subsequently puts these boxes together to treat inhomogeneous external potentials and the infinite particle number limit. For the lower bound of the energy in the boxes Dyson’s Lemma [9, 31, 26], which converts a “hard” potential into a “soft” potential at the expense of sacrificing kinetic energy, is an essential tool. The main differences compared to [27] are on the one hand due to the fact that in Regions 1–3 the lower bounds in [27] are not valid because they are not uniform in the shape of the trap, and on the other hand due to the correlations in the Lieb-Liniger wave function for the longitudinal motion in Regions 4–5. For the latter reason even the proof of the upper bound in Regions 4–5 is considerably more involved than for Regions 1–3. Section 3 contains the main technical estimates for the boxes. We consider here the 3D Hamiltonian for a finite number of particles on a finite interval in the longitudinal direction with Neumann or Dirichlet boundary conditions and a confining potential Vr⊥ in the transverse directions. We estimate its energy from above and below in terms of the energy of a 1D Hamiltonian with delta interactions. In Sect. 4 we apply these results to prove Theorem 1.1. In the last Sect. 5 we consider the question of Bose-Einstein condensation and prove that it holds in Regions 1 and 2. Part of the results of this work were announced in [29]. 2. The Five Parameter Regions We shall now discuss the simplifications of Theorem 1.1 in the five different parameter regions. Besides the ground state energy we also consider the convergence of the QM quantum mechanical density ρN,L,r,a (x) averaged over the transverse variables, i.e., of QM QM ρˆN,L,r,a (z) := ρN,L,r,a (x⊥ , z)d 2 x⊥ . (2.1) We state the results in the form of five theorems, one for each region. These theorems will be proved in Sect. 4. Together they imply Theorem 1.1 because the energy functionals involved are all limiting cases of the general functional (1.8). The methods for proving the latter statement are fairly standard (cf., e.g., [27, 28] for similar computations). Since a proof of all five limit theorems for the functional (1.8) would be largely repetitious, we shall limit ourselves to giving a proof of two of them in Subsect. 2.6, as an example. 2.1. The Ideal Gas Region. This region corresponds to the trivial case where the interaction is so weak that it effectively vanishes in the large N limit and everything collapses to the ground state of −d 2 /dz2 + VL (z). By scaling the ground state
energy and wave −2 e and L−1/2 ρ (z/L), where e function of this latter operator can be written as L
and ρ (z) are the corresponding quantities for −d 2 /dz2 + V (z). Theorem 2.1 (Ideal gas limit). Suppose r/L → 0 and NgL ∼ N aL/r 2 → 0 as N → ∞. Then (N/L2 )−1 E QM (N, L, a, r) − N e⊥ /r 2 → e (2.2) and
354
E.H. Lieb, R. Seiringer, J. Yngvason
(N/L)−1 ρˆN,L,r,a (Lz) → ρ (z) QM
(2.3)
weakly in L1 (R). 2.2. The 1D Gross-Pitaevskii Region. This region is described by the 1D GP density functional √ GP EL,g |∇ ρ(z)|2 + VL (z)ρ(z) + 21 gρ(z)2 dz (2.4) [ρ] = R
corresponding to the high density approximation (1.5) of the interaction energy in (1.7). Its ground state energy GP GP E (N, L, g) = inf EL,g [ρ] : ρ ≥ 0 , ρ(z)dz = N (2.5) R
has the scaling property E GP (N, L, g) = (N/L2 )E GP (1, 1, NgL)
(2.6)
GP (z) satisfies and likewise, the minimizer ρN,L,g GP GP ρN,L,g (Lz) = (N/L)ρ1,1,NgL (z) .
(2.7)
Theorem 2.2 (1D GP limit). Suppose r/L → 0 and NgL ∼ N aL/r 2 is fixed as N → ∞. Then (N/L2 )−1 E QM (N, L, a, r) − N e⊥ /r 2 → E GP (1, 1, NgL) (2.8) and GP (N/L)−1 ρˆN,L,r,a (Lz) → ρ1,1,NgL (z) QM
(2.9)
weakly in L1 (R). Remark. If r/L → 0 and NgL stays bounded, as in Regions 1 and 2, condition (1.12) is automatically satisfied, because g ρr ¯ 2 ∼ aN/L ∼ (r/L)2 (NgL). Likewise, a/r → 0, −1 because a/r ∼ (r/L)N (NgL). 2.3. The 1D “Thomas-Fermi” Region. This region is a limiting case of the previous one in the sense that NgL ∼ N aL/r 2 → ∞, but a/r is sufficiently small so that g/ρ¯ ∼ (aL/N r 2 )(N aL/r 2 )1/(s+1) → 0, i.e., the high density approximation in (1.5) is still valid. Here s is the degree of homogeneity of V and the explanation of the factor (NaL/r 2 )1/(s+1) ∼ (NgL)1/(s+1) is as follows: The linear extension L¯ of the minimizGP ¯ ∼ g(N/L), ¯ which ing density ρN,L,g is for large values of NgL determined by VL (L) 1/(s+1) −2 ¯ gives L ∼ (NgL) L. In addition condition (1.12) requires g ρ¯ r , which means that (N a/L)(N aL/r 2 )−1/(s+1) → 0. If NgL ∼ N aL/r 2 → ∞ the gradient term in the functional (2.5) becomes negligible compared to the other terms. In fact, by a simple scaling,
One-Dimensional Behavior of Dilute, Trapped Bose Gases
355
N (NgL)s/(s+1) L2
2 × (NgL)−(s+2)/(s+1) |∇ ρ(z)| ˜ +V (z)ρ(z) ˜ + 21 ρ(z) ˜ 2 dz, (2.10)
GP EL,g [ρ] =
R
where the scaled density ρ˜ is determined by ρ(z) = (N/L¯ TF )ρ(z/ ˜ L¯ TF ) ,
with L¯ TF := (NgL)1/(s+1) L .
(2.11)
This leads to the functional TF EL,g [ρ]
=
R
VL (z)ρ(z) + 21 gρ(z)2 dz
(2.12)
whose ground state energy E
TF
(N, L, g) = inf
TF EL,g [ρ]
: ρ ≥ 0,
R
ρ(z)dz = N
(2.13)
has the scaling property E TF (N, L, g) = (N/L2 )(NgL)s/(s+1) E TF (1, 1, 1) .
(2.14)
TF The minimizer ρN,L,g satisfies TF TF ρN,L,g (L¯ TF z) = (N/L¯ TF )ρ1,1,1 (z)
(2.15)
and can be computed explicitly: TF ρ1,1,1 (z) = [µTF − V (z)]+ ,
(2.16)
TF (z)dz = 1. where [t]+ = max{t, 0} and µTF is determined by the normalization ρ1,1,1 Because of a formal similarity with the Thomas-Fermi energy functional for fermions, which also has no gradient terms, the functional (2.12) is in the literature usually referred to as a “Thomas-Fermi” functional, but the physics is, of course, quite different. The limit theorem in this region is Theorem 2.3 (1D “TF limit”). Suppose, as N → ∞, r/L → 0, NgL ∼ N aL/r 2 → ∞, but g/ρ¯ ∼ (a L¯ TF /N r 2 ) → 0 and g ρr ¯ 2 ∼ N a/L¯ TF → 0 with L¯ TF given in (2.11). Then (N/L2 )−1 (NgL)−s/(s+1) E QM (N, L, a, r) − N e⊥ /r 2 → E TF (1, 1, 1) (2.17) and TF (N/L¯ TF )−1 ρˆN,L,r,a (L¯ TF z) → ρ1,1,1 (z) QM
weakly in L1 (R).
(2.18)
356
E.H. Lieb, R. Seiringer, J. Yngvason
2.4. The Lieb-Liniger Region. This region corresponds to the case g/ρ¯ ∼ 1, so that neither the high density (1.5) nor the low density approximation (1.6) is valid and the full LL ¯ ∼ energy (1.4) has to be used. The extension L¯ of the system is now determined by VL (L) 2 2/(s+2) 1/(s+1) ¯ ¯ ¯ (N/L) which leads to LLL = LN , in contrast to LTF = L(NgL) in the TF case. Condition (1.12) means in this region that N r/L¯ LL ∼ N s/(s+2) r/L → 0. Since Nr/L¯ LL ∼ (ρ/g)(a/r), ¯ this condition is automatically fulfilled if g/ρ¯ is bounded away from zero and a/r → 0. Conversely, if g/ρ¯ is bounded, then (1.12) implies a/r → 0. The energy functional is LL VL (z)ρ(z) + ρ(z)3 e(g/ρ(z)) dz (2.19) EL,g [ρ] = R
with corresponding energy
LL [ρ] : ρ ≥ 0 , ρ(z)dz = N . E LL (N, L, g) = inf EL,g R
(2.20)
Introducing the density parameter γ := N/L¯ LL = (N/L)N −2/(s+2)
(2.21)
we can write the scaling relation of the functional as E LL (N, L, g) = N γ 2 E LL (1, 1, g/γ ) .
(2.22)
The minimizer, ρ LL (z), satisfies LL LL (L¯ LL z) = γρ1,1,g/γ (z) . ρN,L,g
(2.23)
Theorem 2.4 (LL limit). Suppose r/L → 0 and a/r → 0, with g/γ > 0 fixed as N → ∞. Then (2.24) (N γ 2 )−1 E QM (N, L, a, r) − N e⊥ /r 2 → E LL (1, 1, g/γ ) and LL (z) γ −1 ρˆN,L,r,a (L¯ LL z) → ρ1,1,g/γ QM
(2.25)
weakly in L1 (R). 2.5. The Girardeau-Tonks Region. This region corresponds to impenetrable particles, i.e, the limiting case g/ρ¯ → ∞ and hence the formula (1.6) for the energy density. As in Region 4, the mean density is here ρ¯ ∼ γ = (N/L)N −2/(s+2) . The energy functional is GT VL (z)ρ(z) + (π 2 /3)ρ(z)3 dz (2.26) EL [ρ] = R
with energy
ρ(z)dz = N , E GT (N, L) = inf ELGT [ρ] : ρ ≥ 0 , R
which can be written
(2.27)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
357
E GT (N, L) = N γ 2 E GT (1, 1) .
(2.28)
GT (z) has the form The minimizer ρ1,1 GT ρ1,1 (z) = π −1 [µGT − V (z)]+ , 1/2
(2.29)
with µGT determined by the normalization. Note that its shape is different from that of (2.16), which makes it possible to distinguish experimentally the TF regime from the GT regime. The scaling relation for the minimizer is GT ¯ GT ρN,L (LLL z) = γρ1,1 (z) .
(2.30)
Theorem 2.5 (GT limit). Suppose r/L → 0 and a/r → 0, with g/γ → ∞ as N → ∞. Then (2.31) (N γ 2 )−1 E QM (N, L, a, r) − N e⊥ /r 2 → E GT (1, 1) and GT (z) γ −1 ρˆN,L,r,a (L¯ LL z) → ρ1,1 QM
(2.32)
weakly in L1 (R).
2.6. Limiting cases of the general energy functional. As already stated, the proof of Theorem 1.1 in Sect. 4 consists in comparing the ground state energy of the manybody Hamiltonian (1.1) with the ground state energies of the functionals defined in Subsects. 2.1–2.5 in the various parameter domains. To link these special cases to the functional (1.7) it then remains to show that the ground state energy of (1.7) coincides with that of the functionals in Subsects. 2.1–2.5 in the appropriate asymptotic limits. The proof of this follows the same pattern as the derivation of the 3D TF limit from 3D GP theory in [27, 28] and we shall here only give explicit proofs for the functionals in Theorems 2.3 and 2.4 as examples. The limit theorems for the density are derived from the energy convergence by variation of the external potential in a standard way (cf., e.g. [28]). Proposition 2.1. If N → ∞, NgL → ∞, but g L¯ TF /N = gL(NgL)1/(s+1) /N → 0, then
(N/L2 )(NgL)s/(s+1)
−1
E(N, L, g) → E TF (1, 1, 1)
(2.33)
and
N/L¯ TF
weakly in L1 (R).
−1
TF ρN,L,g L¯ TF z → ρ1,1,1 (z)
(2.34)
358
E.H. Lieb, R. Seiringer, J. Yngvason
Proof. With ρ˜ the scaled density given by (2.11) the energy functional (1.7) can be written
2 s/(s+1) 2 ˜ (NgL)−(s+2)/(s+1) |∂ ρ(z)| E[ρ] = (N/L )(NgL) R + V (z)ρ(z) ˜ + ρ(z) ˜ 3 N (g L¯ TF )−1 e(g L¯ TF /N ρ(z)) ˜ dz . (2.35) Now te(1/t) ≤
1 2
for all t [22], so
−1 (N/L2 )(NgL)s/(s+1) E[ρ]
2 ≤ ˜ + V (z)ρ(z) ˜ + 21 ρ(z) ˜ 2 dz . (NgL)−(s+2)/(s+1) |∂ ρ(z)| R
(2.36)
Let jε (z) = (2ε)−1 exp(−|z|/ε) . TF ∗ j . Then, since |∂j | = ε −1 j and ρ(z)dz ˜ = 1, Define ρ˜ = ρ1,1,1 ε ε ε
2 |∂ ρ(z)| ˜ dz ≤ 1/(4ε 2 ) < ∞ , and (2.36) implies −1 TF TF E(N, L, g) ≤ E1,1 [ρ1,1,1 ∗ jε ] lim sup (N/L2 )(NgL)s/(s+1)
(2.37)
(2.38)
(2.39)
N→∞
TF 2 TF ∗ jε )2 ≤ (ρ1,1,1 ) because jε = 1, and in the limit considered. Moreover, (ρ1,1,1 TF s TF ρ1,1,1 ∗ jε (z)|z| dz → ρ1,1,1 (z)|z|s dz (2.40) TF = [µTF − |z|s ] is continuous and of compact support.) for ε → 0. (Note that ρ1,1,1 + Hence −1 lim sup (N/L2 )(NgL)s/(s+1) E(N, L, g) ≤ E TF (1, 1, 1) . (2.41) N→∞
On the other hand, dropping the positive gradient term in (2.35) gives −1 2 s/(s+1) V (z)ρ(z) ˜ + ρ(z) ˜ 3 Me(1/M ρ(z)) E[ρ] ≥ ˜ dz , (N/L )(NgL) R
(2.42)
with M = N/(g L¯ TF ). Note that M → ∞ in the limit considered here. The functional on the right side of (2.42) has as minimizer −1 (M) (µ − V (z))]+ , ρ (M) (z) = [fM
(2.43)
−1 where fM is the inverse of the function fM (t) = d/dt[Mt 3 e(1/tM)] and µ(M) is cho (M) sen so that ρ = 1. Note also that t −1 fM (t) → 1 as M → ∞, uniformly on [δ, ∞[
One-Dimensional Behavior of Dilute, Trapped Bose Gases
359
TF , for every δ > 0. From this it follows easily that ρ (M) converges uniformly to ρ1,1,1 given by (2.16), as M → ∞. With ρ = ρN,L,g , the minimizer of E, we thus obtain from (2.42), −1 lim inf (N/L2 )(NgL)s/(s+1) E(N, L, g) ≥ E TF (1, 1, 1) . (2.44) N→∞
To prove the corresponding result (2.34) for the density we pick a C ∞ function Y of compact support together with an ε > 0 and replace VL (z) by ε (NgL)s/(s+1) Y (L−1 (NgL)−1/(s+1) z) L2
1 = 2 (NgL)s/(s+1) V (z ) + εY (z ) L
VL (z) +
(2.45)
with z = z/L¯ TF = L−1 (NgL)−1/(s+1) z. While V (z ) + εY (z ) is not strictly homogeneous of order s, it is asymptotically homogeneous in the sense of Def. 1.1 in [28] and as in the proof of Lemma 2.3 in [28] this is sufficient for (2.33), now with the modified external potential (2.45). Since both (1.8) and the TF energy are concave in ε, the derivative with respect to ε can be exchanged with the limits N → ∞, (NgL) → ∞, giving (2.34) in the sense of distributions. Since the densities have norm 1, the convergence holds also weakly in L1 (R). Proposition 2.2. If N → ∞ with g/γ fixed, where γ = N/L¯ LL = (N/L)N −2/(s+2) , then (N γ 2 )−1 E(N, L, g) → E LL (1, 1, g/γ )
(2.46)
LL (z) γ −1 ρN,L,g (L¯ LL z) → ρ1,1,g/γ
(2.47)
and
weakly in L1 (R). Proof. With L¯ LL = LN 2/(s+2) we define the scaled density ρ˜ by ˜ L¯ LL ) . ρ(z) = (N/L¯ LL )ρ(z/ The energy functional (1.7) can then be written
2 N −2 |∂ ρ(z)| ˜ + V (z)ρ(z) ˜ + ρ(z) ˜ 3 e g/(γ ρ(z)) ˜ dz . E[ρ] = Nγ 2 R
(2.48)
(2.49)
The lower bound (N γ 2 )−1 E(N, L, g) ≥ E LL (1, 1, g/γ )
(2.50)
follows simply by dropping the positive gradient term from the right side of (2.49). LL For the upper bound take jε as in (2.37) and define ρ˜ = jε ∗ ρ1,1,g/γ to obtain (N γ 2 )−1 E[ρ] ≤ and hence
N −2 LL + E1,1,g/γ [ρ] ˜ 4ε 2
(2.51)
360
E.H. Lieb, R. Seiringer, J. Yngvason LL LL lim sup(N γ 2 )−1 E(N, L, g) ≤ E1,1,g/γ [jε ∗ ρ1,1,g/γ ]
(2.52)
N→∞
for all ε > 0. The convergence LL LL LL LL lim E1,1,g/γ [jε ∗ ρ1,1,g/γ ] = E1,1,g/γ [ρ1,1,g/γ ] = E LL (1, 1, g/γ )
ε→0
(2.53)
LL follows by continuity of |z|s and t 2 e(t) and uniform convergence of jε ∗ ρ1,1,g/γ to LL ρ1,1,g/γ . The convergence of the densities follows as in the previous proposition by perturbing the external potential, this time replacing VL (z) by
VL (z) + εγ 2 Y (L−1 N −2/(s+2) z) = γ 2 V (z ) + εY (z ) (2.54)
with z = z/L¯ LL = L−1 N −2/(s+2) z.
2.7. One-dimensional GP as limit of three-dimensional GP. We shall now demonstrate that the ground state energy in Regions 1–3 can be obtained as a limit of the three-dimensional Gross-Pitaevskii energy. The latter is defined by the energy functional GP E3D [ ] = |∇ (x)|2 + Vr⊥ (x⊥ ) + VL (z) | (x)|2 + 4π a| (x)|4 d 3 x . R3
(2.55)
We denote its ground state energy, i.e, the infimum over all with GP (N, L, r, a). It satisfies the scaling relation E3D
| |2 = N , by
GP GP E3D (N, L, r, a) = (N/L2 )E3D (1, 1, r/L, N a/L) .
(2.56)
GP and E GP for N = 1 and Because of (2.56) and (2.6) it is sufficient to compare E3D L = 1.
Theorem 2.6. Let g be given by (1.10). In the limit r → 0 and a → 0, GP (1, 1, r, a) − e⊥ /r 2 E3D →1, E GP (1, 1, g)
(2.57)
uniformly in g as long as r 2 E GP (1, 1, g) → 0. Proof. We denote the minimizer of the one-dimensional GP functional (2.4) with N = 1, L = 1 and g fixed by φ(z)2 . Taking br (x⊥ )φ(z) as the trial function for the 3D functional (2.55) and using the definition (1.10) of g we obtain without further ado the upper bound GP E3D (1, 1, r, a) ≤ e⊥ /r 2 + E GP (1, 1, g)
(2.58)
for all r and a. For a lower bound we consider the one-particle Hamiltonian Hr,a = −⊥ + Vr⊥ (x⊥ ) − ∂z2 + V (z) + 8π abr (x⊥ )2 φ(z)2 .
(2.59)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
361
Taking the 3D Gross-Pitaevskii minimizer (x) for N = 1, L = 1, as the trial function we get GP inf spec Hr,a ≤ E3D (1, 1, r, a) − 4πa 4 + 8π a br2 φ 2 2 GP ≤ E3D (1, 1, r, a) + 4πa br4 φ 4 g GP = E3D (1, 1, r, a) + (2.60) φ4 . 2 On the other hand, inf spec Hr,a can be bounded below by Temple’s inequality [40], which says that for any Hamiltonian H with lowest eigenvalues E0 < E1 and expectation value H < E1 in some state, E0 ≥ H −
(H − H )2 . E1 − H
(2.61)
We apply this to H = Hr,a and the state defined by br φ. Here g e⊥ H = 2 + E GP (1, 1, g) + r 2
φ4 ,
(2.62)
and since E1 ≥ e˜⊥ /r 2 with e˜⊥ > e⊥ , (2.62) is smaller than E1 for r 2 E GP (1, 1, g) small enough. Moreover, (H − H )(br φ) = (8πabr2 − g)φ 3 br
(2.63)
and thus
(8πa)2 br6 − g 2 ≤ φ 6 (8πa)2 br4 br 2∞ 2 ≤ const. gφ∞ g φ 4 ≤ const. E GP (1, 1, g)2 ,
(H − H )2 =
φ6
(2.64)
where we used [28, Lemma 2.1] to bound gφ2∞ by the GP energy. Combining (2.60), (2.61) and (2.62) we thus get GP E3D (1, 1, r, a) − e⊥ /r 2 ≥ E GP (1, 1, g) 1 − const. r 2 E GP (1, 1, g) , and the proof is complete.
(2.65)
Remark. In combination with Theorem 2.2 this result demonstrates a fortiori that the three-dimensional GP limit theorem in [27] holds even if r/L → 0, provided N aL/r 2 stays bounded. A more direct proof of this fact, closer to the lines of [27], is certainly possible, but it requires redoing all estimates keeping track of the dependence on r/L.
362
E.H. Lieb, R. Seiringer, J. Yngvason
3. Finite n Bounds Before we give the proof of our main Theorem 1.1 in Sect. 4, we will explain briefly the strategy, and give some auxiliary results in this section. In particular, we will derive upper and lower bounds on the ground state energy of (1.1) with the external potential VL (z) replaced by a box with Dirichlet and Neumann boundary conditions, respectively. To obtain bounds on the full Hamiltonian (1.1), space will be divided in the z-direction into small boxes of side length , and the bounds of this section will be used in every box. The reason for this is twofold: first, this allows to consider an essentially homogeneous system, without the additional difficulty of the external potential VL (z), and secondly, by varying the particle number in each box can be controlled. This is necessary, since the bounds we obtain in every box will not be uniform in the particle number. Since the particle number in the boxes will be small (compared to N ), we denote it by n. In this section, we study the Hamiltonian H =
n
−j + Vr⊥ (xj⊥ ) +
j =1
va (|xi − xj |)
(3.1)
1≤i<j ≤n QM
QM
on L2 ((R2 × [0, ])n ). Let ED (n, , r, a) and EN (n, , r, a) denote its ground state energy with Dirichlet and Neumann boundary conditions, respectively. The following theorem gives upper and lower bounds on the ground state energy of (3.1) in terms of the ground state energy of the 1D Hamiltonian (1.3). Its proof will be given in Subsects. 3.1 and 3.2. A crucial step in the proof of the lower bound will be the use of the “Dyson Lemma” (see [9, 31]), which converts the “hard” interaction potential va into a “soft” one. With this new “soft” potential perturbation theory can be done, and rigorous bounds are obtained with the use of Temple’s inequality (2.61). Theorem 3.1 (Finite n bounds). Let ED1D (n, , g) and EN1D (n, , g) denote the ground state energy of (1.3) on L2 ([0, ]n ), with Dirichlet and Neumann boundary conditions, respectively, and let g be given by (1.10). Then there is a finite number C > 0 such that a 1/8 nr a 1/8 ne⊥ QM 1+ . (3.2) EN (n, , r, a) − 2 ≥ EN1D (n, , g) 1 − Cn r r r Moreover, QM ED (n, , r, a) −
a 1/3 ne⊥ na 2 1D , 1+ 2 ≤ ED (n, , g) 1 + C r2 r r
(3.3)
provided the term in square brackets is less than 1. Let us comment briefly on the error terms in (3.2) and (3.3). As already mentioned, in the proof of Theorem 1.1 we will divide space in the z-direction into small boxes of ¯ where side length . The number of particles in each box will be roughly n ∼ N /L, L¯ ≡ N/ρ¯ is the extension of the system in the z-direction. The n-dependence of the error term in (3.2) restricts us essentially to have a finite particle number n, i.e., that ¯ ¯ , n ∼ N/L¯ ∼ 1, or ∼ L/N. But for (3.2) to be useful we need r, i.e., r L/N or, in other words, r should be of the order of the mean particle spacing, or smaller. ¯ , r is much bigger than the mean particle spacing, and we have to use For r L/N a different strategy, similar to the one used in the 3D problem [31, 27]. This will be
One-Dimensional Behavior of Dilute, Trapped Bose Gases
363
necessary for the lower bound in Regions 1–3. The result is stated in Theorem 3.2. For its proof it will be necessary to use the box method also in x⊥ -direction, similar to the 3D case considered in [27]. However, one cannot use directly the results from there, one has to be careful to retain uniformity in r/L. Likewise, (3.3) will not be useful for an upper bound in all the Regions 1–5. The ¯ reason is the last term in (3.3), where we want g ∼ g L/N 1, which is only fulfilled in Regions 1–4. For Region 5, we use a different upper bound, given in the following Theorem. The proof of Theorem 3.2 will be given in Subsects. 3.3 and 3.4. Theorem 3.2 (Additional energy bounds). With the same notations as in Thm. 3.1, ⊥ 2g ne n na 1/8 QM −1/14 EN (n, , r, a) − 2 ≥ + 1−C n r 2 r2 √ 1/4 4/39 na r na + 1+ √ + . (3.4) r n Moreover, denoting the range of va by R0 , QM
ED (n, , r, a) −
ne⊥ π 2 n3 (1 + 1/n) (1 + 1/2n) ≤ , r2 3 2 (1 − (n − 1)R0 /)2
(3.5)
provided (n − 1)R0 < . ¯ )1/3 Remark. By definition, R0 ∼ a. Equation (3.4) will be used with ∼ (r 2 L/N 2 1/3 ¯ ¯ and n ∼ N /L. √ In this case we have, in Regions 1–3, 1, √ ng ∼ a(N/r L) ¯ 1/3 1 and r/( n) ∼ 1. na/ ∼ a ρ¯ 1, na/r ∼ a(N/r 2 L) The following four subsections contain the proofs of Theorems 3.1 and 3.2. Throughout, C denotes a constant independent of the parameters, although the value of different C’s may be different.
3.1. Upper bound for Theorem 3.1. In this subsection we are going to prove (3.3). We use the variational principle. Let ψ denote the ground state of (1.3) with Dirichlet boundary conditions, normalized by ψ|ψ = 1, and let G and F be given by G(x1 , . . . , xn ) = ψ(z1 , . . . , zn )
n
br (xj⊥ )
(3.6)
j =1
and F (x1 , . . . , xn ) =
f (|xi − xj |) .
(3.7)
i<j
Here f is a monotone increasing function, with 0 ≤ f ≤ 1 and f (t) = 1 for t ≥ R for some R ≥ R0 . For t ≤ R we shall choose f (t) = f0 (t)/f0 (R), where f0 is the solution to the zero-energy scattering equation for va [31, 27]. Note that f0 (R) = 1 − a/R for
364
E.H. Lieb, R. Seiringer, J. Yngvason
R ≥ R0 , and f0 (t) ≤ t −1 min{1, a/t}. We use as a trial wave function for (3.1) the function (3.8)
(x1 , . . . , xn ) = G(x1 , . . . , xn )F (x1 , . . . , xn ) . (2) (2) Let ρψ denote the two-particle density of ψ, normalized by ρψ (z, z )dzdz = 1. Since F is 1 whenever no pair of particles is closer together than a distance R, we can estimate the norm of by | ≥ G2 1 − θ (R − |xi − xj |) i<j
n(n − 1) (2) =1− ρψ (z, z )br (x⊥ )2 br (y⊥ )2 θ(R − |x − y|)dzdz d 2 x⊥ d 2 y⊥ 2 n(n − 1) (2) ≥1− ρψ (z, z )br (x⊥ )2 br (y⊥ )2 2 × θ (R − |x⊥ − y⊥ |)dzdz d 2 x⊥ d 2 y⊥ n(n − 1) =1− br (x⊥ )2 br (y⊥ )2 θ (R − |x⊥ − y⊥ |)d 2 x⊥ d 2 y⊥ 2 n(n − 1) πR 2 ≥1− b44 , (3.9) 2 r2 where we used Young’s inequality [24] in the last step. Using (3.10) | − j | = − F 2 Gj G + G2 |∇j F |2 and the Schr¨odinger equation Hn,g ψ = ED1D ψ, we can write the expectation value of (3.1) as n δ(zi − zj )| |H | = ED1D + 2 e⊥ | − g | r i<j n + G2 (3.11) |∇j F |2 + va (|xi − xj |)|F |2 . j =1
Now, since 0 ≤ f ≤ 1 and n
|∇j F |2 ≤ 2
j =1
f
≥ 0 by assumption, F 2 ≤ f (|xi − xj |)2 , and
f (|xi − xj |)2 + 4
i<j
i<j
f (|xk − xi |)f (|xk − xj |) .
(3.12)
k
Consider the first term on the right side of (3.12), together with the last term in (3.11). These terms are bounded above by G2 f (|xi − xj |)2 + 21 va (|xi − xj |)f (|xi − xj |)2 2 i<j
= n(n − 1) br (x⊥ )2 br (y⊥ )2 ρψ (z, z ) × f (|x − y|)2 + 21 va (|x − y|)f (|x − y|)2 . (2)
(3.13)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
Let h(z) =
365
f (|x|)2 + 21 va (|x|)f (|x|)2 d 2 x⊥ .
(3.14)
Note that h(z) = 0 for |z| ≥ R, and h(z)dz = 4π a(1 − a/R)−1 . Using Young’s inequality for the integration over the ⊥-variables, we get n(n − 1) (2) 4 (3.13) ≤ b ρψ (z, z )h(z − z )dzdz . (3.15) 4 2 r2 R Consider now the contribution from the last term in (3.12). We can write 4 G2 f (|xk − xi |)f (|xk − xj |) k
=
2 n(n − 1)(n − 2) 3
f (|x1 − x2 |)f (|x2 − x3 |)
× br (x1⊥ )2 br (x2⊥ )2 br (x3⊥ )2 ρψ (z1 , z2 , z3 )d 3 x1 d 3 x2 d 3 x3 , (3)
(3.16)
(3)
where ρψ denotes the three-particle density of ψ, normalized by 1. Let k(z) = f (|x|)d 2 x⊥ ,
(3.17)
which is supported in [−R, R]. For the integration over x1⊥ we use b2∞ b2∞ k(z − z ) ≤ k∞ . (3.18) f (|x1 − x2 |)br (x1⊥ )2 d 2 x1⊥ ≤ 1 2 r2 r2 For the remaining integrations, we proceed as in (3.15) to obtain 2 b2 b4 (2) (3.16) ≤ n(n − 1)(n − 2) 2∞ 2 4 k∞ ρψ (z, z )k(z − z )dzdz . (3.19) 2 3 r r R Now, for any φ ∈ H 1 (R),
d|φ(z )|2 dz ≤ 2φ∞ dz z 1/4 ≤ 2|z − z |1/2 |φ|2
|φ(z)|2 − |φ(z )|2 =
z
R
dφ(z ) dz dz z 3/4 dφ 2 , R dz z
(3.20)
where we used φ2∞ ≤ dφ/dz2 φ2 . Applying this to ρψ (z, z ), considered as a function of z only, and using the fact that the support of h is contained in [−R, R], we therefore get (2) (2) ρψ (z, z )h(z − z )dzdz − h(z)dz ρψ (z, z)dz (2)
R2
≤ 2R
1/2
≤ 2R
1/2
R
h(z)dz
! ∂z
"
R
(2) ρψ (z, z )
d2 h(z)dz ψ − 2 ψ dz1 R
#3/4 ,
3/4
2
dzdz
(3.21)
366
E.H. Lieb, R. Seiringer, J. Yngvason (2)
where we used Schwarz’ inequality, the normalization of ρψ and the symmetry of ψ. The same argument is used for (3.19) with h replaced by k. Now h(z)dz = 4π a (1 − a/R)−1 , and 2πa (1 + ln(R/a)) , 1 − a/R 2πaR a k(z)dz ≤ 1− . 1 − a/R 2R R k∞ ≤
(3.22)
Therefore (3.15) + (3.19) ≤
n(n − 1) 1 + K g 2 1 − a/R #3/4 " 2 d (2) , (3.23) × ρψ (z, z)dz + 2R 1/2 ψ − 2 ψ dz1
where we denoted 2π a 1 + ln(R/a) K= (n − 2) 3 R 1 − a/R
2 R b2∞ . r
(3.24)
It remains to bound the second term in (3.11). We use again the fact that F is equal to 1 as long as the particles are not within a distance R. We obtain # " δ(zi − zj )|
| i<j
n(n − 1) (2) ρψ (z, z)dz br (x⊥ )2 br (y⊥ )2 1 − θ(R − |x⊥ − y⊥ |) 2 n(n − 1) πR 2 (2) ≥ (3.25) ρψ (z, z)dz 1 − 2 b44 . 2 r
≥
Putting together the bounds (3.9), (3.23) and (3.25), and using the fact that (2) − 1) ρψ (z, z)dz ≤ ED1D and ψ| − d 2 /dz12 |ψ ≤ ED1D /n, we obtain the upper bound
g 21 n(n
|H | ne⊥ − 2 | r ≤
ED1D (n, , g) × 1+
n(n − 1) πR 2 1− b44 2 r2
−1
a/R + K n + 2 b44 + g(n − 1)R 1/2 1D (1 − a/R) r ED πR 2
1/4
1+K , (3.26) 1 − a/R
provided the term in square brackets is positive. We now use ED1D /n ≥ (π/)2 , and choose R3 =
ar 2 . n2 (1 + g)
(3.27)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
367
This gives (under the assumption (na/r)2 (1 + g) ≤ 1) na 2/3 ne⊥ QM 1D 1/3 ED (n, , r, a) − 2 ≤ ED (n, , g) 1 + C (1 + g) r r
(3.28)
for some constant C > 0. 3.2. Lower bound for Theorem 3.1. We are now going to prove (3.2). We write a general wave function as
(x1 , . . . , xn ) = f (x1 , . . . , xn )
n
br (xk⊥ ) ,
(3.29)
k=1
which can always be done, since br is a strictly positive function. Partial integration gives n n ⊥ ne 2 |∇i f |2 + 1 v (|x − x |)|f | br (xk⊥ )2 d 3 xk . |H | = 2 + a i j 2 r j, j =i
i=1
k=1
(3.30) Choose some R > R0 , fix i and xj , j = i, and consider the Voronoi cell j around particle j , i.e., j = {x : |x − xj | ≤ |x − xk | for all k = j }.
(3.31)
Denote by Bj the ball of radius R around xj . We can estimate br (xi⊥ )2 |∇i f |2 + 21 va (|xi − xj |)|f |2 d 3 xi j ∩Bj
⊥ 2
≥ min br (x ) a x∈Bj
≥
j ∩Bj
minx∈Bj br (x⊥ )2 maxx∈Bj br (x⊥ )2
U (|xi − xj |)|f |2
a
j ∩Bj
br (xi⊥ )2 U (|xi − xj |)|f |2 ,
where we used Lemma 1 of [31], and 3(R 3 − R03 )−1 for R0 ≤ r ≤ R U (r) = 0 otherwise . For some δ > 0 define Bδ ⊂ R2 by Bδ = x⊥ ∈ R2 : b(x⊥ )2 ≥ δ .
(3.32)
(3.33)
(3.34)
Estimating maxx∈Bj br (x⊥ )2 ≤ minx∈Bj br (x⊥ )2 + 2(R/r 3 )∇b2 ∞ , we obtain R ∇b2 ∞ ⊥ ≥ χ (x /r) 1 − 2 . Bδ j maxx∈Bj br (x⊥ )2 r δ minx∈Bj br (x⊥ )2
(3.35)
368
E.H. Lieb, R. Seiringer, J. Yngvason
(For a proof that ∇b2 is a bounded function, see the proof of Lemma 1 in the Appendix.) Here χBδ denotes the characteristic function of Bδ . Denoting k(i) the nearest neighbor to particle i, we conclude that, for 0 ≤ ≤ 1, n n 2 2 1 |∇i f | + va (|xi − xj |)|f | br (xk⊥ )2 d 3 xk 2 i=1
≥
j, j =i
n
k=1
|∇i f |2 + (1 − )|∇i f |2 χmink |zi −zk |≥R (zi )
i=1 ⊥ +a U (|xi − xk(i) |)χBδ (xk(i) /r)|f |2
n
br (xk⊥ )2 d 3 xk ,
(3.36)
k=1
where a = a(1 − )(1 − 2R∇b2 ∞ /rδ). Define F (z1 , . . . , zn ) ≥ 0 by n 2 |F (z1 , . . . , zn )| = |f (x1 , . . . , xn )|2 br (xk⊥ )2 d 2 xk⊥ .
(3.37)
k=1
Neglecting the kinetic energy in the ⊥-direction in the second term in (3.36) and using the Schwarz inequality to bound the longitudinal kinetic energy of f by the one of F , we get the estimate ne⊥ |H | − 2 r n n |∂i F |2 + (1 − )|∂i F |2 χmink |zi −zk |≥R (zi ) ≥ dzk i=1
+
k=1
n
⊥ |∇i⊥ f |2 + a U (|xi − xk(i) |)χBδ (xk(i) /r)|f |2
i=1
n
br (xk⊥ )2 d 3 xk ,
k=1
(3.38) , and ∇ ⊥
where ∂j = d/dzj denotes the gradient in the ⊥-direction. We now investigate the last term in (3.38). Consider, for fixed z1 , . . . , zn , the expression n n ⊥ |∇i⊥ f |2 + a U (|xi − xk(i) |)χBδ (xk(i) /r)|f |2 br (xk⊥ )2 d 2 xk⊥ . (3.39) i=1
k=1
To estimate this term from below, we use Temple’s inequality, as in the [31]. Let $ e⊥ ⊥ ⊥ ⊥ denote the gap above zero in the spectrum of − + V − e , i.e., the lowest non-zero eigenvalue. By scaling, $ e⊥ /r 2 is the gap in the spectrum of −⊥ + Vr⊥ − e⊥ /r 2 . Note that under the transformation φ → br−1 φ this latter operator is unitarily equivalent to ∇ ⊥∗ · ∇ ⊥ as an operator on L2 (R2 , br (x⊥ )2 d 2 x⊥ ), as considered in (3.39). Hence also this operator has $ e⊥ /r 2 as its energy gap. Denote k n n k ⊥ U = U (|xi − xk(i) |)χBδ (xk(i) /r) br (xk⊥ )2 d 2 xk⊥ . (3.40) i=1
k=1
One-Dimensional Behavior of Dilute, Trapped Bose Gases
369
Temple’s inequality (2.61) implies (under the assumption that the denominator in the last term is positive) 2 1 2 U (3.39) ≥ |F | a U 1 − a . (3.41) U $ e⊥ /r 2 − a U Now, using (3.33) and Schwarz’ inequality, U 2 ≤ 3n(R 3 − R03 )−1 U , and U ≤ n(n − 1) U (|x − y|)br (x⊥ )2 br (y⊥ )2 d 2 x⊥ d 2 y⊥ b4 ≤ n(n − 1) 2 4 r
U (|x|)d 2 x⊥ ≤ n(n − 1)
b44 3π R 2 . r 2 R 3 − R03
(3.42)
Using this and a ≤ a in the error term, we obtain (3.41) ≥ |F |2 a U , where
(3.43)
a = a
−1 n2 a 3n ar 2 1 1 1 − ⊥ 3π b44 1− ⊥ 3 , $ e R 1 − (R0 /R)3 $ e R 1 − (R0 /R)3 (3.44)
with the understanding that the term in square brackets is positive. Now let d(z − z ) = br (x⊥ )2 br (y⊥ )2 U (|x − y|)χBδ (y⊥ /r)d 2 x⊥ d 2 y⊥ . R4
(3.45)
Note that d(z) = 0 if |z| ≥ R. We estimate U from below by n ⊥ U ≥ U (|xi − xj |)χBδ (xj /r) θ (|xk − xi | − R) br (xl⊥ )2 d 2 xl⊥ i=j
≥
i=j
≥
k, k=i,j
U (|xi − xj |)χBδ (xj⊥ /r) 1 −
l=1
θ (R − |xk − xi |)
k, k=i,j
d(zi − zj ) 1 − (n − 2)
i=j
Let
a =a
πR 2 r2
br (xl⊥ )2 d 2 xl⊥ (3.46)
.
πR 2 1 − (n − 2) 2 b2∞ r
n l=1
b2∞
,
(3.47)
and denote g = 2a R d(z)dz. Note that since |b(x⊥ )2 − b(y⊥ )2 | ≤ R∇b2 ∞ for |x⊥ − y⊥ | ≤ R, 4π ⊥ 4 2 ⊥ 2 d(z)dz ≥ 2 b(x ) d x − R∇b ∞ /r r R Bδ 4π ≥ 2 b44 − δ − R∇b2 ∞ /r . (3.48) r
370
E.H. Lieb, R. Seiringer, J. Yngvason
We write n i=1
=
|∂i F |2 + a
i=j
d(zi − zj )|F |2
j, j =i
n
dzk
k=1
n |∂i F |2 + a d(zi − zj )|F |2 dzk . n−1
(3.49)
k=1
Now consider, for fixed zj , j = i, the expression |∂i F |2 + a d(zi − zj )|F |2 dzi . n−1 We claim that (3.50) ≥
1 max |F |2 χ[R,−R] (zj ) 2 g |z −z i j |≤R
1−
(3.50)
2(n − 1) gR
1/2 (3.51)
.
Assume that (3.51) is false. Estimating, for any H 1 ([0, ])-function φ, z0 2 2 |φ(z)| − max |φ(z)| ≥ − ∂|φ|2 |z−z0 |≤R
z
≥ −2R 1/2 max |φ(z)| |z−z0 |≤R
1/2 |∂φ|2
,
(3.52)
0
and applying this estimate to F (considered only as a function of zi ), we obtain, using |∂i F |2 ≤ 21 (n − 1)g max|zi −zj |≤R |F |2 by assumption, a d(zi − zj )|F |2 dzi 1/2 2 2 d(z − zj )dz max |F | 1 − ≥a (n − 1)g R |zi −zj |≤R 1/2 2 2 1 ≥ 2 g χ[R,−R] (zj ) max |F | 1 − , (3.53) (n − 1)g R |zi −zj |≤R contradicting our assumption. This proves (3.51). Putting everything together, we thus obtain n n ne⊥ |H | − 2 ≥ (1 − )|∂i F |2 χmink |zi −zk |≥R (zi ) dzk r i=1 k=1 2 1 max |F | χ[R,−R] (zj ) + dzk , (3.54) 2g |zi −zj |≤R
i=j
where
g =g
1−
2 (n − 1)g R
k, k=i
1/2 .
(3.55)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
Assume that (n + 1)R < . Given an F with z1 ≤ z2 ≤ · · · ≤ zn ≤ − (n + 1)R,
371
|F |2 dz1 · · · dzn = 1, define, for 0 ≤
ψ(z1 , . . . , zn ) = F (z1 + R, z2 + 2R, z3 + 3R, . . . , zn + nR) ,
(3.56)
and extend the function to all of [0, − (n + 1)R]n by symmetry. A simple calculation shows that, for
H = (1 − )
n
−∂i2 + g
i=1
δ(zi − zj )
(3.57)
i<j
on L2 ([0, − (n + 1)R]n ), (3.54) ≥ ψ|H |ψ ≥ (1 − )EN1D (n, − (n + 1)R, g )ψ|ψ ≥ (1 − )EN1D (n, , g )ψ|ψ .
(3.58)
Here ψ|H |ψ is interpreted in the quadratic form sense, since ψ does not necessarily fulfill Neumann boundary conditions. Since these give the lowest energy for the quadratic form, (3.58) is valid anyway. It remains to estimate ψ|ψ for the F that is related to the true ground state by (3.37). We have n χ[R,−R] (zk ) θ (|zi − zj | − R) ψ|ψ = |F |2 k=1
≥ 1−
|F |2
i<j n
1 − χ[R,−R] (zk ) +
k=1
θ(R − |zi − zj |)χ[R,−R] (zj ) .
i<j
(3.59) To bound the second term on the right side of (3.59), we use 2 |F | θ (R − |zi − zj |)χ[R,−R] (zj ) ≤ R max i=j
i<j
≤
|zi −zj |≤R
|F |2 χ[R,−R] (zj )
2R QM ne⊥ E , (N, , r, a)− g N r2
(3.60)
where the last inequality follows from (3.54). For the other term, we use the simple fact that, for any function φ ∈ H 1 ([0, ]), and for 0 ≤ R ≤ , R L R φ(z)dz − φ(z)dz = φ (z)fR, (z)dz , (3.61) 0 0 0 with fR, (z) = zR/ − min{z, R}. Note that |fR, (z)| ≤ R. Applying this to F 2 and using Schwarz’ inequality we get the estimate 1/2 n % 1 R |F |2 F | i − ∂i2 |F 1 − χ[R,−R] (zk ) ≤ 2n + 4nR n k=1 1/2 R 1 QM e⊥ ≤ 2n + 4nR EN (n, , r, a) − 2 . n r (3.62)
372
E.H. Lieb, R. Seiringer, J. Yngvason
Denoting A ≡ EN (n, , r, a) − ne⊥ /r 2 , we thus have QM
A≥
EN1D (n, , g )
√ 2R R 1 − A − 2n − 4 nRA1/2 g
,
(3.63)
which implies A≥
EN1D (n, , g )
! R 1D 1 − 2n − 2R nEN (n, , g ) .
1 + 2REN1D (n, , g )/g
(3.64)
We now use the simple upper bounds EN1D (n, , g ) ≤
n(n − 1) g 2
and
EN1D (n, , g ) ≤
π 2 n3 , 3 2
(3.65)
which follow from a constant and a free-fermion trial wave function, respectively. Moreover, by concavity of EN1D in g, EN1D (n, , g ) ≥ EN1D (n, , g)g /g for g ≤ g, and therefore QM
EN (n, , r, a) −
ne⊥ g ≥ EN1D (n, , g) 2 r g
1−
R 2π n(n + 1) + √ n2 . 3
(3.66)
We now choose R=r
a 1/4 r
,
=
a 1/8 r
,
δ=
a 1/8 r
,
(3.67)
and obtain QM EN (n, , r, a) −
a 1/8 nr a 1/8 ne⊥ 1D 1+ (3.68) ≥ EN (n, , g) 1 − Cn r2 r r
for some constant C > 0.
3.3. Upper bound for the Girardeau-Tonks regime. We use as a trial function
(x1 , . . . , xn ) = ψ(z1 , . . . , zn )
n
br (xk⊥ ) ,
(3.69)
k=1
where ψ is the ground state of a one-dimensional Bose gas of particles with hard cores of radius R0 . It is well known that its energy is the same as that of n non-interacting fermions on a line of reduced length − (n − 1)R0 . An explicit calculation yields (3.5).
One-Dimensional Behavior of Dilute, Trapped Bose Gases
373
3.4. Lower bound for Theorem 3.2. We start by defining an auxiliary 2D GP energy functional by φ → |∇ ⊥ φ|2 + V ⊥ |φ|2 + p|φ|4 d 2 x⊥ (3.70) R2
for some parameter p ≥ 0. The following fact will be needed below. Lemma 1. For any p ≥ 0, there exists a unique minimizer (up to a constant phase factor) of (3.70) under the normalization condition |φ|2 = 1, denoted by φp , that can be chosen strictly positive. Moreover, both φp and ∇ ⊥ φp are bounded uniformly in p and x⊥ for p in a finite interval [0, P ]. The proof of this lemma, as well as the proof of Lemmas 2 and 3 below, will be given in the Appendix. The energy corresponding to the minimizer φp will be denoted by E aux (p). Define also φp,r (x⊥ ) = r −1 φp (x⊥ /r). Writing the wave function as
(x1 , . . . , xn ) = f (x1 , . . . , xn )
n
φp,r (xk⊥ ) ,
(3.71)
k=1
using partial integration and the variational equation for φp , we obtain n aux np |H | − 2 E (p) − 2 φp4 r r n |∇i f |2 + 1 = va (|xi − xj |)|f |2 − 2pφp,r (xi⊥ )2 |f |2 2 j, j =i
i=1
×
n
φp,r (xk⊥ )2 d 3 xk .
(3.72)
k=1
We now divide space in x⊥ -direction into boxes of side length s, labeled by α. Let φα,max and φα,min denote the maximal and minimal value of φp,r inside box α, respectively. We obtain n aux np QM 2 EN (n, , r, a) − 2 E (p) − 2 φp4 ≥ inf Eα (nα ) − 2nα pφα,max , {nα } r r α (3.73) with Eα (n) = inf f
n i=1
α
|∇i f |2 +
1 2
&n ⊥ 2 3 2 j, j =i va (|xi − xj |)|f | k=1 φp,r (xk ) d xk &n ⊥ 2 3 2 k=1 φp,r (xk ) d xk α |f |
%
.
(3.74) In (3.73) the infimum is over all possible distributions of the n particles into the boxes 2 2 α. As a first step we will replace φα,max by φα,min in the last term in (3.73). The error in doing so is bounded above by √ ps 2 2 2np sup φα,max ≤ 2 2n 3 ∇ ⊥ φp2 ∞ . − φα,min (3.75) r α
374
E.H. Lieb, R. Seiringer, J. Yngvason
To bound Eα from below, we divide α into even smaller boxes, denoted by β, with side length t ≤ s, where s/t ∈ N. Let ci denote the number of boxes with exactly i particles. Then Eα (n) ≥ inf ci inf Eβ (i) . (3.76) {ci }
i≥0
β⊂α
% % ci i = n. For a Note that the infimum is under the constraints ci = (s/t)2 and lower bound on Eβ , we use & % n 2 n 2+ 1 2 3 n |∇ f | v (|x − x |)|f | i a i j φβ,min j, j =i k=1 d xk β 2 &n Eβ (n) ≥ inf . 2 2 3 f φβ,max k=1 d xk β |f | i=1
(3.77) 2 Fix some δ > 0, and assume that φα,min ≥ δ/r 2 . Then
2 φβ,min
n
2 φβ,max
≥1−
√
2n
t ∇ ⊥ φp2 ∞ r δ
(3.78)
(compare with (3.35)). The rest of (3.77) can be bounded below by the same method as in the previous subsection. The only difference lies in the fact that br is replaced by the constant function, and the integrations are only over the box β. The result is (compare with (3.2)) a 1/8 nt a 1/8 1D 2 Eβ (n) ≥ EN (n, , 8πa/t ) 1 − Cn 1+ t t √ t ∇φp2 ∞ × 1 − 2n . (3.79) r δ To proceed we need an explicit lower bound on EN1D that will be proved in the Appendix. Lemma 2. There is a finite number C > 0 such that EN1D (n, , g) ≥
1 n(n − 1) g 1 − Cn(g)1/2 . 2
Applying this lemma to (3.79), we obtain a 1/8 nt a 1/8 n(n − 1) 4πa Eβ (n) ≥ 1 − Cn 1+ t2 t t 2 ⊥ √ t ∇ φp ∞ × 1 − 2n 1 − Cn(a/t 2 )1/2 . δ r
(3.80)
(3.81)
Note that the right side is independent of β ⊂ α. We insert this bound in (3.76), and use the following lemma. It is a simple generalization of a result of [31]. Although we need at this point only the version proved in [31], we state the lemma in this general form for later use.
One-Dimensional Behavior of Dilute, Trapped Bose Gases
375
Lemma 3 ([31]). For n ∈ N ∪ {0}, let E(n) be a sequence of non-negative real numbers that is superadditive, i.e., E(n1 + n2 ) ≥ E(n1 ) + E(n2 ), and bounded below by E(n) ≥ L(n)K(n) ,
(3.82)
with K : R+ → R+ monotone decreasing, L : R+ → R+ convex, L(0) = 0, and L (x) ≤
L(λx) 2λx
(3.83)
for some λ > 1 and all x > 0. (Here L denotes the right derivative of L.) Let cn be a sequence of non-negative real numbers, with cn ≤ M and cn n = N . (3.84) n≥0
Then
n≥0
cn E(n) ≥ ML(N/M)K(λN/M) ,
(3.85)
n≥0
where x denotes the smallest integer ≥ x. The proof is given in the Appendix. Note that inf β Eβ (n) is certainly a superadditive function, as the infimum over superadditive functions. Therefore we can apply Lemma 3 with L(x) = x[x − 1]+ and λ = 4, to (3.76), together with (3.81), and obtain 1 s2 4πan2α Eα (nα ) ≥ 1− R(nα ) , (3.86) s 2 nα t 2 where
' ' ( a 1/8 ( t a 1/8 R(n) = 1 − C 4nt 2 /s 2 1 + 4nt 2 /s 2 t t ⊥ 2 ' ( ( ' √ t ∇ φp ∞ × 1 − 2 4nt 2 /s 2 1 − C 4nt 2 /s 2 (a/t 2 )1/2 . (3.87) r δ
We now insert this bound in (3.73), and use nα ≤ n in the error terms. Note that R is monotone decreasing in n, hence R(nα ) ≥ R(n). Minimizing over nα gives 2 4 s 2 p 2 φα,min 1 2π ar 2 2 Eα (nα ) − 2nα pφα,min ≥ − 1+ 2 . (3.88) 4πa t pδ R(n) 2 This holds for boxes α where φα,min ≥ δ/r 2 . In boxes where this is not the case, we simply use positivity of Eα (which follows from positivity of va ) in the form 2 Eα (nα ) − 2nα pφα,min ≥ −2nα
Putting everything together, using
% α
4 s 2 φα,min ≤ r −2
p= we obtain
4πan ,
pδ . r2
(3.89)
φp4 and choosing (3.90)
376
E.H. Lieb, R. Seiringer, J. Yngvason
QM EN (n, , r, a)
s n aux 4πan2 √ ≥ 2 E (4π an/) − 8∇ ⊥ φp2 ∞ 2 r r r * ) 2 r2 1 −1 . 1+ 2 +2δ + φp4 2t δn R(n)
(3.91)
We are still free to choose the parameters t, s and δ. It remains to derive a lower bound on E aux . This will be done similarly to the lower bound on the 3D GP energy given in Subsect. 2.7. Consider the auxiliary Schr¨odinger operator H aux = −⊥ + V ⊥ (x⊥ ) + 2pb(x⊥ )2 .
(3.92)
Using φp as a trial function, we have inf spec H aux ≤ E aux (p) − p ≤E
aux
(p) + p
φp4 + 2p
b2 φp2
b4 .
(3.93)
On the other hand, using Temple’s inequality (2.61), 4p 2 b6 aux ⊥ 4 ≥ e + 2p b − ⊥ inf spec H $ e − 2p b4 2pb2∞ . ≥ e⊥ + 2p b4 1 − ⊥ $ e − 2pb2∞ Equations (3.93) and (3.94) together give aux ⊥ 4 E (p) ≥ e + p b 1 −
4pb2∞ $ e⊥ − 2pb2∞
(3.94)
(3.95)
.
We now choose r t= √ , n
s = r,
δ=,
(3.96)
and −1/14
=n
+
na r2
1/8 +
r 1+ √ n
√
na r
1/4 4/39 ,
(3.97)
and obtain as the final result QM
EN (n, , r, a) ≥
na ne⊥ n2 g 1 − C + . + r2 2
(3.98)
Note that we did not pay any attention to the √ restriction s/t ∈ N in choosing s and t. However, since, with our choice, s/t = 2 n ≥ n1/2−1/7 , this can easily be made an integer by replacing by some ¯ with ≤ ¯ ≤ 2. This affects only the constant C in (3.98).
One-Dimensional Behavior of Dilute, Trapped Bose Gases
377
3.5. Boundary conditions. As a last step in this section, before giving the proof of our main Theorem 1.1, we investigate the dependence of the ground state energy of (1.3) on the boundary conditions. In the upper bound above we used Dirichlet boundary conditions for the energy E 1D , and Neumann boundary conditions for the lower bound. To relate these energies to the energy with periodic boundary conditions and to prove independence of boundary conditions in the thermodynamic limit, we need the following lemma. Lemma 4. Denote Ep1D (n, , g) the ground state energy of (1.3) with periodic boundary conditions, i.e., on the torus [0, ]n . Then there is a finite number C > 0 such that EN1D (n, , g) ≤ Ep1D (n, , g) ≤ ED1D (n, , g) ,
(3.99)
and ED1D (n, , g) ≤ EN1D (n, , g) + C
n7/3 . 2
(3.100)
Proof. The first inequality (3.99) is standard, noting that the interaction considered has zero range. For (3.100) we use a result of [37, Lemma 2.1.13 and Prop. 2.2.10], which implies, for 0 < b < /2, ED1D (n, + 2b, g) ≤ EN1D (n, , g) +
2n . b2
(3.101)
Using ED1D (n, + 2b, g) = ED1D (n, , g(1 + 2b/))(1 + 2b/)−2 ≥ ED1D (n, , g) (1 + 2b/)−2 and EN1D (n, , g) ≤ π 2 n3 /(32 ), we obtain ED1D (n, , g) ≤ EN1D (n, , g) +
2n n3 4π 2 b 2 + 2b/) + (1 (1 + b/) . b2 2 3
Now b = const. /n2/3 gives the desired result.
(3.102)
4. Proof of Theorem 1.1 With the results of the previous section in hand, we can now give the proof of our main Theorem 1.1. The proof will be divided into four subsections, two for the upper and lower bound, respectively. In each subsection, we compare the ground state energy of HN,L,r,a with the ground state energy of one of the functionals in Subsects. 2.1–2.5, which, as explained there and in Subsect. 2.6, is asymptotically equal to E(N, L, g) in the respective parameter region. Combining all the bounds obtained, this will prove Theorem 1.1, together with the claimed uniformity in the parameters. The corresponding convergence of the ground state particle density, as stated in Theorems 2.1–2.5, follows in a standard way by variation with respect to the external potential VL (z) (compare with Props. 2.1 and 2.2 in Subsect. 2.6). Since the proof of the energy convergence is already quite lengthy by itself, the simple modifications necessary for a proof of the density convergence will be omitted. Let again L¯ = N/ρ¯ denote the extension of the system in z-direction. As already mentioned in the beginning of Sect. 3, it will be necessary, for the lower bound to E QM , to consider the case of small and large N r/L¯ separately. We will divide space in the z-directions into small boxes, and use the bounds of Sect. 3 in each box. To control the
378
E.H. Lieb, R. Seiringer, J. Yngvason
¯ where r is number of particles in each box, Lemma 3 will be essential. For small N r/L, smaller than the mean particle distance, we will use the lower bound given in Thm. 3.1. ¯ where r is actually bigger than the mean particle distance, the For larger values of N r/L, lower bound of Thm. 3.2 will be used instead. Note that this distinction is only relevant in Regions 1–3, since in Regions 4 and 5 N r/L¯ = r ρ¯ 1 by condition (1.12). Hence r is always smaller than the mean particle distance in Regions 4 and 5. 4.1. Upper bound for Regions 1–3. For an upper bound that gives the right asymptotics as long as g/ρ¯ 1, we can essentially use the same technique as in [27]. The results of Sect. 3 are not needed in this case. As a trial function, we use
(x1 , . . . , xN ) = F (x1 , . . . , xN )
N
br (xk⊥ )φ GP (zk ) ,
(4.1)
k=1 GP )1/2 (cf. Subsect. 2.2), and F is the “Dyson wave function”, where φ GP = (ρN,L,g described in [9, 27]. The result is
E QM (N, L, r, a) ≤ 2/3
N e⊥ 2/3 GP GP 2/3 , + E (N, L, g) 1 + Cab φ r ∞ ∞ r2
(4.2)
2/3
as long as abr ∞ φ GP ∞ ≤ 1. Note that, by the same proof as in Lemma 2.1 of [28], gφ GP 2∞ ≤ 2E GP /N, and therefore E QM (N, L, r, a) ≤
N e⊥ + E GP (N, L, g) 1 + const. a 2/3 N −1/3 E GP (N, L, g)1/3 . 2 r (4.3)
Now, in Regions 1–3, E GP (N, L, g) ∼ E(N, L, g), and 2 g a 2 E GP (N, L, g)/N ∼ a 2 L−2 + g ρ¯ ∼ (a/L)2 + g ρr ¯ 2 1. ρ¯
(4.4)
4.2. Upper bound for Regions 4 and 5. If g/ρ¯ is not small, the method of the previous subsection does not work, and we have to proceed differently. As in Subsect. 3.5, let Ep1D (n, , g) denote the ground state energy of (1.3) on an interval of length , with periodic boundary conditions, and write Ep1D (n, , g) = n3 en (g/n)/2 . In [22] it is shown that, for every fixed t ≥ 0, limn→∞ en (t) = e(t). Since the functions are monotone increasing, concave and bounded in t, the convergence is actually uniform in t. By Lemma 4 of Subsect. 3.5, the same is true with ED1D instead of Ep1D . Hence we get the estimate n3 ED1D (n, , g) ≤ 2 e(g/n) + δ(n) (4.5) for some bounded function δ satisfying limn→∞ δ(n) = 0. Without loss of generality we may assume that δ is monotone decreasing. Let ρ be the minimizer of the LL functional (2.19) under the normalization condition ρ = N . Note that ρ has compact support, with radius R = L(µL2 )1/s , where
One-Dimensional Behavior of Dilute, Trapped Bose Gases
379
µ = ∂E LL (N, L, g)/∂N. (This R is different from that in Eq. (3.27).) By monotonicity and concavity of E LL in g, and by the scaling relation (2.22), E LL (N, L, g) E LL (N, L, g) ≤µ≤3 . N N
(4.6)
Divide R in the z-direction into intervals of length , labeled by α, with R0 < % < R/2. Let nα ∈ N be a collection of integers such that α nα = N . Let Vα = QM supz∈α VL (z), and denote ED (n, , r, a) the ground state energy of (3.1) in a box of side length and with Dirichlet boundary conditions. By confining the particles into different boxes of length − R0 , a distance R0 apart, we get the estimate QM E QM (N, L, r, a) ≤ ED (nα , − R0 , r, a) + Vα nα . (4.7) α
Using (3.3) and (4.5) as well as monotonicity of e, we obtain N e⊥ n3α ) + δ(n ) R(n ) + V n e(g/n E QM (N, L, r, a) − 2 ≤ α α α α α , r 2 α with 1 R(n) = (1 − R0 /)2
1+C
(4.8)
1/3
na 2 (1 + g) r
,
(4.9)
provided we choose nα and such that the term in square brackets is less than 1. Note the additional factor (1 − R0/)−2 , which is due to the fact that the size of the box is only − R0 . Now let n¯ α = α% ρ(z)dz, and nα = n¯ α , where x denotes the smallest integer ≥ x. With this choice α nα ≥ N , but by monotonicity of (4.8) in N we can plug in these values of nα for an upper bound. Since nα ≤ n¯ α +1, and e and δ are monotone increasing and decreasing, respectively, we obtain ) * 3 n¯ 3 1 α R(ρ∞ + 1) + Vα (n¯ α + 1) . e(g/n¯ α ) + δ(n¯ α ) 1 + (4.8) ≤ 2 n¯ α α (4.10) Here we estimated n¯ α by ρ∞ in R. Denote Vˆα = minz∈α VL (z). We estimate Vα ≤ Vˆα + const. L−2 (/L)(R/L)s−1 in boxes where nα > 0. Using (R/L)s ≤ 3L2 E LL (N, L, g)/N (see (4.6)), we therefore see that the error in replacing Vα by Vˆα is, in total, bounded above by const. E LL (N, L, g)(/R). Fix some 0 < < 1. We first bound the contribution to (4.10) from boxes where n¯ α ≤ 1/. Now both e and δ are bounded, the number of boxes with nonzero n¯ α is bounded by (R/) + 2, and Vˆα ≤ L−2 (R/L)s in these boxes. Therefore this contribution is bounded above by E LL (N, L, g) 1 R R(ρ∞ + 1) + . (4.11) const. 2 2 N
380
E.H. Lieb, R. Seiringer, J. Yngvason
For the remaining boxes, we use n¯ α ≥ 1/ and n¯ α ≤ ρ∞ to obtain (4.10) ≤ (4.11) + const. E LL (N, L, g) R n3 α 2 ˆ + (1 + ) · e(g/nα ) + δ(1/) (1 + ) R(ρ∞ + 1) + Vα nα . 2 α (4.12) Since x → x 3 e(1/x) is convex, we can use Jensen’s inequality to bound the sum from above by ρ(z)3 e(g/ρ(z)) + δ(1/) (1 + )2 R(ρ∞ + 1) + V (z)ρ(z) dz . (4.13) R
Now, by the scaling (2.23), ρ(z) = γ ρ˜g/γ (γ z/N ), where γ = (N/L)N −2/(s+2) and ρ˜g/γ is a function that depends only on g/γ . Therefore ρ∞ ≤ γ ρ˜g/γ ∞ , and ρ33 ≤ Nγ 2 ρ˜g/γ 2∞ . We choose, for some 0 < ˆ < , = (ˆ γ )−1 , and use E LL (1, 1, g/γ )1/s ≤
γR ≤ π 2/s . N
(4.14)
Putting everything together, we get, for C denoting some universal constant, N e⊥ E QM (N, L, r, a) − 2 r · R , ≤ E LL (N, L, g) + N γ 2 ρ˜g/γ 2∞ δ(1/) + C(ˆ /)3
(4.15)
where
C ˆ R = 1+ +C (1 + )3 N ˆ E LL (1, 1, g/γ )1/s aρ˜g/γ ∞ 2/3 g 1/3 aR0 γ −2 1+ 1+C . × 1 − 2 ˆ r g ˆ r ˆ γ
(4.16)
The choice of and ˆ is determined by a/r, g/γ and N . The bound is uniform in g/γ for bounded g/γ and γ /g. If g/γ → ∞ as N → ∞ (Region 5), we can use the same method to obtain an upper bound, replacing the bound (3.3) by (3.5) in (4.8). This gives a bound uniform in g/γ (for γ /g bounded). Combined with the result (4.15), this shows that in the limit N → ∞ and r/L → 0, lim sup
E QM (N, L, r, a) − N e⊥ /r 2 ≤1, E LL (N, L, g)
uniformly in the parameters, as long as a/r → 0 and γ /g stays bounded.
(4.17)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
381
4.3. Lower bound for Regions 3–5. We now derive a lower bound on E QM that will give the right asymptotics in Regions 3–5. As in the upper bound, given in Subsect. 4.2, we will use the box method, this time with Neumann boundary conditions. In each box, the results of Sect. 3 will be used. In analogy to (4.5) we infer from [22] and Lemma 4 that EN1D (n, , g) ≥
n3 e(g/n) − δ(n) 2
(4.18)
for some bounded, monotone decreasing function δ satisfying limn→∞ δ(n) = 0. This will be used in the bound for Regions 4 and 5. If g/n is small, however, we use e(g/n) ≤ 21 g/n and (3.80) to obtain EN1D (n, , g) ≥
n2 (n − 1) e(g/n) 1 − Cn(g)1/2 . 2
(4.19)
We divide R in the z-direction into intervals of side length M, labeled by α. Denote the ground state energy of (3.1) in a box of side length M and with Neumann boundary conditions. Let again Vˆα = inf z∈α VL (z), and Vα = supz∈α VL (z). By confining the particles into different boxes and neglecting the interaction between different boxes, we get the estimate QM E QM (N, L, r, a) ≥ inf EN (Nα , M, r, a) + Vˆα Nα , (4.20)
QM EN (n, M, r, a)
{Nα }
α
where the infimum is over all distributions of the N particles among the boxes α. As in the upper bound, we can estimate the difference of the maximum and minimum of VL for boxes alpha α inside some interval [−R, R] by const. L−2 (M/L)(R/L)s−1 . For boxes outside [−R, R] we use Vˆα ≥ Vα (1 − sM/R). Choosing R the radius of the LL minimizer, and M = R for some 0 < < 1, we see that, analogously to the upper bound, the error in replacing Vˆα by Vα (1 − s) is, in total, bounded above by const. E LL (N, L, g). QM We now have to estimate EN (Nα , M, r, a) from below. We cannot directly use (3.2), because this bound is not uniform in the particle number. Instead we proceed similarly to [31]. We divide the interval M again into smaller intervals of length = ηM, where 1/η ∈ N. Neglecting the interaction between different boxes, we obtain QM
EN (Nα , M, r, a) ≥ inf
{cn }
Nα
QM
cn EN (n, , r, a) ,
(4.21)
n=1
where cn denotes the number of boxes% containing exactly% n particles, and the infimum is over all cn ∈ N under the condition n cn n = Nα and n cn = M/ = η−1 . Fix some 0 < χ < 1, and consider the case n ≥ 1/χ . Then, using (3.2) and (4.18), ne⊥ n3 ≥ e(g/n) 1 − ν(n) , r2 2
(4.22)
a 1/8 δ(1/χ ) nr a 1/8 ν(n) = 1+ . + Cn e(g/n) r r
(4.23)
QM
EN (n, , r, a) − with
382
E.H. Lieb, R. Seiringer, J. Yngvason
Note that ν(n) is monotone increasing in n. We now use Lemma 3 from Subsect. 3.4, with L(x) = x 3 e(g/x). Note that for this L (3.83) holds with λ = % 6, since e is a monotone increasing and concave function, with e(0) = 0. Let N = n≥1/χ cn n. The contribution from n ≥ 1/χ to the sum in (4.21) will be bounded below using Lemma 3 QM and (4.22). For n < 1/χ , we simply use EN (n, , r, a) ≥ ne⊥ /r 2 . We thus obtain QM
EN (Nα , M, r, a) −
Nα e ⊥ N 3 ≥ e(gM/N ) 1 − ν(6N η) . r2 M2
(4.24)
Using Nα ≥ N ≥ Nα − 1/(ηχ ), this gives QM EN (Nα , M, r, a) −
3 Nα e ⊥ Nα3 1 ≥ e(gM/N ) 1 − η) . 1 − ν(6N α α r2 M2 Nα ηχ (4.25)
Now if ˆ Nα ≥ 2 for some 0 < ˆ < χ , we can choose 21 ˆ ≤ δ ≤ ˆ such that δNα ∈ N, and take η = (δNα )−1 . We also use 6/δ ≤ 12/ˆ ≤ 13/ˆ (since ˆ < 1). Note that, using (4.14) and δ ≤ ˆ , ˆ g LL g ≥ E (1, 1, g/γ )1/s ≡ ξ(g/γ ) . 13 13 γ
(4.26)
Therefore QM
EN (Nα , M, r, a) −
Nα e ⊥ Nα3 ≥ e(gM/Nα )R , r2 M2
(4.27)
with
ˆ R= 1− χ
3
δ(1/χ ) 1 C a 1/8 a a 1/8 1− 1+ . − e(ξ(g/γ )) ˆ r ξ(g/γ ) r r (4.28)
Here we used (4.14) and Nα ≤ N in the last error term. The bound (4.27) holds for ˆ Nα ≥ 2. If Nα < 2/ˆ , however, we use Nα3 4π 2 e(gM/Nα ) ≤ Nα 2 2 . 2 M 3ˆ M
(4.29)
Using these bounds in (4.20), we obtain N e⊥ 4π 2 E QM (N, L, r, a) − 2 + CE LL (N, L, g) + N 2 2 r 3ˆ M N3 α ≥ inf e(gM/Nα ) + Vα Nα R(1 − s) . 2 {Nα } M α
(4.30)
Note that, by (4.14) and (2.22), N N 1 = 2 2 ≤ E LL (N, L, g) 2 2 LL . 2 M R N E (1, 1, g/γ )1+2/s
(4.31)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
383
% Define ρ(z) = α Nα χα (z), where χα is the characteristic function of the interval α. The sum in (4.30) is bounded below by E LL [ρ] ≥ E LL (N, L, g), and therefore N e⊥ r2 LL ≥ E (N, L, g)R 1 − C −
E QM (N, L, r, a) −
4π 2 3 2 ˆ 2 N 2 E LL (1, 1, g/γ )1+2/s
.
(4.32)
The choice of , ˆ and χ is determined by g/γ and a/r. They can be chosen such that (4.32) gives the correct lower bound in the limit considered, uniformly in g/γ for bounded γ /g. This finishes the proof of the lower bound for Regions 4 and 5. If g/γ → 0 as N → ∞, we can use exactly the same strategy, with (4.19) replacing the bound (4.18). Considering the case n ≥ 1/χ , (4.22) is still valid, but with ν(n) replaced by a 1/8
nr a 1/8 ν (n) = χ + const. n g + const. n 1+ . (4.33) r r For Nα ≥ max{2/ˆ , 2 N } we proceed exactly as above. (We recall that 0 < < 1, and M = R.) The reason why we have to ensure that Nα ≥ 2 N is the second term in (4.33), where we want to be small. (Note that we choose = M/(δNα ).) For Nα < max{2/ˆ , 2 N }, we replace the bound (4.29) by Nα3 Nα g e(gM/Nα ) ≤ max{2/ˆ , 2 N} 2 M R 2 Nα LL 2 g/γ ≤ , (4.34) E (N, L, g) 1 + 2 N N 2 ˆ E LL (1, 1, g/γ )1+1/s where we used e(x) ≤ 21 x and (4.14). Note that, for small g/γ , the last fraction is order 1. We obtain E QM (N, L, r, a) −
N e⊥ ≥ E LL (N, L, g)R , r2
(4.35)
with R given by + g 1 1 ˆ 3 C a 1/8 a a 1/8 1−χ −C 1+ − R = 1 − χ ˆ ˆ γ ˆ r ξ(g/γ ) r r 2 g/γ × 1 − C − 1+ . (4.36) 2 N 2 ˆ E LL (1, 1, g/γ )1+1/s Here we used again (4.14) to estimate R from above, and 2 N ≤ Nα ≤ N in the error term. This proves the lower bound in Region 3, as long as a/(rξ(g/γ )) stays bounded (or at least does not increase too fast as a/r goes to zero with N ). Note that, for small g/γ , a 1 a rN ∼ (γ /g)(s+2)/(s+1) ∼ , r ξ(g/γ ) r L¯ TF
(4.37)
with L¯ TF defined in (2.11), and hence the bound is uniform for rN/L¯ TF bounded.
384
E.H. Lieb, R. Seiringer, J. Yngvason
As explained briefly in the introduction to this section, if N r/L¯ TF is not small, we have to use Thm. 3.2 instead of Thm. 3.1. Hence we will now assume that A ≡ (rN/L¯ TF )1/3 ≥ 1, but still g/γ → 0 as N → ∞. Instead of (3.2) we will use the bound (3.4) in (4.22). This gives (4.22) with ν(n) replaced by 1/8 √ √ 1/4 *4/39 χ r na na na . (4.38) + 1+ + χ 1/14 + ν (n) = C r2 r Let 0 < ˆ < 1. For Nα ≥ max{2A2 /ˆ , 2 N} we proceed as above, but choosing η = A2 /(δNα ) for 21 ˆ ≤ δ ≤ ˆ . Moreover, we choose χ = 1/($ Nα η). Using R ∼ L¯ TF , this gives (4.35), with R replaced by 1 1 1/4 394 1/2 14 8 aA ˆ Na aA ˆ R = 1 − C + + + 1 + 1/2 R $ A2 r ˆ 2 $ ˆ r 2A2 g/γ ×(1 − $ )3 1 − C − 1+ . (4.39) 2 N 2 ˆ E LL (1, 1, g/γ )1+1/s Note that, for g/γ 1, N a/R ∼ a ρ¯ 1, and (aA/r)3 = (a/r)2 N a/R. Moreover, A2 /N N −1/3 if L¯ TF /L = (NgL)1/(s+1) is bounded away from zero. Hence, for bounded 1/A, this gives the desired lower bound for Region 3. In summary, we have thus shown that, in the limit N → ∞ and r/L → 0, lim inf
E QM (N, L, r, a) − N e⊥ /r 2 ≥1, E LL (N, L, g)
(4.40)
uniformly in the parameters, provided (1.12) holds, a/r → 0 and 1/(NgL) stays bounded. This finishes the proof of the lower bound for Regions 3–5. 4.4. Lower bound for Regions 1 and 2. We are left with the lower bound for Regions GP denote the minimizer of 1 and 2. We proceed similarly to [26, Sect. 5.1]. Let ρN,L,g the GP functional (2.4) under the normalization condition R ρ = N , and let φ(z) = GP (z))1/2 . We write a general wave function as (ρN,L,g
(x1 , . . . , xN ) = F (x1 , . . . , xN )
N
φ(zk )br (xk⊥ )
(4.41)
k=1
and assume that | = 1. In evaluating the expectation value of HN,L,r,a , we use partial integration and the GP equation g (4.42) |φ|4 φ . −φ + V φ + g|φ|2 φ = E GP (N, L, g) + 2 Moreover, we split a fraction of the kinetic energy into a part where the particles are closer than a distance T > R0 , and its complement. This splitting will be important in the proof of BEC below. More precisely, for fixed i and xj , j = i, let 1 if mink, k=i |x − xk | ≥ T (4.43) χi,T (x) = 0 otherwise ,
One-Dimensional Behavior of Dilute, Trapped Bose Gases
385
and let χ¯ i,T = 1 − χi,T . Then, for 0 ≤ ≤ 1, |HN,L,r,a | =
N e⊥ g GP |φ|4 + Q(F ) + E (N, L, g) + r2 2 N N +(1 − ) |∇i F |2 χi,T (xi ) φ(zk )2 br (xk⊥ )2 d 3 xk , i=1
(4.44)
k=1
with Q(F ) =
N
|∇i F |2 + (1 − )|∇i F |2 χ¯ i,T (xi )
i=1
+
va (|xi − xj |)|F |2 − g
i<j
N
|φ(zi )|2 |F |2
i=1
N
φ(zk )2 br (xk⊥ )2 d 3 xk .
k=1
(4.45) To bound Q(F ) from below, we use again the box method. We divide R in the z-direction into intervals of length M, labeled by α, put Nα particles in box α, neglect the interaction between boxes, and minimize over the distribution of the N particles. This gives a lower bound. More precisely inf Q(F ) ≥ inf inf Qα (Fα ) , (4.46) F
{Nα }
α
Fα
where Fα = Fα (x1 , . . . , xNα ), and Qα is the same as Q, but with all the integrations restricted to the box α, and N replaced by Nα . The infima are under the normalization & &Nα ⊥ 2 ⊥ 2 2 2 2 conditions |F |2 N k=1 φ(zk ) br (xk ) = 1 and α |Fα | k=1 φ(zk ) br (xk ) = 1, respectively. We consider two cases. Choose δ > 0. First, assume that, for all z ∈ α, |φ(z)|2 ≥ δN/L. We use the same method as in [26, Eqs. (5.28)–(5.34)] to get rid of the φ 2 in the measure φ(z)2 dz. Let φα,min and φα,max denote the minimal and maximal value of φ inside the box α, respectively. We first proceed as in (3.30)–(3.36), and obtain, for T ≥ r(a/r)1/4 = radius of U , Nα Nα 2 (1 − )|∇i F |2 χ¯ i,T (xi ) + 1 v (|x − x |)|F | φ(zk )2 br (xk⊥ )2 d 3 xk a i j 2 i=1
≥
α
Nα i=1
j, j =i
φ2 α,min a U (|xi 2 φ α α,max
k=1
⊥ − xk(i) |)χBδ (xk(i) /r)|F |2
Nα
φ(zk )2 br (xk⊥ )2 d 3 xk .
k=1
(4.47) Here U is given in (3.33), and a is given after Eq. (3.36). Denoting $(x1 , . . . , xNα ) = F (x1 , . . . , xNα ) F
Nα k=1
φ(zk ) ,
(4.48)
386
E.H. Lieb, R. Seiringer, J. Yngvason
and using Nα
$|2 ≤ 2|∇i F |2 |∇i F
$|2 φ(zk )2 + 2|F
supz∈α |φ |2 2 φα,min
k=1
(4.49)
,
we get |∇i F |2 α
n
φ(zk )2 br (xk⊥ )2 d 3 xk ≥
k=1
α
1 $2 2 |∇i F |
−
Nα CNgL $|2 | F br (xk⊥ )2 d 3 xk . δL2 k=1
(4.50) Here we denoted CNgL =
L3 sup |φ (z)|2 , N z
(4.51)
which, by scaling, depends only on NgL. Estimating φ 2 by its maximum in the last term in (4.45), we therefore have 2 $α (F $) − Nα gφα,max Qα (F ) ≥ Q −
CNgL Nα , δL2
(4.52)
with $α (F ) = Q
Nα i=1
α
1 2 |∇i F |
2
+
2 φα,min 2 φα,max
a U (|xi − xk(i) |)|F |
2
N
br (xk⊥ )2 d 3 xk .
k=1
(4.53) $α (F ) over all F (under the normalization condition |F |2 × Denote the infimum of Q & ⊥ 3 1/8 . Looking at the $ k br (xk )d xk = 1) by Eα (Nα , M), and choose = (a/r) proof of Thm. 3.1, we see that the lower bound in Subsect. 3.2 was obtained exactly from an expression like (4.53) (compare with (3.36)), except for the additional factor 2 2 φα,min /φα,max . This factor can be estimated by 2 φα,min 2 φα,max
2M ≥1− L
+
CNgL . δ
(4.54)
Therefore we can apply (3.2), and, in addition, Lemma 2 from Subsect. 3.4 to estimate EN1D from below. This gives + CNgL (N − 1) 1 N 2M α α $α (Nα , M) ≥ E (4.55) g 1− 1 − Cν(Nα , M) , 2 M L δ where
a 1/8
N r a 1/8 1+ . ν(N, M) = N gM + N r M r
Note that ν(N, M) is monotone increasing in N.
(4.56)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
387
This bound is of no use for large Nα , however. Therefore we will use again the box method, as in Subsect. 4.3 (see also [31]), with small boxes = Mη for some η−1 ∈ N. The use of Lemma 3, with L(x) = x[x − 1]+ and λ = 4, implies + 2 2 CNgL 2Mη N 1 1 α $α (Nα , M) ≥ 1− g 1− 1 − Cν(4Nα η, Mη) . E 2M Nα η L δ (4.57) Let 1 ≥ ˆ ≥ 2/N such that ˆ N ∈ N, and choose η = (ˆ N )−1 . We estimate Nα ≤ N in the last term in (4.57), and choose M = L for some > ˆ . Minimizing over Nα yields 2 4 $α (Nα , M) − gφα,min E Nα ≥ − 21 gφα,min MR ,
(4.58)
with −1 + 2 CNgL ˆ 2 1− R = 1+ 2δ ˆ N δ )+ *−1 gL 1 a 1/8 rN a 1/8 1+ × 1−C . + ˆ 3 N ˆ r L r
(4.59)
2 For boxes α where φα,min < N δ/L, we just neglect the positive terms in (4.45). This gives
E
QM
(N, L, r, a) − 1 −
N a 1/8
r
|∇i F | χi,T (xi ) 2
i=1
N
φ(zk )2 br (xk⊥ )2 d 3 xk
k=1
a 1/8 N C N e⊥ g NgL GP ≥ 2 + E (N, L, g) + |φ|4 − 2 r 2 r L δ 2g N 2 $ α , M) − gφα,max + . inf E(N Nα − δ Nα L α
(4.60)
% Here we neglected the condition α Nα = N on the Nα , which can only lower the inf2 2 imum. Moreover, the error in replacing φα,max by φα,min in the term in square brackets in (4.60) is bounded above by 2NgM sup φ(z)|φ (z)| ≡ 2N 2 g z
M ˆ CNgL . L2
(4.61)
NgL. Note that, by scaling, Cˆ NgL depends only on % 4 % 2 Using (4.58), α φα,min M ≤ φ 4 and α φα,min M ≤ N and dropping the positive second term on the left side of (4.60), we finally obtain N e⊥ E QM (N, L, r, a) − 2 − E GP (N, L, g) r a 1/8 N C g g g NgL 4 ≥− |φ| (R − 1) − − δN 2 − 2N 2 Cˆ NgL . 2 r L2 δ L L
(4.62)
388
E.H. Lieb, R. Seiringer, J. Yngvason
Note that E GP (N, L, g) ≥ 21 g φ 4 and E GP (N, L, g) ≥ cs N/L2 for some constant cs depending only on s. Hence we see that, for bounded NgL, the parameters , ˆ and δ can be chosen arbitrary small with a/r and N to show the desired lower bound, as long as N r/L (r/a)1/4 and, in particular, for N r/L bounded. (Note that L¯ ∼ L in Regions 1 and 2.) Here we also need that CNgL and Cˆ NgL are uniformly bounded if NgL stays bounded, which follows by the same methods as in the proof of Lemma 1 in the Appendix. For bigger values of N r/L we have to proceed differently, using the lower bound (3.4) instead of (3.2), as we also did in the previous subsection for the lower bound for Region 3. We omit the details. The results of this subsection can thus be summarized as lim inf
E QM (N, L, r, a) − N e⊥ /r 2 ≥1 E GP (N, L, g)
(4.63)
in the limit N → ∞ and r/L → 0, uniformly in the parameters as long as NgL stays bounded. This finishes the proof of the lower bound for Regions 1 and 2.
5. Bose-Einstein Condensation In this last section we investigate the question of Bose-Einstein condensation in the ground state. It will be proved to occur in Regions 1 and 2 but it probably also occurs in part of Region 3; we cannot prove this and it remains an open problem. BEC in the ground state means that the one-body density matrix γ (x, x ), which is obtained from the ground state wave function 0 by
γ (x, x ) = N
0 (x, x2 , . . . , xN ) 0 (x , x2 , . . . , xN )∗ d 3 x2 · · · d 3 xN ,
(5.1)
factorizes as Nψ(x)ψ(x ) for some normalized ψ (in the N → ∞ limit, of course). This, in fact, is 100% condensation. It was proved in [25] for a fixed trap potential in the Gross-Pitaevskii limit, i.e., for both r/L and N a/L fixed as N → ∞. Here we extend this result to the case r/L → 0 with NgL fixed. The function ψ is the square-root of the minimizer of the 1D GP functional (2.4) times the transverse function br (x⊥ ). BEC is not expected in Regions 4 and 5. Lenard [21] showed that the largest eigenvalue of γ grows only as N 1/2 for a homogeneous gas of 1D impenetrable bosons and, according to [34] and [10], this holds also for a GT gas in a harmonic trap. (The exponent 0.59 in [14] can probably be ascribed to the small number of particles (N = 10) considered.) Our main result about BEC in the ground state is: Theorem 5.1 (BEC in Region 2). If N → ∞, r/L → 0 with N aL/r 2 fixed, then Lr 2 ⊥ ⊥ γ (rx⊥ , Lz; rx , Lz ) → b(x⊥ )b(x )φ GP (z)φ GP (z ) N
(5.2)
in trace norm. Here φ GP is the minimizer of the GP functional (2.4) with N = 1, L = 1 and interaction parameter NgL.
One-Dimensional Behavior of Dilute, Trapped Bose Gases
389
Proof. As in the proof of the energy asymptotics in Region 2, we have to distinguish the cases rN/L small or large. For simplicity we consider only the case rN/L (r/a)1/4 . The case of larger rN/L can be treated in the same manner, replacing the bound (3.2) by (3.4) in Subsect. 4.4, and choosing T in (4.43) appropriately. From the lower bound to the energy in Subsect. 4.4, together with the upper bound in Subsect. 4.1, we infer that if 0 is the ground state of the Hamiltonian HN,L,r,a , T = r(a/r)1/4 and F is defined by
0 (x1 , . . . , xN ) = F (x1 , . . . , xN )
N
L−1/2 φ GP (zk /L)br (xk⊥ ) ,
(5.3)
k=1
then N N L2 |∇i F |2 χi,T (xi ) L−1 φ GP (zk /L)2 br (xk⊥ )2 d 3 xk = 0 N→∞ N
lim
i=1
(5.4)
k=1
in the limit N → ∞, r/L → 0 with NgL fixed. Here χi,T is given in (4.43). Note that in this limit NT 3 /(r 2 L) = N r/L(a/r)3/4 → 0, i.e., the volume of the set where χi,T is zero is small compared to the total volume r 2 L. Equation (5.2) now follows, using the methods of [25]. A. Appendix: Proof of Auxiliary Lemmas Proof of Lemma 1. Without restriction we may assume that V ⊥ ≥ 0. The existence, uniqueness and positivity of a minimizer φp are standard (cf., e.g., [27]). From the variational equation −⊥ + V ⊥ (x⊥ ) + 2p|φp (x⊥ )|2 φp (x⊥ ) = µp φp (x⊥ ) (A.1) we infer that, for K(x⊥ − y⊥ ) the integral kernel of (−⊥ + 1)−1 , ⊥ φp (x ) = K(x⊥ − y⊥ ) µp + 1 − V ⊥ (y⊥ ) − 2p|φp (y⊥ )|2 φp (y⊥ )d 2 y⊥ . (A.2) Using positivity of V ⊥ , φp and K, as well as the normalization of φp , we obtain the bound ⊥ ⊥ |x⊥ | φp (x )e ≤ sup e|x | K(x⊥ − y⊥ ) µp + 1 φp (y⊥ )d 2 y⊥ x⊥
V ⊥ (y⊥ )≤µp +1
≤ (µp + 1) sup x⊥
V ⊥ (y⊥ )≤µp +1
e
|x⊥ |
⊥
⊥
2
1/2 2 ⊥
K(x − y ) d y
≡ Cp . (A.3)
Since µp is uniformly bounded for p in a bounded interval, so is Cp . Moreover, ⊥ ⊥ |∇ φp (x )| ≤ ∇ ⊥ K(x⊥ − y⊥ ) µp +1 − V ⊥ (y⊥ ) − 2p|φp (y⊥ )|2 φp (y⊥ )d 2 y⊥ $p ∇ ⊥ K1 , ≤C
(A.4)
390
E.H. Lieb, R. Seiringer, J. Yngvason
with $p = sup µp + 1 − V ⊥ (y⊥ ) − 2p|φp (y⊥ )|2 φp (y⊥ ) , C
(A.5)
y⊥
which is finite and uniformly bounded for bounded p because of (A.3) and the fact that that V ⊥ is polynomially bounded at infinity by assumption. Proof of Lemma 2. We write (1.3) as Hn,g =
n j =1
− 21 ∂j2 +
1 2
−
i=j
1 ∂j2 + gδ(zi − zj ) . n−1
(A.6)
To bound this expression from below, we want to use Temple’s inequality, and for this purpose we have to smear out the δ-function interaction. We use the following lemma (compare with [2, Lemma 6.3]), whose proof can be found below. Lemma 5. Let ∂z2 denote the Neumann Laplacian on an interval [0, ], let z0 ∈ (0, ), and choose positive numbers A, B and α such that R ≡ B arctan(BA/2α) ≤ min{z0 , − z0 }. Then −α∂z2 + Aδ(z − z0 ) −
α θ (R − |z − z0 |) ≥ 0 . B2
(A.7)
We apply this result to the operator in square brackets in (A.6). This gives Hn,g ≥
n j =1
− 21 ∂j2 +
1 2
i=j
1 θ (R − |zj − zi |)χ[R,−R]) (zi ) , (n − 1)B 2
(A.8)
with R = B arctan(Bg(n − 1)/2) and B > 0 arbitrary. Temple’s inequality (2.61) implies that, for U (z1 , . . . , zn ) = and U k = −n
1 2
1 θ (R − |zj − zi |)χ[R,−R]) (zi ) (n − 1)B 2
i=j
(A.9)
U k dz1 . . . dzn ,
EN1D (n, , g)
U 2 1 ≥ U 1 − 1 2 2 U 2 π / − U
,
(A.10)
provided the term in the denominator is positive. By Schwarz’ inequality, U 2 ≤
n U . 2B 2
(A.11)
Moreover, U =
1 n 1 2 B 2 2
−R
dz 0
R
dwθ (R − |z − w|) =
n R ( − 2R) . B 2 2
(A.12)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
Using x ≥ arctan(x) ≥ x(1 − x/3) for x ≥ 0, this leads to the estimate g B 2 gn Bgn 1D 1 EN (n, , g) ≥ 2 n(n − 1) 1− − 6 n × 1− 2 2 2 , B (π / − n(n − 1)g/)
391
(A.13)
under the assumption π 2 /2 > n(n − 1)g/. Choosing B = (gπ 2 )−1/4 , this gives, for R = (n2 g)1/2 /π, g R π 1D 3/2 1 EN (n, , g) ≥ 2 n(n − 1) (A.14) 1− 1 − R − 1/2 R 6n 1 − R2 and proves the desired lower bound.
Proof of Lemma 5. Let h denote the operator on the left side of (A.7). The function , cos (R − |z − z0 |)/B for 0 ≤ |z − z0 | ≤ R f (z) = (A.15) 1 otherwise fulfills the Schr¨odinger equation hf = 0, and since it is positive, it must be the ground state of h. Hence h ≥ 0. Proof of Lemma 3. Let p = λN/M. For n ≤ p we use the estimate E(n) ≥ L(n)K(p) ,
(A.16)
which follows from monotonicity of K. For n ≥ p, we use superadditivity, which implies that n E(n) ≥ [n/p]E(p) ≥ L(p)K(p) , (A.17) 2p % where [x] denotes the integer part of x. Now let t = n
Hence n≥0
L(p) . cn E(n) ≥ K(p) ML(t/M) + (N − t) 2p
(A.19)
To obtain a lower bound, we have to minimize the right side over all 0 ≤ t ≤ N . Note that (A.19) is convex in t, hence the minimum is either taken at t = N , or, if it is taken at some t0 < N, its right derivative at t0 has to be positive. Using (3.83) this leads to L(λt0 /M) L(p) ≥ . 2λt0 /M 2p
(A.20)
By our assumptions on L, L(x)/x is monotone increasing, and hence t0 ≥ pM/λ ≥ N . Thus we can set t = N in (A.19), and obtain the desired lower bound.
392
E.H. Lieb, R. Seiringer, J. Yngvason
References 1. Astrakharchik, G.E., Giorgini, S.: Quantum Monte Carlo study of the three- to one-dimensional crossover for a trapped Bose gas. Phys. Rev. A 66, 053614-1–6 (2002) 2. Baumgartner, B., Solovej, J.P.,Yngvason, J.: Atoms in Strong Magnetic Fields: The High Field Limit at Fixed Nuclear Charge. Commun. Math. Phys. 212, 703–724 (2000) 3. Blume, D.: Fermionization of a bosonic gas under highly elongated confinement: A diffusion quantum Monte Carlo study. Phys. Rev. A 66, 053613-1–8 (2002) 4. Bongs, K., Burger, S., Dettmer, S., Hellweg, D., Artl, J., Ertmer, W., Sengstok, K.: Waveguides for Bose-Einstein condensates. Phys. Rev. A 63, 031602 (2001) 5. Cornell, E.A., Wieman, C.E.: Bose-Einstein condensation in a dilute gas, the first 70 years and some recent experiments. In: Les Prix Nobel 2001, Stockholm: The Nobel Foundation, 2002, pp. 87–108. Reprinted in: Rev. Mod. Phys. 74, 875–893 (2002); Chem. Phys. Chem. 3, 476–493 (2002) 6. Das, K.K.: Highly anisotropic Bose-Einstein condensates: Crossover to lower dimensionality. Phys. Rev. A 66, 053612-1–7 (2002) 7. Das, K.K., Girardeau, M.D., Wright, E.M.: Crossover from One to Three Dimensions for a Gas of Hard-Core Bosons. Phys. Rev. Lett. 89, 110402-1–4 (2002) 8. Dunjko, V., Lorent, V., Olshanii, M.: Bosons in Cigar-Shaped Traps: Thomas-Fermi Regime, Tonks-Girardeau Regime, and In Between. Phys. Rev. Lett. 86, 5413–5316 (2001) 9. Dyson, F.J.: Ground-State Energy of a Hard-Sphere Gas. Phys. Rev. 106, 20–26 (1957) 10. Forrester, P.J., Frankel, N.E., Garoni, T.M., Witte, N.S.: Finite one dimensional impenetrable Bose systems: Occupation numbers. Phys. Rev. A 67, 043607 (2003); Forrester, P.J., Frankel, N.E., Garoni, T.M.: Random Matrix Averages and the impenetrable Bose Gas in Dirichlet and Neumann Boundary Conditions, J. Math. Phys. 44, 4157 (2003) 11. Gangardt, D.M., Shlyapnikov, G.V.: Local correlations in a strongly interacting 1D Bose gas. New J. Phys. 5, 79 (2003) 12. Girardeau, M.: Relationship between Systems of Impenetrable Bosons and Fermions in One Dimension. J. Math. Phys. 1, 516–523 (1960) 13. Girardeau, M.D., Wright, E.M.: Bose-Fermi variational Theory for the BEC-Tonks Crossover. Phys. Rev. Lett. 87, 210401-1–4 (2001) 14. Girardeau, M.D., Wright, E.M., Triscari, J.M.: Ground-state properties of a one-dimensional system of hard-core bosons in a harmonic trap. Phys. Rev. A 63, 033601-1–6 (2001) 15. G¨orlitz, A., Vogels, J.M., Leanhardt, A.E., Raman, C., Gustavson, T.L., Abo-Shaeer, J.R., Chikkatur, A.P., Gupta, S., Inouye, S., Rosenband, T., Ketterle, W.: Realization of Bose-Einstein Condensates in Lower Dimension. Phys. Rev. Lett. 87, 130402-1–4 (2001) 16. Greiner, M., Bloch, I., Mendel, O., H¨ansch, T., Esslinger, T.: Exploring Phase Coherence in a 2D Lattice of Bose-Einstein Condensates. Phys. Rev. Lett. 87, 160405 (2001) 17. Jackson, A.D., Kavoulakis, G.M.: Lieb Mode in a Quasi-One-Dimensional Bose-Einstein Condensate of Atoms. Phys. Rev. Lett. 89, 070403 (2002) 18. Ketterle, W.: When atoms behave as waves: Bose-Einstein condensation and the atom laser. In: Les Prix Nobel 2001, Stockholm: The Nobel Foundation, 2002, pp. 118–154. Reprinted in: Rev. Mod. Phys. 74, 1131–1151 (2002); Chem. Phys. Chem. 3, 736–753 (2002) 19. Kolomeisky, E.B., Newman, T.J., Straley, J.P., Qi, X.: Low-Dimensional Bose Liquids: Beyond the Gross-Pitaevskii Approximation. Phys. Rev. Lett. 85, 1146–1149 (2000). Bhaduri, R.K., Sen, D.: Comment on “Low-Dimensional Bose Liquids: Beyond the Gross-Pitaevskii Approximation”. Phys. Rev. Lett. 86, 4708 (2001). Reply, Phys. Rev. Lett. 86, 4709 (2001) 20. Komineas, S., Papanicolaou, N.: Vortex Rings and Lieb Modes in a Cylindrical Bose-Einstein Condensate. Phys. Rev. Lett. 89, 070402 (2002) 21. Lenard, A.: Momentum distribution in the ground state of the one-dimensional system of impenetrable bosons. J. Math. Phys. 5, 930–943 (1964) 22. Lieb, E.H., Liniger, W.: Exact Analysis of an Interacting Bose Gas. I. The General Solution and the Ground State. Phys. Rev. 130, 1605–1616 (1963) 23. Lieb, E.H.: Exact Analysis of an Interacting Bose Gas. II. The Excitation Spectrum. Phys. Rev. 130, 1616–1624 (1963) 24. Lieb, E.H., Loss, M.: Analysis. Second edition, Providence, RI: American Mathematical Society, 2001 25. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein Condensation for Dilute Trapped Gases. Phys. Rev. Lett. 88, 170409-1–4 (2002) 26. Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: The Ground State of the Bose Gas. In: Current Developments in Mathematics, 2001, Cambridge: International Press, 2002, pp. 131–178 27. Lieb, E.H., Seiringer, R.,Yngvason, J.: Bosons in a trap: A rigorous derivation of the Gross-Pitaevskii energy functional. Phys. Rev. A 61, 043602-1–13 (2000)
One-Dimensional Behavior of Dilute, Trapped Bose Gases
393
28. Lieb, E.H., Seiringer, R., Yngvason, J.: A Rigorous Derivation of the Gross-Pitaevskii Energy Functional for a Two-dimensional Bose Gas. Commun. Math. Phys. 224, 17–31 (2001) 29. Lieb, E.H., Seiringer, R., Yngvason, J.: One-dimensional Bosons in Three-dimensional Traps. arXiv:cond-mat/0304071, Phys. Rev. Lett. 91, 150401 (2003) 30. Lieb, E.H., Solovej, J.P., Yngvason, J.: Asymptotics of Heavy Atoms in High Magnetic Fields. I : Lowest Landau Band Regions. Commun. Pure and Appl. Math. 47, 513–593 (1994) 31. Lieb, E.H., Yngvason, J.: Ground State Energy of the Low Density Bose Gas. Phys. Rev. Lett. 80, 2504–2507 (1998) 32. Menotti, C., Stringari, S.: Collective Oscillations of a 1D Trapped Bose gas. Phys. Rev. A 66, 043610 (2002) 33. Olshanii, M.: Atomic Scattering in the Presence of an External Confinement and a Gas of Impenetrable Bosons. Phys. Rev. Lett. 81, 938–941 (1998) 34. Papenbrock, T.: Ground-state properties of hard-core bosons in one-dimensional harmonic traps. Phys. Rev. A 67, 041601(R) (2003) 35. Petrov, D.S., Shlyapnikov, G.V., Walraven, J.T.M.: Regimes of Quantum Degeneracy in Trapped 1D Gases. Phys. Rev. Lett. 85, 3745–3749 (2000) 36. Pitaevskii, L., Stringari, S.: Uncertainty Principle, Quantum Fluctuations, and Broken Symmetries. J. Low Temp. Phys. 85, 377–388 (1991) 37. Robinson, D.W.: The Thermodynamic Pressure in Quantum Statistical Mechanics. Lecture Notes in Physics, Vol. 9, Berlin-Heidelberg-New York: Springer, 1971 38. Schreck, F., Khaykovich, L., Corwin, K.L., Ferrari, G., Boudriel, T., Cubizolles, J., Salomon, C.: Quasipure Bose-Einstein Condensate Immersed in a Fermi Sea. Phys. Rev. Lett. 87, 080403 (2001) 39. Tanatar, B., Erkan, K.: Strongly interacting one-dimensional Bose-Einstein condensates in harmonic traps. Phys. Rev. A 62, 053601-1–6 (2000). Girardeau, M.D., Wright, E.M.: Comment on “Strongly interacting one-dimensional Bose-Einstein condensates in harmonic traps”. arXiv:condmat/0010457 40. Temple, G.: The theory of Rayleigh’s Principle as Applied to Continuous Systems. Proc. Roy. Soc. London A 119, 276–293 (1928) 41. Tonks, L.: The Complete Equation of State of One, Two and Three-Dimensional Gases of Hard Elastic Spheres. Phys. Rev. 50, 955–963 (1936) Communicated by M. Aizenman
Commun. Math. Phys. 244, 395–417 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-1000-8
Communications in
Mathematical Physics
The Infinite Volume Limit of Dissipative Abelian Sandpiles C. Maes1 , F. Redig2 , E. Saada3 1 2
Instituut voor Theoretische Fysica, K.U. Leuven, Celestijnenlaan 200D, 3001 Leuven, Belgium Faculteit Wiskunde en Informatica, Technische Universiteit Eindhoven, Postbus 513, 5600 MB Eindhoven, The Netherlands and EURANDOM, Eindhoven, The Netherlands 3 CNRS, UMR 6085, Laboratoire de Math´ematiques Rapha¨el Salem, Universit´e de Rouen, site Colbert, 76821 Mont-Saint-Aignan Cedex, France Received: 9 October 2002 / Accepted: 15 September 2003 Published online: 2 December 2003 – © Springer-Verlag 2003
Abstract: We construct the thermodynamic limit of the stationary measures of the Bak-Tang-Wiesenfeld sandpile model with a dissipative toppling matrix (sand grains may disappear at each toppling). We prove uniqueness and mixing properties of this measure and we obtain an infinite volume ergodic Markov process leaving it invariant. We show how to extend the Dhar formalism of the ‘abelian group of toppling operators’ to infinite volume in order to obtain a compact abelian group with a unique Haar measure representing the uniform distribution over the recurrent configurations that create finite avalanches 1. Introduction The abelian sandpile is a lattice model where a discrete height-variable (e.g. representing the slope of a sandpile at that site) is associated to each site. (Sand) grains are randomly added and if at a site the height exceeds some critical value γ , then that “unstable” site “topples”, i.e., gives an equal portion of its grains to each of its neighboring sites which in turn can become “unstable” and “topple” etc., until every site has again a subcritical height-value. An unstable site thus creates an “avalanche” involving possibly the toppling of many sites around it. The reach of this avalanche depends on the configuration making this dynamics highly non-local. Since their appearance in [1], sandpile models have been studied intensively. One physical motivation is related to what was called self-organized criticality. The steady state typically exhibits power law decay of correlations and of avalanche sizes with an amazing universality of critical exponents found in many computer simulations and in a wide range of natural phenomena. See e.g. [16] for an overview of various models. The advantage of the abelian sandpile model lies in its rich mathematical structure, first discovered by Dhar, see for instance [4, 5, 8, 14], and also [12] for a mathematical
Work partially supported by Tournesol project – nr. T2001.11/03016VF
396
C. Maes, F. Redig, E. Saada
review of the main properties of the model in finite volume. The main technical tool in the analysis is the “abelian group” of toppling operators which can be identified with the set of recurrent configurations. Our aim is to define the model on infinite graphs or better, to understand how the process settles down in a stationary regime as the volume increases. The main problem to overcome is the non-locality or extreme sensitivity to boundary-conditions or surface effects of the dynamics which is of course directly related to its physical interest. We have constructed in [9] and [10] the infinite volume standard sandpile process on the one-dimensional lattice and on homogeneous trees. In the present paper we focus on the thermodynamic limit for dissipative models. There, the infinite graph S is a subgraph of the regular lattice Zd and on each site the height has a critical value γ ≥ N = the maximal number of neighbors of a site in S. The finite volume rule now starts as follows: choose a site x at random from the volume V and add one grain to it. Suppose that x has Nx nearest neighbors and that the new height at x is γ + 1. Then, it topples by giving to each of its nearest neighbors one grain and dissipating γ − Nx grains to a sink associated to the volume. We say that the site x is dissipative when γ > Nx and the model is dissipative when this happens for a considerable fraction of sites. This condition can be rephrased in terms of the simple random walk on S with a sink associated to the dissipative sites: the model is dissipative when the Green’s function decays fast enough in the lattice distance, see (2.10) below for a precise formulation. Dissipative abelian sandpile models have appeared in the physics literature in [15, 11] and [3], where it was argued that dissipation removes criticality, that is, correlation functions decay exponentially fast uniformly in the volume. From the point of view of defining the thermodynamic limit, the main simplification of dissipative models is that there is a stronger control of the non-locality: more precisely, the probability that a site y is influenced by addition on x decays exponentially fast (or at least in a summable way) in the distance between the sites. Hence “avalanche clusters” are almost surely finite. As we will see, the avalanche clusters in a dissipative model behave as “subcritical percolation” clusters, with a characteristic size (in particular they have a finite first moment). Dissipative models as studied in the present paper teach us little about the original goal of sandpile models, i.e., about self-organized criticality. One gains however in providing a rather complete mathematical analysis. There are various reasons to be interested in dissipative models. One still obtains a nonlocal dynamics in analogy with the original models but, as we will show, the nonlocality can be better controlled mathematically. Secondly, one can hope to approach the thermodynamic limit of the original critical model again, by letting the dissipation approach to zero. Thirdly, the claim in the physics literature that the dissipative model is noncritical has only been proved for very special explicitly computable correlation functions. We give a complete analysis and prove that correlations of all local observables decay exponentially. Finally, it will turn out that the dissipative model shows the interesting structure of a compact abelian group of addition operators, also in the infinite volume limit. That generalizes the Dhar formalism to infinite volume and can stand as an example of what to expect for the orginal critical model. 1.1. Results. Our three main results are: 1. The extension of the Dhar formalism to infinite volume sandpile dynamics. That includes the construction of a compact abelian group of recurrent configurations on which we can define addition (of sand) operations.
Infinite Volume Limit of Dissipative Abelian Sandpiles
397
2. The construction of the thermodynamic limit of the finite volume stationary measure with exponential decay of correlations in the case of “strong dissipativity”. 3. The construction of an infinite volume sandpile process which converges exponentially fast to its unique stationary measure. 1.2. Plan of the paper. The paper is organized as follows: in Sect. 2 we repeat some of the basic results on the abelian sandpile model in finite volume and we introduce the definition of dissipativity, with examples. In Sect. 3 we show how to extend the dynamics on infinite volume recurrent configurations and we recover the group structure of “addition of recurrent configurations.” In Sect. 4 we prove existence and ergodic properties of the infinite volume dynamics. Sect. 5 is devoted to the proof of exponential decay of correlations. 2. Finite Volume Model In this section we recall some definitions and properties of abelian sandpiles in finite volume. In [4, 5, 8, 14] and [12], the reader will find more details. The infinite graphs S on which we construct the dissipative abelian sandpile dynamics are S = Zd , and “strips”, that is, S = Z × {1, . . . , }, for some integer > 1 (notice that = 1 corresponds to S = Zd with d = 1). Finite subsets of S will be denoted by V , W ; we write S = {W ⊂ S : W finite}. We denote by ∂V the external boundary of V : all the sites in S \ V that have a nearest neighbor in V . Let N be the maximal number of neighbors of a site in S, e.g., N = 2d for S = Zd and N = 4 for S = Z × {1, . . . , }, ≥ 3. The state space of the process in infinite volume is = {1, . . . , γ }S , with some integer γ ≥ N. We fix V ∈ S, a nearest neighbor connected subset of S. Then V = {1, . . . , γ }V is the state space of the process in the finite volume V . We denote by NV (x) the number of nearest neighbors of x in V . A (infinite volume) height configuration η is a mapping from S to N = {1, 2, ...} assigning to each site x a “number of sand grains” η(x) ≥ 1. If η ∈ , it is called a stable configuration. Otherwise η is unstable. For η ∈ , ηV is its restriction to V , and for η, ζ ∈ , ηV ζV c denotes the configuration whose restriction to V (resp. V c ) coincides with ηV (resp. ζV c ). The configuration space is endowed with the product topology, making it into a compact metric space. A function f : → R is local if there is a finite W ⊂ S such that ηW = ζW implies f (η) = f (ζ ). The minimal (in the sense of set ordering) such W is called the dependence set of f , and is denoted by Df . A local function can be seen as a function on W for all W ⊃ Df , and every function on W can be seen as a local function on . The set L of all local functions is uniformly dense in the set C() of all continuous functions on . 2.1. The dynamics in finite volume. The toppling matrix on S is defined by, for x, y ∈ S, xx = γ , xy = −1 if x and y are nearest neighbors, xy = 0 otherwise. We denote by
V
the restriction of to V × V .
(2.1)
398
C. Maes, F. Redig, E. Saada
A site x ∈ V is called a dissipative site in the volume V if xy > 0. y∈V
Thus if γ > N , every site is dissipative. If γ = N , the internal boundary sites of V (that is all the sites in V that have a nearest neighbor in S \ V ), are the only dissipative sites in V . To define the sandpile dynamics, we first introduce the toppling of a site x as the mapping Tx : NV → NV defined by Tx (η)(y) = η(y) − Vxy if η(x) > Vxx , = η(y) otherwise.
(2.2)
In words, site x topples if and only if its height is strictly larger than Vxx = γ , by transferring −Vxy ∈ {0, 1} grains to site y = x and losing itself in total Vxx = γ grains. As a consequence, if the site is dissipative, then, upon toppling, some grains are lost. Toppling rules commute on unstable configurations, that is, for x, y ∈ V such that η(x) > γ = Vxx and η(y) > γ = Vyy : Tx Ty (η) = Ty (Tx (η)) . For η ∈ NV , we say that ζ ∈ V arises from η by toppling if there exists a k-tuple (x1 , . . . , xk ) of sites in V such that k Txi (η). ζ = i=1
The toppling transformation is the mapping T : NV → V defined by the requirement that T (η) arises from η by toppling. The fact that stabilization of an unstable configuration is always possible follows from the existence of dissipative sites. The fact that T is well-defined, that is, that the same final stable configuration is obtained irrespective of the order of the topplings, is a consequence of the commutation property, see [12] for a complete proof. For η ∈ NV and x ∈ V , let ηx denote the configuration obtained from η by adding one grain to site x, that is ηx (y) = η(y) + δx,y . The addition operator defined by ax,V : V → V ; η → ax,V η = T (ηx )
(2.3)
represents the effect of adding a grain to the stable configuration η and letting a stable configuration arise by toppling. Because T is well-defined, the composition of addition operators is commutative. We can now define a discrete time Markov chain {ηn : n ≥ 0} on V by picking a point x ∈ V randomly at each discrete time step and applying the addition operator ax,V to the configuration. We define also a continuous time Markov process {ηt : t ≥ 0} with infinitesimal generator 0,ϕ ϕ(x)[f (ax,V η) − f (η)]; (2.4) LV f (η) = x∈V
this is a pure jump process on V , where ϕ : S → (0, ∞) is the addition rate function.
Infinite Volume Limit of Dissipative Abelian Sandpiles
399
2.2. Recurrent configurations, invariant measure. The Markov chain {ηn , n ≥ 0} (or its continuous time version {ηt }) has a unique recurrent class RV , and its stationary measure µV is the uniform measure on that class, that is, µV =
1 δη . |RV |
(2.5)
η∈RV
A configuration η ∈ V belongs to RV if it passes the burning algorithm (see [4]), which is described as follows. Pick η ∈ V and erase the set E1 of all sites x ∈ V with a height strictly larger than the number of neighbors of that site in V , that is, satisfying the inequality η(x) > NV (x). Iterate this procedure for the new volume V \E1 , and so on. If at the end some non-empty subset Vf is left, η satisfies, for all x ∈ Vf , η(x) ≤ NVf (x). The restriction ηVf is then called a forbidden subconfiguration (fsc). If Vf is empty, the configuration is called allowed. The set AV of allowed configurations coincides with the set of recurrent configurations, AV = RV (see [8, 12, 14]). A recurrent configuration is thus nothing but a configuration without forbidden subconfigurations. This extends to infinite volume: Definition 2.1. A configuration η ∈ is called recurrent if for any V ∈ S, ηV ∈ RV . The set R of all recurrent configurations forms a perfect (hence uncountable) subset of . This means that R is closed (hence compact) and every element η ∈ R is the limit of a sequence ηn ∈ R, ηn = η. On the set RV , the finite volume addition operators ax,V can be inverted and they generate a finite abelian group. This group is characterized by the closure relation
V
ay,Vxy = Id.
(2.6)
y∈V
By the group property, the uniform measure µV is invariant under the action of ax,V −1 and of ax,V . 2.3. Toppling numbers. For x, y ∈ V and η ∈ V , let nV (x, y, η) denote the number of topplings at site y by adding a grain at x, that is, the number of times we have to apply the operator Ty to stabilize ηx in the volume V . We have the relation η(y) + δx,y = ax,V η(y) + Vyz nV (x, z, η). (2.7) z∈V
Defining GV (x, y) =
µV (dη) nV (x, y, η)
(2.8)
400
C. Maes, F. Redig, E. Saada
one obtains, by integrating (2.7) over µV : GV (x, y) = (V )−1 xy .
(2.9)
In the limit V ↑ S, GV converges to the Green’s function G of the simple random walk on S with a sink associated to the dissipative sites (that is every site x is linked with γ − NS (x) edges to a sink and the walk stops when it reaches the sink). By (2.8), the probability that a site y topples by addition at x in volume V is bounded by GV (x, y). Definition 2.2. We say that the sandpile model is dissipative if sup G(x, y) < +∞.
(2.10)
x∈S y∈S
In our examples, if γ > 2d for Zd or γ ≥ 4 for strips, the Green’s function G(x, y) decays exponentially in the lattice distance between x and y and hence (2.1) defines a dissipative model. From now on, we restrict ourselves to these cases. Definition 2.3. For any integer n, let νWn be a probability measure on Wn , with Wn ∈ S, Wn ↑ S. Then νWn converges to a probability measure ν on if for any f ∈ L, lim f dνWn = f dν. n→∞
We denote by I the set of all limit points of {µV : V ∈ S} in the sense of Definition 2.3. By compactness of , I is a non-empty compact convex set. Moreover, by (2.5) and Definition 2.1, any µ ∈ I concentrates on R (see [10]). 2.4. Untoppling numbers. On the set RV the addition operators ax,V are invertible. The action of the inverse operator on a recurrent configuration can be defined recursively as follows, see [8]. Consider η ∈ RV and x ∈ V . Remove one grain from η at site x. −1 η, otherwise it contains a forbidden If the resulting configuration is recurrent, it is ax,V subconfiguration (fsc) in V1 ⊂ V . In that case “untopple” the sites in V1 . By untoppling of a site z we mean that the sites are updated according to the rule η(y) → η(y) + zy . Iterate this procedure until a recurrent configuration is obtained: the latter coincides with −1 ax,V η. As an example, consider a graph with just three sites a ∼ b ∼ c for γ = 2. The configuration 212 is recurrent. After removal of one grain at site c, we get 211, which contains the fsc 11. Untoppling site b gives 130, and untoppling site c gives 122, which is recurrent. Conversely, one verifies that addition at site c on 122 gives back the original configuration 212. Call n− V (x, y, η) the number of untopplings at site y by removing one grain from x and from untoppling sites until a recurrent configuration is obtained. As in the previous section, one easily proves the relation n− (2.11) V (x, y, η)µV (dη) = GV (x, y). 3. The Group of Addition Operators in Infinite Volume In this section we show how to obtain the group of addition operators in the infinite volume limit. The assumption of dissipativity is crucial in order to obtain a compact abelian group in the thermodynamic limit.
Infinite Volume Limit of Dissipative Abelian Sandpiles
401
3.1. Addition operator. The finite volume addition operators ax,V (cf. (2.3)) are defined on via ax,V : → : η → ax,V η = (ax,V ηV )V ηV c .
(3.1)
(with some slight abuse of notation). Similarly, the inverses are defined on R via −1 −1 ax,V : R → : η → ax,V η V V ηV c . (3.2) Remark that if η ∈ R, then (ax,V η)W ∈ RW for all W ⊂ V but ax,V η is not necessarily an element of R. Definition 3.1. For η ∈ , we say that the limit of the finite volume addition operators is defined on η if for every x ∈ S, there exists 0 ∈ S such that for any ∈ S, ⊃ 0 , ax, η = ax, 0 η; in that case, we write ax η = ax, 0 η. Similarly, for η ∈ R, we say that the limit of the finite volume inverse addition operators is defined on η if for every x ∈ S, there exists 0 ∈ S such that for any −1 −1 ∈ S, ⊃ 0 , ax, η = ax, η; we write 0 −1 η. ax−1 η = ax, 0
Remark that if η ∈ R and ax is defined on η, then ax η ∈ R. Lemma 3.1. Assume (2.10). For any µ ∈ I there exists a tail measurable subset ⊂ such that: 1. µ() = 1; 2. The limit of the finite volume addition operators and their inverses is defined on every η ∈ . Moreover, every µ ∈ I is invariant under the action of ax and ax−1 , that is, for all x ∈ S and f ∈ L, (3.3) f (ax η)µ(dη) = f (ax−1 η)µ(dη) = f (η)µ(dη), and ax ax−1 = ax−1 ax = id on . Proof. We prove the result for the addition operators, the analogue for the inverses is proved along the same lines by replacing “number of topplings” by “number of untopplings”. Pick Wk ∈ S, Wk ↑ S such that µWk → µ and x ∈ S. We have to prove that (3.4) µ ∀ 0 ∈ S, ∃V ⊃ 0 : ax,V η = ax, 0 η = 0. We enumerate S = {xn : n ∈ N}, with Vn = {x1 , . . . , xn } such that Vn ↑ S, xn ∈ ∂Vn−1 . If ax,V η = ax,Vn η, then some boundary site of Vn has toppled under addition at x in volume V . This implies that for every m such that Vm ⊃ V some external boundary site of Vn topples upon addition at x in Vm . Therefore, the left-hand side of (3.4) is bounded by µ ∀n ∈ N, ∃p ≥ n, ∃y ∈ ∂Vn : nVp (x, y, η) ≥ 1
402
C. Maes, F. Redig, E. Saada
and we have to estimate µ ∃p ≥ n, ∃y ∈ ∂Vn : nVp (x, y, η) ≥ 1 .
(3.5)
Since nVp (x, y, η) ≤ nVp+1 (x, y, η), µ ∃p ≥ n, ∃y ∈ ∂Vn : nVp (x, y, η) ≥ 1 ≤ lim µ ∃y ∈ ∂Vn : nVk (x, y, η) ≥ 1 k→∞ nVk (x, y, η)µ(dη) ≤ lim k→∞
y∈∂Vn
≤ lim
k→∞
=
nWk (x, y, η)µWk (dη)
y∈∂Vn
G(x, y)
y∈∂Vn
which implies that (3.5) converges to zero as n tends to infinity, by condition (2.10). Finally, (3.3) follows easily from Definition 3.1, µ() = 1, f ∈ L, and the invariance −1 of µV under the finite volume addition operators ax,V and ax,V . Notice that for η ∈ , we can take the limit V ↑ S in (2.7) and write yz nS (x, z, η) η(y) + δx,y = ax η(y) +
(3.6)
z∈S
for any x, y ∈ S, where
nS (x, z, η), the number of topplings at site z ∈ S by adding a grain at x, satisfies z∈S nS (x, z, η) < +∞. o Lemma 3.2. Assume (2.10). For any µ ∈ I there exists a tail measurable subset ⊂ nx V ∈ S and n , x ∈ V integers, the product with µ(o ) = 1 such that for any x x∈V ax nx is well-defined, as the limit of x∈V ax, as → S, on every η ∈ o .
Proof. We fix V ∈ S, x ∈ V , nx a positive integer, and we prove that axnx is well-defined on (the case of negative nx is similar and the extension to finite products it straightforward). Following the same lines as in the preceding proof, we have to replace (3.4) by
nx nx µ ∀ 0 ∈ S, ∃ ⊃ 0 : ax, η = ax, η = 0. 0 We denote by EVp (nx , x, z, η) the event that addition in Vp of nx grains at x causes at least one toppling at z. As these events are increasing in p, we estimate µWk (EWk (nx , x, y, η)) µ ∃p ≥ n, ∃y ∈ ∂Vn : EVp (nx , x, y, η) ≥ 1 ≤ lim k→∞
≤
y∈∂Vn
y∈∂Vn
nx G(x, y),
Infinite Volume Limit of Dissipative Abelian Sandpiles
403
where the last inequality is a consequence of (2.9) and (3.3). From this we deduce that for any V ∈ S, n = (nx , x ∈ V ) ∈ ZV , the product x∈V axnx is well-defined on a tail measurable set (V , n) of µ-measure one. The set o is then the countable intersection o = ∩V ∈S ,n∈ZV (V , n)
of tail measurable µ-measure one sets.
The following proposition extends this to addition on infinite products.
Proposition 3.1. Assume (2.10). If n = (nx , x ∈ S) ∈ ZS satisfies x∈S |nx |G(0, x) < nx +∞, the product x∈S ax is well-defined on a set (n) of µ-measure 1, for every µ ∈ I. Proof. Take nx ≥ 0 for every x ∈ S; the case of negative nx is treated again by replacing “topplings” with “untopplings”. It suffices to show that for every 0 ∈ S,
µ ∃V0 , ∀V ⊃ V0 , ∀y ∈ 0 :
axnx η (y) =
x∈V
axnx η (y) = 1
x∈V0
or
lim µ ∃V ⊃ V0 , ∃y ∈ 0 :
V0 ↑S
axnx η (y) =
x∈V
axnx η (y) = 0.
(3.7)
x∈V0
The left-hand side of (3.7) is bounded by the sum
µ ∃V ⊃ V0 :
y∈ 0
axnx η (y) =
x∈V
axnx η (y) .
(3.8)
x∈V0
If none of the external boundary points nofx 0 topples upon addition of nz grains at z ∈ V \ V0 to the configuration x∈V0 ax η , we have that for all y ∈ 0 :
axnx η (y) =
x∈V
axnx η (y).
x∈V0
Since µ is invariant under the ax , see (3.3), the sum (3.8) is bounded from above by
µ (ES (nz , z, x, η)) ≤
y∈ 0 |x−y|=1 z∈V0c
which implies (3.7) by the hypothesis on n.
y∈ 0 |x−y|=1 z∈V0c
nz G(z, x)
404
C. Maes, F. Redig, E. Saada
3.2. Group structure. Here we show that the product x∈S axnx can be defined on any recurrent configuration, provided we identify recurrent configurations which differ by a multiple of . Given n ∈ ZS and η ∈ R, we consider the set An (η) = {ξ ∈ R : ∃m ∈ ZS , η + n = ξ + m}. Similarly, for subtraction, Sn (η) = {ξ ∈ R : ∃m ∈ ZS , η − n = ξ + m}. Fix n ∈ ZS so that sup
[|nx | + 2γ ]G(y, x) = B < +∞
(3.9)
y∈S x∈S
and let n = {η ∈ R : Sn (η) = ∅, An (η) = ∅} be the set of recurrent configurations for which both addition and subtraction with n gives rise to a new recurrent configuration, modulo the toppling matrix applied to an integer function. Lemma 3.3. n = R. Proof. We prove that n is closed. Let (ηk )k≥0 be a sequence in n which converges S to η as k → ∞. For each k, there exist ηk± ∈ R and m± k ∈ [−B, B] such that ηk ± n = ηk± + m± k.
(3.10)
Since R×[−B, B]S is compact, there exists a subsequence ki → ∞ such that ηk±i → η± ± and m± ki → m . Taking limits along this subsequence in (3.10) yields η ± n = η± + m± , that is, η ∈ n . Looking back at Proposition 3.1, (n) ∩ R ⊂ n and (n) is a µ-measure one (hence non-empty) tail set. Therefore it is dense and n = R. Definition 3.2. Two recurrent configurations η, ζ ∈ R are called equivalent, and we write η ∼ ζ , if there exists m ∈ ZS such that η = ζ + m. R, if ζ, ζ
(3.11)
∈ ∈ An ∈ Sn (η)), then ζ ∼ ζ . Remark 3.1. 1. For all n ∈ 2. If η ∼ η , then An (η) = An (η ), Sn (η) = Sn (η ) for all n ∈ ZS . 3. In the finite volume case one can prove that every equivalence class in ZV /V ZV contains exactly one recurrent configuration, that is, η, ζ ∈ RV and ZS , η
(η) (or ζ, ζ
η = ζ + V m imply η = ζ . This is no longer true in infinite volume. As an example we take S = Z × {1, 2}, γ = 4. Then the recurrent configurations η(x) = 3 for all x and ζ (x) = 4 for all x (denoted by 3 and 4) are equivalent: ζ = η + m, where m(x) = 1 for all x.
Infinite Volume Limit of Dissipative Abelian Sandpiles
405
We can now introduce the addition operator on classes: take the class [η] containing the recurrent configuration η, let ξ ∈ An (η) and define axnx [η] = [ξ ]. x∈S
Notice that if η ∈ (that is, η ∈ R is such that ax is the limit of ax,V on η), then ax [η] = [ax η].
(3.12)
Proposition 3.2. Assume (2.10). R/ ∼ is a compact metric space. Proof. It suffices to show that equivalence classes are closed. Suppose we have sequences (ηk ), (ξk ) of recurrent configurations with
ηk ∼ ξk , ηk → η, ξk → ξ . Then, there exist mk ∈ [−M, M]S with M = 2γ supx∈S y∈S G(x, y) such that ηk = ξk + mk .
(3.13)
We can choose a subsequence ki → +∞ such that mki → m. Taking limits along this subsequence in (3.13) yields η = ξ + m, giving η ∼ ξ .
By point 2 of Remark 3.1 the addition of equivalence classes of configurations in R is well-defined. Definition 3.3. Assume (2.10). For [η], [ξ ] in R/ ∼ we define [η] ⊕ [ξ ] to be the class which contains Aξ (η). Theorem 3.1. (R/ ∼, ⊕) is a compact abelian group, hence it admits a unique Haar measure. Proof. The group property is immediate; the compactness follows from Proposition 3.2. For the consequence see e.g. [7], p. 31. The next result shows that from a measure theoretic perspective, there is no difference between classes of the relation ∼ and recurrent configurations. As a corollary, we obtain that the set I of possible weak limit points of the finite volume stationary measures is a singleton. ⊂ R of µ-measure one such that Proposition 3.3. For every µ ∈ I there exists a set , [η] = {η}. for all η ∈ Before proving the proposition, we state and prove Theorem 3.2. The set I is a singleton.
406
C. Maes, F. Redig, E. Saada
Proof. Suppose that I contains two different measures µ, ν. Then there exists a measurable subset A such that µ(A) = ν(A). µ and ν are lifted to R/ ∼ via µ([A]) ¯ = µ(∪η∈A [η]). Using Proposition 3.3, µ([A]) ¯ = µ ∪η∈A [η] = µ (∪η∈A [η]) ∩ = µ ∪η∈A {η} = µ(A).
(3.14)
Analogously ν¯ ([A]) = ν(A). Hence µ¯ and ν¯ are different. Because µ and ν are invariant under the action of the addition operators ax , it follows that µ¯ and ν¯ are different and invariant under the group action. This contradicts the uniqueness of the Haar measure. consist of recurrent configurations η that satisfy Proof of Proposition 3.3. Let the set 1. For all x ∈ S, ax and ax−1 are well defined as limits of the corresponding finite volume operators, and ax ax−1 η = ax−1 ax η = η (that is η ∈ ). 2. For all finite volumes V0 , there is a volume , V0 ⊂ so that, whenever W is a finite set outside , W ∩ = ∅ and for all n ∈ [−B, B]S , axnx η(y) = η(y), for all y ∈ V0 . (3.15) x∈W
) = 1 follows from the same kind of arguments as for µ() = 1 in Lemma That µ( ) = µ(ax−1 ) = 1 by invariance. Consider an arbitrary finite 3.1. Moreover, µ(ax volume V and abbreviate V1 = V ∪ ∂V , V2 = V1 ∪ ∂V1 . By the closure relation for the infinite volume addition operators, see (2.6), we have the identity xy nx ay = id . x∈V1 y∈V2
This gives
n(y)
ay
=
y∈V
nx
ay xy
y∈V x∈V1
=
nx
ay xy
x∈V1 y∈V
=
nx
ay xy
x∈V1 y∈V2
=
x∈V1 y∈V2 \V
x∈V1 y∈V2 \V − n ay xy x .
−xy nx
ay
Infinite Volume Limit of Dissipative Abelian Sandpiles
407
Therefore, from (3.15) it follows that for every n ∈ [−B, B]S , lim axn(x) (η) = η p↑∞
(3.16)
x∈Vp
satisfy along some sequence of increasing volumes. Therefore, if η, ξ ∈ η + n = ξ then, using (3.16) and (3.17): η = lim
p→∞
(3.17)
axn(x) (η) = ξ
x∈Vp
. which shows the desired property of the set
From now on we denote by µ the unique element of I as well as the Haar measure. 4. Infinite Volume Dynamics From the previous sections we know that I contains a unique element µ and that addition operators as well as their inverses are well-defined on µ-typical configurations. This measure µ is the natural candidate for a stationary measure of a Markov process on infinite volume recurrent configurations. The construction of this Markov process is completely identical to what was done in [10]. We therefore state the results on existence and Poisson representation of this process without proofs, in the following section, and proceed in Sect. 4.2 to the proof of its ergodic properties, which was open in [10]. 4.1. Infinite volume Markov process. For the unique µ ∈ I we can construct a stationary Markov process on µ-typical infinite volume configurations, as in [10]. We assume that the addition rate function ϕ introduced in (2.4) satisfies sup ϕ(x)G(y, x) < ∞. (4.1) y∈S x∈S
This condition ensures that the number of topplings at any site x ∈ S remains finite almost surely in any finite interval of time when grains are added at intensity ϕ. Notice that for dissipative systems, by (2.10), we can take the addition rate function constant. To each site x ∈ S we associate a Poisson process Nϕt,x (for different sites these Poisson processes are mutually independent) with rate ϕ(x). At the event times of Nϕt,x we “add a grain” at x, that is, we apply the addition operator ax to the configuration. For every finite volume V ∈ S, the natural extension of (2.4) ϕ LV = ϕ(x)(ax − I ) (4.2) x∈V
is the
Lp (µ)
generator of the stationary pure jump process on with semigroup t,x N ϕ ϕ SV (t) = exp(tLV )f = ax ϕ f dP, (4.3) x∈V
408
C. Maes, F. Redig, E. Saada
where P denotes the joint distribution of the independent Poisson processes {Nϕt,x }, and f ∈ Lp (µ). The following theorems can be derived directly from the techniques developed in [10]. Theorem 4.1. If ϕ satisfies condition (4.1), then ϕ
1. The semigroups SV (t) converge strongly in L1 (µ) to a semigroup Sϕ (t). 2. Sϕ (t) is the L1 (µ) semigroup of a stationary Markov process {ηt : t ≥ 0} on . 3. For any f ∈ L, lim t↓0
Sϕ (t)f − f ϕ(x)[ax f − f ], = Lϕ f = t x∈S
where the limit is taken in L1 (µ). 4. The process {ηt : t ≥ 0} admits a c`adl`ag version (right-continuous with left limits). The intuitive description of the process {ηt : t ≥ 0} is correct under condition (4.1), that is, the process has a representation in terms of Poisson processes: Theorem 4.2. Assume (4.1). For µ × P almost every (η, ω) the limit Nϕt,x (ω) lim ax η = ηt V ↑S
x∈V
exists. The process {ηt : t ≥ 0} is a version of the process of Theorem 4.1, that is, its L1 (µ) semigroup coincides with Sϕ (t). To formulate the next theorem we need a partial order on configurations, functions, and probability measures on . For η, ξ ∈ , η ≤ ξ if η(x) ≤ ξ(x) for all x ∈ S. A function f : → R is monotone if η ≤ ξ implies f (η) ≤ f (ξ ), for all η, ξ ∈ . For two probability measures ν, ν on , ν ≤ ν if ν(f ) ≤ ν (f ) for all monotone bounded Borel measurable functions f . Theorem 4.3. Let ν ≤ µ. For ν × P almost every (η, ω) the limit Nϕt,x (ω) lim ax η = ηt V ↑S
x∈V
exists. The process {ηt : t ≥ 0} is Markovian with η0 distributed according to ν. Remark 4.1. Theorem 4.3 implies that η ≡ 1 can be taken as initial configuration. 4.2. An ergodic theorem. In the rest of this section, we assume for simplicity that the rate function ϕ ≡ 1 is constant, and we write S(t) (see Theorem 4.1), N t,x , L and L0V (see (2.4)) without subscript ϕ. We investigate the convergence of νS(t) to µ for a probability measure ν ≤ µ. Before we give the statement and its proof, observe that the role of the dissipativity parameter γ here is double. First, the approximation (and even the existence) of the infinite volume process by finite volume ones gets nicer and easier to prove when γ increases. It is essentially based on the dissipativity condition (2.10). On the other hand, in finite volume, the exponential relaxation to the stationary measure µV also depends on γ and in fact, becomes slower for larger γ . This can be seen from ignoring (as would be reasonable for
Infinite Volume Limit of Dissipative Abelian Sandpiles
409
very large γ and dimension d) the interaction with other sites: we then have essentially a one site dynamics by which at exponential times one grain is added to the site until the latter reaches a height γ + 1, after which it topples to height 1 and so on. The relaxation time of this dynamics being clearly proportional to γ , the convergence is slower for larger γ . Theorem 4.4. Suppose ν is a probability measure on such that ν ≤ µ. There is a constant C2 > 0 so that for all f ∈ L, there exists Cf < +∞ such that S(t)f dν − f dµ ≤ Cf exp(−C2 t). (4.4) In particular, νS(t) converges weakly to µ, uniformly in ν ≤ µ and exponentially fast. Proof. The idea is to approximate S(t) by finite volume semigroups, and to estimate the speed of convergence as a function of the volume. More precisely, we split S(t)f dν − f dµ ≤ AV (f ) + B V (f ) + CV (f ) (4.5) t t with
= S(t)f dν − SV (t)f dνV , V Bt (f ) = SV (t)f dνV − f dµV , CV (f ) = f dµV − f dµ , AVt (f )
where νV is the restriction of ν to V and
SV (t)f (η) =
f
N t,x
ax,V η dP.
x∈V
By Theorem 3.2, lim CV (f ) = 0.
(4.6)
V ↑S
For the first term in the right-hand side of (4.5) we write t,x t,x V N N At (f ) = f ax η − f ax,V η dPdν . x∈S
(4.7)
x∈V
The integrand of the right-hand side is zero if no avalanche from V c has influenced sites of Df during the interval [0, t], otherwise it is bounded by 2f ∞ . Therefore, since N t,x are rate one Poisson processes: AVt (f ) ≤ κf ∞ t G(x, y) (4.8) y∈Df x∈V c
410
C. Maes, F. Redig, E. Saada
for some constant κ. Therefore that first term can be controlled by the dissipativity condition (2.10). The second term in the right hand side of (4.5) is estimated by the relaxation to equilibrium of the finite volume dynamics. The generator L0V has the eigenvalues exp 2πi σ (L0V ) = GV (x, y)ny − 1 : n ∈ ZV /V ZV . x∈V
(4.9)
y∈V
The eigenvalue 0 corresponding to the stationary state arises from the choice n = 0. For the speed of relaxation to equilibrium we are interested in the minimum absolute value of the real part of the non-zero eigenvalues. More precisely: BtV (f ) ≤ Cf exp(−λV t), where λV = inf |Re(λ)| : λ ∈ σ (L0V ) \ {0} = 2 inf sin2 π GV (x, y)ny : n ∈ ZV /V ZV , n = 0 x∈V
y∈V
by (4.9). Since there is a constant c so that for all real numbers r sin2 (π r) ≥ c(min{|r − k| : k ∈ Z})2 we get
sin2 π((V )−1 n)x ≥ c inf (V )−1 n − k2 : n ∈ ZV /V ZV ,
x∈V
n = 0, k ∈ ZV ,
(4.10)
where · represents the Euclidian norm in ZV that we estimate by (V )−1 n − k2 = (V )−1 (n − V k)2 ≥ V −2 . For any regular volume we have V ≤
2γ 2 + 16d 2 .
This gives BtV (f ) ≤ Cf exp(−Ct), where C > 0 is independent of V . The statement of the theorem now follows by combining (4.6), (4.8), (4.11).
(4.11)
Infinite Volume Limit of Dissipative Abelian Sandpiles
Remark 4.2. When we restrict ourselves to the case where ϕ(x) = M < ∞,
411
(4.12)
x∈S
Lϕ becomes a bounded operator, hence it generates a pure jump process which is a continuous time random walk on the group (R/ ∼, ⊕). By the ergodic properties of random walks on compact groups we then obtain that lim νSϕ (t) = µ
t→∞
for every measure ν on R/ ∼ (see Theorems 2.5.14, 2.6.2 and Corollary 2.6.4 in [7] for details). 4.3. Mixing property. To the stationary process defined in Theorem 4.1, we associate the process on R/ ∼ by putting [η]t = [ηt ].
(4.13)
For that, it is important to notice that the equivalence of recurrent configurations is preserved in time (by Theorem 4.2, and points 1,2 of Remark 3.1): when η ∼ ξ , then ηt ∼ ξt with P-probability one. For the following theorem, we abbreviate without consequences [η]t = ηt . Theorem 4.5. The process {ηt : t ≥ 0} is mixing, that is, for all f, g ∈ L: lim gdµ = f dµ gdµ (S(t)f ) t→∞
(4.14)
Proof. Since the semigroup is a normal operator on L2 (µ), ergodicity of the process implies the mixing property by [13]. It thus suffices to prove that for abounded nonnegative function f such that f dµ > 0 and Lf = 0, then µ-a.s. f = f dµ. By the invariance of µ under ax , 0 = −2 (f Lf ) dµ = (ax f − f )2 dµ x∈S
which implies ax f = f for all ax , µ-a.s. Hence, the measure f dµ dνf = f dµ is invariant under the action of ax , thus under the group action. By uniqueness of the Haar measure, we conclude νf = µ. 5. Decay of Correlations In this section we prove that the infinite volume measure µ has exponential decay of correlations under a condition of “strong dissipativity”. That means for the model (2.1) with S = Zd that γ must be sufficiently large, e.g. γ > 13 for d = 2; for the strips S = Z × {1, . . . , } with finite it always suffices that γ > 3. In [11] the exponential decay between very special local observables (indicators of so-called weakly allowed clusters) is also obtained in the case where the Green function decays exponentially. However, the technique developed in that paper does not apply to all local functions.
412
C. Maes, F. Redig, E. Saada
5.1. Decoupling argument. We start with the heuristics of the main ingredient in the proof of exponential decay of correlations. The rest is based on quite general stochastic-geometric methods that are reviewed in [6]. To be specific, suppose that S = Z2 and γ > 4 (in (2.1)). Then, for every volume V ∈ S, µV (η(x) = a|η(z) = c) = µV \z (η(x) = a)
(5.1)
for all a, c ∈ {1, . . . , γ }, c > 4, x = z in V , because, by the burning algorithm, we can burn away the sites on which we know that the configuration is sufficiently large. Instead of fixing in (5.1) the height value at one site z, we could do the same thing on some region C ⊂ V that does not contain x, see Lemma 5.1. On the other hand, if sites x and y are not very close to each other, we can find volumes x , y ⊂ V that contain x, respectively y, that also do not touch (more precisely, that satisfy ( x ∪∂ x )∩ y = ∅). Then, see Lemma 5.2, µ x ∪ y (η(x) = a, η(y) = b) = µ x (η(x) = a)µ y (η(y) = b).
(5.2)
The combination of (5.1) with (5.2) yields conditional independence of two events that are separated by a region C where the configuration is sufficiently high, see Lemma 5.3. Definition 5.1. Let V ∈ S, C ⊂ V , σ ∈ V . The subconfiguration σC is V -burnable if there exists a bijection f : {1, . . . , n} → C such that NV (f (1)) < σ (f (1)), and for every j = 1, . . . , n − 1, NV \{f (1),... ,f (j )} (f (j + 1)) < σ (f (j + 1)). As an example, on Z2 with maximal height γ = xx = 5, every closed curve along which the heights are at least 4 and containing at least one point with height 5 is burnable. Lemma 5.1. Let V = ∪C ∈ S, ∩C = ∅ and fix an arbitrary configuration σ ∈ V so that σC is V -burnable. Put EC = {η ∈ V : ηC = σC }. Then, for all events A that depend only on the configuration in (that is, A ∈ F ), µV (A|EC ) = µ (A).
(5.3)
Proof. By the burning algorithm, η ∈ RV ∩ EC if and only if η ∈ R and ηC = σC . Therefore,
η∈RV I (η ∈ A)I (η ∈ EC )
µV (A|EC ) = η∈RV I (η ∈ EC )
η ∈R ηC ∈RC I (η ∈ A)I (η ∈ EC )
= ηC ∈RC I (η ∈ EC )|R | = µ (A).
Infinite Volume Limit of Dissipative Abelian Sandpiles
413
Remark 5.1. We do not need to condition on one fixed burnable configuration. The lemma and its proof above remain unchanged when taking EC = {η ∈ V : ηC is V -burnable}, the event that we can burn away the sites of C first. Lemma 5.2. Let 1 , 2 ∈ S with ( 1 ∪ ∂ 1 ) ∩ 2 = ∅.
(5.4)
For A ∈ F 1 , B ∈ F 2 , µ 1 ∪ 2 (A ∩ B) = µ 1 ∪ 2 (A)µ 1 ∪ 2 (B). Proof. We have η ∈ R 1 ∪ 2 if and only if η 1 ∈ R 1 and η 2 ∈ R 2 . The rest is writing out expectations as in the proof of Lemma 5.1. We now state the conditional independence. Lemma 5.3. For V ∈ S and C ⊂ V , suppose that V \ C = 1 ∪ 2 with 1 , 2 satisfying (5.4). Then, for all A ∈ F 1 , B ∈ F 2 , µV (A ∩ B|EC ) = µV (A|EC )µV (B|EC ) Proof. By Lemma 5.1, µV (A ∩ B|EC ) = µV \C (A ∩ B), and continuing via Lemma 5.2 µV (A ∩ B|EC ) = µ 1 ∪ 2 (A)µ 1 ∪ 2 (B).
(5.5)
The proof is finished by applying again Lemma 5.1 to the two factors in the right-hand side of (5.5). The conditional independence (5.5) is reminiscent of the situation for Markov random fields. Here µV is not Markovian but nevertheless for all A ∈ F , ⊂ V the conditional probability of A given the configuration in V \ is µV (A|ηV \ ) = µ (A)
(5.6)
whenever η∂ ∩V is V -burnable. In particular, this conditional probability (5.6) is then independent of the particular ηV \ .
414
C. Maes, F. Redig, E. Saada
5.2. Geometric argument. From the previous decoupling argument, it is clear how to proceed for the proof of decay of correlations. What needs to be established is that there will be typically some “circuit” C, separating two far away dependence sets, where the configuration is burnable. We thus basically end up with a stochastic-geometric or percolation-like argument as also reviewed in [6]. The first thing to see is that burnability is sufficiently probable. We do that first for the strip in Lemma 5.4 and then for the full lattice in Lemma 5.5. Lemma 5.4. Let V = {(x, y) ∈ Z2 : |x| ≤ k, y = 1, . . . } and γ ≥ 4 in (2.1). Fix some x1 , |x1 | ≤ k and let C = {(x1 , y) ∈ V : y = 1, . . . }. There is p = p(, γ ) > 0 (uniformly in k) such that for all events E(x1 ) that only depend on the heights η(x, y) with (x, y) ∈ C, µV (η(x, y) ≥ 4 for all (x, y) ∈ C|E(x1 )) ≥ p > 0.
(5.7)
Proof. Via Bayes’rule, µV (η(x, y) ≥ 4 for all (x, y) ∈ C|E(x1 )) = µV (E(x1 )|η(x, y) µV (η(x, y) ≥ 4 for all (x, y) ∈ C) ≥ 4 for all (x, y) ∈ C) . µV (E(x1 ))
(5.8)
If η(x, y) ≥ 4 for the points (x, y) ∈ C, then ηC is V -burnable, and by Lemma 5.1 µV (E(x1 )|η(x1 , y1 ) ≥ 4 whenever |y1 | ≤ ) = µV \C (E(x1 )). On the other hand, by counting, µV \C (E(x1 )) |RV | ≥ µV (E(x1 )) |RV \C ||RC | and µV (η(x, y) ≥ 4 for all (x, y) ∈ C) = (γ − 4 + 1)
|RV \C | . |RV |
As a consequence we can take 0
(γ − 4 + 1) γ −3 = . |RC | γ −3+
Remark 5.2. Obviously, p ↓ 0 as ↑ ∞. For the regular lattice S = Zd we have: Lemma 5.5. Consider the model (2.1) with S = Zd . For all ε > 0, there is a γ0 < +∞ so that for all V ∈ S, all x ∈ V and all events E ∈ FV \x , µV (η(x) > 2d|E) > 1 − ε whenever γ ≥ γ0 .
Infinite Volume Limit of Dissipative Abelian Sandpiles
415
Proof. We can repeat the steps of the proof in Lemma 5.4. At the end we must estimate the number of burnable heights at x divided by the number of configurations at x. That is, µV (η(x) > 2d|E) > It thus suffices that 2d < γ ε.
γ − 2d . γ
We need one more lemma before giving the geometric argument, because the latter requires stochastic domination by Bernoulli measure. Lemma 5.6. The invariant probability measure µV for the sandpile dynamics in V is irreducible,that is, for two given recurrent configurations η, η , there is a sequence η0 = η, . . . , ηm = η of recurrent configurations such that ηi and ηi+1 differ only in one site. Proof. Since η can be reached from η by a finite number of additions, it is enough to show that for any x ∈ V , there is such a sequence from η to ax η. Let x+ (η) = {y ∈ V ; (ax η)(y) > η(y)}, x− (η) = {y ∈ V ; (ax η)(y) < η(y)}. We first add sand grains one by one on the sites y ∈ x+ (η), to reach the value (ax η)(y). Each step leads to a configuration larger than η, thus recurrent. We denote by η+ = η0+ the recurrent configuration (ax η)x+ (η) η[x+ (η)]c . If x− (η) = ∅ we are finished. If not, + we write x− (η) = {z1 , . . . , zn } and we pass from ηk+ = (ax η){z1 ,... ,zk } η{z c to 1 ,... ,zk } + + + ηk+1 for every 1 ≤ k ≤ n (ηk ≥ ax η is recurrent and differs from ηk−1 in one site) to reach ηn+ = ax η. We are now in a position to give the main stochastic-geometric argument leading to exponential decay of correlations. It copies the proof that complete analyticity for Markovian random fields follows from absence of disagreement percolation, as done in [2], see Theorem 7.1 in [6], except that we can replace the Markov property by the decoupling property (5.6). Let pc (d) denote the percolation threshold for Bernoulli site percolation on Zd . Let V ∈ S and let ⊂ V on which we fix two arbitrary height configurations η and η to consider two conditional probabilities µ1 = µV (·|η ) and µ2 = µV (·|η ). Theorem 5.1. Suppose that γ ≥ 4 for S = Z × {1, . . . , } or 4d < γpc (d) for S = Zd in (2.1). There exist constants α > 0, C < +∞ so that for all V ∈ S, ⊂ V , W ⊂ V \ , η ∈ R and for every event A ∈ FW , |µ1 (A) − µ2 (A)| ≤ Ce−α dist (W, ) . where dist(·, ·) is the nearest neighbor distance between the two subsets.
(5.9)
416
C. Maes, F. Redig, E. Saada
Proof. We give the proof for the lattice S = Zd . The case of the strip is analogous but a little simpler (using Lemma 5.4). We use a coupling argument. First, we introduce some linear ordering on V \ . We construct via iteration a coupling between µ1 and µ2 which is a random field (X, X ). We start by setting X(x) = η(x), X (x) = η (x) on . Suppose that we have already realized the coupling as (X, X ) = (η, η ) on all sites outside some non-empty set T ⊂ V \ . Consider then the conditional distributions µV (·|ηV \T ) and µV (·|ηV \T ). are V -burnable. But One possibility is that on the external boundary both η∂T and η∂T then, via (5.6), these two conditional probabilities are equal on T and we can take the optimal coupling for which X = X on T . Alternatively, we choose the smallest site x ∈ T having a nearest neighbor y ∈ V \ T , for which X(y) ≤ 2d or X (y) ≤ 2d and we find the value (η(x), η (x)) for the coupling at x from sampling the optimal coupling between the single site distributions µV (X(x) = ·|ηV \T ) and µV (X (x) = ·|ηV \T ). At this step the coupling is defined outside T \ x and we can repeat the iteration giving us a coupling between µ1 and µ2 . From the above construction, it is possible that in the coupling X(x) = X (x) at some x ∈ W , only if there is a nearest neighbor path from x to on which X(y) ≤ 2d or X (y) ≤ 2d. On the other hand, no matter what we fix off y, P (X(y) ≤ 2d or X (y) ≤ 2d|η(z), η (z), z ∈ V \ y) ≤ 2(1 − µV (η(y) > 2d|η(z), z = y)).
(5.10)
For γ large enough (from Lemma 5.5), this is bounded by pc (d). The proof is then concluded via an application of stochastic domination with the Bernoulli product measure (thanks to Lemma 5.6, see Theorem 4.8 in [6]) and using that the cluster-diameter in sub-critical Bernoulli site percolation has an exponential tail. Examples. 1. The dissipative system in dimension 2: we have pc (2) = 0.5927 (as a numerical result). Thus we need to take γ > 13 so that 8 < γpc (2). 2. The dissipative system in high dimension. Since pc (d) 1/(2d) for large d, we conclude exponential decay of correlations as soon as γ > 8d 2 . References 1. Bak, P., Tang, K., Wiesenfeld, K.: Self-Organized Criticality. Phys. Rev. A 38, 364–374 (1988) 2. van den Berg, J., Maes, C.: Disagreement percolation in the study of Markov fields. Ann. Probab. 25, 1316–1333 (1994) 3. Daerden, F., Vanderzande, C.: Dissipative abelian sandpiles and random walks. Phys. Rev E 63, 030301: 1–4 (2001) 4. Dhar, D.: Self Organised Critical State of Sandpile Automaton Models. Phys. Rev. Lett. 64(14), 1613–1616 (1990) 5. Dhar, D.: The Abelian Sandpiles and Related Models. Physica A 263, 4–25 (1999) 6. Georgii, H.-O., H¨aggstr¨om, O., Maes, C.: The random geometry of equilibrium phases, Phase Transitions and Critical Phenomena, Vol. 18, C. Domb and J.L. Lebowitz, (eds.), London: Academic Press, 2001, pp. 1–142 7. Heyer, H.: Probability measures on locally compact groups. Berlin-Heidelberg-New York: Springer, 1977 8. Ivashkevich, E.V., Priezzhev, V.B.: Introduction to the sandpile model. Physica A 254, 97–116 (1998) 9. Maes, C., Redig, F., Saada E., Van Moffaert, A.: On the thermodynamic limit for a one-dimensional sandpile process. Markov Proc. Rel. Fields 6, 1–22 (2000) 10. Maes, C., Redig, F., Saada E.: The abelian sandpile model on an infinite tree. Ann. Probab. 30(4), 1–27 (2002) 11. Mahieu, S., Ruelle, P.: Scaling fields in the two-dimensional abelian sandpile model. Phys. Rev E 64, 066130–(1–19) (2001)
Infinite Volume Limit of Dissipative Abelian Sandpiles
417
12. Meester, R., Redig, F., Znamenski, D.: The abelian sandpile; a mathematical introduction. Markov Proc. Rel. Fields 7, 509–523 (2002) 13. Rosenblatt, M.: Transition probability operators, Proc. Fifth Berkeley Symposium. Math. Statist. Prob. 2, 473–483 (1967) 14. Speer, E.: Asymmetric Abelian Sandpile Models. J. Stat. Phys. 71, 61–74 (1993) 15. Tsuchiya, V.T., Katori, M.: Phys. Rev. E 61, 1183 (2000) 16. Turcotte, D.L.: Self-Organized Criticality. Rep. Prog. Phys. 62, 1377–1429 (1999) Communicated by H. Spohn
Commun. Math. Phys. 244, 419–453 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-1010-6
Communications in
Mathematical Physics
Exponential Equations Related to the Quantum ‘ax + b’ Group Małgorzata Rowicka1,2, 1 2
Department of Mathematical Methods in Physics, University of Warsaw, Warsaw, Poland University of Texas Southwestern, 5323 Harry Hines Blvd, Dallas, TX 75390-9038, USA. E-mail:
[email protected]
Received: 15 January 2001 / Accepted: 7 October 2003 Published online: 16 December 2003 – © Springer-Verlag 2003
Abstract: We study pairs b, β of unbounded selfadjoint operators, satisfying commutation rules inspired by the quantum ‘ax + b’ group [19]: bβ = −βb and β 2 = id except for kerb, on which β 2 = 0. We find all measurable, unitary-operator valued functions F satisfying the exponential equation: F (b, β)F (d, δ) = F ((b, β) (d, δ)), where d, δ satisfy the same commutation rules as b, β, and is modeled after the comultiplication of the quantum ‘ax + b’ group. This result is crucial for classification of all unitary representations of the quantum ‘ax + b’ group, which is achieved in our forthcoming paper [12]. 1. Introduction To explain the title of this article let us use an analogy with the classical group theory. Finding all representations of a given group G amounts to finding all functions F defined on G, satisfying for every x, y ∈ G the following exponential equation: F (x + y) = F (x)F (y).
(1)
Usually, we restrict ourselves only to unitary representations satisfying certain continuity conditions. For example, let us consider G = R. Then, all continuous, S 1 -valued functions F satisfying (1) are given by the formula Fa (x) = eiax ,
(2)
ˆ They coincide with all 1-dimensional, continuous unitary representations where a ∈ R. of the group R. Note that these representations are numbered by real numbers a. Let us now take a step further and consider all strongly continuous unitary representations of R acting on a Hilbert space H. Then the function F in (1) is operator-valued. One can also
Supported by KBN grant No 2 PO3 A 03618
420
M. Rowicka
study Eq. (1) in a more general setting: x and y can be operators acting on the Hilbert space and satisfying certain conditions. Unbounded selfadjoint operators, related to the quantum ‘ax + b’ group, will be considered in this paper. By Stone’s theorem, if F is a strongly continuous function satisfying (1), with values in unitary operators, then there is a uniquely determined selfadjoint operator a in H such that F is given by (2). Note that now representations Fa are numbered not by real numbers, but by unbounded, selfadjoint operators. In this paper we find all solutions of the general exponential equation related to the (extended) quantum ‘ax + b’ group. The classical ‘ax + b’ group is the group of the affine transformations preserving orientation of the real line (i.e. a > 0). Its quantum deformation has been constructed in [19]. This group is sometimes called extended, because an additional generator β has been added. This extra generator is necessary to construct a quantum deformation of the ‘ax + b’ group in the framework of C ∗ -algebras and unbounded operators. In the remaining part of this section we present the notation and notions used in this paper, some of which are non-standard. Most important are the notions of an operator domain and of an operator function. In this language, we describe the well-known quantum SUq (2) group and the quantum ‘ax + b’ group (as constructed by Woronowicz and Zakrzewski). We introduce two new operator domains, which will play a prominent role in this paper. They are both related to the quantum ‘ax + b’ group. One of them is non-commutative, it will be called M. The knowledge of unitary representations of M enables us to classify unitary representations of the quantum group ‘ax + b’ [12]. However, this operator domain is non-commutative, and as such difficult to deal with. Therefore, we start by considering a similar, but commutative, operator domain N . In Sect. 2 we discuss this operator domain, introduce a binary operation on it and finally find all its unitary representations. Then, in Sect. 3, we reintroduce the operator domain M and also equip it with a binary operation. Using an operator map from N to M and the results from the previous section, we can obtain a description of all unitary representations of M. This is achieved in Theorem 3.4, which is the main result of this paper. We have chosen to be precise when formulating and stating propositions and theorems. However, we are sometimes very loose when giving comments, especially about the “quantum” nature of considered objects. 1.1. Notation. The symbols H and K will denote Hilbert spaces. We study only separable Hilbert spaces, usually infinite dimensional. Most linear operators encountered in this paper are unbounded. All operators considered are densely defined. The set of all closed operators acting on H is denoted by C(H). The set of all unitary ones by Unit(H). The scalar product will be denoted by ·|·. In our notation, the scalar product is antilinear in the first variable. The functional calculus of selfadjoint operators [9, 10, 13] is an important technique employed here. The symbol “sign T ” will denote partial isometry obtained from the polar decomposition of a selfadjoint operator T . We use a non-standard, but very convenient notation [18] for orthogonal projections and their images, as explained below. Let a and b be strongly commuting selfadjoint operators acting on a Hilbert space H. Then, by the spectral theorem, there exists a common spectral measure dE(λ, µ) such that a= λ dE(λ, µ), b= µ dE(λ, µ). R2
R2
Exponential Equations Related to Quantum ‘ax + b’ Group
421
Then, for every complex measurable function f of these two variables: f (λ, λ ) dE(λ, λ ). f (a, b) = R2
Let f be a logical sentence and let χ (f ) be 0 if f is false, and 1 otherwise. If R is a binary relation on R then f (λ, λ ) = χ (R(λ, λ )) is the characteristic function of the set = {(λ, λ ) ∈ R2 : R(λ, λ ) is true } . Assuming that is measurable, f (a, b) = E(). From now on we will write χ (R(a, b)) instead of f (a, b): χ (R(a, b)) = χ (R(λ, λ )) dE(λ, λ ) = E(). R2
The image of χ (R(a, b)) will be denoted by H(R(a, b)), where H is the Hilbert space, on which the operators a and b act. Thus we have defined symbols χ (a > b), χ (a 2 +b2 = 1), χ (a = 1), χ (b < 0), χ (a = 0), etc. They are orthogonal projections on appropriate spectral subspaces. For example, H(a = 1) is an eigenspace of the operator a, corresponding to the eigenvalue 1, and χ (a = 1) is an orthogonal projector onto this eigenspace. Generally, whenever is a measurable subset of R, then H(a ∈ ) is a spectral subspace of a corresponding to , and χ (a ∈ ) is its spectral projection. 1.2. Zakrzewski relation. Below we define the Zakrzewski relation [18], which depends on a parameter , such that −π < < π. Definition 1.1. Let R and S be selfadjoint operators acting on a Hilbert space H. The operators R and S are in the Zakrzewski relation R S if 1. (sign R) commutes with S and (sign S) commutes with R. 2. On the subspace (ker R)⊥ ∩ (ker S)⊥ we have |R|il |S|ik = ei lk |S|ik |R|il for any l, k ∈ R.
Note that the Zakrzewski relation is not symmetric: from R S it follows that S R −1 [18], not that S R. This is why we are using the asymmetric symbol to denote this relation. Now we will give an example of operators satisfying the Zakrzewski relation. Let us set H = L2 (R) and let qˆ and pˆ denote the position and the momentum operators in the Schr¨odinger representation. Then the domain of qˆ is x 2 |ψ(x)|2 dx < ∞ } D(q) ˆ = {ψ ∈ L2 (R) : R
and qˆ is represented by multiplication by the coordinate operator on D(q): ˆ (qψ)(x) ˆ = xψ(x).
422
M. Rowicka
The domain of pˆ consists of all distributions ψ ∈ L2 (R) with square-integrable first derivative D(p) ˆ = {ψ ∈ L2 (R) : ψ ∈ L2 (R) } . For any ψ ∈ D(p) ˆ and for a given , (pψ)(x) ˆ =
dψ(x) . i dx
The operators epˆ and eqˆ satisfy the Zakrzewski relation. Observe that two strictly positive (i.e. positive and invertible) selfadjoint operators satisfy Zakrzewski relation if and only if they satisfy the Weyl commutation relations [10]. If ker R = ker S = {0}, we will call the pair (R, S) non-degenerate. All non-degenerate pairs of operators satisfying the Zakrzewski relations are in some sense built from the operators eqˆ and epˆ . Precisely, as proved by Woronowicz [18]: Proposition 1.2. Let R and S be operators acting on a Hilbert space H, and let ker R = {0}, ker S = {0} and R S. Then the pair (R, S) is unitary equivalent to the pair (u ⊗ epˆ , v ⊗ eqˆ ) acting on the Hilbert space K ⊗ L2 (R), where u, v are unitary, selfadjoint and mutually commuting operators acting on K, u = u∗ = u−1 ,
v = v ∗ = v −1
uv = vu .
1.3. Selfadjoint extensions of R + S for R S. Selfadjoint extensions of the sum R + S, where R S, have been proved to be very important in constructing a quantum deformation of the ‘ax +b’ group [18, 19]. The operator R +S is symmetric, but may not be selfadjoint. Moreover, in some cases such a sum does not have a selfadjoint extension at all. We will now describe conditions, under which there exists a unique selfadjoint extension of the sum of selfadjoint operators satisfying the Zakrzewski relation. To this end, let us consider a symmetric operator Q acting on a Hilbert space H. In general, Q is not selfadjoint. Let τ be a selfadjoint operator acting on the same Hilbert space H as Q does, such that τ 2 is a projection. The operator τ is called a reflection operator for Q if τ anticommutes with Q (that is τ Q ⊂ −Qτ ) and Q restricted to H(τ = 0) is selfadjoint. Then Q preserves the direct sum decomposition H = H(τ = 0) ⊕ H(τ 2 = 1) and the restriction of Q to H(τ = 0) is well defined. By Proposition 5.1. of [18], there exists a unique selfadjoint extension Qτ of Q, such that (τ − I )D(Qτ ) ⊂ D(Q) . The domain of operator Qτ is DQτ = D(Q) + D(Q∗ ) ∩ H (τ = 1) = {x ∈ D(Q∗ ) : (τ − I )x ∈ D(Q)} . Note that it is not enough to take as a domain of Qτ the restriction of D(Q∗ ) to H (τ = 1), since it may not contain the whole D(Q). It has also been proved in [18], that selfadjoint extensions of the symmetric operator Q = R + S exist only if there exists a selfadjoint operator τ , such that τ anticommutes with R and S, and i
τ 2 = χ (e 2 RS < 0) .
Exponential Equations Related to Quantum ‘ax + b’ Group
423
Moreover, every selfadjoint extension of R + S is defined uniquely by its reflexion operator τ . To stress this dependence we will use the symbol [R + S]τ . The selfadjoint extension of R + S is given by a formula [R + S]τ = (R + S)∗ |D(R+S)+D((R+S)∗ ) ∩ H (τ =1) . As we will point out in Subsect. 2.2, such selfadjoint extensions can be very conveniently expressed in terms of the quantum exponential function (see Eq. 8). 1.4. Operator domains and operator functions. Now we will introduce two notions essential for understanding this paper: operator domain and operator function. An operator domain can be thought of as a “quantum space”. In a generalization of the duality theory such spaces correspond to non-commutative C ∗ -algebras [15, 6, 1]. A C ∗ -algebra can be described by its generators and relations [17] and these generators can be viewed as non-commutative coordinates on a “quantum space”. An operator function can be thought of as a function of “non-commuting operator variables”. Operator functions and domains preserve the symmetries of Hilbert space. Precise definitions are given below. Definition 1.3. For a Hilbert space H, let DH be a subset of C(H)N , such that 1. For two Hilbert spaces H and K and for a unitary operator V : H → K, and for an element x := (x1 , x2 , ..., xN ) ∈ DH ⊂ C(H)N we have
V xV ∗ := (V x1 V ∗ , V x2 V ∗ , ..., V xN V ∗ ) ∈ DK .
2. For a space with a measure µ and for a measurable field1 of Hilbert spaces {H(λ)}λ∈ , and for any measurable closed-operator field {a(λ)}λ∈ , a(λ) ∈ C(H)N , we have ⊕ a(λ)dµ(λ) ∈ D ⊕ H(λ)dµ(λ)
if and only if, a(λ) ∈ DH(λ) for µ−almost all λ ∈ . Let D be a collection of DH ’s, numbered by H. We will call D an N - dimensional operator domain. Operator domains in this paper can be thought of as commutation rules, that can be realized in different Hilbert spaces (see Examples 1.4, 1.5 and 1.6 below). Note that the Hilbert space H plays the role of a variable in this scheme, i.e. the difference between an operator domain D and a set DH is such as between a function f and its value at a point x, f (x). An operator domain is a category; the bounded intertwining operators are its morphisms (see [15] and [17]). The notion of a measurable closed-operator field used in Def. 1.3 is not widely known. However, it can be easily reduced to a more popular notion of a measurable bounded-operator field. For T ∈ C(H), its z-transform is defined by 1
zT = T (I + T ∗ T )− 2 . 1 Its definition, as well as the definition of a measurable closed-operator field, can be found in [14]. See also [7].
424
M. Rowicka
R-
D q=1 Dq=0 R+
R-
Dq=-1
Fig. 1. Schematic depiction of a classical locally compact space, which can be identified with the set AC
Observe that zT is a bounded operator and T is uniquely determined by zT . For more details, see [17]. We say that a closed-operator field
λ → a(λ) ∈ C(H(λ))
(3)
is measurable, if the field
λ → za(λ) ∈ B(H(λ)) is a measurable bounded-operator field. Then there exists a unique operator ⊕ a ∈ C( H(λ)dµ(λ)) ,
such that
⊕
za =
za(λ) dµ(λ).
⊕ We call the operator a a direct integral of the field (3) and write a := a(λ)dµ(λ). For bounded operators, this notion of a direct integral coincides with that used in [2]. In particular, when is a countable set and µ is a counting measure, then H = λ∈ H(λ) and condition 2 of Def. 1.3 takes the form a(λ) ∈ DH λ∈
if and only if a(λ) ∈ DH(λ)
for any
λ∈ .
However, the operators considered in this paper have continuous spectra (which lead to non-atomic spectral measures), therefore in Def. 1.3 we use a more general notion of a direct integral. Example 1.4. The collection of sets AH AH = {(R, ρ) ∈ C(H)2 | R = R ∗ , ρ = ρ ∗ and Rρ = ρR and ρ 2 = χ (R < 0)} , numbered by H, forms the operator domain A. This operator domain has been studied in [18].
Exponential Equations Related to Quantum ‘ax + b’ Group
R-
R-
425
D q=1
D q=1
q=-1 D
D q=-1
R+
R+
Fig. 2. Schematic depiction of a classical locally compact space, which can be identified with the set NC
Observe that the operator domain A is commutative. For H = C, the set AC may be identified with a locally compact space constructed from the sum of three half-lines: [0, +∞[×{0} and ] − ∞, 0] × {−1} and ] − ∞, 0] × {1}, with the points (0, 0), (0, −1) and (0, 1) identified (see Fig. 1). Observe also, that in the general case this space is a joint spectrum of R and ρ. Example 1.5. The collection of sets NH NH = {(R, ρ) ∈ C(H)2 | R = R ∗ , ρ = ρ ∗ and Rρ = ρR and ρ 2 = χ (R = 0)}, numbered by H, forms the operator domain N . This operator domain is also commutative. For H = C, the set NC can be identified with a locally compact space similar to the one described in the previous example, but consisting of four, not three, half-lines a with common origin (see Fig. 2). The operator domain N will be discussed in Sect. 2. Example 1.6. The collection of sets MH MH = {(b, β) ∈ C(H)2 | b = b∗ , β = β ∗ and bβ = −βb and β 2 = χ (b = 0)}, numbered by H, forms the operator domain M. Since the operators b and β do not commute, the set MH cannot be identified with a classical space for any H. It is the first example of purely “quantum” space in this paper. The operator domain M will be discussed in Sect. 3. These examples show that the description of operator domains is similar to the description of a manifold given by a set of equations. For example, the sphere S 2 = {(x1 , x2 , x3 ) ∈ R3 | x12 + x22 + x32 = 1} is described by giving coordinates (x1 , x2 , x3 ) and relations between them (x12 + x22 + x32 = 1). Therefore, the unbounded operators entering the descriptions of operator domains can be thought of as “coordinates on a quantum space”. The operator functions can be viewed as a recipe specifying what needs to be done with a N -tuple of closed operators (a1 , a2 , ..., aN ) to obtain a given closed operator F (a1 , a2 , ..., aN ). Definition 1.7. Let D be an operator domain and let for any Hilbert space H be given a map FH : DH → C(H). We say that F is a measurable operator function if 1. For Hilbert spaces H and K and for a unitary operator V : H → K and for an element x ∈ DH we have FK (V xV ∗ ) = V FH (x)V ∗ .
426
M. Rowicka
2. For a space with a measure ( , µ) and for a measurable Hilbert-space field {H(λ)}λ∈
and for an a ∈ DH with a decomposition ⊕ a= a(λ)dµ(λ) ∈ D ⊕ H(λ)dµ(λ) ,
the operator field {FH(λ) (a(λ))}λ∈ is measurable and F ⊕
H(λ)dµ(λ) (a)
⊕
:=
FH(λ) (a(λ))dµ(λ).
For example, let (a1 , a2 , ..., aN ) ∈ DH and let ai ’s be normal, mutually strongly commuting operators, with joint spectrum contained in ⊂ CN . Then measurable operator functions on D are measurable functions on . It shows that the above definition is a generalization of the functional calculus of measurable functions of strongly commuting normal operators to the case of non-commuting closed operators (not necessary normal). Having defined the notion of an operator function, now we can proceed to the definition of an operator map between operator domains. Definition 1.8. Let M be an operator domain and let N be a k - dimensional operator domain. Moreover, let F = (F 1 , F 2 , ..., F k ), where F i for i = 1, 2, ..., k are operator functions on M. If for a Hilbert space H and for an m ∈ MH , 1 2 k FH (m) = (FH (m), FH (m), ..., FH (m)) ∈ NH ,
then we call F an operator map from the operator domain M into the operator domain N.
1.5. Binary operations on operator domains. To proceed toward the definition of a binary operation we need to define its domain first. If D is an operator domain, then we can construct a new operator domain, consisting of all pairs of operators from D satisfying certain commutation relations or spectral conditions. Since this construction resembles the classical definition of a relation, we will use similar notation. Namely, we will denote such an operator domain2 by RD . We start with trivial examples of this construction in the present subsection. The first non-trivial one will be RNH in next subsection. Example 1.9 (Operator domain and map related to the quantum SUq (2) group). Let us define the operator domain SUq (2) as follows:
SUq (2)H
= (α, γ ) ∈ B(H) :
αα ∗ + γ γ ∗ = I ; αγ = qγ α; γ γ ∗ = γ ∗γ ; αγ ∗ = qγ ∗ α; ∗ α α + q 2γ ∗γ = I
.
2 A priori, it is not guaranteed that such a defined object will satisfy the definition of an operator domain. However, this will be obvious in the specific cases considered in this paper.
Exponential Equations Related to Quantum ‘ax + b’ Group
427
Let us define RSUq (2)H by
RSUq (2)H = ((α1 , γ1 ), (α2 , γ2 )) : (α1 , γ1 ), (α2 , γ2 ) ∈ SUq (2)H . There are neither commutation relations nor spectral conditions to be satisfied. Moreover, let us define the operator map by : RSUq (2)H ((α1 , γ1 ), (α2 , γ2 )) → (α, γ ) ∈ SUq (2)H , where α = α1 ⊗ α2 − qγ1 ∗ ⊗ γ2 and γ = γ1 ⊗ α2 + α1 ∗ ⊗ γ2 . One can check that the thus defined binary operation is associative, i.e. for any (α1 , γ1 ), (α2 , γ2 ), (α3 , γ3 ) ∈ SUq (2)H we have {(α1 , γ1 ) (α2 , γ2 )} (α3 , γ3 ) = (α1 , γ1 ) {(α2 , γ2 ) (α3 , γ3 )} . Since in the definition of the operator domain SUq (2) we restricted ourselves to bounded operators, the related quantum SUq (2) group is a compact quantum group. Example 1.10 (Operator domain and map related to the quantum ‘ax + b’ group). Let us define the operator domain G as a collection of sets GH , such that a>0 3 , GH = (a, b, β) ∈ C(H) : ab (b, β) ∈ M H aβ = βa where M is the operator domain from Example 1.6. In this case RGH is again trivial
RGH = ((a1 , b1 , β1 ), (a2 , b2 , β2 )) : (a1 , b1 , β1 ), (a2 , b2 , β2 ) ∈ GH . The operator map on RG is given for every H by : RGH ((a1 , b1 , β1 ), (a2 , b2 , β2 )) → (a, b, β) ∈ GH , where a = a1 ⊗ a2 and b = [a1 ⊗ b2 + b1 ⊗ I ](−1)k (β1 ⊗β2 )χ(b1 ⊗b2 <0) . The formula for β is much more complicated, so we omit it here. This additional ‘generator’ β is needed to ensure the existence of a selfadjoint extension of a1 ⊗ b2 + b1 ⊗ I . To make the operation associative one has to ensure that =±
π , 2k + 3
where
k = 0, 1, 2, . . .
and k is the same as chosen in the formula for the selfadjoint extension of a1 ⊗b2 +b1 ⊗I . The operator domain G with the operation defined as above forms the quantum ‘ax + b’ group on the Hilbert space level [19]. Observe that from a b follows that the operators a and b are not bounded. Therefore the corresponding quantum ‘ax + b’ group is non-compact.
428
M. Rowicka
2. The Operator Domain N Let H be a separable, infinite dimensional Hilbert space. Let N be a collection of sets NH NH = {(R, ρ) ∈ C(H)2 : R = R ∗ ρ = ρ ∗ ρR = Rρ ρ 2 = χ (R = 0) }.
(4)
N is an operator domain from Example 1.5. To ensure the existence of a selfadjoint extension of the sum R + S we have to consider another set
RNH = {((R, ρ), (S, σ )) ∈ NH : R S and Sρ = −ρS and Rσ = −σ R and ρσ = σρ } . Notice that the set RNH consists of all pairs of elements of NH , satisfying certain additional conditions. These additional conditions are necessary to ensure the existence of a selfadjoint extension of the sum R +S. To give a formula for such a selfadjoint extension we need to introduce the notion of a quantum exponential function. In order to define the quantum exponential function, F , it is convenient to introduce the special function Vθ first. 2.1. The special function Vθ and the quantum exponential function F . The special function Vθ is defined by ∞ 1 da −θ , log(1 + a ) Vθ (x) = exp 2πi 0 a + e−x for any x ∈ C such that |x| < π . The function Vθ can be extended to a function meromorphic on C. Let + = { r ∈ C \ {0} : arg r ∈ [0, ] } ,
− = { r ∈ C \ {0} : arg r ∈ [−π, − π ] } , − = + × {0} ∪ × {−1, 1}.
The quantum exponential function F is a quantum analogue of the exponential function fit for operators belonging to RNH . How this should be understood is explained in Corollary 2.2. For (r, ρ) ∈ , the quantum exponential function F is defined by π
F (r, ρ) = [1 + iρ(−r) ]Vθ (log r) . In particular, for r real, (r, ρ) ∈ real := R+ × {0} ∪ R− × {−1, 1}, Vθ (log r) for r > 0 and ρ = 0 π F (r, ρ) = {1 + iρ|r| }Vθ (log |r| − πi) for r < 0 and ρ = ±1.
(5)
The function F can be extended continuously to the closure of real by setting F (0, ρ) = 1. An axiomatic introduction of F and some other properties of F and Vθ can be found in Sect. 2 of [18].
Exponential Equations Related to Quantum ‘ax + b’ Group
429
2.2. Binary operation on N . Observe that if (R, ρ) ∈ NH , then R commutes with ρχ(R < 0) and the joint spectrum of these operators is the closure of real . Therefore, the expression F (R, ρχ (R < 0)) is well defined. Moreover, since (R, ρχ (R < 0)) ∈ AH , and since F : real → C is a measurable function, it follows (see Introduction), that F is an operator function defined on A. Let ((R, ρ), (S, σ )) ∈ RNH . Assume that ker S = {0}. This assumption is not very limiting since every operator S can be expressed as a direct sum of its restrictions to (ker S)⊥ (which is invertible) and to (ker S) (which is trivial). Define T = e 2 S −1 R i
and
τ = (−1)k ρσ ,
(6)
where k ∈ N and k is related to by =±
π . 2k + 3
(7)
Define also [R + S]τ χ(T <0) = F (T , τ χ (T < 0))∗ S F (T , τ χ (T < 0))
(8)
σ˜ = F (T , τ χ (T < 0))∗ σ F (T , τ χ (T < 0)) .
(9)
and
Since F (T , τ χ (T < 0)) is an unitary operator, we see that ([R + S]τ χ(T <0) , σ˜ ) ∈ NH . Therefore, the operation N : RNH −→ NH (R, ρ) N (S, σ ) = ([R + S]τ χ(T <0) , σ˜ ) is well defined. As we have explained above while considering the operator S, we can safely assume that ker R = {0}. Then the operator T is invertible. Using the properties of T and τ derived in [18], one can check that ((T , τ ), ((R, ρ) N (S, σ ))) ∈ RNH , and ((T , τ ), (R, ρ)) ∈ RNH
and
(((T , τ ) N (R, ρ)), (S, σ )) ∈ RNH .
Moreover, one can prove that the operation N is associative in the sense of (T , τ ) N ((R, ρ) N (S, σ )) = ((T , τ ) N (R, ρ)) N (S, σ ) .
(10)
The proof of associativity is not trivial; one has to keep track of the domains of unbounded operators involved. However, a very similar proof was published in [19], so we omit it here. Now we will consider the operator domain A (Example 1.4). This operator domain is fairly similar to the operator domain of our interest, N . The reason why we need A is historic: before it became clear that the operator domain appropriate for our purposes is N, Woronowicz has proved some useful facts about A. To make use of them, we shall
430
M. Rowicka
N
R-
R-
D q=1
D q=1
q=-1 D
D q=-1
R+
R+
n
R-
D q=1 Dq=0
A
R+ R-
Dq=-1
Fig. 3. Schematic depiction of the map ϕ in case the operator domains A and N are represented by locally compact spaces corresponding to AC and NC , respectively
first introduce A and next use facts proved about it to prove analogous facts about N . Then we will work with the actual domain N . The operator domain RA and the operation A on A have been described in [18], although not in the language of operator domains. In this language, the definition of RAH reads
RAH = {((R, ρ), (S, σ )) ∈ AH : R S and Sρ = −ρS and Rσ = −σ R } . Note that it is very similar to the definition of RNH , except that on the list of commutation relations in the case of RAH the relation ρσ = σρ is missing. The lack of commutativity of σ and ρ has proved to be a main technical difficulty and is the reason why in this paper we focus on the operator domain N instead of A. Now that we have described RAH , we can define the operator map A : RAH → AH by (R, ρ) A (S, σ ) = ([R + S]τ , σ˜ ) , where τ = (−1)k ρσ + (−1)k σρ
and
σ˜ = F (T , τ )∗ σ F (T , τ ) .
We are going to apply the results related to the operator domain A obtained in [18] to the operator domain N defined by (4). Our main tool will be an operator map ϕ from the operator domain N into A. (How one can think about this map in terms of corresponding locally compact spaces is shown in Fig. 2.2.) Corollary 2.1 below defines ϕ and describes its most important property. Corollary 2.1. Let ϕ be an operator map from NH into AH , defined for any (R, ρ) ∈ NH by ϕ(R, ρ) = (R, ρχ (R < 0)) . Then, for any ((R, ρ), (S, σ )) ∈ RNH , we have ϕ((R, ρ) N (S, σ )) = ϕ(R, ρ) A ϕ(S, σ ) .
(11)
Exponential Equations Related to Quantum ‘ax + b’ Group
431
Proof. By the definition of N , we know that (R, ρ) N (S, σ ) = ([R + S]τ , σ˜ ) , where τ = (−1)k ρσ and σ˜ = F (T , τ χ (T < 0))∗ σ F (T , τ χ (T < 0)) . Then, by the definition of ϕ: ϕ(([R + S]τ , σ˜ )) = ([R + S]τ χ(T <0) , σ˜ χ ([R + S]τ χ(T <0) < 0)) . On the other hand, ϕ(R, ρ) A ϕ(S, σ ) = (R, ρχ (R < 0)) A (S, σ χ (S < 0)) = ([R + S]τ , σ˜ ) , where τˆ = (−1)k ρχ (R < 0)σ χ (S < 0) + (−1)k σ χ (S < 0)ρχ (R < 0) and σˆ˜ = F (T , τ )∗ σ χ (S < 0) F (T , τ ) . In order to prove that the first components of ϕ((R, ρ)N (S, σ )) and ϕ(R, ρ)A ϕ(S, σ ) do coincide, it is enough to show that τ χ (T < 0) = (−1)k ρχ (R < 0)σ χ (S < 0) + (−1)k σ χ (S < 0)ρχ (R < 0) ,
(12)
because selfadjoint extensions are determined uniquely by their reflection operators. Since R and S satisfy the Zakrzewski relation, it follows that R commutes with (sign S) and S commutes with (sign R). Hence, Rσ χ(S < 0) = σ χ (S < 0)R and Sρχ (R < 0) = ρχ (R < 0)S . This means that if ((R, ρ), (S, σ )) ∈ RNH then (R, ρχ (R < 0)), (S, σ χ (S < 0)) ∈ RAH . Then, the sum R + S has a selfadjoint extension, determined uniquely by the following reflection operator τˆ : τˆ = (−1)k ρχ (R < 0)σ χ (S < 0) + (−1)k σ χ (S < 0)ρχ (R < 0) . Since σ anticommutes with R, it follows that χ (R < 0)σ = σ χ (R > 0), and analogously for ρ and S. Hence, τˆ = (−1)k ρσ {χ (R > 0)χ (S < 0) + χ (S > 0)χ (R < 0)} = (−1)k ρσ χ (e 2 S −1 R < 0) . i
Comparing this result with formula (6) we see that (12) holds. We proceed now to proving that also the second components of ϕ((R, ρ) N (S, σ )) and ϕ(R, ρ) A ϕ(S, σ ) do coincide. To this end, one has to prove that F (T , τ )∗ σ χ (S < 0) F (T , τ ) = σ˜ χ ([R + S]τ χ(T <0) < 0). Observe, that by virtue of Theorem 5.2 of [18], we get that [R + S]τ χ(T <0) = F (T , τ χ (T < 0))∗ SF (T , τ χ (T < 0)) .
432
M. Rowicka
Since F (T , τ χ (T < 0)) is a unitary operator, it follows that χ([R + S]τ χ(T <0) < 0) = F (T , τ χ (T < 0))∗ χ (S < 0)F (T , τ χ (T < 0)) .
(13)
Moreover, by Theorem 6.2 of [18], we have (sign T ) = (sign S). Therefore σˆ˜ = F (T , τ )∗ σ χ (S < 0)F (T , τ ) = F (T , τ χ (T < 0))∗ σ χ (S < 0)F (T , τ χ (T < 0) . Hence, since F (T , τ χ (T < 0)) is a unitary operator F (T , τ χ (T < 0))∗ σ χ (S < 0)F (T , τ χ (T < 0) = F (T , τ χ (T < 0))∗ σ F (T , τ χ (T < 0)) × F (T , τ χ (T < 0))∗ χ (S < 0)F (T , τ χ (T < 0)) = σ˜ F (T , τ χ (T < 0))∗ χ (S < 0)F (T , τ χ (T < 0)) . Moreover, taking into account (13), we get σ˜ F (T , τ χ (T < 0))∗ χ (S < 0)F (T , τ χ (T < 0)) = σ˜ χ ([R + S]τ χ(T <0) < 0) ,
which completes the proof.
By rewriting Theorem 6.1 from [18] in our language, we obtain the following Corollary 2.2. Let ((R, ρ), (S, σ )) ∈ RNH and let ker S = {0} . Then F (ϕ((R, ρ) N (S, σ ))) = F (ϕ(R, ρ))F (ϕ(S, σ )).
(14)
The above corollary explains why F is called the quantum exponential function. Moreover, the quantum exponential function F , like the classical exponential one, is the only function (up to a parameter) satisfying the exponential equation (14) (see Theorem 2.3 below). Theorem 2.3. Let ((R, ρ), (S, σ )) ∈ RNH and let f : real → S 1 be a measurable function. Then the following conditions are equivalent: a) f (ϕ(R, ρ))f (ϕ(S, σ )) = f (ϕ((R, ρ) N (S, σ ))). b) There exist M ≥ 0 and µ = ±1, such that f (ϕ(r, ρ)) = F (ϕ(Mr, µρ)) for a.e. (r, ρ) ∈ R × {−1, 1}. Proof. b) ⇒ a) We first consider the case of M = 0. Then F (Mr, µρ) = F (0, µρ) = 1 , because by Theorem 1.1 [18] lim F (r, ρ) = 1 .
r→0
It is easily seen that if M > 0 and µ = ±1 then ((MR, µρ), (MS, µσ )) ∈ RNH and ker MS = {0}. Thus assumptions of Corollary 2.2 are satisfied and therefore the function f (r, ρχ (r < 0)) = F (Mr, µρχ (Mr < 0)) satisfies condition a). b) ⇐ a) By Corollary 2.1 we may apply Theorem 7.1 of [18], and b) follows directly.
Exponential Equations Related to Quantum ‘ax + b’ Group
433
2.3. The matrix representation of N and some useful matrix elements. The goal of this subsection is to derive certain relationships we will use to classify unitary representations of (N, N ). Consider ((R, ρ), (S, σ )) ∈ RNH . Assume that ker R = ker S = {0}. Since R S, operators R and S commute with (sign R) and (sign S). Therefore we may introduce the following notation: H++ H+− H−+ H−−
= = = =
H(R H(R H(R H(R
> 0) ∩ H(S > 0) ∩ H(S < 0) ∩ H(S < 0) ∩ H(S
> 0), < 0), > 0), < 0).
Then H = H++ ⊕ H+− ⊕ H−+ ⊕ H−− . A vector ψ from the space H is represented by
ψ++ ψ ψ = +− , ψ−+ ψ−− where ψ++ ∈ H++ , ψ+− ∈ H+− , ψ−+ ∈ H−+ , and ψ−− ∈ H−− . Therefore, all operators acting on H can be represented by 4 × 4 matrices. Moreover, since ρ is selfadjoint and ρ 2 = χ (R = 0), we see that the maps ρ : H−− → H−+ and ρ : H−+ → H−− are mutually inverse. Similarly, mutually inverse are the maps ρ : H+− → H++ and ρ : H++ → H+− . Since σ is selfadjoint and σ 2 = χ (S = 0), we see that the maps σ : H−− → H+− and σ : H+− → H−− are mutually inverse. The same applies to the pair of maps: σ : H−+ → H++ and σ : H++ → H−+ . Therefore, the Hilbert spaces H−+ , H+− and H−− and H++ are unitary equivalent. In what follows we simply assume that H++ = H−+ = H+− = H−− and denote this Hilbert space by H+ . With this assumption,
0 I ρ= 0 0
I 0 0 0
0 0 0 I
0 0 , I 0
0 0 σ = I 0
and
0 0 0 I
I 0 0 0
0 I . 0 0
Hence the matrix representation of the operator τ := (−1)k ρσ is
0 k 0 τ = (−1) 0 I
0 0 I 0
0 I 0 0
I 0 . 0 0
Since the operators R and S commute with (sign R) and with (sign S), they are represented by diagonal matrices
R+ 0 R= 0 0
0 0 0 R+ 0 0 , 0 −R+ 0 0 0 −R+
and
S+ 0 0 0 0 −S+ 0 0 S= , 0 0 S+ 0 0 0 0 −S+
434
M. Rowicka
where R+ and S+ denote R and S restricted to H+ . Clearly, R+ and S+ are selfadjoint and strictly positive and R+ S+ . Moreover, T+ 0 0 0 i 0 −T+ 0 0 T = e 2 S −1 R = , 0 0 −T+ 0 0 0 0 T+ −1 where T+ = e 2 S+ R+ . The operator T+ is selfadjoint and strictly positive. Let us consider the following two operators: i
Ro = epˆ and So = eqˆ . Denote the complex conjugation operator by Jo . Then, for w ∈ L2 (R) , (Jo w)(t) = w(t) , where t ∈ R. Note that Jo is an antilinear operator such that (Jo )2 = I and (Jo )∗ = Jo and Jo Ro Jo = Jo epˆ Jo = e−pˆ = Ro−1 and Jo So Jo = Jo eqˆ Jo = eqˆ = So .
By Corollary 1.2, a pair (R, S), such that R S, is unitary equivalent to (u⊗epˆ , v⊗eqˆ ), where u, v are unitary, selfadjoint and commuting operators, i.e. Sp u, Sp v ⊂ {−1, 1}. Therefore, there exists an antilinear operator J , such that J 2 = I and J ∗ = J and J RJ = R −1
and
J SJ = S .
(15)
Since J is antilinear and J ∗ = J , then for any w, v ∈ H, w|J v = v|J ∗ w = v|J w .
(16)
Throughout this subsection we will assume that the operators R and S are strictly positive. Then, we can define an operator F by F = ei 4 e−i π
log2 S
e−i
log2 T 2
,
(17)
i
where as usual T = e 2 S −1 R. Note that F ∗ = F −1
and
J F J = F −1 .
(18)
Keeping in mind the definition of operator F (Formula 17) and remembering that R S, T R and T S, one can prove that F RF −1 = S
and
F SF −1 = R −1 .
(19)
Analogously, taking also into account (15) and the definitions of T and F , we obtain J T J = F T −1 F −1 .
(20)
Exponential Equations Related to Quantum ‘ax + b’ Group
435
Operators R, S and T are selfadjoint operators with purely continuous spectrum. It is well known that such operators do not have eigenvectors. However, they can have socalled generalized eigenvectors ([5, 8, 4]). Let us consider two locally convex spaces, Y and Y , such that Y ⊂ H ⊂ Y , where Y is endowed with a topology finer than H, whereas Y is the space of all continuous linear functionals on Y . For a given selfadjoint operator A acting on H, there exists Y such that 1. AY ⊂ Y and A is continuous on Y . 2. The operator A can be extended to Y by the duality formula Af | = f |A ,
∀f ∈ Y,
∈ Y ,
where f | denotes the action of the functional ∈ Y on the vector f ∈ Y (we denote the scalar product and duality form by the same symbol ·|·). The operator A is the extension of A into Y . 3. There exists a measure E on the spectrum of the operator A, Sp(A), such that for E-almost all a ∈ Sp(A) , there exists a a ∈ Y such that A a = aa . Hence, for all f ∈ Y we have Af |a = af |a . The functional a is called a generalized eigenvector of A with generalized eigenvalue a. 4. For each pair f, g ∈ Y , we have the Dirac completeness relation g|a a |f dE(a) (21) g|f = Sp(A)
with a |f := f |a . Let |r , |s and |t denote generalized eigenvectors of R, S and T with real generalized eigenvalues r, s and t, respectively. Namely, let for any f ∈ Y , f |R r := Rf |r = rf |r , f |S s := Sf |s = sf |s , f |T t := Tf |t = tf |t . From now on, we will use the same symbols for operators R, S and T , and their extensions. Example 2.4. Let H = L2 (R) and ˆ qˆ R = epˆ and S = eqˆ and T = e 2 S −1 R = ep− . i
(22)
Here, the locally convex space Y mentioned above is the Schwartz space of smooth functions on R decreasing rapidly at infinity S(R). It can be easily checked that the tempered distributions |r , |s and |t , such that for any test function f ∈ S(R), i 1 f (x)e x log r dx, f |r := √ 2π r R 1 f |s := √ f (x)δ(log s − x)dx, s R x log t x2 1 f |t := √ f (x)ei 2 ei dx , 2π t R
436
M. Rowicka
satisfy the definition of a generalized eigenvector of the operator R, S and T , respectively. Moreover, note that for R and S given by (22), the operator F is the usual Fourier transform. To see this, it is enough to realize that characteristic properties of the Fourier transform are F qF ˆ −1 = pˆ
F pF ˆ −1 = −qˆ
and
and
F 4 = id .
The scalar product of f and g is defined by the formula +∞ g|f := g(x)f (x) dx . −∞
(23)
Now, disregarding g in (21) yields
+∞
|f =
|s s |f ds ,
0
analogously, disregarding f in (21) yields +∞ g| = g|t t | dt . 0
Therefore g|f = 0
+∞ +∞
g|t t |s s |f ds dt ,
0
with notation t |s := t | |s . Note that the above coincides with (23) if and only if t |s = δ(t − s) , where δ is the Dirac δ distribution. In this case, the generalized eigenvectors are said to be δ-normalized. Note that the generalized eigenvectors , and defined above are δ-normalized. However, these generalized eigenvectors are not specified uniquely by saying that they are δ-normalized, because if e.g | s is such an eigenvector, then also is α | s , where α is a number of modulus 1. Therefore, in what follows, we will specify generalized eigenvectors used more precisely. We will also use notation of the type r |s := r | |s . Note that r |s provides a transition from the expansion of a test function f ∈ S(R) in the basis of generalized eigenvectors of the operator S into a expansion of f in the basis formed by generalized eigenvectors of S: r |f = r |s s |f ds . (24) R
To shorten this notation from now on we shall skip the integration symbol when the same variable occurs twice, once in the ket |· and once in the bra ·|. For example, we shall write r |f = r |s s |f instead of (24).
Exponential Equations Related to Quantum ‘ax + b’ Group
437
Now we proceed to deriving formulas we will need later on. Note that by (15) for any f ∈ S(R) , f |Ss = f |J SJ s . Since J is an antilinear operator and S commutes with J , it follows that f |J SJ s = SJ s |Jf = J Ss |Jf = f |J 2 Ss = f |Ss = sf |s . Comparing the above formulas we obtain SJ s |Jf = sf |s = sf |J 2 s = sJ s |Jf . Therefore it follows that |J s is a generalized eigenvector of S, corresponding to the eigenvalue s¯ . Moreover, |J s is δ-normalized, because |s is. Therefore, there exists an α, such that |α| = 1 and |J s = α|s . As we mentioned earlier, |s was not defined uniquely by specifying that it is a δ-normalized, generalized eigenvector of S with the eigenvalue s. Now we specify |s , by choosing it to satisfy |J s = |s .
(25)
Moreover, observe that by (19), F RF −1 |r = r|r . Hence |F −1 r = β|r , where |β| = 1. Analogously as before, we now specify |r completely by choosing |r = |F −1 r .
(26)
Applying (25) and (26), and then (18) and again (26) and (16) and (25), we obtain r |s = F −1 r |J s = r |F J s = r |J F −1 s = r |J s = s |J r = s | r , so r |s = s | r .
(27)
We proceed to deriving more complicated formulas. Note that ei
log2 t 2
r |t t |s = r |ei
log2 T 2
= r |e
i π4
= r |e
i π4
=e
i π4
e
t t |s = F −1 r |ei
e
2 −i log S
e
e
2 −i log S
t t |s
2 i log S
2 −i log2 T
e
2 i log2 T
log2 T 2
t t |s
t t |s
r |t t |s = ei 4 e−i π
log2 r
r |t t |s .
Keep in mind that according to the convention introduced after Eq.(24), the common factor t |s occurs in the above formulas because of implicit integration over t. From the above calculations, we can derive the formula e−i 4 ei π
log2 t 2
r |t t |s = e−i
log2 r
r |t t |s .
(28)
438
M. Rowicka
Now, we will prove that r |Vθ (log T )∗ s = c e−i
log2 s
s |Vθ (log T )r .
(29)
Let us first compute the left hand side of formula (29), LH S = r |Vθ (log T )∗ s = r |Vθ (log T )∗ t t |s = Vθ (log t)r |t t |s .
(30)
Moreover, by formula 1.36 from [18], for any t ∈ R, Vθ (log t) = e−i 4 c ei π
log2 t 2
Vθ (− log t) ,
(31)
π2
where c = ei( 4 + 24 + 6 ) . Hence π
LH S = e−i 4 c ei π
log2 t 2
Vθ (− log t)r |t t |s .
Compute now the right-hand side of (29): RH S = c e−i
log2 s
s |Vθ (log t)|t t |r
= c Vθ (log t)e−i
log2 s
s |t t |r .
(32)
Note that by (17) and (18), T |ei
log2 S
J t = e−i
log2 T 2
ei 4 T F −1 |J t . π
Moreover by (20), e−i
log2 T 2
ei 4 T F −1 J |t = e−i π
log2 T 2
= t −1 e−i
ei 4 F −1 J T −1 J J |t π
log2 T 2
ei 4 F −1 J |t = t −1 |ei π
log2 S
J t .
2 i log S
The above means that |e J t is a generalized eigenvector of T corresponding to the generalized eigenvalue t −1 . Analogously as we did for |s and |r , we now specify |t completely by setting |ei
log2 S
J t = |t −1 .
Hence Vθ (log t)s |t t |r = Vθ (− log t)s |t −1 t −1 |r = Vθ (− log t)s |ei
log2 S
= Vθ (− log t)e
2 −i log r
= Vθ (− log t)e
2 −i log r
= Vθ (− log t)e
2 −i log r
J t ei
log2 S
J t |r
e
2 i log s
s |J t J t |r
e
2 i log s
t |J s J r |t
e
2 i log s
t |s r |t .
Exponential Equations Related to Quantum ‘ax + b’ Group
439
Therefore Vθ (log t)s |t = Vθ (− log t)e−i
log2 r
ei
log2 s
t |s r |t .
(33)
In fact, we have proved a formula even more general than (33). Namely, for any measurable function f we have f (log t)s |t t |r = f (− log t)e−i
log2 r
ei
log2 s
t |s r |t .
(34)
We will apply this formula to the quantum ‘az + b’ group in our forthcoming paper [11]. Comparing (32) and (30) and using (28) and (33) we get (29). Using again formulas 1.36 of [18] with z = − log t − iπ, where t ∈ R, and (7), we obtain Vθ (log t − iπ ) = c (−1)k e−i 4 ei π
log t 2
2
e− log t Vθ (−log t − iπ ) . π
Using the above and the same method as for derivation of (29), we can obtain formulas for some matrix elements we will soon find very useful: r |Vθ (log T − iπ I )∗ |s˜ = i(−1)k c e−i
log2 s˜
π
s˜ |T Vθ (log T − iπ I )|r (35)
and r |T Vθ (log T − iπ I )∗ |s˜ = i(−1)k c e−i π
log2 s˜
s˜ |Vθ (log T − iπ I )|r . (36)
2.4. Unitary representations of N . Now we will find a formula for all unitary representations of (N, N ) acting on a Hilbert space K. Definition of a unitary representation of (N, N ) acting on a Hilbert space K is slightly complicated, so we will start with explaining a simpler case. An operator map V is a 1-dimensional unitary representation of (N, N ) (or representation of (N, N ) acting on the Hilbert space C), if for any given Hilbert space H there exists a map VH : NH −→ Unit(H), such that for any ((R, ρ), (S, σ )) ∈ RNH we have VH (R, ρ)VH (S, σ ) = VH ((R, ρ) N (S, σ )). This definition is very natural. In order to generalize it to representations acting on any Hilbert space K (not necessarily C) we have to require that V is an operator map with values in operators acting on K. This is done in the two first conditions of the definition below. Definition 2.5. Let K be a given Hilbert space. Let for any Hilbert space H there exists a map VH : NH −→ Unit(K ⊗ H) , such that 1. For any (R, ρ) ∈ NH and any v ∈ Unit(H, H ) we have (idK ⊗ v ∗ )VH (R, ρ)(idK ⊗ v) = VH (v ∗ Rv, v ∗ ρv) . 2. For any space with measure ( , µ) and for any measurable Hilbert-space field {H(λ)}λ∈ and for any two measurable closed-operator fields {R(λ)}λ∈ and {ρ(λ)}λ∈ , such that (R(λ), ρ(λ)) ∈ NH(λ) we have ⊕ ⊕ ⊕ VH(λ) (R(λ), ρ(λ))dµ(λ) = V ⊕ H(λ)dµ(λ) R(λ)dµ(λ), ρ(λ)dµ(λ) .
440
M. Rowicka
3. For any ((R, ρ), (S, σ )) ∈ RNH we have VH (R, ρ)VH (S, σ ) = VH ((R, ρ) N (S, σ )) .
(37)
If Conditions 1,2, and 3 are satisfied, then we call V a unitary representation of (N, N ) acting on Hilbert space K. Now we shall derive a formula for all unitary representations of (N, N ) on a Hilbert space K. In what follows we omit the subscript H in VH . Theorem 2.6. A map
V : NH −→ Unit(K ⊗ H)
is a unitary representation of (N, N ) on a Hilbert space K, if and only if, there exists a (M, µ) ∈ NK , such that for any (R, ρ) ∈ NH we have V (R, ρ) = F (M ⊗ R, (µ ⊗ ρ)χ (M ⊗ R < 0)) . Proof. ⇐ Observe that if (R, ρ) ∈ NH and (M, µ) ∈ NK then also (M ⊗ R, µ ⊗ ρ) ∈ NK⊗H . Therefore we can apply Theorem 2.3, which is our claim. ⇒ We will follow the proof of Theorem 4.2 of [16]. We will start by presenting an outline of our reasoning. We will first show that if V is a unitary representation of (N, N ), then V (r, )V (s, σ ) = V (s, σ )V (r, ) for any r, s ∈ R \ {0} and , σ ∈ {−1, 1}. Then we will find a formula for all unitary representations of (N, N ) acting on C. We will complete the proof using the spectral decomposition theorem. Our proof starts with the observation that since R and ρ commute and Sp ρ = {−1, 1}, the function V may be written as V (R, ρ) = V1 (R) + (IK ⊗ ρ)V2 (R),
(38)
where V1 (R) =
1 1 (V (R, 1) + V (R, −1)) and V2 (R) = (V (R, 1) − V (R, −1)). 2 2
We will now use the following lemma Lemma 2.7. For any r, s ∈ R and , σ ∈ {−1, 1} we have V (r, )V (s, σ ) = V (s, σ )V (r, ).
(39)
V2 (r)V2 (−s) = 0 = V2 (−s)V2 (r).
(40)
Moreover
The proof of this lemma can be found in Appendix A. We stress that satisfying (39) and (40) is a necessary, but not sufficient condition for V to be a representation of (N, N ). Now we proceed toward finding a formula for all unitary representations of (N, N ) acting on C. Let us first solve Eq. (40). There are 3 cases:
Exponential Equations Related to Quantum ‘ax + b’ Group
441
1. For any r ∈ R+ , V2 (r) = 0 and V2 (−r) = 0. Then: V1 (r) for V (r, ) = for V1 (r) + V2 (r)
r>0 , r<0
so V (R, ρ) = V1 (R) + ρχ (R < 0)V2 (R) = V (R, ρχ (R < 0)) = V (φ(R, ρ)). Similarly V (S, σ ) = V1 (S) + σ χ (S < 0)V2 (S) = V (S, σ χ (S < 0)) = V (φ(S, σ )). Moreover, since V is representation of (N, N ) , V (φ(R, ρ))V (φ(S, σ )) = V (φ((R, ρ) N (S, σ ))). Hence by Theorem 2.3, V (R, ρ) = F (MR, µρχ (R < 0)), where M ≥ 0 and µ = ±1. 2. V2 (−r) = 0 and V2 (r) = 0. In the same manner as in the previous case we conclude that ˜ µρχ (MR ˜ < 0)), V (R, ρ) = F (−MR, µρχ (R > 0)) = F (MR, where M˜ = −M ≤ 0 and µ = ±1. 3. V2 (r) = 0 and V2 (−r) = 0. In this case V (R, ρ) = V1 (R) and V (S, σ ) = V1 (S). Note that V1 (R)V1 (S) = V1 ([R + S] ), i −1 µχ(e
since [R + S]
µχ(e
i 2 S −1 R<0)
2
S
R<0)
depends on ρ and on σ , while the left hand side does
not. Therefore we conclude that V = V1 is not a representation of (N, N ). Thus we have proved that all unitary representations of (N, N ) acting on the Hilbert space C are V (R, ρ) = F (MR, µρχ (MR < 0)), where M ∈ R and µ = ±1. Now we turn to the case of representations of (N, N ) acting on an arbitrary Hilbert space K. Note that if dim K = k < ∞, then from commutation of the unitary operators V (r, ρ) and V (s, σ ), the existence of an orthonormal basis diagonalizing the matrices V (r, ρ) and V (s, σ ) for any r, s ∈ R and ρ, σ ∈ {−1, 1} follows. Thus the problem reduces to finding solutions of k scalar equations Vo (R, ρ)Vo (S, σ ) = Vo ((R, ρ) N (S, σ )) , where the complex-valued function Vo is defined on R × {−1, 1}. The same conclusion can be drawn for an arbitrary separable Hilbert space K. The reason is that operators V (r, ρ) and V (s, σ ) belong to a commutative *-subalgebra of B(K). Therefore, by the
442
M. Rowicka
spectral theorem and its consequences [3, Chap. X], the operators V (r, ρ) and V (s, σ ) have the same spectral measure 2π 2π V (r, ρ) = Vo (r, ρ, t)dEK (t) and V (s, σ ) = Vo (s, σ, t)dEK (t). 0
0
Hence
V (R, ρ) =
and
2π
R×{−1,1} 0
V (S, σ ) =
2π
R×{−1,1} 0
Vo (r, ρ, t)dEK (t) ⊗ dER,ρ (z) ,
Vo (s, σ, t)dEK (t) ⊗ dES,σ (z) ,
where dER,ρ is the joint spectral measure of strongly commuting operators R and ρ. Thus, Theorem 2.6 reduces to the proved above result that all unitary representations of (N, N ) acting on C are of the form V (R, ρ) = F (MR, µρχ (MR < 0)) , where M ∈ R and µ = ±1. 3. The Operator Domain M The main goal of this paper is to find all unitary representations of the operator domain M, which will be introduced below. This operator domain is very close to the operator domain corresponding to the quantum ‘ax + b’ group. To emphasize this we use the same letters b and β, which denote the operators generating3 the quantum ’ax + b’ group in [19]. Theorem 3.4 proved in this section, will be crucial for our next paper [12]. For a given Hilbert space H, let us define the MH by MH = {(b, β) ∈ C(H)2 |b = b∗ , β = β ∗ , βb = −bβ, β 2 = χ (b = 0)} and the set RMH by RMH = {((b, β), (d, δ)) |(b, β), (d, δ) ∈ MH
× b d, bδ = δb, dβ = βd, βδ = δβ} . Let us assume that ker d = {0} and let us introduce f = e 2 d −1 b i
and
φ = ±βδχ (e 2 d −1 b < 0) i
and [b + d]φ = F (f, φ)∗ dF (f, φ)
and
δ˜ = F (f, φ)∗ δF (f, φ) .
Corollary 3.1. The map M , defined for every ((b, β), (d, δ)) ∈ RMH , such that ker d = {0}, by ˜ (b, β) M (d, δ) = ([b + d]φ , δ) is a well defined operator map from RMH into MH . 3
In fact, generators are b and ibβ.
(41)
Exponential Equations Related to Quantum ‘ax + b’ Group
443
˜ ∈ MH . We first show that [b + d]φ is a Proof. It is enough to prove that ([b + d]φ , δ) i
selfadjoint operator. The operator φ is a reflection operator corresponding to e 2 d −1 b, i since φ is selfadjoint, φ 2 = χ (e 2 d −1 b < 0) and φ anticommutes with b + d. The i operator b + d is selfadjoint, if e 2 bd ≥ 0 (see Theorem 5.4 [18]). Hence [b + d]φ = i
F (f, φ)∗ dF (f, φ) is a selfadjoint extension of the sum of selfadjoint operators e 2 f d and d (see Theorem 6.1 and 4.1 [18]), corresponding to the reflection operator φ. We prove now that δ˜ is a selfadjoint operator. From δ˜ = F (f, φ)∗ δF (f, φ) it follows that δ˜ is selfadjoint, because δ˜ is unitary equivalent to the selfadjoint operator ˜ because d commutes with δ. In order δ. Moreover, note that [b + d]φ commutes with δ, ˜ ∈ MH , we have to check that to prove that ([b + d]φ , δ) δ˜2 = χ ([b + d]φ = 0) . To this end we compute δ˜2 = F (f, φ)∗ δF (f, φ)F (f, φ)∗ δF (f, φ) = F (f, φ)∗ δ 2 F (f, φ) = F (f, φ)∗ χ (d = 0)F (f, φ) = χ (F (f, φ)∗ dF (f, φ) = 0) = χ ([b + d]φ = 0). The operation M is associative, in the same sense as N is (see the explanation following formula (10)).
3.1. The matrix representation of M. Let ker b = ker d = {0}. Since b d, it follows that b and d commute with (sign b) and (sign d). Therefore, we may introduce the following notation: H++ H+− H−+ H−−
= = = =
H(b > 0) ∩ H(d H(b > 0) ∩ H(d H(b < 0) ∩ H(d H(b < 0) ∩ H(d
> 0), < 0), > 0), < 0).
Then H = H++ ⊕ H+− ⊕ H−+ ⊕ H−− . Every vector ψ ∈ H can be represented as ψ++ ψ ψ = +− , ψ−+ ψ−− where ψ++ ∈ H++ , ψ+− ∈ H+− , ψ−+ ∈ H−+ and ψ−− ∈ H−− . Consequently, operators acting on H will be represented by 4×4 matrices. From (41), it follows that the maps β : H++ → H−+ and β : H−+ → H++ and β : H−− → H+− and β : H+− → H−− are mutually inverse. Similarly by (41) we obtain that the maps δ : H++ → H+− and δ : H+− → H++ and δ : H−− → H−+ and δ : H−+ → H−− are mutually inverse. Hence the Hilbert spaces H++ , H−+ , H+− and H−− are unitary equivalent. In what follows, we assume for simplicity that H++ = H−+ = H+− = H−− and denote this
444
M. Rowicka
Hilbert space by H+ . With this notation we have the following representations of β and δ: 00I 0 0I 00 0 0 0 I I 0 0 0 β= , and δ= . I 0 0 0 000I 0I 00 00I 0 Since b anticommutes with β and commutes with δ and since d anticommutes with δ and commutes with β, they will be represented as follows: b+ 0 0 0 d+ 0 0 0 0 b+ 0 0 0 −d+ 0 0 b= , and d= , 0 0 −b+ 0 0 0 d+ 0 0 0 0 −b+ 0 0 0 −d+ where b+ and d+ are restrictions of b and d to H+ . It is easily seen that b+ and d+ are selfadjoint and strictly positive and b+ d+ . Moreover, f+ 0 0 0 i 0 −f+ 0 0 f = e 2 d −1 b = , 0 0 −f+ 0 0 0 0 f+ −1 where f+ = e 2 d+ b+ is selfadjoint and strictly positive. Hence 0 0 0 kI i 0 0 (−1) φ = (−1)k βδχ(e 2 d −1 b < 0) = (−1)k βδχ (f < 0) = 0 (−1)k I 0 0 0 0 i
0 0 , 0 0
where k ∈ N. 3.2. From N to M. As we stated in the introduction, the operator domain N is auxiliary, and what we are really interested in are unitary representations of (M, M ). However, (N, N ) has been easier to work with, because it is commutative, so we found a formula for all unitary representations of it. It would be very convenient now to have an operator map from N into M, which would allow us to “transfer” the results obtained for (N, N ) to (M, M ). Such an operator map is constructed below. Proposition 3.2. Let ((R, ρ), (S, σ )) ∈ RNH . Define a map ϕ : NH → MC2 ⊗H for any (R, ρ) ∈ NH by
R 0 0ρ ϕ(R, ρ) = ( , ). 0 −R ρ 0
Then φ is an operator map and for any ((R, ρ), (S, σ )) ∈ RNH we have ϕ (R, ρ) NH (S, σ ) = ϕ(R, ρ) MC2 ⊗H ϕ(S, σ ).
Exponential Equations Related to Quantum ‘ax + b’ Group
445
Proof. We first prove that R 0 S 0 [R + S]τ 0 + . = 0 −R 0 −S φ 0 −[R + S]τ Since selfadjoint extensions are determined uniquely by reflection operators, it is enough to check that IC2 ⊗ τ = φ. We already know that 0σ k k 0 ρ k ρσ 0 = (−1) φ = (−1) βδ = (−1) ρ 0 σ 0 0 ρσ and that τ = (−1)k ρσ . Hence φ=
τ 0 . 0τ
Therefore selfadjoint extensions given by these reflection operators are the same. It ˜ To this end, note that remains to prove that ϕ(σ˜ ) = δ. 0 σ˜ ϕ(σ˜ ) = σ˜ 0 0 F (T , τ χ (T < 0))∗ σ F (T , τ χ (T < 0)) = F (T , τ χ (T < 0))∗ σ F (T , τ χ (T < 0)) 0 = F (f, φ)∗ δF (f, φ) = δ˜ .
3.3. Unitary representations of M. Before we formulate our main result, we need to prove the following proposition 2 . For any (b, β) ∈ M Proposition 3.3. Let (g, γ ) ∈ MK and ((b, β), (d, δ)) ∈ MH H define a map ϕ(g,γ ) by
ϕ(g,γ ) : MH (b, β) → (g ⊗ b, γ ⊗ β) ∈ NK⊗H . Then ϕ(g,γ ) ((b, β) M (d, δ)) = ϕ(g,γ ) (b, β) N ϕ(g,γ ) (d, δ) . Proof. We first have to check that the selfadjoint extensions on both sides of the above formula are the same, i.e. that g ⊗ [b + d]φ equals [g ⊗ b + g ⊗ d]τ . Since [g ⊗ b + g ⊗ d]τ equals g ⊗ [b + d]τ |H and selfadjoint extensions are given uniquely by reflection operators, it is enough to check that φ = τ |H . We know that φ = (−1)k βδχ (e 2 d −1 b < 0) i
and τ = (−1)k (γ ⊗ β)(γ ⊗ δ)χ (e 2 (g ⊗ b)−1 (g ⊗ d) < 0) i
= (−1)k (γ 2 ⊗ βδ)χ (I ⊗ e 2 b−1 d < 0) . i
446
M. Rowicka
Since γ 2 = χ (g = 0), it follows that τ = (−1)k (I ⊗ βδ)χ (I ⊗ e 2 b−1 d < 0) . i
Obviously i
τ |H = (−1)k βδχ (e 2 db < 0) = φ, so the selfadjoint operators determined by τ and φ are also the same. Next, we have to prove that ˜ . (γ ⊗ δ) = (γ ⊗ δ) By the definition of the left-hand side, LH S = F (IK ⊗ f, IK ⊗ φ)∗ (γ ⊗ δ)F (IK ⊗ f, IK ⊗ φ) = (IK ⊗ F (f, φ))∗ (γ ⊗ δ) (IK ⊗ F (f, φ)) = RH S . This completes the proof of Proposition 3.3.
We can now formulate our main result. Theorem 3.4. An operator map U is a unitary representation of (M, M ) acting on a Hilbert space K, if and only if there exists such (g, γ ) ∈ MK that for any (b, β) ∈ MH : U (b, β) = F (g ⊗ b, (γ ⊗ β)χ (g ⊗ b < 0)).
(42)
Proof. We first show that if U is a representation of (M, M ), it has the form of (42). Let ϕ : NH → MC2 ⊗H be the operator map considered before. Let (b, β) denote the following element from MC2 ⊗H : R 0 0ρ (b, β) := ϕ(R, ρ) = , . 0 −R ρ 0 Let V (R, ρ) := U (b, β) = U (ϕ(R, ρ)) . Since U is a unitary representation of (M, M ), it follows by Proposition 3.2, that V is a unitary representation of (N, N ) on K ⊗ C2 . Next, by Theorem 2.6 we get R 0 0ρ (43) U , = F (M ⊗ R, (µ ⊗ ρ)χ (M ⊗ R < 0)), 0 −R ρ 0 where (M, µ) ∈ NK⊗C2 . We find the conditions for (M, µ) ∈ NC2 ⊗K , under which (43) holds. We know that U, ϕ and V are operator maps. Let a unitary operator Uˆ be given by 01 Uˆ := IK ⊗ ⊗ IH . 10
Exponential Equations Related to Quantum ‘ax + b’ Group
447
Let us consider the conjugate action of Uˆ on the left-hand side of (43): ∗ 01 R 0 0ρ 01 IK ⊗ ⊗ IH U , IK ⊗ ⊗ IH 10 0 −R ρ 0 10 −R 0 0ρ =U , . 0 R ρ 0 Hence Uˆ ∗ F (M ⊗ R, µ ⊗ ρ)Uˆ = F (−M ⊗ R, µ ⊗ ρ) .
(44)
Observe that for any t ∈ R \ {0}, the pair (tR, ρ) belongs to NH if only (R, ρ) does. Therefore we may write tR instead of R in (44). The function F is not injective, however the family F (t·) separates points of real . Therefore from (44) it follows that 01 01 IK ⊗ M IK ⊗ = −M 10 10 and
01 IK ⊗ 10
01 µ IK ⊗ =µ. 10
Let us introduce another unitary operator 1 0 ˆ V := IK ⊗ ⊗ IH . 0 −1 Considering the conjugate action of Vˆ onto the left hand side of (43), we obtain: ∗ 1 0 R 0 0ρ 1 0 IK ⊗ ⊗ IH U , IK ⊗ ⊗ IH 0 −1 0 −R ρ 0 0 −1 R 0 0 −ρ =U , . 0 −R −ρ 0 Hence Vˆ ∗ F (M ⊗ R, µ ⊗ ρ)Vˆ = F (M ⊗ R, −µ ⊗ ρ), so
1 0 IK ⊗ 0 −1
and
IK ⊗
1 0 0 −1
1 0 M IK ⊗ =M 0 −1
1 0 µ IK ⊗ = −µ. 0 −1
Therefore M and µ have the form 1 0 M =g⊗ 0 −1
and µ = γ ⊗
01 , 10
448
M. Rowicka
where both g and γ are operators acting on the Hilbert space K. Note that since (M, µ) ∈ NK⊗C2 , we see that g, γ are selfadjoint, g anticommutes with γ and γ 2 = χ(g = 0). It means that (g, γ ) ∈ MK . Hence R 0 0ρ R 0 R 0 0ρ , γ⊗ χ g⊗ <0 , U , = F g⊗ 0 −R ρ 0 0 −R 0 −R ρ 0 so U (b, β) = F (g ⊗ b, (γ ⊗ β)χ (g ⊗ b < 0)), where (g, γ ) ∈ MK . We have proved that every unitary representation of (M, M ) has the form (42). Moreover, by Proposition 3.3, every operator function given by (42) is a unitary representation of (M, M ). This finishes the proof. This result will prove extremely useful in our next paper [12], where we find all unitary representations of the quantum ‘ax + b’ group. In that paper, we will also show that the quantum ‘ax + b’ group coincides with its Pontryagin dual. On the other hand, the formula (34) will be used in our other forthcoming paper [11], where we will study unitary representations of the quantum ‘az + b’ group. Appendix A. Proof of Lemma 2.7 Lemma 2.5. For any r, s ∈ R and , σ ∈ {−1, 1}, V (r, )V (s, σ ) = V (s, σ )V (r, ).
(39)
V2 (r)V2 (−s) = 0 = V2 (−s)V2 (r).
(40)
Moreover,
Proof. To prove this lemma, we are going to compute matrix elements of both sides of the equation V (R, ρ)V (S, σ ) = V ((R, ρ) N (S, σ )) ,
(41)
which has to be satisfied because V is a unitary representation of (N, N ) (see Definition 2.5, Point 3). From the definition of V1 and V2 (see Eq.(38)), it follows that V1 (R+ ) V2 (R+ ) 0 0 0 0 V (R ) V (R ) V (R, ρ) = 2 + 1 + 0 0 V1 (−R+ ) V2 (−R+ ) 0 0 V2 (−R+ ) V1 (−R+ ) and
0 V2 (S+ ) 0 V1 (S+ ) V1 (−S+ ) 0 V2 (−S+ ) 0 V (S, σ ) = . V2 (S+ ) 0 V1 (S+ ) 0 0 V2 (−S+ ) 0 V1 (−S+ )
Exponential Equations Related to Quantum ‘ax + b’ Group
449
Hence, the left-hand side of (41) is given by V (R, ρ)V (S, σ ) V1 (R+ )V1 (S+ ) V2 (R+ )V1 (−S+ ) V1 (R+ )V2 (S+ ) V2 (R+ )V2 (−S+ ) V (R )V (S ) V1 (R+ )V1 (−S+ ) V2 (R+ )V2 (S+ ) V1 (R+ )V2 (−S+ ) = 2 + 1 + . V1 (−R+ )V2 (S+ ) V2 (−R+ )V2 (−S+ ) V1 (−R+ )V1 (S+ ) V2 (−R+ )V1 (−S+ ) V2 (−R+ )V2 (S+ ) V1 (−R+ )V2 (−S+ ) V2 (−R+ )V1 (S+ ) V1 (−R+ )V1 (−S+ ) Since V ((R, ρ) N (S, σ )) = F (T , τ χ (T < 0))∗ V (S, σ )F (T , τ χ (T < 0)) , we have to compute F (T , τ χ (T < 0)) Vθ (log T+ ) 0 0 0 π i(−1)k T+ Vθ (log T+ − iπ I ) 0 0 Vθ (log T+ − iπ I ) = π . k Vθ (log T+ − iπ I ) 0 0 i(−1) T+ Vθ (log T+ − iπ I ) 0 0 0 Vθ (log T+ ) Therefore, the right-hand side of (41) is given by X(1, 1) X(1, 2) X(2, 1) X(2, 2) X := V ((R, ρ) N (S, σ )) = X(3, 1) X(3, 2) 0 X(4, 2)
X(1, 3) 0 X(2, 3) X(2, 4) , X(3, 3) X(3, 4) X(4, 3) X(4, 4)
where X(1, 1) = Vθ (log T+ )∗ V1 (S+ ) Vθ (log T+ ), π X(1, 2) = i(−1)k Vθ (log T+ )∗ V2 (S+ )(T+ ) Vθ (log T+ − iπ I ), ∗ X(1, 3) = Vθ (log T+ ) V2 (S+ )Vθ (log T+ − iπ I ), π X(2, 1) = −i(−1)k (T+ ) Vθ (log T+ − iπ I )∗ V2 (S+ )Vθ (log T+ ), X(2, 2) = Vθ (log T+ − iπ I )∗ V1 (−S+ ) Vθ (log T+ − iπ I ) π
π
+T+ Vθ (log T+ − iπ I )∗ V1 (S+ ) T+ Vθ (log T+ − iπ I ), π
X(2, 3) = i (−1)k Vθ (log T+ − iπ I )∗ V1 (−S+ ) T+ Vθ (log T+ − iπ I ) π
−i (−1)k T+ Vθ (log T+ − iπ I )∗ V1 (S+ ) Vθ (log T+ − iπ I ), X(2, 4) = Vθ (log T+ − iπ I )∗ V2 (−S+ )Vθ (log T+ ), X(3, 1) = Vθ (log T+ − iπ I )∗ V2 (S+ )Vθ (log T+ ), π
X(3, 2) = −i (−1)k T+ Vθ (log T+ − iπ I )∗ V1 (−S+ ) Vθ (log T+ − iπ I ) π
+i (−1)k Vθ (log T+ − iπ I )∗ V1 (S+ ) T+ Vθ (log T+ − iπ I ), π
π
X(3, 3) = T+ Vθ (log T+ − iπ I )∗ V1 (−S+ ) T+ Vθ (log T+ − iπ I ) +Vθ (log T+ − iπ I )∗ V1 (S+ ) Vθ (log T+ − iπ I ), π X(3, 4) = −i(−1)k (T+ ) Vθ (log T+ − iπ I )∗ V2 (−S+ )Vθ (log T+ ), X(4, 2) = Vθ (log T+ )∗ V2 (−S+ )Vθ (log T+ − iπ I ), π X(4, 3) = i(−1)k Vθ (log T+ )∗ V2 (−S)(T+ ) Vθ (log T+ − iπ I ), X(4, 4) = Vθ (log T+ )∗ V1 (−S+ ) Vθ (log T+ ).
In order to prove our thesis, we will prove that for r, s ∈ R+ , all the following 10 formulas hold:
450
M. Rowicka
V1 (r)V1 (s) = V1 (s)V1 (r), V1 (−r)V1 (−s) = V1 (−s)V1 (−r), V1 (r)V1 (−s) = V1 (−s)V1 (r), V2 (r)V2 (s) = V2 (s)V2 (r), V2 (−r)V2 (−s) = V2 (−s)V2 (−r), V2 (r)V2 (−s) = 0 = V2 (−s)V2 (r), V1 (r)V2 (s) = V2 (s)V1 (r), V1 (−r)V2 (−s) = V2 (−s)V1 (−r), V1 (r)V2 (−s) = V2 (−s)V1 (r), V1 (−r)V2 (s) = V2 (s)V1 (−r).
(42) (43) (44) (45) (46) (47) (48) (49) (50) (51)
In order to prove this, we will compute certain matrix elements of both sides of (41) and will use the previously derived formulas (27), (29) and (35) and (36). First we prove the formula (42). Compute r |V1 (R+ )V1 (S+ )s = r |X(1, 1)s = r |Vθ (log T+ )∗ V1 (S+ ) Vθ (log T+ )s . Hence, V1 (r)V1 (s) = r |s −1 r |Vθ (log t)∗ t t |s˜ V1 (˜s )s˜ |t˜t˜|Vθ (log t)s . Applying (29) we get V1 (r)V1 (s) = V1 (˜s )r |s −1 r |Vθ (log T+ )∗ s˜ s˜ |Vθ (log T+ )s = c e−i
log2 s˜
V1 (˜s )r |s −1 s˜ |Vθ (log T+ )r s˜ |Vθ (log T+ )s .
Note that s˜ |Vθ (log T+ )|r s˜ |Vθ (log T+ )s is symmetric with respect to swapping r and s. By (27) the same holds for r |s , and the remaining terms of the above formula depend on neither r nor s. Therefore V1 (r)V1 (s) = V1 (s)V1 (r), i.e. (42) holds. In the same manner one can prove formulas (43) and (45) and (46). The proof of the formula (44) is slightly different. Compute r |V1 (R+ )V1 (−S+ )s = r |X(2, 2)s = r | Vθ (log T+ − iπ I )∗ V1 (−S+ ) Vθ (log T+ − iπ I ) s π
π
+ r |T+ Vθ (log T+ − iπ I )∗ V1 (S+ ) T+ × Vθ (log T+ − iπ I )s .
Exponential Equations Related to Quantum ‘ax + b’ Group
451
Hence V1 (r)V1 (−s) = r |s −1 r |Vθ (log t − iπ I )∗ t t |s˜ V1 (−˜s ) × |s˜ |t˜t˜|Vθ (log t˜ − iπ I )s + r |t Vθ (log t − iπ I )∗ |t π × t |s˜ V1 (˜s )|s˜ |V1 (S+ )t˜t˜|t˜ Vθ (log t˜ − iπ )s = r |s −1 V1 (−˜s )r | Vθ (log T+ − iπ I )∗ s˜ s˜ | π
π
× Vθ (log T+ − iπ I )s + V1 (˜s )r |T+ Vθ (log T+ − iπ I )∗ s˜ π × s˜ |T+ Vθ (log T+ − iπ I )s . Therefore, by (35) and (36) we obtain log2 s˜
V1 (r)V1 (−s) = i(−1)r c e−i r |s −1 π × V1 (−˜s )s˜ |T+ Vθ (log T+ − iπ I )r s˜ |Vθ (log T+ − iπ I )s π + V1 (˜s )s˜ |Vθ (log T+ − iπ I )r s˜ |t Vθ (log T+ − iπ I )s . (52) On the other hand, s |V1 (−R+ )V1 (S+ )r = s |X(3, 3)r π
π
= s |T+ Vθ (log T+ − iπ I )∗ V1 (−s) T+ × Vθ (log T+ − iπ I ) r +s | Vθ (log T+ − iπ I )∗ V1 (s) Vθ (log T+ − iπ I )r . Hence, π V1 (−s)V1 (r) = s |r −1 s |t Vθ (log t − iπ )∗ t t |s˜ s˜ |V1 (−s)t˜ × t˜|t˜ Vθ (log t˜ − iπ )r + s |Vθ (log t − iπ I )∗ t t |s˜
× s˜ |V1 (s)t˜t˜|Vθ (log t˜ − iπ )r π = s |r −1 V1 (−˜s )s |T+ Vθ (log T+ − iπ I )∗ s˜ π
π
× s˜ |T+ Vθ (log T+ − iπ I )r
+ V1 (˜s )s | Vθ (log T+ − iπ I )∗ s˜ s˜ |Vθ (log T+ − iπ I )r . Therefore, by (35) and (36) log2 s˜
V1 (−s)V1 (r) = i(−1)k c e−i s |r −1 π × V1 (˜s )s˜ |Vθ (log T+ − iπ I )r s˜ |T+ Vθ (log T+ − iπ I )s
π + V1 (−˜s )s˜ |T+ Vθ (log T+ − iπ I )r s˜ |Vθ (log T+ − iπ I )s . (53)
452
M. Rowicka
Applying (27) we see that (52) and (53) are the same. Thus V1 (r)V1 (−s) = V1 (−s)V1 (r) , so (44) also holds. The proofs of formulas (50), (48) and (49) and (51) are exactly the same as (44), so we can omit them. In order to prove (47) one has to observe additionally that X(1, 4) = X(4, 1) = 0 . Thus we proved formulas (42) to (51). Adding and subtracting (42) and (48) yields V1 (r){V1 (s) + V2 (s)} = V1 (r)V (s, 1) = {V1 (s) + V2 (s)}V1 (r) = V (s, 1)V1 (r). Hence V1 (r)V (s, 1) = V (s, 1)V1 (r) and V1 (r)V (s, −1) = V (s, −1)V1 (r).
(54)
In the same manner we can see that V (r, )V (s, σ ) = V (s, σ )V (r, ) for any r, s ∈ R \ {0} and , σ ∈ {−1, 1}. Moreover, one can see that (40) holds: V2 (r)V2 (−s) = 0 = V2 (−s)V2 (r) . This completes the proof.
Acknowledgements. The author is greatly indebted to Professor Stanisław L. Woronowicz for stimulating discussions and important hints and comments. The author also wishes to thank Professor Wiesław Pusz, Professor Marek Bo˙zejko and Dr. Andrzej Kudlicki for helpful suggestions. The author also gratefully acknowledges the many helpful suggestions of the Referee.
References 1. Connes, A.: Noncommutative Geometry. New York: Academic Press, 1995 2. Dixmier, J.: Les alg´ebres d’operateurs dans l’espace Hilbertien. Paris: Gauthier-Villars, 1969 3. Dunford, N., Schwartz, J.T.: Linear Operators, Part II: Spectral Theory. New York-London: Interscience Publishers, 1963 4. Gadella, M., Gomez, F.: A Unified Mathematical Formalism for the Dirac Formulation of Quantum Mechanics. Found. Phys. 32, 815–869 (2002) 5. Gelfand, I.M., Shilov, G.E.: Generalized Functions: Spaces of Fundamental and Generalized Functions. New York: Academic, 1968 6. Kruszynski, P., Woronowicz, S.L.: A Non-commutative Gelfand-Naimark Theorem. J. Oper. Theor. 8, 361–389 (1982) 7. Maurin, K.: Methods of Hilbert spaces. Warszawa: PWN, 1967 8. Maurin, K.: General Eigenfunction Expansion and Unitary Representations of Topological Groups. Warszawa: PWN, 1968 9. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Part I. New York: Academic Press, 1975 10. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Part II. NewYork: Academic Press, 1975 11. Rowicka, M.: Quantum ‘az+b’ group at roots of unity: Unitary representations. math.QA/0108020 12. Rowicka, M.: Unitary representations of the quantum ‘ax+b’ group. math.QA/0102151 13. Rudin, W: Functional Analysis. New York: McGraw-Hill, 1991 14. Woronowicz, S.L.: Operator systems and their application to the Tomita-Takesaki theory. J. Oper. Theor. 2, 169–209 (1979) 15. Woronowicz, S.L.: Duality in the C ∗ -algebra theory. Warszawa, 1983
Exponential Equations Related to Quantum ‘ax + b’ Group
453
16. Woronowicz, S.L.: Operator Equalities Related to the Quantum E(2) Group. Commun. Math. Phys. 144, 417–428 (1992) 17. Woronowicz, S.L.: C ∗ -algebras generated by unbounded elements. Rev. Math. Phys. 7, 481–521 (1995) 18. Woronowicz, S.L.: Quantum exponential function. Rev. Math. Phys. 12, 873–920 (2000) 19. Woronowicz, S.L., Zakrzewski, S.: Quantum ’ax+b’ group. Rev. Math. Phys. 14, 797–828 (2002) Communicated by L. Takhtajan
Commun. Math. Phys. 244, 455–481 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-1020-4
Communications in
Mathematical Physics
Superdiffusivity of Asymmetric Exclusion Process in Dimensions One and Two C. Landim1,2 , J. Quastel3 , M. Salmhofer4,5 , H.-T. Yau6 1 2 3 4 5 6
IMPA, Estrada Dona Castorina 110, CEP 22460 Rio de Janeiro, Brasil CNRS UPRES-A 6085, Universit´e de Rouen, 76128 Mont Saint Aignan, France. E-mail:
[email protected] Department of Mathematics and Statistics, University of Toronto, 100 St. George Street, Toronto, Canada M5S 3G3. E-mail:
[email protected] Max-Planck Institute for Mathematics, Inselstr. 22, 04103 Leipzig, Germany. E-mail:
[email protected] Theoretical Physics, University of Leipzig, Augustusplatz 10, 04109 Leipzig, Germany. E-mail:
[email protected] Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012, USA. E-mail:
[email protected]
Received: 13 February 2002 / Accepted: 4 November 2002 Published online: 16 December 2003 – © Springer-Verlag 2003
Abstract: We prove that the diffusion coefficient for the asymmetric exclusion process diverges at least as fast as t 1/4 in dimension d = 1 and (log t)1/2 in d = 2. The method applies to nearest and non-nearest neighbor asymmetric exclusion processes. 1. Introduction The asymmetric exclusion process is a Markov process on {0, 1}Z consisting of interacting continuous time random walks with asymmetric jump rates. There is at most one particle allowed per site. A particle at a site x waits for an exponential time and then jumps to y provided the site is not occupied. Otherwise the jump is suppressed and the process starts again. The jump is attempted at rate p(y − x). In this article the jump law p(·) is assumed to have a nonzero mean, so that there is transport of the system. Consider the system in equilibrium with a Bernoulli product measure of density ρ as the invariant measure. Define the time dependent correlation function in equilibrium by S(x, t) = Eρ {ηx (t) − ρ}{η0 (0) − ρ} = ηx (t); η0 (0) , d
where Eρ stands for the expectation on the path space corresponding to a process starting from a Bernoulli measure with density ρ. If we choose ρ = 1/2, there is no net global drift, i.e., x xS(x, t) = 0. Otherwise one needs to subtract a net drift, which complicates but does not change the results or methods. Our main question is the behaviour for large t of the diffusion coefficient D(t) =
1 2 x S(x, t) . 8t d x∈Z
In dimensions d ≥ 3, the diffusion coefficient was proved to be bounded [9] for general asymmetric simple exclusion processes. Based on mode coupling theory,
456
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Beijeren, Kutner and Spohn [3] conjectured that D(t) ∼ (log t)2/3 in dimension d = 2 and D(t) ∼ t 1/3 in d = 1. Similar predictions were made in [7] for the Kardar-ParisiZhang equation in d = 1, which, when differentiated and appropriately discretized, yields the asymmetric exclusion process. This problem has received much attention recently in the context of integrable systems. The main quantity analyzed there is fluctuation of the current across the origin in d = 1 for the totally asymmetric simple exclusion process (only nearest neighbor jumps to the right), starting from the special initial configuration with all sites to the left of the origin occupied and all sites to the right of the origin empty. Johansson [6] observed that in this special situation the current across the origin can be mapped into a last passage percolation problem. By analyzing this problem asymptotically, Johansson proved that the variance of the current is of order t 2/3 . In the case of discrete time, Baik and Rains [2] analyze an extended version of the last passage percolation problem and obtain fluctuations of order t α , where α = 1/3 or α = 1/2 depending on the parameters of the model. Both the approaches of [6] and [2] are related to the earlier results of Baik-Deift-Johansson [1] on the distribution of the length of the longest increasing subsequence in random permutations. In [10] (see also [11]), Pr¨ahofer and Spohn succeeded in mapping the current of the totally asymmetric simple exclusion process into a last passage percolation problem for a general class of initial data, including the equilibrium case considered in this article. For the discrete time case, the extended problem is closely related to the work [2], but the boundary conditions are different. For continuous time, besides the boundary condition issue, one needs to extend the result of [2] from the geometric to the exponential distribution. To relate these results to our problem, the variance of the current across the origin is proportional to |x|S(x, t). (1.1) x∈Zd
Therefore, Johansson’s result on the variance of the current can be interpreted as the spreading of S(x, t) being of order t 2/3 . If we combine the work of [10] and [2], neglect various issues discussed above, and extrapolate to the second moment, we obtain a growth of the second moment as t 4/3 , consistent with the conjectured D(t) ∼ t 1/3 . The results based on integrable systems give not just the variance of the current across the origin, but also its limiting distribution. The main restrictions appear to be the rigid requirements on the fine details of the dynamics and on the initial data, the restriction to one space dimension, and the special quantities which can be analysed. In fact, even for the totally asymmetric simple exclusion process in d = 1 there was previously no proof that D(t) diverges as t → ∞. For general asymmetric exclusion processes all of these problems were completely open. In this article, we present a method to study the diffusion coefficient of general asymmetric exclusion processes. Using this we obtain without too much work lower bounds D(t) ≥ (log t)1/2 in dimension d = 2 and D(t) ≥ t 1/4 in d = 1. We have restricted the proof to the case ρ = 1/2, but a similar proof works for all densities away from zero or one. 1.1. The model and main results. Denote the particle configuration by η = {ηx }x∈Zd , where ηx is equal to 1 if site x is occupied and is equal to 0 otherwise. Denote by ηx,y the configuration obtained from η by exchanging the occupation variables at x and y :
Superdiffusivity of Asymmetric Exclusion
457
ηz x,y (η )z = ηx η y
if z = x, y, if z = y and if z = x.
We assume that the jump law p(·) is local, p(z) = 0 for |z| ≥ L for some L < ∞, and that there is transport of mass, z zp(z) = 0. The generator L of the asymmetric simple exclusion process is given by (Lf )(η) = p(y − x)ηx (1 − ηy )[f (ηx,y ) − f (η)] . x,y∈Zd
For each ρ in [0, 1], denote by νρ the Bernoulli product measure on {0, 1}Z with density ρ and by < ·, · >ρ the inner product in L2 (νρ ). The probability measures νρ are invariant for the process. For two cylinder functions f , g and a density ρ, denote by f ; gρ the covariance of f and g with respect to νρ : d
f ; gρ = f gρ − f ρ gρ . Let Pρ denote the law of the asymmetric exclusion process starting from the equilibrium measure νρ . Expectation with respect to Pρ is denoted by Eρ . Let Sρ (x, t) = Eρ {ηx (t) − ηx (0)}η0 (0) denote the time dependent correlation functions in equilibrium with density ρ. Denote by χ the compressibility given by χ = χ (ρ) = ηx ; η0 ρ . x∈Zd
In our setting, χ (ρ) = ρ(1 − ρ). The bulk diffusion coefficient is defined by
1 1 xi xj Sρ (x, t) − χ (vi t)(vj t) , Di,j (ρ, t) = t 2χ d x∈Z
where v in Rd is the velocity defined by vt =
1 xEρ {ηx (t) − ηx (0)}η0 (0) . χ d x∈Z
To simplify the notation we now specialize to the totally asymmetric simple exclusion process to the right in d = 1 and, in d = 2, jumps only to the nearest neighbor to the right in the x1 coordinate, and jumps to both nearest neighbors in the x2 coordinate with symmetric jump rule. More precisely, we take (Lf )(η) = ηx (1 − ηx+1 )[f (ηx,x+1 ) − f (η)] x∈Z
458
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
in dimension 1 and (Lf )(η) =
1 ηx [1 − ηx+e1 ][f (ηx,x+e1 ) − f (η)] + [f (ηx,x+e2 ) − f (η)] 2 2
x∈Z
in dimension 2. We have combined the symmetric jump in the x2 -axis into the last term. We emphasize that the result and method in this paper apply to all asymmetric exclusion processes; the special choice is made only to simplify the notation. The velocity of the totally asymmetric simple exclusion process is explicitly computed as v = 1 − 2ρ in d = 1 and v = 2(1 − 2ρ)e1 in d = 2. We further assume for simplicity that the density ρ = 1/2 so that the velocity is zero. Denote the instantaneous currents (i.e., the difference between the rate at which a particle jumps from x to x + ei and the rate at which a particle jumps from x + ei to x) by w˜ x,x+ei : w˜ x,x+e1 = ηx [1 − ηx+e1 ] ,
w˜ x,x+e2 =
1 [ηx − ηx+e2 ] . 2
We have the conservation law Lη0 +
d
(w˜ −ei ,0 − w˜ 0,ei ) = 0 .
i=1
Let wi (η) denote the renormalized current in the i th direction: wi (η) = w˜ 0,ei − w˜ 0,ei ρ −
d (η0 − ρ) . w˜ 0,ei θ θ=ρ dθ
Note the subtraction of the linear term in this definition. We have 1 w2 (η) = − [ηe2 − η0 ] . 2
w1 (η) = −(η0 − ρ)(ηe1 − ρ) − ρ[ηe1 − η0 ] ,
A function f on {0, 1}Z will be called local if it only depends on the variables at finitely many sites. For local functions f and g we define the semi-inner product < τx g ; h >ρ = < τx h ; g >ρ · (1.2) g, hρ = d
x∈Zd
x∈Zd
Since the density ρ is fixed to be 1/2 in this article, we will henceforth leave out the subscript. All but a finite number of terms in this sum vanish because νρ is a product measure and g, h are local. From this inner product, we define the seminorm: f 2 = f, f . Note that gradient terms g = τx h − h and all degree one functions vanish in this norm. Therefore we shall identify the currents w with their degree two parts: For the rest of the article we set w1 (η) = −(η0 − ρ)(ηe1 − ρ),
w2 (η) = 0 .
Superdiffusivity of Asymmetric Exclusion
459
It has been proved in [8] that the diffusion coefficient D(t) = {Di,j (t), 1 ≤ i, j ≤ d} can be written as t s 1 1 ds dr es L wj , wi∗ Di,j (t) = δi,j − 2 2χ t 0 0 s t 1 ds dr es L wi , wj∗ , − 2χ t 0 0 where wi∗ = −wi is the normalized current in the ith direction for the adjoint process. Fix a vector in Rd . It follows from the previous identity and Ito’s formula that
2 t
| |2 1 −1/2
· D(t) − = t · w(η(s))ds
. 2 2χ 0 This is a variant of the Green-Kubo formula [8]. In d = 1 of course D(t) is a scalar. In our special case in d = 2, since w2 = 0, D(t) is a matrix with D1,2 (t) = D2,1 (t) = 0, D2,2 (t) = 1/2 and
2
t
1 1
.
t −1/2 D1,1 (t) = w (η(s))ds + 1
2 2χ 0 ∞ Recall that 0 e−λt f (t)dt ∼ λ−α as λ → 0 means, in some weak sense, that f (t) ∼ t α−1 as t → ∞. Throughout the following λ will always be a positive real number. We can now state the main result. Theorem 1. There exists C > 0 so that for sufficiently small λ > 0, ∞ 1 d=1: e−λt tD(t)dt ≥ Cλ−2− 4 , 0 ∞ 1/2 e−λt tD1,1 (t)dt ≥ Cλ−2 log λ . d=2: 0
The conjectured behavior for t large is D(t) ∼ t 1/3 in d = 1 and D1,1 (t) ∼ (log t)2/3 in d = 2. This theorem says that in a certain average, asymptotic sense D(t) ≥ t 1/4 in d = 1 and D1,1 (t) ≥ (log t)1/2 in d = 2. From the definition, we can rewrite the diffusion coefficient as t 1 t s uL e w1 , w1 du ds tD1,1 (t) = + 2 χ 0 0 in d = 2, with an analogous formula in d = 1 (just drop the subscripts). Thus, t s ∞ 1 1 ∞ −λt e tD1,1 (t) dt = + dt e−λt euL w1 , w1 du ds. 2λ2 χ 0 0 0 0 Changing the order of summations, we obtain that the previous expression is equal to t 1 ∞ ∞ 1 −λ(t−u) + du dt e ds e−λu euL w1 , w1 2λ2 χ 0 u u 1 1 = + w1 , (λ − L)−1 w1 . 2λ2 2χ λ2 Therefore, Theorem 1 follows from the following estimate on the resolvent.
460
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Lemma 1.1. There exists C > 0 such that for sufficiently small λ > 0, d = 1 : w, (λ − L)−1 w ≥ Cλ−1/4 ; 1/2 d = 2 : w1 , (λ − L)−1 w1 ≥ C log λ . Theorem 1 can also be interpreted as a lower bound on the variance of the position of a second class particle in the asymmetric simple exclusion process. Fix 0 < ρ < 1 and distribute particles on Zd∗ = Zd − {0} according to a Bernoulli product measure with density ρ. These particles are called first class particles. Place a second class particle at the origin. Particles evolve according to the asymmetric exclusion process. However, the first class particles have priority over the second class particle in the sense that if a first class particle tries to jump to a site occupied by the second class particles, they exchange position; while if the second class particle tries to jump to a site occupied by a first class particle, the jump is suppressed. Denote by Xt = (Xt1 , . . . , Xtd ) the position of the second class particle at time t. An elementary computation shows that P [Xt = x] =
1 1 < ηx (t); η0 (0) > = S(t, x) . χ χ j
In particular, E[Xt ] = vt, E[Xit ; Xt ] = 2tDi,j (t) and, by Theorem 1, in a weak sense, in dimension 1, E[Xt ; Xt ] ≥ t 5/4 and in dimension 2, E[a · Xt ; a · Xt ] ≥ |a|2 t (log t)1/2 for all a in R2 . In dimension d ≥ 3, √a central limit theorem for the second class particle with the usual renormalization in t follows essentially from the equilibrium fluctuations of the density field proved in [4]. 2. Duality and Resolvent Hierarchy Denote by C = C(ρ) the space of νρ -mean zero local functions. For a finite subset of Zd , denote by the νρ -mean zero local function defined by =
x∈
√
ηx − ρ · ρ(1 − ρ)
By convention, φ = 1. Note that the collection { }, where ranges over finite subsets of Zd , forms an orthonormal basis of L2 (νρ ). For n ≥ 0, denote by En the family of all subsets of Zd with n points: En = {A ⊂ Zd : |A| = n}, and let E be the finite subsets of Zd : E = ∪n≥0 En . All νρ -mean zero local functions f can be decomposed uniquely as a finite linear combination of cylinder functions of finite degree : f = f = f ∈E
n≥0 ∈En
Superdiffusivity of Asymmetric Exclusion
461
for a finitely supported function f : E → R. We call f the Fourier coefficients of the cylinder function f . Note that these coefficients depends on the density because the basis { , ∈ E} itself depends on f(A) = f(A, ρ). A local function f = ∈E f whose Fourier coefficients vanish outside En is said to have degree n. The scalar product ·, · can be easily represented in terms of the Fourier coefficients. Fix two local functions u, v and write them in the basis { A , A ∈ E}: u = u(A) A , v = v(A) A . A∈E
A∈E
An elementary computation shows that u(A) v(A + x) . u, v = x∈Zd n≥1 A∈En
In this formula, B + z is the set {x + z; x ∈ B}. The summation starts from n = 1 due to the presence of the covariance in the definition of the inner product ·, · . We say that two finite subsets A, B of Zd are equivalent if one is the translation of the other. This equivalence relation is denoted by ∼ so that A ∼ B if A = B + x for some x in Zd . Let E˜n be the quotient of En with respect to this equivalence relation: E˜n = En /∼ , E˜ = E/∼ . For any summable function f : E → R, f(A) = f(A + z) . A∈E
A∈E˜ z∈Zd
In particular, for two local functions u, v, u˜ (A) v˜ (A) , (2.1) u(A + z) v(A + x + z) = u, v = x,z∈Zd n≥1 A∈E˜n
n≥1 A∈E˜n
where, for a finite set A and a summable function u : E → R, u˜ (A) = u(A + z) . z∈Zd
In view of the previous formula, denote by L2 (E) the Hilbert space induced by the finite supported functions on E with scalar product · , · given by u˜ (A) v˜ (A) . u, v = n≥1 A∈E˜n
In particular, for two local functions u =
∈E
u() , v =
∈E
v() ,
u, v = u, v . We now examine how the generator of the asymmetric exclusion process L acts on the Fourier coefficients. Fix a local function u = ∈E u() . A simple computation shows that Lu = (Lρ u)() , ∈E
462
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
where Lρ = S + A0 + A+ + A− , (Su)() = (1/2)
d
[u(x,x+ej ) − u()] ,
j =1 x∈Zd
(A0 u)() = (1 − 2ρ) (A+ u)() = (A− u)() =
χ (ρ) χ (ρ)
{u(x,x+e1 ) − u()} ,
x∈,x+e1 ∈
{u(\{x + e1 }) − u(\{x})} ,
x,x+e1 ∈
{u( ∪ {x + e1 }) − u( ∪ {x})} .
x,x+e1 ∈
In our special case ρ = 1/2 so that A0 = 0. The operator Lρ , which is not a generator, can thus be decomposed in three pieces, S, A+ , A− . S is the symmetric part of Lρ and does not change the degree of a function. In contrast, A+ increases the degree by one, while A− decreases it by one and A− is the adjoint of −A+ : (A− )∗ = −A+ . In particular, A− + A+ is anti-symmetric and Sf, g = f, Sg and A+ f, g = − f, A− g so that (A+ + A− )f, g = − f, (A+ + A− )g . Moreover, a simple computation shows that in L2 (E): Sf = 0 A− f = 0
for all functions f of degree one , for all functions f of degree two .
(2.2)
For n ≥ 1, let Gn = ∪1≤k≤n En . Denote by πn the orthogonal projection on L2 (En ), by Pn the orthogonal projection on L2 (Gn ) and operator Lρ truncated by Ln = Lρ,n the at level n: Ln = Pn Lρ Pn . In particular, f = n≥1 πn f and Pn = 1≤j ≤n πj . Recall that the normalized current w1 = −(η0 − ρ)(ηe1 − ρ) has degree 2. Its Fourier transform w is given by −χ (ρ) if = {0, e1 } , w = 0 otherwise . To investigate the asymptotic behavior of w, (λ − L)−1 w , for λ > 0 consider the resolvent equation (λ − L)uλ = w. In the Fourier space, the equation becomes the hierarchy equations A∗ π3 uλ + (λ − S)π2 uλ = w , + A∗ πk+1 uλ + (λ − S)πk uλ − A+ πk−1 uλ = 0 , +
for
k≥3
because A∗+ = −A− . The hierarchy starts at degree 2 instead of 1 because the degree one equation is trivial. Indeed, by (2.2), Sπ1 uλ = 0, A− π2 uλ = 0, so that the degree one equation −A− π2 uλ + (λ − S)π1 uλ = 0 becomes π1 uλ = 0. Hence π1 uλ plays no role and we can set π1 uλ = 0.
Superdiffusivity of Asymmetric Exclusion
463
Consider the truncated resolvent equation up to the degree n: A∗ π u + (λ − S)π2 uλ,n = w , + 3 λ,n A∗+ πk+1 uλ,n + (λ − S)πk uλ,n − A+ πk−1 uλ,n = 0 , 3 ≤ k ≤ n − 1 , (λ − S)πn uλ,n − A+ πn−1 uλ,n = 0 .
(2.3)
We can solve the final equation of (2.3) by πn uλ,n = (λ − S)−1 A+ πn−1 uλ,n . Substituting this into the equation of degree n − 1, we have −1 πn−2 uλ,n . πn−1 uλ,n = (λ − S) + A∗+ (λ − S)−1 A+ Solving iteratively we arrive at π2 uλ,n = Tn w , where the operators {Tn , n ≥ 2} are defined inductively by T2 = (λ − S)−1 ,
Tn+1 =
(λ − S) + A∗+ Tn A+
−1
.
The truncated equation represents the solution of (λ − Ln )uλ,n = w and hence π2 uλ,n , w = w, (λ − Ln )−1 w so that w, (λ − Ln )−1 w = w, Tn w , where, for example, −1 , T3 = (λ − S) + A− (λ − S)−1 A+ −1 −1 A+ . T4 = (λ − S) + A− (λ − S) + A− (λ − S)−1 A+ Lemma 2.1. For each λ > 0, w, (λ − L2k+1 )−1 w is an increasing sequence which converges to w, (λ − Lρ )−1 w and w, (λ − L2k )−1 w is a decreasing sequence which converges to w, (λ − Lρ )−1 w . Proof. Since λ − S is positive, it is easy to show from the definition of the sequence of operators Tn that 0 ≤ T3 ≤ T2 and that Tm ≤ Tn if Tm−1 ≥ Tn−1 . In particular, {T2k , k ≥ 1} is a decreasing sequence, {T2k+1 , k ≥ 1} is an increasing sequence and T2k+1 ≤ T2j for any k, j ≥ 1: w, (λ − L3 )−1 w ≤ w, (λ − L5 )−1 w ≤ · · · · · · ≤ w, (λ − L4 )−1 w ≤ w, (λ − L2 )−1 w .
(2.4)
To check that w, (λ − Lρ )−1 w is in fact the limit of these upper and lower bounds we use the variational formula. For any matrix M, let Ms denote the symmetric part
464
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
−1 (M + M ∗ )/2. The identity [M −1 ]s = M ∗ (Ms )−1 M always holds, and thus we have w, (λ − Lρ )−1 w = sup 2w, f − (λ − Lρ )f, (λ − S)−1 (λ − Lρ )f , (2.5) f
where the supremum is carried over all finite supported functions f : E → R. Note that (λ − Lρ )f, (λ − S)−1 (λ − Lρ )f = f, (λ − S)f + Af, (λ − S)−1 Af , provided A = A+ + A− . Hence, w, (λ − Lρ )−1 w = sup inf 2w − A∗ g, f − f, (λ − S)f + g, (λ − S)g . f
g
Let an denote the supremum restricted to finite supported functions in f in L2 (Gn ), and a n denote the infimum restricted to a finite supported function g in L2 (Gn ) so that an ↑ w, (λ − Lρ )−1 w and a n ↓ w, (λ − Lρ )−1 w. By straightforward computation one checks that an ≤ w, (λ − Ln+1 )−1 w ≤ a n , giving the desired result. In what follows we will present a general approach to Eq. (2.3) which, from (2.4) give a nontrivial lower bound on the diffusion coefficient without too much work. Because it gives a sequence of upper and lower bounds, the method has the potential to give the full conjectured scaling of the diffusion coefficient. 3. Degree 3 Lower Bounds From Lemma 2.1 of the previous section we have a lower bound at degree three. However, computations are complicated by the hard core exclusion. We now describe a method to remove the hard core restriction in the computation. We then perform the computations in Fourier space. The estimates which justify the removal of the hard core are presented in the next section. By removal of the hard core, we mean replacing functions defined on En by symmetric functions defined on Zdn and replacing operators acting on En by operators acting on Zdn . We first identify a function f : En → R with a symmetric function f : (Zd )n → R. To this end, for n ≥ 1, let En,1 = {xn := (x1 , . . . , xn ) : xk ∈ Zd , xi = xj , for i = j } and define
f (xn ) =
f{x1 ,...,xn } 0
With the notation just introduced, 2 1 E fA A = n! A∈En
(3.1)
if xn ∈ En,1 , otherwise .
|f (x1 , . . . , xn )|2 .
x1 ,...,xn ∈Zd
For a function f : Znd → R, we shall use the same symbol f to denote the expectation 1 f (x1 , . . . , xn ) n! d x1 ,...,xn ∈Z
Superdiffusivity of Asymmetric Exclusion
465
and write the inner product of two functions as f, g = f g. If f and g vanish on the complement of En,1 , this coincides with the inner product introduced before. We also define, as before, f, g = x∈Zd τx f, g. d n Denote by En the space (Z ) and let E = ∪n≥1 En , Gn = ∪1≤j ≤n Ej . We use the same symbol πn , Pn = 1≤j ≤n πj for the projection onto En , Gn . As before, there is a simple formula for the inner product · , · . Consider two finitely supported functions f , g : En → R. By definition, f, g =
f (x1 + z, . . . , xn + z)g(xn ) =
xn ∈En z∈Zd
f˜(xn )g(xn ) ,
xn ∈En
where
f˜(xn ) =
f (x1 + z, . . . , xn + z) .
z∈Zd
Denote by ∼ the equivalence relation on En defined by xn ∼ yn if xn − yn = (z, . . . , z) for some z in Zd . Let E˜ n = En ∼ . Since summing over all sites in En is the same as summing over all equivalence classes and then over all elements of a single class, the previous sum is equal to
f˜(xn + z)g(xn + z) =
xn ∈E˜ n z∈Zd
˜ n) f˜(xn )g(x
xn ∈E˜ n
because f˜(xn + z) = f˜(xn ). Here, xn + z stands for (x1 + z, . . . , xn + z). It remains to choose an element of each class. This can be done by fixing the last coordinate to be zero. In conclusion,
f, g =
f ∗ (xn−1 )g ∗ (xn−1 ) ,
(3.2)
xn−1 ∈En−1
where f ∗ (xn−1 ) =
f (x1 + z, . . . , xn−1 + z, z) .
z∈Zd
Here again we see that the translations in the inner product effectively reduce the degree of a function by one. We derive now explicit formulas for the operators S, A+ acting on symmetric functions of (Zd )n . An elementary computation shows that (Sf )(xn ) =
d n
σ σ 1{∇α,i xn ∈ En,1 }∇α,i f (xn )
α=1 i=1 σ =±
if xn belongs to En,1 and (Sf )(xn ) = 0 if xn does not. Here, for σ = ±, σ ∇α,i xn = (x1 , . . . , xi + σ eα , . . . , xn ) , σ σ ∇α,i f (xn ) = f (∇α,i xn ) − f (xn ) .
466
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Note that S is the discrete Laplacian with Neumann boundary condition on En,1 . In the same way, (A+ f )(xn ) =
n i=1 j =i
ij
1{xj = xi + e1 }∇+ f (xn )
(3.3)
if xn belongs to En,1 and (A+ f )(xn ) = 0 otherwise. Here, ij
j
∇+ f (xn ) = f (xni ) − f (xn ) j
j
and the index j in xn indicates the absence of xj in the vector xn : xn = (x1 , . . . , xj −1 , xj +1 , . . . , xn ). We now extend the operators S, A+ to symmetric functions not necessarily vanishing on En,1 by formulas analogous to the ones above, except that we drop the indicator function. Let S and A+ be the operators defined by: (SF )(xn ) = (F )(xn ) =
d n
σ (∇α,i F )(xn ) .
i=1 σ =± α=1
Of course, S is the discrete Laplacian . On the other hand, (A+ F )(xn ) =
n i=1 j =i
ij
1{xj = xi + e1 }∇+ F (xn ) .
(3.4)
Notice that A+ F = 0 if |F | < ∞ and hence the counting measure is invariant. Let L = + A+ , and denote by Ln = Pn LPn the restriction of L to Gn . Throughout the rest of the paper we will use C(λ) to denote a function of λ > 0 which has the property that for some C < ∞, and for sufficiently small λ, C| log λ| , d = 1 ; C(λ) ≤ (3.5) C, d=2. In the next section we will prove the following lemma. Lemma 3.1. There exists a constant C(λ) as in (3.5) such that 1 w, (λ − Ln )−1 w ≤ w, (λ − Ln )−1 w ≤ C(λ)3 n4 w, (λ − Ln )−1 w. C(λ)4 n6 The proof of this lemma is given in the next section. The special case n = 3 combined with Lemma 2.1 tells us that −1 w, (λ − ) + A∗+ (λ − )−1 A+ w ≤ C(λ)4 w, (λ − L)−1 w .
Superdiffusivity of Asymmetric Exclusion
467
To bound below the left hand side of the previous expression, we define the Fourier transform of a function F : Znd → R by
(pn ) = F
e−ixn ·pn F (xn )
xn ∈Znd
for pn ∈ (Rd /2π Zd )n . The Fourier transform of the discrete Laplacian acting on F is given by (pn ) = − −F
d n
eiek pj − 2 + e−iek pj
Fˆ (pn ) = ω(pn )Fˆ (pn ) ,
j =1 k=1
where ω(pn ) = nj=1 ω(pj ). In d = 2 we will denote the d components of p by (r, s): rn = (r1 , . . . , rn ) and sn = (s1 , . . . , sn ). In d = 1, ω(p) = − eip − 2 + e−ip and in ir d = 2, ω(p) = − e − 2 + e−ir − eis − 2 + e−is . Note that in both cases ω is real valued and nonnegative. If F is a symmetric function of two integer variables we have A + F (p1 , p2 , p3 ) =
eie1 ·pσ1 − e−ie1 ·pσ3 Fˆ (pσ1 + pσ3 , pσ2 )
σ
=
i{sin(e1 · pσ1 ) + sin(e1 · pσ3 )}Fˆ (pσ1 + pσ3 , pσ2 ) ,
σ
where σ runs over permutations of degree three. On the other hand, in view of (3.2), F, G =
π 1 ∗ 1 ∗ (p) G ∗ (p) dp . F (z) G∗ (z) = F 2! 4π −π d z∈Z
An elementary computation shows that ∗ (p) = Fˆ (p, −p) F so that 1 F, G = 4π
π
−π
ˆ Fˆ (p, −p) G(p, −p) dp .
Lemma 3.2. Fix a symmetric function F : (Zd )2 → R. There exists a finite constant C such that in dimension 1, π −1 A+ F, (λ − ) A+ F ≤ C du ω(u)[λ + ω(u)]−1/2 |Fˆ (u, −u)|2 ; −π
and in dimension 2, A+ F, (λ − )
−1
A+ F ≤ C
π −π
du ω(e1 · u) | log(λ + ω(u))| |Fˆ (u, −u)|2 .
468
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Proof. Using the Schwarz inequality to bound the cross terms by the diagonal terms, we can bound A+ F by A+ F, (λ − )
−1
≤C
p1 +p2 +p3 =0
2 |A + F (p1 , p2 , p3 )| dS p1 +p2 +p3 =0 λ + ω(p1 ) + ω(p2 ) + ω(p3 ) |sin(e1 · p1 ) + sin(e1 · p3 )|2 |Fˆ (p1 + p3 , p2 )|2 dS, (3.6) λ + ω(p1 ) + ω(p2 ) + ω(p3 )
A+ F =
where dS is the element of surface area on the hyperplane {p1 + p2 + p3 = 0}. Let denote the region in which at least one of the integration variables, ri or si in d = 2 or pi in d = 1, is bounded away from ±π and 0, let us say by 1/8. On the denominator, λ + ω(p1 ) + ω(p2 ) + ω(p3 ) ≥ C −1 > 0 independent of λ. Hence the integration over in (3.6) is uniformly bounded in λ. We are only concerned with terms diverging as λ ↓ 0 and hence we can restrict our attention to the integration over C . We need to divide C according to whether p1 , p3 in d = 1 and r1 , r3 , s1 , s3 in d = 2 are within 1/8 of −π = π or 0. There are four regions in d = 1 and sixteen in d = 2. We have to compute each one and add them up. But in fact they are all analogous, and give the same result. So for simplicity we only present the region where they are all in [−1/8, 1/8]. We call u = p1 + p3 , v = p1 − p3 . The integration over the corresponding region in (3.6) is bounded by a constant multiple of
dv du ω(u)|Fˆ (u, −u)|2 2 2 2 |u|≤1/8 |v|≤1/8 λ + |u + v| + |u − v| + |u| in d = 1 and |u|≤1/8
ω(e1 · u)|Fˆ (u, −u)|2
|v|≤1/8
dv du λ + |u + v|2 + |u − v|2 + |u|2
in d = 2. Estimating the inside integration, in brackets, in d = 1, dv ≤ C(λ + u2 )−1/2 . 2 2 2 |v|≤1/8 λ + (u + v) + (u − v) + u In d = 2,
|v|≤1/8
λ + |u + v|2
dv ≤ C | log(λ + u2 )|, + |u − v|2 + |u|2
which completes the proof of the lemma.
Lemma 3.3. There exists C < ∞ such that w,
λ − + A∗+ (λ − )−1 A+
−1
w ≥
C −1 λ−1/4 d = 1; C −1 | log λ|1/2 d = 2.
(3.7)
Superdiffusivity of Asymmetric Exclusion
469
Proof. d = 1: The Fourier transform of the current w is w(p ˆ 1 , p2 ) = e−ip2 /2. By the variational formula (2.5) for the H−1 norm, the left-hand side of (3.7) is equal to sup 2 w, F − F, [λ − + A∗+ (λ − )−1 A+ F , F
where the supremum is performed over all local functions F . We use the Fourier transform and the estimate stated in the previous lemma to bound below the previous expression. By the identity stated just before Lemma 3.2, and in view of the explicit formula for the Fourier transform of the current given above, the linear term of the variational formula is equal to π π 1 1 ip ˆ F (p, −p)e dp = Fˆ (p, −p) cos p dp . 8π −π 8π −π The last identity follows from the fact that Fˆ (a, b) = Fˆ (b, a) because F is symmetric and from a change of variables. The first quadratic term is equal to π 1 1 |Fˆ (p, −p)|2 . 4π −π λ + 2w(p) By Lemma 3.2, the second one is bounded above by π w(p) C |Fˆ (p, −p)|2 {λ + w(p)}1/2 −π for some finite constant C. In view of the explicit formulas for each term of the variational formula, choosing F appropriately we obtain that the left-hand side of (3.7) is bounded below by π cos2 ξ dξ . C −1/2 −π λ + ω(ξ )(λ + ω(ξ )) We can restrict the integration to the region ξ ∈ [−1/8, 1/8] and replace ω(x) by x 2 to have a further lower bound, 1/8 dξ C ≥ Cλ−1/4 . 2 (λ + ξ 2 )−1/2 λ + ξ −1/8 This proves Lemma 1.1 for d = 1. d = 2: The proof follows closely the one in dimension one and we omit some details. The Fourier transform of the current w is w (p1 , p2 ) = e−ir2 /2. From the previous lemma the left-hand side of (3.7) is bounded below by π π dξ dη −1 · C −π −π λ + ω(η) + ω(ξ ) + ω(ξ )| log(λ + ω(η) + ω(ξ ))|
470
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
We can restrict to the region ξ, η ∈ [−1/8, 1/8] to have a further lower bound. In this region, we can replace ω(x) by x 2 up to a constant factor and use | log(λ + η2 + ξ 2 )| ≤ | log(λ + η2 )| to obtain a further lower bound C −1
1/8
1/8
−1/8 −1/8
dξ dη · λ + η2 + ξ 2 + ξ 2 | log(λ + η2 )|
Letting z = ξ(1 + | log(λ + η2 )|)1/2 the integration is bounded below by
1/8
1/8
−1/8 −1/8
dzdη (1 + log(λ + η2 )|)−1/2 . λ + η 2 + z2
Changing to polar coordinates r, θ and restricting to π/3 ≤ θ ≤ 2π/3 we get another lower bound, 0
1/20
rdr | log(λ + r 2 )|−1/2 ≥ C −1 | log λ|1/2 . λ + r2
Lemma 1.1 follows immediately from Lemma 3.3, (2.4) and Lemma 3.1 in d = 2. However in d = 1 this only gives the slightly weaker result w, (λ − L)−1 w ≥ C −1 λ−1/4 /| log λ|4 . The log λ is purely from the removal of the hard core, and we now show that the correct degree 3 lower bound in d = 1 is of order λ−1/4 . We conclude this section with a proof of Lemma 1.1 in dimension 1. Proof of Lemma 1.1 in d = 1. By Lemma 2.1, by definition of the operator T 3 introduced just before that lemma and by the variational formula (2.5), w, (λ − L)−1 w ≥ w, (λ − L3 )−1 w (3.8) −1 = sup 2w, f − f, (λ − S)f − A+ f, (λ − S) A+ f . f
It remains to choose a test function f to obtain a lower bound. Before proposing such a function, we derive explicit formulas for all expressions which appear on the right-hand side of this variational formula. Denote by N the positive integers. Fix two local functions f , g of degree 2 and denote by f, g their Fourier transforms. In view of (2.1) and since there is a one to one correspondence between the equivalent class E˜2 and N, f, g =
f∗ (x) g∗ (x) ,
x∈N
where f∗ (x) =
f({z, z + x}) .
z∈Z
In particular, for any local function of degree 2, since the current has degree 2, f, w = −2χ (ρ)f∗ (1) .
Superdiffusivity of Asymmetric Exclusion
471
In the same way, an elementary computation shows that for every local function f and g of degree 2 and 3, respectively, f, (−S)f =
{f∗ (x + 1) − f∗ (x)}2 ,
(3.9)
x∈N
1 ∗ {g (x + v) − g∗ (x)}2 1{x + v ∈ I2 } . 4 v
g, (−S)g =
x∈I2
In this formula I2 = {(x, y) ∈ Z2 : 0 < x < y}, the last summation is carried over all vectors v of Z2 such that v = ±(0, 1), ±(1, 0), ±(1, 1) and, for every local function of degree 3, g∗ (x1 , x2 ) = g({z, x1 + z, x2 + z}) . z∈Z
Finally, for a local function f of degree 2 and (x, y) in I2 , (A+ f )∗ (x, y) =
χ (ρ) 1{x = 1} − 1{x = y − 1} f∗ (y) − f∗ (y − 1) .
Consider now a function of degree 2 such that f∗ (x) = λ−1/4 e−λ
3/4 (x−1)
for x in N and compute all terms which appear in the variational formula (3.8). It follows from the previous identities that as λ ↓ 0, 2 w, f ∼ λ−1/4 ,
λ f, f ∼ λ−1/4 ,
f, (−S)f ∼ λ1/4
and that (A+ f )∗ (x, y) =
√ −λ3/4 y λe 1{x = y − 1} − 1{x = 1} = Rλ (x, y) .
In particular, the final term A+ f, (λ − S)−1 A+ f can be written as
∞
dt e−λt
0
pt (x, y)Rλ (x)Rλ (y) ,
x,y∈I2
∞
∼ λ 0
dt e−λt
e−λ
3/4 (x+y)
pt (x − 1, x), (y − 1, y)
x>2,y>2
−pt (x − 1, x), (1, y) − pt (1, x), (y − 1, y) + pt (1, x), (1, y) , where pt are the transition probabilities of the random walk on I2 with Dirichlet form given by the second formula in (3.9). Denote this random walk by (Xt , Yt ) and let (X˜ t , Y˜t ) = (Xt − 1, Yt − Xt − 1). (X˜ t , Y˜t ) is a random walk on the first quadrant Q1 = {(x, y) ∈ Z2 : x, y ≥ 0}. It jumps from (x, y) to (x, y) + v at rate one, where v = ±(1, 0), ±(0, 1), ±(1, −1). Jumps from Q1 to its complement are suppressed. Let
472
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
qt be the transition probabilities of the random walk (X˜ t , Y˜t ). With this notation, the previous integral is equal to
∞
0
dt e−λt
λ
e−λ
3/4 (x+y)
qt (x, 0), (y, 0)
x>0,y>0
− qt (x, 0), (0, y) − qt (0, x), (y, 0) + qt (0, x), (0, y) .
By images we can rewrite this as
∞
2λ
dt e−λt
0
e−λ
3/4 (|x|+|y|)
qt (x, 0), (y, 0) − qt (x, 0), (0, y) ,
x,y=0
where qt is now the transition density for a continuous time random walk on Z2 , where the particle makes jumps at rate 1/2 of (1, 0), (−1, 0), (0, 1), (0, −1) and, in the first and third quadrants (1, −1) and (−1, 1), and in the second and fourth quadrants (1, 1) and (−1, −1). On the axes themselves the rules are changed a bit. For example, on the positive x-axis the particle jumps at rate 1/4 of (0, 1), (0, −1), (−1, 1) and (−1, −1). On the other axes these rules are just naturally rotated. The diffusion approximation is 2λ
∞
dt e
−λt
−∞
0
∞
dx
∞
−∞
dy e−λ
3/4 (|x|+|y|)
qt (x, 0), (y, 0) −qt (x, 0), (0, y) ,
where qt is now the transition density for a diffusion in R2 with generator + d 2 , where d = ∂y − ∂x in the first and third quadrants and d = ∂y + ∂x in the second and fourth quadrants. The corresponding Dirichlet form is comparable to the standard one and therefore we can bound the transition probabilities above and below by those of Brownian motion (see [5]). By change of variables
∞
λ
dteλt
−∞
0
∞
dx
|y−x|2
∞
−∞
dye
−λ3/4 (|x+y|)
e− 4t 4π t
= Cλ−1/4 .
The error
∞
dte
λ 0
λt
∞ −∞
dx
|y−x|2
∞
−∞
dy[e
−λ3/4 (|x|+|y|)
−e
−λ3/4 (|x+y|)
e− 4t ] 4π t
= O(λ) .
Choosing af∗ for our test function instead of f∗ , in view of the bounds (3.8) and of the explicit computations, we obtain that w, (λ − L−1 )w ≥ C1 aλ−1/4 − C2 a 2 λ−1/4 for some finite constants C1 , C2 independent of λ. Choosing an appropriate a, we conclude the proof.
Superdiffusivity of Asymmetric Exclusion
473
4. Removal of Hard Core In this section we prove Lemma 3.1. Recall the main statement is that w, (λ−Ln )−1 w can be bounded above and below in terms of w, (λ − Ln )−1 w at the expense of constants depending on n. Since we only use the bound for n = 3 in this article, the precise dependence of the constants on n is not important and in many places is probably not optimal. For a symmetric function f on (Zd )n , we denote by ρi , i = 1, 2, 3 the one, two and three point functions : ρ1 (x) =
1 (n − 1)! x ,...,x 1
f (x, x1 , . . . , xn−1 )2 ,
n−1
1 ρ2 (x, y) = (n − 2)!2! x ,...,x 1
f (x, y, x1 , . . . , xn−2 )2 ,
n−2
1 ρ3 (x, y, z) = (n − 3)! 3! x ,...,x 1
f (x, y, z, x1 , . . . , xn−3 )2 ,
n−3
for distinct sites x, y, z in Zd . The following is Lemma 4.8 of [9]. Lemma 4.1. There exists a finite constant C such that for every symmetric function f : (Zd )n → R, 1/2 1/2 ρ3 , (−S3 )ρ3 ≤ Cn2 f, (−S)f . Proof. A simple computation shows that for a finite supported function h : En → R, h, (−S)h = 2ρ(1 − ρ)
d j =1
x∈Zd
(hA∪{x} − hA∪{x+ej } )2 .
A∈En−1 A∩{x,x+ej }=φ
In particular, for a finite supported function g : E3 → R, g, (−S)g = 2ρ(1 − ρ)
d
j =1 x∈Zd
{y,z}∈E2 {y,z}∩{x,x+ej }=φ
[g(x, y, z) − g(x + ej , y, z)]2 .
Fix a finite supported symmetric function f : (Zd )n → R. For {x, y, z} in E3 , let ρ3 (x, y, z) =
Set g =
1 3!
∈En−3 ∩{x,y,z}=φ
2 f{x,y,z}∪ .
√ ρ3 and write, as in [9], for w = x or x + ej ,
g(w, y, z)2 =
1 3!
∈En−3 ∩{x,x+ej ,y,z}=φ
2 f∪{w,y,z} + F (x, x + ej , y, z)2 ,
474
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
where
F (x, x + ej , y, z)2 =
∈En−4 ∩{x,x+ej ,y,z}=φ
2 f∪{x,x+e . j ,y,z}
By the reverse triangle inequality ( a 2 − b 2 )2 ≤ a − b 22 , {g(x, y, z) − g(x + ej , y, z)}2 ≤
1 3!
[f∪{x,y,z} − f∪{x+ej ,y,z} ]2 .
∈En−3 ∩{x,x+ej ,y,z}=φ
In particular, g, (−S)g is less than or equal to C
d j =1 x∈Zd
(f∪{x}∪{y,z} − f∪{x+ej }∪{y,z} )2
{y,z}∈E2 ∈En−3 {y,z}∩{x,x+ej }=φ ∩{x,x+ej ,y,z}=φ
for some finite constant C. The summand depends only on A = {y, z} ∪ . Since |{B ⊂ A : |B| = 2}| = n−1 2 , n−1 √ √ f, (−S)f ρ3 , (−S) ρ3 ≤ C 2 for some finite constant C. This concludes the proof of the lemma.
The following lemma is a simple extension of Theorem 4.7 of [9] or Lemma 3.7 of [12] to dimensions d= 1, 2. We use hereafter the lattice distance in Zd so that |x| = |(x 1 , . . . , x d )| = 1≤j ≤d |x j |. Recall also the definition of En,1 given in (3.1). Lemma 4.2. Fix R > 0. There exists a constant C(λ) as in (3.5) such that for a symmetric c , function f : (Zd )n → R vanishing on En,1 1{|xi − xk | + |xk − xj | ≤ R} f 2 ≤ C(λ)n2 f, (λ − S)f , i,j,k
1{|xi − xj | + |xk − xl | ≤ R} f 2 ≤ C(λ)n3 f, (λ − S)f .
i,j,k,l
In these inequalities summation is carried over all distinct indices {i, j, k} or {i, j, k, l} in {1, . . . , n}. Proof. We prove the first inequality and leave the second to the reader. By definition of the three point function, the left side is a constant times 1{|x1 − x3 | + |x3 − x2 | ≤ R} ρ3 (x1 , x2 , x3 ) . 1/2
By the previous lemma, we can bound the Dirichlet form of g = ρ3 we only have to prove that
by that of f . Thus
1{|x1 − x3 | + |x3 − x2 | ≤ R}g 2 ≤ C(λ)g, (λ − S)g for functions g of degree three.
Superdiffusivity of Asymmetric Exclusion
475
Recall that for configuration with three particles we have E3,1 = {x3 := (x1 , x2 , x3 ) : xi = xj , for i = j }. We have g(x3 ) = 0 whenever x3 ∈ E3,1 and the operator S is the discrete Laplacian on E3,1 with Neumann boundary conditions. Define G(x3 ) = g(x3 ) if x3 ∈ E3,1 and G(x3 ) = Avy3 ∈E3,1 ,|y3 −x3 |≤2 g(y3 ) for x3 ∈ E3,1 . Here, Av f stands for the average of the function f over the set . We claim that for G so defined, G(λ − )G ≤ Cg(λ − S)g.
(4.1)
Consider [G(x3 ) − G(x1 + e1 , x2 , x3 )]2 with (x1 + e1 , x2 , x3 ) ∈ E3,1 and x3 ∈ E3,1 . In this case that G(x3 ) is the average of g(y3 ) with |y3 − x3 | ≤ 2 and y3 ∈ E3,1 . We can check that y3 and (x1 + e1 , x2 , x3 ) ∈ E3,1 can be connected via nearest-neighbor bonds in E3,1 . Thus we have [G(x3 ) − G(x1 + e1 , x2 , x3 )]2 ≤ Cg(−S)g. A similar inequality can be checked if (x1 + e1 , x2 , x3 ) ∈ E3,1 and this proves (4.1). Since g(x3 ) = G(x3 ) on E3,1 and 0 otherwise, it is clear that 1{|x1 − x3 | + |x3 − x2 | ≤ R}g 2 ≤ 1{|x1 − x3 | + |x3 − x2 | ≤ R}G2 . Thus, to prove the lemma it will suffice to prove that 1{|x1 − x3 | + |x3 − x2 | ≤ R}G2 ≤ C(λ)G, (λ − )G . Fix the x3 variable and integrate only over x1 , x2 . Dropping the part of the Laplacian in the x3 direction makes the right-hand side smaller. Hence, it is enough to prove that 1{|x1 | + |x2 | ≤ R}G2 ≤ C(λ)G, (λ − )G for functions G(x1 , x2 ). Call V = 1{|x1 | + |x2 | ≤ R}. It is local and bounded. We are in Z2d and we want to show that there is a C(λ) such that C(λ)(λ − ) ≥ V as operators, or, equivalently V 1/2 (λ − )−1 V 1/2 ≤ C(λ). LetGλ (x, y) be the kernel of (λ − )−1 and ϕ : Z2d → R. Since V is bounded by 1, x,y∈Z2d V 1/2 (x)Gλ (x, y)V 1/2 (y)ϕ(x)ϕ(y) is bounded by Gλ (0, 0) x ϕ 2 (x) (cf. Lemma 3.1 in [12] for a an elementary proof). If d = 1, we are in Z2 and C(λ) = Gλ (0, 0) ≤ C| log λ|. In d = 2, we are in the transient case Z4 and Gλ (0, 0) is bounded uniformly in λ (see [13]). We now divide the complement of En,1 into two sets. We call a site x an isolated double site if xi = xj = x , |xk − x| ≥ 5 for all k = i, j . Denote by En,2 the set with at most isolated double sites and En,3 = (Zd )n −(En,1 ∪En,2 ) the rest. For a configuration (x1 , x1 , · · · , xk , xk , x2k+1 , x2k+2 , . . . , xn ) with k isolated double sites, we define F = T f by F (x1 , x1 , . . . , xk , xk , x2k+1 , x2k+2 . . . , xn ) = Av f (x1 , y1 , . . . , xk , yk , x2k+1 , . . . , xn ) . yk :|xi −yi |=1 for all i
If x ∈ En,3 , then F (x) = 0, e.g., F (x, x, x + e1 , x4 , . . . , xn ) = 0. We also define the restriction R F by R F (xn ) = F (xn ) if xn ∈ En,1 and R F (x) = 0 otherwise. Note that T and R are not inverse to each other although R T is the identity.
476
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Lemma 4.3. There is a constant C(λ) as in (3.5) such that for any symmetric function c and F = T f , f on (Zn )d vanishing on En,1 1 F, (λ − )F ≤ f, (λ − S)f ≤ F, (λ − )F . C(λ)n2
(4.2)
Moreover, if F˜ = T R F , 1 F˜ , (λ − )F˜ . (4.3) C(λ)n2 Proof. By definition, F, −F is given by 21 x,y:|x−y|=1 [F (x) − F (y)]2 . The upper bound of (4.2) is immediate, since S has Neumann boundary conditions, corresponding c . We now prove the to dropping terms in the Dirichlet form with either x or y in En,1 lower bound in (4.2). We decompose x and y into three sets, En,1 , En,2 , En,3 , so that F, −F = α,β , F, (λ − )F ≥
α=1,2,3;β=1,2,3
where α,β =
1 2
[F (x) − F (y)]2 .
x∈Eα ,y∈Eβ :|x−y|=1
If both x and y satisfy the hard core condition, then the contribution is 1 1,1 = [f (x) − f (y)]2 ≤ f, −S f . 2 x,y:|x−y|=1
We can estimate terms where either x or y is in En,3 by, |1,3 + 2,3 + 3,3 | ≤ C 1{|xi − xk | + |xk − xj | ≤ R} F 2 i=j =k
for some finite constant C. From the definition of F , we can check that 1{|xi − xk | + |xk − xj | ≤ R} F 2 ≤ C 1{|xi − xk | + |xk − xj | ≤ R} f 2 . i=j =k
i=j =k
The last term is bounded by C(λ)n2 f, (λ − S)f from Lemma 4.2. Thus we have |1,3 + 2,3 + 3,3 | ≤ C(λ)n2 f, (λ − S)f . We now bound 1,2 . In this case we have, for example, x = (x1 , x1 + e1 , x3 , · · · , xn ) and y = (x1 , x1 , x3 , · · · , xn ). Notice that because x ∈ En,1 , y can in fact have at most one double site. By assumption of isolated double sites, we have |xj − x1 | ≥ 5 for all j ≥ 3. Thus, F (y) = Av f (x1 , z, x3 , · · · , xn ) . |z−x1 |=1
Under the assumption |xj − x1 | ≥ 5 for all j ≥ 3 we can always connect z to x1 . By Schwarz inequality, we then have |f (x1 + e1 , z, x3 , · · · , xn ) − f (x1 , z, x3 , · · · , xn )|2 ≤ C f, (−S)f . x1 ,x3 ,··· ,xn
Superdiffusivity of Asymmetric Exclusion
477
Hence, |1,2 | ≤ C f, (−S)f . Finally we bound 2,2 . The typical case looks like x = (x1 , x1 , x3 , x3 + e1 , x5 , . . . , xn ) and y = (x1 , x1 , x3 , x3 , x5 , · · · , xn ). Then F (x1 , x1 , x3 , x3 + e1 , x5 , . . . , xn ) − F (x1 , x1 , x3 , x3 , x5 , . . . , xn ) = Av Av [f (x1 , z, x3 , x3 + e1 , x5 , · · · , xn ) − f (x1 , z, x3 , w, x5 , · · · , xn )] . |z−x1 |=1 |w−x3 |=1
Using Jensen’s inequality and the same arguments as in the estimate of 1,2 above we obtain |2,2 | ≤ Cf, (−S)f . Putting all these estimates together, we have the lower bound of (4.2). To prove (4.3), call f = R F so that F˜ = T f . From (4.2) we have f, (λ − S)f ≥ C(λ)n−2 F˜ , (λ − )F˜ . Since S is an operator with Neumann boundary condition, for any F with R F = f we have F, (λ − )F ≥ Cf, (λ − S)f and this proves (4.3). Lemma 4.4. Recall that Gn = ∪j ≤n Ej . Let f, g : Gn → R and let F, G = T f, T g. There is a constant C(λ) as in (3.5) such that g, Af − G, AF ≤ C(λ) nG, (λ − )G1/2 F, (λ − )F 1/2 . (4.4) Moreover, if F˜ = T R F , ˜ AF , G − AF, G ≤ C(λ)nG, (λ − )G1/2 F, (λ − )F 1/2 ,
(4.5)
where C(λ) is a constant as in (3.5). Proof. We first prove (4.4). Note that it suffices to prove it with A replaced by A+ , since it then follows for A∗+ = −A− and thus for A = A+ + A− . Suppose, first of all, that f : En → R, g : En+1 → R are functions of degree n and n + 1, respectively. Recall the definition of A+ in (3.3). It suffices to consider one term in the summation, 1,2 1,2 1,2 say A1,2 + f = 1{x2 = x1 + e1 }1{xn ∈ En,1 }∇+ f and A+ f = 1{x2 = x1 + e1 }∇+ f . We divide (Zd )n+1 into N = {|xj − x1 | > 5 for all j ≥ 3} ∩ {|xi − xj | > 5 for all i = j , i, j ≥ 3} and its complement. If xn+1 ∈ N , then xn+1 belongs to En+1,1 so that 1,2 g(xn+1 ) = G(xn+1 ) and (A1,2 + F − A+ f )(xn+1 ) = 0. It remains to estimate 2
2
g1{N c } 2 . g, 1{N c }A12 + f ≤ f (x1 + e1 , x3 · · · , xn+1 ) − f (x1 , x3 · · · , xn+1 ) Clearly, 1{N c } ≤
1{|xj − x1 | ≤ 5} +
j ≥3
1{|xi − xj | ≤ 5} .
i,j ≥3 i=j
Since |x2 − x1 | = 1, by Lemma 4.2, g1{N c } 2 ≤ C(λ)n3 g, (λ − S)g .
478
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
Replacing 1, 2 by i, j and summing over all i, j , by the permutation symmetry of f we have g, 1{N c }A+ f |2 ≤ C(λ)n3 g, (λ − S)gn−1 f, (−S)f . (4.6) A similar bound holds for F on N c . Combining these estimates and using Lemma 4.3, we obtain (4.4) for A+ . If now f : Gn → R, write f = 1≤j ≤n fj . From (4.6) we have gk+1 , A+ fk | ≤ C(λ)nεg(k+1) , (λ − S)g(k+1) + Cnε −1 f(k) , (−S)f(k) for every ε > 0. Summing over k and optimizing ε, we get (4.4). Repeating the proof, 1,2 ˜ i.e., noting that A1,2 + F − A+ F = 0 on N and using Lemma 4.2 to bound the term on N c , gives (4.5). Lemma 4.5. There is a constant C(λ) as in (3.5) such that for all f : Gn → R, C −1 AF, (λ − )−1 AF − C(λ)2 n2 F, (λ − )F ≤ Af, (λ − S)−1 Af ≤ C(λ)n2 AF, (λ − )−1 AF + C(λ)3 n4 F, (λ − )F ,
(4.7)
where F = T f . Moreover, if F˜ = T R F , AF, (λ − )−1 AF ≥ C −1 AF˜ (λ − )−1 AF˜ − C(λ)2 n2 F, (λ − )F . (4.8) Proof. Recall the variational formula Af, (λ − S)−1 Af = sup {2g, Af − g, (λ − S)g} . g
By Lemmas 4.3 and 4.4, we can bound the right side from above by 2g, Af − g, (λ − S)g ≤ 2G, AF − (1/C(λ))n−2 G, (λ − )G + C(λ)nG, (λ − )G1/2 F, (λ − )F 1/2 ≤ C(λ)n2 AF, (λ − )−1 AF + C(λ)3 n4 F, (λ − )F , which give the upper bound in (4.7). Alternatively we can bound the right side below by 2g, Af − g, (λ − S)g ≥ 2G, AF − C(λ)nG, (λ − )G1/2 F, (λ − )F 1/2 − G, (λ − )G ≥ 2G, AF − (1/2)G, (λ − )G − C(λ)2 n2 F, (λ − )F . Optimizing over G, we obtain that 2g, Af − g, (λ − S)g ≥ CAF, (λ − )−1 AF − C(λ)2 n2 F, (λ − )F . This proves the lower bound in (4.7).
Superdiffusivity of Asymmetric Exclusion
479
On the other hand, by definition, AF, (λ − )−1 AF = sup {2AF, G − G(λ − )G} . G
By Lemma 4.4, 2AF, G − G(λ − )G ≥ 2AF˜ , G − C(λ)nG, (λ − )G1/2 F, (λ − )F 1/2 − G, (λ − )G ≥ 2AF˜ , G − CG, (λ − )G − C(λ)2 n2 F, (λ − )F . Taking the sup over G, we prove (4.8).
Our interest is in the inner product ·, · . It is easy to check that all previous lemmas hold for the inner product ·, · as well. Since extensions from ·, · to ·, · were carried out in detail in [9], we shall only outline the basic procedures to prove these lemmas for the inner product ·, · . We first prove the analogue of Lemma 4.3 for the inner product ·, · . For any local functions we rewrite the inner product as ! τx g ; τx h . g, h = lim (2k + 1)−2 k→∞
|x|≤k
|x|≤k
Similarly, we have g, (−S)h = lim (2k + 1)−2 k→∞
τx g ; (−S)
|x|≤k
! τx h .
(4.9)
|x|≤k
Recall the definition of F = T f , which is linear in f . Furthermore, the intersection properties of (x1 , . . . , xn ) are independent of the translation, i.e. x belongs to En,i if and only if f τz x belongs to En,i . In particular, T τx h = τx T h . |x|≤k
|x|≤k
For each fixed k everything is still local so by Lemma 4.3, ! (2k + 1)−2 τx F, (λ − ) τx F |x|≤k
≤ C(λ)n (2k + 1) 2
|x|≤k
−2
|x|≤k
τx f, (λ − S)
! τx f .
|x|≤k
Letting k → ∞ limit and using that the limits exist on both side we obtain Lemma 4.3 for ·, ·. To prove Lemmas 4.4 and 4.5 for ·, ·, we only have to use that A commutes with translations. Proof of Lemma 3.1. Upper Bound. We start with the variational formula for the H−1 norm (see (2.5)), w, (λ − Ln )−1 w = sup 2w, f − (λ − Ln )f, (λ − S)−1 (λ − Ln )f , f
(4.10)
480
C. Landim, J. Quastel, M. Salmhofer, H.-T. Yau
where the supremum is carried over all finitely supported function f : Gn → R. By definition, (λ − Ln )f, (λ − S)−1 (λ − Ln )f = f, (λ − S)f + Af, (λ − S)−1 Af . Here we have set A+ = 0 on En since we are considering only Ln . For every 0 < b < 1, f, (λ − S)f + Af, (λ − S)−1 Af ≥ f, (λ − S)f + bAf, (λ − S)−1 Af . By Lemmas 4.3 and 4.5, choosing b = 1/C(λ)3 n4 , the previous expression is bounded below by 1 n2 C(λ)
F, (λ − )F +
1 AF, (λ − )−1 AF . C(λ)3 n4
Since f, w and F, w vanish except for functions of degree 2, it is easy to check that f, w = F, w . Thus we have w, (λ − Ln )−1 w ≤ sup 2F, w −
1 1 −1 AF, (λ − ) AF − F, (λ − )F C(λ)3 n4 C(λ)n2
F=Tf
≤ C(λ)3 n4 w, (λ − Ln )−1 w . This proves the upper bound. Lower Bound. We have w, (λ − Ln )−1 w = sup 2F, w − AF, (λ − )−1 AF − F, (λ − )F , F
where the supremum is carried over finitely supported functions F : ∪n≥1 (Zd )n → R. Since F˜ , w = F, w, from Lemma 4.5 we have w, (λ − Ln )−1 w ≤ sup 2F˜ , w − C −1 F, (λ − )F − F
1 ˜ , (λ − )−1 AF˜ . A F n2 C(λ)2
By Lemma 4.3, F, (λ−)F ≥ C(λ)n−2 F˜ , (λ−)F˜ . Recall R F = f, F˜ = T f . Thus the previous line is bounded above by 1 sup 2 T f, w − 2 T f, (λ − ) T f n C(λ) f=RF 1 −1 A T f, (λ − ) A T f . − 2 n C(λ)2 Clearly, T f, w = f, w. By Lemma 4.3, we have T f, (λ − ) T f ≥ f, (λ − S)f . By Lemma 4.5, we have w, (λ − Ln )−1 w 1 1 f, (λ − S)f − Af, (λ − S)−1 Af 2 C(λ)n C(λ)4 n6 ≤ C(λ)4 n6 w, (λ − LN )−1 w .
≤ 2f, w −
This proves the lemma.
Superdiffusivity of Asymmetric Exclusion
481
Acknowledgement. H.-T. Yau would like to thank P. Deift, J. Baik and H. Spohn for explaining their results. In particular, Spohn has pointed out the relation (1.1) so that the connection between the current across the zero and the diffusion coefficient becomes transparent. H.-T. Yau would also like to thank A. Sznitman for his hospitality and invitation to lecture on this subject at ETH.
References 1. Baik, J., Deift, P.A., Johansson, K.: On the distribution of the length of the longest increasing subsequence in a random permutation. J. Am. Math. Soc. 12, 1119–1178 (1999) 2. Baik, J., Rains, E.M.: Limiting distributions for a polynuclear growth model with external sources. J. Stat. Phys. 100, 523–542 (2000) 3. van Beijeren, H., Kutner, R., Spohn, H.: Excess noise for driven diffusive systems. Phys. Rev. Lett. 54, 2026–2029 (1985) 4. Chang, C. C., Landim, C., Olla, S.: Equilibrium fluctuations of asymmetric exclusion processes in dimension d ≥ 3. Probab. Theor. Relat. Fields 119, 381–409, (2001) 5. Davies, E.B.: Heat Kernels and Spectral Theory. Cambridge: Cambridge U. Press, 1989 6. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000) 7. Kardar, M., Parisi, G., Zhang, Y.C.: Dynamical scaling of growing interfaces. Phys. Rev. Lett. 56 , 889–892 (1986) 8. Landim, C., Olla, S., Yau, H.-T.: Some properties of the diffusion coefficient for asymmetric simple exclusion processes. Ann. Probab. 24(4), 1779–1808 (1996) 9. Landim, C., Yau, H.-T.: Fluctuation-dissipation equation of asymmetric simple exclusion processes. Probab. Theor. Relat. Fields 108(3), 321–356 (1997) 10. Pr¨ahofer, M., Spohn, H.: Current Fluctuations for the Totally Asymmetric Simple Exclusion Process. Preprint cond-mat/0101200 v2 11. Pr¨ahofer, M., Spohn, H.: Universal distribution for growth processes in 1+1 dimensions and random matrices. Phys. Rev. Lett. 84, 4882–4885 (2000) 12. Sethuraman, S., Varadhan, S. R. S., Yau, H.-T.: Diffusive limit of a tagged particle in asymmetric simple exclusion processes. Comm. Pure Appl. Math. 53(8), 972–1006 (2000) 13. Spitzer, F.: Principles of Random Walks. New York: Springer-Verlag, 1976 Communicated by H. Spohn
Commun. Math. Phys. 244, 483–525 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0999-x
Communications in
Mathematical Physics
The Hawking Effect for Spin 1/2 Fields Fabrice Melnyk Universit´e Bordeaux I, 351, Cours de la Lib´eration, 33405 Talence Cedex, France. E-mail:
[email protected] Received: 1 July 2002 / Accepted: 26 June 2003 Published online: 12 December 2003 – © Springer-Verlag 2003
Abstract: We prove the Hawking effect, for a gravitational collapse of charged star in the case of a charged massive Dirac field.
1. Introduction In this paper, we investigate the Hawking effect [14] in the case of the Dirac quantum field. We adopt the semi-classical approximation by supposing that the space-time curvature influences the fields, but the back-reaction on the metric is neglected. Then, we prove the emergence of a thermal state at the last moments of a gravitational collapse which is interpreted by a static observer at infinity as an outgoing flux of particles and anti-particles. Moreover, the black-hole preferentially emits massive spin 1/2 particles whose charge is of the same sign as its own charge. The Hawking effect and more generally the quantum effects in the vicinity of a blackhole have been the subject of numerous studies, we mention only the works that we have used: [5, 11, 25, 26]. A first mathematical study of the Hawking radiation was undertaken by J. Dimock and B. S. Kay [10]. In this work the authors consider the case of a Schwarzschild black hole for a Klein-Gordon field. By quantizing suitably this field in the vicinity of the past horizon of the black hole, the authors show that an observer located at infinity future observes the Hawking radiation. The case that was initially considered by S. Hawking of gravitational collapse in the Fock vacuum was examined by A. Bachelot. In a first time and for a field of Klein-Gordon [1], the author showed that a plunging observer in the future Schwarzschild black hole observes the Hawking radiation when he crosses the horizon of the black hole. In a second paper and for the same field, A. Bachelot obtained the proof of the Hawking effect [3]: a fixed observer in Schwarzschild variables observes at last moments of collapse in his own proper time, an outgoing Hawking thermal flux coming from the horizon of the future Schwarzschild black hole. In [4], this same author
484
F. Melnyk
extends his study [1] to the case of charged Dirac field for a plunging observer in a charged black hole resulting from a gravitational collapse. Just like that which was done for the field of Klein-Gordon in [3], our contribution to this program of study is to prove the Hawking effect for a charged Dirac field of the point of view of a fixed observer in Schwarzschild variables for a collapsing charged star. More precisely, in this work (and as for those of A. Bachelot) we consider a very simplified model of gravitational collapse, for which the star is modelled by a reflecting sphere: the properties of the star surface are given by the boundary condition for the Dirac field on this surface. Here, we chose the MIT bag boundary condition [6] which is conservative and which causes a reflexion of the fields on the star surface like occurs for a bosonic field by using a Dirichlet condition. These simplifying assumptions enable us to avoid difficult studies of the interactions between the fields and the fluid which composes the star and of the behavior of this fluid at the time of gravitational collapse via the EinsteinMaxwell equations. Moreover, we suppose that the spherical symmetry of the charged star is preserved during the collapse, hence, outside this one and by the Birkhoff theorem, the DeSitter-Reissner-Nordstrøm or the Reissner-Nordstrøm spaces time are relevants. The gravitational collapse occurs in the Fock vacuum. Although this last assumption is not physically correct in the case of DeSitter-Reissner-Nordstrøm space time (see [13]), the mathematical proof remains valid. Indeed, in this case, it would be preferable to consider a thermal state whose temperature is that of Gibbons-Hawking associated to the cosmological horizon. A forthcoming work will be to study the Hawking effect for Dirac field in (DeSitter-)Reissner-Nordstrøm space time by considering the gravitational collapse in a thermal bath of arbitrary temperature. This article is organized as follows: In the second part, we define the geometrical framework for a charged collapsing star described by the globally hyperbolic manifold (Mcoll , g). This collapse creates the (DeSitter-)Reissner-Nordstrøm space-time (Mbh , g) produced by a charged black-hole. In the third part, we define the Dirac equation for massive charged spin 1/2 field on (Mcoll , g) with MIT bag boundary conditions on the star surface. The mixed problem is well-posed. In the fourth part, we study the scattering theory for the massive charged Dirac field in the charged eternal black-hole (Mbh , g). To do this, we introduce the useful wave operators at the horizon and at infinity. More particularly, we extend the studies of [16, 21, 18 and 19], in proving the asymptotic completeness for the classical wave operators at the horizon and infinity when we consider the curved DeSitter-Reissner-Nordstrøm space-time. In the fifth part, we construct the local algebra of observable U(Mcoll ) as in [8 and 9], using the Dirac-Fermi Fock representation on some particular Cauchy hyper-surface. We define the KMS-state involving the (Hawking) temperature and the chemical potential. In this same section we state the main theorem of this work using the mathematical objects of the previous part. We interpret the result as a thermal state given by a KMS-state which is independent on the behavior of the collapse and boundary condition on the star for the Dirac field. The last section is devoted to the proofs of the technical results useful to demonstrate the main theorem of this article.
2. Geometrical Description of a Gravitational Collapse We introduce the general geometrical framework describing the creation of a black-hole by an idealized star collapsing. First, we consider the (DeSitter-)Reissner-Nordstrøm space-time outside a charged, static eternal black-hole in an expanding universe, as the
Hawking Effect for Spin 1/2 Fields
485
globally hyperbolic manifold (Mbh , g), Mbh = Rt ×]r0 , r+ [×Sω2 ,
0 < r0 < r+ ≤ +∞,
gab dx dx = F (r)dt − F
−1
a
2
b
(r)dr 2 − r 2 dω2 ,
(1)
dω = dθ + sin θ dϕ , ω = (θ, ϕ) ∈ [0, π ] × [0, 2π [, 2
2
2
2
2M r 2 Q2 + 2 − . r r 3 Here, Q ∈ R, M > 0, ≥ 0, r0 and r+ are respectively the electric charge, the mass, the cosmological constant, the radius of the horizon of the black-hole and the radius of the cosmological horizon. We have F (r) = 1 −
F (r0 ) = F (r+ ) = 0,
2κ0 = F (r0 ) > 0,
2κ+ = F (r+ ) < 0, r ∈]r0 , r+ [⇒ F (r) > 0 with κ0 , κ+ the surface gravity at the black hole horizon and at the cosmological horizon. If = 0 then 2M Q2 F (r) = 1 − + 2 , 0 < |Q| ≤ M, r r r0 = M + M 2 − Q2 , r+ = +∞, and the globally hyperbolic manifold (Mbh , g) describes the Reissner-Nordstrøm spacetime which is asymptotically flat at spatial infinity. We introduce a radial coordinate r∗ , which straightens the radial null geodesics: r 1 1 2κ0 ln(r − r0 ) − − r∗ = dx + c, r ∈]r0 , r+ [, c ∈ R, 2κ0 x − r0 F (x) r0 (2) dr∗ (3) = F −1 . dr This coordinate shifts the horizon of the black-hole to the negative infinity and the cosmological horizon to the positive infinity. As we consider a black-hole created by the collapse of spherical charged star, if the exact spherical symmetry of the star in collapsing is maintained, outside of it, the (DeSitter-)Reissner-Nordstrøm geometry is relevant thanks to Birkoff’s theorem [15, 20]. Hence the space-time outside the spherical charged star with r∗ -radius z(t), t ∈ R, is the manifold (Mcoll , g) such that: Mcoll := (t, r∗ , ω) ∈ Rt × Rr∗ × Sω2 , r∗ ≥ z(t) ,
= ∪t∈R {t}×]z(t), +∞[r∗ ×Sω2 . (4) Following the general geometrical discussion about the same problem in [2 and 4], the reasonable assumptions of generic collapse lead to the following properties for z(t): z ∈ C 2 (R); ∀t ∈ R, − 1 < z˙ (t) ≤ 0, z(t) = −t − Cκ0 e
(5)
−2κ0 t
+ (t), Cκ0 > 0,
| (t)| + | ˙ (t)| = O e−4κ0 t , t → +∞.
(6)
486
F. Melnyk
We suppose the star stationary in the past. Moreover, we arbitrarily choose c in (2), such that for all t ≤ 0, z(t) = z(0) < 0. If we consider a ray of light leaving x0 at t = 0, with z(0) ≤ x0 < 0, then τ (x0 ) is the time where the ray is reflected by the surface of the star, S := {(t, z(t))} × Sω2 , t∈R
such that τ (x0 ) is the unique solution of z(τ (x0 )) + τ (x0 ) = x0 .
(7)
Thanks to the property (6), we have also (see [1]): 1 1 ln(−x0 ) + ln(Cκ0 ) + O(x0 ), 2κ0 2κ0 1 + z˙ (τ (x0 )) = −2κ0 x0 + O(x02 ), x0 → 0− . τ (x0 ) = −
x0 → 0− ,
Cκ0 > 0,
(8) (9)
3. The Dirac Equation For the spin 1/2 particles with real charge q and mass m > 0, the Dirac equation on (Mcoll , g), has the general form (see [4 and 22])
iγ 0 qQ F iγ 1 F +√ ∂t + i ∂r∗ + + √ r r 4 F F 2 3 iγ iγ 1 + ∂θ + cot θ + ∂ϕ − m = 0, (10) r 2 r sin θ where the Dirac matrices γ k , satisfy γ a γ b + γ b γ a = 2ηabI R4 , a, b = 0, .., 3, ηab = Diag(1, −1, −1, −1). 0 σ0 0 σk 0 k k = 1, 2, 3, γ =i , γ =i −σ 0 0 σk 0 with the Pauli matrices, 10 1 0 0 1 , σ = , σ = 01 0 −1
σ = 2
01 , 10
σ =i 3
0 −1 . 1 0
(11) (12)
(13)
On the star surface, we put the following boundary condition, written for (t, r∗ , ω) ∈ S, as nj γ j (t, r∗ , ω) = B,
(14)
where nj is the outgoing normal of subset of Rt × Rr∗ × Sω2 and B some operator local in time, rotationally invariant and which conserves the L2 norm. We choose B such that
Hawking Effect for Spin 1/2 Fields
487
(14) forms a family indexed by a parameter ν of non-equivalent boundary conditions: ν defined by the generalized MIT boundary condition (see [6]), BMIT 5
ν BMIT := ieiνγ (t, r∗ , ω),
γ 5 := −iγ 0 γ 1 γ 2 γ 3 = diag(1, 1, −1, −1),
(15)
where the parameter ν is the chiral angle. We suppose that ν ∈ R if m > 0 with r+ < +∞, and ν = (2k + 1)π , k ∈ Z if m > 0 with r+ = +∞. We introduce the Hilbert spaces: L2t := L2 (]z(t), +∞[r∗ ×Sω2 , r 2 F 1/2 (r)dr∗ dω)4 , L2BH := L2 (Rr∗ × Sω2 , r 2 F 1/2 (r)dr∗ dω)4 .
(16)
The norms of these spaces are denoted by . t and . . Moreover for ∈ L2t , (r∗ , ω) r∗ ∈]z(t), +∞[r∗
t = []L , []L (r∗ , ω) = . 0 r∗ ∈ R\]z(t), +∞[r∗ Hence, respectively, on (Mcoll , g) and on (Mbh , g), we consider the hyperbolic mixed problems: ∂t = iD t , z˙ γ 0 √
− γ1
1 − z˙ 2
(17)
z(t) < r∗ , 5
(t, z(t)) = −ieiνγ (t, z(t)),
(t = s, .) = s (.) ∈ L2s ,
(18) (19)
and ∂t = iD BH ,
(20)
(t = 0, .) = BH (.) ∈ LBH , 2
with, D t defined on L2t and D BH defined on L2BH , such that:
qQ F (r) F (r) 1 + ∂r∗ + + D t , D BH = − r r 4
2 3 1 4 + F (r) ∂θ + cot θ + ∂ϕ + , r 2 r sin θ 1 := iγ 0 γ 1 = iDiag(−1, 1, 1, −1),
(21)
(22)
2 := iγ 0 γ 2 ,
3 := iγ 0 γ 3 , 4 := −mγ 0 , (23) 0 1 z˙ γ − γ 5 D(D t ) = ∈ L2t , D t ∈ L2t ; √ (z(t), ω) = −ieiνγ (z(t), ω) 2 1 − z˙ (24) and D(D BH ) = ∈ L2BH , D BH ∈ L2BH .
(25)
Proposition III.2 in [4] gives the solution (t) of the hyperbolic problem (17), (18) and (19) expressed with the propagator U (t, s):
488
F. Melnyk
Proposition 3.1. Given s ∈ D(D s ), there exists [(.)]L = [U (., s)s ]L ∈ C 1 (Rt , L2BH ) solution of (17), (18) and (19) such that, for all t ∈ R, (t) ∈ D(D s ). Moreover,
(t) t = s s and U (t, s) can be extended in an isometric strongly continuous propagator from L2s onto L2t . For the eternal black-hole, we have (see Theorem 4.1 in [17]): Proposition 3.2. D BH is a densely defined self-adjoint operator on L2BH , hence the Cauchy problem (20), (21) has a unique solution ∈ C 0 (Rt , L2BH ), given by the strongly continuous unitary group U (t) := eitD BH : (t) = U (t)BH ,
(0) = BH ,
(t) = BH .
4. Scattering by an Eternal Black-Hole Since the Hawking effect arises from an asymptotic study of the fields, we define the wave operators for the eternal charged black-hole. Near the black-hole horizon (resp. near the cosmological horizon when = 0), we compare the solution of (20) on L2BH with the solution of resp. ∂t → = D,→ → , ∂t ← = iD← ← where D← := 1 ∂r∗ −
qQ r0
resp. D,→ := 1 ∂r∗ −
qQ r+
is self-adjoint on 2 := L2 (Rr∗ × Sω2 ; dr∗ dω)4 , L←
with the dense domain D(D← ) = H 1 (Rr∗ ; L2 (Sω2 ))4
2 2 (resp. L,→ := L← ,
> 0),
resp. D(D,→ ) = H 1 (Rr∗ ; L2 (Sω2 ))4 .
2+ Thanks to the form of 1 , we define the subspaces of outgoing and incoming waves L← 2− 2 2+ 2− and L← such that L← = L← ⊕ L← , 2 2+ := { ∈ L← ; 2 = 3 = 0}, L← 2 2+ 2− L,→ = L,→ ⊕ L,→ ,
2− 2 L← := { ∈ L← ; 1 = 4 = 0},
2+ 2+ L,→ := L← ,
2− 2− L,→ := L← .
(26)
We introduce for the two asymptotic regions, respectively the identifying operator 2 and L2 and the one between L2 and L2 : between L← BH BH ,→ J← : ± (r∗ , ω) → χ← (r∗ )r −1 F −1/4 (r) ± (r∗ , ω), J,→ : ± (r∗ , ω) → χ→ (r∗ )r −1 F −1/4 (r) ± (r∗ , ω),
2± ± ∈ L← ,
2± ± ∈ L,→ ,
Hawking Effect for Spin 1/2 Fields
489
where χ← and χ→ are cut-off functions, χ← ∈ C ∞ (Rr∗ ), ∃ a, b ∈ R, 0 < a < b < 1 1 r∗ < a χ← (r∗ ) = , χ→ = 1 − χ ← . 0 r∗ > b
(27)
± We define the wave operators W ± ← at the black-hole horizon and W,→ at the cosmological horizon ( > 0), by ± itD← ± W± ← = lim U (−t)J← e t→±∞
in
± ∓ = lim U (−t)J,→ eitD,→ ∓ W,→ t→±∞
L2BH , in
2± ± ∈ L← , 2∓ ∓ ∈ L,→ .
L2BH ,
(28) (29)
When = 0, the space-time is asymptotically flat at the infinity. Hence, we compare the solutions of (20) on L2BH with the solution → of the Dirac equation on Minkowski spacetime with spherical coordinates (ρ, ω) ∈ R∗+ × [0, π] × [0, 2π [, putting r∗ = ρ > 0 to avoid artificial long-range interactions: ∂t → = iD0,→ → , where
1 1 2 3 D0,→ := 1 ∂ρ + + ∂θ + cot θ + ∂ϕ + 4 , ρ ρ 2 ρ sin θ is self-adjoint on 2 L0,→ := L2 (Rρ+ × Sω2 ; ρ 2 dρdω)4
with the dense domain D(D0,→ ) = H 1 (Rρ+ × Sω2 ; ρ 2 dρdω)4 . ¯ 0,→ on L2 (Rx3 )4 , with the help We define the Dirac operator with Cartesian coordinates D 2 2 3 4 of the isometry T between L (Rx ) and L0,→, such that:
1 2 2 3 − γ γ ¯ ¯ T : (x) → (ρ, ω) = T (x), T = e 2 γ γ e− 4 γ γ e 2 4 , −1 1 2 3 0 ¯ p + mβ, α = i( , , ), β = −γ , p = −i∇. T D0,→ T = D0,→ = α.p ϕ
π
θ
π
1 2
The previous comparison involves long-range perturbations due to the mass and the charge. Then, as in [17 and 19], we construct the Dollard-modified propagator U0,→ (t): +
−
U0,→ (t) := T u(t)T −1 , u(t) := eitλ(pp ) eiX (t) P+0 + e−itλ(pp ) eiX (t) P−0 , log(t) log(t) p |2 + m2 , u (p p ) := p /λ(p p ), p ) := |p X± (t) := ±m2 M − qQ , λ(p u(p p )|λ(p p) u(p p )| |u |u ¯ 0,→ /λ(p p )). log(t) := t|t|−1 ln |t|, P±0 := 1/2(1 ∓ D (30)
490
F. Melnyk
2 and L2 : We define the bounded identifying operator J0,→ between L0,→ BH χ→ (ρ = r∗ )r −1 F −1/4 (r)r∗ (ρ = r∗ , ω) r∗ > 0 2 , ∀ ∈ L0,→ , (J0,→ )(r∗ , ω) := 0 r∗ ≤ 0
± at infinity, for all ∈ L2 : and in the case of = 0 the wave operator W0,→ 0,→ ± = lim U (−t)J0,→ U0,→ (t) W0,→ t→±∞
in
L2BH .
(31)
Then, we state the theorem which is proved in the last part of this work: ± ± 2± 2∓ 2 Theorem 4.1. The operators W ± ← , W,→ and W0,→ , respectively on L← , L,→ and L0,→ exist and are independent of χ← , χ→ and χ→ satisfying (27). Moreover: ± ±
W ± ← = L2 ,
2± ∀ ± ∈ L← ,
←
± ∓ = ∓ L2 ,
W,→ ,→
± = L2 ,
W0,→ 0,→
and
( ≥ 0,
2∓ ∀ ∓ ∈ L,→ ,
2 ∀ ∈ L0,→ ,
( > 0,
( = 0,
± Ran W ± = L2BH , ⊕ W ← ,→
m ≥ 0), m ≥ 0),
m > 0),
( ≥ 0).
5. Dirac Quantum Field and Hawking Effect 5.1. Second quantization of the Dirac fields. We define the framework of the Quantum Field Theory to describe the Hawking effect. We use the approach of the algebras of local observables on curved space-time introduced by J. Dimock in [8 and 9]. First, we define the Fermi-Dirac Fock space which describes the state with an arbitrary number of non interacting charged fermions. Given, (H, < ., . >H) a complex Hilbert space and ϒ the charge conjugation (see [24] Sect. 1.4.6), then we split H into two orthogonal spectral subspaces H = H+ ⊕ H − ,
H+ := P+ H,
H− := ϒP− H,
(32)
where, P+ and P− are the spectral projectors on positive and negative subspaces. We define, F(1) (H+ ) and F(1) (H− ), respectively the one particle space and the one anti-particle space such that F(1) (H+ ) := H+ ,
F(1) (H− ) := H− .
(33)
To treat various numbers of particles and anti-particles, we recall the definition of the Fermi-Dirac Fock space: F(H) :=
+∞ n,m=0
F(n,m) ,
F(n,m) (H) := F(n) (H+ ) ⊗ F(m) (H− ),
(34)
Hawking Effect for Spin 1/2 Fields
491
where F(0) (H+ ) := C,
F(0) (H− ) := C,
F(n) (H+ ) :=
n
H+ ,
k=1
F(m) (H− ) :=
m
H− .
(35)
k=1
An element ψ of F(H) consists of the sequence ψ = (ψ (n,m) )n,m∈N , with ψ (n,m) ∈ F(n,m) (H). The vacuum vector is the vector vac ∈ F(H) satisfying (n, m) = (0, 0) ⇒ (0,0) = 1, vac
(n, m) = (0, 0) ⇒ (n,m) = 0. vac
(36)
We define the quantized Dirac field operator and its adjoint ∗ : f ∈ H −→ (f ) := a(P+ f ) + b∗ (ϒP− f ) ∈ L(F(H)), f ∈ H −→ ∗ (f ) := a ∗ (P+ f ) + b(ϒP− f ) ∈ L(F(H)), where a(P+ f ), a ∗ (P+ f ), b(P− f ), b∗ (P− f ) are respectively the particle annihilation, creation operators and the anti-particle annihilation, creation operators. The quantized Dirac field is an anti-linear and bounded operator and, thanks to the classical properties of the creation and annihilation operators, it satisfies the canonical anti-commutation relations (CAR). We consider the C ∗ -algebra U(H) generated by the field operators (g), with f, g ∈ H. For an observable A ∈ U(H), we define the vacuum state as ∗ (f ) ωvac (A) :=< Avac , vac >H. Then, by straightforward computation and for f, g ∈ H, we have ∗ (f ) (g)) =< P− f, g >H . ωvac (
(37)
Given a Dirac-type equation, with Hamiltonian H, satisfied by the one particle field fD : ∂t fD = iHfD , we choose the spectral projectors P+ and P− such that P+ := 1]−∞,0] (H),
P− := 1[0,+∞[ (H).
(38)
δ,σ depending on σ > 0 and δ ∈ R, such On U(H), we also introduce the KMS state ωKMS that for f, g ∈ H: δ,σ ∗ (f ) (g)) :=< µeσ H (1 + µeσ H )−1 f, g >H, ωKMS (
µ := eσ δ .
(39)
The restriction of this KMS state to the sub-algebra U(H+ ) (resp. U(H− )) of U(H), corresponds to the Gibbs equilibrium state describing the thermodynamic models for noninteracting Fermi particles (resp. anti-particles) with temperature σ −1 > 0 and chemical potential δ (resp. −δ). As J. Dimock [9], we construct the algebra of local observables in the space-time outside the collapsing star, with the help of a given CAR representation on a Cauchy hyper-surface. In fact this construction does not depend on the choice of the CAR representation, the spin structure and the hyper-surface. Then, in particular, we consider the
492
F. Melnyk
Fermi-Dirac Fock representation and the following foliation of the globally hyperbolic manifold: t , t := {t}×]z(t), +∞[r∗ ×Sω2 . Mcoll = t∈R
We consider 0 , and we put H := L2 (]z(0), +∞[×Sω2 , r 2 F 1/2 (r)dr∗ dω)4 = L20 ,
H := D 0 .
(40)
Using the previous definition of Dirac quantum field, we define on L20 the quantized 0 (2 ), with 1 , 2 ∈ H. Dirac field 0 and U(H) the C ∗ -algebra generated by ∗0 (1 ) We introduce the following operator: U (0, t)(t)dt ∈ L20 , (41) Scoll : ∈ C0∞ (Mcoll )4 −→ Scoll := R
where U (0, t) is the propagator defined in Proposition 3.1. Then, we define the local quantum field in Mcoll by the operator: coll : ∈ C0∞ (Mcoll )4 −→ coll () := 0 (Scoll ),
(42)
and, for any open set O ⊂ Mcoll , we introduce U(O) the C ∗ -algebra generated by ∗coll (1 ) coll (2 ), supp(j ) ⊂ O, j = 1, 2. Finally, we have:
U(Mcoll ) = adh U(O) . O
Then, thanks to (37), (38) and (40), we define on U(Mcoll ) a ground state ωMcoll as follows: ∗coll (1 ) coll (2 )) :=ωvac ( ∗0 (Scoll 1 ) 0 (Scoll 2 )), ωMcoll (
1 , 2 ∈ H
= < 1[0,+∞[ (D 0 )Scoll 1 , Scoll 2 >H .
(43)
We describe the quantum field at the horizon of a future back-hole. We consider the stationary space-time Mbh with the associated Dirac Hamiltonian D← for the one particle field. Using the Fermi-Dirac Fock quantization on Rr∗ × Sω2 , we define the field − () 2 , and the operator S such that with ∈ L← ← ∞ 4 e−itD← (t)dt. (44) S← : ∈ C0 (Mbh ) −→ S← := R
We also introduce ← : ∈ C0∞ (Mbh )4 −→ ← () := − (S← ),
(45)
∗ 2 . Using ← and the C ∗ -algebra U← (Mbh ) generated by ← (1 ) (2 ), 1 , 2 ∈ L← (39), we consider the Hawking thermal state: δ,σ ∗ ← ← (2 )) ωHaw ( (1 )
δ,σ −∗ (S← 1 ) − (S← 2 )), := ωKMS (
1 , 2 ∈ C0∞ (Mbh )4
=< µeσ D← (1 + µeσ D← )−1 S← 1 , S← 2 >L2 , ←
(46) (47)
Hawking Effect for Spin 1/2 Fields
493
with µ := eσ δ ,
δ ∈ R,
σ > 0.
(48)
Now, we describe the quantum field at the spatial infinity of the future black-hole.According to which is respectively positive or zero (cosmological horizon or asymptotically flat space-time), we consider the stationary space-times Mbh or Mflat := Rt × Rr+∗ × Sω2 , with the Dirac Hamiltonian associated to a one particle field D,→ and D0,→ . As above, using the Fermi-Dirac Fock quantization on Rr∗ × Sω2 or Rr+∗ × Sω2 , we define the fields 2 or 2 ,+ (1 ) with 1 ∈ L,→ 0,+ (1 ) with 1 ∈ L0,→ and the operators S,→ or S0,→ characterized by: e−itD,→ (t)dt, > 0, (49) S,→ : ∈ C0∞ (Mbh )4 −→ S,→ := R S0,→ : ∈ C0∞ (Mflat )4 −→ S0,→ := U0,→ (−t)(t)dt, (50) R
where U0,→ is the Dollard-modified propagator given by formula (30). Then, we con∗ struct the C ∗ -algebras U→ (Mbh ) and U→ (Mflat ), respectively generated by ,→ (1 ) ∗ 2 2 0,→ (1 ) with 1 , 2 ∈ L0,→, where ,→ (1 ) with 1 , 2 ∈ L,→ and 0,→ (1 ) ,→ : ∈ C0∞ (Mbh )4 −→ ,→ () := ,+ (S,→ ), 0,→ : ∈
C0∞ (Mflat )4
> 0,
−→ 0,→ () := 0,+ (S0,→ ).
(51) (52)
With (37), the vacuum states on each algebra U→ (Mbh ) and U→ (Mflat ) are given by ∗ ,→ ,→ (1 )) =< P− S,→ 1 , S,→ 2 >L2 , (1 ) ωvac (
> 0,
(53)
,→
1 , 2 ∈ C0∞ (Mbh ), ∗
P− := 1[0,∞[ (D,→ ),
0,→ (1 ) 0,→ (1 )) =< ωvac ( 1 , 2 ∈ C0∞ (Mflat ),
P−0 S0,→
(54) (55)
1 , S0,→ 2 >L2 , 0,→
P−0 := 1[0,∞[ (D0,→ ).
(56)
Since we are interested in the state of the quantum field at the last moment of gravitational collapse, we investigate the following limit: ∗coll (T1 ) coll (T2 )), lim ωMcoll (
T →+∞
where Tj (t, r∗ , ω) := j (t − T , r∗ , ω),
j ∈ C0∞ (Mcoll )4 ,
j = 1, 2,
and, ωMcoll and coll are defined by (43) and (42). Then, we state the main theorem of this work Theorem 5.1 (Main Result). Given j ∈ C0∞ (Mcoll )4 , ≥ 0,
j = 1, 2, then we have for
δ,σ ∗ ∗coll (T1 ) coll (T2 )) =ωHaw ← ← (− ( (− lim ωMcoll ( ← 1 ) ← 2 ))
T →+∞
∗ − − ,→ ,→ (,→ (,→ 1 ) 2 )), + ωvac (
494
F. Melnyk
with THaw =
2π 1 = , σ κ0
δ=
qQ . r0
Proof of Theorem 5.1. For ∈ C0∞ (Mcoll )4 , by the identity of polarization, it is sufficient to evaluate 2 ∗coll (T ) coll (T )) = lim 1[0,+∞[ (D 0 ) Scoll T , lim ωMcoll ( T →+∞ T →+∞ 0 2 = lim 1[0,+∞[ (D 0 ) U (0, T )Sbh 0 , (57) T →+∞
because for T > 0 large enough, we have:
Scoll T = U (0, T )Sbh ,
Sbh :=
R
U (−t)(t)dt.
Then, we use the key theorem that we prove in the next section: Theorem 5.2. Given f ∈ L2BH , if ≥ 0, then 2 lim 1[0,+∞[ (D 0 ) U (0, T )f 0 T →+∞ 2 − = 1[0,+∞[ (D,→ ),→ f 2 σ D← + < − ← f, µe
L,→
1 + µeσ D←
−1
− ← f >L2
(58)
←
with − ∗ qQ 2π σ = , − ← := W ← , r0 κ0
∗
∗ − − − := W,→ , 0,→ := W0,→ ,
µ = eσ δ , − ,→
δ :=
− − where W − ← , W,→ , W0,→ are the wave operators respectively defined in (28), (29) and (31).
According to (57) and the previous theorem, for ≥ 0, we deduce that: ∗coll (T ) coll (T )) lim ωMcoll ( 2 − = 1[0,+∞[ (D,→ ),→ Sbh 2
T →+∞
L,→
,
−1 σ D← + < − − 1 + µeσ D← 2 ← Sbh , µe ← Sbh >L← 2 − = 1[0,+∞[ (D,→ )S,→ ,→ 2 , σ D← + < S← − ← , µe
=
← (− ← ωHaw ( ← 1 ) ∗ − δ,σ
∗
L,→
1 + µeσ D←
(− ← 2 )) −
−1
,→ (,→ ) ,→ (,→ )). + ωvac (
S← − ← >L2
←
Hawking Effect for Spin 1/2 Fields
495
5.2. Discussion. The interpretation of the previous theorem in terms of particles is more difficult. Indeed, there are as many definitions of particles as types of observers. In the Minkowski space time and thanks to the Lorentz transformations, we naturally define the particles linked to the inertial observers. For the general curved space-times, we have not the similar transformations and the notion of particles is rather vague. In ∗coll (T ) coll (T )) gives information at the time T of Theorem 5.1, the state ωMcoll ( a detector fixed with respect to the variables (r∗ , ω) measuring the fluctuation of the quantum field outside the collapsing star. The detector is put in the Boulware vacuum that corresponds to the classical concept of vacuum state for a static observer. This last theorem gives the response of the detector at their own infinite proper time (T = +∞), which corresponds to the last moments of gravitational collapse. On one hand, the term ∗ − − ,→ ,→ (,→ ωvac ( (,→ 1 ) 2 )) proves that the dectector measures merely a vacuum coming from the past infinity and falling into the black hole. On the other hand, δ,σ ∗ ← ← (− ωHaw ( (− ← 1 ) ← 2 )) corresponds to the emergence of a thermal state at temperature THaw coming from the vicinity of the black hole. An observer at rest with respect to coordinates (r∗ , ω) will interpret as t → +∞ this thermal state like a flux of fermionic and anti-fermionic particles leaving the future black hole. The result is independent of the history of the collapse and the boundary condition on the star surface. Indeed, we can easily prove the same theorem putting the more general MIT Bag boundary condition (see [4]): 5 B := i eiνl,n γ Pln , (ln)∈I
where Pln is the orthogonal L2 (Sω2 )-projector on V ect (Yln ), (see (68)) and νl,n a sequence which satisfies the same conditions as ν in the third section about the MIT Bag boundary condition. Moreover, for a Lebesgue measurable subset B of Rr∗ × Sω2 with 0 < |B| < +∞, Lemma A.2 in [4] gives respectively the expression of the density of δ,σ ), of antiparticles D − (ωδ,σ ) and the charge density ρ particles DB+ (ωKMS Haw for the gas B KMS of fermions created at the vicinity of the black-hole horizon in the subset B: 1 δ,σ δ,σ + j + j DB+ (ωKMS ) := B −1 (a ∗ (P← )a(P← )) = (59) ln(1 + eσ δ ), ωKMS πσ 1 δ,σ δ,σ − j − j DB− (ωKMS ln(1 + e−σ δ ), ωKMS ) := B −1 (a ∗ (P← )a(P← )) = (60) πσ + − P← := 1]−∞,0] (D← ), P← := 1[0,+∞[ (D← ), (61) 1 q 2Q δ,σ δ,σ ρHaw := q DB+ (ωKMS ) + DB− (ωKMS ) = qδ = , π π r0
(62)
2 / B ⇒ where (j )j ∈N is an orthonormal basis of {S← − ← ∈ LBH : (r∗ , ω) ∈ − S← ← (r∗ , ω) = 0}. Since ρHaw and Q have the same sign, we conclude that the black-hole preferentially emits charged particles with the same sign as its own charge. We emphasize that the interpretation of Theorem 5.1 is valid only in the semiclassical regime. Indeed, we suppose that the black hole that we consider has a sufficiently large mass in order to be able to use the classical theory of General Relativity to model the gravitational field but also to neglect the back reaction of the quantum fields. Thanks to Theorem 5.1, we can conjecture that the black hole loses its charge and its mass. Therefore, if we want to study this evaporation, we can not neglect the back reaction of the Hawking effect. But for that, it would be necessary to study a non-linear problem of a very great complexity.
496
F. Melnyk
6. Proofs of the Main Theorems This section is organized as follows: in the first subpart, thanks to the spherical symmetry property of the geometrical framework, we reduce (17) and (20) to a family of one dimensional problems. This reduction will be useful for the next subparts. In the second part, we prove Theorem 4.1 on the scattering theory in the eternal charged black-hole. In the third part, we demonstrate Theorem 5.2 on the sharp estimate of 1[0,+∞[ (D 0 )U (0, T ). 6.1. Reduction to a one dimensional problem. To reduce problems (17) and (20), we use spin-weighted harmonics Y l 1 (see [12, 17]). The families ± 2 ,n
Y 1l ,n ; (l, n) ∈ I , Y−l 1 ,n ; (l, n) ∈ I , 2 2 1 I := (l, n) : l − ∈ N, l − |n| ∈ N , 2
l , s = ±1/2 satisfies the recurrence relations, form a Hilbert basis of L2 (Sω2 ) and each Ysn
n − s cos θ l Ysn (ω) θ sin √ −i (l ± s)(l ∓ s + 1)Y l s∓1,n (ω), ±l > −s , = 0, l = ∓s
l (ω) ∓ ∂θ Ysn
l l ∂ϕ Ysn (ω) = −inYsn (ω).
(63) (64)
We introduce the Hilbert spaces to treat the one dimensional problem respectively outside the charged collapsing star and the eternal black hole: 0 ≤ t, L2t := L2 (]z(t), +∞[r∗ , dr∗ )4 , L2R := L2 (Rr∗ , dr∗ )4 , L2BH := L2 (Rr∗ , r 2 F 1/2 (r)dr∗ )4 .
(65)
The norm of L2t and L2R are respectively denoted by . t and . . Moreover for ∈ L2 (B, dr∗ )4 , B ⊂ R, (r∗ ) r∗ ∈ B
L2 (B, dr∗ )4 = []L , []L (r∗ ) := . 0 r∗ ∈ R \ B In the same way, we define 0 ≤ t, Ht1 := ∈ L2t , ∂r∗ ∈ L2t , and moreover for ∈ Ht1 we have, []H ∈ HR , 1
[]H (r∗ ) :=
HR1 := ∈ L2R , ∂r∗ ∈ L2R ,
(r∗ ) r∗ ∈]z(t), +∞[r∗ . (2z(t) − r∗ ) r∗ ∈ R\]z(t), +∞[r∗
Hence, for 0 ≤ t ≤ +∞, and putting Pr : → r −1 F −1/4 ,
(66)
Hawking Effect for Spin 1/2 Fields
497
any ∈ L2t or L2BH , where ln ∈ Pr L2t or Pr L2R can be written in the following way: (r∗ , ω) = ln (r∗ ) ⊗4 Yln (ω), (67) (l,n)∈I
v ⊗4 u := (u1 v1 , u2 v2 , u3 v3 , u4 v4 ), Yln := Y−l 1 ,n , Y 1l ,n , Y−l 1 ,n , Y 1l ,n . 2
2
2
∀u, v ∈ C4 , (68)
2
We define, Rνln : ∈ L2t → ei 2 γ Pr −1 ln ∈ L2t ,
(69)
Rln : ∈ LBH → ln ∈ Pr LR ,
(70)
ν
5
2
BH
2
Eνln : ln ∈ L2t → e
−i ν2 γ 5
Pr ln ⊗4 Yln ∈ L2t ,
Eln : ln ∈ Pr LR → ln ⊗4 Yln ∈ LBH 2
BH
2
to express L2t and L2BH as a direct sum: 2 L2t = Eνln L2t , L2BH = EBH Eνln L2R . ln LBH = (l,n)∈I
(l,n)∈I
(71) (72)
(73)
(l,n)∈I
With (63), (64) and s = ±1/2, we obtain Dt =
Eνln DVl,ν ,t Rνln −
(l,n)∈I
qQ , r0
(74)
DVl,ν ,t := 1 ∂r∗ + Vl,ν , 1 1 i Vl,ν = qQ − − F (r) mAν + 2 (l + 1/2) , r0 r r 1 + z˙ (t) 0 aν iν iν Aν := , , aν := diag(ie , ie ), Z(t) = a¯ν 0 1 − z˙ (t) D(DVl,ν ,t ) = ∈ L2t ; DVl,ν ,t ∈ L2t , 2 (z(t)) = Z(t)4 (z(t)), 1 (z(t)) = −Z(t)3 (z(t))} and
F (r) 1 D BH = + F (r) + VBH , Eln DBH Rln , DBH = ∂r∗ + r 4 (l,n)∈I qQ i 2 VBH = − − F (r) (l + 1/2) − 4 , r r D(DBH ) = ∈ L2BH ; DBH ∈ L2BH .
(75) (76)
(77)
BH
BH
1
(78) (79) (80)
Therefore, is solution of problem (17), (18) and (19) if and only if, for all (l, n) ∈ I, −1
(t, r∗ ) := eitqQr0 Rνln (t, r∗ )
498
F. Melnyk
is solution of ∂t = iDVl,ν ,t ,
t ∈ R,
(81)
r∗ > z(t),
2 (t, z(t)) = Z(t)4 (t, z(t)),
3 (t, z(t)) = −Z(t)1 (t, z(t)),
(82)
(t = s, .) = s (.) :=
∈
(83)
Rνln s (.)
L2s .
In the same way, is solution of problem (20) and (21) if and only if, for all (l, n) ∈ I, (t, r∗ ) := RBH ln (t, r∗ ) is solution of ∂t = iDBH ,
(84)
(t = 0, .) = BH := Rln BH ∈ LBH . 2
BH
(85)
In [4], Proposition VI.2 gives a solution (t) of the problem (81), (82) and (83) expressed with the propagator UVl,ν (t, s): Proposition 6.1. If s ∈ D(DVl,ν ,s ), then there exists a unique solution [(.)]H = [UVl,ν (., s)s ]H ∈ C 1 (Rt , L2R ) ∩ C 0 (Rt , HR1 ) of (81), (82) and (83): (t) ∈ D(DVl,ν ,t ). Moreover,
(t) t = s s
(86)
and UVl,ν (t, s) can be extended in an isometric strongly continuous propagator from L2s onto L2t , and for an R > z(s), (x > R ⇒ s (r∗ , ω) = 0) ⇒ (x > R + |t − s| ⇒ [UVl,ν (t, s)s ](r∗ , ω) = 0). Thanks to the notations (69) and (71), we give the important relations connecting propagator UV (t, s) with U (t, s) defined in Proposition (3.1): U (t, s) = e L2s
=
i(s−t) qQ r
(l,n)∈I
0
(l,n)∈I
Eνln L2s
Eνln UVl,ν (t, s)Rνln :
→ L2t =
Eνln L2t .
(87)
(l,n)∈I
Subsequently, to simplify the notations, we forget subscripts ln and ν in the above one dimensional problem. Given a interval B := (a, b) ⊂ Rr∗ and V ∈ L∞ (Rr∗ ), then, on L2 (B)4 we define the self-adjoint operator DV ,B with the dense domain D(DV ,B ) such that DV ,B = 1 ∂r∗ + V ,
(88) D(DV ,B ) = ∈ L2 (B)4 ; DV ,B ∈ L2 (B)4 , r∗ ∈ ∂B ⇒ nγ 1 (r∗ ) = i(r∗ ) ,
(89)
Hawking Effect for Spin 1/2 Fields
499
where n is the outgoing normal of B and 1 given by (23). Hence by the Kato-Rellich and spectral theorem, the problem ∂t = iDV ,B ,
(t = 0) = 0 ,
(90)
is solved with the help of the propagator UV ,B (t), following the proposition: Proposition 6.2. Given 0 ∈ D(DV ,B ), then there exists a unique solution (.) = UV ,B (.)0 ∈ C 0 (Rt , D(DV ,B )) ∩ C 1 (Rt , L2 (B)4 ) and
(t) = 0 . Moreover, UV ,B (t) can be extended, by density and continuity, strongly in the unitary group on L2 (B)4 . In some useful particular cases, we have an explicit formula: Lemma 6.1. Given 0 = (01 , 02 , 03 , 04 ) ∈ L2s for t ≥ s, then (t, r∗ ) = U0 (t, s) 0 (r∗ ) is given by r∗ > z(t) : 2 (t, r∗ ) = 02 (r∗ − t + s), 3 (t, r∗ ) = 03 (r∗ − t + s), r∗ > z(t) + s − t : 1 (t, r∗ ) = 01 (r∗ + t − s), 4 (t, r∗ ) = 03 (r∗ + t − s), z(t) < r∗ < z(t) + s − t : 1 (t, r∗ ) = −Z(τ (r∗ + t))03 (r∗ + t + s − 2τ (r∗ + t)), z(t) < r∗ < z(t) + s − t : 4 (t, r∗ ) = Z(τ (r∗ + t))02 (r∗ + t + s − 2τ (r∗ + t)), where τ is defined by (7). Given 0 ∈ L2 (B)4 , with B =] − ∞, a] or [a, +∞[, a ∈ R ∪ {−∞, +∞} and δ ∈ R, then, if B =] − ∞, a], (t, r∗ ) = Uδ,B (t)0 (r∗ ) is given by (t, r∗ ) iδt e = eiδt iδt e
0 3 (2a − r∗ − t), 02 (r∗ − t), 03 (r∗ − t), −02 (2a − r∗ − t) , r∗ + t ≥ a, t 0 (r + t), 0 (r − t), 0 (r − t), 0 (r + t) , r + t ≤ a, r − t ≤ a, ∗ 2 ∗ 3 ∗ 4 ∗ 1 ∗ ∗ t 0 (r + t), −0 (2a − r + t), 0 (2a − r + t), 0 (r + t) , r − t ≥ a, ∗ ∗ ∗ ∗ ∗ 1 4 1 4 t
and, if B = [a, +∞[, by (t, ∗) riδt e = eiδt iδt e
0 1 (r∗ + t), 04 (2a + t − r∗ ), −01 (2a + t − r∗ ), 04 (r∗ + t) , r∗ − t ≤ a, t 0 (r + t), 0 (r − t), 0 (r − t), 0 (r + t) , r − t ≥ a, r + t ≥ a, ∗ 2 ∗ 3 ∗ 4 ∗ 1 ∗ ∗ t −0 (2a − r − t), 0 (r − t), 0 (r − t), 0 (2a − r − t) , r + t ≤ a. ∗ ∗ ∗ 3 2 ∗ 3 ∗ 2 t
Proof. The result follows from the study of the characteristics of problems (81)–(82) and (84).
500
F. Melnyk
6.2. Proof of Theorem 4.1 on the scattering theory. Before proving Theorem 4.1, we state the following proposition concerning the spectral properties of DBH and D BH . Proposition 6.3. If ≥ 0, then σ (DBH ) = σac (DBH ) = R
(91)
σ (D BH ) = σac (D BH ) = R,
(92)
and
with DBH and D BH given by (78)(80) and (22)(25). Proof. When = 0 the properties (91) and (92) have been proved in [19]. If > 0 the proof remains essentially similar. Principally, our demonstration in [19] bases one’s argument on the Mourre theory and, in this work, when = 0, we wrote − DBH = − 1 ∂r∗ + Vq + Vl + Vm , qQ i , Vl := −(l + 1/2) 2 F (r) , Vq := r r
Vm :=
F (r) 4 = m F (r)γ 0 .
The main difficulty of this proof is the obtaining of the Mourre inequality. To do this, we must choose an appropriate conjugate operator A. But, we remark that lim Vq =
r∗ →−∞
qQ . r0
For the positive energies, when qQ < 0 (respectively for negative energies and qQ > 0), we obtain easily this inequality if A is the classical generator of dilations. But, when qQ > 0 (respectively qQ < 0), this choice of conjugate operator does not allow us to obtain the result. Indeed, if we put h = −DBH and consider the case qQ > 0, then we obtain the following equality (in the sense of the quadratic forms in HR1 ): χ(h)i[h, A]χ (h) ≥ (ε − qQr −1 )χ 2 (h) + k,
ε > 0,
A := −
i r∗ ∂r∗ + ∂r∗ r∗ , 2
where k is a L2R compact operator and χ ∈ C0∞ (R) such that suppχ ⊂ R∗+ − {m}. Then, to overcome the problem, we put: qQ 0 1 i r∗ ∂r∗ + ∂r∗ r∗ + γ γ r∗ J− (r∗ ), 2 r0 1 r∗ ≤ −3 J− (r∗ ) = . 0 r∗ ≥ −2
A := −
J− ∈ C ∞ (Rr∗ ),
With this choice, the Mourre assumptions are satisfied and since qQr0−1 − qQr −1 ≥ 0, we have:
χ (h)i[h, A]χ (h) ≥ (ε + qQr0−1 J− − qQr −1 J− )χ 2 (h) + k ≥ εχ 2 (h) + k ,
with ε > 0 and k is a compact operator on L2R . When > 0, the result becomes widespread. Indeed, we put qQ h := h − . r+
Hawking Effect for Spin 1/2 Fields
501
Then, for the difficult cases, we define i 1 1 A := − r∗ ∂r∗ + ∂r∗ r∗ + qQ γ 0 γ 1 r∗ J− (r∗ ). − 2 r0 r+ Therefore, for qQ > 0 and suppχ ⊂ R∗+ − {m}, we obtain: −1 2 )χ (h) χ (h)i[h, A]χ (h) ≥ εχ 2 (h) + qQJ− (r0−1 − qQr+ −1 2 )χ (h) + k − qQJ− (r −1 − qQr+
≥ εχ 2 (h) + k ,
ε > 0,
with ε > 0 and k is a L2R -compact operator on L2R . To finish, as in [4], we check that DBH has no eigenvalues when ≥ 0. Proof of Theorem 4.1. The case where = 0 was proved in [19] and we consider only the case > 0. Given two self-adjoint operators A on HA and B on HB , we formally define the wave operators W ± (A, B, J ) = s − lim e−itA J eitB Pac (B), t→±∞
(93)
where Pac (B) is the projector on the absolutely continuous subspace of B and J the bounded identifying operator between HB and HA . When HA = HB and J = I d, we denote W ± (A, B, I d) simply by W ± (A, B). First, we separate the problems at the + 2 horizon and at infinity. To do this, we use the self-adjoint operator D − BH ⊕ D BH on LBH , such that: qQ F (r) 1 − + 1 D BH , D BH := − + ∂r∗ + + F (r) r r 4 2 3 1 4 + F (r) (∂θ + cot θ) + ∂ϕ + , r 2 r sin θ 2 2 2 1/2 D D− (r)dr∗ dω)4 ; BH = ∈ L (] − ∞, 1]r∗ × Sω , r F 2 2 2 1/2 D− (r)dr∗ dω)4 , BH ∈ L (] − ∞, 1]r∗ × Sω , r F γ 1 (1, .) = i(1, .) , 2 2 2 1/2 (r)dr∗ dω)4 ; D D+ BH = ∈ L ([1, +∞[r∗ ×Sω , r F 2 2 2 1/2 D+ (r)dr∗ dω)4 , BH ∈ L ([1, +∞[r∗ ×Sω , r F −γ 1 (1, .) = i(1, .) .
Thanks to formula (78) we reduce D BH on L2BH by DBH on L2BH . In the same way, via + 2 by the self-adjoint operoperators (70) and (72), we can also reduceD − BH ⊕ D BH on L BH − + − + ⊕ D D = P ⊕ ⊕ D with the dense domain D D [D D ator D r V ,]−∞,1] BH BH BH BH BH D DVBH ,[1,+∞[ ] using definitions (79), (88) and (89). Since − + ⊕ DBH ± i)−1 (DBH ± i)−1 − (DBH
502
F. Melnyk
is of finite rank and thus trace class on L2BH , Birman-Kuroda theorem (see [23]) assures that − + W ± DBH , DBH ⊕ DBH exists on L2BH and − + Ran W ± DBH , DBH = Pac (DBH ) L2BH . ⊕ DBH Therefore, the following wave operator BH + ± − + W ± D BH , D − EBH ⊕ DBH DBH , DBH Rln BH ⊕ D BH = ln W
(94)
(l,n)∈I
exists on L2BH , and + Ran W ± D BH , D − = Pac (D BH )L2BH . BH ⊕ D BH Now, as
|r − r0 | ≤ O e2κ0 r∗ r∗ → −∞,
(95)
|r − r+ | ≤ O e2κ+ r∗ r∗ → +∞,
− 2 we compare respectively, the self-adjoint operators D − BH and D← on L (] − ∞, 1]r∗ × − 2 4 Sω ) with the dense domain D(D← ), given by
qQ − D← := 1 ∂r∗ − , r0 − D(D← ) = ∈ L2 (] − ∞, 1]r∗ × Sω2 , dr∗ dω)4 ;
− D← ∈ L2 (] − ∞, 1]r∗ × Sω2 , dr∗ dω)4 , γ 1 (1, .) = i(1, .) ,
+ 2 2 4 and, the self-adjoint operators D + BH and D,→ on L ([1, +∞[r∗ ×Sω ) with the dense + domain D(D,→ ), given by
qQ + D,→ := 1 ∂r∗ − , r+ + ) = ∈ L2 ([1, +∞[r∗ ×Sω2 , dr∗ dω)4 ; D(D,→
+ D,→ ∈ L2 ([1, +∞[r∗ ×Sω2 , dr∗ dω)4 , − γ 1 (1, .) = i(1, .) .
We introduce Jr such that Jr : (r∗ , ω) → Jr ()(r∗ , ω) = r −1 F −1/4 (r)(r∗ , ω),
(96)
+ and we apply respectively Lemma 4.11 in [19] to W ± (Jr−1 D + BH Jr , D,→ ) on − 2 2 L2 ([1, +∞[r∗ ×Sω2 , dr∗ dω)4 and to W ± (Jr−1 D − BH Jr , D← ) on L (] − ∞, 1]r∗ × Sω , 4 dr∗ dω) . Hence
− + W ± (D − resp. W ± (D + (97) BH , D← , Jr ) BH , D,→ , Jr )
Hawking Effect for Spin 1/2 Fields
503
exists on L2 (] − ∞, 1]r∗ × Sω2 , dr∗ dω)4 (resp. L2 ([1, +∞[r∗ ×Sω2 , dr∗ dω)4 ), and − − 2 2 4 (98) Ran W ± (D − BH , D← , Jr ) = Pac (D BH )L (] − ∞, 1]r∗ × Sω , dr∗ dω)
± + + + 2 2 4 resp. Ran W (D BH , D,→ , Jr ) = Pac (D BH )L ([1, +∞[r∗ ×Sω , dr∗ dω) . (99) We introduce the operators J−∗ and J+∗ respectively as the adjoint of χ − r∗ ≤ 1 J− : → J− = , χ− ∈ C ∞ (Rr∗ ), ∃ a, b, a < b < 1, 0 r∗ ≥ 1 1 r∗ < a χ− (r∗ ) = (100) 0 r∗ > b and
J+ : → J+ = χ+ (r∗ ) =
χ + r∗ ≥ 1 , χ+ ∈ C ∞ (Rr∗ ), ∃ a, b, 1 < a < b, 0 r∗ ≤ 1 1 r∗ > b . 0 r∗ < a
(101)
− on L2 (] − ∞, 1] × S 2 , dr dω)4 and D on L2± have spherical symmetry, Since D← r∗ ∗ ← ω ← we use Lemma 6.1 which gives the explicit calculation of the unitary group generated ∞ by these self-adjoint operators. Hence, for all 0 ∈ C0 (] − ∞, 1]r∗ × Sω2 )4 and since ∂r∗ χ− is compactly supported and supp(χ−2 − 1) ⊂ [a, +∞[: itD − − e ← 0 2 D← J− − J− D← L← itD − = ∂r∗ χ− e ← 0 2 ∈ L1 (Rt ), L (]−∞,1]r∗ ×Sω2 , dr∗ dω)4 − ∗ → 0, t → ±∞. J− J− − 1 eitD← 0 2 2 4 L (]−∞,1]r∗ ×Sω , dr∗ dω)
− , J ) exists Therefore, by a standard density argument, the wave operator W ± (D← , D← − 2 2 4 and is an isometry on L (] − ∞, 1]r∗ × Sω , dr∗ dω) . Moreover, if we take ∈ 2± ∩ C ∞ (R × S 2 )4 such that, for real R > 0, supp ± ⊂ [R + 1, −R + 1], we obtain L← r∗ ω 0 0 for ±T > R: −
J−∗ eiT D← 0± = e−itD← J−∗ ei(T +t)D← 0± ∀t ∈ R, J− J−∗ − 1 eitD← 0± 2 → 0, t → ±∞, L←
since supp(χ−2 − 1) ⊂ [a, +∞[. Therefore, by density, the following wave operator − , D← , J−∗ ) W ± (D← 2± , and exists on L← − − , D← , J−∗ ) = Pac (D← )L2 (] − ∞, 1]r∗ × Sω2 , dr∗ dω)4 . Ran W ± (D←
(102)
(103)
504
F. Melnyk
Using again Lemma 6.1, which gives the explicit calculation of the unitary group gen+ on L2 ([1, +∞[ ×S 2 , dr dω)4 , as above and erated by the self-adjoint operator D← r∗ ∗ ω in the same way, we deduce that the wave operator: + , D,→ , J+∗ ) W ± (D,→
(104)
2∓ , and exists on L,→
+ + Ran W ± (D,→ , D,→ , J+∗ ) = Pac (D,→ )L2 ([1, +∞[r∗ ×Sω2 , dr∗ dω)4 .
(105)
We define the operators: − J← : L2 (] − ∞, 1]r∗ × Sω2 , r 2 F 1/2 (r)dr∗ dω)4 → L2BH ; r∗ ≤ 1 − = → J← , 0 r∗ ≥ 1
(106)
+ J→ : L2 ([1, +∞[r∗ ×Sω2 , r 2 F 1/2 (r)dr∗ dω)4 → L2BH ; r∗ ≥ 1 + → J→ = , 0 r∗ ≤ 1
(107)
and the chain rule applied to (94)(95), (97)(99), (102)(103), (104)(105) assures that − + Jr J−∗ ) ⊕ W ± (D BH , D,→ , J→ Jr J+∗ ) W ± (D BH , D← , J← 2± ⊕ L2∓ . By Proposition 6.3, the spectrum of D exists on L← BH is purely absolutely ,→ continuous when > 0. Hence − + Ran W ± (D BH , D← , J← Jr J−∗ ) ⊕ W ± (D BH , D,→ , J→ Jr J+∗ ) = L2BH .
Finally ± ± − + ⊕ W→ = W ± (D BH , D← , J← Jr J−∗ ) ⊕ W ± (D BH , D,→ , J→ Jr J+∗ ) W←
in
2± = L2± , because for all ± ∈ L← ,→ − Jr J−∗ eitD← ± J← − J← ≤ {χ← − χ− } eitD← ± 2 → 0, t → ±∞, L (]−∞,1]r∗ ×Sω2 , dr∗ dω)4 + Jr J+∗ eitD,→ ∓ J,→ − J→ ≤ {χ→ − χ+ } eitD,→ ∓ 2 → 0, t → ±∞. 2 4 L ([1,+∞[r∗ ×Sω , dr∗ dω)
L2BH
(108)
(109)
2± ∩ C ∞ (R × S 2 )4 = L2± ∩ C ∞ (R × S 2 )4 we have Indeed, taking ± ∈ L← r∗ r∗ ω ω 0 0 ,→ −1
eitD← ± (r∗ ) = eitqQr0 ± (r∗ ± t),
−1
eitD,→ ∓ (r∗ ) = eitqQr+ ∓ (r∗ ± t)
and, since χ← − χ− and χ→ − χ+ are compactly supported, by density we obtain the 2± = L2± . limits (108) and (109) for all ± ∈ L← ,→
Hawking Effect for Spin 1/2 Fields
505
6.3. Sharp estimate of 1[0,+∞[ (D 0 )U (0, T ): Proof of Theorem 5.2. We briefly describe the steps of the proof. First, we take advantage of the spherical invariance to reduce our study to a one dimensional problem. Since with (74), (75) and (87) we have
1[0,+∞[ (D 0 )U (0, T ) = e−iT δ
(l,n)∈I
Eνln 1[δ,+∞[ (DVl,ν ,0 )UVl,ν (0, T )Rνln ,
δ :=
qQ , r0
it is sufficient to study the propagator 1[δ,+∞[ (DVl,ν ,0 )UVl,ν (0, T ). Now, to simplify the notations, we forget subscripts ln and ν. We choose J ∈ C ∞ (Rr∗ ) satisfying 1 r∗ < a ∃ a, b ∈ R, 0 < a < b < 1 J (r∗ ) = (110) 0 r∗ > b and split in two parts our investigation: 1[δ,+∞[ (DV ,0 )UV (0, T ) = 1[δ,+∞[ (DV ,0 )J UV (0, T ) + 1[δ,+∞[ (DV ,0 )(1 − J )UV (0, T ).
(111)
Far from the star, we treat the term 1[δ,+∞[ (DV ,0 )(1 − J )UV (0, T ) using Theorem 4.1 on the scattering by the eternal black-hole. Indeed, we have: 1[δ,+∞[ (DV ,0 )(1 − J )UV (0, T ) = 1[δ,+∞[ (DV ,0 )(1 − J )UV ,R (−T ), seeing that U (t) = e
−it qQ r
0
(l,n)∈I
Eνln UVl,ν ,R (t)Rνln ,
where UVl,ν ,R is defined by Proposition 6.2. Near the star, with 0 in ad-hoc dense
subspace on L2R , we note that J UV (0, T )0 is given by J V ,gT (0, r∗ ). The function V ,gT (t, r∗ ) is the only solution of the mixed characteristic problem (81) and (82) with initial data gT (t) specified on the characteristic sub-manifold := {(t, r∗ ) ∈ Rt × [z(t), +∞[; r∗ = 1 − t} such that gT (t) := t (0, [UV (t, T )0 (1 − t)]2 , [UV (t, T )0 (1 − t)]3 , 0), ∃ tg > 0 : t > tg ⇒ gT (t) = 0. Concurrently, in the L2 norm, we prove that − 0 )(1 − 2t − T ), gT (t) ∼ g 2 (t) := (W0,R T
T → +∞,
− is defined in Lemma 6.3, seeing that where the wave operator W0,R
∗ − ν = Eνln W0,R Rln . Pr W − ←
(112)
(l,n)∈I
Then, in the L20 norm we obtain 1[δ,+∞[ (DV ,0 )J UV (0, T )0 ∼ 1[δ,+∞[ (DV ,0 )J V ,g T /2 ∼ 1[0,+∞[ (D0,0 )J 0,g T /2 ,
T → +∞.
506
F. Melnyk
The last term entails asymptotically an explicit calculation which leads to a term of − . This proof using the characteristic problem allows us KMS-type depending on W0,R − . This operator is connected with the curvato easily introduce the wave operator W0,R ture of the space-time at the vicinity of the eternal black-hole horizon. To finish, we prove that the two terms on the right-hand side in (111) are asymptotically orthogonal as T → +∞. 6.3.1. Preliminary estimate for 1[δ,+∞[ (DV ,0 )UV (0, T ). In this part, we use the notations introduced by formulas (74) (75), (88) (89) and Propositions (6.1), (6.2). Therefore, we note that DV ,0 = DV ,[z(0),+∞[ ,
L20 = L2 ([z(0), +∞[r∗ , dr∗ )4 .
(113)
Since DBH = e−i 2 γ Pr DV ,R ei 2 γ Pr −1 − qQ r0 , then, thanks to Proposition 6.3, we have σ (DV ,R ) = σac (DV ,R ) and therefore we deduce the following lemma of local energy decay: ν
ν
5
5
Lemma 6.2. If ≥ 0, then lim f UV ,R (t) = 0,
t→±∞
with f ∈ C 0 (R, M4 (C)) and limr∗ →±∞ |f (r∗ )| = 0. Proof. We consider the dense subspace Ld (DV ,R ) in L2R such that Ld (DV ,R ) = ∈ L2R ; B ⊂ R, |B| < +∞, 1B (DV ,R ) = . As σ (DV ,R ) = σac (DV ,R ), we have UV ,R (t) 0, t → ±∞. Then for all ∈ Ld (DV ,R ), lim f 1B (DV ,R )UV ,R (t) = 0,
t→±∞
because f 1B (DV ,R ) is compact on L2R following Proposition B.7.1 in [7]. Hence, by a density argument, the limit is proved for ∈ L2R . We choose a cut-off function χ ∈ C ∞ (Rr∗ ), such that ∃ a, b ∈ R, − ∞ < a < b < +∞
χ (r∗ ) =
1 r∗ < a , 0 r∗ > b
2− 2 and the subspaces L2+ R , LR of LR , satisfying:
2 = ∈ L ; ≡ ≡ 0 , L2+ 2 3 R R Therefore, we state the lemma:
2 L2− = ∈ L ; ≡ ≡ 0 . 1 4 R R
(114)
Hawking Effect for Spin 1/2 Fields
507
Lemma 6.3. The wave operators ± W0,R = s − lim U0,R (−t)χ UV ,R (t), t→±∞
in L2R ,
WV±,[z(0),+∞[ = s − lim UV ,[z(0),+∞[ (−t)(1 − χ )UV ,R (t) in t→±∞
L20
= L ([z(0), +∞[r∗ , dr∗ )4 2
exist and are independent of χ satisfying (114). Moreover
2 ± ± Ran W0,R = L2± R , Ran WV ,[z(0),+∞[ = Pac DV ,[z(0),+∞[ L0 ,
(115)
where Pac DV ,[z(0),+∞[ is the projector on the absolutely continuous subspace of DV ,[z(0),+∞[ , and for f ∈ HR1 ,
− f − χ UV ,R (t)f lim U0,R (t) W0,R
HR1
t→−∞
= 0.
(116)
± , the existence and property (115) are contained in Proof. For the wave operator W0,R 2 . For Theorem 4.1, since (112) exists and is an isometry from L2BH onto L2BH := Pr L← ± WV ,[z(0),+∞[ , we note that
DV ,]−∞,z(0)] ⊕ DV ,[z(0),+∞[ ± i
−1
−1 − DV ,R ± i
is of finite rank. Then, with the notations (93) introduced in the proof of Theorem 4.1, we obtain by the Birman-Kuroda theorem the existence on L2R of the wave operator W ± D0,]−∞,z(0)] , DV ,R , J1 ⊕ W ± DV ,[z(0),+∞[ , DV ,R , J2 = W ± D0,]−∞,z(0)] ⊕ DV ,[z(0),+∞[ , DV ,R ,
where J1 : ∈ L2R → J1 = |]−∞,z(0)] ∈ L2 (] − ∞, z(0)]r∗ , dr∗ )4 , J2 : ∈ L2R → J2 = |[z(0),+∞[ ∈ L2 ([z(0), +∞[r∗ , dr∗ )4 = L20 , with the property Ran W ± DV ,]−∞,z(0)] ⊕ DV ,[z(0),+∞[ , DV ,R ! = Pac DV ,]−∞,z(0)] ⊕ Pac DV ,[z(0),+∞[ L2R . Now, we must show the equality: WV±,[z(0),+∞[ = W ± DV ,[z(0),+∞[ , DV ,R , J2 . It arises from Lemma 6.2. Indeed, for all ∈ L2R , we have [J2 − (1 − χ )]U (t) 2 ≤ 1[z(0),+∞[ χ U (t) → 0, t → ±∞, V ,R V ,R L 0
(117)
508
F. Melnyk
because lim|r∗ |→+∞ 1[z(0),+∞[ χ = 0. Now we prove property (116). Since the wave − exists, then operator W0,R − − W0,R DV ,R = D0,R W0,R .
Given f ∈ HR1 = D(DV ,R ), then there exists ∈ L2R such that = DV ,R f . Therefore, with the previous formula,
− f − χ UV ,R (t)f 1 U0,R (t) W0,R HR
− ≤ D0,R U0,R (t) W0,R f − χ DV ,R UV ,R (t)f
− f − χ UV ,R (t)f , + χ V + [χ , D0,R ] UV ,R (t)f + U0,R (t) W0,R
− = U0,R (t) W0,R − χ UV ,R (t)
− + χ V + [χ , D0,R ] UV ,R (t)f + U0,R (t) W0,R f − χ UV ,R (t)f . The first and the third norm on the right-hand side are treated by the previous scattering − , and the second using Lemma 6.2, since lim results for W0,R r∗ →±∞ (|χ V |+|[χ , D0,R ]|) = 0. Now, we solve the characteristic Cauchy problem Lemma 6.4. For any g := t (0, g2 , g3 , 0) ∈ HR1 , such that t > tg ⇒ g(t) = 0, then there exists an unique solution of ∂t = i 1 ∂r∗ + iV , t ∈ R, z(t) < r∗ < −t + 1, 2 (t, z(t)) = Z(t)4 (t, z(t)), 1 (t, z(t)) = −Z(t)3 (t, z(t)), (0, 2 , 3 , 0)(t, −t + 1) = g(t), t ∈ R, t > tg , r∗ ∈ [z(t), −t + 1] ⇒ (t, r∗ ) = 0,
t ∈ R,
(118) (119) (120) (121)
∈ C 1 (Rt , L2 ) ∩ C 0 (Rt , H 1 ) such that with R R (t, r∗ ). t ∈ R, r∗ ∈ [z(t), −t + 1] ⇒ (t, r∗ ) =
(122)
Proof. We prove the uniqueness. Given a solution of the problem for g ≡ 0 such that ∈ C 1 (Rt , L2 ) ∩ C 0 (Rt , H 1 ) and z(t) < r∗ ⇒ (t, r∗ ) = (t, r∗ ). We have for R R t ∈ R: −t+1 d ||2 (t, r∗ )dr∗ dt z(t) = − ||2 (t, −t + 1) − z˙ (t) ||2 (t, z(t)) −t+1 +2 < ∂t , >C4 (t, r∗ )dr∗ , =
z(t) −2 |2 |2 (t, −t −t+1
+2
z(t)
+ 1) − 2 |3 |2 (t, −t + 1)
< ∂t − i 1 ∂r∗ − iV , >C4 (t, r∗ )dr∗ .
Hawking Effect for Spin 1/2 Fields
509
Since (t, r∗ ) satisfies Eq. (118), then −t+1 d ||2 (t, r∗ )dr∗ = −2 |2 |2 (t, −t + 1) − 2 |3 |2 (t, −t + 1). dt z(t) Integrating (123) on [t, T ], T > tg with respect to time, we obtain with (121), −t+1 +∞ ||2 (t, r∗ )dr∗ = 2 |g|2 (τ )dτ ≤ 2 g 2 . z(t)
(123)
(124)
t
Therefore, since g ≡ 0 then ≡ 0. Now, we prove the existence of the solution for a regular initial data g = (0, g2 , g3 , 0) ∈ C01 (R)4 . First, we solve the following characteristic problem: ∂t fV = i 1 ∂r∗ fV + iVfV , t ∈ R, r∗ > −t + 1, fV (t, −t + 1) = g(t), t ∈ R, t ∈]1 − r∗ , r∗ + a[ ⇒ fV (t, r∗ ) = 0,
(125) (126) (127)
where a = inf [supp(g)] . The continuous solution fV of (125), (126) and (127) is given by the continuous solution of the following equivalent integral problem: fV (t, r∗ ) = F (X = t + r∗ − 1, T = t − r∗ − a) T +a+1 + BF (X, T ) X ≥ 0, T > 0, g 2 , = 0 X ≥ 0, T ≤ 0,
' % & T X−ξ −a+1 V F (X, ξ ) dξ 2 %0 &
'1 X ξ −T −a+1 V F (ξ, T ) dξ i 0 2 2 . &
' BF (X, T ) = % X ξ −T −a+1 2 dξ V F (ξ, T ) 2 %0 &
'3 T X−ξ −a+1 dξ V F (X, ξ ) 0 2
(128)
(129)
4
For X ≥ 0, T > 0, putting T +a+1 0 ; F (X, T ) = g 2
F n+1 (X, T ) = BF n (X, T ), n ≥ 0,
and since, g and V are bounded, we have (X + T )n n−1 , (X, T ) ≤ g L∞ V nL∞ 6n BF n!
n ≥ 1.
Then the Picard method gives a unique solution F (X, T ) ∈ C 0 ([0, +∞[X ×RT )4 of (128) such that F (X, T ) =
+∞ n=0
F n (X, T ),
|F (X, T )| ≤ g L∞ exp (6 V L∞ (|X| + |T |)).
510
F. Melnyk
Seeing that (X ≥ 0, T ≤ 0) ⇒ F (X, T ) = 0, V ∈ C ∞ (Rr∗ , M4 (C)) and for X, T ≥ 0, n−1 ∂X F n (X, T ) + ∂Y F n (X, T ) ≤ 16 g L∞ V n ∞ 12n−1 (X + T ) L (n − 1)! n n (X + T ) +2 g L∞ V ∞ V n−1 ∞ 12 L L n! n (X + T ) + g ∞ V nL∞ 6n , n ≥ 1, L n!
we have F (X, T ) ∈ C 1 ({(X, T ) ∈ [0, +∞[X ×RT : 2tg ≥ X + T + a + 1})4 . Hence, [g ]H (., .) = [UV (., tg )φV (tg , .)]H ∈ C 1 (Rt , L2R ) ∩ C 0 (Rt , HR1 ), fV (tg , r∗ ) r∗ > −tg + 1, 1 [φV (tg , .)]H ∈ HR , φV (tg , r∗ ) := 0 z(tg ) < r∗ ≤ −tg + 1,
(130) (131)
is a solution of (81), (82), (120) and (121) and in particular of (118), (119), (120) and (121) with g ∈ C01 (R). Moreover we have
d dt
+∞
−t+1
|fV |2 (t, r∗ )dr∗ = 2 |g|2 (t),
and integrating this formula on [−∞, tg ] with respect to time, we obtain
+∞
−tg +1
|fV |2 (tg , r∗ )dr∗ = 2 g 2 .
Thanks to (130), (131) and (86), 2 2 sup [g (t, .)]L L2 = sup g (t, .)t = 2 g 2 , t∈R
R
t∈R
and by density and continuity, we get the existence with g ∈ HR1 .
(132)
We introduce some notations: For g ∈ L2R , g T (.) := g(. − T ),
T ≥ 0,
and following the previous lemma, when g := t (0, g2 , g3 , 0) ∈ HR1 , t > tg ⇒ g(t) = 0, we define the operator GV (g) such that GV (g)(r∗ ) := J (r∗ )V (0, r∗ ),
r∗ ∈ [z(0), 1],
(133)
with J as in (110) and V (0, r∗ ) the solution of (118), (119), (120) and (121). Moreover, by density and thanks to (124), formula (133) is well defined for g ∈ L2R , t > tg ⇒ g(t) = 0. Therefore, we prove the first important estimate:
Hawking Effect for Spin 1/2 Fields
511
Lemma 6.5. Given g := t (0, g2 , g3 , 0) ∈ L2R such that t > tg ⇒ g(t) = 0, then & ' 2 lim 1[0,+∞[ (D0,0 ) G0 g T T →+∞ L 0 −1 2π 2π D D = 2 < g, e κ0 0,R 1 + e κ0 0,R g >L2 , R
(134)
and &
' G0 g T 0,
T → +∞,
L
in L20 .
(135)
Proof. As the norm of (134) is uniformly bounded in T by (124), it is enough to obtain (134) for g ∈ C0∞ (R)4 such that supp(g) ⊂ [0, R], R > 0 fixed. For T > 1−z(0) 2 , we T have G0 (g ) ∈ [z(0), 0[ and thanks to Lemma 6.1, 1 − r∗ T t T G0 (g )(r∗ ) = Z(τ (r∗ )) (−g3 , 0, 0, g2 ) τ (r∗ ) + , 2 with τ and Z respectively defined by (8) and (76). We define the spinor GT , such that 1 1 1 1 t T T G (r∗ ) := √ , (−g3 , 0, 0, g2 ) − ln(−r∗ ) + ln(Cκ0 ) + 2κ0 2κ0 2 −κ0 r∗ r∗ < 0, (136) with supp(GT ) ⊂] − ∞, 0[ and the real Ck > 0 as in (8). For the first time, we remark that, for f ∈ L2R , < [GT ]L , f >L2 → 0, R
T → +∞.
(137)
Indeed, for f ∈ C0∞ (R)4 , we have T T < [G ]L , f >L2 ≤ f L∞ (R)4 [G ]L (r∗ )dr∗ R R 0 T = f L∞ (R)4 G (r∗ )dr∗ −∞ ≤ κ0 Cκ0 e−2κ0 T +κ0 f L∞ (R)4 × e−κ0 y |g|(y)dy → 0, T → +∞. R
We obtain (137) by density and using the inequality [GT ]L ≤ g . Moreover, for T > 1−z(0) 2 , we have, & ' 2 T T [G ]L − G0 (g ) = L 0
0 z(0)
2 T G (r∗ ) − G0 (g T ) dr∗ .
512
F. Melnyk
We remark that: Z(τ (r∗ )) ∈ C 0 ([z(0), 0[) and lim h(r∗ ) = 1,
r∗ →0−
h(r∗ ) :=
√ −κ0 r∗ Z(τ (r∗ )).
(138)
Indeed, thanks to (8) and (9), (138) entails that √ 1 + z˙ (τ (r∗ )) −2κ0 r∗ + O (r∗ 2 ) h(r∗ ) = −κ0 r∗ = , 1 − z˙ (τ (r∗ )) −2κ0 r∗ + O(r∗ 2 ) − 1 < z˙ (τ (r∗ )) ≤ 0,
r∗ ∈ [z(0), 0[.
Therefore, using (8) and putting y(r∗ ) = − 2κ10 ln(−r∗ ) + & ' 2 T [G ]L − G0 (g T ) L 0 +∞ =2
− 2κ1 ln(−z(0))+ 2κ1 ln(Cκ0 )+ 21 −T
0
0
× g y + O(e
−2T κ0 −2yκ0 +κ0
1 2κ0
ln(Cκ0 ) +
1 2
−T,
g(y) − h −Cκ0 e−2T κ0 −2yκ0 +κ0
2 ) dy,
and by the Lebesgue theorem, we obtain: & ' & ' 2 T − G0 (g T ) → 0, G L
T → +∞.
L 0
(139)
With (137), this last limit gives (135). Finally, for T > − 2κ10 ln(−z(0))+ 2κ10 ln(Cκ0 )+ 21 , and denoting F the Fourier transform, we have 2 1[0,+∞[ (D0,0 )[GT ]L 0 +∞
2 1 = F [GT ]L (ξ )dξ 2π 0 2 κ Cκ0 κ0 +∞ iCκ0 ξ eκ0 y 20 y dξ, g(y) e g(y)dy ˜ ˜ = g(−y/2), = e 2π R 0 −1 2π 2π D D g˜ >L2 , (lemma III.6 in [4]), =< g, ˜ e κ0 0,R 1 + e κ0 0,R = 2 < g, e
2π κ0 D0,R
1+e
2π κ0 D0,R
which implies, with (139), limit (134).
−1
R
g >L2 , R
(140)
Hawking Effect for Spin 1/2 Fields
513
To prove the following estimate, we need a Gronwall type inequality: Lemma 6.6. Given J, E1 , E2 ∈ C 0 ([a, b]) and t ∈ [a, b] ⇒ E1 (t), E2 (t) ≥ 0, such that t J (s)ds, a ≤ t ≤ b, (141) J (t) ≤ E2 (t) + E1 (t) a
then J (t) ≤ E2 (t) + E1 (t) exp
t
t
E1 (s)ds
E2 (s)ds,
a
a ≤ t ≤ b.
(142)
a
Proof. We put R(s) = exp −
s
s
E1 (τ )dτ
a
J (τ )dτ. a
We differentiate R(s) and using (141): s s s d E1 (τ )dτ − E1 (s) exp − E1 (τ )dτ J (τ )dτ, R(s) = J (s) exp − ds a a a s ≤ E2 (s) exp − E1 (τ ) . a
As R(a) = 0, integrating the result on [a, t], we obtain s t E2 (s) exp − E1 (τ )dτ ds. R(t) ≤ a
a
Since s ∈ [a, t] and E1 is non negative: s E1 (τ )dτ ≤ 1. exp − a
Hence
t
J (s)ds ≤ exp
a
and (142) follows.
t
t
E1 (τ )dτ a
E2 (s)ds, a
Lemma 6.7. Given g := t (0, g2 , g3 , 0) ∈ L2R such that t > tg ⇒ g(t) = 0, then & ' ' 2 & lim G0 g T − GV g T = 0,
T →+∞
L
L 0
(143)
and &
' GV g T 0, L
T → +∞,
in L20 .
(144)
514
F. Melnyk
Proof. With (124), it is enough to obtain the result for g ∈ C0∞ (R)4 such that supp(g) ⊂ [0, R], R > 0 fixed. By Lemma 6.4, formulas (130) and (131), for r∗ ∈ [z(0), 1], we have
GV g T (r∗ ) = J (r∗ ) [UV (0, R + T )φV (R + T , .)] (r∗ ), fV (R + T , r∗ ) r∗ > −R − T + 1, φV (R + T , r∗ ) = (145) 0 z(R + T ) < r∗ ≤ −R − T + 1. Now, for r∗ ∈ [z(0), 1], we write
' & GV g T − G0 g T (r∗ ) = J (r∗ ) [UV (0, R + T )φV (R + T , .) − U0 (0, R + T )φ0 (R + T , .)] (r∗ ), = J (r∗ ) [UV (0, R + T ) {φV (R + T , .) − φ0 (R + T , .)}] (r∗ ) − J (r∗ ) [{U0 (0, R + T ) − UV (0, R + T )} φ0 (R + T , .)] (r∗ ), =: A1 + A2 . We estimate A1 . First, with (145), we have +∞
A1 20 ≤ |φV (R + T , r∗ ) − φ0 (R + T , r∗ )|2 dr∗ , z(R+T ) +∞
=
−R−T +1
|fV (R + T , r∗ ) − f0 (R + T , r∗ )|2 dr∗ ,
=: J (R + T ). But, d J (t) = |fV − f0 |2 (t, −t + 1) dt +∞ < ∂t (fV − f0 ) (t, r∗ ), (fV − f0 ) (t, r∗ ) >C4 dr∗ , + 2 −t+1
=: J1 + 2J2 . Since the solutions fV and f0 have the same characteristic data, J1 = 0. On the other hand, with the help of equations satisfied by fV and f0 , we have: +∞ J2 = < i 1 ∂r∗ (fV − f0 ) (t, r∗ ) + iVfV (t, r∗ ), (fV − f0 ) (t, r∗ ) >C4 dr∗ , −t+1 +∞
=− + =−
−t+1 +∞ −t+1 +∞ −t+1
+ 2
< (fV − f0 ) (t, r∗ ), i 1 ∂r∗ (fV − f0 ) (t, r∗ ) >C4 dr∗ < iVfV (t, r∗ ), (fV − f0 ) (t, r∗ ) >C4 dr∗ , < (fV − f0 ) (t, r∗ ), i 1 ∂r∗ (fV − f0 ) (t, r∗ ) + iVfV (t, r∗ ) >C4 dr∗ +∞
−t+1
< iVfV (t, r∗ ), (fV − f0 ) (t, r∗ ) >C4 dr∗ ,
= − J¯2 + 2
+∞ −t+1
< iVfV (t, r∗ ), (fV − f0 ) (t, r∗ ) >C4 dr∗ .
Hawking Effect for Spin 1/2 Fields
Then d J (t) = 2 dt
+∞ −t+1
515
< V (r∗ )fV (t, r∗ ), fV (t, r∗ ) − f0 (t, r∗ ) >C4 dr∗ .
(146)
In Lemma 6.4, we have proved that the solution fV (t, x) propagates at speed one. Therefore, for t ∈ [T , T + R], we have
supp g T ⊂ [T , T + R] ⇒ supp (fV (t, .)) ⊂ [−t + 1, t − 2T + 1], ⇒ J (0) = 0.
T , R > 0, (147)
Hence, integrating (146) on [0, T + R], we obtain: T +R +∞ J (R + T ) = 2 < V (r∗ )fV (t, r∗ ), fV (t, r∗ ) − f0 (t, r∗ ) >C4 dr∗ dt. −t+1
0
By the Cauchy-Schwartz inequality, T +R +∞ |< V (r∗ )fV (t, r∗ ), fV (t, r∗ ) − f0 (t, r∗ ) >| dr∗ dt, J (R + T ) ≤ 2 0
≤2
T +R
−t+1 +∞
−t+1
0
1/2 |V (r∗ )fV (t, r∗ )|2 dr∗
J (t)1/2 dt.
√ x ≤ x + 1 for x ≥ 0, then we deduce that T +R J (t)dt, J (R + T ) ≤ E2 (T + R) + E1 (T + R)
Thanks to the remark (147) and as
0
E1 (t) := 4 g sup {|V (x)|; x ≤ −t + 2R + 1} , E2 (t) := tE1 (t). As E1 , J ∈ C 0 (R), by Lemma 6.6, we have
T +R
J (T + R) ≤ E2 (T + R) + E1 (T + R) exp
T +R
E1 (s)ds 0
E2 (s)ds. 0
Since, V (r∗ ) is exponentially decreasing as r∗ → −∞, we get
A1 20 ≤ J (R + T ) → 0,
T → +∞.
(148)
To estimate A2 , we use the usual formula {U0 (0, R + T ) − UV (0, R + T )} φ0 (R + T , .) R+T =− UV (0, s)V U0 (s, R + T )φ0 (R + T , .)ds. 0
Hence, we deduce with (86) that
A2 0 ≤ {U0 (0, R + T ) − UV (0, R + T )} φ0 (R + T , .) 0 , R+T
V U0 (s, R + T )φ0 (R + T , .) s ds. ≤ 0
(149)
516
F. Melnyk
Now we defined the time τT , such that Z(τT ) − τT = −2T + 1. Thanks to (6): τT = T −
1 + O e−2κ0 T , 2
T → +∞
(150)
and according Lemma 6.1, we have also s ∈ [0, τT ] ⇒ [U0 (s, R + T )φ0 (R + T , .)] (r∗ ) 1 − r∗ − s t T = Z(τ (r∗ )) (−g3 , 0, 0, g2 ) τ (r∗ ) + 2
(151)
and
& ' s ∈ [0, τT ] ⇒ supp [U0 (s, R + T )φ0 (R + T , .)] ⊂ −s − O(e−2κ0 T ) , −s . (152)
Indeed, for s ∈ [0, τT ], supp [U0 (s, R + T )φ0 (R + T , .)] ⊂ [−s + 2τT − 2T + 1, −s] , and with (150), (152) follows. Hence, τT
A2 0 ≤
V U0 (s, R + T )φ0 (R + T , .) s ds 0
+
R+T τT
V U0 (s, R + T )φ0 (R + T , .) s ds,
≤ A21 + A22 . First, we estimate A21 . With the help of (152) and (151), we have, τT I (s)ds, A21 ≤ 0
where I (s) :=
−s −s−|O (e−2κ0 T )|
2 V (r∗ )Z(τ (r∗ )) t (−g3 , 0, 0, g2 )T τ (r∗ ) + 1 − r∗ − s dr∗ . 2 (153)
Using (8) and putting y(r∗ ) = − 2κ10 ln(−r∗ ) + I (s) ≤ 2 V 2L∞
y(−s) y (−s−|O (e−2κ0 T )|)
1 2κ0
ln(Cκ0 ) +
1−s 2
− T , we have
h2 (r∗ (y, s, T )) |g (y + O(r∗ (y, s, T )))|2 dy,
' & ≤ Cz V 2L∞ g 2L∞ ln s + O(e−2κ0 T ) − ln(s) ,
Cz > 0,
Hawking Effect for Spin 1/2 Fields
517
with h defined in (138). First, for x ≥ 0, log(x + 1) ≤ x. Hence we obtain τT τT I (s)ds ≤ Cz,V ,g ln s + O(e−2κ0 T ) − ln(s)ds 0 0 +∞ log(x + 1) ≤ Cz,V ,g O(e−2κ0 T ) dx, x2 C(T ) C(T ) := τT−1 O(e−2κ0 T ) ,
√ +∞ 1 log(x + 1) x −2κ0 T ≤ Cz,V ,g O(e ) 2 dx + dx , 2 x2 C(T ) x 1 −2κ0 T −2κ T 0 ≤ Cz,V ,g τT O(e ) + C O(e ) → 0, T → +∞. For s ∈ [τT , T + R] we have: supp[U0 (s, R + T )φ0 (R + T , .)] ⊂ [z(s), 2R + 1 − s]. Hence, thanks to (150) and (132), A22 ≤
R+T τT
≤ 2 g
V U0 (s, R + T )φ0 (R + T , .) s ds
T +R
sup {|V (x)|; z(s) ≤ x ≤ 2R + 1 − s} ds → 0,
T → +∞.
τT
Then, we obtain that
A2 0 → 0,
T → +∞.
(154)
Now, finally, with (154) and (148), we deduce that & ' ' & − GV g T ≤ A1 0 + A2 0 → 0, T → +∞. G0 g T L
L 0
Lastly, the above result with (135), entails (144).
Lemma 6.8. Given g := t (0, g2 , g3 , 0) ∈ L2R such that t > tg ⇒ g(t) = 0, then & ' 2 lim 1[δ,+∞[ (DV ,0 ) GV g T T →+∞ L 0 −1 2π 2π D D = 2 < g, e κ0 0,R 1 + e κ0 0,R g >L2 , R
(155)
with δ=
qQ . r0
Proof. First, we define V∞ thanks to V such that V∞ := δIR4 + ςAν =
lim V (r∗ ),
r∗ →+∞
δ=
qQ , ς = −m F (r+ ), r0
(156)
518
F. Melnyk
where Aν as in (76). If ς < 0 ( = 0), thus by assumption ν = (2k + 1)π , k ∈ Z and from the proof of Lemma III-7 in [4], we set that: 1[0,+∞[ Dς Aν ,]−∞,z(0)] ⊕ Dς Aν ,[z(0),+∞[ − 1[0,+∞[ Dς Aν ,R is compact. (157) For g ∈ C0∞ (R)4 such that supp(g) ⊂ [0, R], R > 0 fixed, and T > − 2κ10 ln(−z(0)) + T 1 1 ⊂]z(0), 0[ which entails: 2κ0 ln(Cκ0 ) + 2 , we have supp G & ' & ' 1[0,+∞[ Dς Aν ,]−∞,z(0)] ⊕ Dς Aν ,[z(0),+∞[ GT = 0 ⊕ 1[0,+∞[ Dς Aν ,[z(0),+∞[ GT , L
L
where GT is defined by (136). Since, 1[δ,+∞[ DV∞ ,0 = 1[0,+∞[ Dς Aν ,0 = 1[0,+∞[ Dς Aν ,[z(0),+∞[ and according to (137) and (157), we deduce that & ' & ' 1[0,+∞[ Dς Aν ,[z(0),+∞[ GT − 1[0,+∞[ Dς Aν ,R GT → 0, L
L
(158)
T → +∞. (159)
Seeing that DςAν ,R is the Dirac Hamiltonian, using the Fourier transform F: +
1 1 1 F1[0,+∞[ Dς Aν ,R = iξ + ς Aν F. + 2 2 ξ2 + ς2 We remark that & ' 2 (ξ ) = 4κ0 B(T )|θ (B(T )ξ )|2 , F GT L −2κ0 y θ (B(T )ξ ) := e−κ0 y eiξ B(T )e g(y)dy, B(T ) := Cκ0 e−2κ0 T +κ0 . R
Hence, thanks to Lebesgue’s theorem, & ' & ' 2 1[0,+∞[ D0,R GT − 1[0,+∞[ Dς Aν ,R GT L L 2
1 iξ 1 iξ 1 + ς Aν = C1 − 2 2 |ξ | ξ +ς R & ' 2 × F GT (ξ ) dξ, C1 > 0, L
2 1 iη 1 iη 1 + B(T )ς Aν ≤ C2 − 2 2 2 |η| η + B (T )ς R × |θ(η)|2 dη → 0, T → +∞, C2 > 0.
(160)
By (140), we have, +∞ & ' & ' 1 2 2 (ξ ) dξ 1[0,+∞[ D0,R GT = F GT L L 2π 0 & T' 2 = 1[0,+∞[ D0,0 G . L
(161)
Hawking Effect for Spin 1/2 Fields
519
As [GT ]L ≤ g , by density and using (158), (159) ,(160), (161) and (139), we obtain, for g ∈ L2R & T ' 2 1[δ,+∞[ D G0 g V∞ ,0 L & ' 2 (162) − 1[0,+∞[ D0,0 G0 g T → 0, T → +∞. L
If ς = 0 ( > 0), then we have clearly: & ' & ' 2 2 1[δ,+∞[ DV∞ ,0 G0 g T = 1[0,+∞[ D0,0 G0 g T . L
(163)
L
Moreover, from Lemma III-10 in [4], we have 1[δ,+∞[ DV∞ ,0 − 1[δ,+∞[ DV ,0
is compact.
(164)
Then, using respectively (143), (164)-(135), (162)-(163) and (134), we conclude that: ' & & ' lim 1[δ,+∞[ DV ,0 GV g T = lim 1[δ,+∞[ DV ,0 G0 g T , T →+∞ T →+∞ L L
' & = lim 1[δ,+∞[ DV∞ ,0 G0 g T , T →+∞ L
' & = lim 1[0,+∞[ D0,0 G0 g T , T →+∞
2π
= 2 < g, e κ0
D0,R
2π
1 + e κ0
D0,R
−1
L
g >L2 . R
We defined a dense subspace DR of L2R , such that,
DR = ∈ HR1 : ∃R > 0 r∗ ≤ R ⇒ (r∗ ) = 0 .
For f ∈ DR , we put gT (t) := t 0, [UV (t, T )f ]2 , [UV (t, T )f ]3 , 0 (−t + 1),
− g(t) = W0,R f (−2t + 1),
(165) (166)
where [x]j is the j th component of x ∈ C4 . Moreover 2t ≥ T − R + 1 ⇒ gT (t) = 0, 2t ≥ −R + 1 ⇒ g(t) = 0. Lemma 6.9. Given f ∈ DR , with the definitions (165) and (166) we have +∞ 2 T gT (t) − g 2 (t) dt → 0, T → +∞. 0
(167) (168)
(169)
520
F. Melnyk
Proof. We define such that (t, r∗ ) := UV ,R (t)f (r∗ ). Since f ∈ DR , then,
− f (r∗ − t) = 0. |t| ≤ R − r∗ ⇒ (t, r∗ ) = W0,R
(170)
Then, using the notation of (165) and for t + r∗ ≤ R, we remark that (t, r∗ ) is solution of r∗
− (t, r∗ ) = W0,R f (r∗ − t) + A(t, r∗ , s)ds, (171) −∞
with j = 1, 4 ⇒ [A(t, r∗ , s)]j = −[iV (s)(r∗ − s + t, s)]j , j = 2, 3 ⇒ [A(t, r∗ , s)]j = [iV (s)(s − r∗ + t, s)]j . From (171), for r∗ ≤ R, we deduce that
(., r∗ ) H 1 (]−∞,R−r∗ ])4 r∗ − ≤ W0,R f 1 + 2 |V (s)| (r∗ − s + ., r∗ ) H 1 (]−∞,R−r∗ ])4 ds HR −∞ r∗ +2 |V (s)| (s − r∗ + ., r∗ ) H 1 (]−∞,R−r∗ ])4 ds, −∞ r∗ − ≤ W0,R f 1 + 2 |V (s)| (., r∗ ) H 1 (]−∞,R−s])4 ds H −∞ r∗ R +2 |V (s)| (., r∗ ) H 1 (]−∞,R−2r∗ +s])4 ds, −∞ r∗ ≤ C1 f H 1 + 4 |V (s)| (., r∗ ) H 1 (]−∞,R−s])4 ds, C1 > 0. R
−∞
Since, V (s) is exponentially decreasing as s → −∞, by Gronwall lemma we obtain sup (., r∗ ) H 1 (]−∞,R−r∗ ])4 < +∞.
(172)
r∗ ≤R
On the other hand, using (171) and (172), we have for r∗ ≤ R, − f (r∗ − .) 1 (., r∗ ) − W0,R H (]−∞,R−r∗ ])4 r∗ ≤4 |V (s)| (., r∗ ) H 1 (]−∞,R−s])4 ds, −∞ r∗ ≤ C2 |V (s)|ds, C2 > 0. −∞
Thanks to the Sobolev embedding, for r∗ ≤ R, we conclude that r∗ − sup (σ, r∗ ) − W0,R f (r∗ − σ ) ≤ C3 |V (s)|ds, σ ≤R−r∗
−∞
C3 > 0.
(173)
Hawking Effect for Spin 1/2 Fields
We define
521
+∞
2 T gT (t) − g 2 (t) dt
I := 0
and remark that
' & T − g 2 (t) = U0,R (t − T ) W0,R (−2t + 1), ! ! gT (t) = t 0, UV ,R (t − T )f 2 , UV ,R (t − T )f 3 , 0 (−t + 1).
Therefore, choosing χ ∈ C ∞ (Rr∗ ) a cut-off function such that 1 r∗ < a ∃ a, b ∈ R, − ∞ < a < b < 0 χ (r∗ ) = , 0 r∗ > b and for ζ > 0 we deduce that +∞ 2
− f (−t + 1) − χ (−t + 1)UV ,R (t − T )f (−t + 1) dt, I≤ U0,R (t − T ) W0,R 0 2
− f − χ UV ,R (σ )f ∞ 4 ≤ζ sup U0,R (σ ) W0,R L (R)
σ ≤ζ −T +∞
2 − f (−t + 1) − (t − T , −t + 1) dt. U0,R (t − T ) W0,R
+
ζ
By the Sobolev embedding and formula (173), for ζ, T ≥ 1 − R, we obtain 2
− I ≤ ζ sup U0,R (σ ) W0,R f − χ UV ,R (σ )f 1 σ ≤ζ −T +∞
+
ζ
HR
2
− sup U0,R (σ ) W0,R f (−t + 1) − (σ, −t + 1) dt,
σ ≤t−T
T ≥ R − 1,
2
− ≤ ζ sup U0,R (σ ) W0,R f − χ UV ,R (σ )f σ ≤ζ −T
+C3
+∞ ζ
−t+1 −∞
HR1
2 |V (s)|ds
dt.
Thanks to Lemma 6.3 and since V (s) is exponentially decreasing as s → −∞, we conclude that limT →+∞ I = 0. Lemma 6.10. Given f ∈ L2R , then 2 lim 1[δ,+∞[ (DV ,0 )J UV (0, T )f 0 T →+∞
−
=< W0,R f, e
2π κ0 D0,R
1+e
with δ=
2π κ0 D0,R
−1
− W0,R f >L2 ,
(174)
in L20 .
(175)
R
qQ , r0
and J UV (0, T )f 0,
T → +∞,
522
F. Melnyk
Proof. For f ∈ DR , R > 0 fixed, thanks to (167), (168), (124) and (114), we have & T ' 2 J UV (0, T )f − GV g 2 L 0 & T ' 2 = [GV (gT )]L − GV g 2 , L 0 +∞ 2 T ≤2 gT (t) − g 2 (t) dt → 0,
T → +∞.
(176)
0
According to Lemma 6.9 2 lim 1[δ,+∞[ (DV ,0 )J UV (0, T )f 0 T →+∞ T ' 2 & = lim 1[δ,+∞[ (DV ,0 ) GV g 2 , T →+∞ L 0 −1 2π 2π D0,R D0,R − − κ κ =2 1+e 0 < W0,R f (1 − 2t), e 0 W0,R f (1 − 2t) >C4 dt, R
2π
− f, e κ0 =< W0,R
D0,R
2π
1 + e κ0
D0,R
−1
− W0,R f >L2 . R
With limit (176) and Lemma 6.7 we obtain (175) for f ∈ DR . Since all norms are uniformly bounded with respect to T , lemma is proved by density. Finally, we prove the main result of this subpart: Proposition 6.4. Given f ∈ L2R , then 2 lim 1[δ,+∞[ (DV ,0 )UV (0, T )f 0 T →+∞ 2 = 1[δ,+∞[ (DV ,0 )WV−,[z(0),+∞[ f 0 −1 2π 2π D D0,R − − κ κ 0,R 1+e 0 + < W0,R f, e 0 W0,R f >L2 , R
(177)
with δ=
qQ . r0
Proof. With simple calculation, we deduce 1[δ,+∞[ (D
V ,0
2 2 )UV (0, T )f 0 = 1[δ,+∞[ (DV ,0 )J UV (0, T )f 0 2 + 1[δ,+∞[ (DV ,0 )(1 − J )UV (0, T )f 0 +2 < 1[δ,+∞[ (DV ,0 )(1 − J )UV (0, T )f, 1[δ,+∞[ (DV ,0 )J UV (0, T )f >L2 . 0
According to limit (175) and Lemma 6.3 the last term is zero as T → +∞. The two norms are treated by Lemma 6.10 and Lemma 6.3.
Hawking Effect for Spin 1/2 Fields
523
6.3.2. Proof of Theorem 5.2. Now, we prove the key estimate. Using operators (69), (71) and the properties (73), (74) and (87), by Lebesgue theorem and Proposition 6.4, we have 1[0,+∞[ (D 0 ) U (0, T )f 2 0 2 = 1[0,+∞[ (DVl,ν ,0 − δ)UVl,ν (0, T )Rνln f , 0
(l,n)∈I
=
2 1[δ,+∞[ (DVl,ν ,0 )UVl,ν (0, T )Rνln f , 0
(l,n)∈I
−→
T →+∞
+
2 1[δ,+∞[ (DVl,ν ,0 )WV− ,[z(0),+∞[ Rνln f l,ν
(l,n)∈I
(l,n)∈I
2π
− ν < W0,R Rln f, e κ0
D0,R
0
2π
1 + e κ0
D0,R
−1
− ν W0,R Rln f >L2 , R
=: S1 + S2 . By Lemma 6.3, the wave operator WV− ,[z(0),+∞[ exists and is an isometry from L2R onto l,ν
Pac (DV ,[z(0),+∞[ )L20 . Hence W+− :=
(l,n)∈I
Eνln WV− ,[z(0),+∞[ Rνln l,ν
exists and is an isometry from L2BH onto Pac (D 0 )L20 . We put
∗ − − ,→ := W,→ ,
resp.
∗
− − 0,→ := W0,→ .
If > 0 (resp. = 0), we define the wave operator: − 2 W,D : Pac (D 0 )L20 → L,→ ,
resp.
− 2 , W0,D : Pac (D 0 )L20 → L0,→
such that
∗ − − W,D := ,→ W+− ,
resp.
∗
− − W0,D := 0,→ W+− .
2 , By the chain rule theorem, these operators are isometries from Pac (D 0 )L20 onto L,→ 2 2 (resp. Pac (D 0 )L0 onto L0,→). From the previous discussion and the intertwining
524
F. Melnyk
properties, we have, if ≥ 0 2 − WV ,[z(0),+∞[ 1[δ,+∞[ (DV ,R )Rνln f ,
S1 =
(l,n)∈I
l,ν
0
2 = W+− 1[δ,+∞[ (D BH + δ)f , 0 2 − = W,D W+− 1[0,+∞[ (D BH )f 2 2 − = ,→ 1[0,+∞[ (D BH )f 2
L,→
2 − = 1[0,+∞[ (D,→ ),→ f 2
L,→
,
,
L,→
.
We put − ∗ − ← := W ← , and remark that Pr D← Pr −1 =
Eνln D0,R Rνln − δ,
δ=
(l,n)∈I
qQ . r0
Then, with (73) and (112), we obtain that 2π 2π κ0 D← κ0 D← S2 =< Pr − 1 + µe − f, P µe r ← ← f >L2 , σ D← =< − ← f, µe
µ = eσ δ ,
1 + µeσ D←
2π σ := , κ0
involving the limits (58).
−1
BH
2 L2BH = Pr L← ,
− ← f >L2 , ←
qQ δ := , r0
Acknowledgements. The author would like to thank warmly Professor Alain Bachelot for his enlightening comments.
References 1. Bachelot, A.: Quantum vacuum polarization at the black-hole horizon. Ann. Inst. H. Poincar´e Phys. Th´eor. 67(2), 181–222 (1997) 2. Bachelot, A.: Scattering of scalar fields by spherical gravitational collapse. J. Math. Pures Appl. (9) 76(2), 155–210 (1997) 3. Bachelot, A.: The Hawking effect. Ann. Inst. H. Poincar´e Phys. Th´eor. 70(1), 41–99 (1999) 4. Bachelot, A.: Creation of fermions at the charged black-hole horizon. Ann. Henri Poincar´e 1(6), 1043–1095 (2000) 5. Birrell, N.D., Davies, P.C.W.: Quantum fields in curved space. Cambridge: Cambridge University Press, 1982 6. Chodos, A., Jaffe, R.L., Johnson, K., Thorn, C.B., Weisskopf, V.F.: New extended model of hadrons. Phys. Rev. D (3) 9(12), 3471–3495 (1974) 7. Derezi´nski, J., G´erard, C.: Scattering theory of classical and quantum N-particle systems. Berlin: Springer-Verlag, 1997
Hawking Effect for Spin 1/2 Fields
525
8. Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77(3), 219–228 (1980) 9. Dimock, J.: Dirac quantum fields on a manifold. Trans. Am. Math. Soc. 269(1), 133–147 (1982) 10. Dimock, J., Kay, B.S.: Classical and quantum scattering theory for linear scalar fields on the Schwarzschild metric. I. Ann. Phys. 175(2), 366–426 (1987) 11. Fredenhagen, K., Haag, R.: On the derivation of Hawking radiation associated with the formation of a black hole. Commun. Math. Phys. 127(2), 273–284 (1990) ˇ 12. Gel fand, I.M., Ya. Sapiro, Z.: Representations of the group of rotations of 3-dimensional space and their applications. Am. Math. Soc. Transl. (2) 2, 207–316 (1956) 13. Gibbons, G.W., Hawking, S.W.: Cosmological event horizons, thermodynamics, and particle creation. Phys. Rev. D (3) 15(10), 2738–2751 (1977) 14. Hawking, S.W.: Particle creation by black holes. Commun. Math. Phys. 43(3), 199–220 (1975) 15. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge Monographs on Mathematical Physics, No. 1, London: Cambridge Univ. Press, 1973 16. Jin, W.M.: Scattering of massive Dirac fields on the Schwarzschild black hole spacetime. Classical Quantum Gravity 15(10), 3163–3175 (1998) 17. Melnyk, F.: Wave operators for the massive charged linear Dirac field on the Reissner-Nordstr¨om metric. Classical Quantum Gravity 17(11), 2281–2296 (2000) 18. Melnyk, F.: Diffusion d’un champ massif et charg´e de Dirac en m´etrique de Reissner-Nordstrøm. C. R. Acad. Sci. Paris S´er. I Math. 333(3), 185–190 (2001) 19. Melnyk, F.: Scattering on Reissner-Nordstrøm metric for the massive charged spin 1/2 fields. Ann. Henri Poincar´e, to appear, 2003 20. Misner, C.W., Thorne, K.S, Wheeler, J.A.: Gravitation. San Francisco, CA: W. H. Freeman and Co., 1973 21. Nicolas, J.-P.: Scattering of linear Dirac fields by a spherically symmetric black hole. Ann. Inst. H. Poincar´e Phys. Th´eor. 62(2), 145–179 (1995) 22. Nicolas, J.-P.: Global exterior Cauchy problem for spin 3/2 zero rest-mass fields in the Schwarzschild space-time. Comm. Partial Diff. Eqs. 22(3-4), 465–502 (1997) 23. Reed, M., Simon, B.: Methods of modern mathematical physics. I, II, III, IV. New York: Academic Press Inc. [Harcourt Brace Jovanovich Publishers], 1972-75-78-79 24. Thaller, B.: The Dirac equation. Berlin: Springer-Verlag, 1992 25. Wald, R.M.: On particle creation by black holes. Commun. Math. Phys. 45(1), 9–34 (1975) 26. Wald, R.M.: Quantum field theory in curved spacetime and black hole thermodynamics. Chicago, IL: University of Chicago Press, 1994 Communicated by H. Nicolai
Commun. Math. Phys. 244, 527–569 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-0992-4
Communications in
Mathematical Physics
First Order Asymptotics of Matrix Integrals; A Rigorous Approach Towards the Understanding of Matrix Models Alice Guionnet Ecole Normale Superieure de Lyon, Unite de Mathematiques pures et appliquees, UMR 5669, 46 Allee d’Italie, 69364 Lyon Cedex 07, France Received: 25 October 2002 / Accepted: 18 August 2003 Published online: 27 November 2003 – © Springer-Verlag 2003
Abstract: We investigate the large N limit of spectral measures of matrices which relate to the Gibbs measures of a number of statistical mechanical systems on random graphs. These include the Ising and Potts models on random graphs. For most of these models, we prove that the spectral measures converge almost surely and describe their limit via solutions to an Euler equation for isentropic flow with negative pressure p(ρ) = −3−1 π 2 ρ 3 . 1. Introduction Since the work of ’t Hooft, it is appreciated that matrix integrals can provide, via Feynman diagrams expansion, generating functions for enumerating maps (or triangulated surfaces). A very useful survey was written by A. Zvonkin [34]. Matrix integrals are used to enumerate maps with a given genus and given vertices degrees distribution whereas several matrices integrals can be used to consider the case where the vertices can additionally be coloured (i.e. can take different states). Matrix integrals are usually of the form N N N ZN (P ) = e−N tr(P (A1 ,··· ,Ad )) dAN 1 · · · dAd with some polynomial function P of d-non-commutative variables and the Lebesgue measure dA on some well chosen ensemble of N × N matrices such as the set HN (resp. SN , resp. SympN ) of N × N Hermitian (resp. symmetric, resp. symplectic) matrices. One would like to understand the full expansion of ZN (P ) in powers of N . For instance, in the case where the matrices live on HN , the formal expansion linked with Feynamn diagrams is of the type 1 1 log Z (P ) = CP (g), N N2 N 2g g≥0
528
A. Guionnet
where CP (g) enumerates some maps with genus g. Such an expansion was proved to hold in the one matrix case by K. McLaughlin and N. Ercolani [12]. A related issue is to understand the asymptotic behaviour of the corresponding Gibbs measure N N µN P (dA1 · · · dAd ) =
N N 1 N e−N tr(P (A1 ,··· ,Ad )) dAN 1 · · · dAd . ZN (P )
More precisely, if for a N × N selfadjoint matrix A, (λ1 (A), · · · , λN (A)) denotes its N −1 eigenvalues and µˆ N i=1 δλi (A) its spectral measure, one would like to underA := N stand the asymptotic behaviour of (µˆ NN , · · · , µˆ NN ) under the Gibbs measure µN P when A1
Ad
N goes to infinity. Of course, this understanding is intimately related with the first order asymptotic of the free energy FN (P ) = N −2 log ZN (P ). In fact, the rigorous approach of the full expansion of matrix integrals when d = 1 given by K. McLaughlin and N. Ercolani is based on Riemann Hilbert problems techniques which themselves require a precise understanding of such asymptotics of the spectral measures. However, only very few matrix integrals could be evaluated in the physics literature, even on non-rigorous grounds. These cases correspond in general to the case where integration holds over Hermitian matrices. Using orthogonal polynomial methods, Mehta [24] obtained the limiting free energy for the Ising model on random graphs, corresponding to d = 2, β = 2 and P (A, B) = P (A) + Q(B) − AB, when P (x) = Q(x) = gx 4 + x 2 . This work was extended in [9, 20] to matrices coupled in a chain, model corresponding to P (A1 , · · · , Ad ) = di=1 Pi (Ai ) − di=2 Ai−1 Ai . However, asymptotics of the spectral distribution of the matrices under the corresponding Gibbs measure were not considered in these works. On a less rigorous ground, P. Zinn Justin [32, 33] discussed the limiting spectral measures of the matrices following the Gibbs measure of the so-called Potts model on random graphs, described by P (A1 , · · · , Ad ) = d d i=1 Pi (Ai ) − i=2 A1 Ai . Very interesting work was also achieved by V. Kazakov (in particular for the so-called ABAB interaction case), A. Migdal and B. Eynard for instance. We refer to the review [14] of B. Eynard for a general survey. A. Matytsin [21] obtained the first order asymptotics for spherical integrals, from which he could study the phase transition of diverse matrix models (see [22] for instance). O. Zeitouni and myself [16] gave a complete proof of part of his derivation in [16] and the present paper actually finishes putting his article [21] on firm ground. In this paper, we investigate the problem of the first order asymptotics of matrix integrals with AB interaction, including the above Ising model, Potts model, matrix model coupled in a chain and induced QCD models. The integration will hold over either Hermitian matrices or symmetric matrices. The case of symplectic matrices could be handled similarly. We obtain, as a consequence of [16], the convergence of the free energy and represent its limit as the solution of a variational problem. We here study this variational problem and characterize its critical points. One of the main outcomes of this study is to show that under the Gibbs measure µN P of the Ising model described by P (A, B) = P (A) + Q(B) − AB with P (x) ≥ ax 4 + b and Q(x) ≥ cx 4 + d with a, b > 0, the spectral measures of N (AN 1 , A2 ) converge almost surely. Moreover, we characterize their limits and prove that ˆN Theorem 1.1. 1) (µˆ N A,µ B ) converges almost surely towards a unique couple (µA , µB ) of probability measures on R.
First Order Asymptotics of Matrix Integrals
529
2) (µA , µB ) are compactly supported with finite non-commutative entropy (µ) = log |x − y|dµ(x)dµ(y). 3) There exists a couple (ρ A→B , uA→B ) of measurable functions on R ×(0, 1) such that ρtA→B (x)dx is a probability measure on R for all t ∈ (0, 1) and (µA , µB , ρ A→B , uA→B ) are characterized uniquely as the minimizer of a strictly convex function under a linear constraint (see Theorem 3.3). In particular, (ρ A→B , uA→B ) are solution of the Euler equation for isentropic flow 2 with negative pressure p(ρ) = − π3 ρ 3 such that, for all (x, t) in the interior of = {(x, t) ∈ R × [0, 1]; ρtA→B (x) = 0}, )=0 ∂t ρtA→B + ∂x (ρtA→B uA→B t (1.1) A→B (uA→B )2 − π 2 (ρ A→B )3 ) = 0 ) + ∂ (ρ ∂t (ρtA→B uA→B x t t t t 3 with the probability measure ρtA→B (x)dx weakly converging towards µA (dx) (resp. µB (dx)) as t goes to zero (resp. one). Moreover, we have 2 (x) − H µA (x) = 0 µA − a.s and P (x) − x − uA→B 0 β 2 (x) − H µB (x) = 0 µB − a.s. Q (x) − x + uA→B 1 β Here, H µ stands for the Hilbert transform of the probability measure µ given by (x − y) 1 dµ(y). dµ(y) = lim H µ(x) = P V ↓0 x−y (x − y)2 + 2 A more detailed characterization of (µA , µB , ρ A→B , uA→B ) is given in Theorem 3.3. To obtain such a result, we shall first study the limit obtained in [16] for spherical integrals. This limit was indeed given by the infimum of a rate function over measure-valued processes with given initial and terminal data. We show in Sect. 2 that this infimum is in fact taken at a unique probability measure-valued path, solution of the Euler equation for isentropic flow described in (1.1). Using a saddle point method, we derive from [16] in Theorem 3.1 formulae for the limiting free energy of some matrix models with AB interaction. In the Ising model case, this free energy is indeed written as the infimum of a strictly convex function, from which uniqueness of the minimizers is obtained. As a consequence, we obtain the convergence of the spectral measures under the Gibbs measure for the Ising model. A variational study then shows that the limiting spectral measures satisfy the above set of equations (see Theorem 3.3) . For the other considered models (q-Potts model, matrix coupled in a chain, induced QCD), obvious convexity arguments, and therefore uniqueness, are lost in general, but still hold in certain cases. However, we can always specify some properties of the limit points (see Theorem 3.4). In this paper, we shall denote C([0, 1], P(R)) the set of continuous processes with values in the set P(R) of probability measures on R, endowed with its usual weak p,q topology. For a measurable set of R × [0, 1], Cb () denotes the set of real-valued functions on which are p times continuously differentiable with respect to the (first) space variable and q times continuously differentiable with respect to the (second) time p,q p,q variable with bounded derivatives. Cc () will denote the functions of Cb () with
530
A. Guionnet
compact support in the interior of the measurable set . Lp (dµ) will denote the space of measurable functions whose pth power is integrable under a given measure µ. We shall say that an equality holds in the sense of distribution on a measurable set if it holds, once integrated with respect to any Cc∞,∞ () functions. 2. Study of the Rate Function Governing the Asymptotic Behaviour of Spherical Integrals In [16], Ofer Zeitouni and I studied the so-called spherical integral Nβ (β) β IN (DN , EN ) := exp{ tr(U DN U ∗ EN )}dmN (U ), 2 β
where mN denotes the Haar measure on the orthogonal group ON when β = 1 and on the unitary group UN when β = 2, and DN , EN are diagonal real matrices whose spectral measures converge to µD , µE . We proved (see Theorem 1.1 in [16]) the existence and represent as a solution to a variational problem the limit I (β) (µD , µE ) := lim N −2 log IN (DN , EN ). (β)
N→∞
This result in fact was obtained under the additional technical assumptions that there 2 exists a compact subset K of R such that supp µˆ N ˆN DN ⊂ K for all N ∈ N and that µ EN (x ) is uniformly bounded (in N). These hypotheses will be made throughout this section. (β) Note here that the term 2−1 β in the definition of IN (DN , EN ) is irrelevant since it amounts to change the definition of DN or EN . However, it simplifies the notations. We forgot this parameter in [16] but wrote an addendum [17]. In this section, we investigate the variational problem which defines I (β) and study its minimizer. We indeed prove Matytsin’s heuristics [21] outlined in Sect. 6 of [16]. Let us recall the formula obtained in [16] for I (β) : β I (β) (µD , µE ) := −Jβ (µD , µE ) + Iβ (µE ) − inf Iβ (µ) + x 2 dµD (x), 4 µ∈P (R) where, for any µ ∈ P(R), β β Iβ (µ) = x 2 dµ(x) − (µ). 4 2 We assume in this section that (µ0 ) > −∞, which can be seen to imply, when Jβ (µ0 , µE ) < ∞, that (µE ) > −∞ (see Lemma 2.4). Jβ (µD , .) is the rate function governing the deviations of the law of the spectral measure of XN = DN + WN with a Hermitian (resp. symmetric) Gaussian Wigner matrix WN and a deterministic diagonal matrix DN = diag(d1 , · · · , dN ), (di )1≤i≤N ∈ RN , with spectral measure N −1 µˆ N i=1 δdi weakly converging towards µD ∈ P(R). It is given (see [16]) by DN = N Jβ (µD , µ) =
β inf{SµD (ν. ); ν ∈ C([0, 1], P(R)) : ν1 = µ}, 2
(2.1)
First Order Asymptotics of Matrix Integrals
if
SµD (ν) :=
531
+∞ , if ν0 = µD , S 0,1 (ν) := supf ∈C 2,1 (R×[0,1]) sup0≤s≤t≤1 S¯ s,t (ν, f ) ,
otherwise.
b
Here, we have set, for any f, g ∈ Cb2,1 (R × [0, 1]), any s ≤ t ∈ [0, 1], and any ν. ∈ C([0, 1], P(R)), t s,t S (ν, f ) = f (x, t)dνt (x) − f (x, s)dνs (x) − ∂v f (x, v)dνv (x)dv s t ∂x f (x, v) − ∂y f (y, v) 1 − dνv (x)dνv (y)dv, (2.2) 2 s x−y < f, g
>νs,t =
t ∂x f (x, v)∂x g(x, v)dνv (x)dv ,
(2.3)
s
and 1 S¯ s,t (ν, f ) = S s,t (ν, f ) − < f, f >νs,t . 2
(2.4)
It can be shown by Riesz’s theorem (see such a derivation in [7] for instance) that any measure-valued path ν. ∈ C([0, 1], P(R)) in {SµD < ∞} is such that there exists a process k. so that 1. inf
f ∈Cb2,1 (R×[0,1])
< f − k, f − k >ν0,1 = 0.
2. ν0 = µD and for any f ∈ Cb2,1 (R × [0, 1]), any 0 ≤ s ≤ t ≤ 1, S s,t (ν, f ) =< f, k >νs,t .
(2.5)
Then, it is not hard to show that SµD (ν. ) =
1 < k, k >ν0,1 . 2
Therefore, Jβ (µD , µ) is given also by Jβ (µD , µ) =
β inf{< k, k >ν0,1 ; 4
(ν, k) satisfies (C)},
(2.6)
with (C) the condition L2 (dνt dt)
(C) : ν0 = µD , ν1 = µ, ∂x k ∈ Cb1,1 (R × [0, 1])) [0, 1]), any s, t ∈ [0, 1], S s,t (ν, f ) =< f, k >νs,t .
, and, for any f ∈ Cb2,1 (R ×
L2 (dνt dt)
Here Cb1,1 (R × [0, 1])) denotes the closure in L2 (dνt dt) of Cb1,1 (R × [0, 1]). The main theorem of this section is stated as follows: Theorem 2.1. Let µE ∈ {Jβ (µD , .) < ∞}. Then, the infimum in Jβ (µD , µE ) is reached at a unique probability measures-valued path µ∗ ∈ C([0, 1], P(R)) such that
532
A. Guionnet
• µ∗0 = µD , µ∗1 = µE . • For any t ∈ (0, 1), µ∗t is absolutely continuous with respect to Lebesgue measure ; µ∗t (dx) = ρt∗ (x)dx. t ∈ [0, 1] → µ∗t ∈ P(R) is continuous and therefore limt↓0 µ∗t = µD , limt↑1 µ∗t = µE . • Let k ∗ be such that the couple (µ∗ , k ∗ ) satisfies (C). Then, if we set u∗t = ∂x kt∗ + H µ∗t (y), (ρ ∗ , u∗ ) satisfies the Euler equation for isentropic flow described by the equations, for t ∈ (0, 1), ∂t ρt∗ (x) = −∂x (ρt∗ (x)u∗t (x)), ∂t (ρt∗ (x)u∗t (x)) = −∂x (ρt∗ (x)u∗t (x)2 −
(2.7) π2 3
ρt∗ (x)3 )
(2.8)
in the sense of distributions that for all f ∈ Cc∞,∞ (R × [0, 1]), 1 1 ∂t f (t, x)dµ∗t (x)dt + ∂x f (t, x)u∗t (x)dµ∗t (x)dt = 0 0
0
Cc∞,∞ ()
with := {(x, t) ∈ R × [0, 1] : ρt∗ (x) > 0}, and, for any f ∈ 2u∗t (x)∂t f (x, t) + u∗t (x)2 − π 2 ρt∗ (x)2 ∂x f (x, t) ρt∗ (x)dxdt = 0. (2.9) If we assume that (µD , µE ) are compactly supported probability measures, we additionally know that (ρ ∗ , u∗ ) are smooth in the interior of , which guarantees that (2.7) and (2.8) hold everywhere in the interior of . Moreover, is bounded in R × [0, 1]. Furthermore, there exists a sequence (φ ) >0 of functions such that if we set 1
ρt (x) := π −1 (max{∂t φ + 4−1 (∂x φ )2 , 0}) 2 then
2
φ (x) 2 ∗ π2 ∗ dµt (x)dt + u∗t (x) − ∂x t ρt (x) − ρt (x) ρt∗ (x) + ρt (x) dxdt 2 3 +π 2 |∂t φ + 4−1 (∂x φ )2 − π 2 ρt (x)2 |dµ∗t (x)dt ≤ .
x ∗ ut (y)dy, which should be thought of as the As a consequence, if we let ∗t (x) = ∗ −1 limit in H1 (ρt (x)dxdt) of the sequence (2 φ ) >0 , we find that it satisfies, in the sense of distributions in , 1 π2 ∗ 2 ∂t ∗t = − (∂x ∗t )2 + (ρ ) , 2 2 t which is Matytsin ’s equation [21]. The (non-trivial) existence of solutions to the Euler equation for isentropic flow (2.7), (2.8), is a consequence of our variational study. Even when such solutions are not unique, we know that our minimizer is unique due to a convex property of SµD (which is a consequence of Property 2.2.1) below, see Property 2.6). In fact, C. Villani pointed out to me how smooth solutions of (2.7, 2.8) can be shown to minimize SµD . Hence, solutions to (2.7), (2.8) are unique under some a priori regularity properties. Study of minimizers of entropies similar to SµD was also undertaken recently in [19].
First Order Asymptotics of Matrix Integrals
533
Property 2.2. Let µE ∈ {Jβ (µD , .) < ∞}. Then, 1) For any ν ∈ C([0, 1], P(R)), if (ν, k) verifies (C) and ut (x) = ∂x kt (x) + H νt (x), 1 2
Sµ0 (ν) =
1
(ut (x))2 dνt (x)dt +
0
1 2
1
(H νt (x))2 dνt (x)dt
0
1 − ((µE ) − (µD )). 2 2) Consequently, we can write Jβ under the following form: β Jβ (µD , µE ) = 4
+ 0
1
0 1
(u∗t (x))2 dµ∗t (x)dt
(H µ∗t (x))2 dµ∗t (x)dt − ((µE ) − (µD )) (2.10)
with (µ∗ , u∗ ) as in Theorem 2.1. Note here that µ∗t (dx) = ρt∗ (x)dx for t ∈ (0, 1) and ρ.∗ ∈ L3 (dxdt), so that 1
0
(H µ∗t (x))2 dµ∗t dt =
π2 3
0
1
(ρt∗ (x))3 dxdt.
3) As a consequence, 1
1 β (u∗t (x))2 dµ∗t (x)dt + (H µ∗t (x))2 dµ∗t (x)dt 4 0 0 β β 2 − ((µE ) + (µD )) + x dµD (x) 4 4 β + x 2 dµE (x) − inf Iβ . (2.11) 4
I (β) (µD , µE ) = −
In [21], a similar result was announced (see formulae (1.4) and (2.8) of [21]). However, it seems (as far as I could understand) that in formulae (2.10,2.11) of [21], the first term has the opposite sign. But, in [22], formula (2.18), the very same result is stated. Let us also notice that the minimizer µ∗t has a nice representation in the free probability context. Free probability is a probability theory for non-commutative variables, furnished with a notion of freeness, analogous to standard independence in the classical probability setting. In this field, random variables are operators which we shall assume hereafter to be selfadjoint. This theory naturally appears as the right set up to consider the large N -limit of random matrices and shall be needed in several places in this paper. Since free probability remains mysterious for many potential readers, let me recall some basic notions from this rapidly growing field. A W ∗ -probability space is a couple (A, τ ) of a unital von Neumann algebra (A, ∗) equipped with a linear form τ on A with real-valued restriction to the self-adjoint elements of A and such that 1. Positivity τ (AA∗ ) ≥ 0, for any A ∈ A. 2. Traciality τ (AB) = τ (BA) for any A, B ∈ A. 3. Total mass τ (I ) = 1, with I the neutral element of A.
534
A. Guionnet
To compare this notion with the classical definition of probability space (, P ), A is the analogue of the set B() of bounded measurable functions B() on . A measure P can be viewed as a linear form on B(), it is real-valued if for any measurable realvalued function f , P (f ) ∈ R. To be a probability measure, P has to be non-negative, which corresponds to condition 1) above and with total mass one (see (3)). The traciality condition is, in the commutative setting, trivial. The notion of law τX1 ,... ,Xm of m operators (X1 , . . . , Xm ) in a W ∗ -probability space (A, τ ) is simply given by the restriction of the trace τ to the algebra generated by (X1 , . . . , Xm ), that is by the values τX1 ,... ,Xm (P ) = τ (P (X1 , . . . , Xm )),
∀P ∈ C X1 , . . . Xm ,
where C X1 , . . . Xm is the set of polynomial functions of m non-commutative variables. The natural topology in free probability is the weak topology with respect to the polynomial functions C X1 , . . . Xm . Following the above description, laws of m non-commutative self-adjoint variables (m) can be seen as elements of the set M1 of linear forms on the set of polynomial functions of m non-commutative variables C X1 , . . . Xm furnished with the involution ∗ → → Xik = Xik 1≤k≤n
n≤k≤1 (1)
which satisfies the above properties of positivity, traciality and mass one. Note that M1 is the set of probability measures on R with finite moments. N ) ∈ Hm and consider the empirical distribution µ Let (M1N , . . . , Mm ˆNN N given N M1 ,... ,Mm
for any P ∈ C X1 , . . . Xm by
N N P (M (P ) := tr , . . . , M ) . µˆ N N m 1 M N ,··· ,M N m
1
It is clear that µˆ N N
N M1 ,··· ,Mm
(m) N ) ∈ Hm ∈ M1 . Take a sequence (M1N , . . . , Mm N N∈N such
that for any P ∈ C X1 , . . . Xm the limit
N ) τ (P ) := lim trN P (M1N , . . . , Mm N→∞
(m)
(m)
exists. Then, it is not hard to see that τ ∈ M1 . Hence, M1 is the natural space in which one should consider large matrices, as far as their trace is concerned. In this framework, Wigner’s [31] results assert that the empirical distribution √ of Wigner’s matrices are converging towards the semi-circle distribution σ (dx) = C 4 − x 2 dx. Free probability is not only a theory of probability for non-commutative variables ; it contains also the central notion of freeness, which is the analogue of independence in standard probability. Definition 2.3. By definition, the variables (X1 , . . . , Xm ) and (Y1 , . . . , Yn ) are said to be free iff for any n ∈ N, any (Pi , Qi )1≤i≤i ∈ (C X1 , · · · , Xm × C X1 , . . . , Xn )n , → τ Pi (X1 , . . . , Xm )Qi (Y1 , . . . , Yn ) = 0 1≤i≤n
First Order Asymptotics of Matrix Integrals
535
as soon as τ (Pi (X1 , . . . , Xm )) = 0, Here,
→
τ (Qi (Y1 , . . . , Yn )) = 0,
∀i ∈ {1, . . . , n}.
denotes the non-commutative product.
Note that the notion of freeness defines uniquely the law of {X1 , . . . , Xm , Y1 , . . . , Yn } once the laws of (X1 , . . . , Xm ) and (Y1 , . . . , Yn ) are given (in fact, check that every expectation of any polynomial is given uniquely by induction over the degree of this polynomial). It was shown by Voiculescu [28] that freeness is the right notion to describe the joint law of independent matrices in the limit where their size goes to infinity. More precisely, if (AN , B N ) are two diagonal matrices with converging spectral distribution, and UN ∗ a unitary matrix following the Haar measure mN 2 , then (AN , UN B N UN ) converges in law towards (A, B), A and B being free and each of their laws being given by the limiting spectral distributions. This convergence holds almost surely. As a consequence, two independent Wigner’s matrices are asymptotically free, since their law is invariant by rotation. In particular, imagine you are given two matrices (AN , B N ) with given spectrum but unknown eigenvectors, but want to have some idea about the typical spectrum of the sum of these two matrices. Then, the most natural model is to try to identify the spectrum of ∗ . By Voiculescu’s result, if µ AN + UN B N UN ˆN ˆN B N ) converges weakly towards AN (resp. µ ∗ converges, as N goes to µA (resp. µB ), the spectral distribution of AN + UN B N UN infinity, towards the law of A + B, A and B being free and with distribution µA and µB , that is the free convolution µA + µB of µA and µB . In particular, if XN is a Wigner converges almost surely towards µA + σ . matrix, µˆ N AN +XN Let us now come back to the representation of (µ∗t )t∈[0,1] in terms of free probability. Let (A, τ ) be a W ∗ - probability space on which an operator D with distribution µD , an operator E with distribution µE and a semi-circular variable S, free with (D, E), live. Then, there exists a joint distribution of (D, E) such that (µ∗t )t∈[0,1] is the law of a free Brownian bridge Xt = tE + (1 − t)D + t (1 − t)S. The isentropic Euler equation which governs µ∗ hence partially specifies the joint law of (D, E) since it describes uniquely the law νt∗ of tE +(1−t)D (which is the unique solution of µ∗t = νt∗ + σt (1−t) (x), where σδ is the semicircular variable with covariance δ.) 2.1. Study of Sµ0 . Hereafter and to simplify the notations, µD = µ0 and µE = µ1 with some probability measures (µ0 , µ1 ) on R. We shall in this section study the rate function Sµ0 and show that it achieves its minimal value on {ν ∈ C([0, 1], P(R)) : ν1 = µ1 } at a unique continuous measure-valued path µ∗ . 2.1.1. Sµ0 achieves its minimal value. Recall that for any probability measure µ0 ∈ P(R), Sµ0 is a good rate function on C([0, 1], P(R)) (see Theorem 2.4(1) of [16]). Therefore, the infimum defining Jβ (µ0 , µ1 ) is, when it is finite, achieved in C([0, 1], P(R)). We shall in the sequel restrict ourselves to (µ0 , µ1 ) such that Jβ (µ0 , µ1 ) is finite.
536
A. Guionnet
2.1.2. A new formula for Sµ0 . In this section, we shall give a simple formula of Sµ0 (ν) in terms of u. = H ν. + ∂x k. and ν when (k, ν) satisfies (C). We begin with the following preliminary lemma Lemma 2.4. Let µ0 ∈ {µ ∈ P(R) : (µ) > −∞} and ν. ∈ {Sµ0 < ∞}. Then, ν1 ∈ {µ ∈ P(R) : (µ) > −∞} and for almost all t ∈ (0, 1), νt (dx) dx and 1 dνt (x) 3 π2 1 dxdt < ∞. (H νt (x))2 dνt (x)dt = 3 0 dx 0 The idea of the proof of the lemma is quite simple ; we make, in the definition of Sµ0 , the change of variable f (x, t) → f (x, t) − log |x − y|dνt (x). However, because (x, t) → log |x − y|dνt (x) is not in Cb2,1 (R × [0, 1]) in general, the full proof requires approximations of the path ν. and becomes rather technical. This is the reason why I defer it to the Appendix, Sect. 4.2. We shall now prove the following Property 2.5. Let µ0 ∈ {µ ∈ P(R) : (µ) > −∞} and ν. ∈ {Sµ0 < ∞}. Then, if (ν, k) satisfies (C) and if we set ut := ∂x kt (x) + H νt (x), we have 1 1 1 1 1 2 (ut (x)) dνt (x)dt+ (H νt (x))2 dνt (x)dt− ((ν1 )−(ν0 )). Sµ0 (ν) = 2 0 2 0 2 Proof. Let us recall that (ν, k) satisfying condition (C) implies that for any f ∈ Cb2,1 (R× [0, 1]), f (x, 1)dν1 (x) − f (x, 0)dν0 (x) 1 ∂v f (x, v)dνv (x)ds = 0
+
1 2
+
1
0 1
∂x f (x, v) − ∂y f (y, v) dνv (x)dνv (y)dv x−y
∂x f (x, v)∂x k(x, v)dνv (x)dv
(2.12)
0
with ∂x k ∈ L2 (dνt (x) × dt). Observe that by [27], p. 170, for any s ∈ [0, 1] such that νs is absolutely continuous with respect to Lebesgue measure with density ρs ∈ L3 (dx), for any compactly supported measurable function ∂x f (., s), ∂x f (x, s) − ∂y f (y, s) dνs (x)dνs (y) = 2 ∂x f (x, s)H νs (x)dxds. x−y Since by Lemma 2.4, for almost all s ∈ [0, 1], νs (dx) dx with a density ρs ∈ L3 (dx) we conclude that, in the sense of distributions on R × [0, 1], (2.12) implies ∂s ρs + ∂x (us ρs ) = 0,
(2.13)
i.e for any compactly supported f ∈ Cc∞,∞ (R × [0, 1]) vanishing at the boundary of R × [0, 1], 1 (∂s f (x, s) + us ∂x f (x, s))ρs (x)dx = 0. 0
R
First Order Asymptotics of Matrix Integrals
537
Note here that, by the dominated convergence theorem, we can equivalently take f ∈ Cb1,1 (R × [0, 1]). Moreover, since H ν. belongs to L2 (dνs × ds) by Lemma 2.4, we can write 1 1 2Sµ0 (ν. ) =< k, k >ν0,1 = (us (x))2 dνs (x)ds + (H νs (x))2 dνs (x)ds 0
−2
R 1
0
H νs (x)us (x)dνs (x)ds.
R
0
R
(2.14)
We shall now see that the last term in the above right-hand side only depends on (ν0 , ν1 ). The only difficulty in the proof2,1of this point lies in the fact that x, s ∈ R × [0, 1] → log |x − y|dνs (y) is not in Cb (R × [0, 1]). However, following Lemma 5.16 in [8], if uδt denotes the field corresponding to νt + σδ by Riesz’s construction, 1 (ν1 + σδ ) − (ν0 + σδ ) = 2 H (νs + σδ )(x)uδs (x)dνs + σδ (x)ds. (2.15) 0
R
It is well known that δ → (ν + σδ ) is continuous (see [26], Theorem 2.1 for the lower semicontinuity and use the well known upper semi-continuity). Moreover, if Xs is a random variable with distribution νs and S a semicircular variable, free with Xs , living in a non-commutative probability space (A, τ ), by Theorem 4.2 in [8], the field uδ is given, νs + σδ almost surely, by √ uδs = τ (∂x k(Xs , s)|Xs + δS) + H νs + σδ , √ where√ τ (.|Xs + δS) is the orthogonal projection in L2 (τ ) on the algebra generated by Xs + δS. Consequently, √ H (νs + σδ )(x)uδs (x)dνs + σδ (x) = τ ∂x k(Xs , s)H (νs + σδ )(Xs + δS) R √ +τ H (νs + σδ )(Xs + δS)2 . Moreover, by Voiculescu [29], Prop. 3.5 and Cor. 6.13, if νs (dx) = ρs (x)dx ∈ L3 (dx), √ lim τ (H (νs + σδ )(Xs + δS) − H νs (Xs ))2 = 0. δ→0
Therefore, for any such s ∈ [0, 1], H (νs + σδ )(x)uδs (x)dνs lim
+
δ→0 R
σδ (x) =
R
H νs (x)us (x)dνs (x).
(2.16)
Note that by Lemma 2.4, this convergence holds for almost all s ∈ [0, 1] since ρ. ∈ L3 (dxdt). Finally, by Props. 3.5 and 3.7 of [29], for any s such that H νs is well defined, √ √ H νs + σδ (Xs + δS) = τ (H νs (Xs )|Xs + δS), so that for any δ > 0 τ (H νs
+
σδ (Xs +
√
δS))2 ≤ τ (H νs (Xs ))2 .
538
A. Guionnet
Therefore, the dominated convergence theorem and (2.16) imply that 1 1 δ H νs + σδ (x)us (x)dνs + σδ (x)ds = H νs (x)us (x)dνs (x)ds. lim δ→0 0
R
0
R
Thus, (2.15) extends to δ = 0 which proves, with (2.14), Property 2.5.
2.1.3. Uniqueness of the minimizers of Sµ0 . We shall use the formula for Sµ0 obtained in the last section to prove that Property 2.6. For any (µ0 , µ1 ) ∈ P(R) with finite entropy , there exists a unique measure-valued path µ∗ such that Jβ (µ0 , µ1 ) =
β inf{Sµ0 (ν. ) 2
:
ν1 = µ1 } =
β Sµ (µ∗ ). 2 0
In the following, µ∗ shall always denote the minimizer of Property 2.6 and ∂x k ∗ , u∗ its associated field and velocity. Proof. According to the previous section, the minimizers of Sµ0 also minimize 1 π2 1 S(u, ρ) = (ρt (x))3 dxdt + (ut (x))2 ρt (x)dxdt 3 0 R R 0 under the constraint ∂t ρt + ∂x (ρt ut ) = 0 in the sense of distributions, ρt ≥ 0 almost surely w.r.t Lebesgue measure and ρt (x)dx = 1, and with given initial and terminal data for ρ given by lim ρt (x)dx = µ0 (dx), t↓0
lim ρt (x)dx = µ1 (dx), t↑1
where convergence holds in the weak sense and is simply due to the fact that Sµ0 is finite only on continuous measure-valued paths. Let m = uρ be the corresponding momentum. In the variables (m, ρ), S(u, ρ) = S(m, ρ) reads 1 (mt (x))2 π2 1 (ρt (x))3 dxdt + S(m, ρ) = dxdt 3 0 R R ρt (x) 0 with the convention
0 0
= 0, whereas the constraint becomes linear,
∂t (ρt (x)) + ∂x (mt (x)) = 0, ρt (x)dx ∈ P(R) ∀t ∈ [0, 1], lim ρt (x)dx = µ0 (dx), lim ρt (x)dx = µ1 (dx). t↓0
t↑1
We now observe that S is a strictly convex function. Indeed, if (m1 , ρ 1 ) and (m2 , ρ 2 ) are any two couples of measurable functions in {S < ∞}, it is easy to see that for any α ∈ (0, 1), ∂α2 S(αm1 + (1 − α)m2 , αρ 1 + (1 − α)ρ 2 ) 1 (ρt1 (x) − ρt2 (x))2 (αρt1 (x) + (1 − α)ρt2 (x))dxdt = 2π 2 R 0 1 (ρ 1 (x)m2 (x) − ρ 2 (x)m1 (x))2 t t t t dxdt. 1 (x) + (1 − α)ρ 2 (x))3 (αρ R 0 t t
+2
First Order Asymptotics of Matrix Integrals
539
Hence, ∂α2 S(αm1 + (1 − α)m2 , αρ 1 + (1 − α)ρ 2 ) > 0 for some α ∈ (0, 1) unless for almost all t ∈ [0, 1], ρt1 (x) = ρt2 (x) = ρt (x), and u1t (x) =
m1t (x) m2t (x) = = u2t (x) ρt1 (x) ρt2 (x)
ρt (x)dxdt a.s.
In other words, S is strictly convex. By standard convex analysis, the strict convexity of S results with the uniqueness of its minimizers given a linear constraint, and in particular in Jβ . More precisely, from the above, the minimizer µ∗ in Jβ is defined uniquely for almost all t ∈ [0, 1] (and then everywhere by continuity of µ∗ ) and its field u∗ , or equivalently ∂x k ∗ , is then defined uniquely µ∗t (dx)dt almost surely. 2.2. A priori properties of the minimizer µ∗ . In this section, we shall see that the minimizer µ∗ has to be the distribution of a free Brownian bridge when at least one of the probability measures µ0 or µ1 are compactly supported, the other having finite variance (since we rely on results in [16]). To simplify the statements, we shall assume throughout this section that both probability measures are compactly supported. This property will unable us to obtain a priori properties on the laws of the minimizers, such as existence, boundedness, and smoothness of their densities. 2.2.1. Free Brownian bridge characterization of the minimizer. Let us state more precisely the theorem obtained in this section. A free Brownian bridge between µ0 and µ1 is the law of (2.17) Xt = (1 − t)X0 + tX1 + t (1 − t)S with a semicircular variable S, free with X0 and X1 , with law µ0 and µ1 respectively. We let FBB(µ0 , µ1 ) ⊂ C([0, 1], P(R)) denote the set of such laws (which depend of course not only on µ0 , µ1 but on the joint distribution of (X0 , X1 ) too). Then, we shall prove that Theorem 2.7. Assume µ0 , µ1 compactly supported. Then, β inf{S(ν), ν0 = µ0 , ν1 = µ1 } 2 β = inf{S(ν) ; ν ∈ FBB(µ0 , µ1 )}. 2
Jβ (µ0 , µ1 ) =
Therefore, since FBB(µ0 , µ1 ) is a closed subset of C([0, 1], P(R)), the unique minimizer µ∗ in the above infimum belongs to FBB(µ0 , µ1 ). The proof of Theorem 2.7 is rather technical and goes back through the large random matrices origin of Jβ . We therefore defer it to the appendix. 2.2.2. Properties of the free Brownian motion paths. As a consequence of Theorem 2.7, we shall prove that Corollary 2.8. Assume µ0 and µ1 compactly supported. Then, a) There exists a compact set K ⊂ R so that for all t ∈ [0, 1], µ∗t (K c ) = 0. For all t ∈ (0, 1), the support of µ∗t is the closure of its interior.
540
A. Guionnet dµ∗ (x)
t b) µ∗t (dx) dx for all t ∈ [0, 1]. Let ρt∗ (x) = dx . c) There exists a finite constant C (independent of t) so that, µ∗t almost surely,
ρt∗ (x)2 + (H µ∗t (x))2 ≤ (t (1 − t))−1 and
1
|u∗t (x)| ≤ C(t (1 − t))− 2 .
d) (ρ ∗ , u∗ ) are analytic in the interior of = {x, t ∈ R × [0, 1] : ρt∗ (x) > 0}. e) At the boundary of t = {x ∈ R : ρt∗ (x) > 0}, for x ∈ t , |ρt∗ (x)2 ∂x ρt∗ (x)|
1 ≤ 3 2 4π t (1 − t)2
⇒
ρt∗ (x)
≤
3 3 2 4π t (1 − t)2
1 3
1
(x − x0 ) 3
if x0 is the nearest point of x in ct . Consequently, the minimizer µ∗ may only have shocks at the boundary of its support. Proof. This corollary is a direct consequence of Theorem 2.7 and we shall collect these properties for any free brownian bridge law. Indeed, let (A, τ ) be a non-commutative probability space in which two operators X0 , X1 with laws µ0 and µ1 and a semicircular variable S, free with (X0 , X1 ), live. We assume throughout that X0 and X1 are bounded by C for the operator norm (i.e µ0 ([−C, C]c ) = µ1 ([−C, C]c ) = 0). Let µt be the distribution of Xt = tX1 + (1 − t)X0 + t (1 − t)S. Clearly, since S is bounded by 2 for the operator norm, Xt is bounded by C + 2 for all t ∈ [0, 1]. Thus, Proposition 4 in [3] finishes the proof of a). Following Voiculescu (see Proposition 3.5 and Corollary 3.9 in [29]), the Hilbert transform of µt is given, µt -almost surely, by −1 H µt (x) = τ ((2 t (1 − t)) S|Xt ) with τ ( |Xt ) the conditional expectation with respect to Xt , i.e the orthogonal projection on the sigma algebra generated by Xt . We deduce that since S is bounded for the operator norm by 2, µt -almost surely, |H µt (x)| ≤ √
1 . t (1 − t)
Further, following [4], the stochastic differential equation satisfied by Xt shows that, for any twice continuously differentiable function f on R, ∂x f (x) − ∂x f (y) 1 t µt (f ) = µ0 (f ) + dµs (x)dµs (y)ds 2 x−y 0 t ∂x f (x)∂x ks (x)dµs (x)ds + (2.18) 0
with k the element of L2 (dµs ds) given by Xs − X1 ∂x ks (x) = τ |Xs . s−1
First Order Asymptotics of Matrix Integrals
541
Hence,
Xt − X1 −1 ut := ∂x kt + H µt = τ |Xt + τ 4t (1 − t) S|Xt t −1 (1 − 2t) = τ X1 − X0 + √ S|Xt . 2 t (1 − t)
(2.19)
Therefore, µt -almost surely, |ut | ≤ 2C + √
1 . t (1 − t)
(2.20)
Moreover, by Biane’s results [3], we know that, for t ∈ (0, 1), µt is absolutely continuous with respect to Lebesgue measure. We denote by ρt its density. Then, we also know that for all t ∈ (0, 1), µt -almost surely, ρt (x)2 + (H µt )2 (x) ≤
1 . t (1 − t)
(2.21)
Let us mention the regularity properties that (µt )t∈(0,1) will inherit from its free Brownian bridge formula. If νt denotes the law of tX1 + (1 − t)X0 , we have, following Biane [3], Cor. 3, that if we set dνt (x) v(u, t) = inf{v ≥ 0| ≤ (t (1 − t))−1 }, (u − x)2 + v 2 = inf{v ≥ 0|τ ((tX1 + (1 − t)X0 − u)2 + v 2 )−1 ≤ (t (1 − t))−1 }, ψ(u, t) = u + t (1 − t) then
H µt (ψ(u, t)) =
(u − x)dνt (x) , (u − x)2 + v(u, t)2
(u − x)dνt (x) , (u − x)2 + v(u, t)2
while ρt (ψ(u, t)) =
v(u, t) . πt (1 − t)
From these formulae, we observe that ψ −1 is analytic in the interior of since ψ is bounded below by a positive constant there (see Biane [3], p. 713 and the obvious analyticity in the time parameter t ∈ (0, 1) ), it is clear that ρ is C ∞ in . Hence, the weak equation (2.18) is verified in the strong sense in and we find that in the interior of , ∞ ut (x) = ρt (x)−1 x ∂t ρt (y)dy is C ∞ . At the boundary of t = {x : (x, t) ∈ }, Biane ([3], Cor. 5) also noticed that 1 ⇒ ρt (x) ≤ |ρt (x) ∂x ρt (x)| ≤ 3 2 4π t (1 − t)2 2
3 3 2 4π t (1 − t)2
with x0 the nearest point of the boundary of t from x.
1 3
1
(x − x0 ) 3
542
A. Guionnet
2.3. The variational problem. We now turn to the analysis of the variational problem defining Jβ ; we shall prove that Property 2.9. Assume that µ0 is a probability measure such that (µ0 ) is finite. Then a) Jβ (µ0 , µ1 ) is infinite unless (µ1 ) is finite. b) The path µ∗ ∈ C([0, 1], P(R)) minimizing Sµ0 under the condition µ∗1 = µ1 satisfies; 1) µ∗0 = µ0 and µ∗1 = µ1 . 2) For any t ∈ (0, 1), µ∗t (dx) dx. Let (ρt∗ )t∈(0,1) denote the corresponding density. By continuity of µ∗ , µ∗t (dx) = ρt∗ (x)dx converges towards µ0 (resp. µ1 ) as t goes to zero (resp. one) in the usual weak sense on P(R). 3) µ∗ is characterized as the unique continuous measure-valued path such that µ∗0 = µ0 and µ∗1 = µ1 and, for any ν ∈ {Sµ0 < ∞} so that (ν, k) satisfies (C) and ν1 = µ1 , we have, with u = ∂x k + H ν, ∗ ∗ ∗ 2ut (ut dνt − ut dµt ) − (u∗t )2 (dνt − dµ∗t ) 2 ∗ 2 ∗ (ρt ) (dνt − dµt ) dt ≥ 0. +π (2.22) 4) (ρ ∗ , u∗ ) satisfies the Euler equation for isentropic flow described by the equations, for t ∈ (0, 1), ∂t ρt∗ (x) = −∂x (ρt∗ (x)u∗t (x)), ∂t (ρt∗ (x)u∗t (x)) = −∂x (ρt∗ (x)u∗t (x)2 −
(2.23) π2 3
ρt∗ (x)3 )
(2.24)
in the sense of distributions that for all f ∈ Cc∞,∞ (R × [0, 1]),
1 0
∂t f (t, x)dµ∗t (x)dt
1
+ 0
∂x f (t, x)u∗t (x)dµ∗t (x)dt = 0
and, for any f ∈ Cc∞,∞ () with := {(x, t) ∈ R × [0, 1] : ρt∗ (x) > 0}, 2u∗t (x)∂t f (x, t) + u∗t (x)2 − π 2 ρt∗ (x)2 ∂x f (x, t) dxdt = 0.
(2.25)
Let us now assume that (µ0 , µ1 ) are compactly supported. Then 5) (2.24) is true everywhere in the interior of . 6) There exists a sequence (φ ) >0 of functions in Cb1,1 (R × [0, 1]) such that if we set 1
ρt (x) := π −1 (max{∂t φ + 4−1 (∂x φ )2 , 0}) 2 , then 2
φ (x) 2 ∗ π2 ∗ ρt (x) − ρt (x) ρt∗ (x) + ρt (x) dxdt u∗t (x) − ∂x t dµt (x)dt + 2 3 + π 2 |∂t φ + 4−1 (∂x φ )2 − π 2 ρt (x)2 |dµ∗t (x)dt ≤ .
First Order Asymptotics of Matrix Integrals
543
Discussion 2.10. Matytsin [21] noticed that if we set f (x, t) = u∗t (x) + iπρt∗ (x), then the Euler equation for isentropic flow implies that f solves Burgers equation. Hence, if one assumes that f can be smoothly extended to the complex plan still solving Burgers equation, we find by usual characteristic methods that for z ∈ C, f (f (z, 0)t + z, t) = f (z, 0) and therefore, setting G+ (z) = z + f (z, 0) and G− (z) = z − f (z, 1), we see that our problem boils down to solve G+ ◦ G− (z) = G− ◦ G+ (z) = z
(2.26)
with (G+ )(x) = πρ0 (x) and (G− )(x) = −πρ1 (x) if ρ0 and ρ1 are the densities of µ0 , µ1 respectively. This kind of characterization is in fact reminiscent of the description of minimizers provided by P. Zinn Justin [33]. However, such a result would require more smoothness of (ρ ∗ , u∗ ) than what we proved here. Let us note at this point that since the Euler equation preserves the energy t → (u∗t (x)2 − Hρt∗ (x)2 )ρt∗ (x)dx, we can express I (β) in terms of (µD , µE , ∗0 , ∗1 ) ;
β 1 I (β) (µD , µE ) = − ∗1 dµE − ∗0 dµD + (∂x ∗0 )2 − (H µD )2 dµD 4 2
β − (µE ) + (µD ) − x 2 dµD (x) − x 2 dµE (x) − inf Iβ . 4 (2.27) Since (2.26) furnishes an implicit equation giving ( ∗0 , ∗1 ) in terms of (µD , µE ), it gives, with (2.27), a nice description of I (β) (µD , µE ). Proof of Property 2.9. By Property 2.5, we want to minimize 1 π2 1 (ut (x))2 ρt (x)dxdt + (ρt (x))3 dxdt S(ρ, u) := 3 0 0 under the constraint (C’) : ∂t ρt + ∂x (ut ρt ) = 0,
lim ρt (x)dx = µ0 , t↓0
lim ρt (x)dx = µ1 t↑1
and when ρt (x)dx ∈ P(R) for all t ∈ [0, 1]. To study the variational problem associated with this energy, I know essentially three ways. The first is to make a perturbation with respect to the source. This strategy was followed by D. Serre in [25] but applies only when we know a priori that (ρ ∗ , u∗ ρ ∗ ) are uniformly bounded. Since this case corresponds to the case where µ0 , µ1 are compactly supported, we shall consider it in the second part of the proof. One can also use a target type perturbation, which is a standard perturbation on the space of probability measure, viewed as a subspace of the vector space of measures. This method gives (3) in Property 2.9 as we shall see. The last way is to use convex analysis, following for instance Y. Brenier (see [5], Sect. 3.2). We shall also detail these arguments, since it provides some approximation property of the field u∗ , as described in Property 2.9.6. These three approaches would yield the same results if we knew a priori that the solutions are smooth enough.
544
A. Guionnet
We begin with the target type perturbation. In the following, we denote (ρ ∗ , u∗ ) the minimizer of S under the constraint (C’). Let (ρ, u) ∈ {S < ∞} satisfying the constraint (C’). Then, for any α ∈ [0, 1], we set, with m = ρu and m∗ = ρ ∗ u∗ , ρ α = (1 − α)ρ ∗ + αρ,
mα := (1 − α)(ρ ∗ u∗ ) + α(ρu) := ρ α uα ,
uα = (mα /ρ α ).
It is then not hard to check that S(ρ α , uα ) < ∞ for all α ∈ [0, 1]. Moreover, by the convexity of φ : α → (ρtα (x))−1 (mαt (x))2 + 3−1 π 2 (ρtα (x))3 for all admissible (ρ, m), we see that α −1 (φ(α) − φ(0)) decreases as α → 0 showing, by the monotone convergence theorem, the existence of ∂α S(ρ α , uα )(0+ ) and ∂α S(ρ α , uα )(0+ ) = [−(u∗ )2 ρ ∗ − (u∗ )2 ρ + 2mu∗ + π 2 (ρ ∗ )2 (ρ − ρ ∗ )]dxdt = [2u∗ (m − m∗ ) − (u∗ )2 (ρ − ρ ∗ ) + π 2 (ρ ∗ )2 (ρ − ρ ∗ )]dxdt. Hence, for any (ρ, u) ∈ {S < ∞}, we have α α + ∂α S(ρ , u )(0 ) = [2u∗ (m − m∗ ) − (u∗ )2 (ρ − ρ ∗ ) +π 2 (ρ ∗ )2 (ρ − ρ ∗ )]dxdt ≥ 0.
(2.28)
Reciprocally, since S is convex in (ρ, m), we know that S(ρ α , uα ) ≥ S(ρ ∗ , u∗ ) + ∂α S(ρ α , uα )(0+ )α so that (2.28) implies that S(ρ α , uα ) ≥ S(ρ ∗ , u∗ ) for all α ∈ [0, 1] and (ρ, u) ∈ {S < ∞}. Hence, (2.28) characterizes our unique minimizer, which proves Property 2.9.3. We can apply this result with ρ = ρ ∗ + ∂x φ,
m = m∗ − ∂t φ
for some φ ∈ Cc1,1 (), such that ∂x φ(., 0) = ∂x φ(., 1) = 0, insuring that S(ρ, u) has finite entropy. This yields the second point of Property 2.9.3. Conditions at the boundary of the support can also be deduced from (2.28), but they are hardly understandable, since the conditions over the potentials φ such that S(ρ, u) < ∞ that do not vanish at the boundary become less transparent. To prove the last points of our property which concerns the case where (µ0 , µ1 ) are compactly supported, we follow D. Serre [25] and Y. Brenier [5]. The idea developed in [25] is basically to set at (x) = a(t, x) = (ρt∗ (x), ρt∗ (x)u∗t (x)) so that div(at (x)) = 0 and perturb a by considering a family a g = Jg (a.∇x,t h) ◦ g = Jg (ρ ∗ (∂t h + u∗ ∂x h)) ◦ g with a C ∞ diffeomorphism g of Q = [0, 1] × R with inverse h = g −1 and Jacobian Jg . Such an approach yields the Euler’s equation (2.25) of Property 2.9 (use the boundedness of (ρ ∗ , u∗ ) obtained in Corollary 2.8 to apply Theorem 2.2 of [25]). Moreover, since we saw in Corollary 2.8.d that ρ ∗ and u∗ are smooth in the interior of , (2.24) results in Property 2.9.5. We now develop convex analysis for our problem following [5]. By Corollary 2.8.a, we see that there exists a compact K such that µ∗t (K c ) = 0 for all t ∈ [0, 1]. We set Q = K × [0, 1] and E = Cb (Q)3 .
First Order Asymptotics of Matrix Integrals
545
For any continuous functions (F, G, H ) ∈ E , we set 3 2 |H (x, t)| 2 dxdt α(F, G, H ) = 3π Q if H ≥ 0 and F + ( G2 )2 ≤ 0, on Q, and +∞ otherwise. For any (µ, M, µ) ∈ E , let us consider ∗ µ) = sup F (x, t)µ(dx, dt) + G(x, t)M(dx, dt) α (µ, M, Q
Q + H (x, t) µ(dx, dt) − α(F, G, H ) . Q
It is not hard to see that α ∗ (µ, M, µ) < ∞ iff µ is non negative, M is absolutely continuous w.r.t µ and µ is absolutely continuous w.r.t Lebesgue measure with density in L3 (dxdt). Moreover, if we denote µ(dx, dt) = ρ t (x)dxdt, M(dx, dt) = 2 t (x)3 dxdt. Now, let µ) = u2 (x, t)µ(dx, dt) + π3 ρ ut (x)µ(dx, dt), α ∗ (µ, M, F (x, t)ρt∗ (x)dxdt + G(x, t)u∗t (x)ρt∗ (x)dxdt β(F, G, H ) = Q Q ∗ + H (x, t)ρt (x)dxdt Q
if there exists φ ∈ Cb1,1 (Q) such that F (x, t) + H (x, t) = ∂t φ(x, t),
G(x, t) = ∂x φ(x, t)
for all (x, t) ∈ Q, and is equal to +∞ otherwise. We consider ∗ µ) = sup F (x, t)µ(dx, dt) + G(x, t)M(dx, dt) β (µ, M, Q
Q + H (x, t) µ(dx, dt) − β(F, G, H ) . Q
Then, β ∗ is infinite unless
∗ Q ∂t φ(x, t)(µ(dx, dt)−ρt (x)dxdt)+ Q ∂x φ(x, t)(M(x, t)− 1,1 µ(dx, dt)) = 0 for m∗t (x))dxdt = 0 for all φ ∈ Cb (Q) and H (x, t)(µ(dx, dt) − all H ∈ Cb (Q). µ, µ(dx, dt) = µ(dx, t)dt, ∂t µ + ∂x M = 0 in the sense of Thus, µ = distributions, R µ(dx, t) = 1 for almost all t ∈ [0, 1] and limt↓0 µ(dx, t) = dµ0 (x), limt↑1 µ(dx, t) = dµ1 (x). As a consequence,
µ) + β ∗ (µ, M, µ)} inf{α ∗ (µ, M, = inf{S(ρ, m) : (ρ, m) satisfies (C’) and ρt |K c = 0 ∀t ∈ [0, 1]} = 2 inf{Sµ0 (ν) : ν1 = µ1 } + ((µ0 ) − (µ1 )) := Z(µ0 , µ1 ), where in the last line we have used Property 2.5 and Corollary 2.8.a. Observe that α, β are convex functions with values in ] − ∞, ∞]. Moreover, there is at least one point (F, G, H ) ∈ E, namely F = −1, G = 0, H = 1 for which α is continuous for the uniform topology on E and β finite (this is the reason why we need to
546
A. Guionnet
work on a compact set K instead of R). Thus, following [5], by the Fenchel-Rockafellar duality theorem (see Th´eor`eme 1.11 in [6]), we have inf{α ∗ (µ, M, µ) + β ∗ (µ, M, µ), (µ, M, µ) ∈ E } = sup{−α(F, G, H ) − β(−F, −G, −H ) : (F, G, H ) ∈ E} and the infimum is achieved. More precisely,
3 2 Z(µ0 , µ1 ) = sup ∂t φt (x)ρt∗ (x)dxdt+ ∂x φt (x)m∗t (x)dxdt− (H ) 2 dxdt , 3π Q Q Q where the supremum is taken over φ ∈ Cb1,1 (Q) and H in Cb (Q) such that H ≥ 0, ∂t φ + (∂x φ/2)2 ≤ H . Optimizing over H yields Z(µ0 , µ1 ) = sup Q
−
set
2 3π
∂t φt (x)ρt∗ (x)dxdt
+ Q
∂x φt (x)m∗t (x)dxdt 3
max{∂t φ + (∂x φ/2)2 , 0}
2
dxdt .
Q
As a consequence, there exists a sequence of functions φ in Cb1,1 (Q) such that if we π 2 (ρ )2 = max{∂t φ + 4−1 (∂x φ )2 , 0},
u∗t (x)2 dµ∗t (x)dt +
π2 3
ρt∗ (x)3 dxdt ≤
∂t φt (x) + u∗t (x)∂x φ dµ∗t (x)dt 2π 2 − ρt (x)3 dxdt + 2 3
for all > 0, which implies
φt (x) 2 ∗ ∗ ut (x) − ∂x dµt (x)dt
≤π
2
2
ρt (x)2 ρt∗ (x)dxdt
−π =−
|∂t φ +
2
π2 3
−π 2
∂x φ 2
−
2
2π 2 3
ρt (x)3 dxdt −
π2 3
ρt∗ (x)3 dxdt
− (π 2 (ρ )2 |ρt∗ (x)dxdt + 2
(ρt∗ (x) − ρt (x))2 (2ρt (x) + ρt∗ (x))dxdt
|∂t φ +
∂x φ 2
2
− π 2 (ρ )2 |ρt∗ (x)dxdt + 2 ,
which completes the proof of the property.
(2.29)
First Order Asymptotics of Matrix Integrals
547
3. Applications to Matrix Integrals In physics, several matrix integral have been of interest in the 80’s and 90’s for their applications to quantum field theory as well as string theory. We refer here to the works of M. Mehta, A. Matytsin, A. Migdal, V. Kazakov, P. Zinn Justin and B. Eynard for instance. Among these integrals, are often considered the following : • The random Ising model on random graphs described by the Gibbs measure µN I sing (dA, dB) =
1 ZINsing
e
Nβ 2
tr(AB)−N tr(P1 (A))−N tr(P2 (B)) dAdB
with ZINsing the partition function Nβ N ZI sing = e 2 tr(AB)−N tr(P1 (A))−N tr(P2 (B)) dAdB and two polynomial functions P1 , P2 . Again, β = 1 (resp. β = 2) if integration holds over SN (resp. HN ). The limiting free energy for this model was calculated by M. Mehta [24] in the case P1 (x) = P2 (x) = x 2 + gx 4 and integration holds over HN . However, the limiting spectral measures of A and B under µN I sing were not considered in that paper. A discussion about this problem can be found in P. Zinn Justin [33]. • One can also define the q-Potts model on random graphs described by the Gibbs measure µN P otts (dA0 , ..., dAq )
=
1
q
ZPNotts i=1
e
Nβ 2
tr(A0 Ai )−N tr(Pi (Ai )) dA e−N tr(P0 (A0 )) dA . i 0
The limiting spectral measures of (A0 , · · · , Aq ) are discussed in [33] when Pi = gx 3 − x 2 (!). • As a straightforward generalization, one can consider matrices coupled by a chain following S. Chadha, G. Mahoux and M. Mehta [9] given by µN chain (dA1 , ..., dAq ) =
1
q
N Zchain i=2
e
Nβ 2
tr(Ai−1 Ai )−N tr(Pi (Ai )) dA e−N tr(P1 (A1 )) dA . i 1
q can eventually go to infinity as in [22]. • Finally, we can mention the so-called induced QCD studied in [21]. It is described, if = [−q, q]D ⊂ ZD , by µN QCD (dAi , i ∈ ) Nβ 2D 2D 1 tr(Uj Ai+ej Uj∗ Ai ) β e 2 j =1 dmN (Uj ) e−N tr(P (Ai )) dAi , = N ZQCD i∈ j =1 i∈ where (ej )1≤j ≤2D is a basis of ZD . The description of the limit behaviour of the spectral measures of A1 , · · · , Aq is given in [21] in the case q = ∞. We impose periodic boundary conditions at the boundary of the lattice points .
548
A. Guionnet
In this section, we shall study the asymptotic behaviour of the free energy of these models as well as describe the limit behaviour of the spectral measures of the matrices under the corresponding Gibbs measures. The theorem is stated as follows: Theorem 3.1. Assume that Pi (x) ≥ ci x 4 + di with ci > 0 and some finite constants di . Hereafter, β = 1 (resp. β = 2) when dA denotes the Lebesgue measure on SN (resp. HN ). Then, 1 log ZINsing N2
β β = − inf µ(P ) + ν(Q) − I (β) (µ, ν) − (µ) − (ν) − 2 inf Iβ (ν), 2 2 ν∈P (R) (3.1) 1 = lim log ZPNotts N→∞ N 2 q q q β = − inf µi (Pi ) − I (β) (µ0 , µi ) − (µi ) −(q + 1) inf Iβ (ν), 2 ν∈P (R)
FI sing = lim
N→∞
FP otts
i=0
Fchain
i=1
i=0
(3.2) 1 N = lim log Zchain N→∞ N 2 q q q β (β) µi (Pi ) − I (µi−1 , µi ) − (µi ) − q inf Iβ (ν), = − inf 2 ν∈P (R) i=1
i=2
i=1
(3.3) FQCD
1 N = lim log ZQCD N→∞ N 2 2D β = − inf µi (P ) − I (β) (µi+ej , µi ) − (µi ) 2 i∈
i∈ j =1
−2D|| inf Iβ (ν). ν∈P (R)
i∈
(3.4)
Remark 3.2. The above theorem actually extends to polynomial functions going to infinity like x 2 . However, the case of quadratic polynomials is trivial since it boils down to the Gaussian case and therefore the next interesting case is a quartic polynomial as above. Moreover, Theorem 3.3 fails in the case where P , Q go to infinity only like x 2 . However, all our proofs would extend easily for functions Pi s such that Pi (x) ≥ a|x|2+ + b with some a > 0 and > 0. Theorem 3.1 will be proved in the next section, but merely boils down to a Laplace (or saddle point) method. We shall then study the variational problems for the above energies. We prove the following for the Ising model. Theorem 3.3. Assume P1 (x) ≥ ax 4 + b, P2 (x) ≥ ax 4 + b for some positive constant a. Then
First Order Asymptotics of Matrix Integrals
549
N 2 • The law of (µˆ N ˆN A,µ B ) ∈ P(R) under µI sing satisfies a large deviation principle in the scale N 2 with good rate function
II sing (µ, ν) = µ(P1 ) + ν(P2 ) −
β ((µ) + (ν)) − I (β) (µ, ν) − FI sing . 2
• The infimum of II sing is achieved at a unique couple (µA , µB ) of probability N measures. Consequently, (µˆ N ˆN A,µ B ) converges almost surely under µI sing towards (µA , µB ). • (µA , µB ) are compactly supported measures with finite entropy . • Let (ρ A→B , uA→B ) be the minimizer of SµA on {ν1 = µB } as described in Theorem 2.9. Then, (µA , µB , ρ A→B , mA→B = ρ A→B uA→B ) is the unique minimizer of the strictly convex energy β β β L(µ, ν, ρ ∗ , m∗ ) := µ P1 − x 2 + ν P2 − x 2 − ((µ) + (ν)) 4 4 4 1 1 ∗ 2 2 (mt (x)) π β ∗ 3 ρt (x) dxdt . dxdt + + 4 ρt∗ (x) 3 0 0 Thus, we find that (µA , µB , ρ A→B , mA→B ) are characterized by the property that for any (µ, ν, ρ ∗ , m∗ ) ∈ {L < ∞}, β 2 β 2 P1 − x d(µ − µA ) + P2 − x d(ν − µB ) 4 4 β − log |x − y|dµA (y)(dµ − dµA )(x) 2 β − log |x − y|dµB (y)(dµ − dµB )(x) 2 β + [2uA→B (m∗ − mA→B ) − (uA→B )2 (ρ ∗ − ρ A→B ) 4 +π 2 (ρ A→B )2 (ρ ∗ − ρ A→B )]dxdt ≥ 0. • (ρ A→B , mA→B ) satisfies the Euler equation for isentropic flow with pressure p(ρ) = 2 − π3 ρ 3 in the strong sense in the interior of = {(x, t) ∈ R×[0, 1] : ρtA→B (x) = 0} and satisfy the conclusions of Property 2.9. • Moreover, β β βH µA (x) = P1 (x) − x − uA→B (x), µA a.s, 2 2 0 and β β βH µB (x) = P2 (x) − x + uA→B (x), µB a.s. 2 2 1 For the other models, we unfortunately lose obvious convexity, and therefore uniqueness of the minimizers in general. We can still prove the following Theorem 3.4. • For any given µ0 , there exists at most one minimizer (µ1 , · · · , µq ) in FP otts but uniqueness of µ0 is unclear in general, except in the case q = 2. The critical points in FP otts are compactly supported, with finite entropy . Let (µ0 , · · · , µq ) be a critical point and for i ∈ {2, · · · , q}, denote (ρ i , ui ) the unique
550
A. Guionnet
minimizer described in Theorem 2.9 with µi0 (dx) = µ0 (dx) and µi1 (dx) = µi (dx). Then P0 (x)
q qβ β β i = u0 (x) − (q − 3)H µ0 (x) x+ 2 2 2 i=2
µ0 -almost surely and Pi (x) =
β β β x − ui1 (x) − H µi (x), 2 2 2
1≤i≤q
µi -almost surely. • There exists at most one minimizer in FChain . The minimizer (µ1 , · · · , µq ) is compactly supported with finite entropy . It is such that if we denote by (ρ i , ui ) the minimizer described in Theorem 2.9 with µi0 (dx) = µi−1 (dx) and µi1 (dx) = µi (dx), we have P1 (x) =
β β β β x + u20 − H µ1 (x) and Pi (x) = 2x − (ui1 − ui+1 0 ), 2 2 2 2
2 ≤ i ≤ q,
µ1 -almost surely and µi -almost surely respectively. • Again, uniqueness of the critical points in FQCD is unclear in general, except in the case D = 1 where uniqueness holds. In this case, the minimizer µi is symmetric, yielding µi = µ for all i ∈ and the unique path (ρ , u) described in Theorem 2.9 with boundary data (µ, µ), satisfies u∗0 (x) = −u∗1 (x) and P (x) − βx − βu∗0 (x) = 0 µ a.s. 3.1. Proof of Theorem 3.1. The proof of Theorem 3.1 follows a standard Laplace method. We shall only detail it in the Ising model case, the generalization to the other models being straightforward. Let P , Q be two polynomial functions and define, for N ∈ N, N (P , Q) ∈ R∪{+∞} by Nβ β N (P , Q) = exp{−N tr(P (A)) − Ntr(Q(B)) + tr(AB)}dAdB, 2 where the integration holds over orthogonal (resp. Hermitian) matrices if β = 1 (resp. β = 2). We claim that Lemma 3.5. Assume that there exists a, c ∈ R+∗ , and b, d ∈ R such that P (x) ≥ ax 4 + b and Q(x) ≥ cx 4 + d,
for all x ∈ R.
Then, we have lim
N→∞
1 log N (P , Q) = N2
sup
{−µ(P ) − ν(Q) + I (β) (µ, ν)
µ,ν∈P (R)
β + ((µ) + (ν))} − 2 inf Iβ (ν). 2 ν∈P (R)
First Order Asymptotics of Matrix Integrals
551
Remark here that the result could be extended to P (x) ≥ ax 2 +b and Q(x) ≥ cx 2 +d with ac > 1 but that the Gaussian case being uninteresting, we shall use the above and a simpler hypothesis. Proof. Observe that for any > 0, ! ! ! ! B A !trN (AB) − trN ! ! 1 + A2 1 + B 2 ! ! ! ! ! ! ! ! ! B3 A3 A ! ! ! ! ≤ !trN B ! + !trN 2 2 2 1 + A 1 + A 1 + B ! 21 1 A6 2 2 ≤ trN tr B N (1 + A2 )2 21 1 B6 2 2 tr + trN A N (1 + B 2 )2 1 1 1 1 √ 4 2 2 2 4 2 2 2 trN (B ) + trN (B ) trN A ≤ trN (A ) √ ≤ trN (A4 ) + trN (B 4 ) + trN (A2 ) + trN (B 2 ) . Therefore, if we set µN I sing (dA, dB) =
1 β N (P , Q)
exp{−Ntr(P (A)) − N tr(Q(B)) +
Nβ tr(AB)}dAdB 2
and ! ! A B ! 1 ! exp{−N tr(P (A)) − N tr(Q(B)) + Nβ 2 tr( 1+ A2 1+ B 2 )}dAdB ! ! N ( ) := ! 2 log !, β !N ! N (P , Q) ! ! ! 1 ! B Nβ Nβ A = !! 2 log µN )− tr( tr(AB)} !! , I sing exp{ 2 2 N 2 1 + A 1 + B 2 we get √ √ 1 N 4 2 4 2 exp{ log µ N tr(A + A ) + N tr(B + B )} I sing N2 √ √ 1 N 4 2 4 2 ≤ log µ N tr(A + A ) + q N tr(B + B )} , exp{q I sing qN 2
N ( ) ≤
where we used Jensen’s inequality with √ q > 1. Now, under our hypothesis, and since 2|AB| ≤ A2 + B 2 , it is clear that if q is chosen small enough (e.g smaller than a ∧ c), 1√ and obtain the above right-hand side is bounded uniformly. Hence, we take q = 2a∧c √ lim sup N ( ) ≤ C N→∞
(3.5)
552
A. Guionnet
with a finite constant C. Moreover, for any > 0, we can use saddle point method (see [1] for a full rigorous derivation) and Theorem 1.1 of [16] to obtain 1 B Nβ A lim dAdB log exp − N tr(P (A)) − N tr(Q(B)) − tr N →∞ N 2 2 1 + A2 1 + B 2 =
sup
{−µ(P ) − ν(Q) + I (β) (µ ◦ φ −1 , ν ◦ φ −1 )
µ,ν∈P (R)
β + ((µ) + (ν))} − 2 inf Iβ 2 with φ (x) = (1 + x 2 )−1 x and µ ◦ φ −1 (f ) = µ(f ◦ φ ). Thus, (3.5) results with lim
N →∞
1 log N (P , Q) = lim sup {−µ(P ) − ν(Q) + I (β) (µ ◦ φ −1 , ν ◦ φ −1 ) →0 µ,ν∈P (R) N2 β + ((µ) + (ν))} − 2 inf Iβ . 2
Moreover, we can prove as for (3.5) that for any µ, ν such that µ(x 4 ) ≤ M and ν(x 4 ) ≤ M, √ |I (β) (µ ◦ φ −1 , ν ◦ φ −1 ) − I (β) (µ, ν)| ≤ C(M) . Using the a priori bounds |I (β) (µ, ν)| ≤
1 (µ(x 2 ) + ν(x 2 )), and (µ) + (ν) ≤ C(µ(x 2 ) + ν(x 2 ) + 1), 2
for some finite constant C, we see that the supremum above is taken at µ, ν such that µ(x 4 ) and ν(x 4 ) are bounded by some finite constant depending only on P , Q. Hence, we can take the limit going to zero above and conclude. 3.2. Proof of Theorem 3.3 and 3.4. 3.2.1. The Ising model. Let us recall that
β β (β) FI sing + 2 inf Iβ (ν) = − inf µ(P1 ) + ν(P2 ) − I (µ, ν) − (µ) − (ν) . 2 2 ν∈P (R) Observe that since I (β) (µ, ν) ≤ 2−1 µ(x 2 ) + 2−1 ν(x 2 ), the minimizer (µA , µB ) in the above right-hand side is such that β β β β µA P1 − x 2 + µB P2 − x 2 − (µA ) − (µB ) 4 4 2 2 ≤ −FI sing − 2 inf Iβ (ν) < ∞. ν∈P (R)
Hence, since P1 − 4−1 βx 2 and P2 − 4−1 βx 2 are bounded below under our hypotheses (for well chosen a), we conclude that (µA ) and (µB ) are bounded below and hence
First Order Asymptotics of Matrix Integrals
553
finite. For a later purpose, remark also that if 2n1 (resp. 2n2 ) is the degree of P1 (resp. P2 ) for n1 , n2 ≥ 2, we also see that µA (x 2n1 ) < ∞,
µB (x 2n2 ) < ∞.
(3.6)
Thus, we can use Property 2.5 to get FI sing + 2 inf Iβ (ν) ν∈P (R) β β β = − inf µ(P1 − x 2 ) + ν(P2 − x 2 ) − ((µ) + (ν)) 4 4 4 1
1 β ∗ 2 ∗ ∗ 2 ∗ u H µ (x) dµ (x)dt + (x) dµ (x)dt inf + t t t t 4 (u∗ ,µ∗ )∈(C)µ,ν 0 0 β β β =− inf µ(P1 − x 2 ) + ν(P2 − x 2 ) − ((µ) + (ν)) µ,ν∈P(R) 4 4 4 (u∗ ,µ∗ )∈(C)µ,ν 1 π2 1 β u∗t (x)2 dµ∗t (x)dt + ρt∗ (x)3 dxdt + 4 3 0 0 := − inf L(µ, ν, µ∗ , u∗ ), µ,ν∈P(R) (u∗ ,µ∗ )∈(C)µ,ν
where (u∗ , µ∗ ) ∈ (C)µ,ν means that in the sense of distributions ∂t ρt∗ + ∂x (ρt∗ u∗t ) = 0,
lim µ∗t (dx) = µ, t↓0
lim µ∗t (dx) = ν, t↑1
and we have used in the last line that when the above infimum is finite, µ∗t is absolutely continuous with respect to Lebesgue measure for almost all t ∈ [0, 1] and with density ρ ∗ ∈ L3 (dxdt) (see Lemma 2.4). Observe that if L(µ, ν, µ∗ , u∗ ) = L(µ, ν, ρ ∗ , m∗ ) with m∗ = ρ ∗ u∗ , L is a strictly convex function of (µ, ν, ρ ∗ , m∗ ) (recall that − is convex, see [1] for instance) and that the constraint (C)µ,ν is linear in the variables (µ, ν, ρ ∗ , m∗ ). Therefore, the above minimum is achieved at a unique point (µA , µB , µA→B , mA→B ). . . We now perform a measure type perturbation to characterize the infimum. Take (µ, ν, ρ ∗ , m∗ ) ∈ {L < ∞} and set, for α ∈ [0, 1], ). (µα , ν α , ρ α , mα ) = α(µ, ν, ρ ∗ , m∗ ) + (1 − α)(µA , µB , ρ.A→B , uA→B . Then, we find that we must have β β P2 − x 2 d(ν − µB ) P1 − x 2 d(µ − µA ) + 4 4 β − log |x − y|dµA (y)(dµ − dµA )(x) 2 β − log |x − y|dµB (y)(dµ − dµB )(x) 2 β + [2uA→B (m∗ − mA→B ) − (uA→B )2 (ρ ∗ − ρ A→B ) 4 +π 2 (ρ A→B )2 (ρ ∗ − ρ A→B )]dxdt ≥ 0.
(3.7)
554
A. Guionnet
Taking µ = µA and ν = µB , we see that (ρ A→B , uA→B ) must satisfy Property 2.9. Now, if µ(dx) = µA (dx) + ∂x φ0 (x)dx, ν(dx) = µB (dx) + ∂x φ1 (x)dx and m∗ = mA→B − ∂t φ, ρt∗ = ρ A→B + ∂x φ with φ ∈ Cb∞,∞ (R × [0, 1]) such that (mA→B + ∂t φ)2 (∂t φ)2 t dxdt < ∞, dxdt < ∞, (3.8) A→B + ∂ φ x ρ A→B =0 ρ ρ A→B =0 ∂x φ we obtain by (3.7) β 2 β 2 P1 − x ∂x φ0 (x)dx + P2 − x ∂x φ1 (x)dx 4 4 β β − log |x − y|dµA (y)∂x φ0 (x)dx − log |x − y|dµB (y)∂x φ1 (x)dx 2 2 β + (3.9) [−2uA→B ∂t φ − (uA→B )2 ∂x φ + π 2 (ρ A→B )2 ∂x φ]dxdt ≥ 0 4 which becomes an equality if φ is supported in = {(x, t) ∈ R×[0, 1] : ρtA→B (x) = 0} by symmetry. If we assume that uA→B is sufficiently smooth, in particular continuously differentiable with respect to the time variable around t = 0 and t = 1, we can use integration by parts to see that, if ∂x A→B = uA→B , t t [2uA→B ∂t φ − (uA→B )2 ∂x φ + π 2 (ρ A→B )2 ∂x φ]dxdt = 2[ A→B ∂x φt dx]10 , t yielding that there exists two constants l1 , l2 such that β β β P1 (x) − x 2 − log |x − y|dµA (y) − A→B (x) = l1 4 2 2 0 β β β P2 (x) − x 2 − (x) = l2 log |x − y|dµB (y) + A→B 4 2 2 1 β β β P1 (x) − x 2 − log |x − y|dµA (y) − A→B (x) ≥ l1 4 2 2 0 β β β P2 (x) − x 2 − log |x − y|dµB (y) + A→B (x) ≥ l2 4 2 2 1
µA
a.s,
(3.10)
µB
a.s,
(3.11)
if x ∈ supp(µA )c , if x ∈ supp(µB )c .
Such a result would generalize the usual equations obtained in the one matrix case. However, since we could not prove such a regularity property of (ρ A→B , uA→B ), we shall now obtain a Schwinger-Dyson type formula following [8], Theorem 2.15 and Proposition 2.17, to obtain a weak form of (3.10),(3.11). Let us briefly recall the idea which is based on an infinitesimal change of variables. If, in ZINsing , we change A → A+N −1 h(A, B) with some smooth bounded functions h of two non-commutative variables (take for instance h belonging to the set CCst (C) of Stieljes functionals defined in [7, 8](see also its definition in Appendix 4.1), it turns out that, due to the Kadison-Fuglede determinant formula (see [8], the proof of Theorem 2.15 and Proposition 2.17) ZINsing β βN β = etr(h(A,B)(−P1 (A)+ 2 B))+ 2 tr⊗tr(DA h(A,B))+O(1)−N tr(P1 (A)+P2 (B)− 2 AB) dAdB
First Order Asymptotics of Matrix Integrals
555
with DA the non-commutative derivation with respect to A given by DA (hg) = DA h×1⊗g+h⊗1×DA g,
∀h, g ∈ CCst (C), DA B = 0, DA A = 1⊗1.
Thus, if we set XN (h) =
β (N) (N) (N) ⊗ µˆ A,B (DA h) + µˆ A,B µˆ 2 A,B
−P1 (A) +
β B h(A, B) 2
(N)
with µˆ A,B the empirical distribution of A, B defined by (N)
µˆ A,B (h) = trN (h(A, B)), we have proved that
∀h ∈ CCst (C),
C(h) eNXN (h) dµN I sing ≤ e
for some finite constant C(h). Changing h into −h we deduce that C(h) eN|XN (h)| dµN I sing ≤ 2e and therefore, by Chebyshev’s inequality 2 (N) (N) (N) µN | µ ˆ − P ⊗ µ ˆ (D h(A, B)) + µ ˆ (A) + B h(A, B) | ≤ A I sing β 1 − N+C(h) ≥ 1 − 2e . (3.12) Of course, the same type of formula holds when A is replaced by B. It is not hard to see that µˆ (N) is tight under µN I sing for the topology described in [8], corresponding to the CCst (C)-weak topology (see [8] for proof of similar statements). Let τ be a limit point. Taking, for > 0 and δ > 0, h(A, B) = (1 + δA2 )−p j (A)(1 + B 2 )−1 with j (x) = 1≤i≤n (zi − x)−1 for some zi ∈ C\R and n ∈ N, and p large enough (p larger than half the degree of P1 ) so that DA h(A, B) ∈ CCst (C) ⊗ CCst (C) and (1 + δA2 )−p (P1 (A) − B)(1 + B 2 )−1 j (A) ∈ CCst (C), we deduce from (3.12) that τ must satisfy for any , δ > 0 and p large enough, 2 −p 2 −1 τ ⊗ τ (D A (1 + δA ) j (A) × 1 ⊗ (1 + B ) ) 2 =τ P1 (A) − B (1 + δA2 )−p j (A)(1 + B 2 )−1 . β
(3.13)
Similarly for any , δ > 0, and p large enough, 2 −p 2 −1 τ ⊗ τ (D B (1 + δB ) j (B) × 1 ⊗ (1 + B ) ) 2 =τ P2 (B) − A (1 + δA2 )−p j (B)(1 + A2 )−1 . β
(3.14)
Now, by (3.6), P1 (A) − 2−1 βB and P2 (B) − 2−1 βA belongs to L1 (τ ) so that we can let δ, going to zero to conclude by the dominated convergence theorem that 2 τ ⊗ τ (DA j (A)) = τ (( P1 (A) − B)j (A)), β
556
A. Guionnet
τ ⊗ τ (DB j (B)) = τ
2 P2 (B) − A j (B) . β
(3.15)
We next show that (3.15) implies that µA and µB are compactly supported when n1 ≥ 2 2, and n first that all their moments are finite. To this end, taking j (x) =
and n2 ≥ (1 + x 2 )−1 x ∈ CCst (C) for n ∈ N, we get n n 2 2 −1 = τ τ (B|A) (1 + A2 )−1 A µA P1 (x) (1 + x ) x β +τ ⊗ τ (DA j (A)) (3.16) with, since Df can be represented in the tensor product space as Df (x, y) = (x − y)−1 (f (x) − f (y)), τ ⊗ τ (DA j (A)) =
n−1
µA ((1 + x 2 )−1 x)p µA ((1 + x 2 )−1 x)n−1−p
p=0
−
n−1
µA ((1 + x 2 )−1 x)p+1 µA ((1 + x 2 )−1 x)n−p .
p=0
When n is odd, it is not hard to see that we can find c > 0, dn ∈ R such that P (x)x n ≥ cx 2n1 −1+n − dn , so that we deduce from (3.16) that 2 x x 2n1 −1+n p cµA | ≤ d ( | + 2n sup µ ) n A 1 + x 2 1 + x 2 p≤n 1 q 1 x nq +µA | | µB (|x|p ) p , (3.17) 2 1 + x where we have used in the last line H¨older’s inequality with conjugate exponents p, q. We take q = n−1 (2n1 − 1 + n), p = (2n1 − 1)−1 (2n1 − 1 + n). Similarly, we obtain for µB , and q = n−1 (2n2 − 1 + n), p = (2n2 − 1)−1 (2n2 − 1 + n), 2 x x 2n2 −1+n p cµB | ≤ d ( | + 2n sup µ ) n B 1 + x 2 1 + x 2 p≤n 1 q 1 x nq +µB | | µA (|x|p ) p . (3.18) 2 1 + x Now, by (3.6), (3.17), (3.18) yield µA (x 2n1 −1+n ) = sup µA ((1 + x 2 )−1 x)2n1 −1+n < ∞ ≥0
µB (x 2n2 −1+k ) = sup µB ≥0
for n such that 2n1 − 1 + n ≤ 2n2 (2n1 − 1) ((1 + x 2 )−1 x)2n2 −1+k < ∞
for k such that 2n2 − 1 + k ≤ 2n1 (2n2 − 1).
Replacing n1 (resp. n2 ) by n2 (2n1 − 1) (resp. n1 (2n2 − 1)) and proceeding by induction shows that µA , µB have finite moments of any order since 2n1 − 1 > 1 and 2n1 − 1 > 1.
First Order Asymptotics of Matrix Integrals
557
As a consequence, we can extend by the dominated convergence theorem (3.16) to polynomial functions (i.e. take = 0) resulting with µA
2 P (x)x n β 1
n−1
= τ τ (B|A)An + µA x p µA x n−1−p ,
(3.19)
p=0
and a similar equation for the moments of µB . Let us write 2β −1 P1 (x) = α1 x 2n1 −1 + 2n1 2n2 2n1 −p , 2β −1 P (x) = β x 2n2 −1 + 2n2 −p with α > 0, β > 0. 1 1 1 2 p=2 αp x p=2 βp x Setting an = |µA (x n )| and bn = |µB (x n )|, we deduce that α1 a2n1 −1+n ≤
2n1
|αp |a2n1 −p+n +
p=2
β1 b2n1 −1+n ≤
2n2
n−1
1
1
q ap an−1−p + aqn bpp ,
(3.20)
p=0
|βp |b2n1 −p+n +
p=2
n−1
1
1
q bp bn−1−p + bqn app
(3.21)
p=0
with conjuguate exponents (p, q) to be chosen later. Now, we make the induction hypothesis that for some R ∈ R+ , for some m ∈ N, ap ≤ R p Cp ,
bp ≤ R p Cp ,
for p ≤ m
with Cp the Catalan numbers given by Cp =
p−1
Cn Cp−1−n ,
C0 = 1.
n=0
Of course, up to taking R big enough, we can always assume that m ≥ 2n1 ∨ n2 . Now, plugging this hypothesis into (3.20),(3.21) with m + 1 = 2n1 − 1 + n and q = mn−1 , we obtain α1 a2n1 −1+n ≤
2n1
m |αp |R 2n1 −p+n C2n1 −p+n + R n Cn + R n+1 (Cm ) m (C[ m−n ]+1 ) n
p=2
≤ Cm+1 R m+1
2n1
m−n m
|αp |R −p + R n−2−m + R n−m ,
p=2
where we have used that Cm increases with m. Thus, our induction hypothesis is verified as soon as 2n1
|αp |R −p + R −2n1 + R 2(1−n1 ) ≤ α1 ,
p=2 2n2 p=2
|βp |R −p + R −2n1 + R 2(1−n2 ) ≤ β1 ,
558
A. Guionnet
which is clearly the case for R large enough since we assumed n1 ∧ n2 ≥ 1. Since m−1 log Cm goes to 4 as m goes to infinity, we deduce that lim sup m→∞
1 log µA (x 2m ) ≤ R + 4, 2m
lim sup m→∞
1 log µB (x 2m ) ≤ R + 4, 2m
implying that µA and µB are supported into [−R − 4, R + 4] for R finite satisfying the above induction hypothesis (plus the condition imposed by the first 2n1 ∨ n2 moments). Let us now go back to (3.15) and notice that since the Stieljes functions are dense in Cc (R) and P1 − β2 τ (A|B) belongs to L1 (τ ), it can be extended to j ∈ Cb1 (R):
j (x) − j (y) dµA (x)dµA (y) = τ x−y
2 P (x) − τ (B|A) j (x) . β
(3.22)
Since (µA , µB ) are compactly supported, we can use the conclusions of Sect. 2.2.2. We see that µA→B -almost surely, t uA→B = τ (B − A|Xt ) + (1 − 2t)H µA→B (x) t t so that uA→B = τ (B|A) − x + H µA 0 at least in the sense of distribution as in (3.22). Thus, by uniqueness of the solutions to the Euler equation given the initial and final data (µA , µB ) proved in Property 2.6, we conclude that 2 H µA (x) = P1 (x) − x − uA→B (x) 0 β in the sense of distribution that 1 h(x) − h(y) 2 dµA (x)dµA (y) = P1 (x) − x − uA→B (x) h(x)dµA (x) 0 2 x−y β (3.23) for all h ∈ Cb1 (R). We now show that this weak equality in fact yields almost equality. Indeed, taking h = P ∗ g with P the Cauchy law with parameter , one obtains from the weak equality 2 H (P ∗ µA )(x)g(x)dP ∗ µA (x) = τ (A) g(A + C ) . P1 (A) − A − uA→B 0 β Therefore, for any bounded measurable function g, if we set MA := supx∈supp(µA ) |2β −1 P1 (x) − x − uA→B (x)|, 0 ! ! ! ! ! H (P ∗ µA )(x)g(x)dP ∗ µA (x)! ≤ MA |g(x)|dP ∗ µA (x) ! ! from which we deduce that, since P ∗ µA dx with a non-zero density everywhere on R, for all > 0, |H (P ∗ µA )(x)| ≤ MA a.s.
First Order Asymptotics of Matrix Integrals
559
Consequently, for any > 0, 3 3 dP ∗ µA 3 (x)dx = 2 H (P ∗ µA )(x)2 dP ∗ µA (x) ≤ 2 MA2 . dx π π As a consequence, we claim that µA dx and 3 dµA 3 (x)dx ≤ 2 MA2 . dx π Indeed, if f is a Lipschitz function with Lipschitz constant |f |L , we know that ! ! ! ! ! ! ! ! ! f (x)dµA (x)! ≤ |f |L + ! f (x)dP ∗ µA (x)! ! ! ! ! 1 2 3 3 3 2 3 2 dx |f (x)| M . ≤ |f |L + A 2 π We can now let going to zero to conclude that ! ! 1 2 3 3 ! ! 3 ! f (x)dµA (x)! ≤ 3 M 2 2 dx |f (x)| , A ! ! 2 π which proves the claim. As a consequence, by Tricomi [27], (3.23) gives for all h ∈ L∞ (dµA ), 2 A→B P (x) − x − u0 (x) − H (µA )(x) h(x)dµA (x) = 0, β 1 and hence the µA almost sure equality. The second equation is derived similarly and one finds that H µB (x) =
2 P (x) − x + uA→B (x) 1 β 2
µB almost surely. Note also that by Property 2.9, the fact that (µA , µB ) are compactly supported implies that (ρ A→B , uA→B ) satisfies the isentropic Euler equation in the strong sense in . 3.3. q-Potts model. In this case, we find that q q q β (β) FP otts = − inf µi (Pi )− I (µ0 , µi )− (µi ) −(q +1) inf Iβ (ν) 2 ν∈P (R) i=0 i=1 i=0 q q β qβ βx 2 + µi P i − inf = − inf µ0 P0 − x 2 + (u∗ ,µ∗ )∈(C)µ1 ,µi 4 4 4 i=1 i=1 (3.24)
1 1 (u∗t (x))2 dµ∗t (x)dt + (H µ∗t (x))2 dµ∗t (x)dt × 0 0 q β β − (µi ) + (q − 2)(µ0 ) − q inf Iβ (ν). 4 4 ν∈P (R) i=1
560
A. Guionnet
When q > 2, the above functional is not anymore clearly convex in µ0 since (µ0 ) is concave. Hence, the uniqueness of the minimizers is now unclear. Note however that the Euler-Lagrange term may still contain sufficient convexity in µ1 to insure uniqueness but simply that the above formula does not show it. In the case q = 2, the functional is still convex, and strictly convex in the arguments (µ∗ , m∗ ). Therefore, uniqueness of the minimizers still holds since if (µ, ν, µ∗ , u∗ ) and ( µ, ν, µ∗ , u∗ ), we would still find ∗ ∗ ∗ ∗ that by convexity µ = µ and therefore µ = µ0 = µ0 = µ, ν = µ∗1 = µ∗1 = ν. The above formula already shows that the critical points satisfy µi (Pi ) < ∞ and have finite entropy . We can also obtain the Schwinger-Dyson equations in this case and deduce as for the Ising model that the critical points are compactly supported and satisfy the equations of Theorem 3.4. 3.4. Chain model. In this case, q q q β (β) Fchain = − inf µi (Pi ) − I (µi−1 , µi ) − (µi ) 2 i=1 i=2 i=1 (3.25) −q inf Iβ (ν) ν∈P (R) q q β β 2 β 2 = − inf µ1 P1 − x + µi Pi − x + inf (u∗ ,µ∗ )∈(C)µi ,µi+1 4 4 4 i=2 i=2 1
1 (u∗t (x))2 dµ∗t (x)dt + (H µ∗t (x))2 dµ∗t (x)dt × 0 0
β − (µ1 ) − q inf Iβ (ν). (3.26) 4 ν∈P (R) Here, we still have convexity and strict convexity on the term coming from I (β) . Hence, uniqueness of the minimizers holds. Again, we can prove the conclusions of Theorem 3.4 as for the Ising model. 3.5. Induced QCD model. q 2D β FQCD = − inf µi (P ) − I (β) (µi+ej , µi ) − (µi ) 2 i∈ j =1
i=1
i∈
−2D|| inf Iβ (ν) ν∈P (R) q β β = − inf µi (P − Dx 2 ) − (1 − D) (µi ) 2 2 i=1
+
i∈
2D i∈ j =1
1
× 0
inf
(ui,µ ,µi,µ )∈(C)µi ,µi+µ i,µ
i,µ
(ut (x))2 dµt (x)dt +
−2D|| inf Iβ (ν). ν∈P (R)
1
0
i,µ
i,µ
(H µt (x))2 dµt (x)dt
First Order Asymptotics of Matrix Integrals
561
Again, obvious convexity disappears and uniqueness of the minimizers becomes unclear when D > 1. Uniqueness of the minimizers still holds when D = 1. Then, clearly µi = µ for all i ∈ and u∗0 = −u∗1 at the minimizing path with (ρ ∗ , u∗ ) the solution of the Euler equation with boundary data (µ, µ). µ then satisfies P (x) − βx − βu∗0 (x) = 0 in the sense of distributions in supp(µ), which corresponds to the result obtained by Matytsin ([21], (4.3)) when β = 2. Actually, since we can prove as for the Ising model that µ is compactly supported, it turns out that P (x) − βx − βu∗0 (x) is in every Lp (dµ) and therefore that P (x) − βx − βu∗0 (x) = 0 almost everywhere in the support of µ. 4. Appendix 4.1. Free Brownian motion description of the minimizers. Let us return to the probability aspect of the story. In fact, by definition, if XtN = X0N + HtN with a Hermitian (if β = 2, otherwise symmetric if β = 1) matrix X0N with spectral N measure µˆ N 0 and a Hermitian (resp. symmetric) Brownian motion H , if we denote N N N µˆ t the spectral measure of Xt , then, if µˆ 0 converges towards a compactly supported probability measure µ0 , for any µ1 ∈ P(R), lim sup lim sup δ→0
N→∞
1 1 log P(d(µˆ N log P(d(µˆ N 1 , µ1 ) < δ) = lim inf lim inf 1 , µ1 ) < δ) 2 δ→0 N→∞ N 2 N = −Jβ (µ0 , µ1 ).
Let us now reconsider the above limit and show that the limit must be taken at a free Brownian bridge. More precisely, we shall see that, if τ denotes the joint law of (X0 , X1 ) (the precise sense of which being given below) and µτ the law of the free Brownian bridge (2.17) associated with (X0 , X1 ), 1 log P(d(µˆ N 1 , µ1 ) < δ) 2 δ→0 N→∞ N 1 τ ≤ sup lim sup lim sup 2 log P( max d(µˆ N tk , µtk ) ≤ δ) 1≤k≤n N −1 δ→0 N→∞ τ ◦X =µ
lim sup lim sup
0 0 τ ◦X1−1 =µ1
for any family {t1 , · · · , tn } of times in [0, 1]. Therefore, the large deviation estimate obtained in [16] yields lim sup lim sup δ→0
N→∞
1 log P(d(µˆ N 1 , µ1 ) < δ) N2
β ≤ − inf{S(µτ ), τ ◦ X0−1 = µ0 , τ ◦ X1−1 = µ1 }. 2 The lower bound estimate obtained in [16] therefore guarantees that inf{S(ν), ν0 = µ0 , ν1 = µ1 } = inf{S(µτ ), τ ◦ X0−1 = µ0 , τ ◦ X1−1 = µ1 }. Such kinds of result were already obtained in [8] and [4].
562
A. Guionnet
Let us now be more precise. We recall that we can define the joint law of the two matrices X0N , X1N by the family N N µˆ N 0,1 (F ) = trN (F (X0 , X1 ))
when F is taken into a natural set F of test functions of two non-commutative variables and trN (A) = N −1 N i=1 Aii . It is common in free probability to consider polynomial test functions. In [7], bounded analytic test functions were introduced for self-adjoint non-commutative variables. F = CCst (C) is there the complex vector space generated by → 1 , F (X1 , X2 ) = 1 z − αi X1 − αi2 X2 1≤i≤n i → where (zi )1≤i≤n belongs to (C\R)n , (αik , 1 ≤ k ≤ 2)ni=1 to (R2 )n , and is the noncommutative product. We shall here use the very same set of functions and recall then that the space
M0,1 = {τ ∈ F ∗ : τ (I ) = 1, τ (F F ∗ ) ≥ 0, τ (F G) = τ (GF )} is a compact metric space. We denote by D a metric on M0,1 . Let us recall [7] that if one considers the restriction µk = τ ◦ Xk−1 of τ to functions which only depend on one of the variables Xk , k = 1, 2, then µk is a probability measure on R (in fact the spectral measure of Xk ) and that the topology inherited by duality from F is the vague topology, i.e. the topology generated by continuous compactly supported functions. Since M0,1 is compact, for any > 0, we can find M ∈ N, (τk )1≤k≤M so that M0,1 ⊂ ∪1≤k≤M {τ : D(τ, τk ) < } and therefore 1 log P(d(µˆ N 1 , µ1 ) < δ) 2 N δ→0 N→∞ 1 ˆN ≤ max lim sup 2 log P(d(µˆ N 1 , µ1 ) < δ; D(µ 0,1 , τk ) < ). 1≤k≤M N→∞ N
lim sup lim sup
Now, conditionally to X1N , dXtN = dHtN −
XtN − X1N dt, 1−t
or equivalently XtN
=
tX1N
+ (1 − t)X0N
t
+ (1 − t) 0
(1 − s)−1 dHsN .
Let us assume that µˆ N 0,1 converges towards τ ∈ M0,1 when N goes to infinity and that N N X1 , X0 remains uniformly bounded for the operator norm. In particular, µˆ N N N converges for any t ∈ [0, 1] towards νtτ = τ ◦ (tX1 + (1 − t)X0 )−1 , νtτ (f ) = τ (f (tX1 + (1 − t)X0 ))
tX1 +(1−t)X0
First Order Asymptotics of Matrix Integrals
563
for any test function f . Therefore, Voiculescu’s result implies that µˆ NN converges Xt t towards the distribution µτt of tX1 + (1 − t)X0 + (1 − t) 0 (1 − s)−1 dSs with a free Brownian motion S, free with tX1 + (1 − t)X0 . We shall now extend this result in our topology and also control the dependence of this convergence with respect to the speed of convergence of the distribution of (X0N , X1N ) towards τ . ˆN We shall work below with given (X0N , X1N ) ∈ {d(µˆ N 0 , µ0 ) < δ; d(µ 1 , µ1 ) < δ; D(µˆ N , τ ) < }. 0,1 Let, for u ≤ t, XuN,t denote the process XuN,t = tX1N + (1 − t)X0N + (1 − t)
u 0
(1 − s)−1 dHsN .
Then, one deduces from Ito’s calculus that for any test function f , µˆ NN,t (f ) = µˆ N (f ) tX1N +(1−t)X1N Xu (1 − t)2 u N f (x) − f (y) ds + µˆ N,t ⊗ µˆ NN,t + MfN (u) X X s s 2 x−y (1 − s)2 0 with a martingale MfN (u) such that E
f 2∞ sup (MfN (u))2 ≤ . N2 u∈[0,t]
Moreover, it is not hard to check that (µˆ NN,t , u ≤ t) is tight in C([0, 1], P(R)) (see the Xu
proof of exponential tightness of the spectral process of X0N + HtN given in [16]). The
limit points (µXut , u ≤ t) (when D µˆ N 0,1 , τ goes to zero) satisfy the equation µXut (f ) = νtτ (f ) +
(1 − t)2 2
u 0
µXst ⊗ µXst
f (x) − f (y) x−y
ds . (1 − s)2
This equation admits a unique solution, as can be proved following the arguments of [8] or [16], p. 494. Taking f (x) = eiξ x , and subtracting both equations, we find, with N ˆ NN,t (eiξ x ) − µXut (eiξ x )|], u (R) = sup E[|µ |ξ |≤R
Xu
that for u ≤ t,
N 2 N u (R) ≤ 0 (R) + 4R
u 0
N s (R)ds +
R , N
which yields thanks to the Gronwall lemma and taken at u = t, since µτt = µXtt , (eiξ x ) − µτt (eiξ x )|] sup E[|µˆ N XN
|ξ |≤R
≤
"
t
# R 2 N iξ x τ iξ x + sup E[|µˆ tXN +(1−t)XN (e ) − νt (e )|] e4R t . N |ξ |≤R 1 0
564
A. Guionnet
Therefore, if we define the distance dF on P(R) by 2 dF (µ, µ ) = |µ(eiξ x ) − µ (eiξ x )|e−4ξ dξ we have proved that there exists a finite constant C such that for all t ∈ [0, 1], , µτt )] ≤ CdF (µˆ N , ντ ) + E[dF (µˆ N XN tXN +(1−t)XN t t
1
0
C . N
It is not hard to convince ourselves that dF is a distance compatible with the weak topology on P(R). Observe now that on {d(µˆ N ˆN ˆN 1 , µ1 ) < δ, d(µ 0 , µ0 ) < δ}, (µ tX1 +(1−t)X0 , t ∈ [0, 1]) is tight for the usual weak topology so that for any > 0 we can find κ > 0 so that for any τ and t ∈ [0, 1], D(τ, µˆ N 0,1 ) < implies dF (µˆ N , ν τ ) < κ. tXN +(1−t)XN t 1
0
Therefore, for any t1 , · · · , tn ∈ [0, 1], for any (X0N , X1N ) ∈ {d(µˆ N ˆN 0 , µ0 ) < δ; d(µ 1 , µ1 ) < N δ; D(µˆ 0,1 , τ ) < }, The Chebyshev inequality yields P( max dF (µˆ N , µτtk ) > η|X1N ) ≤ nC(κ + XN 1≤k≤n
tk
1 ) N
√ with µτt = µXt the distribution of Xt = tX1 + (1 − t)X0 + t (1 − t)S when the law of (X0 , X1 ) is τ . Hence for any η, when κ (i.e. ) is small enough and N large enough, , µτtk ) < η|X1N ) > P( max dF (µˆ N XN 1≤k≤n
tk
1 . 2
Hence P(d(µˆ N ˆN 1 , µ1 ) < δ; D(µ 0,1 , τ ) < ) ≤ 2P(d(µˆ N ˆN ˆN , µτtk ) < η). 1 , µ1 ) < δ; D(µ 0,1 , τ ) < , max dF (µ XN 1≤k≤n
tk
We arrive at, for small enough and any τ ∈ M0,1 , 1 log P(d(µˆ N ˆN 1 , µ1 ) < δ; D(µ 0,1 , τ ) < ) 2 N→∞ N 1 τ ≤ lim sup 2 log P( max dF (µˆ N tk , µtk ) < δ). 1≤k≤n N→∞ N
lim sup
Using the large deviation upper bound for the law of (µˆ N t , t ∈ [0, 1]) from [16], we deduce lim sup N→∞
1 β min log P(d(µˆ N inf 1 , µ1 ) < δ) ≤ − 2 N 2 1≤p≤M max1≤k≤n dF (νt
τp k ,µtk )≤δ
S(ν).
First Order Asymptotics of Matrix Integrals
565
We can now let go to zero, and then with δ going to zero, and then n going to infinity, to obtain, since S is a good rate function, that lim sup lim sup δ→0
N→∞
1 β log P(d(µˆ N 1 , µ1 ) < δ) ≤ − 2 N 2
inf
τ :τ ◦X0−1 =µ0 τ ◦X1−1 =µ1
S(µτ ).
Since it was also proved in [16] that lim inf lim inf δ→0
we conclude that
N→∞
1 β log P(d(µˆ N inf S(ν), 1 , µ1 ) < δ) ≥ − 2 0 N 2 νν0 =µ =µ 1
inf S(ν) =
ν0 =µ0 ν1 =µ1
inf
τ :τ ◦X0−1 =µ0 τ ◦X1−1 =µ1
1
S(µτ ).
Hence, if FBB(µ0 , µ1 ) is the set of laws of free Brownian bridges between µ0 and µ1 , i.e FBB(µ0 , µ1 ) = {µτ , τ ◦ X0−1 = µ0 , τ ◦ X1−1 = µ1 }, we have seen that inf{S(ν), ν0 = µ0 , ν1 = µ1 } = inf{S(ν), ν ∈ FBB(µ0 , µ1 )}. To finish the proof of Theorem 2.7, we need to show that FBB(µ0 , µ1 ) is a closed subset of C([0, 1], P(R)) so that indeed the infimum is reached in FBB(µ0 , µ1 ). Observe here that µτ does depend only partially on τ since it only depends on {νtτ , t ∈ [0, 1]}. Noting that νtτ (x p ) =
p
t r τ (Pr,p (X1 − X0 , X0 ))
r=0
with Pr,p (X, Y ) the sum over all the monomial functions with total degree p and degree r in X, we see that µτ only depends on the restriction of τ to polynomial functions P ∈ S = {Pr,p , 0 ≤ r ≤ p < ∞}. Of course, ,C 2p + Y 2p ) ≤ 2C 2p , ∀p ∈ N} MS 0,1 = {τ |S , τ ∈ M0,1 , τ (X
is closed for the dual topology generated by the polynomial functions of S. Here C denotes a common uniform bound on X0 and X1 , and we have ,C FBB(µ0 , µ1 ) = {µτ |S , τ ∈ M0,1 } = {µκ , κ ∈ MS 0,1 }. ,C κ We denote, for κ ∈ MS 0,1 and t ∈ [0, 1], νt ∈ P(R) the distribution of tX1 + (1 − t)X0 when the joint distribution of (X0 , X1 ) restricted to S is κ. Then, µκt = νtκ + σt (1−t) . We now show that FBB(µ0 , µ1 ) is a closed set of C([0, 1], P(R)), which insures, since S is a good rate function on C([0, 1], P(R)), that the infimum is achieved on FBB(µ0 , µ1 ). Indeed, if µn is a sequence of FBB(µ0 , µ1 ) given by {νtκn + σt (1−t) , t ∈ [0, 1]}, the weak convergence of µn implies the weak convergence of κ n . Indeed, for any p ∈ N, any t ∈ [0, 1], µnt (x p ) = νtκn (x p ) + Pt (µnt (x l ), l ≤ p − 1)
566
A. Guionnet
with a polynomial function Pt . Hence, by induction, the convergence of (µnt (x p ))p∈N (recall that µn is supported by [−C − 2, C + 2] for any n so that weak
convergence is equivalent to moment convergence) results in the convergence of νtκn (x p )) p∈N , and again, since (νtκn )n∈N is supported by [−C, C], with the weak convergence of νtκn towards some probability measure νt . Since this convergence holds for any t ∈ [0, 1], we can expend the moments in powers of the time variable to conclude that κn converges ,C towards κ ∈ MS 0,1 . Again by free convolution calculus, this convergence results in the convergence of µn towards µκ ∈ FBB(µ0 , µ1 ). Hence, FBB(µ0 , µ1 ) is closed. 4.2. Proof of Lemma 2.4. In [16] (see (2.13) and Lemma 2.10) O. Zeitouni and I proved that for any path ν ∈ C 1 ([0, 1], P(R)), there exists a path ν , such that lim sup S 0,1 (ν , ) = Sµ0 (ν). ,↓0
This path was constructed as follows. Let P be the Cauchy law with parameter and set µ = P ∗ µ as the convoluted path with the Cauchy law. Moreover, if 0 = t1 < t2 < . . . < tn = 1 with ti = (i − 1), we set, for t ∈ [tk , tk+1 [, νt , = νt k +
(t − tk ) [νtk+1 − νt k ].
Let us therefore consider S 0,1 (ν , ). Because we took the convolution with respect to the Cauchy law, the Hilbert transform H νt , is well defined, and actually a continuously differentiable function with respect to the time variable and an analytic function with respect to the space variable. Henceforth, in the supremum defining S 0,1 (ν , ), we can actually make the change of function f (t, x) → f (t, x) − log |x − y|dνt , (y). Observing that, with νi = νi ∗ P for i ∈ {0, 1},
1
∂t 0
1
log |x − y|−1 dνt , (y) dνt , (x)dt = (ν1 ) − (ν0 ) , 2
we find that 1 1 1
(H νt , )2 dνt , dt (ν1 ) − (ν0 ) + 2 2 0 + sup f1 dµ 1 − f0 dν0
S 0,1 (ν , ) = −
f ∈Cb2,1 ([0,1]×R)
1 0,1 − − < f, f >ν , 2 0 1 1 1
(H νt , )2 dµ , ≥ − (ν1 ) − (ν0 ) + t dt. 2 2 0
1
∂t ft dνt , dt
Noticing that 0
1
[] 1
(H νt , )2 dµ , t dt
=
k=0
(H νt k )2 dνt k
First Order Asymptotics of Matrix Integrals
567
converges since t → H νt and t → νt are continuous for any ν ∈ C([0, 1], P(R)), we arrive at 1 1 1
(H νt )2 dνt dt. (4.1) lim inf S 0,1 (ν , ) ≥ − (ν1 ) − (ν0 ) + ↓0 2 2 0 Remark that for t ∈ {0, 1}, 1 (νt ) = log |x−y|−1 dP ∗νt (x)dP ∗νt (y) = log((x−y)2 + 2 )−1 dνt (x)dνt (y). 2 Hence, the monotone convergence theorem asserts that lim (νt ) = (νt ). ↓0
In particular, if (ν0 ) is finite, (4.1) implies that (ν1 ) is also bounded below, and therefore bounded since S 0,1 (ν) < ∞ implies that ν1 (x 2 ) < ∞, and consequently that (ν1 ) is bounded above. Now, recall that for any ρ ∈ L3 , Tricomi [27] p. 169 asserts that π2 1 ρ(x)2 = (Hρ)2 (x) − H (ρ(Hρ))(x) , 2 2 so that π2 (Hρ)2 (x)ρ(x)dx = (ρ(x))3 dx. 3 Since, for any > 0, νt is absolutely continuous with respect to Lebesgue measure with density ρt ∈ L3 (dx) for almost all t ∈ [0, 1], (4.1) implies that 1 ρt (x)3 dxdt ≤ C 0
with a finite constant C independent of . Consequently, for any Lipschitz function f , by Holder’s inequality, ! ! ! ! ! ! ! ! ! ft (x)dνt (x)dt ! ≤ sup |ft |L + ! ft (x)ρ (x)dxdt ! t ! ! ! ! t∈[0,1]
≤ sup |ft |L + C t∈[0,1]
so that letting go to zero, we obtain ! ! ! ! ! ft (x)dνt (x)dt ! ≤ C ! !
1
1
3 2
|ft (x)| dxdt
23 ,
0
3 2
|ft (x)| dxdt
23 ,
0
3
an inequality which extends readily to L 2 (dxdt) by density. As a consequence, dνt (x)dt dxdt, dνt (x)dt = ρt (x)dxdt and ρt (x) converges towards ρt almost surely. We conclude by Fatou’s lemma that +∞ > lim inf lim inf S 0,1 (ν , ) ↓0
↓0
π2 1 1 lim inf (ρt (x))3 dxdt. ≥ − ((ν1 ) − (ν0 )) + ↓0 2 6 0 1 2 π 1 (ρt (x))3 dxdt. = − ((ν1 ) − (ν0 )) + 2 6 0
568
A. Guionnet
Acknowledgements. I am very much indebted to C. Villani and O. Zeitouni whose careful reading of preliminary versions of the manuscript, wise remarks and encouragements were crucial in this research. I am also very grateful to D. Serre and Y. Brenier for stimulating discussions.
References 1. Ben Arous, G., Guionnet, A.: Large deviations for Wigner’s law and Voiculescu’s non-commutative entropy. Prob. Th. Rel. Fields 108, 517–542 (1997) 2. Bercovici, H., Voiculescu, D.: Free convolution of measures with unbounded support. Indiana Univ. Math. J. 42, 733–773 (1993) 3. Biane, P.: On the Free convolution with a Semi-circular distribution. Indiana Univ. Math. J. 46, 705–718 (1997) 4. Biane, P., Capitaine, M., Guionnet, A.: Large deviation bounds for matrix Brownian motion. Invent. Math. 152, 433–459 (2003) 5. Brenier, Y.: Minimal geodesics on groups of volume-preserving maps and generalized solutions of the Euler equations. Comm. Pure. Appl. Math. 52, 411–452 (1999) 6. Br´ezis, H.: Functional analysis. Paris: Masson, 1983 7. Cabanal-Duvillard, T., Guionnet, A.: Large deviations upper bounds and non commutative entropies for some matrices ensembles. Ann. Probab. 29, 1205–1261 (2001) 8. Cabanal-Duvillard, T., Guionnet, A.: Discussions around non-commutative entropies. Adv. Math. 174, 167–226 (2003) 9. Chadha, S., Madhoux, G., Mehta, M.L.: A method of integration over matrix variables II. J. Phys. A. 14, 579–586 (1981) 10. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 11. Dembo, A.,Zeitouni, O.: Large deviations techniques and applications. Second edition, BerlinHeidelberg-New York: Springer, 1998 12. Ercolani, N.M., McLaughlin, K.D.T-R.: Asymptotics of the partition function for random matrices via Riemann-Hilbert techniques, and applications to graphical enumeration. To appear in Int. Math. Res. Notes, 2003 13. Eynard, B.: Eigenvalue distribution of large random matrices, from one matrix to several coupled matrices. Nucl. Phys. B. 506, 633–664 (1997) 14. Eynard, B.: Random matrices. http://www-spht.cea.fr/lectures-notes.shtml 15. Guionnet, A.: Large deviation upper bounds and central limit theorems for band matrices. Ann. Inst. H. Poincar´e Probab. Statist. 38, 341–384 (2002) 16. Guionnet, A., Zeitouni, O.: Large deviations asymptotics for spherical integrals. J. Funct. Anal. 188, 461–515 (2002) 17. Guionnet, A. Zeitouni, O.: Addendum to: Large deviations asymptotics for spherical integrals. To appear in J. Funct. Anal. (2004) 18. Harer, J., Zagier, D.: The Euler characteristic of the moduli space of curves. Invent. Math. 85, 457–485 (1986) 19. Loeper, G.: The inverse problem for the Euler-poisson system in cosmology. Preprint, 2003 20. Mahoux, G., Mehta, M.: A method of integration over matrix variables III. Indian J. Pure Appl. Math. 22, 531–546 (1991) 21. Matytsin, A.: On the large N-limit of the Itzykson-Zuber integral. Nucl. Phys. B411, 805–820 (1994) 22. Matytsin, A., Zaugg, P.: Kosterlitz-Thouless phase transitions on discretized random surfaces. Nucl. Phys. B497, 699–724 (1997) 23. Mehta, M.L.: Random matrices. 2nd ed., New York-London: Academic Press, 1991 24. Mehta, M.L.: A method of integration over matrix variables. Comm. Math. Phys. 79, 327–340 (1981) 25. Serre, D.: Sur le principe variationnel des e´ quations de la m´ecanique des fluides parfaits. Math. Model. Num. Anal. 27, 739–758 (1993) 26. Szarek, S., Voiculescu, D.: Volumes of restricted Minkowsky Sums and the Free analogue of the Entropy Power Inequality. Commun. Math. Phys. 178, 563–570 (1996) 27. Tricomi, F.G.: Integral equations. New York: Interscience, 1957 28. Voiculescu, D.: Limit laws for random matrices and free products. Invent. Math. 104, 201–220 (1991) 29. Voiculescu, D.: The analogues of Entropy and Fisher’s Information Measure in Free Probability Theory, V : Noncommutative Hilbert Transforms. Invent. Math. 132, 189–227 (1998) 30. Voiculescu, D.: Lectures on free probability theory. In: Sptinger Lecture Notes Mathematics 1738, Berlin-Heidelberg-New York: Springer-Verlag, 2000, pp. 283–349 31. Wigner, E.: On the distribution of the roots of certain symmetric matrices. Ann. Math. 67, 325–327 (1958)
First Order Asymptotics of Matrix Integrals
569
32. Zinn-Justin, P.: Universality of correlation functions of hermitian random matrices in an external field. Commun. Math. Phys. 194, 631–650 (1998) 33. Zinn-Justin, P.: The dilute Potts model on random surfaces. J. Stat. Phys. 98, 245–264 (2000) 34. Zvonkin, A.: Matrix integrals and Map enumeration: an accessible introduction. Math. Comput. Mod. 26, 281–304 (1997) Communicated by M. Aizenman
Commun. Math. Phys. 244, 571–594 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-1002-6
Communications in
Mathematical Physics
Integrability Versus Separability for the Multi-Centre Metrics Galliano Valent1,2 1
Laboratoire de Physique Th´eorique et des Hautes Energies, Unit´e associ´ee au CNRS UMR 7589, 2 Place Jussieu, 75251 Paris Cedex 05, France 2 CNRS Luminy Case 907, Centre de Physique Th´eorique, 13288 Marseille Cedex 9, France Received: 28 November 2002 / Accepted: 12 September 2003 Published online: 17 December 2003 – © Springer-Verlag 2003
Abstract: The multi-centre metrics are a family of euclidean solutions of the empty space Einstein equations with self-dual curvature. For this full class, we determine which metrics do exhibit an extra conserved quantity quadratic in the momenta, induced by a Killing-St¨ackel tensor. Our systematic approach brings to light a subclass of metrics which correspond to new classically integrable dynamical systems. Within this subclass we analyze on the one hand the separation of coordinates in the Hamilton-Jacobi equation and on the other hand the construction of some new Killing-Yano tensors. 1. Introduction The discovery of the generalized Runge-Lenz vector for the Taub-NUT metric [8] has been playing an essential role in the analysis of its classical and quantum dynamics. As shown in [5] this triplet of conserved quantities gives quite elegantly the quantum bound states as well as the scattering states. The Killing-St¨ackel tensors, which are the roots of the generalized Runge-Lenz vector of Taub-NUT, have been derived in [10] using purely geometric tools. As a result the classical integrability of the Taub-NUT metric was established. The classical integrability of the Eguchi-Hanson metric was obtained in [15] where the Hamilton-Jacobi equation was separated. This result was further generalized in [10] to cover the twocentre metric. Despite these successes, a systematic analysis of the full family of the multi-centre metrics was still lacking. It is the aim of this article to fill this gap. In Sect. 2 we have gathered a summary of known properties of the multi-centre metrics, their geodesic flow and some basic concepts about Killing-St¨ackel tensors. In Sect. 3 we obtain the most general structure of the conserved quantity associated to a Killing-St¨ackel tensor: it is a bilinear form in the momenta. Taking this quadratic structure as a starting point, we obtain the system of equations which ensure that such a kind of quantity is preserved by the geodesic flow. This system is analyzed and simplified. Its most important consequence is that the existence of an extra conserved quantity
572
G. Valent
is related to the existence of an extra spatial Killing (besides the tri-holomorphic one), which may be either holomorphic or tri-holomorphic. In Sect. 3 we first consider the case of an extra spatial Killing which is holomorphic. We find that the extra conserved quantity does exist for the following families, with (minimal) isometry U (1) × U (1): 1. The most general two-centre metric, with the potential V = v0 +
m1 m2 + . |r + c| |r − c|
Our approach explains quite simply why there are three extra conserved quantities for Taub-NUT and only one for Eguchi-Hanson, and their very different nature. 2. A first dipolar breaking of Taub-NUT, with potential V = v0 +
m F · r + 3 . r r
3. A second dipolar breaking of Taub-NUT with potential V = v0 +
m + E · r. r
In the Taub-NUT limit E → 0 there appears a triplet of extra conserved quantities: the generalized Runge-Lenz vector of [8]. The classical integrability of these three dynamical systems follows from our analysis. In Sect. 4 we consider the case of an extra spatial Killing which is tri-holomorphic, with (minimal) isometry group still U (1) × U (1). We find four different families of metrics, which share with the previous ones their classical integrability and, using appropriate coordinates, with potentials: 1. In the first case
aξ ξ 2 − c2 + bη c2 − η2 V = v0 + . ξ 2 − η2
2. In the second case V = v0 + m
cos(2φ) . r2
3. In the third case V =
aξ + bη . ξ 2 + η2
4. And in the fourth case V = v0 + mx. As an application we work out in Sects. 5 and 6 the separation of variables for the Hamilton-Jacobi equation which gives also a check of the results obtained in the former sections. Eventually we present in Sect. 7 some new Killing-Yano tensors, and some conclusions in Sect. 8.
Multi-Centre Conservation Law
573
2. The Multi-Centre Metrics 2.1. Background material. These euclidean metrics on M4 have at least one Killing = ∂t and have the local form vector K g=
1 (dt + )2 + V γ , V
V = V (x),
= i (x) dx i ,
(1)
where the x i are the coordinates on γ . They are solutions of the empty space Einstein equations provided that : 1. The three dimensional metric γ is flat. Using cartesian coordinates x i we can write γ = d x · d x.
(2)
dV = ∗ d.
(3)
2. Some monopole equation holds γ
Notice that the integrability condition for the monopole equation is V = 0, hence these metrics display an exact linearization of the empty space Einstein equations. They have been derived in many ways [14, 7, 11, 12]. In this last reference the geometric meaning of the cartesian coordinates xi was obtained: they are nothing but the momentum maps of the complex structures under the circle action of ∂t . Let us summarize some background knowledge on the multi-centre metrics for further use. Taking for canonical vierbein 1 E0 = √ (dt + ), V
Ea :
Ei =
√ V dxi
(4)
and defining as usual the spin connection ωab and the curvature Rab by dEa + ωab ∧ Eb = 0,
Rab = dωab + ωas ∧ ωsb ,
one can check that these metrics have a self-dual spin connection: (−)
ωi
≡ ω0i −
1 ij k ωj k = 0, 2
⇒
(−)
Ri
= 0,
which implies the self-duality of their curvature. It follows that they are hyperk¨ahler and hence Ricci-flat. The complex structures are given by the triplet of 2-forms (−)
i
= E0 ∧ Ei −
1 1 ij k Ej ∧ Ek = (dt + ) ∧ dxi − V ij k dxj ∧ dxk , 2 2
(5)
which are closed, in view of the hyperk¨ahler property of these metrics. Let us note that the self-duality of the complex structures and of the spin connection are opposite and that the Killing vector ∂t is tri-holomorphic. = ∂t , which reads It is useful to define the Killing 1-form, dual of the vector K K=
dt + , V
and plays some role in characterizing the multi-centre metrics.
(6)
574
G. Valent
Among these characterizations let us mention: 1. For the multi-centre metrics the differential dK has a self-duality opposite to that of the connection. A proof using spinors may be found in [17] and without spinors in [6]. 2. The multi-centre metrics possess at least one tri-holomorphic Killing. For a proof see [10]. 2.2. Geodesic flow. The geodesic flow is the Hamiltonian flow of the metric considered as a function on the cotangent bundle of M4 . Using the coordinates (t, xi ) we will write a cotangent vector as i dxi + 0 dt. The symplectic form is then ω = dxi ∧ di + dt ∧ d0 , and we take for hamiltonian 1 1 H = g µν µ ν == 2 2
1 2 2 (i − 0 i ) + V 0 . V
(7)
(8)
For geodesics orthogonal to the U (1) fibers and affinely parametrized by λ the equations for the flow allow on the one hand to express the velocities dt ∂H 2 i i ˙t ≡ = 0 − , = V + dλ ∂0 V V (9) ∂H dxi 1 = x˙i ≡ = pi , pi = i − 0 i , dλ ∂i V and on the other hand to get the dynamical evolution equations ∂H = 0, ∂t ∂H ˙i =− ⇒ ∂xi ˙0 =−
(t˙ + i x˙i ) q ≡ 0 = , (a) V (10) H q p˙ i = − q 2 ∂i V + (∂i s − ∂s i ) ps . (b) V V
Relation (10a) expresses the conservation of the charge q, a consequence of the U (1) isometry of the metric. For the multi-centre metrics, use of relation (3) brings the equations of motion (10b) to the nice form ˙p = H − q 2 ∇ V + q p ∧ ∇V . (11) V V The conservation of the energy 1 pi2 V 1 2 H = + q V = (x˙i2 + q 2 ) = gµν x˙ µ x˙ ν 2 V 2 2
(12)
is obvious since it expresses the constancy of the length of the tangent vector x˙ µ along a geodesic.
Multi-Centre Conservation Law
575
2.3. Killing-St¨ackel tensors and their conserved quantities. A Killing-St¨ackel (KS) tensor is a symmetric tensor Sµν which satisfies ∇(µ Sνρ) = 0.
(13)
Let us observe that if K and L are two (possibly different) Killing vectors their symmetrized tensor product K(µ Lν) is a KS tensor. So we will define irreducible KS tensors as the ones which cannot be written as linear combinations, with constant coefficients, of symmetrized tensor products of Killing vectors. For a given KS tensor Sµν the quadratic form of the velocities: S = Sµν x˙ µ x˙ ν
(14)
is preserved by the geodesic flow. In all what follows we will look for KS tensors, under the assumptions A1: The KS tensor is preserved by Lie dragging along the tri-holomorphic Killing vector: L Sµν = 0, K˜
K˜ = ∂t .
(15)
A2: We will consider generic values of H and q = 0. Furthermore, instead of focusing ourselves on the KS tensor Sµν , whose usefulness is just to produce the conserved quantity S, let us rather examine more closely the structure of the conserved quantity induced by such a KS tensor. From relation (14) we obtain the following ansatz for the conserved quantity we are looking for: S = Aij (xk ) pi pj + 2q Bi (xk ) pi + C(xk ),
(16)
where the various unknown functions, as a consequence of A1, are independent of the coordinate on the U (1) fiber. It is interesting to notice that the knowledge of S is equivalent to the knowledge of the K-S tensor: using (9) one can express S in terms of the velocities and, going backwards, compute the K-S tensor components from relation (14). Imposing the conservation of S under the geodesic flow gives: Proposition 1. Under Assumptions A1 and A2 the quantity S, given by (16), is conserved iff the following equations are satisfied 1 a)
q · L V = 0,
b) c) d)
∂(k Aij ) = 0, q(∂(i Bj ) − As(i j )su ∂u V ) = 0, ∂i C + 2(H − q 2 V ) Ais ∂s V − 2 q 2 ist Bs ∂t V = 0.
B
(17)
We are now in a position to explain why we assumed, in A2, that q should not vanish. Indeed for q = 0 the relations (17a) and (17c) are trivially true and we are left with ∂(k Aij ) = 0,
∂i C + 2H Ais ∂s V = 0,
while the conserved quantity S reduces to S = Aij (x) pi pj + C(x). 1
Assumption A2 implies that H − q 2 V does not vanish identically.
576
G. Valent
It is interesting to notice that, formally, S is preserved by the hamiltonian flow induced by the classical hamiltonian [16] H=
p 2 − H V, 2
where now H appears as some constant parameter. However the assumption that q = 0 leads to a reduced system which has only three degrees of freedom and as such may exhibit integrability. Since we are interested in genuine four dimensional integrability we have to exclude such a possibility. Let us proceed to the discussion of the system (17). Relation (17a) shows that there are two possible situations: 1. Either the potential V has one (or more) spatial symmetries, with Killing K, and then B has to be conformal to this Killing vector, 2. Or the potential has no spatial symmetry, and in this case B = 0. Let us show that this last possibility does not give any new conserved quantity. Indeed relation (17c) can be written [A, R] = 0,
(R)ij = isj ∂s V .
(18)
Since V has no Killing the matrix R is a generic matrix in the Lie algebra so(3). By the Schur lemma it follows that A has to be proportional to the identity matrix and this does trivialize the corresponding conserved quantity S. So the unique possibility left is the first one. Let us notice that K lifts up to an isometry of the 4 dimensional metric. We have obtained: Proposition 2. The number of extra conserved quantities, having the structure (16), of a multi-centre metric is at most equal to the number of extra spatial Killing vectors it = ∂t . does possess, besides the tri-holomorphic Killing K Using this result we can discuss the triaxial generalization of the Eguchi-Hanson metric, with a tri-holomorphic su(2), discovered in [2]. Its potential and cartesian coordinates were given in [9] and the potential has no spatial Killing. From the previous proposition it follows that this metric will not exhibit a conserved quantity of the form (16) for generic values of H and q = 0. 2.4. Transformations of the system. As observed above, the vector B has to be conformal to the Killing K. So we define the conformal factor F such that Bi = − F Ki .
(19)
The conserved quantity (16) becomes S = Aij (x) pi pj − 2 q F Ki pi + C(x),
(20)
and Eq. (17c) transforms into K(i ∂j ) F + As(i j )su ∂u V = 0.
(21)
Taking its trace we see that L F = 0, showing that V and F must have the same K
Killing.
Multi-Centre Conservation Law
577
Contracting (21) with ∂j V gives Lemma 1. Equation (21) has for a consequence: (dV · dF )K + (A[dV ] ∧ dV ) = 0,
A[dV ] = Ais ∂s V dx i .
(22)
We can proceed to: Proposition 3. The relation (21) is equivalent (except possibly at the points where the norm of the Killing K vanishes) to the relations: A[K] = a(x) K, a) (23) b) |K|2 dF − A[(K ∧ dV )] + (A[K] ∧ dV ) = 0. Proof. Contracting relation (21) with Kj gives relation b), while contracting with Ki Kj we have stu Ks A[K]t ∂u V = 0
⇒
A[K]i = a(x)Ki + b(x)∂i V ,
(24)
which is not relation a). To complete the argument we first contract relation (21) with iab Ka ; after some algebra we get Kj iab ∂i F Ka + 2A[K]j ∂b V + A[dV ]b Kj − Ks A[dV ]s δj b − Ass Kj ∂b V = 0, (25) which, upon contraction with A[K]b , gives eventually (A[K]s ∂s V ) A[K]i = {− stu Ks A[K]t ∂u F + Ass A[K]t ∂t V −A[K]s A[dV ]s } Ki .
(26)
Let us now suppose that A[K]s ∂s V = 0. The previous relation shows that in (24) we must have b(x) = 0, hence A[K]s ∂s V = 0 which is a contradiction. Let us prove that the converse is true. From (23b) we get |K|2 K(i ∂j ) F + (K(j Ai)s Kt tsu + A[K]s K(j i)su )∂u V = 0.
(27)
Use of the identity Ais Kt Kj tsu ∂u V = (|K|2 Ais j su − A[K]i Kt j tu )∂u V
(28)
and of relation (23a) leaves us with (21), up to division by |K|2 . Notice that |K|2 vanishes at the fixed points under the Killing action, i. e. in subsets of zero measure in R3 . We can give, using (23a) and the identity −A[(K ∧ dV )] = (A[K] ∧ dV ) − Ass (K ∧ dV ) + (K ∧ A[dV ]),
(29)
a simpler form to the relation (23b): Lemma 2. The relation (23b) is equivalent to |K|2 dF + (2a − Tr A) (K ∧ dV ) + (K ∧ A[dV ]) = 0. For further use let us prove:
(30)
578
G. Valent
Lemma 3. To the spatial Killing K, leaving the potential V invariant, there corresponds a quantity Q invariant under the geodesic flow given by Q = Ki pi + qG,
i(K)F = − dG.
with
(31)
Proof. We start from L V = 0. Since K is a Killing we have L( dV ) = d(L V ) = 0, K
K
K
and (3) implies that L d = 0. The closedness of d implies d(i(K)d) = 0, and K
since our analysis is purely local in R3 , we can define η dG = −i(K) d,
⇒
(K ∧ dV ) = dG.
(32)
Then we multiply (10b) by pi and get successively Ki p˙ i = (Ki˙pi ) − K˙ i pi = (Ki˙pi ) = which concludes the proof.
q ˙ Ki (∂i s − ∂s i )ps = −q x˙s ∂s G = −q G, V
Let us point out that if we use the coordinate φ adapted to the Killing K˜ = ∂φ , we can write the connection = G dφ, where G does not depend on φ. 2.5. Integrability equations. We will derive now the integrability conditions for Eqs. (17c) and (17d). The first one was written using forms in (30) while the second one is dC + 2(H − q 2 V )A[dV ] + 2q 2 F (K ∧ dV ) = 0.
(33)
It can now be proved : Proposition 4. The integrability condition for (33) is d A[dV ] = 0
⇒
A[dV ] = dU
and
L U = 0. K
(34)
Proof. The integrability condition is obtained by differentiating (33). We get 2(H − q 2 V ) d A[dV ] + 2q 2 A[dV ] ∧ dV + 2q 2 dF ∧ (K ∧ dV ) +2q 2 F d (K ∧ dV ) = 0.
(35)
The last term in this equation vanishes in view of (32). Furthermore we have the identity specific to three dimensional spaces dF ∧ (K ∧ dV ) = −(K · dF ) dV + (dV · dF ) K = (dV · dF ) K because K is a symmetry of F. Relation (35) simplifies to 2(H − q 2 V ) d A[dV ] + 2q 2 [(dV · dF ) K + (A[dV ] ∧ dV )] = 0, and lemma 1 implies the closedness of A[dV ]. Since our analysis is purely local, the existence of U is a consequence of Poincar´e’s lemma. The relations L U = i(K) dU = i(K) A[dV ] = (A[K] · dV ) = a(K · dV ) = a L V = 0 K
show the invariance of U under the Killing K.
K
Multi-Centre Conservation Law
579
Let us now turn to Eq. (30). We will prove: Proposition 5. The integrability condition for (30) is (2a − Tr A)dV + dU = |K|2 dτ,
L dτ = 0, K
(36)
for some one form τ. Proof. Let us define the 1-form Y = (2a − Tr A)dV + dU.
(37)
It allows to write (30) and its integrability condition as K ∧Y K ∧Y , δ = 0, dF = − |K|2 |K|2
(38)
or switching to components Ki δ
Y |K|2
+
Y s ∂ s K i − K s ∂ s Yi = 0. |K|2
(39)
Let us examine the last terms. Since a and Tr A are invariant under the Killing K, we obtain Ys ∂s Ki − Ks ∂s Yi = −(2a − Tr A)∂i (Ks ∂s V ) − ∂i (Ks ∂s U ),
(40)
and both terms vanish because V and U are invariant under K. We are left with the vanishing of the divergence of Y /|K|2 from which we conclude (local analysis!) that it must have the structure dτ for some 1-form τ . From its definition it follows that dτ is invariant under K. Using this result we can simplify (30) to dF + (K ∧ dτ ) = dF − i(K)dτ = 0.
(41)
Collecting all these results we have: Proposition 6. Under Assumptions A1 and A2, the quantity S = Aij (x) pi pj − 2 q F Ki pi + C(x) is preserved by the geodesic flow of the multi-centre metrics provided that the integrability constraints V = 0,
A[dV ] = dU,
(2a − Tr A) dV + dU = |K|2 dτ
(42)
and the following relations hold: a)
L V = 0,
b) c) d)
A[K] = a K, ∂(k Aij ) = 0, dF = i(K) dτ, d(C + 2H U ) + 2q 2 (−V dU + F dG) = 0,
K
(43) (K ∧ dV ) = dG.
580
G. Valent
2.6. Classification of the spatial Killing vectors. An important point, in view of classification, is whether the extra spatial Killing is tri-holomorphic or not. This can be checked thanks to: Lemma 4. The spatial Killing vector Ki ∂i is tri-holomorphic iff ist ∂[s Kt] = 0. Otherwise it is holomorphic. Proof. From [3] we know that, for an hyperk¨ahler geometry, a Killing may be either holomorphic or tri-holomorphic. As shown in [10] such a vector will be tri-holomorphic iff the differential of the dual 1-form K = Ki dxi has the self-duality opposite to that of the complex structures. A computation shows that this is equivalent to the vanishing of 1 1 dK (−) = − ij k ∂[j Kk] E0 ∧ Ei − ist Es ∧ Et , 2 2 from which the lemma follows.
Since we are working in a flat three dimensional flat space, there are essentially two different cases to consider: 1. The Killing K generates a spatial rotation, which we can take, without loss of generality, around the z axis. In this case we have Ki pi = Lz , and this Killing vector is holomorphic with respect to the complex structure J3 , defined in Sect. 2. 2. The Killing K generates a spatial translation, which we can take, without loss of generality, along the z axis. In this case we have the Ki pi = pz , and this Killing vector is tri-holomorphic. We will discuss successively these two possibilities, under the simplifying additional assumption: A3: the K-S tensor Sµν is also preserved by Lie dragging along the extra spatial Killing vector K L Sµν = 0. K
3. One Extra Holomorphic Spatial Killing Vector Equation (43b) states that Aij is a Killing tensor in flat space. As shown in [13] such a Killing tensor is totally reducible to symmetrized tensor products of Killing vectors
Multi-Centre Conservation Law
581
and involves 20 free parameters. It is most conveniently written in terms of A(p, p) ≡ Aij p i p j . One has: α L2x + β L2y + γ L2z + 2µ Ly Lz + 2ν Lz Lx + 2λ Lx Ly A(p, p) = +a1 px Ly + a2 px Lz + b1 py Lx + b2 py Lz + c1 pz Lx +c2 pz Ly + d1 px Lx + d2 py Ly + aij pi pj .
(44)
The constraint (A 3) for the rotational Killing, requires L Aij = 0, which allows to K
bring (44) to the form z + a33 pz2 + a11 p 2 + δ pz Lz . A(p, p) = α(L2x + L2y ) + γ L2z + b (p ∧ L)
(45)
We note that the parameter γ corresponds to a reducible piece which is just the square of Lz . We will take γ = α for convenience. The parameter a11 is easily seen, upon integration of the remaining equations in (17), to give rise, in the conserved quantity S, to the full piece a11 (p 2 − 2H V + q 2 V 2 ),
(46)
which vanishes thanks to the energy conservation (12). So we can take a11 = 0. The second relation in (43b) implies the vanishing of δ. Hence, with slight changes in the notation, we end up with 2 + c2 pz2 + b (p ∧ L) z. A(p, p) = a L
(47)
Let us note that the parameters a and b are real while the parameter c may be either real or pure imaginary. To take advantage of the rotational symmetry around the z axis we use the coordinates x=
√ ρ cos φ,
y=
√ ρ sin φ,
z,
and write the connection = G dφ. By Lemma 3, this symmetry gives for conserved quantity Jz = Lz + q G = x y − y x .
(48)
From the system (43) one can check that the functions F and U are to be determined from F,ρ = (az + b/2)V,z − a/2 V,z (49) F,z = 2(az2 + bz − c2 )V,ρ − (az + b/2)V,z and
U,ρ = z(az + b)V,ρ − 21 (az + b/2)V,z . U,z = −2ρ(az + b/2)V,ρ + (aρ + c2 )V,z
(50)
582
G. Valent
3.1. The two-centre metric. This case corresponds to the choice a = 1 and c = 0. Since a = 1, we can get rid of the constant b by a translation of the variable z. So, without loss of generality, we can take b = 0 and use the new variables r± = x 2 + y 2 + (z ± c)2 . We get the relations ∂r+ F = −c ∂r+ V ,
∂r− F = +c ∂r− V ,
which imply V = f (r+ ) + g(r− ),
F = −c(f (r+ ) − g(r− )).
Imposing to the potential V the Laplace equation we have m1 m2 m1 m2 = −c, + , F = −c − V = v0 + r+ r− r+ r−
(51)
i.e. we recover the most general 2-centre metric. Let us recall that only the double TaubNUT metric, given by real m1 = m2 , is complete. If in addition we take the limit v0 → 0, we are led to the Eguchi-Hanson [4] metric. One has then to check the integrability constraint (34) and to determine the functions U and C,2 U = −cz,
C = −2(H − q 2 V )U − q 2 r 2 2 ,
r 2 = x 2 + y 2 + z2 . (52)
Let us observe that the conserved quantity which we obtain may be real even if c is pure imaginary. In this case m1 = m may be complex, but if we take m2 = m the functions V and c are real, as well as S. One obtains quite different metrics (as first observed in the particular case of Eguchi-Hanson metric): real c corresponding to type II metric and c pure imaginary corresponding to type I metric, in the terminology of [4]. The final form of the conserved quantity for the two-centre metric is therefore 2 + c2 pz2 + 2 qc Lz + 2cz (H − q 2 V ) − q 2 r 2 2 , SI = L z+c z−c m1 m2 m1 m2 (53) + , = − , G = m1 + m2 . V = v0 + r+ r− r+ r− r+ r− The relation of our results with the separability of the Hamilton-Jacobi equation for the two-centre metric, obtained in [10], will be discussed in the next section. From the very definition of the coordinates r± it is clear that the previous analysis is only valid for c = 0. The special case c = 0 (it is a singular limit), giving a first dipolar breaking of the Taub-NUT metric, will be examined now. 3.2. First dipolar breaking of Taub-NUT. This case corresponds to the choice a = 1 and c = 0. Since a = 1, we can again get rid of the parameter b. Then relation (49) for F implies V = w0 (r) + w1 (r) z, F,r = −rw1 (r), r = x 2 + y 2 + z2 . (54) 2
We discard constant terms in the function C.
Multi-Centre Conservation Law
583
Imposing the Laplace equation we obtain V = v0 +
m z + Ez + F 3 , r r
F =−
E 2 F r + . 2 r
(55)
The integrability relations for U require that E = 0 and we have z U =F , r
C = −2F
(3z2 − r 2 ) z z (H − q 2 V ) − 2mq 2 F 2 − q 2 F 2 . r r r4
The final form of the conserved quantity is therefore 2 2 2 − 2 q F Lz − 2F z (H − q 2 v0 ) + q 2 F 2 (x + y ) , SI I = L r r r4 x2 + y2 z V = v + m + F z , G=m −F . 0 r r3 r r3
(56)
(57)
Let us now consider the case a = 0, which leads to a second dipolar breaking of Taub-NUT. 3.3. Second dipolar breaking of Taub-NUT. This case corresponds to the choice a = 0 and b = 1. The relation (49) shows that by a translation of z we can take, without loss of generality, c = 0. From the integrability of F we deduce V = f (x 2 + y 2 + (z − c)2 ) + g(z). Hence by a translation of z we can set c to 0. We are left with V = f (r) + g(z),
F =
1 (f (r) − g(z)). 2
(58)
F =
1 m − Ez . 2 r
(59)
Imposing the Laplace equation yields V = v0 +
m + Ez, r
Then the integrability conditions for U and C are satisfied and we obtain U=
mz E 2 − (x + y 2 ), 2r 4
C = −2U (H − q 2 v0 ) − 2q 2 mE
(x 2 + y 2 ) . r
(60)
The final form of the conserved quantity is therefore
z−q SI I I = (p ∧ L) V = v0 +
m r
+ E z,
(x 2 + y 2 ) − E z Lz − 2U (H − q 2 v0 ) − 2q 2 mE , (61) r r G = m rz + E2 (x 2 + y 2 ).
m
For E = 0 we are back to the Taub-NUT metric. In this case the spatial isometries are lifted up from u(1) to su(2). As a result we have now three possible Killings to start with (1)
K i pi = L x
(2)
Ki pi = Ly
(3)
Ki pi = Lz ,
(62)
584
G. Valent
and we expect that the conserved quantity found above should be part of a triplet. The two missing conserved quantities can be constructed following the same route which led to SI I I using the new available spatial Killings given by (62). We recover −q mL + m(q 2 v0 − H ) r , S = p ∧ L r r
SI I I (E = 0) ≡ Sz .
(63)
Lemma 3 lifts up Jz , given by (48), to a triplet of conserved quantities + q m r, J = L r
(64)
r S = p ∧ J + m(q 2 v0 − H ) , r
(65)
which allows to write
on which we recognize the generalized Runge-Lenz vector discovered by Gibbons and Manton [8]. We have therefore obtained, for the three hamiltonians HI , HI I (F = 0) and HI I I , corresponding respectively to the extra conserved quantities SI , SI I and SI I I , (the proof of their irreducibility with respect to the Killing vectors is easy) a set of four independent conserved quantities: H,
q = 0 ,
Jz ,
S,
which can be checked to be in involution with respect to the Poisson bracket. Hence we conclude: Proposition 7. The three hamiltonians HI , HI I (F = 0) and HI I I , defined above are integrable in the Liouville sense. 4. One Extra Tri-Holomorphic Spatial Killing Vector This time we have for Killing Ki pi = pz . Imposing (A 3) for the translational invariance and the constraint A[K] ∝ K restricts A(p, p) to have the form A(p, p) = a L2z − 2b px Lz + 2c py Lz +
2
aij pi pj .
(66)
i,j =1
We have omitted a term proportional to pz2 since it is reducible. The functions F and U, which depend only on the coordinates x and y, using the system (43), are seen to be determined by F,x = A12 V,x − A11 V,y U,x = A11 V,x + A12 V,y (67) F,y = A22 V,x − A12 V,y U,y = A12 V,x + A22 V,y with A11 = ay 2 + 2by + a11 , A22 = ax 2 + 2cx + a22 , A12 = −axy − bx − cy + a12 . In order to organize the subsequent discussion, let us observe:
(68)
Multi-Centre Conservation Law
585
1. For a = 0, we may take a = 1. The spatial translations in the xy-plane allow to take b = c = 0, and a rotation a12 = 0 as well. Hence we are left with A(p, p) = L2z + (a11 − a22 ) px2 + a22 (px2 + py2 ). Adding the reducible term a22 pz2 we recover the piece a22 p 2 which can be discarded, as already explained in Sect. 4. So we will take for our first case A1 (p, p) = L2z − c2 px2 ,
c ∈ R ∪ iR,
c = 0.
(69)
2. Our second case, which is the singular limit c → 0 of the first case, corresponds to A2 (p, p) = L2z .
(70)
3. For a = 0, a first translation allows to take a12 = 0, while the second one allows the choice a11 = a22 and the corresponding term a11 (px2 + py2 ) is disposed of as in the first case. Eventually a rotation will bring b to zero and c to 1. Our third case will be A3 (p, p) = py Lz .
(71)
4. For a = b = c = 0, we can discard px2 + py2 and we are left with our fourth case A4 (p, p) = α py2 + β px py .
(72)
We will state the results obtained for these four cases without going through the detailed computations, which are greatly simplified by the use of the complex coordinate w = x + iy. In all four cases the metric will have the form g=
1 (dt + )2 + V (dz2 + dwdw), V
= G dz.
(73)
4.1. First case. Writing the conserved quantity as S1 = L2z − c2 2x − 2c2 F 0 z + c2 (2v0 U + D)20 − 2c2 U H,
c = 0,
(74)
where z = pz + G 0 and • V + iG = v0 + 2m √ •
w w2
+ c2
w+w , U + iF = −m √ w 2 + c2
,
v0 ∈ R, D = −2|m|2
m ∈ C, (w 2 + w 2 + |w|2 + c2 ) . (75) √ | w 2 + c2 |2
In the particular case where v0 = 0, c ∈ R, m ∈ R (resp. v0 = 0, c ∈ R, m ∈ iR) the metric reduces to the Bianchi VII0 (resp. Bianchi VI0 ) multi-centre metric. The integrability of their geodesic flow was first proved in [1].
586
G. Valent
4.2. Second case. Writing the conserved quantity as S2 = L2z − 2F 0 z + 2v0 U 20 − 2U H,
(76)
we have: • •
m V + iG = v0 + 2 , w w U + iF = m . w
v0 ∈ R,
m ∈ C, (77)
4.3. Third case. Writing the conserved quantity as S3 = y Lz − 2F 0 z + (2v0 U + D)20 − 2U H,
(78)
we have: • •
m V + iG = v0 + 2 √ , w m w−w U + iF = − √ , 2 w
v0 ∈ R,
m ∈ C,
w+w D = |m|2 √ 2 . | w|
(79)
4.4. Fourth case. In this case we take for the driving term A4 (p, p) = α py2 + β px py . Using the freedom of rotations in the xy-plane, at the level of the metric, we can take V = v0 + mx,
G = my.
(80)
S4 = α S4 + β S4 ,
(2)
(81)
S4 = 2y + (z − my 0 )2 , (2) S4 = x y − V 0 (z − my 0 ) − my H.
(82)
This time there are two conserved quantities (1)
given by
(1)
We added reducible terms of the form 2z and z 0 to get a simpler final form. The metric exhibits one further tri-holomorphic Killing vector and a corresponding conserved quantity ∂y − mz ∂t
⇒
y − mz 0 .
Let us close the algebra of the conserved quantities under Poisson bracket. For the Killing vectors we recover a Bianchi II Lie algebra {0 , z } = 0,
{z , y − mz 0 } = m 0 ,
{y − mz 0 , 0 } = 0.
The K-S tensors are invariant under the Killing vectors action, and it may be interesting to note that the Schouten bracket of the two K-S tensors is vanishing. This hamiltonian is therefore super-integrable.
Multi-Centre Conservation Law
587
To conclude this Sect. let us notice that, among the four potentials considered, only the second one and the fourth one are uniform functions in the three dimensional flat space. As was the case when the extra spatial Killing was holomorphic, we have obtained for the four hamiltonians considered in this section, a set of (at least) four conserved quantities H,
q = 0 ,
z ,
S,
and in all the four cases S is irreducible with respect to the Killing vectors. One can check that these four independent conserved quantities are in involution with respect to the Poisson bracket, hence we have: Proposition 8. The four hamiltonians determined in this Sect. are integrable in the Liouville sense. As is well known the existence of K-S tensors is related to the separability of the Hamilton-Jacobi (H-J) equation, or equivalently to the separability of the Schr¨odinger equation. In the next sections we will analyze the separability of the H-J equation according to the nature of the extra Killing vector. 5. H-J Separability: Extra Holomorphic Killing We write the metric g=
1 (dt + G dφ)2 + V (γ1 dξ12 + γ2 dξ22 + γ3 dφ 2 ), V
(83)
= ∂t and L = ∂φ , where which makes apparent the two commuting Killing vectors K only the first one is tri-holomorphic. The hamiltonian is 2φ 22 G2 + γ3 V 2 2 G 1 21 H = . (84) + 0 − 0 φ + + 2γ3 V γ3 V 2γ3 V 2V γ1 γ2 Since the γi ’s depend only on ξ1 and ξ2 , it follows that 0 and φ are conserved. 5.1. The two-centre case. The H-J equation separability was first used in [10] to get the corresponding K-S tensor. This reference is muddied by so many misprints that we will present its results anew. Separability relies here on the use of spheroidal coordinates ξ1 = ζ, ξ2 = λ, defined by y = c (ζ 2 − 1)(1 − λ2 ) sin φ, z = c ζ λ. x = c (ζ 2 − 1)(1 − λ2 ) cos φ, This implies γ1 = c2
ζ 2 − λ2 , ζ2 − 1
γ2 = c2
ζ 2 − λ2 , 1 − λ2
γ3 = c2 (ζ 2 − 1)(1 − λ2 ).
588
G. Valent
The potential and connection are V = v0 +
σ ζ − δλ , c(ζ 2 − λ2 )
G=
σ λ(ζ 2 − 1) + δζ (1 − λ2 ) , ζ 2 − λ2
with σ = m1 + m2 and δ = m1 − m2 . The hamiltonian is 2 (ζ − 1) 2ζ + (1 − λ2 ) 2λ (φ − G 0 )2 1 V H = 2 + 2 + 20 . 2c V (ζ 2 − λ2 ) (ζ − 1)(1 − λ2 ) 2
(85)
(86)
The separation constants 3 are 2φ ζ −2δ 2 0 φ − 2c(v0 cζ 2 + σ ζ )H Cζ = (ζ 2 − 1) 2ζ + 2 ζ − 1 ζ − 1 2 δ 2 2 2 + + v c ζ + 2v cσ ζ 20 , 0 0 ζ2 − 1
(87)
and 2φ λ −2σ 0 φ + 2c(v0 cλ2 + δλ)H Cλ = (1 − λ2 ) 2λ + 2 2 1 − λ 1 − λ σ2 2 2 2 + − v0 c λ − 2v0 cδλ 20 . 1 − λ2
(88)
The knowledge of these separation constants is of paramount importance since it reduces the integration of the H-J equation to quadratures. Indeed writing S = t 0 + φ φ + A(ζ ) + B(λ), dA dB in (87) and λ by in (88) to get the relevant dζ dλ separated differential equations. In practice the final integrations may be quite tough. Some algebra allows us to relate the conserved quantity obtained in Sect. 3 to the separation constants, with the final simple result
one has just to replace ζ by
SI = Cλ − (σ 2 + δ 2 )20 .
(89)
In [10] it was conjectured that in the Taub-NUT limit c → 0 this separation constant could be related to some component of the generalized Runge-Lenz vector. We can check that this is not true since, using relation (53), we get 2 − δ 2 20 . lim SI = L
c→0 3
In all that follows each couple of separation constants add up to zero.
(90)
Multi-Centre Conservation Law
589
5.2. First dipolar breaking. The H-J equation does separate in spherical coordinates ξ1 = r, ξ2 = θ, for which we have γ1 = 1,
γ2 = r 2 ,
γ3 = r 2 sin2 θ,
and V = v0 +
m cos θ +F 2 , r r
G = m cos θ − F
sin2 θ . r
The separation constants in the H-J equation are F F2 2 2 2 2 Cr = r r + 2 0 φ + v0 r + 2v0 mr + 2 20 − 2(v0 r 2 + mr)H, r r
(91)
(92)
and Cθ = 2θ +
2φ sin2 θ
−2m
cos θ 0 φ + sin2 θ
m2 F cos θ 20 − 2F cos θ H. + 2v 0 sin2 θ (93)
The relation with the K-S tensor of Sect. 3 is SI I = Cθ − m2 20 . 5.3. Second dipolar breaking. The H-J equation does separate in parabolic coordinates ξ1 = ξ, ξ2 = η, for which we have γ1 =
(ξ + η) , 4ξ
γ2 =
(ξ + η) , 4η
γ3 = ξ η,
and V = v0 +
2m E + (ξ − η), ξ +η 2
G=m
ξ −η E + ξ η. ξ +η 2
(94)
The separation constants in the H-J equation are 2φ E m E Cξ = 4ξ 2ξ + +2 − ξ 0 φ − 2 m + v 0 ξ + ξ 2 H ξ ξ 2 2 2 2 m E + + 2v0 m + (v02 + 3mE)ξ + v0 Eξ 2 + ξ 3 20 , ξ 4
(95)
and 2φ E m E Cη = 4η 2η + −2 + η 0 φ − 2 m + v 0 η − η 2 H η η 2 2 2 2 m E + + 2v0 m + (v02 − 3mE)η − v0 Eη2 + η3 20 . η 4
(96)
The relation with the K-S tensor of Sect. 3 is SI I I = − 21 Cξ . Having settled the case of an extra holomorphic Killing vector let us now consider the case of an extra tri-holomorphic Killing vector.
590
G. Valent
6. H-J Separability: Extra Tri-Holomorphic Killing We write the metric in the form g=
1 (dt + G dz)2 + V dz2 + γ1 dξ12 + γ2 dξ22 , V
(97)
where the coordinates ξ1 and ξ2 will be appropriate coordinates in the xy-plane which = ∂t and L = ∂z , both will ensure separability. The two commuting Killing vectors K tri-holomorphic, are apparent. The hamiltonian is 2z 22 V 2 + G2 2 G 1 21 H = . (98) + 0 − 0 z + + 2V V 2V 2V γ1 γ2 It follows that 0 and z are conserved.
6.1. First case. We restrict ourselves to the case of real c and use elliptic coordinates ξ1 = ξ and ξ2 = η in the xy-plane defined by x=
1 c
(ξ 2 − c2 )(c2 − η2 ),
y=
1 ξ η. c
For convenience, we will define
ξ = ξ ξ 2 − c2 ,
η = η c2 − η2 .
The first case corresponds to γ1 =
ξ 2 − η2 , ξ 2 − c2
γ2 =
ξ 2 − η2 , c2 − η2
V = v0 +
a ξ + b η , ξ 2 − η2
G=
−b ξ + a η . ξ 2 − η2
(99)
The separation constants in the H-J equation are Cξ = (ξ 2 − c2 )2ξ + v02 ξ 2 + 2v0 a ξ + (a 2 + b2 )(ξ 2 − c2 /2) 20 +2 b ξ 0 z + ξ 2 2z − 2(v0 ξ 2 + a ξ ) H,
(100)
and Cη = (c2 − η2 )2η + −v02 η2 + 2v0 b η + (a 2 + b2 )(η2 − c2 /2) 20 −2 a η 0 z − η2 2z + 2(v0 η2 − b η)H.
(101)
The relation with the K-S tensor obtained in Sect. 4 is S1 = −Cξ + c2 (2z + v02 20 − 2v0 H ).
(102)
Multi-Centre Conservation Law
591
6.2. Second case. We use polar coordinates ξ1 = r, ξ2 = φ in the xy-plane. The second case corresponds to γ1 = 1,
γ2 = r 2 ,
V = v0 + m
cos(2φ) , r2
G = −m
sin(2φ) . r2
(103)
The separation constants in the H-J equation are 2 Cr = r 2 (2 + 2 ) + v 2 r 2 + m 2 − 2v0 r 2 H, r z 0 0 r2 Cφ = 2 + 2m sin(2φ) 0 z + 2m cos(2φ) v0 2 − H . φ 0
(104)
The relation with the K-S tensor obtained in Sect. 4 is S2 = Cφ .
6.3. Third case. We use squared parabolic coordinates ξ1 = ξ, ξ2 = η in the xy-plane. The third case corresponds to γ1 = γ2 = ξ 2 + η2 ,
V =
aξ +bη , ξ 2 + η2
G=
bξ −aη . ξ 2 + η2
(105)
The separation constants in the H-J equation are
Cξ = 2ξ + (ξ z − b 0 )2 + 21 (a 2 − b2 )20 − 2aξ H, Cη = 2η + (η z + aη 0 )2 − 21 (a 2 − b2 )20 − 2bη H.
(106)
The relation with the K-S tensor obtained in Sect. 4 is S3 = − 21 Cξ . 6.4. Fourth case. We use cartesian coordinates ξ1 = x, ξ2 = y in the xy-plane. The fourth case corresponds to γ1 = γ2 = 1,
V = v0 + mx,
G = my.
(107)
The separation constants in the H-J equation are
Cx = 2x + V 2 20 − 2V H, Cy = 2y + (z − my 0 )2 .
(108) (1)
The relation with the K-S tensors obtained in Sect. 4 is merely S4 = Cy . As a conclusion of these last two sections let us observe that the separable coordinates, known for the various potentials V , lift up, without any modification, to separable coordinates for the four dimensional system. Let us turn now to the Killing-Yano tensors.
592
G. Valent
7. Killing-Yano Tensors An antisymmetric tensor Yµν is a Killing-Yano (K-Y) tensor iff ∇(µ Yν)ρ = 0.
(109)
A complex structure is therefore a K-Y tensor. The usefulness of such a concept is related to the fact that the symmetrized tensor product of two K-Y tensors does give a K-S tensor, as can be checked by an easy computation. Clearly the triplet of complex structures shared by the multi-centre metrics is not very useful since it gives only trivial K-S tensors so we need extra K-Y tensors. It is the aim of this section to give new examples of these extra K-Y tensors which will give some explicit K-S tensors which do not satisfy Assumption A3. We have been able to obtain K-Y tensors for 1. The special case of the second dipolar breaking, corresponding to V = v0 + E z. 2. The fourth case with an extra tri-holomorphic Killing vector, with potential V = v0 + m x. Let us consider successively these two cases. 7.1. Special second dipolar breaking. For m = 0 the metric simplifies to g=
1 (2dt − Ey dx + Ex dy)2 + V (dx 2 + dy 2 + dz2 ), 4V
V = v0 + Ez. (110)
We have four Killing vectors ∂t ,
x ∂y − y ∂x ,
∂x +
Ey ∂t , 2
∂y −
Ex ∂t , 2
(111)
and the induced conserved quantities have simple Poisson brackets: 0 is central and for the remaining ones {Jz , px } = py ,
{Jz , py } = −px
{px , py } = E 0 ,
(112)
with Jz = x y − y x ,
px = x +
Ey 0 , 2
py = y −
Ex 0 . 2
Using the canonical vierbein one gets for the K-Y two-form Y = −E 2 E0 ∧ (x E1 + y E2 ) + E 2 (x E2 ∧ E3 + y E3 ∧ E1 ) +2EV E1 ∧ E2 .
(113)
From it and the complex structures we can construct four K-S tensors Y 2,
(−)
Si = Y i
(−)
+ i
Y,
i = 1, 2, 3.
We will quote the corresponding conserved quantities instead of the K-S tensors, for the ease of comparison with our earlier results:
Multi-Centre Conservation Law
593
Y2 → −4V (2x + 2y ) E2 +E 2 (x 2 + y 2 )V 20 + 4E φ (x x + y y ) − 2E 2 (x 2 + y 2 )H, S1 → 4EV 0 py − 4E φ px + 4E 2 x H, S2 → −4EV 0 px − 4E φ py + 4E 2 y H, S3 → 4E(px2 + py2 ). (114) Let us observe that S3 is reducible and that S1 and S2 do not satisfy Assumption A3, so we are left with Y 2 . Some algebra shows how it is related to the conserved quantity obtained in Sect. 3: SI I I (m = 0) = −
Y2 v0 − 2 S 3 − v 0 0 Jz , 3 4E 4E
(115)
so that, up to reducible terms, the two conserved quantities are one and the same. This case is quite similar to the Kerr metric (albeit much simpler) for which the Carter K-S tensor is in fact the square of some K-Y tensor.
7.2. The fourth case. Using the canonical vierbein one gets for the K-Y two-form (−)
Y = −my 2
(−)
− mz 3
+ 2V E2 ∧ E3 .
(116)
Defining pz = z − G 0 , we can write the induced conserved quantities: Y2 → −V 2y − V pz2 + my x y − myV 0 pz 4 m2 2 +mz x pz + mzV 0 y − (y + z2 )H, 2 S1 → 2y + pz2 , pz = z − G 0 , S2 → −x y + V 0 pz + my H, 4 S3 → −x pz − V 0 y + mz H. 4
(117)
We see that S1 and S2 were alredy obtained in Sect. 4. The other two are missing since (2) they don’t satisfy our Assumption A3. Notice also that the conserved quantity S4 cannot be obtained in that way. So this example is of some interest since it shows that there do exist K-S tensors which do not satisfy Assumption A3. However, since the corresponding conserved quantities do not commute with z , they are of no use to prove integrability. 8. Conclusion We have settled the problem of finding all the multi-centre metrics which do exhibit some extra conserved quantity, having the structure (16), under the assumptions A1 to A3. Since it is induced by a KS tensor, this conserved quantity is quadratic with respect to the momenta, and preserved by the geodesic flow. As we have observed, the existence of this extra conserved quantity is essential to obtain integrability.
594
G. Valent
However one should keep in mind that our analysis does not cover all the integrable multi-centre metrics, since integrability could emerge from the existence of more complicated conserved quantities. In fact the concept of the Killing-St¨ackel tensor can be generalized to symmetric (n, 0) tensors with n ≥ 3 such that ∇(λ Sµ1 ···µn ) = 0. It follows that the geodesic flow preserves the quantity Sµ1 ···µn x˙ µ1 · · · x˙ µn . The corresponding invariants will be cubic, quartic, etc... with respect to the momenta. Little is known about the existence of such conserved quantities, which could produce possibly new integrable multi-centre metrics. Let us conclude by putting some emphasis on the purely local nature of our analysis: it makes no difference between complete and non-complete metrics. For instance in Sect. 3 we have seen that the most general two-centre metric is integrable, however it is complete only for real m1 = m2 , i. e. for the double Taub-NUT metric. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
Aliev, A.N., Hortacsu, M., Kalayci, J., Nutku, Y.: Class. Quant. Grav. 16, 189–210 (1999) Belinskii, V.A., Gibbons, G.W., Page, D.N., Pope, C.N.: Phys. Lett. B 76, 433–435 (1978) Boyer, C.P., Finley, J.D.: J. Math. Phys. 23, 1126–1130 (1982) Eguchi, T., Hanson, A.J.: Phys. Lett. B 74, 249–251 (1978) Feher, L.G., Horv´athy, P.A.: Phys. Lett. B 183, 182–186 (1987) Gegenberg, J.D., Das, A.: Gen. Rel. Grav. 16, 817–829 (1984) Gibbons, G., Hawking, S.: Phys. Lett. B 78, 430–432 (1978) Gibbons, G.W., Manton, N.S.: Nucl. Phys. B 274, 183–224 (1986) Gibbons, G.W., Olivier, D., Ruback, P.J., Valent, G.: Nucl. Phys. B 296, 679–696 (1988) Gibbons, G.W., Ruback, P.J.: Commun. Math. Phys. 115, 267–300 (1988) Hitchin, N.: Math. Proc. Camb. Phil. Soc. 85, 465–476 (1979) Hitchin, N.: Monopoles, minimal surfaces and algebraic curves. In: NATO Advanced Study Institute n◦ 105, Montreal, Canada: Presses Universit´e de Montreal, 1987 Katzin, H., Levine, J.: Tensor 16, 97 (1965) Kloster, S., Som, M., Das, A.: J. Math. Phys. 15, 1096–1102 (1974) Mignemi, S.: J. Math. Phys. 32, 3047–3054 (1991) Perelomov, A.M.: Integrable systems of classical mechanics and Lie algebras. Basel-Boston-Berlin: Birkh¨auser Verlag, 1990 Tod, K.P., Ward, R.S.: Proc. Roy. Soc. London A 368, 411–427 (1979)
Communicated by H. Nicolai
Commun. Math. Phys. 244, 595–642 (2004) Digital Object Identifier (DOI) 10.1007/s00220-003-1008-0
Communications in
Mathematical Physics
Ising Models with Four Spin Interaction at Criticality Vieri Mastropietro Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Roma, Italy Received: 30 January 2003 / Accepted: 12 June 2003 Published online: 12 December 2003 – © Springer-Verlag 2003
Abstract: We consider two bidimensional Ising models coupled by an interaction quartic in the spins, like in the spin representation of the Eight vertex or the Ashkin-Teller model. By Renormalization Group methods we write a convergent perturbative expansion for the specific heat and for the energy-energy correlation up to the critical temperature. A form of nonuniversality is proved, in the sense that the critical behaviour is described in terms of critical indices which are non trivial functions of the coupling. The logarithmic singularity of the specific heat of the Ising model is removed or changed in a power law (with a non universal critical index) depending on the sign of the interaction. 1. Main Results 1.1. Much of our understanding about phase transitions and critical behaviour of classical spin systems on a 2D lattice is based on some remarkable exact solutions. Onsager [O] solved the Ising model, in which the spins take two values and only nearest-neighbor two spin interactions are considered. Lieb [Li] and Baxter [B] solved respectively the Six vertex and Eight vertex models; in their original formulation such models are vertex models (to each site of a bidimensional lattice is associated a vertex with four arrows) but via a suitable identification of the parameters they can be written as two Ising models coupled by a four spin interaction [W]. The critical exponents describing the behaviour of the system close to the critical point can be exactly computed; it is remarkable that the critical indices in the Ising or in the vertex models are different. The exact solutions provide indeed a lot of detailed information about such integrable models; however even very small and apparently harmless modifications of them completely destroy their integrability. On the other hand one can hope that many relevant properties of the integrable models are quite “robust” under perturbations. It is believed that a universality property holds for the Ising model, in the sense that by adding to it, for instance, a next to nearest neighbor or a four spin interaction the critical indices remain unchanged. A universality property is believed to hold also for the Eight vertex
596
V. Mastropietro
model; Kadanoff [K] by “operator algebra and scaling theory” found evidence that the Eight vertex model is in the same class of universality of the Ashkin-Teller model [AT], which is not integrable. Other evidence for such a conclusion was found in [PB] (using second order renormalization group arguments) and [LP, N] (by a heuristic mapping of both models into the massive Luttinger model describing interacting fermions in the continuum). The natural method to relate non-integrable models to integrable ones is given by the Renormalization Group (RG); this was known for a long time but the main open problem in this context was to implement such a method in a rigorous way. While RG methods were generally applied to the spin variables, it was realized in recent times that it can be convenient to do this in the fermionic representation of spin models. The fermionic representation of the Ising model was done in [SML, H, Ka, MW, S, ID] and it was shown that the correlations can be written as Grassmann integrals formally describing noninteracting fermions on a lattice in d = 1 + 1. In the same way Ising models with quartic interactions can be written as Grassmann integrals formally describing interacting nonrelativistic fermions. The rigorous analysis of Grassmann integrals for non-relativistic fermions via RG methods is quite well developed, starting from [G] and [BG1] (see also [BG] or [GM] for extensive reviews) and one can apply such methods to classical 2D spin systems (such methods were already applied to a closely related problem, the XYZ Heisenberg spin chain [BM]; the relation between the Eight vertex and XYZ model is well known, [Su, Ba]). Fermionic RG methods for classical spin models have been applied first in [PS] to the Ising model with a small next to nearest neighbor or four spin interaction. A form of universality was established in the sense that the interaction does not change certain critical indices; the fermionic interaction is, in this case, irrelevant in the RG sense and the fixed point of the RG transformation is the free one. The aim of the present paper is to study two Ising models coupled by an interaction quartic in the spins, such that both the Eight vertex and the Ashkin-Teller models are included: the system is, in general, non-integrable. The specific heat and the energyenergy correlation are written as Grassmann integrals and studied by RG methods. In such cases the fermionic interaction is marginal and the RG transformation has a line of fixed points. The critical behaviour is different with respect to the case of the Ising model, and it is described in terms of critical indices which are analytic non-trivial functions of λ. In agreement with [K] we find that the behaviour of the system is quite independent from the details of the quartic interaction. In our analysis no use is made of the Six or Eight vertex model exact solutions; we use instead some properties which can be deduced from the solution [ML] of the (massless) Luttinger model following a strategy first outlined (in purely fermionic models) in [BG1]. Our analysis establishes as a mathematically rigorous statement the statement in [K, LP, N, PB] that the Gaussian boson model, the massive Luttinger model, the Eight vertex and the Ashkin-Teller models are in the same class of universality. 1.2. We consider two Ising models coupled by a four spin interaction bilinear in the energy densities of the two sublattices. Given ∈ Z 2 a square lattice with side M (1) and periodic boundary condition, we call x = (x, x0 ) a site of . If σx = ±1 and (2) σx = ±1, we write the following Hamiltonian H (σ (1) , σ (2) ) = HI (σ (1) ) + HI (σ (2) ) + V (σ (1) , σ (2) ) M ≡ H,x (σ (1) , σ (2) ), x,x0 =1
(1.1)
Ising Models with Four Spin Interaction at Criticality
597
Fig. 1. The spins involved in the interaction of the models in Eq. (1.1). The heavy dots and lines or the light dots and lines mark the Ising lattices and the nearest neighbors Ising couplings. The ellipses symbolize the Ashkin–Teller four spin interactions and the circles the Eight vertex four spin interactions couplings
where, if α = 1, 2 M
HI (σ (α) ) = −J
(α)
x,x0 =1
(α)
(α) (α) [σx,x σ + σx,x σ ] 0 x+1,x0 0 x,x0 +1
(1.2)
is the Ising model hamiltonian and V (σ (1) , σ (2) ) is the interaction between the Ising systems V (σ (1) , σ (2) ) = −λ
M
(1)
(2)
(1)
(2)
(1) (1) σ σ (2) σ + σx,x σ σ (2) σ ] a [σx,x 0 x+1,x0 x,x0 x+1,x0 0 x,x0 +1 x,x0 x,x0 +1
x,x0 =1 (2) (1) (1) +b[σx,x σ σ (2) σ 0 x+1,x0 x,x0 x,x0 +1
(2) (2) (1) (1) + σx,x σ σ σ ] . 0 x,x0 +1 x−1,x0 +1 x,x0 +1 (1.3)
If b = 0 the Hamiltonian (1.1) coincides with the Hamiltonian of the spin representation [F] of the Ashkin-Teller model [AT]; if a = 0 it coincides with the spin representation [W] of the Eight vertex model. For a given observable O(x) localized near x we define the correlation < O(x)O(y) > =
where Z =
(1) (2) σx ,σx =±1 x∈
1 Z
e−H (σ
O(x)O(y)e−H (σ
(1) ,σ (2) )
,
(1.4)
(1) (2) σx ,σx =±1 x∈M
(1) ,σ (2) )
is the the partition function. The truncated
correlation of the observable O(x) is < O(x)O(y) >,T =< O(x)O(y) > − < O(x) > < O(y) > ,
(1.5)
598
V. Mastropietro
and the energy-energy truncated correlation function is given by (1.5) with O(x) = H,x (σ (1) , σ (2) ); the specific heat Cvλ is 1 < H,x (σ (1) , σ (2) )H,y (σ (1) , σ (2) ) >,T . ||→∞ ||
Cvλ = lim
(1.6)
x,y∈
If λ = 0 the model reduces to two independent Ising models and close to the critical temperature (equal for both) it is Jc (1.7) − 1| + C2 , J √ where C1 , C2 are positive constants and tanh Jc = 2 − 1, see [MW] Eq. (3.58). The truncated correlation of the observable O(x) = HI,x (σ (α) ) for λ = 0 has the property | < O(x)O(y) >T | ≤ Ce−A|t−tc ||x−y| with A, C suitable constants. We expect that the interaction changes the value of the critical temperature (i.e. of Jc ) by quantities O(λ). However it is convenient to keep the critical singularity at a λ-independent value; we shall show that this can be done by choosing properly the molecular energy parameter J as a function of λ. Therefore we consider the model (1.1) with Jr replacing J , and we shall choose √ Jr = J + O(λ) so that the critical coupling is precisely in correspondence of tanh−1 ( 2 − 1). Denoting by N an arbitrary positive integer, fixing a + b = 0 and with the notations √ def t ≡ tanh J , tanh Jr = tanh J + ν(λ) and tc ≡ 2 − 1, we shall rigorously derive the following result. Cv0 −C1 log |
Theorem. Assume a = 0 or b = 0. There are C, CN , C1 , C2 , τ, Z˜ 1 , positive λ–independent constants, such that for λ small enough one can uniquely define ν (λ), analytic in λ, so that the model in Eq. (1.1), (1.3) and with coupling Jr = J + ν (λ) is critical at t = tc . This means that, for |t − tc | > 0, lim H,x (σ (1) , σ (2) )H,y (σ (1) , σ (2) ),T = a (x, y) + b (x, y),
||→∞
(1.8)
and the bounds CN 1 , |x − y|2+2η1 1 + ( |x − y|)N CN 1 |b (x, y)| ≤ |x − y|2+τ 1 + ( |x − y|)N |a (x, y)| ≤
(1.9)
hold, with “correlation length” −1 and “critical indices” η1 , η2 given by = |t − tc |1+η2 , η1 (λ) = −a1 (a + b)λ + O(λ2 ) η2 (λ) = −a2 (a + b)λ + O(λ2 )
(1.10)
with a1 > 0, a2 > 0 constants and η1 , η2 analytic in λ. Furthermore if 1 ≤ |x| ≤ −1 the correlation is asymptotic to a in the sense that b is neglegible because a (x, y) =
1 1 + A(x − y) , Z˜ 2 (x − y)2+2η1 1
1 |A(x)| ≤ C |λ| + ( |x|) 2 .
(1.11)
Ising Models with Four Spin Interaction at Criticality
599
Finally the specific heat Cvλ (1.6) verifies C1
1 1 [1 − 2η1 ] ≤ Cvλ ≤ C2 [1 − 2η1 ], 2η1 2η1
(1.12)
where C1 , C2 are positive constants. 1.3. The above result says that the interaction changes the value of the critical temperature and it qualitatively modifies the critical behaviour of the specific heat and of the energy-energy correlations. As t gets closer and closer to the critical temperature the logarithmic singularity of the specific heat in the Ising model is changed by the four spin interaction into a power law singularity with non-universal critical indices if λ (a + b) > 0; if λ (a + b) < 0 the specific heat is instead continuous, but higher derivatives of the free energy are singular, as one can check from the proof of the Theorem. Moreover one can distinguish two different regimes in the asymptotic behaviour of the energy- energy correlation function, discriminated by an intrinsic correlation length ξ of order |t − tc |−1−η2 with η2 = O(λ). If 1 |x − y| ξ , the bound for the correlation function is power-like while if ξ |x − y|, there is a faster than any power decay with rate of order ξ −1 . The splitting (1.8) and (1.9) might suggest that the fast decay is modulated by a power |x − y|−2−2η1 but it does not prove that because the first of (1.9) is an inequality rather than an asymptotic expression. We do not study the free energy directly at t = tc , therefore in order to show that t = tc is a critical point we must study some thermodynamic property like the specific heat by evaluating it at t = tc and M = ∞ and then verify that it has a singular behavior as t → tc . Moreover (1.11) holds uniformly for all |t − tc | > 0, hence we can draw the remarkable consequence that assuming continuity for t → tc , at fixed |x − y|, of the correlations in (1.8) we obtain at t = tc a power law behaviour with critical index η1 . We cannot exclude a discontinuity at t = tc of the correlation in (1.8), not even at fixed x − y, because, as it is the case in various models which can be studied up to the critical point, the case t precisely equal to tc cannot be discussed at the moment with our techniques in spite of the uniformity of our bounds as t → tc . In the case of the Eight vertex model our results are in agreement with the exact solution in [B] (see also [W]). For definiteness we have chosen V (σ (1) , σ (2) ) of the form (1.3) but the proof of the Theorem does not depend on the details of the interaction but only on a few general properties; one needs essentially that the interaction is short ranged and it is invariant under the same symmetry transformations which leave invariant the “free” hamiltonian HI (σ (1) ) + HI (σ (1) ). We will describe briefly how the proof of the theorem can be generalized in Appendix O. 1.4. The paper is organized in the following way. We begin to study the analyticity properties of the partition function. The starting point is the well known representation, due to [H, Ka, MW, S], of the Ising model partition function in terms of Grassmann integrals with a formal action which is quadratic. Also the partition function of the model (1.1) can be written in terms of Grassmann integrals, with a formal action which is however non-quadratic. By a suitable linear transformations, see §2, the Grassmann integrals can be written in a form which strongly resembles the partition function of a system of two interacting Dirac fermions on a lattice in d = 1 + 1; one fermion (called massive) has an O(1) mass, while the other (light fermion) has a mass O(t − tc ) i.e. vanishing at criticality. In §3 we “integrate out” the massive fermions, thus obtaining an effective theory in terms of the light fermions only. The integration of the light fields is much more
600
V. Mastropietro
difficult, as their mass is almost zero, and we perform a multiscale analysis based on Renormalization Group ideas, see §4; the result of such analysis is an integration procedure (or a resummmation prescription) for the partition functions which is written as a series in a number of functions which are called running coupling constants carrying a scale label h = 0, 1, . . . : for each scale there are only a few such running couplings. Contrary to the naive expansion in powers of λ (which cannot be convergent at t = tc ), such expansion is well defined arbitrarily close to the critical temperature if the running coupling constants are small enough. The running coupling constants verify a recursive relation expressing the running couplings on a given scale h as a function of the ones on the previous scales h < h: the latter function is usually called the Beta function and it is defined as long as its arguments are small enough. In §5 we show that the running coupling constants are indeed small, if ν is chosen properly and λ is small enough. In order to prove this one has to use two key results. The first is the exploitation of a number of symmetry cancellations to prove that a number of running coupling constants are exactly vanishing; such symmetries, which are manifest in the original spin variables, become quite involved in the fermionic representation. The second one is the decomposition of the beta function in the sum of many terms, in which only one of them is really crucial, while the others would have a small effect in the absence of the first one. One recognizes that such a crucial contribution to the Beta function of our model coincides with the Beta function of the Luttinger model: the latter Beta function was proved to be zero, as a consequence of its exact solution [ML], in [BGPS, GS, BM1] (see [BeM1] for a simplified proof). This means that the apparently largest contribution to the Beta function is essentially zero, if ν (λ) is properly chosen. Note also that, despite the vanishing of the Luttinger model the Beta function is believed to be a consequence of suitable Ward identities, to convert such an argument on a rigorous proof seems at the moment quite difficult, see [BeM1], hence the only rigorous proof of such a key result is the one in [BGPS, GS, BM1]. Finally in §5 we define an expansion for the correlation functions and the specific heat; it is similar to the one for the partition function, with the main difference that one has to introduce new terms in the action associated with the external fields introduced to express via functional integrals the correlation functions. The proof establishes rigorously a relationship between spin models with quartic interactions like the model (1.1) and the massive Luttinger model: in agreement with what was conjectured in [LP, N, PB]. Our results extend a previous paper [M1] (where Eq. (1.12) must replace Eq. (1.16) of [M1] which was incorrect). The analysis of ref. −a [M1] was restricted to the case |t − tc | ≥ e λ2 , where a is a suitable constant. The paper is self contained aside from a few technical lemmata proved in full detail in [BM]. A very important open problem is to obtain by such fermionic RG methods the asymptotic behaviour of the spin-spin correlation function; its fermionic representation is much more involved than the one for the specific heat or for energy-energy correlations which are the only correlations considered here. One can study also the cases in which the parameters J of the two Ising model hamiltonian are different so that there are two critical temperatures; new fermionic effective marginal interactions appear in such a case and universality will be probably found. Another possible extension is the analysis of four coupled Ising models; in this last case interacting spinning d = 1 fermions appear in the fermionic description, which are known to have a behaviour quite different from the spinless one (like in the d = 1 Hubbard model).
Ising Models with Four Spin Interaction at Criticality
601
2. Fermionic Representation (α)
2.1. The partition function ZI of the Ising model with Hamiltonian HI (σ (α) ) in (1.3) can be written as a Grassmann integral; this is a classical result, mainly due to [Ka, H, MW, S] and rederived recently in §3 of [PS] to which we refer for a detailed proof. It is (α) (cosh J )B 2S (α) (α) S (α) ZI = (−1)S dHx(α) dH x dVx(α) dV x (−1)δγ e J ;γ , (2.1) 2 ε,ε =±
x∈M
def
def
where α = 1, 2 denotes the lattice, γ = (ε, ε ) and δγ is δ+,+ = 1, δ+,− = δ−,+ = def
δ−,− = 2, M = , B is the total number of bonds and S is the total number of sites, (α) (α) (α) (α) (α) [H x,x0 Hx+1,x0 + V x,x0 Vx,x0 +1 ] SJ,γ = tanh J +
x∈M (α) (α) [H x,x0 Hx,x 0
x∈M (α) (α) +Hx,x V 0 x,x0 (α)
(α)
(α)
(α)
(α)
(α)
(α)
(α) (α) + V x,x0 Vx,x + V x,x0 H x,x0 + Vx,x H x,x0 0 0
(α) + Vx,x H (α) ], 0 x,x0
(2.2)
(α)
where Hx , H x , Vx , V x are Grassmann variables verifying different boundary conditions depending on the label γ = (ε, ε ) which is not affixed explicitly, to simplify the notations, i.e. H x,x0 +M = εH x,x0 H x+M,x0 = ε H x,x0 (α) (α) (α) (α) Hx,x0 +M = εHx,x Hx+M,x0 = ε Hx,x 0 0 (α)
(α)
(α)
(α)
and identical definitions are set for the variables V (α) , V the set of k’s such that
(α)
ε, ε = ±,
(2.3)
. We call Dγ , for γ = ε, ε
2πn1 2π n0 (ε − 1)π (ε − 1)π + k0 = + (2.4) M 2M M 2M and −[M/2] ≤ n0 ≤ [(M − 1)/2], −[M/2] ≤ n1 ≤ [(M − 1)/2], n0 , n1 ∈ Z. We can write if k = (k0 , k), 1 1 (α) (α) (α) Hx(α) = 2 Hk e−ikx H x = 2 H k e−ikx , (2.5) M M k=
k∈Dε,ε
k∈Dε,ε
(α)
(α)
and similar expressions hold for Vx , V x .
(α) (α) (α) (α) The integration x dHx dH x or x dVx dV x is defined as a linear functional on the Grassmann algebra in the standard way: we recall it in Appendix A below. It will be convenient to use auxiliary models in which J is allowed to depend on α and on the bonds: i.e. we can imagine replacing the coupling J of each bond b joining the (α) nearest neighbors x, y on the lattice α by Jb = Jx,y . If J is not constant but it depends (α) (α) on the bonds, one expresses the partition function ZI (Jx,x ) by a formula similar to Eq. (2.1) in which SJ,γ , with γ = (ε, ε ), becomes (α) (α) (α) (α) (α) (α) (α) tanh Jx,x0 ;x+1,x0 H x,x0 Hx+1,x0 + tanh Jx,x0 ;x,x0 +1 V x,x0 Vx,x0 +1 SJ (α) ,γ = (α)
x
602
V. Mastropietro
+
(α) (α) (α) (α) (α) (α) (α) (α) [H x,x0 Hx,x + V x,x0 Vx,x + V x,x0 H x,x0 + Vx,x H x,x0 0 0 0
x (α) (α) +Hx,x V 0 x,x0
(α) + Vx,x H (α) ], 0 x,x0
and the factor (cosh J )B is replaced by
(2.6) (α)
b
cosh Jb .
2.2. The partition function of the model (1.1) with Jr replacing J is (1) (2) (1) (2) Z2I = e−HI (σ ) e−HI (σ ) e−V (σ ,σ ) .
(2.7)
(1) (2) σx =±1 σx =±1 x∈M x=M
Setting λ a = tanh(λ a), λb = tanh(λ b) we see that Z2I becomes 2I with (cosh λa cosh λb)2S times Z (1) (2) 2I = Z e−HI (σ ) e−HI (σ ) def
def
σ (1) =±1 σ (2) =±1 x∈M x∈M
·
x∈M
·
x∈M
·
x∈M
·
(2) (1) (1) [1 + λ aσx,x σ σ (2) σ ] 0 x+1,x0 x,x0 x+1,x0 (1) [1 + λ aσx,x σ σ (2) σ ] 0 x,x0 +1 x,x0 x,x0 +1 (1)
(2)
(2) (1) (1) x,x [1 + λbσ σ σ (2) σ ] 0 x+1,x0 x,x0 x,x0 +1 (1) x,x [1 + λbσ σ σ σ ], 0 x,x0 +1 x−1,x0 +1 x,x0 +1 (1)
(2)
(2)
(2.8)
x∈M
where HI (σ (α) ) are defined as in (1.3) with Jr replacing J . Note that
σx(α) σx e−HI (σ (α)
(α) σx =±1 x∈
(α) )
=
∂ (α) ∂Jx,x
(α)
(α)
ZI ({J }x,x )|{J (α) }={J } , x,x
(2.9)
r
where x, x are nearest neighbors on the lattice α, and from (2.6) (and remembering that (α) (α) (α) a = 0 or b = 0) this derivative gives an extra factor tanh Jr + sech2 Jr H x,x0 Hx+1,x0 in 2I , hence Z2I , as a Grassmann integral over the variables (2.1). We can therefore write Z H, V , H , V . The algebra is straightforward and we reproduce it in Appendix B, and 2I as a sum of sixteen partition functions labeled by the result is that we can express Z γ1 , γ2 = (ε(1) , ε (1) ), (ε(2) , ε (2) ) (corresponding to choosing each ε and ε as ±) γ1 ,γ2 , 2I = (cosh λa cosh λb)2S (−1)δγ1 +δγ2 Z (2.10) Z 2I γ1 ,γ2
each of which is given by a functional integral 2B 2S γ1 ,γ2 = (cosh Jr ) 2 Z 2I 4
2 α=1
(α)
(α) (α) SJ,γ α
dHx(α) dH x dVx(α) dV x
e
e−V ,
x∈M
(2.11)
Ising Models with Four Spin Interaction at Criticality
603 (α)
(α)
(α)
where V is an expression containing linear or bilinear terms in H x Hx+1,x0 or V x (α)
Vx,x0 +1 , see (7.4). It is convenient to rewrite (2.11) in a form closer to an expression more familiar in the theory of fermionic ground states: our aim in fact is to reduce our critical point problem to a rather standard problem on the ground state of Fermi systems. −,−,−,− def − , i.e. the We shall consider for simplicity the partition function Z = Z 2I 2I partition function in which all Grassmann variables verify antiperiodic boundary conditions (see (2.3)). The other fifteen partition functions in (2.10) admit similar expressions. γ ,γ Furthermore it will appear that for |t − tc | > 0 the logarithm of Z2I1 2 divided by its expression for λ = 0 is insensitive to boundary conditions up to corrections which are exponentially small in the size M of the system in the thermodynamic limit in which M → ∞ (see Appendix G) so that it will turn out that it is sufficient to study just one of −,−,−,− is chosen here (arbitrarily). It is convenient the sixteen partition functions and Z 2I to perform the following change of variables [ID], α = 1, 2: H x + iHx(α) = ei 4 ψx(α) − ei 4 χx(α) H x − iHx(α) = e−i 4 ψ x − e−i 4 χ (α) x (α) (α) (α) V x + iVx(α) = ψx(α) + χx(α) V x − iVx(α) = ψ x + χ (α) (2.12) x (α)
π
π
(α)
π
(α)
π
which replaces the H, V , H , V variables with “Majorana variables” ψ (α) , χ α) . Subsequently we replace the Majorana variables with Dirac variables by setting 1 ∓ = √ (ψx(1) ± iψx(2) ), ψ1,x 2 1 ∓ χ1,x = √ (χx(1) ± iχx(2) ), 2
1 (1) (2) ∓ ψ−1,x = √ (ψ x ± iψ x ) , 2 1 ∓ (2) χ−1,x = √ (χ (1) x ± iχ x ) . 2
The final expression, see Appendix C for the algebra, is −,−,−,− = N P (dψ)P (dχ )eQ(χ,ψ)−V (χ,ψ) , − def = Z Z 2I 2I
(2.13)
(2.14)
(2.15)
where N is a suitable constant and, if φ denotes either ψ or χ , t (+)T + − dφk,ω dφk,ω exp −ξ A (k)ξ P (dφ) = Nφ−1 φ k , k 2M 2 k∈D−,− ω=±1 k∈D−,− i sin k + sin k0 −imφ (k) − − Aφ (k) = , ξ T k = (φk,1 , φk,−1 ) imφ (k) i sin k − sin k0 + + , φk,−1 ) (2.16) ξ +,T k = (φk,1 with Nφ a normalization constant, mφ defined, differently for φ = ψ (choose +) and for φ = χ (choose −), by √ t t (2.17) mφ (k) = (t − (± 2 − 1)) + (cos k0 + cos k − 2). 2 2 √ Remark. Note that we are interested in t close to tc = 2 − 1 hence, for t → tc , mχ is bounded away from 0 and therefore m−1 χ (0) defines a length scale which stays finite in this limit while mψ (0) → 0 and the corresponding length scale diverges. Note also that +,+,+,+ at t = tc is meaningless, as in that case N = 0 (as Nψ = 0); hence (2.15) for Z 2I the assumption |t − tc | > 0.
604
V. Mastropietro
Finally Q(χ , ψ) and V(χ , ψ) are obtained respectively from (7.10) and (7.5) in Appendix C through the change of variables (2.12), (2.13) and (2.14). The final expressions for them are rather intricate and we just extract from them a few properties which will be important in the following. Introducing the discrete derivatives of φ = ψ, χ as def
def
∂1 φx = φx+1,x0 − φx ,
∂0 φx = φx,x0 +1 − φx .
(2.18)
It turns out, see Appendix D, that Q and V are given by a sum of terms of the forms b;σ ,σ a;σ ,σ b;σ1 ,σ2 1 2 1 2 Ax;φ,ω or A (2.19) ,ω , ,ω Ax ;φ ,ω ;φ ,ω , ;φ ;φ x;φ,ω 1 2 1 2 x
1
x
2
where x = x or x = (x − 1, x0 + 1) with φ, φ , φ , φ ∈ {ψ, χ }, σ = ± and 1) If ω1 = ω2 then for a suitable numerical coefficient aσ1 ,σ2 ,ω,c,n it is, for n = 1, 2 and c = a, b,
σ1 σ2 1 ,σ2 Ac;σ x;φ,ω;φ ,ω = aσ1 ,σ2 ,ω,c,n φω,x ∂xn φω,x
with
(2.20)
1a) If n = 1 ∂xn = ∂x0 and aσ1 ,σ2 ,ω,c,1 is imaginary; 1b) If n = 2 ∂xn = ∂x and aσ1 ,σ2 ,ω,c,2 is real. 2) If ω1 = −ω2 then for suitable real numerical coefficients bσ1 ,σ2 ,ω,c,m , cσ1 ,σ2 ,ω,c,m it is
σ2 σ1 1 ,σ2 2a) Ac,σ x;φ,ω;φ ,−ω = ibσ1 ,σ2 ,ω,c,m ∂xm φω,x ∂xm φ−ω,x , ∂xm = ∂x0 if m = 1, ∂xm = ∂x if m = 2, σ σ1 1 ,σ2 2 2b) Ac,σ x;φ,ω;φ ,−ω = icσ1 ,σ2 ,ω,c,l φω,xl φ−ω,xl ,
(2.21)
with l = 1, 2, 3 and xl = x, xl = (x + 1, x0 ), xl = (x, x0 + 1) for l = 1, 2, 3 respectively. 2.3. The value of P (dφ)Q(φ), where Q(φ) is any monomial on the φ = −ψ, χ+variables, is given by the anticommutative Wick rule with propagator P (dφ) φx,ω φy,ω = (φ)
gω,ω (x − y) given by (φ)
gω,ω (x − y) =
2 −ik(x−y) −1 e [Aφ (k)]ω,ω . tM 2
(2.22)
k
If we set Qφ (k) = det Aφ (k) = − sin2 k0 − sin2 k − [mφ (k)]2 , then 1 − sin k0 + i sin k imφ (k) (k) = . A−1 φ −imφ (k) sin k0 + i sin k Qφ (k)
(2.23)
The following bounds hold for the propagators, for any N > 1 and for a suitable constant CN (φ) |gω,ω (x − y)| ≤ (φ)
1 CN , 1 + |d(x − y)| 1 + |mφ (0)d(x − y)|N
|gω,−ω (x − y)| ≤
|mφ (0)| log[1 + (|mφ (0)||d(x − y)|)−1 ]CN , 1 + |mφ (0)d(x − y)|N
(2.24) (2.25)
Ising Models with Four Spin Interaction at Criticality
605
where d is a distance between x, y which takes into account the antiperiodicity of the boundary conditions that we are considering, namely π(x − y) M π(x0 − y0 ) M sin , sin . (2.26) d(x − y) = π M π M Note that the following parity properties hold: (φ) (φ) (x) = −gω,ω (−x), gω,ω
(φ)
(φ)
gω,−ω (x) = gω,−ω (−x) .
(2.27)
Remark. After the change of variables (2.12), (2.13) and (2.14) we have achieved writing − as (2.15), which can be naturally seen as the partition function of a system of two Z 2I kinds of bidimensional Dirac fermions on a lattice. The remark following (2.17) says that the χ -fields mass is O(1) while the ψ-fields mass is vanishing when t = tc ; hence the χ -fields will be called massive fields and the ψ-fields will be called light fields. In contrast with this interpretation note, however, that the interaction V has a quite nonstandard form; it is not invariant under global gauge transformations and is not given by products of density operators, unlike in the usual fermionic models. 3. Integration of Massive Fermions 3.1. Considering (2.15) we proceed to perform the Grassmann integration over the massive χ fields and to reduce the double integration over ψ, χ to an integration of a (more (1) involved) new exponential e−V (ψ) over the light fields ψ alone, − = N P (dψ) P (dχ )eQ(χ,ψ) e−V (ψ,χ) = P (dψ)eM 2 N (1) −V (1) (ψ) , (3.1) Z 2I where N (1) is a constant such that the effective potential V (1) (ψ) vanishes at ψ = 0 and P is suitably defined. Indeed we prove the following result. 3.2. Lemma 1. Assume a = 0 or b = 0. There exists ε and C such that, for |λ|, |ν| ≤ ε, 2n W2n,σ ,α,ω (x1 , .., x2n )∂ α1 ψxσ11,ω1 ...∂ α2n ψxσ2n V (1) = ,ω2n , n≥1 α,ω,σ x1 ,..,x2n
2n,σ ,α,ω (k1 , ...kn−1 )| ≤ M 2 C n ε n/2 , |W
n ≥ 2.
(3.2)
4 J + O(ε 2 ) The addends in (3.2) with n = 2 can be written, for l1 = 2(λ a + λb)sech r real, as + + − − l1 ψ1,x ψ−1,x ψ−1,x ψ1,x + W4,σ ,α,ω (x1 , .., x4 ) x1 ,..,x4 α1 +..α4 ≥1,σ
x
× ∂ α1 ψxσ11,ω1 ∂ α2 ψxσ22,ω2 ∂ α3 ψxσ33,ω3 ∂ α4 ψxσ44,ω4 .
(3.3)
The addend with n = 1 can be written, for ν1 = ν + O(ε), a1 , a2 = ν/2 + O(ε), as − + + − [−iων1 ψx,ω ψx,−ω + ψx,ω (iωa1 ∂0 − a2 ∂1 )ψx,ω ] ω x + W2,σ ,α,ω (x1 , x2 )∂ α1 ψxσ11,ω1 ∂ α2 ψxσ22,ω2 (3.4) x1 ,x2 {ω} α1 +α2 ≥2,σ1 ,σ2
606
V. Mastropietro
2,σ ,α,ω (k1 )| ≤ M 2 C|ε|. Finally making use of a general notawith ν1 , a1 , a2 real and |W tion for later reference, the Grassmann integration P (dψ) is PZ1 ,m1 ,C1 (dψ), where PZ1 ,m1 ,C1 (dψ) = N −1
k∈D−,− (k)−1 >0 C 1
ω=±1
1 (k) tZ1 C (1) + − + − dψk,ω dψk,ω exp − ψ T (k)ψ k,ω k,ω ω,ω , M2 k∈D −,− (k)−1 >0 C 1
1 def T (1) (k) = C0 + µ0,0 (k) 1 (i sin k + sin k0 ) + µ1,1 (k)Z −1 Z −im1 − iµ1,2 (k)Z1−1 1 × 1 (i sin k − sin k0 ) + µ2,2 (k)Z −1 (3.5) Z im1 + iµ1,2 (k)Z1−1 1 √ √ 1 (k) ≡ 1, m1 = C0 (t − tc ), Z1 = 1, Z 1 = t [(2t + 2 2t) + with C0 = (t + 1 + 2)2 , C 2 √ (2 2 + 3 + t 2 )], µi,j (k) analytic functions in k of size O(k2 ) with µi,i (k), i = 1, 2, odd and µ1,2 (k) even and real; moreover C0 + µ0,0 ≥ 1 and det T (1) (k) is bounded above and below by two constants times −2t (1 − t 2 )(cos k0 + cos k1 − 2) + m21 . The proof of the above proposition is a repetition of standard arguments, see for instance [BGPS] or [BM]: the key is the Gram-Hadamard inequality applied along the lines of Lesniewski, [Le]. For completeness the details are in Appendix E and F. The result says that the integration of the massive fermions has the “only” effect over the remaining (non trivial) ψ–integration of modifying the propagator of the light ψ fields by a few trivial factors of O(1) (analytically dependent on λ for λ small). The only difficulty and novelty is that a detailed analysis of the bilinear and quartic terms in V (1) is necessary. In fact we have to show that the quadratic part can be writσ ψ −σ , or ψ σ ψ σ ten as in (3.4), saying that there are no terms of the form ψx,ω x,ω x,ω x,−ω or σ σ ψx,ω ∂ψx,−ω ; despite the fact that such terms are absent in V, they could be generated by the integration of the χ variables. This is not the case, as a consequence of symmetry properties verified by the model (1.1), as it will be shown in Appendix F. 4. Renormalization Group for Light Fermions − (2.15); after the integration 4.1. Multiscale analysis. We continue the analysis of Z 2I over the χ –fields we have to compute the Grassmann integral over the ψ–fields given by the r.h.s. of (3.1). The problem is quite different from the one treated in Sect. 3 because the ψ–field has propagator, (3.5), with “mass” O(t − tc ) which can be arbitrarily close to 0, and we need estimates that are uniform in this quantity. Therefore we shall proceed via a multiscale analysis following the techniques developed to study the ground state of one–dimensional Fermi systems in [BG], [BGPS] and [BM]. We introduce a scaling parameter γ > 1 which will be used to define a geometrically growing sequence of length scales 1, γ , γ 2 , γ 3 , . . . , i.e. of geometrically decreasing momentum scales γ h , h = 0, −1, −2, . . . . Let χ (k) ∈ C ∞ be a non-negative function such that 1 if |k| < 1/γ , χ(k) = χ (−k) = , where |k| = sin k02 + sin k 2 , (4.1) 0 if |k| > 1 ,
Ising Models with Four Spin Interaction at Criticality
607
Fig. 2. The function χ (γ −h x), χ(γ −(h−1) x), f (γ −h x) def
and for h ≤ 0 integer define fh (k) = χ (γ −h k) − χ (γ −h+1 k) so that, for h < 0, it is χ(k) = 0h=h +1 fh (k) + χ (γ −h k) . Note that, if h ≤ 0, fh (k) = 0 for |k| < γ h−2 or |k| > γ h , and fh (k) = 1, if |k| = γ h−1 . Furthermore with our boundary conditions ε = ε = −, see (2.4), the def
√
momenta k = (k0 , k) are such that |k| > kM = πM2 . Therefore if we define the “minimum” momentum scale larger than kM (i.e. hM = min{h : γ h > kM }) it will be for all such k: 1=
1
fh (k)
f1 = 1 − χ (k) ,
(4.2)
h=hM
which can be visualized as in Fig. 2. Note that the fact that hM is finite plays essentially no role in the subsequent analysis; note also that we are making a multiscale decomposition around k = k0 = 0 as it is the only pole of the propagator corresponding to PZ1 ,m1 ,C1 (dψ). The purpose is to perform the integration over the light fermion fields in a iterative way. The iteration steps will be labeled by scale values h = 1, 0, −1, . . . , hM . The number of iterations will be −hM + 2 and after each iteration we shall be left with a “simpler” Grassmann integration to perform: it will be an integration with respect to a field ψ (≤h) , h = 0, −1, . . . , hM of (≤h) 2 (h) √ PZh ,mh ,Ch (dψ (≤h) ) e−V ( Zh ψ )−M Eh , V (h) (0) = 0 , (4.3) where the quantities PZh ,mh ,Ch (dψ), Zh , mh , Ch (k), V (≤h) (ψ), Eh have to be defined 2 2I , i.e. the value recursively and the result of the last iteration will be e−M E−1+hM ≡ Z of the partition function. 1 (k) The PZh ,mh ,Ch (dψ) integration is defined by (3.5) in which we replace Z1 , m1 , C by other quantities Zh , mh , Ch (k) with Ch (k)−1 =
h
fj (k),
(4.4)
j =hM
1 fixed to the value in (3.5) and Zh , mh recursively defined as discussed below; keeping Z moreover V
(h)
(ψ
(≤h)
)=
∞ 2n n=1
x1 ,...,x2n , σ ,ω,α
i=1
(h)
iW ∂ αi ψx(≤h)σ 2n,σ ,α,ω (x1 , ..., x2n ) . i ,ωi
(4.5)
608
V. Mastropietro
4.2. The localization operator. The effective potential V (h) will be rather involved: to define it recursively it will be convenient to identify in it a part that can be called “irrelevant” and the rest. Here the word irrelevant does not mean “negligible”: it identifies a part of V (h) which can be expressed as a (convergent) power series in terms of a number of parameters vh , h > h, which we call running coupling constants. The latter are also defined recursively and they can be isolated from the effective potential V (h) by acting on it with a “localization operator” L which extracts from the sum of monomials in the fields in (3.1) the terms of degree 2n = 2, 4 in the fields and from each of them it extracts the “local part”: for h ≤ 0 it acts on the kernels W by simplifying them as follows: 1) If 2n = 4, then we define ¯ ¯ ¯ (h) (h) LW 4,σ ,α,ω (k1 , k2 , k3 ) = W4,σ ,α,ω (k++ , k++ , k++ ) ,
(4.6)
π π where k¯ ++ = ( M , M ) is the smallest momentum allowed by the boundary conditions that we are using (see (2.4)). π π 2) If 2n = 2 and k¯ ηη = η M , then , η M 1 (h) M (h) ¯ LW2,σ ,α,ω (k) = W2,σ ,α,ω (kηη ) 1 + aM (η sin k + η sin k0 ) , (4.7) 4 π η,η =±1
π where aM M π sin M = 1. h 3) In all other cases LW 2n,σ ,α,ω (k1 , . . . , k2n−1 ) = 0.
Remark. Note that in the limit M → ∞ (4.7) becomes simply LW 2,σ ,α,ω (k) = [W2,σ ,α,ω (0) + sin k0 ∂k0 W2,σ ,α,ω (0) + sin k∂k W2,σ ,α,ω (0)], (h)
(h)
(h)
a(h)
(4.8)
hence LW 2,σ ,α,ω (k) has to be understood as a discrete version of the Taylor expansion up to order 1. Since aM = 1+O(M −2 ) this property would be true also if aM = 1; how (h) (k) = ever the choice (4.7) shares with (4.8) another important property, that is L2 W 2,σ ,ω (h) (k), see [BM]. LW (h)
2,σ ,ω
4.3. Relevant, marginal and irrelevant operators. By (4.6),(4.7) and the symmetry relations in Appendix F we can write LV (h) as: (≤h)
LV (h) (ψ (≤h) ) = (sh + γ h nh )Fm(≤h) + lh Fλ
(≤h)
+ z h Fζ
+ ah Fα(≤h) ,
(4.9)
where sh , nh , lh , zh , ah are real and, if |λ|, |ν| ≤ ε, s1 = O(m1 λ), z1 , a1 = O(λ), sech4 Jr + O(λ2 ), γ n1 = ν + O(λ); moreover l1 = 2(λ a + λb) 1 (≤h)− , (≤h)+ ψ iωψ k,ω k,−ω M2 k∈DM ω=±1 1 (≤h)+ ψ (≤h)+ (≤h)− (≤h)− ψ = 8 k1 ,+1 k2 ,−1 ψk3 ,−1 ψk4 ,+1 δ(k1 − k2 + k3 − k4 ) , M k1 ,...,k4 ∈DM 1 (≤h)− , (≤h)+ ψ = 2 i sin k ψ k,ω k,ω M
Fm(≤h) = (≤h)
Fλ
Fα(≤h)
k∈DM ω=±1
Ising Models with Four Spin Interaction at Criticality (≤h)
Fζ
=
1 (≤h)+ ψ (≤h)− , ω sin k0 ψ k,ω k,ω M2
609
(4.10)
k∈DM ω=±1
where δ(k) = 0 if k = 0 and δ(0) = 1. Applying the operations L to the kernels of the effective potential generates the sum in (4.9), i.e. a linear combination of the Grassmann monomials in (4.10) which, in the renormalization group language are called “relevant” operators (the first) and “marginal” operators (the three others); while applying the operations 1−L generates a sum of (infinitely many, in the limit M → ∞) monomials called irrelevant operators. Note that one can repeat the analysis in Appendix F to conclude that many terms, which could be a priori present in (4.9) are indeed absent. Hence the constants nh , sh , lh , (≤h)− or (≤h)+ ψ zh , ah are real and many possible marginal interactions (like k sin k ψ k,ω k,−ω (≤h)+ (≤h)− k ψk,ω ψk,ω ) are excluded. This remark is crucial in order to analyze the flow of the running coupling constant, see the next section: it shows that the number of relevant or marginal operators is far smaller than a priori one might expect, due to the symmetries in the hamiltonian. (≤h) Note also that we have written the coefficient of Fσ as sh + γ h nh according to a rule which will be specified in (4.17), (4.18) below and for the reasons explained in the subsequent remark. 4.4. Renormalization. We have set all definitions needed to define the recursive procedure leading to the definition of the running couplings and of the effective potentials. Suppose that Zk , mk ,√Ck , V (k) in (4.3) have√been defined for k = 1, 0, √. . . h + 1. Then we can write V (h) ( √Zh ψ (≤h) ) as LV (h) ( Zh ψ (≤h) ) + (1 − L)V (h) ( Zh ψ (≤h) ) and we split from LV (h) ( Zh ψ (≤h) ) in (4.9) the three terms quadratic in ψ (≤h) given (≤h) (≤h) (≤h) + Zh zh (Fζ + Fα ). by Zh sh Fσ Since such terms are quadratic we can imagine to include them in the “the free integration” PZh ,mh ,Ch (dψ (≤h) ) by simply replacing the integration symbol PZh ,mh ,Ch (dψ (≤h) ) by a new Grassmann integration symbol PZˆ h−1 ,mh−1 ,Ch (dψ (≤h) ) obtained from PZh ,mh ,Ch (dψ (≤h) ) via the substitutions of Zh , mh (k) with −1 (C0 + µ0,0 (k))zh ], Zˆ h−1 (k) = Zh [1 + t −1 Ch−1 Z 1 Zh mh−1 (k) = [mh (k) + Ch−1 (k)t −1 (C0 + µ0,0 (k))sh ] ; Zˆ h−1 (k)
(4.11)
√ (h) = V (h) − Zh sh Fσ(≤h) − and correspondingly by replacing V (h) ( Zh ψ (≤h) ) by V (≤h) (≤h) + Fα ). This means that the subtracted terms are imagined included in Zh zh (Fζ PZˆ h−1 ,mh−1 ,Ch as an algebraic check confirms. If exp(−M 2 th ) is a suitable constant factor fixing normalization of the two integrations we get (h) √ (≤h) PZh ,mh ,Ch (dψ (≤h) )e−V ( Zh ψ ) (≤h) (h) √ −M 2 th PZˆ h−1 ,mh−1 ,Ch (dψ (≤h) ) e−V ( Zh ψ ) , =e (4.12) and we try to express the r.h.s. as a double integral by writing ψ (≤h) = ψ (≤h−1) + ψ (h) .
610
V. Mastropietro
We shall call mh (0) ≡ mh and Zˆ h (0) ≡ Zh . The r.h.s of (4.12) can be written, as an algebraic check will confirm, as (≤h) (h) √ −M 2 th (≤h−1) PZh−1 ,mh−1 ,Ch−1 (dψ ) PZh−1 ,mh−1 ,f −1 (dψ (h) ) e−V ( Zh ψ ) , e h
(4.13) where we have set −1 C0 ) , Zh−1 = Zh (1 + zh t −1 Z 1
fh (k) = Zh−1
C −1 (k) Ch−1 (k) − h−1 Zh−1 Zˆ h−1 (k)
. (4.14)
Note that fh (k) has the same support of fh (k). The single scale propagator is
(h)
gω,ω (x − y) (h)− (h)+ ψy,ω = , PZh−1 ,mh−1 ,f−1 (dψ (h) ) ψx,ω h Zh−1 1 −ik(x−y) def (h) e fh (k)[Th−1 (k)]ω,ω , gω,ω (x − y) = tM 2
(4.15)
k
and Th (k) is defined by performing in (3.5) the replacement indicated in (4.4). −1 C0 zh | ≤ 1 , |C0 sh | ≤ |mh /2| and supk≥h | Zk | ≤ ec0 |λ| , the large distance If |Z 1 2 Zk−1 (h)
behavior of gω,ω (x − y) and of its (discrete) derivatives can be established in detail and one finds that it is characterized by a single lengh scale, namely γ −h . The analysis leads to naively expected results that will be exploited in the following and it is performed in Appendix H. (h) (k¯ ηη ) = sh + γ h nh We can now specify according to which rule the splitting W 2,σ ,ω in (4.9) will be done. We write (h)
(h)
(h)
(h)
(h)
gω,−ω (x − y) = gω,−ω (x − y) + gω,−ω (x − y), 1 −ik(x−y) imh (k) def (h) gω,−ω (x − y) = fh (k) 2 2 e , (4.16) 2 sin k 2 + m2 (k) sin k0 + Z tM 2 Z 1 1 h k (h)
gω,−ω and it does not vanish for mh = 0. We write and gω,−ω (x − y) is gω,−ω − a(h) + W b(h) (h) = W W 2,σ ,ω 2,σ ,ω 2,σ ,ω
(4.17)
a(h) given by definition by a sum of terms containing at least a propagator with W 2,σ ,ω (k)
gω,−ω (x − y), k > h and we set 1 sh = δω,−ω 4
η,η =±1
a(h) (k¯ ηη ) W 2,σ ,ω
1 γ h nh = δω,−ω 4
η,η =±1
b(h) (k¯ ηη ) . W 2,σ ,ω (4.18)
a(h)
Such definitions imply that W2,σ ,ω is vanishing at t = tc for all h.
Ising Models with Four Spin Interaction at Criticality
611
Remark. In a theory of fermions if there is no mass term in the action then no mass terms are generated by the Renormalization Group iterations, by local Gauge invariance. In our spin model this is not true, as the interaction is not Gauge invariant; hence even if t = tc (or m1 = 0) a mass term in the Renormalization Group iterations can be generated. Hence we collect all the relevant terms which are vanishing if t = tc , in sh , which we include in the fermionic integration; the “mass” has a non trivial flow producing at the end the critical index of the correlation length. The remaining terms are left in the effective interaction; they constitute the running coupling constant νh whose flow is controlled by the counterterm ν. √ √ (h) (h) , see (4.4), by a factor Zh / Zh−1 so We now rescale the kernels W2n,σ ,α,ω in V √ (h) ( Zh ψ (≤h) ) can be rewritten as that the effective potential V (h) ( Zh ψ (≤h) ) = V (h) ( Zh−1 ψ (≤h) ) ; V
(4.19)
and as a consequence, see (4.9), (h) ( Zh−1 ψ ≤h ) = γ h νh Zh−1 Fσ(≤h) + δh Zh−1 Fα(≤h) + λh (Zh−1 )2 F LV λ Zh 2 def Zh def Zh def nh , δh = (ah − zh ) , λh = ( ) lh . νh = Zh−1 Zh−1 Zh−1
(≤h)
,
(4.20)
def
We will call vh = (λh , δh , νh ) the running coupling constants and Zh , mh the renormalization constants. h by If we now define V (h−1) , E (h−1) (√Z (≤h−1) )−M 2 E (≤h) (h) √ h h−1 ψ e−V = PZh−1 ,mh−1 ,f−1 (dψ (h) ) e−V ( Zh−1 ψ ) , h
(4.21) √ h such that V (h−1) (0) = 0, we see that V (h−1) ( Zh−1 ψ (≤h−1) ) is of the form (4.4) with E h : this is checked by decomposing ψ (≤h) = ψ (≤h−1) +ψ (h) and and Eh−1 = Eh +th + E by means of the relation (which is, essentially, a definition of truncated expectations), h M 2E
+V
(h−1)
( Zh−1 ψ
(≤h−1)
∞ 1 (h) ( Zh−1 ψ (≤h) )) , (−1)n+1 EhT ,n (V )= n! n=1
(4.22) −1 gω,ω , see where EhT ,n denotes the truncated expectation of order n with propagator Zh−1 (4.15). (h) The above procedure allows us to write the kernels W2n,σ ,ω,α and E˜ h by a convergent expansion in the running coupling constants and the renormalization constants at higher scales; more exactly we will prove in Appendix I the following proposition. (h)
4.5. Lemma 2. Suppose that εh < ε, then if ε is small enough and if for some constant c1 , ! m ! ! Z ! 2 ! h ! ! h ! c1 εh | ≤ εh , max ≤ e |v sup , sup (4.23) ! ! ! ! ≤ ec1 εh , h m Z h >h h −1 h −1 h >h h >h
612
V. Mastropietro
then for a suitable M–independent constant c0 the kernels in (4.4) satisfy −hDk (Pv0 ) |W (c0 εh )max(1,n−1) 2n,σ ,ω,α (k1 , ..., k2n−1 )| ≤ γ (h)
where Dk (Pv0 ) = −2 + n + k and k =
2n
i=1 αi .
(4.24)
Moreover
h+1 | ≤ γ 2h c0 εh . (|nh | + |zh | + |ah | + |lh |) ≤ c0 εh , |sh | ≤ |mh | c0 εh , |E
(4.25)
5. The Flow of the Running Coupling Constants (h)
5.1. By the result in Sect. (4.5) it follows that the kernels W2n,σ ,ω,α in (4.4) are bounded as soon as the condition (4.23) on the running coupling constants vh and the renormalization constants Zh , mh , h > h are verified. Such quantities verify a set of recursive equations called Beta function relations of the form νh−1 = γ νh + βνh (ah , νh ; ...; a1 , ν1 ), mh−1 h = 1 + βm (ah , νh ; ...; a1 , ν1 ), mh
ah−1 = ah + βah (ah , νh ; ...; a1 , ν1 ), Zh−1 = 1 + βzh (ah , νh ; ...; a1 , ν1 ), (5.1) Zh
where ah = (λh , δh ). By explicit calculation of the lower order non-zero terms one finds, for h ≤ 0, βzh (ah , νh ; ...; a1 , ν1 ) = b1 λ2h + O(εh3 ) , h (ah , νh ; ...; a1 , ν1 ) = a2 λh + O(εh2 ) , βm
b1 > 0 , a2 > 0 .
(5.2)
It is possible to prove the following proposition, see Appendix L. 5.2. Lemma 3. There are positive constants ci 1, . . . , 6, such that for M large, |t −tc | > 0 and small enough and λ small enough one can uniquely define ν(λ) such that there exists an integer h∗ ≤ 0 such that for h∗ − 1 ≤ h ≤ 0, 1
∗
|δh | ≤ c1 |λ| , |νh | ≤ c6 |λ| (γ − 2 (h−h ) + γ κh ), |λh − λ| ≤ c1 |λ|3/2 , m 2 2 h γ −λc2 h < < γ −λc3 h , γ −c4 λ h < Zh < γ −c5 λ h . (5.3) m1 ∗ −2
The scale h∗ is such that γ k−1 ≥ 4|mk | for 1 ≥ k ≥ h∗ while γ h logγ |m1 | 1 − λc2
≤ h∗ <
logγ |m1 | 1 − λc2
+1.
≤ 4|mh∗ |; it verifies (5.4)
5.3. Remark. h∗ is the scale at which the mass mh and the momentum scale γ h become of the same order (at the first steps |mh | << γ h close to criticality). As mh has a non trivial flow, such scale depends on λ, see (5.4). It is sufficient to study the flow up to ∗ h∗ because the integration of ψ (≤h ) can be performed in a single step as mh∗ acts as ∗ an infrared cut-off on the momentum scale γ h ; see §5.2. For scales greater than h∗ (5.3) says that it is possible to choose the counterterm ν so that νh stays bounded, λh , δh remains close to their initial value, while mh , Zh have a non trivial anomalous flow.
Ising Models with Four Spin Interaction at Criticality
613
The main point in order to prove the above proposition is that the functions βah (ah , νh ; ...; a1 , ν1 ) can be written as the sum of two terms; only one of them is really crucial while the other has little effect on the flow, if the counterterm ν is chosen properly. In particular we write βah (ah , νh ; ...; a1 , ν1 ) in the following way: h (ah ; ...; a1 ) + rah (ah , νh ; . . . ; a1 , ν1 ), βah (ah , νh ; ...; a1 , ν1 ) = βa,L
(5.5)
h (a ; ...; a ) is obtained from β h (a , ν ; ...; a , ν ) by setting ν = m = 0 where βa,L h 1 1 1 k k a h h (k)
for all k ≥ h and substituing, for all k ≥ h, each propagator gω,ω (x − y) given by (4.16) (k) (k) with gL;ω,ω (x − y) given by (7.63), and each propagator gω,−ω (x − y) with zero. By the estimates of the propagator in Appendix H and proceeding as in Appendix I it follows that, if (4.23) holds and ε¯ is small enough, for h ≥ h∗ , ∗
1
|rλh | + |rδh | ≤ Cλh (ν h + γ − 2 (h−h ) + γ κh ), 2
(5.6)
where 0 < κ < 1 is a constant and λh = supk≥h |λk |, ν h = supk≥h |νk |. On the other hand it was proved, following the strategy outlined in [BG], in [BGPS, GS, BM1] (see also [BeM1] for a simplified proof) that, with the latter definition of λh and for h∗ ≤ h ≤ 0, the following result holds 5.4. Lemma 4. There exist constants ε 0 and η > 0, such that, if |ah | ≡ |(δh , λh )| ≤ ε0 , if the label a is a = λ, δ and if h ≤ 0, 2
h |βa,L (ah , . . . , ah )| ≤ Cλh γ η h .
(5.7)
The proof of the above statement is based on a Renormalization Group analysis of the Luttinger model (see for instance [BGM] for the definition of this model). Proceeding as in §4 one gets an expansion for the correlation functions in terms of running coupling (L) (L) constants λh , δh verifying h (a λh−1 = λh + β¯λ,L h , . . . , a1 ) , (L) (L) (L) (L) h = δ + β¯ (a , . . . , a ) , δ (L)
(L)
h−1
h
δ,L
(L)
(L)
h
1
(5.8)
h , β¯ h are the same as the ones in (5.2) up to O((λ 2 η h ) terms for a suitwhere β¯λ,L h ) γ λ,L able η > 0. The proof of (5.7) is done by comparing the expression for the correlation functions obtained by the exact solution in [ML] with their expression as series in terms of running coupling constants, see [BGPS, GS, BM1] and [BeM1]. Hence by (5.6), (5.7) and some properties of the functions βih (ah , νh ; ...; a1 , ν1 ) in (5.1) the above proposition on the flow follows, see Appendix L. (L)
5.5. The propagator corresponding to the integration of all the scales between h∗ and hM , (≤h∗ ) g ω,ω (x − y) ∗ (≤h∗ )− (≤h∗ )+ ≡ PZh∗ −1 ,mh∗ −1 ,Ch∗ (dψ (≤h ) )ψx,ω ψy,ω , (5.9) Zh∗ −1 obeys the same bound as the propagator of the integration of a single scale greater than h∗ , see (7.67) in Appendix H ; this property can used to perform the integration of all the scales ≤ h∗ in a single step. We define √ (≤h∗ ) ) 2 ∗ (h∗ ) e−M E≤h∗ = PZh∗ −1 ,mh∗ −1 ,Ch∗ (dψ (≤h ) ) e−V ( Zh∗ −1 ψ , (5.10) and in Appendix I it is proved that:
614
V. Mastropietro
5.6. Suppose that εh∗ < ε, then if ε is small enough and (4.23) holds with h = h∗ − 1, then ∗
≤h∗ | ≤ γ 2h c0 εh . |E
(5.11)
Then by the statements in §4.5, §5.2, §5.6 for λ small enough and ν suitably chosen we get a convergent expansion for the free energy −
1 1 ˆ− = h + th ] + E ≤h∗ + t≤h∗ . log Z [E 2I M2 ∗
(5.12)
h=h +1
h , E ≤h∗ are written by a convergent tree expansion, see Appendix I. The quantities E Note that the fact that hM is finite, which is due to the fact that we are considering the addend with γ (1) , γ (2) = −, −, −, − in (2.10), plays essentially no role in the analysis. ∗
Remark. γ h is a momentum scale and, roughly speaking, for momenta bigger than ∗ γ h the theory is “essentially” a massless theory (up to O(mh γ −h ) terms), while for ∗ ∗ momenta smaller than γ h it is a “massive” theory with mass O(γ h ) which can be integrated without multiscale decomposition. 6. Correlation Functions and the Specific Heat 6.1. In the preceding sections we have found a convergent expansion for the free energy; the latter is not interesting per se until we show that the free energy as a function of t − tc has some singularity at t = tc . In order to show that t = tc is a critical point we can study some correlation functions or some thermodynamic property like the specific heat by evaluating them at t = tc and M = ∞ and then verify that they have a singular behavior as t → tc . We shall study, for this purpose, the energy-energy correlation function (1.8) and the specific heat. We start by considering the following expression: (x, y) =< [HI,x (σ (1) ) + HI,x (σ (2) )][HI,y (σ (1) ) + HI,y (σ (2) )] >,T ,
(6.1)
where HI (σ (α) ) = x HI,x (σ (1) ), and HI (σ (α) ) is the Ising model hamiltonian defined in the first of (1.3). By using (2.9) we get, for x = y (x, y) =
(−1)
δγ1 +δγ2
γ1 ,γ2
γ ,γ
Z2I1 2 γ1 ,γ2 , (x − y), Z2I
(6.2)
where γ1 ,γ2 , (x − y) =
"# 2 α=1
− −
∂ (α)
∂Jx,x0 ;x,x0 +1
! ! V!
∂ (α)
∂Jr;y,y0 ;y+1,y0
{Jr }
! ! V!
! ! V!
∂
α sech2 Jr ∂t Sx,ε (a) ,ε (α) −
(α)
∂Jx,x0 ;x+1,x0 {Jr } $ # 2 α ; sech2 Jr ∂t Sy,ε (a) ,ε (α)
{Jr }
α=1
−
∂ (α)
∂Jy,y0 ;y,y0 +1
! ! V!
$%T {Jr }
.
(6.3)
Ising Models with Four Spin Interaction at Criticality
615
If A1 , . . . , An are functions of the field, we are using the symbol < A1 ; . . . ; An >T to
(α) denote the truncated expectation w.r.t. the fermionic integration γ11,γ2 [ 2α=1 P (α) (α) ε ,ε Z2I
(dH (α) , dV (α) )]e−V of ni=1 Ai . Let us consider first the following expression, which gives the dominant large distance contribution %T 2 " 2 α α ˜ γ1 ,γ2 , (x − y) = ∂t Sx,ε(a) ,ε (α) ; ∂t Sy,ε(a) ,ε (α) . (6.4) α=1
α=1
Performing the change of variable (2.12), (2.13) we see that the r.h.s. of (6.4) can be ∂ ˜ γ1 ,γ2 , (x − y) = ∂ written as ∂φ(x) ∂φ(y) Sγ1 ,γ2 (φ)|φ=0 where, with the notation of (3.1), eSγ1 ,γ2 (φ) =
P (dψ)
P (dχ )eQ(χ,ψ)−V (ψ,χ) e
x
2 +∂t S 2 2 x,ε 1 ,ε 1 x,ε ,ε
φ(x) ∂t S 1
.
(6.5)
This is a new expression similar but not identical to the ones studied to obtain analyticity of the free energy for t = tc . We can study (6.5) in a similar way, by adapting the free energy analysis for the integration of Sγ1 ,γ2 (φ). def ˜ −,−,−,−, (x − y) def ˜ (x − y). One can Consider S−,−,−,− (φ) = S(φ) and = proceed as in §3 in order to integrate the massive χ fields and finding, for |λ| ≤ ε and with the notations of (3.1), (1) S (φ) M2N =e P (dψ)e−V (ψ)+B(φ,ψ) , (6.6) e where N is a normalization constant and B(ψ, φ) =
∞ ∞
···
m=1 n=1 σ ,α,ω x1
xm
···
y1
·
y2n
m &
· Bm,2n,σ ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )
i=1
where for n ≥ 2,
2n '&
φ(xi )
' ∂ αi ψyσii,ωi ,(6.7)
i=1
n
|Bm,2n,σ ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )| ≤ C n ε 2 ,
(6.8)
y1 ,...,y2n
and for n = 1 (1) − + iωZ1 φ(x)ψx,ω ψx,−ω + x
B1,2,σ ,α,ω (x; y1 , y2 )φ(x)
y1 ,y2 x {σ,ω} α1 +α2 ≥1
ψ), × ∂ α1 ψyσ11,ω1 ∂ α2 ψyσ22,ω2 + B(φ, (1)
where Z1 is an O(1) constant,
y1 ,y2
(6.9)
ψ) con|B1,2,σ ,α,ω (x; y1 , y2 )| ≤ C and B(φ, (1)
tains the terms with m ≥ 2. All kernels Bm,2n,σ ,α,ω and Z1 are analytic in λ, as follows by proceeding as in Appendix E. The symmetry considerations in Appendix F apply here as well and imply that the + − only possible local terms with n = m = 1 are of the form φ(x)ψx,1 ψx,−1 .
616
V. Mastropietro
6.2. We shall evaluate the integral, over the light fermions, in the r.h.s. of (6.6) in a way which is very close to that used for the integration of the r.h.s. of (3.1). We introduce the scale decomposition described in §4 and we perform iteratively the integration of the single scale fields, starting from the field of scale 1. After integrating the fields ψ (1) , ...ψ (h+1) , 0 ≥ h ≥ h∗ , we find 2 (h+1) (φ) (h) √ (≤h) (h) √ (≤h) eS (φ) = e−M Eh +S PZh ,mh ,Ch (dψ ≤h )e−V ( Zh ψ )+B ( Zh ψ ,φ) , (6.10) where PZh ,mh ,Ch (dψ (≤h) ) and V h are given by (4.4), respectively, while S (h+1) (φ), which denotes the sum over all terms dependent on φ but independent of the ψ field, and B (h) (ψ (≤h) , φ), which denotes the sum over all terms containing at least one φ field and two ψ fields, can be represented in the form S (h+1) (φ) =
∞ m=1 x1
B (h) (ψ (≤h) , φ) =
···
m &
(h+1) Sm (x1 , . . . , xm )
xm
' φ(xi ) ,
(6.11)
i=1
∞ ∞
···
···
·
xm y1 y2n m=1 n=1 α,ω x1 (h) · Bm,2n,σ ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )· m 2n & ' '& i . · φ(xi ) ∂ αi ψy(≤h)σ i ,ωi i=1 i=1
(6.12)
Since the field φ is equivalent, from the point of view of dimensional considerations, to two ψ fields, the only terms in the r.h.s. of (6.12) which are not irrelevant are those with m = 1 and n = 1, which are marginal. Repeating the symmetry considerations in Appendix F we can conclude that the only local terms with n = m = 1 and α1 = α2 = 0 (≤h)+ (≤h)− have the form φ(x)ψx,ω ψx,−ω . Hence we extend the definition of the localization operator L, so that its action on B (h) (ψ (≤h) , φ) is defined by its action on the kernels (h) Bm,2n,σ ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n ): 1) if m = 1, n = 1, α1 = α2 = 0, then (h) LB1,2,σ ,α,ω (x1 ; y1 , y2 )
= δ(y1 − x1 )δ(y2 − x1 )
(h)
dz1 dz2 B1,2,σ ,α,ω (x1 ; z1 , z2 ), (6.13)
2) in all the other cases, (h)
LBm,2n,σ ,α,ω (x1 , ...xm ; y1 , ..., y2n ) = 0 .
(6.14)
Hence, by the symmetry reasons discussed in Appendix F, (1)
LB(h) (ψ (≤h) , φ) =
Zh (≤h) F , Zh 1
(6.15)
Ising Models with Four Spin Interaction at Criticality
617
(1)
where Zh is a real number and (≤h) (≤h)+ (≤h)− (≤h)+ (≤h)− = φ(x)i[ψx,1 ψx,−1 − ψx,−1 ψx,1 ] . F1
(6.16)
x
In the expansion for the energy-energy correlation function there is then a renormal(1) ization constant more, namely Zh . With the notation of §4 we can write the integral in the r.h.s. of (6.10) 2 (≤h) (h) √ (≤h) (h) √ e−M th PZh−1 ,mh−1 ,Ch (dψ (≤h) )e−V ( Zh ψ )+B ( Zh ψ ,φ) −M 2 th PZh−1 ,mh−1 ,Ch−1 (dψ (≤h−1) ) · =e (≤h) (≤h) (h) √ (h) √ · PZh−1 ,mh−1 ,f−1 (dψ (h) )e−V ( Zh−1 ψ )+B ( Zh−1 ψ ,φ) , (6.17) h
√
√ (h) ( Zh−1 ψ (≤h) ) is defined as in (4.19) and B (h) ( Zh−1 ψ (≤h) , φ) = where√ V √ B (h) ( Zh ψ (≤h) , φ); moreover B (h−1) ( Zh−1 ψ (≤h−1) , φ) and S (h) (φ) are then defined through the relation analogue of (4.21), that is √
√
2
(h)
)+B ( Zh−1 ψ ,φ)−M Eh +S (φ) e−V ( Zh−1 ψ √ (h) (≤h) (≤h) (h) √ = PZh−1 ,mh−1 ,f−1 (dψ (h) )e−V ( Zh−1 ψ )+B ( Zh−1 ψ ,φ) . (6.18) (h−1)
(≤h−1)
(h−1)
(≤h−1)
h
As in §5.5, the fields of scale between h∗ and hM are integrated in a single step without any multiscale decomposition. Hence we define, in analogy to (5.10), ∗ ∗ h∗ def S (h ) (φ)−M 2 E = PZh∗ −1 ,mh∗ −1 ,Ch∗ (dψ (≤h ) ) e √ √ (≤h∗ ) )+B (h∗ ) (h∗ ) ( Zh∗ −1 ψ (≤h∗ ) ,φ) . (6.19) × e−V ( Zh∗ −1 ψ It follows that ∗
S(φ) = −M 2 EM + S (h ) (φ) = −M 2 EM +
1
S (h) (φ) ;
(6.20)
h=h∗
hence, if S2 (x, y) = (h)
∂ (h) ∂ ∂φ(x) ∂φ(y) S (φ)|φ=0 , ∗
˜ (x, y) = S (h ) (x, y) = 2
1 h=h∗
(h) S2 (x, y) .
(6.21)
˜ (x, y) = α (x, y) + β (x, y), where 6.3. It is shown in Appendix M that α (x, y) =
1
(Z (1) )2 (h ) h∨h (h) [gω,ω (x − y)g−ω,−ω (y − x)− Zh−1 Zh −1
h,h =h∗ ω=±1
(h) g+1,−1 (x
(h ) − y)g−1,+1 (y
− x)] +
1 h=h∗
(
(1)
Zh Zh
)2 (h),a
G
(x, y),
(6.22)
618
V. Mastropietro (h∗ )
(≤h∗ )
where h ∨ h = max{h, h } and gω1 ,ω2 (x) has to be understood as gω1 ,ω2 (x); moreover for all N > 0 there exists a constant CN such that (h),α
|∂xm1 ∂xm00 G
(x, y)| ≤ γ (2+m0 +m1 )h |λ1 |
CN h 1 + (γ |d(x
− y)|)N
.
(6.23)
β
For (x, y) the following bound holds: ( (1) )2 1 Zh CN m1 m0 β |∂x ∂x0 (x, y)| ≤ γ (2+m0 +m1 +τ )h , h Zh 1 + (γ |d(x − y)|)N ∗
(6.24)
h=h
where 0 < τ < 1 is a constant. A similar bound, by dimensional reasons, holds for ˜ (x, y) − −,−,−,−, (x, y). It is shown in Appendix M that sech4 (1)
Zh−1 (1) Zh
(1)
= 1 + zh = 1 + a1 λh + O(µ2h ) ,
so that there exist two constants c1 , c2 such that γ −λ1 c1 h <
(1)
Zh Zh
(6.25)
< γ −λ1 c2 h . If we define
η = logγ (1 + z[h∗ /2] ) , −1 zh = with C0 Z 1
Zh−1 Zh
(6.26)
−ηh
− 1, we can check that | γZh − 1| ≤ Cλ21 and, from (5.2), (1)
η1 = logγ (1 + z[h∗ /2] ), it holds η = b1 λ21 + O(λ3 ). In a similar way, if we define |
η1 h Z1 γ − (1)
(1)
Zh
− 1| ≤ C|λ1 | and η1 = a1 λ1 + O(λ2 ).
Note also that, by reasoning as in Appendix G, for x, y and t − tc fixed lim [γ1 ,γ2 , (x, y) − −,−,−,−, (x, y)] = 0.
M→∞
(6.27)
and the limit is reached exponentially fast. Then (6.2) is equal to the limiting value of −,−,−,−, (x, y). In order to prove the first inequality in (1.9), we write, if m0 +m1 = n and η1 = η − η1 , ! ! ( )2 ! 1 ! ! Zh(1) ! m1 m0 (h),α ! ! ∂ ∂ G (x, 0) x x0 ! ! !h=h∗ Zh ! ≤ CN,n
0 h=h∗
γ (2+2η1 +n)h CN,n HN,2+2η1 +n (|d(x)|), ≤ h N [1 + (γ |d(x)|) ] |d(x)|2+2η1 +n
(6.28)
where η1 = η − η1 , HN,α (r) =
0 h=h∗
(γ h r)α . 1 + (γ h r)N
(6.29)
On the other hand one sees that, if α ≥ 1/2 and N − α ≥ 1, there exists a constant CN,α , HN,α (r) ≤
CN,α , 1 + ( r)N−α
∗
= γh ,
(6.30)
Ising Models with Four Spin Interaction at Criticality
619
and this implies the first inequality in (1.9). Proceeding in the same way by using (6.24) one can prove the second inequality in (1.9). Moreover by writing the propagators in the first two sums in the r.h.s. of (6.22) as in (7.62), (4.16) and using (7.63),(7.65),(7.66),(4.16) it follows (1.11). Finally firstnote that the specific heat Cvλ differs, by trivial dimensional arguments, from the sum x ε, (x, 0) by terms which are O(λ). By (6.22),(6.23),(6.24) it holds
|a (x, 0)|
≤C
1
(
h=h∗
x
(1)
Zh Zh
)2 ≤ C2
1
∗
γ
2η1 h
h=h∗
1 − γ 2η1 h ≤ C2 . 2η1
(6.31)
On the other hand, by (6.22),(6.23),(6.24), ! ! ! ! 1 (Z (1) )2 (h) ! ! (h ) h∨h ! ! (x, 0) − g (x)g (−x) L,ω,ω L,−ω,−ω ! ! Zh−1 Zh −1 ! ! x ∗ x h,h =h ω=±1 ( (1) )2 1 Z |mh | h ≤ C |λ| + γ τ h + h , (6.32) Z γ h ∗ h=h
so the first of the two inequalities in (1.12) follows. Remark. It is interesting to see how the results in [PS, Spe] can be recovered by our analysis. We can consider the hamiltonian (1.1) with interaction given for instance by V = −λ
2
(α)
(α)
(α)
(α)
(α) (α) [σx,x σ σ (α) σ + σx,x σ σ (α) σ ] 0 x+1,x0 x,x0 x+1,x0 0 x,x0 +1 x,x0 x,x0 +1
(6.33)
x,x0 α=1
describing two independent Ising models with a quartic interaction. We will briefly explain in Appendix N that all the above analysis can be repeated in such a case and, due to the special form of (6.33), formulas (1.8)-(1.12) hold with η1 = η2 = 0, i.e. there is universality. 7. Appendices (α)
(α)
(α)
(α)
7.1. Appendix A: Grassmann integration. Grassmann variables H x , Hx , V x , Vx , x ∈ are such that all functions of them are polynomials. The Grassmann integra
(α) (α) (α) , H (α) ) in the variables H (α) , H (α) , tion x x x∈ dHx dH x of a monomial Q(H
(α) (α) (α) x ∈ M , is defined to be zero, except in the case Q(H (α) , H ) = x Hx H x , up to a permutation of the variables. In this case the value of the functional is determined, by using the anticommuting properties of the variables, by the condition (α) (α) dH x dHx(α) Hx(α) H x = 1 . (7.1) x∈M
x∈M
In a similar way the Grassmann integration for V (α) , V H, H with V , V .
(α)
is defined likewise exchanging
620
V. Mastropietro
7.2. Appendix B: Expression of the partition function as a Grassmann integral. If a = 0 2I , see (2.8), by making use of (2.9), as in (2.10) with or b = 0 we can write Z γ1 ,γ2 = (cosh Jr )2B 22S 1 Z 2I 4 (1) (2) 2 S (α) (α) S (α) (α) [ dHx dH x dVx dV x ] e Jr ;ε(1) ,ε (1) e Jr ;ε(2) ,ε (2) · α=1 x∈M
·
(1) (2) (1) (2) 1+λ a(tanh Jr + sech2 Jr H x,x0 Hx+1,x )(tanh Jr + sech2 Jr H x,x0 Hx+1,x ) 0
0
x∈M
·
1+λ a(tanh Jr + sech2 Jr V x,x0 Vx,x +1 )(tanh Jr + sech2 Jr V x,x0 Vx,x +1 ) 0 0 (1)
(2)
(1)
(2)
x∈M
·
x∈M
·
1 + λb(tanh Jr + sech2 Jr H x,x0 Hx+1,x )(tanh Jr + sech2 Jr V x,x0 Vx,x +1 ) 0 0 (1)
(2)
(1)
(2)
(1) (2) (1) (2) 1 + λb(tanh Jr + sech2 Jr V x,x0 Vx,x +1 )(tanh Jr + sech2 Jr H x−1,x0 +1 Hx,x +1 ) . 0 0
x∈M
(7.2) def
S
(α)
S
(α)
S
(α),ν
(α)
By writing tanh Jr = tanh J + ν(λ), we have e Jr ;ε,ε = e J ;ε,ε e ε,ε , where SJ ;ε,ε is given by (2.2) and (α) (α) (α),ν (α) (α) Sε,ε = ν [H x,x0 Hx+1,x0 + V x,x0 Vx,x0 +1 ] . (7.3) x∈M
2I can be written as in (2.11) with We can check that Z V = Va + Vb −
2
S
(α),ν , ε(α) ,ε (α)
(7.4)
α=1
* tanh2 Jr ) and [i] = a, b and, if fi = log(1 + λ[i] (1) (2) (1) (2) [fa + λa [H x,x0 Hx+1,x0 + H x,x0 Hx+1,x0 ] −Va = x∈M (1)
(1)
(2)
(2)
+λa H x,x0 Hx+1,x0 H x Hx+1,x0 ] +
−Vb =
[fa + λa [V x,x0 Vx,x0 +1 (1)
(1)
x∈M (2) (1) (1) (2) (2) (2) +V x,x0 Vx,x0 +1 ] + λa V x Vx,x0 +1 V x,x0 Vx,x0 +1 ] (1) (2) (1) (2) (2) (1) (2) (1) [fb + λb [H x,x0 Hx+1,x0 + V x,x0 Vx,x0 +1 ] + λb H x,x0 Hx+1,x0 V x Vx,x0 +1 ] x∈M (1) (2) (1) (2) [fb + λb [V x,x0 Vx,x0 +1 + H x−1,x0 +1 Hx,x0 +1 ] + x∈M (1) (2) (1) (2) (7.5) +λb V x,x0 Vx,x0 +1 H x−1,x0 +1 Hx,x0 +1 ],
where 2 J tanh J , * tanh2 Jr ) = λ[i]sech * λi (1 + λ[i] r r 2 4J . 2 * tanh Jr )(λi + ( * (1 + λ[i] λi ) ) = λ[i]sech r 4 J + O(λ)). * * For small λ it is λi = λ[i](tanh Jr sech2 Jr + O(λ)), λi = λ[i](sech r
(7.6)
Ising Models with Four Spin Interaction at Criticality
621 (α)
7.3. Appendix C: Change from Majorana to Dirac Grassmann variables. If S (α) (α) J ;ε ,ε (α) = x S (α) (α) we get, from the change of variables (2.12), x,ε
,ε
S
(α) x,ε(α) ,ε (α)
=S
(α,ψ) x,ε(α) ,ε (α)
+S
(α,χ) x,ε(α) ,ε (α)
+Q
(α) , x,ε(α) ,ε (α)
where S
(α,ψ) x,ε(α) ,ε (α)
t (α) t (α) (α) (α) [ψx (∂1 − i∂0 )ψx(α) + ψ x (∂1 + i∂0 )ψ x ] + [−iψ x (∂1 ψx(α) 4 4 √ (α) (α) (α) +∂0 ψx(α) ) + iψx(α) (∂1 ψ x + ∂0 ψ x )] + i( 2 − 1 − t)ψ x ψx(α) (7.7)
=
with the definitions (α)
(α)
∂1 ψx(α) = ψx+1,x0 − ψx(α)
∂0 ψx(α) = ψx,x0 +1 − ψx(α) .
(7.8)
Moreover (α,χ) x,ε(α) ,ε (α)
S
=
and finally Q(χ , ψ) = (α) x,ε(α) ,ε (α)
Q
=
t (α) (α) [χ (∂1 − i∂0 )χx(α) + χ (α) x (∂1 + i∂0 )χ x ] 4 x t (α) (α) + [−iχ (α) x (∂1 χx + ∂0 χx ) 4 √ (α) (α) (α) +iχx(α) (∂1 χ (α) x + ∂0 χ x )] − i( 2 + 1 + t)χ x χx ,
(1) x [Qx,−,−
(7.9)
(2)
+ Qx,−,− ] with
t (α) (α) {−ψx(α) (∂1 χx(α) + i∂0 χx(α) ) − ψ x (∂1 χ (α) x − i∂0 χ x ) 4 (α) (α) (α) (α) −χx(α) (∂1 ψx(α) + i∂0 ψx(α) ) − χ (α) x (∂1 ψ x − i∂0 ψ x ) + iψ x (∂1 χx (α) (α) (α) (α) (α) (α) (α) −∂0 χx ) + iψx (−∂1 χ x + ∂0 χ x ) + iχ x (∂1 ψx − ∂0 ψx ) (α) (α) (7.10) +iχx(α) (−∂1 ψ x + ∂0 ψ x )} .
Moreover (α)
(α)
(α)
(α)
α H x,x0 Hx+1,x0 + V x,x0 Vx,x0 +1 = ∂t Sx,ε (α) ,ε (α) (α),ν
so that Sεα ,ε α = ν Let us define
α x∈ ∂t Sx,ε(α) ,ε (α) .
P−,− (dψ) = Nψ−1 [ (α)
(7.11)
(α)
(α)
dψ k dψk ] exp[
k∈D−,− (α) (α) +ψ k ψ −k (i sin k
t 4M 2
(α)
(α)
[ψk ψ−k (i sin k + sin k0
k∈D−‘,− (α) (α) − sin k0 ) − i2mψ (k)ψ k ψ−k ]],
(7.12)
√ where Nψ is a normalization constant and 2t mψ (k) = (− 2 + 1 + t) + 2t (cos k0 + (α) cos k − 2). Defining in the same way P−,− (dχ ), with the only difference that mχ (k) = √ −−−− as −( 2 + 1 + t) − 2t (cos k0 + cos k − 2) 2t replaces mψ (k), we can rewrite Z 2I −,−,−,− = (cosh Jr )2B 22S 1 Z 2I 4
622
V. Mastropietro
2
(α) (α) P−,− (dψ)P−,− (dχ )
eQ(χ,ψ)−V (χ,ψ) .
(7.13)
α=1
We can perform the following change of variables: 1 1 (1) (2) (1) (2) − + ψ1,k = √ (ψk + iψk ), ψ1,−k = √ (ψk − iψk ), (7.14) 2 2 1 1 (1) (2) (1) (2) − + = √ (ψ k + iψ k ), ψ−1,−k = √ (ψ k − iψ k ) ψ−1,k 2 2 iσ kx σ σ = 1 which in coordinate space is (2.13) if ψω,x ψω,k , σ = ±. By this change ke M (1)
(2)
of variables P−,− (dψ (1) )P−,− (dψ (2) ) ≡ P (dψ) and P−,− (dχ (1) )P−,− (dχ (2) ) = P (dχ ), where P (dψ), P (dχ ) given by (2.16). In the physical language, the change of variables (2.13) means that one is describing the system in terms of Dirac fermions instead of in terms of Majorana fermions.
7.4. Appendix D: The interaction in fermionic Grassmann variable. Note that (α)
(α)
V x,x0 Vx,x0 +1 = Q1(α) + Q2(α) + Q3(α) x x x ,
(7.15)
where 1 (α) (α) (α) (α) (α) (α) (α) (α) − ψ x,x0 ψ x,x0 +1 + ψ x,x0 ψx,x0 +1 − ψx,x ψ ], [ψ ψ 0 x,x0 +1 4i x,x0 x,x0 +1 1 (α) (α) (α) (α) (α) (α) (α) = [χx,x0 χx,x0 +1 − χ (α) x,x0 χ x,x0 +1 + χ x,x0 χx,x0 +1 − χx,x0 χ x,x0 +1 ], 4i (7.16) 1 (α) (α) (α) (α) (α) (α) (α) (α) = [ψx,x χ − ψ x,x0 χ x,x0 +1 − ψx,x χ + ψ x,x0 χx,x0 +1 0 x,x0 +1 0 x,x0 +1 4i (α) (α) (α) (α) (α) (α) (α) ψ − χ (α) +χx,x x,x0 ψ x,x0 +1 − χx,x0 ψ x,x0 +1 + χ x,x0 ψx,x0 +1 ]. 0 x,x0 +1
Q1(α) = x Q2(α) x Q3(α) x
(α)
(α)
A similar expression holds for H x,x0 Hx+1,x0 . The above expressions can be naturally expressed in terms of (discrete) derivatives of the fields. In fact by looking for instance to the first of (7.16) one finds (α)
(α)
(α)
(α)
(α)
(α) (α) ψ − ψ x,x0 ψ x,x0 +1 = ψx,x ∂ ψ (α) − ψ x,x0 ∂x0 ψ x,x0 ψx,x 0 x,x0 +1 0 x0 x,x0
(7.17)
and (α)
(α)
(α)
(α)
(α)
(α)
(α)
(α) (α) (α) ψ x,x0 ψx,x0 +1 − ψx,x ψ = −∂x0 ψ x,x0 ∂x0 ψx,x + ψ x,x0 ψx,x + ψ x,x0 +1 ψx,x0 +1 . 0 x,x0 +1 0 0 (7.18) (α)
(α)
From (7.5) we see that V is the sum of expressions linear or bilinear in H x Hx+1,x0 or
(α) (α) V x Vx,x0 +1 ; moreover the change of variables (2.13) and (2.14) replaces a ψ, χ field ± ± with a ψ1± , χ1± field, and a ψ, χ field with a ψ−1 , χ−1 field; hence we see that V is a
sum of terms of the form (2.19). Analogous considerations hold for Q.
Ising Models with Four Spin Interaction at Criticality
623
7.5. Appendix E: The integration of the χ fields. We start from the definition of truncated expectation: ! ∂n ! EχT (X; n) = n log P (dχ )eλX(χ) ! (7.19) λ=0 ∂λ so that, calling V(χ , ψ) = −Q(χ , ψ) + V(χ , ψ)
(7.20)
we obtain
P (dχ )e−V (χ,ψ) =
M 2 N (1) − V (1) (ψ) = log
∞ (−1)n n=0
n!
EχT (V(., ψ; n)).
(7.21)
0 ) in V by an index We label each one of the monomials (whose number will be called C vi , so that each monomial can be written as ε(f ) ε(f ) v(xvi ) ∂ α(f ) ψω(f ),x(f ) ∂ α(f ) χω(f ),x(f ) , (7.22) xvi
v f ∈P i
f ∈Pvi
vi are the set where xvi is the total set of coordinates associated to vi , and Pvi and P of indices labeling the χ or ψ-fields; v(xvi ) are short ranged functions (products of Kronecker deltas, see (7.5)). We can write v0 ) , V (1) (ψ) = V (1) (P (7.23) v =0 P 0
v0 ) = V (1) (P
xv0
KPv (xv0 ) = 0
v f ∈P 0
ε(f ) ∂ α(f ) ψω(f ),x(f ) KPv (xv0 ) , 0
∞ n 1 T Eχ [ χ (Pv1 ), . . . , χ (Pvn )] vi (xvi ) , n! v ,..,v n=1
1
n
(7.24)
(7.25)
i=1
+ ε(f ) n , P v0 = i P vi and xv0 = where χ (Pv ) = f ∈Pv ∂ α(f ) χω(f ),x(f ) , v1 ,...vn 1 ≤ C 0 + T i xvi . We use now the well known expression for Eχ (see for instance [Le]) (χ) T χ (P1 ), ..., χ (Ps )) = gω− ,ω+ (xl − yl ) dPT (t) det GT (t), (7.26) Eχ ( T l∈T
where:
ε(f ) a) P is a set of indices, and χ (P ) = f ∈P ∂ α(f ) χx(f ),ω(f ) . b) T is a set of lines forming an anchored tree between the cluster of points P1 , .., Ps i.e. T is a set of lines which becomes a tree if one identifies all the points in the same clusters. c) t = {ti,i ∈ [0, 1], 1 ≤ i, i ≤ s}, dPT (t) is a probability measure with support on a set of t such that ti,i = ui · ui for some family of vectors ui ∈ Rs of unit norm. d) GT (t) is a (n − s + 1) × (n − s + 1) matrix, whose elements are given by GTij,i j = ti,i gω− ,ω+ (xij − yi j ) with (fij− , fi+ j ) not belonging to T .
624
V. Mastropietro
If s = 1 the sum over T is empty, but we can still use the above equation by interpreting the r.h.s. as 1 if P1 is empty, and detG(P1 ) otherwise. We bound the determinant using the well known Gram-Hadamard inequality, stating that, if M is a square matrix with elements Mij of the form Mij =< Ai , Bj >, where Ai , Bj are vectors in a Hilbert space with scalar product < ·, · >, then | det M| ≤ ||Ai || · ||Bi ||, (7.27) i
where || · || is the norm induced by the scalar product. Let H = Rs ⊗ H0 , where H0 is the Hilbert space of complex four dimensional vectors F (k) = (F1 (k), . . . , F4 (k)), Fi (k) being a function on the set D−,− , with scalar product < F, G >=
4 1 ∗ Fi (k)Gi (k), M2
(7.28)
k
i=1
and one checks that (χ)
GTij,i j = ti,i gω− ,ω+ (xij − yi j ) =< ui ⊗ Ax(f − ),ω(f − ) , ui ⊗ Bx(f + ),ω(f + ) > , l
ij
l
ij
i j
i j
(7.29) where ui ∈ Rs , i = 1, . . . , s, are the vectors such that ti,i = ui · ui , and (with Q(k) defined in (2.23)) 1 (− sin k0 + i sin k, 0, −imχ (k), 0), if ω = +1, ik x · Ax,ω (k) = e (0, imχ (k), 0, mχ (k)), if ω = −1, −Qχ (k) (7.30) 1 (1, 1, 0, 0), if ω = +1, ik y · Bx,ω = e (0, 0, 1, (sin k0 + i sin k)/mχ (k)), if ω = −1. −Qχ (k) Hence from (7.27) we immediately find |GTij,i j | ≤ C1n
(7.31)
where C1 is an O(1) constant. Finally we get xv0
∞ 1 |KPv (xv0 )| ≤ 0 n! v ,..,v n=1
1
n xv1 ,...,xvn
C1n
n (χ) [ |gω− ,ω+ (xl − yl )|] |vi (xvi )|, T
l∈T
i=1
(7.32)
where we have used that dP +T (t) = 1. The number of addends in T is bounded by n!C2n . Finally T and the i xvi form a tree connecting all points, so that using that the propagator is massive and that the interactions are short ranged xv ,...xvn 1
n (χ) n n 2 n is the number of T [ l∈T |gω− ,ω+ (xl − yl )|] i=1 |vi (xvi )| ≤ C3 |ε| M , where couplings O(ε). v0 | ≥ 4. Note that if to vi are associated only terms Let us consider the case |P from V(ψ, χ ), then n = n. Let us consider now the case in which there are end-points v0 | end-points associated to Q(ψ, χ ), which have O(1) coupling; there are at most |P
Ising Models with Four Spin Interaction at Criticality
625
associated with Q(ψ, χ ). In fact in Q(ψ, χ ) there are only terms of the form ψχ, so at most the number of them is equal to the number of ψ fields. If we call nλ ≤ n the v0 |/2 − 1 ≥ |P v0 |/4; hence number of vertices quartic in the fields it is clear that nλ ≥ |P
|KPv (xv0 )| ≤ M 2 0
xv0
∞
n+|Pv0 | C |ε|n
(7.33)
v |/4 n=|P 0
v0 | ≥ 4. and the second of (3.2) holds for |P Consider now the case |Pv0 | = 2; in this case there are terms O(1), obtained when all the vi are associated with elements of Q(ψ, χ ). It is convenient to include all such terms in the gaussian integration, as they cannot be considered as perturbations (they are not O(ε)). Hence we define N P (dψ) = P (dψ) P (dχ )eQ(ψ,χ) (7.34) and, if < X >0 =
− + − + P (dψ)X, it holds < ψx,1 ψy,1 >0 =< ψx ψy >0 , < ψx,−1 ψy,−1 >0 (1)
(1)
− + =< ψ x ψ y >0 and < ψx,1 ψy,−1 >0 =< ψx ψ y >0 . By using the explicit expres(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
sions for < ψx ψy >0 , < ψ x ψ y >0 , < ψx ψ y >0 in [MPW], (3.5) follows. In order to obtain (3.3) we single out the local part of the terms quartic in the fields; the fact that l1 = 2λ(a + b)sech4 J + O(ε 2 ) can be checked by an explicit computation of all the contributions with coupling O(λ) to W2 , noting that they can only be obtained contracting a term quartic in the χ fields with one of the addends of (7.10); each of such terms carries a derivative in the coordinate space, hence the Fourier transform of such terms is vanishing at zero momentum. 7.6. Appendix F: Symmetry cancellations in the effective potential. + − • There are no local terms in the r.h.s. of (3.2) of the form ψx,1 ψx,1 ; in fact by (2.13) − + ψx,1 = iψx ψx , but the system is invariant under the transformation ψx,1 (1)
(2)
(1)
(1)
ψ (1) , ψ , χ (1) , χ (1) → −ψ (1) , −ψ , −χ (1) , −χ (1) (2) (2) ψ (2) , ψ , χ (2) , χ (2) → ψ (2) , ψ , χ (2) , χ (2) ,
(7.35)
hence such terms cannot be present as they violate such symmetry. − − + + • There are no terms in the r.h.s. of (3.2) of the form ψ1,x ψ−1,y or ψ1,x ψ−1,y ; in fact, − − ψ−1,y = ψ1,x
1 (1) (1) (2) (2) (1) [ψ ψ y − ψx(2) ψ y + iψx(1) ψ y + iψx(2) ψ y ] 2 x
(7.36)
and the last two terms violate the symmetry (7.35). Moreover the first two terms are forbidden a) in the case b = 0 by the invariance under the symmetry (1) (1) (2) (2) (2) (2) (1) (1) ψx,x , χx,x , ψx,x , χx,x → ψx,x , χx,x , ψx,x , χx,x ; 0 0 0 0 0 0 0 0
(7.37)
b) in the case a = 0 by the invariance under the symmetry (1)
(1)
(1) (1) (2) (2) (2) (2) , χx,x , ψx,x , χx,x → ψx,x , χx,x , ψx+1,x0 −1 , χx+1,x0 −1 . ψx,x 0 0 0 0 0 0
(7.38)
626
V. Mastropietro
(1) (1) In fact consider in V (1) (3.1) the terms of the form x,y [ψx ψ¯ y w (1) (x, y) + (2) (2) ψx ψ¯ y w (2) (x, y)]; w (1) (x, y) is obtained by the truncated expectation EχT of a certain number of (V + Q)|ψ=0 , of a term
∂ (1) (V ∂ψx
+ Q)|ψ=0 and of a term
∂ (1) (V ∂ ψ¯ y
+ Q)|ψ=0 .
If we perform in the truncated expectation the change of variable (7.37) or (7.38)we get that (V + Q)|ψ=0 is invariant while ∂(1) (V + Q)ψ=0 is changed in ∂(2) (V + Q)ψ=0 and
∂ (1) (V ∂ ψ¯ y
∂ψx
+ Q)ψ=0 is changed in
∂ (2) (V ∂ ψ¯ y
∂ψx
+ Q)ψ=0 ; this shows that w(1) (x, y) =
w(2) (x, y). + + A similar argument can be repeated for ψ1,x ψ−1,y . + ψ + ; in fact, • There are no terms in the r.h.s. of (3.2) of the form ψω,x ψω,y or ψω,x ω,y − − ψ1,x ψ1,x =
' 1 & (1) (1) ψx ψy − ψx(2) ψy(2) + iψx(1) ψy(2) + iψx(2) ψy(1) 2
(7.39)
and we can proceed as in the previous case. • The model is invariant under complex conjugation and the exchange (α)
(α)
ψx(α) , ψ x → ψ x , ψx(α)
(α) (α) χx(α) , χ (α) x → χ x , χx ; (α)
(7.40)
(α)
this follows from the fact that, from (2.12), H , H (α) , V , V (α) , written in terms (α) of ψ , ψ (α) , χ (α) , χ (α) , are invariant under such transformation. Hence the coefficient of the lo+ + cal part of the quartic (non-vanishing) terms is real; in fact ψ1,x ψ1,x ψ−1,x ψ−1,x ≡ (1)
(1)
(2)
(2)
(0, 0, 0) must be equal, by the above invariance, to ψx ψ x ψx ψ x times w (1) (1) (2) (2) (0, 0, 0) = w ∗ (0, 0, 0). Finally the combiw ∗ (0, 0, 0)ψ x ψx ψ x ψx , hence w (1) (2) (2) (1) + − + − ψx,−1 + ψx,−1 ψx,1 is equal to i[ψx ψ x − ψx ψ x ] so it nation of local terms ψx,1 + − cannot be present as it violates the symmetry (7.35). On the other hand ψx,1 ψx,−1 − + − ψx,1 is equal to [ψx ψ x + ψx ψ x ]; hence the coefficient of the local part ψx,−1 (1)
(1)
(2)
(2)
(1)
(1)
(2)
(2)
is imaginary and odd in ω; in fact w (0)[ψx ψ x + ψx ψ x ] must be equal to (1) (1) (2) (2) w ∗ (0)[ψ x ψx +ψ x ψ x ], by the invariance under complex conjugation and (7.40), hence w (0) = − w∗ (0). • We consider now the addends with n = 1 in the r.h.s. of (3.2), + Wω1 ,ω2 (x, y)ψx,ω ψ− . (7.41) 1 y,ω2 x,y
We can represent Wω1 ,ω2 (x, y) as the sum over Feynman graphs g in the usual way (see for instance [GM]); the external lines are associated to the ψ fields, and to χ the internal lines are associated the propagators gω,ω (x − y); moreover the vertices 1 ,ε2 associated to the interaction are linear or bilinear in Ac,ε x;φ,ω1 ;φ ,ω2 . We show that
x
sin
πx πx0 Wω,−ω (x, 0) = Wω,−ω (x, 0) = 0 . sin M M x
ω,−ω (x, 0) and we call We can consider a single Feynman diagram value W g
(7.42)
Ising Models with Four Spin Interaction at Criticality
627
1 ,ε2 1) nωa is the number in g of terms Aεx;φ,ω with ω1 = ω2 = ω; n1a + n−1 a = na . 1 ;φ ,ω2 ε1 ,ε2 2) nb is the number of Ax;φ,ω1 ;φ ,ω2 with ω1 = −ω2 = ω.
(χ)
3) nω+ is the number of diagonal propagators gω,ω . (χ) 4) n− is the number of non diagonal propagators gω,−ω .
g πx x Wω,−ω (x, 0) sin M (χ) and we use (2.27), then in each Feynman graph each propagator gω,ω (x) is replaced (χ) (χ) by (−1)δω,ω gω,ω (x). Moreover the propagators ∂gω,ω (x) are replaced by (−1)δω,ω +1 (χ) ∂x0 fx,x0 = fx,x0 − fx,x0 −1 and ∂x fx,x0 = fx,x0 − fx−1,x0 ; finally ∂gω,ω (x), where (χ) δω,ω (χ) gω,ω (x + a) is replaced by (−1) gω,ω (x − a), if a is a constant vector.
If we make the transformation xi → −xi in all the sums in
On the other hand we could equivalently write the interaction (1.3) as
V (σ (1) , σ (2) ) = −
M
(1)
(2)
(1)
(2)
(1) (1) λa[σx−1,x0 σx,x σ σ (2) + σx,x0 −1 σx,x σ σ (2) ] 0 x−1,x0 x,x0 0 x,x0 −1 x,x0
x,x0 =1 (1) (1) (2) σ σ (2) +λb[σx−1,x0 σx,x 0 x,x0 −1 x,x0 (1) (2) (1) (2) +σx,x0 −1 σx,x0 σx,x0 −1 σx+1,x0 −1 ]}
;
(7.43)
Equation (7.43) can be found from (1.3) making the change of variables x → −x, (α) (α) and then making the transformation σx → σ−x . Starting from this expression and repeating the computations in §2, §3 we get an expression similar to (2.11), where V is (α) (α) (α) (α) now an expression linear or bilinear in H x−1,x0 Hx,x0 or V x,x0 −1 Vx,x0 . From (2.12) it holds (α) (α) 1(α) 2(α) 3(α) V x,x0 −1 Vx,x =Q +Q +Q x x x , 0
(7.44)
where 1 (α) (α) (α) (α) (α) (α) (α) (α) 1(α) Q = [ψx,x0 −1 ψx,x − ψ x,x0 −1 ψ x,x0 + ψ x,x0 −1 ψx,x − ψx,x0 −1 ψ x,x0 ], x 0 0 4i 1 (α) (α) (α) (α) (α) (α) (α) 2(α) Q = [χx,x0 −1 χx,x − χ x,x0 −1 χ (α) x x,x0 + χ x,x0 −1 χx,x0 − χx,x0 −1 χ x,x0 ], 0 4i (7.45) 1 (α) (α) (α) (α) (α) (α) (α) 3(α) Q = [ψx,x0 −1 χx,x − ψ x,x0 −1 χ (α) x x,x0 − ψx,x0 −1 χ x,x0 + ψ x,x0 −1 χx,x0 0 4i (α) (α) (α) (α) (α) (α) (α) (α) +χx,x0 −1 ψx,x − χ x,x0 −1 ψ x,x0 − χx,x0 −1 ψ x,x0 + χ x,x0 −1 ψx,x ]. 0 0 (α)
(α)
A similar expression hold for H x−1,x0 Hx,x0 . Note that, looking for instance to the first of (7.45), we get (α) (α) ψx,x0 −1 ψx,x − ψ x,x0 −1 ψ x,x0 = ψx,x ∂ ψ (α) − ψ x,x0 ∂x0 ψ x,x0 0 0 x0 x,x0 (α)
(α)
(α)
(α)
(α)
(7.46)
and (α) (α) (α) (α) (α) (α) (α) (α) ψ x,x0 −1 ψx,x − ψx,x0 −1 ψ x,x0 = − ∂x0 ψ x,x0 + ψ x,x0 ψx,x ∂x0 ψx,x 0 0 0 (α)
(α)
+ψ x,x0 −1 ψx,x0 −1 .
(7.47)
628
V. Mastropietro
σ1 ,σ2 σ1 ,σ2 x Ax;φ,ω1 ;φ ,ω2 or x Ax;φ,ω1 ;φ ,ω2 σ ,σ 1 2 1, x0 − 1) and A x;φ,ω1 ;φ ,ω2 is identical
One verifies that V is a sum of the form
σ 1 ,σ 2 where x = x or x = (x + A x ;φ ,ω1 ;φ ,ω2 1 ,σ2 to Aσx;φ,ω up to the substitutions ∂ → ∂, x + 1 → x − 1 and x0 + 1 → x0 − 1. 1 ;φ ,ω2 Hence it holds
sin
x
πx πx sin Wω,−ω (x, 0) = (−1)na +n+ +1 Wω,−ω (x, 0) . M M x
(7.48)
It holds 2na + 2nb = 2(n+ + n− ) + 2 so that
sin
x
πx πx Wω,−ω (x, 0) = (−1)2na +nb −n− Wω,−ω (x, 0) sin M M x πx = (−1)nb −n− sin Wω,−ω (x, 0). M x
(7.49)
The number of fields with ω = 1 is 2n1a + nb and the number of external fields with ω = 1 is then 2n1a + nb − 2n1+ − n− = 2(n1a − n1+ ) + nb − n− which implies that nb − n− must be an odd numberif the number of external fields = 1 is 1. Hence with ω πx πx πx x sin M Wω,−ω (x, 0) = (−1) x sin M Wω,−ω (x, 0) so that x sin M Wω,−ω (x, 0) = 0. We consider now Wω,ω (x; 0); we have already proved that x Wω,ω (x; 0) = 0. We want to show that sin πx0 x
M
Wω,ω (x; 0) = iωα;
sin π x x
M
Wω,ω (x; 0) = β
with α, β real. From (2.23) we see that gω,−ω (x) is even in the exchange x → −x and imaginary. Moreover we can write gω,ω (k) =
sin2 k
−i sin k ω sin k0 1 2 + = gω,ω (k) + gω,ω (k) 2 2 2 + sin k0 + mχ sin k + sin2 k0 + m2χ (7.50)
1 (x) real, odd in the exchange x → −x and even in x → −x ; g 2 (x) is with gω,ω 0 0 ω,ω imaginary, even in the exchange x → −x and odd in x0 → −x0 . Remember that (see 1 ,σ2 §2.4) the coefficient of Aσx;φ,ω is a) imaginary if ω1 = ω2 and α = 1, ∂xα = ∂x0 ; 1 ;φ ,ω2 b) real if ω1 = ω2 and α = 2, ∂xα = ∂x ; c)imaginary if ω1 = −ω2 . Given a Feynman diagram g contributing to i x sin πx M Wω,ω , by parity it must be present a total odd num1 (x) propagators and ∂ derivatives from the interaction. Moreover by parity the ber of gω,ω x 2 (x) and ∂ from the interaction must be even. Finally as the external lines number of gω,ω x0 have the same ω index, the sum of the number of non diagonal propagators gω,−ω (x) plus 1 ,σ2 the number of Aσx;φ,ω with ω1 = −ω2 must be even. Hence x sinMπx Wω,ω (x; 0) 1 ;φ ,ω2 sin πx0 is real and ω-independent. In the same way one sees that x M Wω,ω (x; 0) = iωα.
Ising Models with Four Spin Interaction at Criticality
629
7.7. Appendix G: Independence from boundary conditions. We show that, if |t −tc | > 0, ε(1),ε(2),ε(3),ε(4) = Z ε(1),ε(2),ε(3),ε(4) (Z ε(1),ε(2),ε(3),ε(4) )−1 , Z 2I 2I 0,2I
(7.51)
ε(1),ε(2),ε(3),ε(4)
where Z0,2I is given by (7.2) with λ = 0, is exponentially insensitive to boundary conditions. In particular we show that for |t − tc | > 0, λ small enough ! ! ! γ (1),γ (2) !! Z ! 2I (7.52) !log −,−,−,− ! ≤ |λ|M 2 e−c1 |t−tc |M , ! ! Z 2I
where c1 > 0 is a suitable constant. The above equation implies in particular that the partition function is non-vanishing; in fact, from (2.10) Z2I is (cosh λa cosh λb)2S times ε(1),ε(2),ε(3),ε(4) Z ε(1),ε(2),ε(3),ε(4) (−1)δγ1 +δγ2 Z 2I 0,2I ε
=
−,−,−,− Z0,2I Z 2I
−,−,−,− +Z 2I
(−1)δγ1 +δγ2
ε(1),ε(2),ε(3),ε(4)
2I
ε
× Z0,2I
ε(1),ε(2),ε(3),ε(4) Z 2I −1 −,−,−,− Z
(7.53)
,
ε(1),ε(2) where Z0,2I = ZI2 and ZI = ε(1),ε(2) (−1)δε(1),ε(2) ZI is the Ising model partition function. ε(1),ε(2) We recall that in §4 of [MW] it was proved that the limit M → ∞ of |ZI | if |t − tc | > 0 is exponentially independent from boundary conditions; moreover if ε(1),ε(2) have a positive limit, while if t − tc < 0 for any choice of ε1 , ε2 the functions ZI t − tc > 0 the limit of ZI+,+ is negative, and for the other choices the limit is a positive number. −,−,−,− Z0,2I ||λ|M 2 Hence by (7.52) the second addend in (7.53) is bounded by C|Z 2I e−c1 |t−tc |M so (7.53) is non-vanishing. In order to prove (7.52) we can write, see (7.12) and (7.20), 2 (1),ε(2),ε (2) ε(1),ε (α) (α) (α) (α) log Z˜ = P (dψ )P (dχ ) eQ−V , (7.54) 2I
(α)
α=1
(α)
εα ,εα
(α)
(α)
εα ,εα
ε(1),ε (1),ε(2),ε (2)
can be written as in (5.12). and proceeding as in §3 we see that log Z2I The terms E˜ h are the sum of addends of the form x1 ,..,xn Wε (x1 , .., xn ), with xi varying M M M in [− M 2 , 2 ] × [− 2 , 2 ] and the W are truncated expectations for which a formula like (7.26) holds. Note that W (x1 , .., xn ) is periodic with period M in any of its coordinates, for any ε; this follows from the fact that there is an even number of ψ, χ fields associated to any xi , and from the form of V. Moreover W (x1 , .., xn ) is translation invariant, so that we can fix one variable to (0, 0), for instance x1 ; hence it holds Wε (x1 , .., xn ) = Wε (0, x2 .., xn ) . (7.55) x1 ,..,xn
x1 ,..,xn
∗
∗∗
We can write x1 ,..,xn W as x1 ,..,xn W + x1 ,..,xn W , where ∗x1 ,..,xn is over xi varying ∗∗ M M M −c1 |t−tc |M) ), as in W there is surely in [− M x1 ,..,xn W is O(e 4 , 4 ] × [− 4 , 4 ]. Then
630
V. Mastropietro
a chain of propagators exponentially decaying connecting ∗ the point (0, 0) with a point M M M outside [− M , ] × [− , ]. On the other hand in x1 ,..,xn W we can use the Poisson 4 4 4 4 summation formula, stating that M−1 1 n2π απ f f (nM)(−1)αn , + = M M M n=0
(7.56)
n∈Z
(i)
where f is a 2π-periodic function and α = (0, 1). From (7.56) we find, if g,ε,ε (x, x0 ), i = ψ, χ is the propagator corresponding to Pε,ε (dψ) or Pε,ε (dχ ) (7.12), (i) (−1)nδε (−1)nδε g (i) (x − y + nM, x0 − y0 + n0 M) g,ε,ε (x − y, x0 − y0 ) = n,n0 ∈Z (i)
(i)
≡ g (x − y, x0 − y0 ) + δgε,ε (x − y, x0 − y0 ),
(7.57)
(i)
where g (i) (x, x0 ) = limM→∞ g,ε,ε (x, x0 ) and δε = 1 if ε = − and δε = 0 if ε = +. Note that the only dependence on boundary conditions in the r.h.s. of (7.57) is (i) M in δgε,ε (x − y, x0 − y0 ) and it holds, if |x − y| ≤ M 2 , |x0 − y0 | ≤ 2 , |δg (i) (x − y, x0 − y0 )| ≤ e−c2 |mi |M , (7.58) ∗ with a proper constant c2 . Hence all the terms in x1 ,..,xn W with at least a δg (i) (x − y, x0 − y0 ) are exponentially bounded, and the part with only g (i) (x − y, x0 − y0 ) is independent from boundary conditions. By (7.56) it holds that also the terms th are exponentially insensitive to boundary conditions. −1 C0 zh | ≤ 7.8. Appendix H:Asymptotic properties of the propagators on scale h. If |Z 1 Z 1 |λ| k 2 , |C0 sh | ≤ |mh /2| and supk≥h | Zk−1 | ≤ e , for λ, t − tc small enough, given the positive integers N, n0 , n1 and if n = n0 + n1 , it holds (h) |∂xn00 ∂xn1 gω,ω (x − y)| ≤ CN,n
(h)
|∂xn00 ∂xn1 gω,−ω (x − y)| ≤ CN,n |
γ h(1+n) , 1 + (γ h |d(x − y)|)N
mh γ h(1+n) | , γ h 1 + (γ h |d(x − y)|)N
(7.59)
(7.60)
where ∂x denotes the discrete derivative. This follows immediately from the compact support properties of f˜h (k) and the fact that (h)
dM (x − y)n1 dM (x0 − y0 )n0 gω,ω (x − y) 1 −ik(x−y) n1 n0 −1 −1 = e−iπ(xM n1 +x0 M n0 ) (−i)n0 +n1 2 e ∂k ∂k0 M k & ' × f˜h (k)[Th−1 (k)]ω,ω , where Th is the quadratic form associated to PZh−1 ,mh−1 ,Ch (dψ).
(7.61)
Ising Models with Four Spin Interaction at Criticality
631
It will be useful to write (h)
(h) (h) (h) (x − y) = gL;ω,ω (x − y) + gω,ω (x − y) + gω,ω (x − y) gω,ω
(7.62)
with (h)
gL;ω,ω (x − y) =
1 −ik(x−y) fh (k) e , 1 ωk0 + i Z 1 k M2 −Z
(7.63)
k
which is of course obeying the bound (7.59). The decomposition (7.62) is related to the following identity: 1 1 1 [Th−1 (k )]ω,ω = + − −ωk −ω sin k0 + i sin k −ωk0 + ik 0 + ik −ω sin k0 + i sin k 1 + − . (7.64) −ω sin k0 + i sin k sin2 k02 + sin2 k + [mh−1 (k)]2 From (7.64) one shows that γ (2+n)h , 1 + (γ h |d(x − y))|N γ h(1+n) mh (h) |∂xn00 ∂xn1 gω,ω (x − y)| ≤ CN,n | h |2 . γ 1 + (γ h |d(x − y)|)N (h) gω,ω (x − y)| ≤ CN,n |∂xn00 ∂xn1
(7.65) (7.66)
(h)
Analogously the decomposition (4.16) is such that gω,−ω (x − y) verifies (7.60) and (h) gω,−ω (x − y), verifying (7.65). Finally note that, with the definition (5.9), it holds, given the positive integers N, n0 , n1 and putting n = n0 + n1 , that there exists a constant CN,n such that (≤h∗ ) |∂xn00 ∂xn1 gω,ω (x; y)|
∗
γ h (1+n) ≤ CN,n . ∗ 1 + (γ h |d(x − y)|)N
(7.67)
7.9. Appendix I: The integration of the ψ fields. It is possible to write V (h) in terms of Gallavotti-Nicolo trees We need some definitions and notations. 1) Let us consider the family of all trees which can be constructed by joining a point r, the root, with an ordered set of n ≥ 1 points, the endpoints of the unlabeled tree, so that r is not a branching point. n will be called the order of the unlabeled tree and the branching points will be called the non trivial vertices. The unlabeled trees are partially ordered from the root to the endpoints in the natural way; we shall use the symbol < to denote the partial order. Two unlabeled trees are identified if they can be superposed by a suitable continuous deformation, so that the endpoints with the same index coincide. Then the number of unlabeled trees with n end-points is bounded by 4n . We shall consider also the labeled trees (to be called simply trees in the following); they are defined by associating some labels with the unlabeled trees, as explained in the following items.
632
V. Mastropietro
Fig. 3. A tree with its scale labels
2) We associate a label h ≤ 0 with the root and we denote Th,n the corresponding set of labeled trees with n endpoints. Moreover, we introduce a family of vertical lines, labeled by an integer taking values in [h, 2], and we represent any tree τ ∈ Th,n so that, if v is an endpoint or a non trivial vertex, it is contained in a vertical line with index hv > h, to be called the scale of v, while the root is on the line with index h. There is the constraint that, if v is an endpoint, hv > h + 1; if there is only one end-point its scale must be equal to h + 2, for h ≤ 0. The tree will intersect in general the vertical lines in set of points different from the root, the endpoints and the non trivial vertices; these points will be called trivial vertices. The set of the vertices of τ will be the union of the endpoints, the trivial vertices and the non trivial vertices. Note that, if v1 and v2 are two vertices and v1 < v2 , then hv1 < hv2 . Moreover, there is only one vertex immediately following the root, which will be denoted v0 and can not be an endpoint; its scale is h + 1. 3) With each endpoint v of scale hv = +2 we associate one of the contributions to V (1) given by (3.2); with each endpoint v of scale hv ≤ 1 one of the terms in LV (hv −1) defined in (4.19). Moreover, we impose the constraint that, if v is an endpoint and hv ≤ 1, hv = hv + 1, if v is the non trivial vertex immediately preceding v. 4) If v is not an endpoint, the cluster Lv with frequency hv is the set of endpoints following the vertex v; if v is an endpoint, it is itself a (trivial) cluster. The tree provides an organization of endpoints into a hierarchy of clusters. 5) We introduce a field label f to distinguish the field variables appearing in the terms associated with the endpoints as in item 3); the set of field labels associated with the endpoint v will be called Iv . Analogously, if v is not an endpoint, we shall call Iv the set of field labels associated with the endpoints following the vertex v; x(f ), σ (f ) and ω(f ) will denote the space-time point, the σ index and the ω index, respectively, of the field variable with label f . 6) We associate with any vertex v of the tree a subset Pv of Iv , the external fields of v. These subsets must satisfy various constraints. First of all, if v is not an endpoint and v1 , . . . , vsv are the sv vertices immediately following it, then Pv ⊂ ∪i Pvi ; if v is an endpoint, Pv = Iv . We shall denote Qvi the intersection of Pv and Pvi ; this definition implies that Pv = ∪i Qvi . The subsets Pvi \Qvi , whose union will be made, by definition, of the internal fields of v, have to be non empty, if sv > 1, that is if v is a non-trivial vertex.
Ising Models with Four Spin Interaction at Criticality
633
Given τ ∈ Tj,n , there are many possible choices of the subsets Pv , v ∈ τ , compatible with the previous constraints; let us call P one of these choices. Given P, we consider the family GP of all connected Feynman graphs, such that, for any v ∈ τ , the internal fields of v are paired by propagators of scale hv , so that the following condition is satisfied: for any v ∈ τ , the subgraph built by the propagators associated with all vertices v ≥ v is connected. The sets Pv have, in this picture, the role of the external legs of the subgraph associated with v. The graphs belonging to GP will be called compatible with P and we shall denote Pτ the family of all choices of P such that GP is not empty. As explained for instance in §3.2 of [BM] we can write, if h ≤ 0, V
(h)
Zh ψ
(≤h)
h+1 + M 2E
=
∞
Zh
|Pv0 |
n=1 τ ∈Th,n P∈Pτ
(≤h) (Pv0 )K (h+1) (xv0 ) , ψ τ,P
xv0
(7.68) where (≤h) (Pv ) = ψ
(≤h)
ψx(f ),ω(f )
(7.69)
f ∈Pv (j +1)
and Kτ,P (xv0 ) is a suitable function, which is obtained by summing the values of all the Feynman graphs compatible with P, see item 6) above, and applying iteratively in the vertices of the tree, different from the endpoints and v0 , the R-operation, starting from the vertices with higher scale. In order to control, uniformly in M, the various terms in (7.68) one has to exploit the Gram-Hadamard inequality (see Appendix E) and to take into account the R operation acting on the vertices of the tree, as explained in full detail in [BM], §3. The result of this analysis, which applies essentially unchanged in the present case, is the following bound (see (3.105) of [BM]), if k = i αi , (h+1) |Kτ,P (xv0 )| ≤ C n M 2 εhn γ −hDk (Pv0 ) · xv0
·
v not
e.p.
, |Pv | 2 |Pv | 1 sv |Pv |−|Pv | Zhv −[−2+ +z(P )] v 2 , γ C i=1 i sv ! Zhv −1 (7.70)
with −2 +
|Pv | 2
+ z(Pv ) > 0 and
1 z(Pv ) = 2 0
if |Pv | = 4, if |Pv | = 2, otherwise.
(7.71)
The above bound admits a simple dimensional interpretation. If we erase the R operation from all the vertices of the tree, then z(Pv ) = 0 and (7.70) allow us to associate a |Pv | factor γ 2− 2 with any trivial or non-trivial vertex of the tree. This would allow us to control the sums over the scale labels and Pτ , provided that |Pv | were larger than 4 in all vertices, which is however not true. The effect of the R operation is to improve the |Pv | bound with the factor γ −z(Pv ) , so that there is a factor γ −[−2+ 2 +z(Pv )] smaller than 1 associated with all the vertices.
634
V. Mastropietro
In order to perform the sums note that the number of unlabeled trees is ≤ 4n ; fix an unlabeled tree, the number of terms in the sum over the various labels of the tree is bounded by C n , except the sums over the scale labels. In order to bound the sums over the scale labels and P we first use the inequality |Pv | γ −[−2+ 2 +z(Pv )] ≤ γ −2α(hv −hv ) γ −2α|Pv | , (7.72) v v not e.p. v not e.p. where v are the non-trivial vertices, and v is the non trivial vertex immediately preceding −2α(h −h ) v v v or the root. The factors γ in the r.h.s. of (7.72) allow us to bound the sums 1 ). over the scale labels by C n ; α is a suitable constant (one finds α = 40 Finally the sum over P can be bounded by using the following combinatorial inequalv ity, trivial for γ large enough. Let {pv , v ∈ τ } be a set of integers such that pv ≤ si=1 pvi for all v ∈ τ which are not endpoints; then pv γ − 40 ≤ C n . (7.73) p v not e.p. v It follows that
P |Pv0 |=2m
v not
γ−
|Pv | 40
≤
e.p.
v not
e.p.
pv
γ − 40 ≤ C n .
(7.74)
pv
7.10. Appendix L: The flow of running coupling constants. Choice of the counterterm ν. Let us call µh = supk≥h max{|λk |, |δk |}. Let us consider the first of Eqs. (5.1) for fixed values of ah , Zh−1 and mh−1 (k), h˜ ≤ h ≤ 1, if h˜ is a negative integer, satisfying the conditions µh ≤ ε 1 ≤ ε 0 , γ −c0 µh ≤
a0 γ h−1 ≥ 4|mh |,
mh−1 ≤ γ +c0 µh , mh
γ −c0 µh ≤ 2
Zh−1 2 ≤ γ +c0 µh Zh
(7.75)
(7.76)
for some constant c0 . We prove that, if ε¯ 0 is small enough, there exist some constants ε¯ 1 , κ, γ , c1 , B, and ¯ a family of intervals I (h) , h˜ ≤ h¯ ≤ 0, such that ε¯ 1 ≤ ε¯ 0 , 0 < κ < 1, 1 < γ < γ , ¯ ¯ ¯ ¯ ¯ I (h) ⊂ I (h+1) , |I (h) | ≤ c1 ε¯ 1 (γ )h and, if ν = ν1 ∈ I (h) , 1
¯
|νh | ≤ B ε¯ 1 [γ − 2 (h−h) + γ κh ] ≤ ε¯ 0 ,
h¯ ≤ h ≤ 1.
(7.77)
In order to show this, note that if |νh | ≤ ε 0 for h¯ ≤ h ≤ 1 and ε 0 is small enough, the r.h.s. of the first of (5.1) is well defined for h = h¯ and we can write = γ νh¯ + bh¯ + rh¯ , νh−1 ¯
(7.78)
¯
ν γ h−1 λ and r collects all terms of second or higher order in ε . In the where bh¯ = ch−1 0 h¯ h¯ ¯
tree expansion of βνh , there is no contribution from the trees with n ≥ 2 endpoints, which
Ising Models with Four Spin Interaction at Criticality
635
are only of type ν or δ, because of the support properties of the single scale propagators; hence by (7.75) |rh¯ | ≤ c2 µh¯ ε0 . Let us now fix a positive constant c, consider the intervals bh bh − cε 1 , − + cε1 . J (h) = − γ −1 γ −1
(7.79)
By using (7.78) one can show by an inductive argument (see for instance §4.3 of [BM]) ¯ that there exists a decreasing family of intervals I (h) , h˜ ≤ h¯ ≤ 0, such that, if ν = ν1 ∈ ¯ ( h) I , then the sequence νh is well defined for h ≥ h¯ and satisfies the bound |νh | ≤ ε 0 . In order to prove the bound (7.77) we note that, if we iterate the first of (5.1), we can ¯ write, if h¯ ≤ h ≤ 0 and ν1 ∈ I (h) , νh = γ −h+1 ν1 +
1
γ k−2 βνk (νk , . . . , ν1 )
,
(7.80)
k=h+1
where now the functions βνk are thought of as functions of νk , . . . , ν1 only. If we put h = h¯ in (7.80), we get the following identity: ν1 = −
1
¯
γ k−2 βνk (νk , . . . , ν1 ) + γ h−1 νh¯ .
(7.81)
¯ k=h+1
Equations (7.80) and (7.81) are equivalent to νh = −γ −h
h
¯
γ k−1 βνk (νk , . . . , ν1 ) + γ −(h−h) νh¯ ,
h¯ < h ≤ 1 .
(7.82)
¯ k=h+1
By construction, see §4.4, βkν is given by the sum over trees with at least an end-point νk , k ≥ h or at least a propagator gω,−ω , see (4.16), or at least with an end-point at scale 2 to which is associated one of the terms in RV (1) . Hence, we can write βνh
= µh
1
ν h,k νk β γ −2κ(k−h) + γ κh µh Rhν ,
(7.83)
k=h
ν | ≤ C and κ is a constant. The second term in (7.83) comes from the where |Rhν |, |β h,k trees with at least a propagator gω,−ω or with an end-point at scale 2, and the first term from the trees with at least a νk end-point. The factor γ −2κ(k−h) in the r.h.s. of (7.83) follows from the simple remark that the bound over all the trees contributing to νh , which have at least one endpoint of fixed scale k > h, can be improved by a factor γ −η (k−h) , with η positive but small enough. It is sufficient to use (7.72), which allows to extract such a factor from the r.h.s. before performing the sum over the scale indices, and to choose η = 2κ, which is possible if κ is small enough. Let us now observe that the sequence νh , h¯ < h ≤ 1, satisfying (7.77) can be obtained (n) as the limit as n → ∞ of the sequence {νh }, h¯ < h ≤ 1, n ≥ 0, parameterized by ¯ νh¯ ∈ J (h+1) and defined recursively in the following way:
636
V. Mastropietro (0)
νh = 0 ,
h
νh = −γ −h (n)
(n−1)
γ k−1 βkν (νk
(n−1)
, . . . , ν1
¯
) + γ −(h−h) νh¯ ,
n≥1.
(7.84)
¯ k=h+1 (n)
In fact, by induction one verifies that, if ε1 is small enough, |νh | ≤ Cε 1 ≤ ε 0 , (n) (n−1) | ≤ (Cε 1 )n . In fact for so that (7.84) is meaningful, and maxh∗
1 it follows by the fact that βkν (νk (n−2) (n−2) ν , . . . , ν1 ) can be written as a sum of terms in which there is at least one βk (νk (n−1) (n−2) − νh , h ≥ k, in place of the correspondendpoint of type ν, with a difference νh (n) ing running coupling, and one endpoint of type λ. Then νh converges as n → ∞, for h¯ < h ≤ 1, to a limit νh , satisfying (7.77) and the bound |νh | ≤ ε0 , if ε 1 is small enough. Hence, if ε 1 is small enough, by (7.83), |βkν | ≤ Cε 1
1
|νm |γ −2κ(m−k) + γ κk
.
(7.85)
+ γ −(h−h) } .
(7.86)
m=k
Hence (n) |νh |
≤ cε 1 {γ
−h
h k=h+1
γ
k
1
(n−1) −2κ(m−k) |νm |γ
+γ
κk
m=k
Let us now suppose that, for some constant cn−1 , 1
(n−1) | ≤ cn−1 ε 1 (γ κm + γ − 2 (m−h) ) ≤ ε 0 , |νm
(7.87)
(0)
which is true for n = 1, since νm = 0, if ε 1 is small enough. One then checks that the (n) same bound is verified by νm , if cn−1 is substituted with cn = c(1 + c4 cn−1 ε 1 ), where (n) c4 is a suitable constant. Hence, we can prove the bound (7.77) for νh = limn→∞ νh , for ε1 small enough. Proof of Lemma 3. We shall proceed by induction. The second part of (5.1) and the above analysis imply that, if λ is small enough, there exists an interval I (0) , whose size is of order λ, such that, if ν ∈ I (0) , then the bound (7.77) is satisfied, together with |λ0 − λ| ≤ C|λ|2 . Let us now suppose that the solution of (5.1) is well defined for h¯ ≤ h ≤ 0 and satisfies the conditions (7.75),(7.77), for any ν belonging to an interval ¯ I (h) . Suppose also that there exists a constant c0 , such that µh¯ ≤ c0 |λ| .
(7.88)
We want to prove that all these conditions are verified also if h¯ is substituted with h¯ − 1, if λ is small enough. The induction will be stopped as soon as the second condition in ¯ (7.75) is violated for some ν ∈ I (h) . We shall put ν equal to one of these values, so ∗ defining h as equal to h¯ + 1.
Ising Models with Four Spin Interaction at Criticality
637
By using (5.5) we have ah−1 = ah¯ + βhα,L ¯ ¯ (ah¯ , . . . , ah¯ ) +
1 ¯ k=h+1
α α Dh,k ¯ + rh¯ (ah¯ , νh¯ ; . . . ; a1 , ν1 ; u) ,
(7.89)
where α = βhα,L (ah , . . . , ah , ak , ak+1 , . . . , a1 ) − βhα,L (ah , . . . , ah , ah , ak+1 , . . . , a1 ) . Dh,k (7.90) α admits a tree expansion similar to that of the On the other hand, one checks that Dh,k
functions βhα,L (ah , . . . , a1 ), with the property that all trees giving a non zero contribution must have an endpoint of scale h, associated with a difference λk − λh or δk − δh . Hence, if κ is the same constant in (7.83) and h ≤ 0, α | ≤ C|λh |γ −κ(k−h) |ak − ah | . |Dh,k
(7.91)
Let us now suppose that h¯ ≤ h ≤ 0 and that there exists a constant c0 , such that ¯
1
|ak−1 − ak | ≤ c0 |λ|3/2 [γ − 2 (k−h) + γ ϑk ] ,
h < k ≤ 0,
(7.92)
where ϑ = min{κ/2, η }. Equation(7.92) is certainly verified for k = 0, thanks to the second part of (5.1); we want to show that it is verified also if h is substituted with h − 1, if λ1 is small enough. By using (7.89), (5.6), (5.7) and (7.92), we get
¯
1
|ah−1 − ah | ≤ Cλh γ η h + C|λh |2 [γ − 2 (h−h) + γ ϑh ] 1 k 1 ∗ +Cc0 |λh |5/2 γ −κ(k−h) [γ − 2 (h −h ) + γ ϑh ] , 2
(7.93)
h =h+1
k=h+1
which immediately implies (7.92) with h → h − 1 and (7.88) with h¯ → h¯ − 1. The bound (7.93) implies also the first of (5.3). Finally the second of (5.3) follows from (5.2). Independence of ν from t − tc . We have shown that by choosing ν ∈ Ih∗ then (5.3) ∗ holds; such ν are parametrized by νh∗ ∈ J (h +1) . Assuming (7.75) and h¯ = hM , one can proceed as above to show that there exists a sequence νh , hM < h ≤ 1 such that (so that ν hM = 0) νh = −γ −h
h
γ k−1 βkν (νk , . . . , ν1 ) .
(7.94)
k=hM +1
If νh , h∗ < h ≤ 1 verify (7.82) with νh∗ = 0 it holds that ∗
|νh − νh | ≤ Cε 1 γ κh
h∗ ≤ h ≤ 1 ;
(7.95)
638
V. Mastropietro
this implies that one can choose ν = ν1 for any h∗ . Equation (7.95)) is proved by induction assuming that it holds for any k ≥ h + 1 and subtracting (7.82) with h = h∗ and νh∗ +1 = 0 from (7.94), finding νh − νh = −γ −h −γ −h
h
γ k−1 [βkν (νk , . . . , ν1 ) − βkν (νk , . . . , ν1 )]
k=h∗ +1 h∗
γ k−1 βkν (νk , . . . , ν1 ) .
(7.96)
k=hM +1
By using (7.83) and the inductive hypothesis, (7.95) follows. √ 7.11. Appendix M: Physical observables. The functionals B (h) ( Zh ψ (≤h) , φ) and S (h) (φ) defined in (6.11),(6.12) can be written in terms of a tree expansion similar to the one introduced in Appendix I. m of trees, which are We introduce, for each n ≥ 0 and each m ≥ 1, a family Th,n defined as in Appendix I, with some differences. m , the tree has n + m (instead of n) endpoints. Moreover, among 1) First of all, if τ ∈ Th,n the n + m endpoints, there are n endpoints, which we call normal endpoints, which are associated with a contribution to the effective potential on scale hv − 1. The m remaining endpoints, which we call special endpoints, are associated with a local term of the form (6.15); we shall say that they are of type Z (1) . 2) We associate with each vertex v a new integer lv ∈ [0, m], which denotes the number of special endpoints following v, i.e. contained in Lv .
In order to study the expansion of the correlation function (x, 0) ≡ (x), which follows from (6.21), we have to consider the trees with two special endpoints, whose space-points we shall denote x and y = 0; moreover, we shall denote by hx and hy the scales of the two special endpoints and by hx,y the scale of the smallest cluster containing both special endpoints. ˜ (x, y) = α (x, y) + β (x, y) is such that α (x, y) is given The decomposition 2 with endpoints v to which are associated only by the sum over trees belonging to Th,n
terms in LV (hv −1) or LB (hv −1) , and (x, y) is the sum over the remaining trees. The first two addends in (6.22) are the contribution from the trees with n = 0, while β
(1)
(
Zh Zh
(h),α
)2 G
(x) is given by the sum of trees with n ≥ 1, (h),α
G
(x) =
∞
h−1
n=1 hr =h∗ −1
(h,hr ),α
G
(x, τ, P) ,
(7.97)
τ ∈Th2 ,n,l P∈Pτ ,r r Pv0 =∅ hx,y =h
and, as proved in full detail in §5 of [BM], the following bound holds, see (5.60) of [BM], (h,hr ),α
|G
(x, τ, P)| ≤ (Cεh )n CN (2n + 1)N
γ 2h · 1 + [γ h d(x)]N
Ising Models with Four Spin Interaction at Criticality
639
) (1) Zhy Zh 1 sv · C i=1 |Pvi |−|Pv | (1) (1) sv ! Zhx −1 Zh Zhy −1 Zh v not e.p. |Pv |/2
−[−2+ |P2v | +lv +z(Pv ,lv )] · Zhv /Zhv −1 γ . , (
(1)
Zhx Zh
(7.98)
where z(Pv , lv ) = 1 if Pv = 4, lv = 0; z(Pv , lv ) = 2 if Pv = 2, lv = 0; z(Pv , lv ) = 1 if Pv = 2, lv = 1; z(Pv , lv ) = 0 in all other cases. We can now perform as in Appendix I the various sums in the r.h.s. of (7.97). There are some differences in the sum over the scale labels, but they can be easily treated. First of (1) (1) (1) (1) all, one has to take care of the factors (Zhx Zh )/(Zhx −1 Zh ) and (Zhy Zh )/(Zhy −1 Zh ),
with the only effect of adding to the final bound a factor γ C|λ|(hv −hv ) for each non-trivial vertex v containing one of the special endpoints and strictly following the vertex vx,y ; this has a negligible effect, thanks to a bound analogous to (7.72), valid in this case. The other difference is in the fact that, instead of fixing the scale of the root, we have now to fix the scale of vx,y . However, this has no effect, since we bound the sum over the scales with the sum over the differences hv − hv . (h)α The previous considerations are sufficient to get the bound (6.23) for G (x). An (h)β expression similar to (7.97) holds also for G (x); the extra factor γ τ h in the bound (6.24) (with respect to (6.23)) is due to the fact that the bound over all the trees which have at least one endpoint v of fixed scale hv = 2 can be improved by a factor γ τ h . It is sufficient to use (7.72), which allows to extract such a factor from the r.h.s. before performing the sum over the scale indices. (1) Note also that from (6.15), (6.17) we get (6.25) , where zh is given by (1) zh
=
∞
(1)
zh (τ, P) ,
(7.99)
n=1 τ ∈T 1 ,P∈Pτ ,Pv =(f1 ,f2 ) h,n 0
with |zh (τ, P)| ≤ C n εhn γ −h[D0 (Pv0 )+lv0 ] (1)
C
sv
i=1 |Pvi |−|Pv |
v not e.p. |Pv |/2 |Pv | 1 · γ −[−2+ 2 +lv +z(Pv ,lv )] . Zhv /Zhv −1 sv !
(7.100)
˜ (x, y) − −,−,−,−, (x, y) is given by a sum of terms in Finally note that sech4 Jr which three or four external φ fields are present. Essentially by power counting one gets a bound similar to (7.98) in which γ 2h is replaced by γ 3h or γ 4h depending if there are three or four external φ fields.
7.12. Appendix N: Perturbations of a single Ising model. If we consider the hamiltonian (1.1) with interaction given by (6.33) all the analysis in §2, §3 is still valid; the only place in which we have used the explicit form of V is in Appendix F, but the symmetry cancellations exploited there hold also in the case of V given by (6.33). The integration (≤h) = 0, of the light fermions is done exactly as in §4 but now in (4.9) and (4.20) Fλ (h) i.e. the term and quartic in the field is missing in LV ; the reason is that
640
V. Mastropietro
(≤h)+
ψx,1
(≤h)−
ψx,1
ψx,−1 ψx,−1 = ψ¯ x(≤h)(1) ψx(≤h)(1) ψ¯ x(≤h)(2) ψx(≤h)(2) , (≤h)+
(≤h)−
(7.101)
but such a term cannot be present as the (1) and (2) systems are independent. As a h , β h , β h are all O(ε γ κh ), if κ is a constant, for the same conconsequence, in (5.1) βm h z δ siderations used in Appendix L: there is no contribution from trees with only end-points of type ν or δ, because of the support properties of the single scale propagators. Hence h , β h , β h are given by a sum of trees with at least an end-point of scale h = 2 and by βm v z δ (7.72) the bound for them can be improved by a factor γ κh . Then, choosing ν properly, δh = O(λ), mh = m0 (1 + O(λ)), Zh = 1 + O(λ). For the same reasons the analysis in (1) §6 still holds but Zh = 1 + O(λ) and at the end (1.8)–(1.12) hold with η1 = η2 = 0. 7.13. Appendix O: Extensions of the main Theorem. It should be clear from the above analysis that the correlation function or the specific heat behaviour in (1.10) or (1.12) does not depend on the details of the interaction (1.3) but on a few general properties. In fact assume that V verifies the following properties. (1)
(2)
(2)
(1)
1) V is symmetric under the exchange {σx }x∈ , {σx }x∈ → {σx }x∈ , {σx }x∈ . This is true for the Ashkin-Teller Hamiltonian which is invariant under the operation (1) (2) (2) (1) σx,x0 , σx,x0 → σx,x0 , σx,x0 , and for the Eight vertex model which is invariant under (1) (2) (2) (1) σx,x0 , σx,x0 → σx,x0 , σx+1,x0 −1 for any x ∈ . 2) V is given by the sum of monomials in the spin variables each one of the form λv(x1 , .., xn )
n i=1
(α )
σx(αi i ) σx i
(7.102)
i
with αi = 1, 2 , xi , xi nearest neighbor, v(x1 , .., x2 ) short ranged and λ small. The above two properties ensure that the effective potential can be written in the form (3.1), with V given by a sum over short range monomials in the Grassmann variables ψ, χ . Moreover the analysis in Appendix F can be repeated, as the symmetries which were true in the Ashkin Teller or in the Eight vertex model are true also here, and the marginal or relevant terms in the Renormalization group analysis are the same as in the Eight vertex or Ashkin Teller models. Note that the interaction in the AshkinTeller or the Eight-vertex model verify an extra symmetry, namely a symmetry in the exchange x, x0 → x0 , x; such extra symmetry is however not used in our analysis. Finally: 3) V is such that in V (3.1) there is a non vanishing local term of the form + − + − [cλ + O(λ2 )]ψ1,x ψ1,x ψ−1,x ψ−1,x
(7.103)
with c = 0 a constant. If such conditions are verified, then a statement identical to the main Theorem follows. Acknowledgements. This paper was partly written in the stimulating atmosphere of the Institute for Advanced Studies, in Princeton. I am indebted to Prof. Spencer for his invitation and for many clarifying discussions about his work [PS]. I thank G.Benfatto, G.Gallavotti and A. Giuliani for many important remarks and suggestions.
Ising Models with Four Spin Interaction at Criticality
641
References [AT]
Ashkin, J., Teller, E.: Statistics of Two-Dimensional Lattices with Four Components. Phys. Rev. 64, 178–184 (1943) [B] Baxter, R.J.: Eight-Vertex Model in Lattice Statistics. Phys. Rev. Lett. 26, 832–833 (1971) [Ba] Baxter, R.: Exactly solved models in statistical mechanics. London-New York-San Diego: Academic Press, 1982 [BG1] Benfatto, G., Gallavotti, G.: Perturbation Theory of the Fermi Surface in Quantum Liquid. A General Quasiparticle Formalism and One-Dimensional Systems. J. Stat. Phys. 59, 541–664 (1990) [BG] Benfatto, G., Gallavotti, G.: Renormalization group. Physics Notes 1, Princeton, NJ: Princeton University Press, 1995 [BGM] Benfatto, G., Gallavotti, G., Mastropietro, V.: Renormalization Group and the Fermi Surface in the Luttinger Model. Phys. Rev. B 45, 5468–5480 (1992) [BM] Benfatto, G., Mastropietro, V.: Renormalization group, hidden symmetries and approximate Ward identities in the XY Z model. Rev. Math. Phys. 13(11), 1323–143 (2001); Commun. Math. Phys. 231, 97–134 (2002) [BeM1] Benfatto, G., Mastropietro, V.: Ward identities and Dyson equations in interacting Fermi systems. To appear in J. Stat. Phys. [BGPS] Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Beta Functions and Schwinger Functions for a Many Fermions System in One Dimension. Commun. Math. Phys. 160, 93–171 (1994) [BM1] Bonetto, F., Mastropietro, V.: Beta Function and Anomaly of the Fermi Surface for a d = 1 System of Interacting Fermions in a Periodic Potential. Commun. Math. Phys. 172, 57–93 (1995) [GM] Gentile, G., Mastropietro., V.: Renormalization group for one-dimensional fermions. A review on mathematical results. Phys. Rep. 352(4-6), 273–43 (2001) [G] Gallavotti, G.: Renormalization theory and ultraviolet stability for scalar fields via renormalization group methods. Rev. Mod. Phys. 57(2), 471–562 (1985) [GS] Gentile, G., Scoppola, B.: Renormalization group and the ultraviolet problem in the Luttinger model. Commun. Math. Phys. 154, 153–179 (1993) [K] Kadanoff, L.P.: Connections between the Critical Behavior of the Planar Model and That of the Eight-Vertex Model. Phys. Rev. Lett. 39, 903–905 (1977) [Ka] Kasteleyn, P.W.: Dimer Statistics and phase transitions. J. Math. Phys. 4, 287 (1963) [F] Fan, C.: On critical properties of the Ashkin-Teller model. Phys. Lett. 6, 136 (1972) [H] Hurst, C.: New approach to the Ising problem. J. Math. Phys. 7(2), 305–310 (1966) [ID] Itzykson, C., Drouffe, J.: Statistical field theory: 1. Cambridge: Cambridge Univ. Press, 1989 [Le] Lesniewski, A.: Effective action for theYukawa 2 quantum field Theory. Commun. Math. Phys. 108, 437–467 (1987) [Li] Lieb, H.: Exact solution of the problem of entropy of two-dimensional ice. Phys. Rev. Lett. 18, 692–694 (1967) [LP] Luther, A., Peschel, I.: Calculations of critical exponents in two dimension from quantum field theory in one dimension. Phys. Rev. B 12, 3908–3917 (1975) [M1] Mastropietro, V.: Non universality in Ising models with quartic interaction. J. Stat. Phys. 111, 201–259 (2003) [ML] Mattis, D., Lieb, E.: Exact solution of a many fermion system and its associated boson field. J. Math. Phys. 6, 304–312 (1965) [MW] McCoy, B., Wu, T.: The two-dimensional Ising model. Cambridge, Ma: Harvard Univ. Press, 1973 [MPW] Montroll, E., Potts, R., Ward, J.: Correlation and spontaneous magnetization of the two dimensional Ising model. J. Math. Phys. 4, 308 (1963) [N] den Nijs, M.P.M.: Derivation of extended scaling relations between critical exponents in two dimensional models from the one dimensional Luttinger model. Phys. Rev. B 23(11), 6111– 6125 (1981) [O] Onsager, L.: Critical statistics. A two dimensional Ising model with an order-disorder transition. Phys. Rev. 65, 117–149 (1944) [PB] Pruisken, A.M.M., Brown, A.C.: Universality fot the critical lines of the eight vertex, AshkinTeller and Gaussian models. Phys. Rev. B 23(3), 1459–1468 (1981) [PS] Pinson, H., Spencer, T.: Universality in 2D critical Ising model. To appear in Commun. Math. Phys. [S] Samuel, S.: The use of anticommuting variable integrals in statistical mechanics. J. Math. Phys. 21, 2806 (1980)
642
V. Mastropietro
[Su]
Sutherland, S.B.: Two-Dimensional Hydrogen Bonded Crystals. J. Math. Phys. 11, 3183–3186 (1970) Spencer, T.: A mathematical approach to universality in two dimensions. Physica A 279, 250– 259 (2000) Schultz, T., Mattis, D., Lieb, E.: Two-dimensional Ising model as a soluble problem of many Fermions. Rev. Mod. Phys. 36, 856 (1964) Wu, F.W.: The Ising model with four spin interaction. Phys. Rev. B 4, 2312–2314 (1971)
[Spe] [SML] [W]
Communicated by G. Gallavotti