Communications In Mathematical Physics - Volume 284

Commun. Math. Phys. 284, 1–49 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0525-2 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

58 downloads 632 Views 11MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 284, 1–49 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0525-2

Communications in

Mathematical Physics

WZW Orientifolds and Finite Group Cohomology Krzysztof Gaw¸edzki1, , Rafał R. Suszek2 , Konrad Waldorf 3 1 Laboratoire de Physique, ENS-Lyon, 46 Allée d’Italie, F-69364 Lyon, France.

E-mail: [email protected]

2 Department of Mathematics, King’s College London Strand, London WC2R 2LS, UK.

E-mail: [email protected]; [email protected]

3 Fachbereich Mathematik, Universität Hamburg, Bundesstrasse 55, D-20146 Hamburg,

Germany. E-mail: [email protected] Received: 9 February 2007 / Accepted: 17 January 2008 Published online: 12 August 2008 – © Springer-Verlag 2008

Abstract: The simplest orientifolds of the WZW models are obtained by gauging a Z2 symmetry group generated by a combined involution of the target Lie group G and of the worldsheet. The action of the involution on the target is by a twisted inversion g → (ζ g)−1 , where ζ is an element of the center of G. It reverses the sign of the Kalb-Ramond torsion field H given by a bi-invariant closed 3-form on G. The action on the worldsheet reverses its orientation. An unambiguous definition of Feynman amplitudes of the orientifold theory requires a choice of a gerbe with curvature H on the target group G, together with a so-called Jandl structure introduced in [31]. More generally, one may gauge orientifold symmetry groups Γ = Z2 × Z that combine the Z2 -action described above with the target symmetry induced by a subgroup Z of the center of G. To define the orientifold theory in such a situation, one needs a gerbe on G with a Z -equivariant Jandl structure. We reduce the study of the existence of such structures and of their inequivalent choices to a problem in group-Γ cohomology that we solve for all simple simply connected compact Lie groups G and all orientifold groups Γ = Z2 × Z . 1. Introduction Unoriented string theory, both in the closed and in the open sector, has a long history [26,32]. From the two-dimensional point of view, it involves conformal field theory defined on unoriented worldsheets. Such a theory may be viewed as an “orientifold” obtained from a conformal field model defined on oriented surfaces by gauging a discrete symmetry including transformations reversing the worldsheet orientation. If the conformal theory is a sigma model whose target space carries a background Kalb-Ramond 2-form field B, the worldsheet orientation-changing transformations have to be combined with target-space transformations that change the sign of B so that the B-field contribution to the sigma model action functional stays invariant. This leads to subtle issues if the Membre du C.N.R.S.

2

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

B-field is topologically non-trivial, like the one present in the Wess-Zumino-Witten (WZW) sigma models with Lie group targets [35]. In those models, only the closed torsion 3-form H = d B, a right-left invariant 3-form on the group manifold, is globally defined. Orientifolds of the WZW models have been studied intensively within the algebraic approach, following the pioneering work of the Rome group [27,28]. The main tool in this approach was the use of sewing and modular duality constraints in order to find consistent expressions for the crosscap states encoding the action of the orientation inversion in the closed-string sector. The algebraic approach was further developed in the context of more general orientifolds combining simple-current orbifolds and orientation reversal in [5,10,17–19]. It gave rise to an abstract formulation of the relevant topological structures in the language of tensor categories [9]. The interpretation of the results of the algebraic approach in terms of the target geometry was the subject of papers [2] and [4] that studied orientifolds of the SU (2) and SO(3) WZW theories. In general, one may expect that the intricacies appearing in the algebraic studies of WZW orientifolds have their source in the classical target geometry, more precisely in the non-triviality of the B-field background, similarly to the ones involved in the simple-current orbifolds of the WZW models. In the latter case, it was argued in [12], see also [1], that the proper treatment of the non-trivial B-field background in the closedstring sector may be achieved by employing the third (real) Deligne cohomology. This approach lay behind the classification of the WZW models on non-simply connected simple compact groups obtained in [8]. The third Deligne cohomology classifies geometric structures called bundle gerbes with connections, introduced in [24,25]. The latter are in a similar relation to the closed 3-forms H as line bundles with connection are to their curvature 2-forms F. Consequently, the 3-form H corresponding to a gerbe is called its curvature. The geometric language of bundle gerbes is sometimes more convenient than the cohomological one of Deligne cohomology. For general simple groups, in particular, it appeared to be easier to construct the bundle gerbes with the curvature given by a bi-invariant 3-form H than the corresponding Deligne cohomology classes. Such a construction was accomplished for the simply connected groups in [23] and was generalized to the non-simply connected ones in [15]. An extension of the geometric analysis including open strings and D-branes required studying gerbe modules carrying Chan-Paton gauge fields twisted by the gerbe [20]. In the algebraic language, the WZW models with non-simply connected target groups are simple-current orbifolds of the models with simply connected targets. The geometric analysis of [13,14], employing (bundle) gerbes and gerbe modules, permitted a systematic classification of symmetric D-branes in the WZW models and exposed the classical origin of the finite group cohomology that appeared in the algebraic analysis of the simple-current orbifolds. Indeed, the relevant cohomological aspects pass undeformed to the quantum theory that is obtained by geometric quantization of the classical one [13]. The recent paper [31] introduced additional data, called a Jandl structure on a gerbe, that are required to define Feynman amplitudes for closed unoriented worldsheets in the presence of a topologically non-trivial B-field. A Jandl structure may be viewed as a symmetry of the gerbe under a transformation of the underlying space that changes the sign of the curvature 3-form. In this paper, we classify such structures on all gerbes on simple compact groups with the gerbe curvature equal to a bi-invariant torsion 3-form H . More precisely, on the simply connected group targets G, we consider the action of orientifold groups Γ = Z2Z . This action combines the involutive twisted inversion g → (ζ g)−1 , where the twist element ζ belongs to the center Z (G) of G, with the multiplication by elements of the “orbifold” subgroup Z ⊂ Z (G). The action of Z

WZW Orientifolds and Finite Group Cohomology

3

preserves the bi-invariant 3-form H , whereas the action of the twisted inversion changes its sign. We introduce the notion of a Γ -equivariant structure on the gerbe with curvature H on group G. Such a structure may be regarded as a Z -equivariant Jandl structure on that gerbe. It determines a genuine Jandl structure on the quotient gerbe on the nonsimply connected group G/Z and enables to define unambiguously the contribution of the B-field to Feynman amplitudes of unoriented string world histories represented by maps from unoriented closed surfaces to the target G/Γ . We show that obstructions to the existence of Γ -equivariant structures are contained in the cohomology group H 3 (Γ, U (1) ), where the subscript indicates that U (1) is considered with the action λ → λ−1 of those elements of Γ which reverse the sign of H . If the obstruction class vanishes, non-equivalent Γ -equivariant structures may be labeled by elements of the cohomology group H 2 (Γ, U (1) ). Each choice gives a different (closed-string) orientifold theory. Let us recall that obstructions to the existence of the quotient gerbe on G/Z (and of the Z -orbifold theory) lie in H 3 (Z , U (1)) and that ambiguities in its construction (the “discrete torsion” of [33]) take values in H 2 (Z , U (1)), see [13]. The present paper is devoted to the study of obstruction 3-cocycles for all simple simply connected groups G of the Cartan series and all choices of the orientifold groups Γ = Z2Z . We find the conditions under which the obstruction cocycles are coboundaries, i.e. the obstruction cohomology class is trivial. This provides an extension of the work of [15] from the orbifold to the orientifold case. Similarly as in the orbifold case analyzed in [13,15], the cochains trivializing the obstruction cocycles enter directly the construction of Γ -equivariant structures on the gerbes on groups G and the analysis of the symmetric D-branes in the WZW orientifolds. These topics, involving more geometric considerations as well as a discussion of the relation between our approach and the algebraic one of [5], are postponed to a later publication [16]. In the present paper, we shall avoid geometry by sticking to a local description of gerbes, staying close to the Deligne cohomology approach of [12]. The paper is organized as follows. In Sect. 2, we summarize the description of gerbes by local data and the relation of gerbes on discrete quotients to finite group cohomology. The application to gerbes on simple simply connected compact groups G and their non-simply connected quotients G/Z is recalled from [15]. Finally, we extend the construction to the case of quotients by orientifold groups Γ and describe a 3-cocycle whose cohomology class obstructs the existence of Γ -equivariant structures on gerbes on the simply connected groups G for Γ = Z2Z . In Sect. 3, we study the relevant cohomology groups: the one containing the obstruction classes: H 3 (Γ, U (1) ), and the one classifying non-equivalent Γ -equivariant structures: H 2 (Γ, U (1) ). They are more difficult to calculate than the corresponding orbifold cohomologies but information about those groups may be obtained from the Lyndon-Hochschild-Serre spectral sequence that we discuss in some detail. In particular, we are able to calculate the classifying group H 2 (Γ, U (1) ) in all the relevant cases. Section 4 is the most technical part of the paper. It analyzes the obstruction 3-cocycles for all simple groups G of the Cartan series and all choices of the twisted orientifold group actions and finds cohomologically inequivalent trivializing cochains whenever the obstruction cohomology class is trivial. The results are tabulated in Appendix. In Sect. 5, we collect our conclusions. 2. Bundle Gerbes and Orientifolds 2.1. Local description of bundle gerbes. (Bundle) gerbes (with hermitian structure and unitary connection) are geometric structures that allow to define the contribution of

4

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

the Kalb-Ramond torsion 3-form H to closed-string Feynman amplitudes. A simple, although not always convenient, way to present a gerbe on a manifold M is via its local data. In this paper, we shall stick to such a local description of bundle gerbes that reduces the geometric structures to the cohomological ones described already in [12]. A discussion, in relation to orientifolds, of the geometric structures underlying the notion of bundle gerbes [24,25] is postponed to [16]. Gerbe local data subordinate to a good open covering1 (Oi ) of M are a collection (Bi , Ai j , gi jk ), where Bi are 2-forms on the sets Oi , Ai j = −A ji are 1-forms on Oi j and −1 gi jk = g −1 jik = g jki = gik j are U (1)-valued functions on Oi jk such that the following descent equations hold: B j − Bi = d Ai j on Oi j , Ai j − Aik + A jk = i gi−1 jk dgi jk on Oi jk , −1 gi jk gi−1 jl gikl g jkl = 1 on Oi jkl .

The global closed 3-form H equal to d Bi on the sets Oi is called the curvature of the gerbe. The necessary and sufficient condition for the existence of a gerbe with curvature 1 H (and of the corresponding local data) is that the periods of the 3-form 2π H be integers. The local data (Bi , Ai j , gi jk ) and (Bi , Ai j , gi jk ) are considered equivalent if there exist 1-forms Πi on Oi and U (1)-valued functions χi j = χ −1 ji on Oi j such that Bi = Bi + dΠi , Ai j = Ai j + Π j − Πi − i χi−1 j dχi j , −1 gi jk = gi jk χi−1 j χik χ jk .

Equivalent local data correspond to gerbes that are called stably isomorphic [25]. Clearly, such gerbes have the same curvature 3-form H . In general, two gerbes with the same curvature differ by a flat gerbe with vanishing curvature. Up to equivalence, the local data of a flat gerbe are of the form (0, 0, u i jk ) with u i jk ∈ U (1) [12]. Their equivalence classes are in a one-to-one correspondence with elements of the cohomology group H 2 (M, U (1)). In particular, if H 2 (M, U (1)) is trivial then all gerbes with the same curvature are stably isomorphic. If there is no torsion in H 3 (M, Z) then one may also put the flat gerbe local data into an equivalent form (B, 0, 1), where B is a global closed 2-form. If Σ is an oriented closed connected surface and X maps Σ to M then, pulling back the gerbe by X to Σ, one obtains a flat gerbe on Σ which, up to a stable isomorphism, is characterized by a cohomology class in H 2 (Σ, U (1)) = U (1). The corresponding number in U (1) is called the holonomy of the gerbe on M along X . If the local data for the pullback gerbe are taken in the form (B, 0, 1) then the holonomy along X is given by exp(i Σ B). It enters as a factor in the Feynman amplitude of the closed-string world history X . It is convenient to use the cohomological language to describe gerbe local data and ˇ their equivalence classes [12]. We shall denote by Cˇ p (S) the Abelian group of Cech p ˇ p-cochains with values in an (Abelian) sheaf S. An element c ∈ C (S) is a collection of 1 In a good open covering, the sets O and all their (non-empty) intersections O i i 1 i 2 ...i k = Oi 1 ∩ Oi 2 ∩ · · · ∩ Oi k are contractible.

WZW Orientifolds and Finite Group Cohomology

5

sections ci0 ···i p of S over the sets Oi0 ···i p that is antisymmetric in the indices i 0 , . . . , i p . ˇ ˇ complex C(S), The groups Cˇ p (S) form the Cech ˇ

ˇ

ˇ

δ δ δ 0 −→ Cˇ 0 (S) −→ Cˇ 1 (S) −→ Cˇ 2 (S) −→ · · · ,

(1)

ˇ where the Cech coboundary δˇ is defined by ˇ i0 ···i p+1 = (δc)

p+1

(−1) j ci0 ···i j−1 i j+1 ···i p+1 .

j=0

ˇ ˇ The Cech cohomology groups H p (M, S) are composed of Cech p-cocycles modulo p-coboundaries. In particular, taking the sheaf of locally constant U (1)-valued functions, one obtains the cohomology groups H p (M, U (1)). Given a complex D of sheaves d0

d1

d2

0 −→ S 0 −→ S 1 −→ S 2 −→ · · · , one may build a double complex ↓ δˇ

↓ δˇ d0

↓ δˇ d1

0 −→ Cˇ p (S 0 ) −→ Cˇ p (S 1 ) −→ ↓ δˇ ↓ δˇ d0 d1 p+1 0 p+1 1 ˇ ˇ 0 −→ C (S ) −→ C (S ) −→ ↓ δˇ ↓ δˇ

d2 Cˇ p (S 2 ) −→ · · · ↓ δˇ d2 Cˇ p+1 (S 2 ) −→ · · · ↓ δˇ

The hypercohomology groups Hs (M, D) of the complex D are defined as the cohomology groups of the diagonal complex K (D), D0

D1

D2

D3

0 −→ A0 −→ A1 −→ A2 −→ A3 −→ · · · ,

(2)

where As =

⊕ Cˇ p (S q )

p+q=s

(3)

and Ds = (−1)q+1 δˇ + dq on Cˇ p (S q ). We shall denote by U the sheaf of local (smooth) U (1)-valued functions on M and by Λq the sheaves of (smooth) q-forms on M. For the complex D(2), 1 i

d log

d

0 −→ U −→ Λ1 −→ Λ2 ,

(4)

where d is the exterior derivative, the groups As of (3) are A0 = Cˇ 0 (U) = {( f i )},

(5)

A1 = Cˇ 0 (Λ1 ) ⊕ Cˇ 1 (U) = {(Πi , χi j )},

(6)

A2 = Cˇ 0 (Λ2 ) ⊕ Cˇ 1 (Λ1 ) ⊕ Cˇ 2 (U) = {(Bi , Ai j , gi jk )},

(7)

A3 = Cˇ 1 (Λ2 ) ⊕ Cˇ 2 (Λ1 ) ⊕ Cˇ 3 (U) = {(Fi j , Di jk , σi jkl )},

(8)

6

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

where f i , χi j , gi jk , σi jkl are U (1)-valued functions on Oi , Oi j , Oi jk and Oi jkl , respectively, Πi , Ai j and Di jk are 1-forms on Oi , Oi j and Oi jk , and Bi , Fi j are 2-forms on Oi ˇ and Oi j . The differentials Di combine the exterior derivative with the Cech coboundary D0 ( f i ) = (−i f i−1 d f i , f j−1 f i ), −1 −1 D1 (Πi , χi j ) = (dΠi , −i χi−1 j dχi j + Π j − Πi , χ jk χik χi j ),

D2 (Bi , Ai j , gi jk ) = (d Ai j − B j + Bi , −i gi−1 jk dgi jk + A jk − Aik + Ai j , −1 g −1 jkl gikl gi jl gi jk ).

The hypercohomology of the complex D(2) of (4), i.e. the cohomology of the complex K (D(2)), see (2), is H0 (M, D(2)) = ker D0 ∼ = H 0 (M, U (1)), H1 (M, D(2)) =

ker D1 ∼ 1 = H (M, U (1)), im D0

(9)

and, in the second degree, H2 (M, D(2)) =

ker D2 . im D1

Here, H 0 (M, U (1)) is the group of locally constant U (1)-valued functions on M, H 1 (M, U (1)) is the one of the isomorphism classes of flat line bundles on M and H2 (M, D(2)) is the third real Deligne cohomology group [6,11]. The local data of a gerbe c = (Bi , Ai j , gi jk ) satisfy the cocycle condition2 D2 c = 0 and equivalent local data differ by a coboundary D1 β with β = (Πi , χi j ) so that the elements of the hypercohomology group H2 (M, D(2)) are in a one-to-one correspondence with stable isomorphism classes of gerbes.

2.2. Gerbes on orbifolds and group cohomology. Suppose now that a discrete group Γ acts on M preserving the closed 3-form H . Let us assume that the open covering (Oi ) is such that γ (Oi ) = Oγ i for an action (γ , i) → γ i of Γ on the index set. We shall call Γ the orbifold group. In a natural way, we may lift its action to the Abelian groups An of (5)–(8) by defining ∗

γ f i = γ −1 f γ −1 i ,

∗

γ Πi = γ −1 Πγ −1 i ,

(10)

etc. This turns the complex K (D(2)) of (2) induced from the sheaf complex (4) into a complex of Γ -modules. 2 We use the additive notation for the Abelian groups An .

WZW Orientifolds and Finite Group Cohomology

7

Below, we shall employ the language of the group-Γ cohomology, see e.g. [3] or Appendix A of [14], defining p-cochains on Γ with values in a Γ -module N as maps from Γ p to N , and the coboundary operator δ by (δn)γ ,γ ,...,γ ( p) = γ n γ ,...,γ ( p) − n γ γ ,γ ,...,γ ( p) + · · · + (−1) p n γ ,...,γ ( p−1) γ p +(−1) p+1 n γ ,γ ,...,γ ( p−1) . The Abelian groups C p (N ) of p-cochains on Γ form the complex C(N ), δ

δ

δ

0 −→ C 0 (N ) −→ C 1 (N ) −→ C 2 (N ) −→ · · · . The cohomology groups H p (Γ, N ) are composed of equivalence classes of p-cocycles on Γ modulo p-coboundaries. Given a complex K of Γ -modules d0

d1

d2

0 −→ N 0 −→ N 1 −→ N 2 −→ · · · , we may consider again a double complex formed from the groups C p (N q ) and the induced diagonal complex. The cohomology groups of the latter define the hypercohomology groups Hs (Γ, K ). We shall be interested in gerbes on M with Γ -equivariant structures (Γ -gerbes for short) that permit to define the contribution of the torsion field H to Feynman amplitudes of closed strings moving in the orbifold M/Γ . Γ -gerbes may be presented by their γ γ local data (c, bγ , aγ ,γ ), where c = (Bi , Ai j , gi jk ) ∈ A2 , bγ = (Πi , χi j ) ∈ A1 and γ ,γ

aγ ,γ = ( f i

) ∈ A0 satisfy the relations D2 c = 0, (δc)γ ≡ γ c − c = D1 bγ ,

(11)

(δb)γ ,γ ≡ γ bγ − bγ γ + bγ = −D0 aγ ,γ ,

(12)

(δa)γ ,γ ,γ ≡ γ aγ ,γ − aγ γ ,γ + aγ ,γ γ − aγ ,γ = 0.

(13)

The Γ -gerbe local data (c , bγ , aγ ,γ ) and (c, bγ , aγ ,γ ) will be considered equivalent if there exist β ∈ A1 and φγ ∈ A0 such that c = c + D1 β,

(14)

bγ = bγ + γβ − β + D0 φγ ≡ bγ + (δβ)γ + D0 φγ ,

(15)

aγ ,γ = aγ ,γ − γ φγ + φγ γ − φγ ≡ aγ ,γ − (δφ)γ ,γ .

(16)

In particular, c and c are equivalent local data for gerbes on M. Γ -gerbes with equivalent local data will be called stably isomorphic. Equivalence classes of local data (c, bγ , aγ ,γ ) form the hypercohomology group H2 (Γ, K (D(2))).

8

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

It is easy to see that, up to equivalence, the local data (c, bγ , aγ ,γ ) of a flat Γ -gerbe are of the form c = (0, 0, u i jk ),

bγ = (0, vγ−1 ;i j ),

aγ ,γ = (wγ ,γ ;i ),

(17)

where u i jk , vγ ;i j , wγ ,γ ;i ∈ U (1). This form is preserved by the transformations (14)–(16) with β = (0, vi−1 j ),

φγ = (wγ ;i )

(18)

for vi j , wγ ;i ∈ U (1). The equivalence classes of local data for a flat Γ -gerbe form ˇ (1))), where C(U ˇ (1)) is the Cech ˇ the hypercohomology group H2 (Γ, C(U complex (1) for the sheaf of locally constant U (1)-valued functions on M, viewed as a complex of Γ -modules. In general, there are obstructions to the existence of a Γ -equivariant structure on a gerbe with local data c. First, the existence of bγ ∈ A1 such that (11) holds requires that the equivalence class of the flat-gerbe local data [γ c − c] ∈ H 2 (M, U (1)) be trivial (or, geometrically, that the pullback of the gerbe by γ −1 stays in the same stable isomorphism class). This is automatically assured if H 2 (M, U (1)) = 0. Suppose then that γ c − c = D1 bγ for bγ ∈ A1 . It follows that D1 (δb)γ ,γ = 0 so that (δb)γ ,γ defines a 2-cocycle rγ ,γ on Γ with values in H 1 (M, U (1)) ≡ H 1 , see (9). Its cohomology class [rγ ,γ ] ∈ H 2 (Γ, H 1 ) is the next obstruction to the existence of a Γ -equivariant structure. If it is trivial, which holds automatically if H 1 = 0, then there exist eγ ∈ A1 with D1 eγ = 0 and aγ ,γ ∈ A0 such that (δb)γ ,γ = (δe)γ ,γ − D0 aγ ,γ . Note that D0 (δa)γ ,γ ,γ = 0. Hence u γ ,γ ,γ = (δa)γ ,γ ,γ is a 3-cocycle on Γ with values in ker D0 = H 0 (M, U (1)) ≡ H 0 . Its cohomology class [u γ ,γ ,γ ] ∈ H 3 (Γ, H 0 )

(19)

is the last obstruction to the existence of a Γ -equivariant structure. If it is trivial, i.e. if u γ ,γ ,γ = (δv)γ ,γ ,γ for some vγ ,γ ∈ H 0 , then taking bγ − eγ as a new bγ and aγ ,γ − vγ ,γ as a new aγ ,γ , we obtain the relations (11)–(13). Note that in (19), the group H 0 of locally constant U (1)-valued functions f should be viewed as a Γ -module ∗ with γ f = γ −1 f . If M is connected then H 0 = U (1) with the trivial action of Γ . An important question arises as to how many inequivalent Γ -equivariant structures exist on a gerbe on M if all obstructions vanish. Two sets of local data for a Γ -gerbe with the same underlying gerbe local data c differ by (bγ , aγ ,γ ) such that D1 bγ = 0,

(δb)γ ,γ = −D0 aγ ,γ ,

(δa)γ ,γ ,γ = 0.

The equivalence classes of (bγ , aγ ,γ ) satisfying (20) modulo ((δβ)γ + D0 φγ , −(δφ)γ ,γ )

(20)

WZW Orientifolds and Finite Group Cohomology

9

with D1 β = 0 label then inequivalent Γ -equivariant structures on the gerbe with local data c. Note that bγ and aγ ,γ above may be taken in the form (17) and β and φγ in the form (18). The set of equivalence classes [bγ , aγ ,γ ] forms an Abelian group that we shall denote by HΓ . It may be interpreted as the hypercohomology group H2 (Γ, Cˇ 1 (U (1))) where Cˇ 1 (U (1)) is the complex ˇ

δ 0 −→ Cˇ 0 (U (1)) −→ Zˇ 1 (U (1))

ˇ ˇ1 of Γ -modules with Zˇ 1 (U (1)) = ker δ| C (U (1)) . There is a natural map from HΓ to 1 1 H (Γ, H ) that assigns to [bγ , aγ ,γ ] the cohomology class [bγ ] of the image of bγ in H 1. If H 1 (Γ, H 1 ) = 0, e.g. if H 1 = 0, then the group HΓ may be described in simpler terms. Indeed, under this assumption, [bγ ] = 0 and there exist (β, φγ ) such that D1 β = 0 and bγ = (δβ)γ + D0 φγ . For αγ ,γ = aγ ,γ + (δφ)γ ,γ , one has the relation D0 αγ ,γ = 0. It follows that φγ may be modified so that aγ ,γ = −(δφ)γ if and only if the cohomology class [αγ ,γ ] ∈ H 2 (Γ, H 0 ) is trivial. This results in the isomorphism HΓ [bγ , aγ ,γ ] −→ [αγ ,γ ] ∈ H 2 (Γ, H 0 ) of Abelian groups. We infer this way that if (c, bγ , aγ ,γ ) are local data for a Γ -gerbe then, for 2-cocycles vγ ,γ on Γ with values in H 0 , (c, bγ , aγ ,γ + vγ ,γ ) are also local data for a Γ -gerbe and, up to equivalence, all Γ -gerbe local data with the same gerbe local data c are obtained in such a way. The local data (c, bγ , aγ ,γ ) and (c, bγ , aγ ,γ + vγ ,γ ) are equivalent if and only if vγ ,γ = (δw)γ ,γ for wγ ∈ H 0 . Hence, elements of H 2 (Γ, H 0 ) label inequivalent Γ -structures on a gerbe on M provided that H 1 (Γ, H 1 ) = 0. Suppose now that Γ acts on M without fixed points and that M/Γ ≡ M is a manifold. Under the assumption that the open covering (Oi ) of M is such that Oi(γ i) = ∅ only if γ = 1, the sets Oi = π(Oi ), where π : M → M is the canonical projection, form a covering of M and Oi j ≡ Oi ∩ O j = π(Oi j ), j=γ j

Oi j k =

j=γ j k=γ γ k

π(Oi jk ),

etc. In that situation, a Γ -gerbe with local data (c, bγ , aγ ,γ ), where c = (Bi , Ai j , gi jk ), γ

γ ,γ

γ

bγ = (Πi , χi j ) and aγ ,γ = ( f i ), induces in a canonical way a gerbe on M with local data (Bi , Ai j , gi j k ) given by the relations [29] π ∗ Bi = Bi on Oi , γ

π ∗ Ai j = Ai j + Π j γ

on Oi j for j = γ j , γ ,γ −1

π ∗ gi j k = gi jk (χ jk f k

)

on Oi jk for j = γ j , k = γ γ k .

Equivalent Γ -gerbe local data on M are associated with equivalent gerbe local data on M . Note that the latter correspond to the curvature 3-form H such that π ∗ H = H . In the more general context where Γ acts on M with fixed points, we shall sometimes talk,

10

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

by an abuse of language, of Γ -gerbes on M as gerbes on the orbifold M/Γ . A more sophisticated approach to gerbes on orbifolds may be found in [21]. ˜ 1 be an oriented closed connected surface with π1 its fundamental Let Σ = Σ/π group and Σ˜ its universal covering space. The maps X : Σ˜ → M such that there exists a homomorphism x : π1 → Γ for which X (a σ˜ ) = x(a)X (σ˜ ) if a ∈ π1 and σ˜ ∈ Σ˜ describe world histories of the closed string moving in the orbifold M/Γ , with X and γ X for γ ∈ Γ defining the same history. The pullback by X of the ˜ Those, in turn, local data for a Γ -gerbe on M defines local data for a flat π1 -gerbe on Σ. determine local data for a flat gerbe on Σ by the construction described above and an element in H 2 (Σ, U (1)) = U (1) called the holonomy along X . The holonomies along X and γ X defined this way coincide. They represent the contribution of the Kalb-Ramond field to the Feynman amplitude of the string history in M/Γ defined by X . 2.3. Gerbes on simple compact Lie groups. Gerbes on Lie groups have been studied in the context of the Wess-Zumino-Witten (WZW) models [35] of conformal field theory describing the motion of strings in group manifolds. Let G be a connected and simply connected compact simple Lie group and let Hk be the bi-invariant closed 3-form on G, Hk =

k 12π

tr(g −1 dg)3 .

Here, tr denotes the ad-invariant positive bilinear symmetric (Killing) form on the Lie 1 algebra g, normalized so that the 3-form 2π Hk has integer periods if and only if k (called the level) is an integer. For such k, there exists a gerbe on G with curvature Hk and it is unique up to stable isomorphisms since H 2 (G, U (1)) = 0. We shall call it the level-k gerbe on G. An explicit construction of such gerbes was given in [12] for G = SU (2), in [7] for G = SU (N ) and in [23] for all simple simply connected compact Lie groups. In the last two cases, the construction used a more geometric description of gerbes along the lines of [24,25] rather than the one employing local data. Let Z (G) be the center of the simply connected group G and let Γ = Z ⊂ Z (G) be its subgroup. The case of non-simply connected quotients G/Z ≡ G was studied in [14] for G = SU (N ) and in [15] for other groups G. In those references, gerbes on groups G with curvature Hk were explicitly constructed whenever possible. Equivalently, the construction provides Z -equivariant structures on the level-k gerbe on G. Since the groups H 2 (G, U (1)) and H 1 (G, U (1)) are trivial and H 0 (G, U (1)) = U (1) with the trivial action of Z , the only obstruction to the existence of such Z -equivariant structures is the cohomology class [u z,z ,z ] ∈ H 3 (Z , U (1)), see (19). The main part of the construction of [14,15] consisted in analyzing the cohomological equation u z,z ,z = (δv)z,z ,z

(21)

and finding its solutions for all levels k for which the obstruction cohomology class (19) is trivial. In agreement with the analysis of the last subsection, solutions vz,z differing by non-cohomologous 2-cocycles gave rise to inequivalent Z -equivariant structures and hence to stably non-isomorphic gerbes on G with curvature Hk . The levels k for which 1 the obstruction class is trivial are the ones for which the 3-form 2π Hk on G has integer periods. They were identified for the first time in [8].

WZW Orientifolds and Finite Group Cohomology

11

Let us recall here the form of the obstruction 3-cocycle u z,z ,z obtained in [15]. The cocycle was related to the action of the center Z (G) on the set of conjugacy classes of G. Each conjugacy class has a single representative of the form e2π iτ , where τ belongs to the positive Weyl alcove A, a simplex in the Cartan algebra t ⊂ g with the vertices τ0 = 0,

τi =

1 ki

λi∨

for i = 1, . . . , r,

where r = dim t is the rank of G, λi∨ are the fundamental coweights in t and ki are the corresponding Coxeter labels. The latter are defined by the relations tr λi∨ α j = δi j ,

φ=

r

ki αi ,

i=1

where αi , i = 1, . . . , r , are the simple roots and φ is the highest root of the Lie algebra g. Multiplication by an element z ∈ Z (G) sends conjugacy classes into conjugacy classes and induces an affine map τ → zτ on the positive Weyl alcove. More exactly, z e2π iτ = wz−1 e2π i(zτ ) wz

(22)

for some wz belonging to the normalizer N (T ) of the Cartan subgroup T ⊂ G. For the vertices of A, we have zτi = τzi

for i = 0, . . . , r.

Upon identification of the set of indices i = 0, 1, . . . , r with the set of nodes of the extended Dynkin diagram of g, the action i → zi induces a symmetry of the diagram. The group elements wz are determined up to left multiplication by elements of T and, −1 in general, cannot be chosen to depend multiplicatively on z but wz wz wzz ∈ T . Let bz,z ∈ t be such that −1 2π i bz,z wz wz wzz . = e

For z ∈ Z (G), the vertex τz −1 0 of A is a fundamental coweight such that z = e−2π i τz−1 0 . The formula u z,z ,z = e−2π ik tr τz−1 0 bz ,z

(23)

defines a 3-cocycle on Z (G) whose cohomology class does not depend on the choices made in the definition.3 The restriction of u z,z ,z to z, z , z ∈ Z ⊂ Z (G) gives the 3-cocycle whose cohomology class in H 3 (Z , U (1)) is the obstruction (19) to the existence of a Z -equivariant structure on the level-k gerbe on G. The cohomological equation (21) was discussed case by case in [15]. 3 The 3-cocycle analyzed in [15] differed by a coboundary from the one of (23).

12

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

2.4. Gerbes on orientifolds. A simple generalization of the notion of a Γ -gerbe developed in Sect. 2.2 is to admit a more general action of the discrete group Γ on M such that γ ∗ H = (γ )H for a homomorphism : Γ → {±1}. The only modification required is in the definition (10) of the action of Γ on the groups An of (5)–(8) that should read ∗

γ f i = (γ −1 f γ −1 i )(γ ) ,

∗

γ Πi = (γ ) γ −1 Πγ −1 i ,

etc. The change assures, for example, that if c ∈ A2 with D2 c = 0 gives local data for a gerbe with curvature H then so does γ c. Let Γ0 ⊂ Γ denote the kernel of so that one has the exact sequence of groups

1 −→ Γ0 −→ Γ −→ Z2 −→ 1. We shall call Γ an orientifold group if Γ0 = Γ . The whole discussion of Sect. 2.2 except for the two end paragraphs about gerbes on non-singular quotients and about the holonomy extends to the case of orientifold group actions, generalizing the notions of Γ -equivariant structures and of Γ -gerbes to that case. We shall loosely talk of Γ -gerbes for Γ an orientifold group as gerbes on the orientifold M/Γ . As before, if H 1 (M, U (1)) = 0 = H 2 (M, U (1)) then the only obstruction to the existence of a Γ -equivariant structure on the gerbe with local data c is the class (19), where the group H 0 ≡ H 0 (M, U (1)) of locally constant U (1)-valued functions is now viewed as ∗ a Γ -module with γ f = (γ −1 f )(γ ) . For M connected, H 0 = U (1) with the action (γ ) γ λ = λ . If the obstruction (19) is trivial then inequivalent Γ -equivariant structures are labeled by elements of H 2 (Γ, H0 ), where the subscript indicates that H 0 is taken with the Γ -module structure just described. The simplest example is that of the inversion group Γ = {±1} ∼ = Z2 with (±1) = ±1. Γ -equivariant structures on a gerbe on M for such Γ were introduced (in an equivalent formulation) in [31] under the name of Jandl structures. A particular case is when ˆ 2. M is the orientation double Σˆ of an unoriented closed connected surface Σ = Σ/Z Σˆ is an oriented closed surface, connected if Σ is non-orientable and with two components otherwise. The group of stable isomorphism classes of Z2 -gerbes on Σˆ is ˇ (1))), as described in Sect. 2.2. In [31], a natural group homomorphism H2 (Z2 , C(U ι was constructed that renders the diagram ˇ (1))) H2 (Z2 , C(U ↓ 2 ˆ H (Σ, U (1))

ι

−→ −→

U (1) ↓ sq

(24)

U (1)

commutative. In the diagram, the left down-arrow is induced by forgetting the Z2 -equivariant structure and the right down-arrow by sq(λ) = λ2 . If Σˆ has one comˆ U (1)) ∼ ponent then the lower horizontal arrow is the isomorphism H 2 (Σ, = U (1) described before. If Σˆ has two components then it is the composition of the isomorphism m ˆ U (1)) ∼ H 2 (Σ, = U (1) × U (1) with the multiplication map U (1) × U (1) → U (1). In the latter case, the construction of ι is straightforward because the image of the forgetting map lies in the diagonal subgroup of U (1) × U (1). Let X : Σˆ → M be such that X (−1 · σˆ ) = −1 · X (σˆ ). ˆ Such maps X , invariant under the combined worldsheet orientation reversal for σˆ ∈ Σ. and a target map that changes the sign of the torsion field, describe world histories

WZW Orientifolds and Finite Group Cohomology

13

of the closed unoriented string moving in the orientifold M/Z2 , with X and −1 · X corresponding to the same history. The pullback by X of a Z2 -gerbe on M to Σˆ defines ˆ The number in U (1) associated to the stable isomorphism class of a Z2 -gerbe on Σ. the latter by the homomorphism ι is called the holonomy of the Z2 -gerbe on M along X . The holonomies of X and of −1 · X are inverses of each other. Halved, their sum represents the contribution of the Kalb-Ramond field to the Feynman amplitude of the world history in the orientifold M/Z2 defined by X . For more general orientifold groups Γ , the restriction of the Γ -equivariant structure to the Γ0 -equivariant one may be used to define a quotient gerbe on M = M/Γ0 if Γ0 acts without fixed points. The full Γ -equivariant structure induces then a Jandl structure on the quotient gerbe, see [16]. We could, more correctly, call a Γ -equivariant structure a Γ0 -equivariant Jandl structure, but we shall stick to the former name in what follows. The construction of the holonomy of gerbes with Jandl structures described above may be extended to the equivariant case. Let Σ be an unoriented closed connected surface, ˆ and Σ˜ the universal covering Σˆ its orientation double, πˆ 1 the fundamental group of Σ, ˆ of Σ. There exists a natural group π˜ entering the exact sequence of groups ˜

1 −→ πˆ 1 −→ π˜ −→ Z2 −→ 1 and acting on Σ˜ without fixed points in a way that extends the action of πˆ 1 , projects to the natural action of Z2 on Σˆ and to the identity on Σ (if Σˆ has two components, then π˜ = πˆ 1 × Z2 ). Suppose that X : Σ˜ → M is such that, for a homomorphism x : π˜ → Γ with (x(a)) ˜ = ˜ (a), ˜ one has X (aˆ σ˜ ) = x(a)X ˜ (σ˜ ) ˜ Such maps X , covariant with respect to the action of the for a˜ ∈ π˜ and σ˜ ∈ Σ. orientifold group, describe world histories of the closed unoriented string moving in the orientifold M/Γ , with X and γ X for γ ∈ Γ corresponding to the same history. ˜ Using the restriction The pullback of a Γ -gerbe on M by X defines a flat π-gerbe ˜ on Σ. ˆ The of the π˜ -equivariant structure to πˆ 1 ⊂ π, ˜ one may then obtain a flat gerbe on Σ. π˜ -equivariant structure descends to a Jandl structure on it. Applying the homomorphism ι of (24) to the Z2 -gerbe on Σˆ obtained this way, one finds then the holonomy of the Γ -gerbe along X . The holonomies along γ X defined this way are equal to the one along X or to its inverse depending on whether (γ ) = 1 or (γ ) = −1. Halved, the sum of the holonomy along X and of its inverse gives the contribution of the Kalb-Ramond field to the Feynman amplitude of the closed unoriented string history in the orientifold M/Γ defined by X . 2.5. Orientifolds of simple compact Lie groups. We may consider the inversion group Γ = {±1} ∼ = Z2 with (±1) = ±1 and −1 ≡ z 0 acting on a connected, simply connected simple compact Lie group G by G g → z 0 g = (ζ g)−1 ∈ G

(25)

for ζ ∈ Z (G) that we shall call the twist element. The action of z 0 changes, indeed, the sign of Hk so that the relation γ ∗ Hk = (γ )Hk holds. More generally, we shall consider orientifold groups Γ = Z2Z for Z ⊂ Z (G), with the multiplication table (1, z) · (1, z ) = (1, zz ), (1, z) · (−1, z ) = (−1, z −1 z ),

(−1, z) · (1, z ) = (−1, zz ), (−1, z) · (−1, z ) = (1, z −1 z )

14

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

and (±1, z) = ±1 so that Γ0 ∼ = Z . Note that Γ is a non-Abelian group if Z is non-trivial and different from Z2 or from a direct product of Z2 factors. To simplify the notation, we shall write (1, z) ≡ z and (−1, z) ≡ z 0 z. For the action of Γ on G we shall take the one that combines (25) with the action of Z by multiplication so that z 0 zg = (ζ zg)−1 . Note that if h ζ , for ζ ∈ Z , denotes the automorphism of Γ = Z2Z defined by the relations h ζ (z) = z,

h ζ (z 0 z) = z 0 ζ z,

(26)

then the composition of the action of Γ on G with h ζ induces the change ζ → ζ ζ of the twist element. Hence, twist elements in the same coset of Z (G)/Z give rise to equivalent orientifold group actions. This is in agreement with the observation [16] that the restriction of a Γ -equivariant structure on a gerbe on G to the Z -equivariant structure induces a gerbe on the non-simply connected group G = G/Z and the full Γ -equivariant structure gives rise to a Jandl structure on that gerbe. Indeed, actions of Γ on G corresponding to twist elements in the same coset of Z (G)/Z induce the same action of Z2 on G . As discussed in the previous subsection, the sole obstruction to the existence of a Γ -equivariant structure on the level-k gerbe on G is given by the cohomology class [u γ ,γ ,γ ] ∈ H 3 (Γ, U (1) ), where the subscript indicates that U (1) is taken with the action γ λ = λ(γ ) of Γ . The 3-cocycle u γ ,γ ,γ may be found by a straightforward generalization of the work done in [15] where the case of orbifold groups Z was treated, see Sect. 2.3 above. We shall only describe the result here, postponing a more detailed exposition to [16]. κ First, let us observe that the inversion map g → g −1 sends conjugacy classes to conjugacy classes. Upon identification of the set of conjugacy classes in G with the positive Weyl alcove A ⊂ t, see Sect. 2.3, it induces an affine map τ → κτ on A such that κτi = τκi for κ0 = 0 and i → κi for i = 1, . . . , r , giving rise to a symmetry of the (unextended) Dynkin diagram of g. More precisely, e−2π iτ = wκ−1 e2π i(κτ ) wκ

(27)

for some wκ belonging to the normalizer N (T ) of the Cartan subgroup T ⊂ G. The element wκ is determined up to left multiplication by elements of T . Combining the relations (27) and (22), we infer that for any γ ∈ Γ = Z2Z there exist an affine map τ → γ τ on the Weyl alcove A and wγ ∈ N (T ) such that γ e2π iτ = wγ−1 e2π i(γ τ ) wγ .

(28)

One has γ τ = zτ for γ = z and γ τ = κζ zτ for γ = z 0 z, and one may take wγ = wz for γ = z,

wγ = wκ wζ wz for γ = z 0 z.

(29)

for i = 0, . . . , r,

(30)

The action of Γ on the vertices of A, γ τi = τγ i

WZW Orientifolds and Finite Group Cohomology

15

induces a symmetry i → γ i of the extended Dynkin diagram of g. This symmetry preserves the Coxeter labels: kγ i = ki if one sets k0 = 1. From the relations (28) and (30), one obtains the formula γ τ = (γ ) wγ τ wγ−1 + τγ 0 for the action of γ on A. As before, it is easy to see that wγ wγ wγ−1γ ∈ T so that one may choose bγ ,γ ∈ t such that wγ wγ wγ−1γ = e2π i bγ ,γ .

(31)

The 3-cocycle on Γ whose cohomology class defines the obstruction to the existence of a Γ -equivariant structure on the level-k gerbe on G takes the form u γ ,γ ,γ = e

−2π ik (γ ) tr τγ −1 0 bγ ,γ

.

(32)

The cocycle condition means that (γ )

−1 (δu)γ ,γ ,γ ,γ ≡ u γ ,γ ,γ u −1 γ γ ,γ ,γ u γ ,γ γ ,γ u γ ,γ ,γ γ u γ ,γ ,γ = 1.

It may be verified by direct calculation. The cohomology class of u γ ,γ ,γ is independent of the choices made in its definition. Note that the 3-cocycle (32) on Γ restricts to the 3-cocycle (23) on Z = Γ0 . Let us finally remark that, since the orientifold action (25) of Γ = Z2Z with the twist element ζ ζ for ζ ∈ Z may be obtained from that with the twist element ζ by composing with the automorphism (26) of Γ , the cocycle u γ ,γ ,γ for the new action defines the same cohomology class in H 3 (Γ, U (1) ) as the one for the original action composed with the automorphism h ζ . The composition with an automorphism of Γ that leaves the homomorphism invariant commutes with the coboundary δ and induces an automorphism of the cohomology groups H n (Γ, U (1) ). 3. Lyndon-Hochschild-Serre Spectral Sequence Recall that the cohomology class [u γ ,γ ,γ ] ∈ H 3 (Γ, U (1) ) is the obstruction to the existence of a Γ -equivariant structure on the level-k gerbe on the simply connected group G. The purpose of the present paper is to discuss in detail the cohomological equation (γ )

−1 u γ ,γ ,γ = vγ ,γ vγ−1 γ ,γ vγ ,γ γ vγ ,γ ≡ (δv)γ ,γ ,γ

(33)

which is solvable if and only if the cohomology class [u γ ,γ ,γ ] is trivial. Knowledge of the general structure of the cohomology group H 3 (Γ, U (1) ) will be useful in checking the latter condition. In what follows, we shall call u γ ,γ ,γ the obstruction cocycle and vγ ,γ a trivializing cochain. As will be shown in [16], trivializing cochains enter directly the construction of a Γ -equivariant structure on the level-k gerbe on G, similarly as in

16

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

the case of orbifold groups that was discussed in [15]. The classification of inequivalent Γ -gerbes may, likewise, be formulated in the cohomological language, with inequivalent Γ -gerbes corresponding to trivializing cochains differing by non-cohomologous 2-cocycles vγ ,γ , defining cohomology classes [vγ ,γ ] ∈ H 2 (Γ, U (1) ). This way, H 2 (Γ, U (1) ) plays the role of the classifying group for inequivalent Γ -gerbes on G. Its structure will provide valuable insights into certain algebraic properties and the origin of trivializing cochains prior to entering the straightforward yet tedious computations of Sect. 4. It should be stressed at this point that while obstructions to the existence of orientifold gerbes do not, in general, exhaust the obstruction cohomology group H 3 (Γ, U (1) ), it is the entire classifying group H 2 (Γ, U (1) ) that captures inequivalent orientifold gerbe structures. In consequence of the semi-direct product nature of the orientifold group Γ = Z2Z , the main tool which will be used in exploring the U (1) -valued cohomology of Γ is the Lyndon-Hochschild-Serre (LHS) spectral sequence [34] p,q

Er

=⇒ H p+q (Γ, U (1) )

(34)

associated with the short exact sequence of groups 1 −→ Z −→ Γ −→ Z2 −→ 1. Recall [22] that the r th page of a spectral sequence with r ≥ 0 is a collection of Abelian p,q groups Er vanishing for negative p or q, together with the coboundary homomorp,q p,q p+r,q−r +1 p+r,q−r +1 p,q phisms dr : Er → Er such that dr dr = 0. The groups of the p,q p,q next page are defined by setting Er +1 = ker dr /im d p−r,q+r −1 . The second page of the LHS spectral sequence is composed of the groups p,q

E2

= H p (Z2 , H q (Z , U (1)) ),

with the action of Z2 on H q (Z , U (1)) induced by the one on the q-cochains on Z , (−1 · c)z,z ,...,z (q) = c−1 −1 z

,z −1 ,...,z (q)

−1

.

(35)

The relation (34) of the LHS sequence to the cohomology groups H n (Γ, U (1) ) is established with the help of a filtration n 0 = Hn+1 ⊂ Hnn ⊂ · · · ⊂ H1n ⊂ H0n = H n (Γ, U (1) )

such that n ∼ p,n− p H pn /H p+1 , = E∞ p,q

p,q

where E ∞ denotes the group at which Er large.

stabilize for ( p, q) fixed and r sufficiently

WZW Orientifolds and Finite Group Cohomology

17

Already the second page of the LHS spectral sequence provides a great deal of information on the possible structure of the cohomology groups H n (Γ, U (1) ), at least for Z cyclic to which case we shall specialize first, taking Z = Zm with m > 0. The cyclic group cohomology is well known, see [3], ⎧ ⎨ U (1) 0 H q (Zm , U (1)) = ⎩ Z m

if q = 0, if q > 0 is even, if q is odd,

for the trivial action of the orbifold group Zm on U (1). The action of the generator −1 of the orientifold group Z2 on H q (Zm , U (1)) induced by (35) reduces to the inversion for q even and to the trivial action for q odd. Furthermore, one has H (Z2 , U (1) ) = p

Z2 0

if p is even, if p is odd,

(36)

for the action of −1 on U (1) by inversion, and ⎧ ⎨ Zm H p (Z2 , Zm ) == Z2 ⎩ 0

if p = 0, if p > 0 and m is even, if p > 0 and m is odd,

for the trivial action of Z2 on Zm . This gives, for the second page of the spectral sequence,

q↑

p,q

E2

.. .

.. .

.. .

.. .

.. .

Zm

0

0

0

0

···

0

0

0

0

0

···

0

0

···

Z2

···

:

d40,3

Zm OO 0 0 OOO OOO OOO OOO d20,1 OO' 0,0 Z2 E 2 = Z2 0

0

#

p

− →

18

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

for m odd, and q↑ .. . Zm J p,q

E2

: 0

J

J

.. .

.. .

.. .

.. .

.. .

Z2

Z2

Z2

Z2

Z2

···

J d3 J 0 J J

0

0

0

···

J 0

J

0,3

J

J

J$ Zm OO Z2 OO Z2 OO Z Z Z2 OOO OOO 2 OOOOO 24,1 OOO OOO OOdO2 OOO OOO OOO OOO 1,1 OOO 2,1 OOO OOOd OOO OOO# OOOd2 d20,1 2 O O' ' ' ' Z2 Z2 E 20,0 = Z2 0 0 0 d40,3

···

···

p

− →

p,q

for m even. The images of the coboundary homomorphisms dr for the second page (the continuous lines) and of the higher ones (the dotted and dashed lines), together with the definition of the groups entering subsequent pages, lead us to the conclusion that the LHS spectral sequence stabilizes quickly for the cohomology groups of interest: the classifying group H 2 (Γ, U (1) ), and the obstruction group H 3 (Γ, U (1) ). For m odd, taking into account that there are no non-trivial homomorphisms from Zm to Z2 , we conclude that the sequence collapses to the second page giving Z2 if n is even, H n (Z2Zm , U (1) ) = Zm if n is odd. The case of m even is somewhat more complicated. We shall argue that d20,1 = 0 also in this case. It is shown in [30] that, for Γ = Z2Z , there exists a 7-term exact sequence 0

/ H 1 (Z2 , H 0 (Z , U (1)) )

/ H 1 (Γ, U (1) )

z H 2 (Z2 , H 0 (Z , U (1)) )

d20,1

z H 3 (Z2 , H 0 (Z , U (1)) )

/ H 2 (Γ, U (1) )1

ρ

/ H 0 (Z2 , H 1 (Z , U (1)) ) / H 1 (Z2 , H 1 (Z , U (1)) )

d21,1

with ρ denoting the restriction map and H 2 (Γ, U (1) )1 entering the exact sequence 0 −→ H 2 (Γ, U (1) )1 −→ H 2 (Γ, U (1) ) −→ H 2 (Z , U (1))Z2 ,

WZW Orientifolds and Finite Group Cohomology

19

where the last group is the subgroup of Z2 -invariant elements of H 2 (Z , U (1)). Since every 1-cocycle wz on Z with values in U (1) (i.e. a character of Z ) extends to a 1-cocycle on Γ with values in U (1) upon setting wz 0 z = wz−1 , the restriction map is surjective. Besides, H 2 (Z , U (1)) = 0 for Z cyclic so that H 2 (Γ, U (1) )1 ∼ = H 2 (Γ, U (1) ) in this case. If Z = Zm with m even then H 1 (Z2 , H 0 (Z , U (1)) ) = H 0 (Z2 , H 1 (Z , U (1)) ) = H 2 (Z2 , H 0 (Z , U (1)) ) = H 1 (Z2 , H 1 (Z , U (1)) ) = H 3 (Z2 , H 0 (Z , U (1)) ) =

H 1 (Z2 , U (1) ) = 0, H 0 (Z2 , Z ) = Z , H 2 (Z2 , U (1) ) = Z2 , H 1 (Z2 , Z ) = Z2 , H 3 (Z2 , U (1) ) = 0,

so that the 7-term exact sequence reduces to d20,1

ρ

d21,1

0 −→ H 1 (Γ, U (1) ) −→ Z −→ Z2 −→ H 2 (Γ, U (1) )) −→ Z2 −→ 0. It follows, in particular, that ρ is an isomorphism and d20,1 = 0. Finally, using this information in the LHS spectral sequence, we infer that, for m even, ⎧ Z2 if n = 0, ⎪ ⎨ Z if n = 1, m H n (Z2Zm , U (1) ) = (37) Z or Z × Z if n = 2, ⎪ 4 2 2 ⎩ m Z2k or Z2 × Zk , k = 1, 2, 4 if n = 3. The ambiguity in (37) can actually be resolved for the group H 2 (Z2Zm , U (1) ). Indeed, consider its element defined by the cocycle v

(1) z 0n z,z 0n z

= (−1)nn

(38)

(1)

for z, z ∈ Zm . Suppose that v (1) is a coboundary, vγ ,γ = (δw)γ ,γ , from which it would follow, in particular, that wz 0 z (wz 0 z −1 z )−1 wz = 1

and

(wz 0 z )−1 (wz −1 z )−1 wz 0 z = −1.

The two conditions are, however, contradictory, as can be verified by replacing z by z −1 z 2 in the second one. Hence, the class [vγ(1) ,γ ] generates a Z2 subgroup of H (Z2Zm , U (1) ). If m odd, this is the whole group. If m is even, we may repeat the same reasoning for the 2-cocycle 1 if (z )m/2 = 1, (2) v n n = (39) n z 0 z,z 0 z if (z )m/2 = 1, (−1) (1) −1 (2) 2 and for vγ(2) ,γ (vγ ,γ ) , whereby we establish that the class [vγ ,γ ] in H (Z2Zm , U (1) ) (1)

is non-trivial and different from [vγ ,γ ]. This immediately implies that, for m even, H 2 (Z2Zm , U (1) ) = Z2 × Z2 , (1)

(2)

and the group is generated by the cohomology classes [vγ ,γ ] and [vγ ,γ ].

20

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Finally, we give, for the sake of completeness, the classifying cohomology group for the case of the non-cyclic orbifold subgroup Z2 × Z2 that will be encountered in the study of the Cartan series D2s of simple groups. Since ⎧ if q = 0, ⎨ U (1) H q (Z2 × Z2 , U (1)) = Z2 × Z2 if q = 1, ⎩ Z if q = 2, 2 see [15], in the LHS sequence E 0,2 = Z2 ,

E 1,1 = Z2 × Z2 ,

E 2,0 = Z2 .

It follows that H 2 (Z2 (Z2 × Z2 ), U (1) ) has rank smaller or equal to 16. In Sect. 4.4.2, we shall exhibit 16 cohomologically inequivalent 2-cocycles on Z2 (Z2 × Z2 ), taking values ±1. This will establish the equality H 2 (Z2 (Z2 × Z2 ), U (1) ) = Z2 × Z2 × Z2 × Z2 . Prior to refining the tools of analysis of the obstruction cocycles, let us make one general comment. In the cyclic case, Z = Zm , the (possibly non-factorizing) component of the obstruction group H 3 (Γ, U (1) ) coming from H 0 (Z2 , H 3 (Zm , U (1)) ) is determined completely by the orbifold subgroup. After imposing conditions that allow a trivialization of the obstruction cocycle u z,z ,z restricted to Z , we are left with at most a sign obstruction. We shall encounter such sign obstructions for m even in the cases considered below. On the other hand, for m odd, the restriction to Z is clearly the sole source of obstruction. In particular, there are no obstructions to the trivializability of u γ ,γ ,γ for trivial Z and, consequently, no obstruction to the existence of Jandl structures on the level-k gerbes on simply connected groups G. 4. Case-by-Case Analysis The trivializability of the obstruction 3-cocycle u γ ,γ ,γ given by (32) constrains the admissible values of the level k in terms of the other elements such as the structure of the group G, the choice of the orbifold subgroup Z ⊂ Z (G), and that of the twist element ζ ∈ Z entering the action (25) on G of the Z2 component of the orientifold group Γ = Z2Z . Below, we shall calculate the cocycles u γ ,γ ,γ on Γ and classify the cases when they may be trivialized, giving also an explicit form of trivializing cochains. The latter provide the main input in the explicit construction of Γ -equivariant structures on the level-k gerbe on G that will be described in [16]. Cohomologically inequivalent trivializing cochains give rise to inequivalent Γ -equivariant structures. The construction of [16] is a direct generalization of the one of gerbes on non-simply connected groups G/Z described in [15]. Below, we shall denote by z a fixed generator of Z (G) for the groups G with a cyclic center Z (G). The essential input in the calculations of the obstruction cocycle u γ ,γ ,γ is the identification of the elements wz and wκ in the normalizer N (T ) of the Cartan subgroup of T ⊂ G and of the maps τ → zτ and τ → κz of the positive Weyl alcove that satisfy (22) and (27). To simplify the notation, we shall abbreviate z n ≡ n and z 0 z n ≡ n, where n = 0, . . . , |Z (G)| − 1 for the elements of the maximal orientifold group Γ = Z2Z (G). For any integer n, we shall denote by [n] the number equal to n

WZW Orientifolds and Finite Group Cohomology

21

modulo |Z (G)| and such that 0 ≤ [n] < |Z (G)|. In agreement with (29), for general elements γ = z n , z 0 z n of Γ with n = 0, . . . , |Z (G)| − 1, one may set wn ≡ wz n = wzn ,

wn ≡ wz 0 z n = wκ wzn 0 wzn

(40)

if the twist element ζ entering the action (25) of z 0 on G is equal to z n 0 ≡ n 0 . With these choices of wγ , the calculation of the obstruction cocycle u γ ,γ ,γ will follow from (31) and (32). For smaller orientifold groups Γ = Z2Z ⊂ Z2Z (G) with Z ⊂ Z (G), the obstruction cocycle will be obtained by restriction of the one for the maximal Γ . Obstructions to the existence of a trivializing cochain coming from the orbifold subgroup Z ⊂ Γ were analyzed in [15]. To look for a further obstruction associated with the subgroup Z2Z2 ⊂ Γ for Z (G) cyclic with |Z (G)| even, we shall consider, for n = 21 |Z (G)|, the combination −2 −1 −1 2 2 −1 X = u −2 n,n,0 u 0,n,n u 0,0,0 u 0,0,0 u n,n,n u n,0,n u n,0,n u 0,n,n u n,n,0 u 0,0,0 .

(41)

By direct substitution, one may check that X = 1 if u γ ,γ ,γ satisfies identity (21) (or its restriction to Z2Z2 ). In a few cases, this equality will impose further non-trivial conditions on the existence of trivializing cochains. Recall from Sect. 3 that, for Z cyclic, the cohomology groups H 2 (Z2Z ) = Z2 if the rank |Z | of Z is odd, and H 2 (Z2Z ) = Z2 × Z2 if |Z | is even. If a 2-cochain vγ ,γ solves the cohomological equation (21) then (1)

vγ ,γ and vγ ,γ vγ ,γ give two cohomologically inequivalent solutions if the rank of |Z | is odd, and (1)

(2)

(1)

(2)

vγ ,γ , vγ ,γ vγ ,γ , vγ ,γ vγ ,γ and vγ ,γ vγ ,γ vγ ,γ give four cohomologically inequivalent solutions if |Z | is even. All other solutions of (21) differ from those by 2-coboundaries. In the notation introduced above, the 2-cocycles (1) (2) vγ ,γ and vγ ,γ of (38) and (39) read (1)

(1)

(2)

(2)

(1)

(1)

vn,n = vn,n = vn,n = 1, vn,n = −1, (2)

(42)

(2)

vn,n = vn,n = 1, vn,n = vn,n = (−1)n |Z |/|Z (G)| .

(43)

4.1. The case of G = Ar = SU (r + 1). The Lie algebra g = su(r + 1) is composed of the Hermitian traceless (r + 1) × (r + 1)-matrices. The Cartan algebra t ⊂ g is chosen in the standard way as composed of the diagonal matrices in g. We shall denote by ei , i = 1, . . . , r + 1, the diagonal matrices with matrix elements (ei ) j, j = δi, j δi, j . The scalar product tr ei ei = δi,i defines the Killing form on t with the required normaliza2π i ∨ tion. The center Z (G) ∼ = Zr +1 is generated by the element z = e−2π i λr = e− r +1 , with the fundamental (co)weights λi∨ =

i j=1

ej −

i r +1

r +1 j=1

ej

22

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

0

1

2

3

...

r−2

r−1

r

Fig. 1. The rotation of the extended Dynkin diagram of Ar under z

1

2

3

...

r–2

r–1

r

Fig. 2. The reflection of the Dynkin diagram of Ar under κ

corresponding to the standard choice of the simple (co)roots4 αi = ei −ei+1 . The positive Weyl alcove A is the simplex in t with the vertices τ0 = 0 and τi = λi∨ . For τ ∈ A, the relations (22) and (27) hold for πi r

(wz ) j, j = e r +1 δ j−1,[ j ] ,

(wκ ) j, j = e

πi r 2

δ j,r +2− j ,

and for the transformations of the positive Weyl alcove acting on the vertices of A by zτi = τ[i+1] ,

κτi = τ[r −i+1] .

The corresponding index transformations induce, respectively, the symmetry of the extended Dynkin diagram and the symmetry of the unextended one that are depicted in Fig. 1 and Fig. 2. For the maximal orientifold group Γ = Z2Zr +1 with the action of the generator z 0 of Z2 given by (25), we shall define wn and wn according to (40). In order to satisfy the relation (31) for γ , γ = n, n, one may take 0 if n + n < r + 1, bn,n = bn,n = r (r +1) ∨ if n + n ≥ r + 1, 2 λr ∨ if n ≥ n, r nλ r r +1 ∨ bn,n = r n + 2 λr if n < n,

∨ if n ≥ n, r n + n 0 + r +1 2 λr bn,n = ∨ if n < n. r (n + n 0 ) λr Using the identity

tr λi∨ λr∨ =

i , r +1

4 We shall always identify the Cartan algebra t with its dual using the Killing form tr.

WZW Orientifolds and Finite Group Cohomology

23

one obtains from (32) the explicit form of the obstruction cocycle on the group Γ = Z2Zr +1 , u n,n ,n = Φ n

n +n −[n +n ] r +1

u n,n ,n = Φ (n 0 +n) u n,n ,n = Φ n

u n,n ,n = Φ u n,n ,n = Φ

n +n −[n +n ] r +1

n −n −[n −n ] r +1

u n,n ,n = Φ (n 0 +n)

= u n,n ,n ,

(44)

= u n,n ,n ,

(45)

Ψ −nn ,

n −n −[n −n ] r +1

−n ] n 1+ n −n r−[n +1

(46)

Ψ (n 0 +n)n ,

(47)

Ψ −n(n 0 +n ) ,

−n ] (n 0 +n) 1+ n −n r−[n +1

(48)

Ψ (n 0 +n)(n 0 +n ) ,

(49)

2π i k

where Φ ≡ (−1)kr and Ψ ≡ e r +1 . In the case when Γ = Z2Zm for a proper subgroup Zm ⊂ Zr +1 composed of elements n that are multiples of rm+1 , the obstruction cocycle is obtained by restriction of n, n and n to such values. A necessary condition for the solvability of the cohomological equation (33) is the triviality of the cohomology class [u n,n ,n ] ∈ H 3 (Z , U (1)). This condition was analyzed in [14,15] where it was shown that it implies that k is even if m is even and

r +1 m

is odd.

(50)

The latter restriction means that kr rm+1 is even so that the factors Φ n in the expression for u γ ,γ ,γ may be replaced by 1, and that u n,n ,n ≡ 1, in particular. Another necessary condition for the solvability of (33) is the trivializability of the restriction of u γ ,γ ,γ to Z2 ⊂ Γ , i.e. to γ , γ , γ = 0, 0. This, however, always holds because of the triviality of the cohomology group H 3 (Z2 , U (1) ), see (36). The 2-cochain on Z2 which trivializes the restricted 3-cocycle is v˜0,0 = v˜0,0 = v˜0,0 = 1,

v˜0,0 = ± Ψ

− 21 n 0 n 0 + r (r2+1)

,

with the two signs giving cohomologically inequivalent 2-cochains. All other trivializing 2-cochains differ from them by 2-coboundaries (recall that H 2 (Z2 , U (1) ) = Z2 ). Note again that the triviality of H 3 (Z2 , U (1) ) implies that if the orbifold subgroup Z is trivial then the cohomological equation (33) is always solvable. Returning to the case of non-trivial Z ∼ = Zm , further simplification of the 3-cocycle (44)–(49) may be achieved by extracting from it the coboundary (δv )γ ,γ ,γ with

−nn vn,n , = Ψ

vn,n = ±Ψ

nn , vn,n = Ψ nn c , vn,n = Ψ n

− 21 n 0 n 0 + r (r2+1)

(51)

Ψ −n(n 0 +n ) cn−1 , (52)

24

K. Gaw¸edzki, R. R. Suszek, K. Waldorf 1

where cn = Ψ − 2

n 2 +(r +1)n

satisfies

−1 nn , cn c[n+n ] cn = Ψ

c[−n] = cn .

(53)

Note that the lift of the 2-cochain v˜ to Γ appears as an explicit factor in v . Writing u γ ,γ ,γ = u γ ,γ ,γ (δv )γ ,γ ,γ , we obtain the following formulae by a straightforward calculation using (51)–(53) u n,n ,n = u n,n ,n = u n,n ,n = u n,n ,n = 1, u n,n ,n = u n,n ,n = Φ n 0 u n,n ,n = u n,n ,n = Φ n 0

n +n −[n +n ] r +1

,

n −n −[n −n ] r +1

.

If m is odd then the cocycle u γ ,γ ,γ may be trivialized by setting vn,n = vn,n = 1,

mn

n 0 r +1 vn,n . = vn,n = Φ

(54)

Indeed, using the fact that Φ = ±1 and that mn

Φ n 0 r +1 Φ −n 0

m[n+n ] r +1

mn

Φ n 0 r +1 = Φ

n+n −[n+n ] r +1

for m odd, one easily verifies that u γ ,γ ,γ = (δv )γ ,γ ,γ . If m is even and rm+1 is odd then the condition (50) implies that Φ = 1 so that the cocycle u γ ,γ ,γ is trivial. For m and rm+1 even, however, there exists a further obstruction to the trivializability of u γ ,γ ,γ that is related to the choice of the twist element ζ = n 0 in the action (25). The analysis of the cohomology group H 3 (Γ, U (1) ) done in Sect. 3 showed that such an obstruction has to lie in Z2 since the part of the obstruction related to the orbifold group has already been removed by the condition (50). To identify the additional obstructions, we note that the combination X of (41) calculated for the cocycle n0 u γ ,γ ,γ and n = r +1 2 is equal to Φ since u 0,n,n contributes the only non-trivial factor to it. One obtains this way the equality Φ n 0 = 1 showing that if u γ ,γ ,γ is a coboundary then k n 0 is even if m is even and

r +1 m

is even.

(55)

In that case, vγ,γ may be taken trivial or, which amounts to the same, given by (54). Note

that k rm+1 is even for m even due to the restriction (50) so that the condition (55) holds or fails simultaneously for all n 0 in the same congruence class modulo rm+1 , in agreement with the equivalence of the Γ actions for the twist elements in the same Z -coset that we discussed in Sect. 2.5.

WZW Orientifolds and Finite Group Cohomology

25

To summarize, for the orientifold group Γ = Z2Zm , the triviality of the cohomology class [u γ ,γ ,γ ] ∈ H 3 (Γ, U (1) ) imposes the conditions (50) and (55). If they are satisfied then the 2-cochain trivializing u γ ,γ ,γ may be taken in the form vγ ,γ = vγ ,γ vγ,γ , where v and v are given by (51)–(52) and (54), respectively. For m odd, the two choices of the sign in (52) give two cohomologically inequivalent trivializing cochains from which other trivializing cochains differ by 2-coboundaries. Indeed, the sign change (1) is induced by multiplication of vγ ,γ by the 2-cocycle vγ ,γ given by (42). For m even, further two cohomologically inequivalent solutions are obtained by additionally multiplying vγ ,γ by the 2-cocyle vγ(2) ,γ given by (43). Let us illustrate how the above analysis provides concrete information about the numbers of inequivalent orientifold gerbes for a few examples of Ar groups of low ranks. For G = SU (2), if Γ = Z2 with its generator acting by (25) then there are two inequivalent Γ -equivariant (or Jandl) structures on the gerbe of level k on SU (2) for each k and each of the two choices of the twist element ζ . For Γ = Z2Z2 with the second factor being the center of SU (2), the condition (50) imposes that the level k be even. For each of the two choices of the twist element ζ , there are then four inequivalent Γ -equivariant structures. The different choices of the twist element lead to equivalent actions of Γ on SU (2) and there are altogether four inequivalent Jandl structures on the induced gerbe on SO(3), see the discussion in Sect. 2.5. These results are in agreement with the analysis of refs. [31] and [27,28]. For G = SU (3), there are no obstructions. There are two inequivalent Γ -structures on the gerbe on G for Γ = Z2 or Γ = Z2Z3 for each level k and each of the three choices of the twist element. For Γ = Z2Z3 , different choices of the twist element lead to equivalent Γ -actions. Consequently, there are, altogether, two inequivalent Jandl structures on the induced gerbe on the quotient group SU (3)/Z3 for each k. For G = SU (4), there are two inequivalent Jandl structures on the gerbe on G for each level k and each of the four choices of the twist element. For Γ = Z2Z2 , there are four inequivalent Γ -equivariant structures for each k even and each choice of the twist element in Z4 , and for each k odd and each twist element in Z2 ⊂ Z4 . There are no Γ -equivariant structures for k odd and twist elements in Z4 \ Z2 . We get this way eight inequivalent Jandl structures on the induced gerbe on the quotient group SU (4)/Z2 if k is even and four if k is odd. Finally, if Γ = Z2Z4 there are four inequivalent Γ -equivariant structures for k even and each of the four choices of the twist element. Overall, they give rise to four inequivalent Jandl structures on the induced gerbe on SU (4)/Z4 . There are no Γ -equivariant structures for k odd. 4.2. The case of G = Br = Spin(2r + 1). The Lie algebra g = spin(2r + 1) is composed of the imaginary antisymmetric (2r + 1) × (2r + 1)-matrices. We shall denote by ei , i = 1, . . . , r , the matrices with matrix elements (ei ) j, j = i(δ j,2i δ2i−1, j −δ j,2i−1 δ2i, j ) that span the Cartan algebra t ⊂ g. The Killing form is normalized so that tr ei ei = δi,i . ∨ The center is Z (G) ∼ = Z2 with the generator z = e−2π i λ1 , where λi∨ =

i j=1

ei

26

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

0

... 1

2

3

r−2

r−1

r

Fig. 3. The transformation of the extended Dynkin diagram of Br under z

are the fundamental coweights corresponding to the simple roots αi = ei − ei+1 for i = 1, . . . , r − 1 and αr = er . We have Spin(2r + 1)/Z2 = SO(2r + 1). The vertices 1 ∨ of the positive Weyl alcove are τ0 = 0, τ1 = λ∨ 1 and τi = 2 λi for i = 2, . . . , r . For τ ∈ A, the relations (22) and (27) hold for wz and wκ that project to the SO(2r + 1) matrices with entries (wz ) j, j = −(−1)δ1, j δ j, j , (wκ ) j, j =

r (δ j,2i δ2i−1, j + δ j,2i−1 δ2i, j ) + (−1)r δ j,2r +1 δ2r +1, j , i=1

and for the transformations of the positive Weyl alcove acting on the vertices of A by zτ0 = τ1 , zτ1 = τ0 , zτi = τi for i = 2, . . . , r, κτi = τi for i = 0, . . . , r. The symmetry of the extended Dynkin diagram corresponding to the index transformations under z is represented in Fig. 3. The index transformation under κ induces a trivial symmetry of the Dynkin diagram. It is easy to see, by calculating first the eigenvalues of the projections of wz and wκ to SO(2r + 1), that ∨

wz = z n z Oz eπ i λr Oz−1 ,

∨

wκ = z n κ Oκ eπ i λr Oκ−1 ,

where n z , n κ = 0 or 1, Oz , Oκ ∈ Spin(2r + 1) and r = r2 for r even and r = r +1 2 for r odd. The coroot lattice of Br is spanned by the simple coroots αi∨ = ei − ei+1 for i = 1, . . . , r − 1, and αr∨ = 2er . By checking that the coweights λr∨ and λr∨ belong to the coroot lattice if and only if, respectively, r and r are even, one infers from the above relations that wz2 = z r ,

wκ2 = z r .

As far as (wκ wz )2 is concerned, we note that it projects to the same matrix in SO(2r + 1) ∨ as eπ i λ1 so that ∨

(wκ wz )2 = e±π i λ1 for some choice of the sign.

(56)

WZW Orientifolds and Finite Group Cohomology

27

For the maximal orientifold group Z2Z2 , we define wn and wn for n = 0, 1 according to (40). One can satisfy the relation (31) by taking bn,n = bn,n = m n,n λ∨ 1,

(57)

bn,n = ( 2 + m n,n ) δn,1 λ∨ 1,

(58)

∨ 0 bn,n = ( 2 δ[n 0 +n],1 + m nn,n )λ1 ,

(59)

1

1

with integers m n,n = r + m − 1 + r δn ,0 ,

m n,n = r δn,1 δn ,1 ,

(60)

0 m nn,n = r δn,1 δn ,0 + r δ[n 0 +n],0 + m δ[n 0 +n],1 ,

(61)

and m = 0 (m = 1) for the upper (lower) sign in Eq.(56). Since τz −n 0 = δ[n],1 λ∨ 1,

τ(z 0 z n )−1 0 = δ[n 0 +n],1 λ∨ 1,

2 and tr(λ∨ 1 ) = 1, one readily sees that the contribution of the integer multiplicities of λ∨ to b γ ,γ drops out from the expression (32) for the obstruction cocycle which, 1 accordingly, takes the following form:

u n,n ,n = u n,n ,n = u n,n ,n = u n,n ,n = 1,

(62)

u n,n ,n = (−1)k nn ,

u n,n ,n = (−1)k(n 0 +n)n ,

(63)

u n,n ,n = (−1)k n(n 0 +n ) ,

u n,n ,n = (−1)k(n 0 +n)(n 0 +n ) .

(64)

The cohomological equation (21) can always be solved. Two cohomologically inequivalent solutions are obtained by taking

vn,n = vn,n = (−1)k nn ,

vn,n = (−1)k nn e−

3π i 2 kn

,

πi

vn,n = ±(−1)k n(n 0 +n ) e 2 k(n 0 +3n) . Two further cohomologically inequivalent solutions for the orientifold group Z2Z2 are obtained by multiplying vγ ,γ by the 2-cocycle vγ(2) ,γ given by (43). In summary, there are no obstructions to the trivialization of the 3-cocycle (62)– (64) on Γ = Z2Z2 . For each k and each choice of the twist element ζ ∈ Z2 , there are four cohomologically inequivalent trivializing cochains that give rise to inequivalent Γ -equivariant structures on the level-k gerbe on Spin(2r +1). The latter induce altogether four inequivalent Jandl structures on the level-k gerbe on SO(2r + 1). Restriction to the inversion group Γ = Z2 reduces the number of inequivalent trivializing cochains to two for each k and each ζ . Altogether, they induce four inequivalent Jandl structures on the level-k gerbe on Spin(2r + 1).

28

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

... 0

1

2

3

r−2 r−1

r

Fig. 4. The transformation of the extended Dynkin diagram of Cr under z

4.3. The case of G = Cr = Sp(2r ). The group Sp(2r ) is composed of the unitary (2r ) × (2r )-matrices such that U T Ω U = Ω for (Ω) j, j =

r (δ j,2i−1 δ2i, j − δ j,2i δ2i−1, j ). i=1

The Lie algebra sp(2r ) of Sp(2r ) is composed of the Hermitian matrices X such that Ω X is symmetric. The Cartan subalgebra t ⊂ sp(2r ) is spanned by the matrices ei , i = 1, . . . , r , with (ei ) j, j = i(δ j,2i δ2i−1, j − δ j,2i−1 δ2i, j ) and the Killing form is normalized so that tr ei ei = 2δi j . The center is Z (G) ∼ = Z2 with the generator ∨ −2π i λ r = −1, where z=e λi∨ =

i

e j for i = 1, . . . , r − 1,

j=1

λr∨ =

1 2

r

ej

j=1

are the fundamental coweights corresponding to the simple roots αi = 21 (ei − ei+1 ) for i = 1, . . . , r − 1 and αr = er . The vertices of the positive Weyl alcove A are τ0 = 0, τi = 21 λi∨ for i = 1, . . . , r − 1 and τr = λr∨ . In order to satisfy the relations (22) and (27) for τ ∈ A, we may take for wz and wκ the matrices with entries (wz ) j, j = i δ j,2r +1− j ,

(wκ ) j, j = i

r

(δ j,2i−1 δ2i, j + δ j,2i δ2i−1, j )

i=1

and the actions of z and κ on the positive Weyl alcove reducing to zτi = τr −i ,

κτi = τi

on the vertices. The symmetry of the extended Dynkin diagram corresponding to the action of z is depicted in Fig. 4. Note that wz2 = wκ2 = −1 = z and (wκ wz )2 = 1. Defining wn and wn for the orientifold group Γ = Z2Z2 by (40), we may satisfy (31) by taking bn,n = bn,n = bn,n = nn λr∨ ,

bn,n = (1 + n 0 + nn )λr∨ .

Since τz −n 0 = δ[n],1 λr∨ ,

τ(z 0 z n )−1 0 = δ[n 0 +n],1 λr∨

WZW Orientifolds and Finite Group Cohomology

and tr(λr∨ )2 =

r 2

29

one infers from (32) that

u n,n ,n = u n,n ,n = u n,n ,n = (−1)kr nn n ,

(65)

u n,n ,n = u n,n ,n = u n,n ,n = (−1)kr (n 0 +n)n n ,

u n,n ,n = (−1)kr n(1+n 0 +n n ) ,

(66)

u n,n ,n = (−1)kr (n 0 +n)(1+n 0 +n n ) .

(67)

The restriction u n,n ,n of the cocycle u γ ,γ ,γ to the orbifold subgroup Z2 is trivializable if and only if k is even if r is odd, see [15]. Under this condition, u γ ,γ ,γ ≡ 1 and four cohomologically inequivalent solutions of (21) are given by the formulae

vn,n = vn,n = 1,

vn,n = σ n ,

vn,n = σ σ n

with σ, σ = ±1. They lead to four inequivalent Z2Z2 -equivariant structures on the level-k gerbe on Sp(2r ) for each choice of the twist element ζ = Z2 and, altogether, to four inequivalent Jandl structures on the quotient gerbe on Sp(2r )/Z2 . The restriction of the 3-cocycle (65)–(67) to the inversion group Γ = Z2 is trivial for any level k and any choice of the twist element ζ ∈ Z2 . For such a restriction, the two cohomologically inequivalent solutions of (21) are given by v0,0 = v0,0 = v0,0 = 1,

v0,0 = ±1.

(68)

Altogether, they lead to four inequivalent Jandl structures on the level-k gerbe on Sp(2r ). 4.4. The case of G = Dr = Spin(2r ). The Lie algebra g = spin(2r ) is composed of the imaginary antisymmetric (2r ) × (2r )-matrices, with the Cartan subalgebra t ⊂ g spanned by the matrices ei , i = 1, . . . , r , with matrix elements (ei ) j, j = i(δ j,2i δ2i−1, j − δ j,2i−1 δ2i, j ) and tr ei ei = δi,i . The vertices of the positive Weyl alcove are τ0 = 0,

τ1 = λ∨ 1,

τi = 21 λi∨ for i = 2, . . . , r − 2,

τr −1 = λr∨−1 ,

τr = λr∨ ,

where λi∨ = λr∨−1 =

i

ei for i = 1, . . . , r − 2,

j=1 1 2

r −1 j=1

e j − 21 er ,

λr∨ =

1 2

r

ej

j=1

are the fundamental coweights corresponding to the simple roots αi = ei − ei+1 , i = 1, . . . , r − 1 and αr = er −1 + er that coincide with the simple coroots. The subsequent discussion depends on the parity of r and hence will be split into two parts.

30

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

0

2s+1

... 1

2

3

2s −2

2s−1

2s

Fig. 5. The transformation of the extended Dynkin diagram of D2s+1 under z

2s+1

... 1

2

3

2s−1

2s−2

2s

Fig. 6. The inversion map κ flips the last two nodes of the Dynkin diagram of D2s+1

4.4.1. The subcase of r odd When r = 2s + 1, the center Z (G) ∼ = Z4 is generated by ∨ the element z = e−2π i λr , with Spin(2r )/{1, z 2 } = SO(2r ). For τ ∈ A, the relations (22) and (27) are satisfied if we take for wz and wκ the elements of Spin(2r ) that project to matrices in SO(2r ) with the entries (wz ) j, j = (−1)δ j,2r δ j,2r +1− j , (wκ ) j, j =

r −1 (δ j,2i−1 δ2i, j + δ j,2i δ2i−1, j ) + δ j,2r −1 δ2r −1, j + δ j,2r δ2r, j , i=1

with the actions of z and κ on the positive Weyl alcove reducing to zτ1 = τr , zτi = τr −i for i = 2, . . . , r, zτ0 = τr −1 , κτi = τi for i = 0, . . . , r − 2, κτr −1 = τr , κτr = τr −1 on the vertices. The corresponding symmetries of the Dynkin diagrams are depicted in Fig. 5 and Fig. 6. Note the adjoint action wz ei wz−1 = −(−1)δi,1 er +1−i .

(69)

It is easy to see, comparing first the eigenvalues of the projections of both sides to SO(2r ), that wz = z 2n z Oz e2π i τz Oz−1 wκ = z 2n κ Oκ e2π i τκ Oz−1

for τz =

1 2

for τκ =

s

1 2

1

ei + 4 es+1 ,

i=1 s

ei ,

i=1 −1 wκ wz = z 2n κz Oκz e2π i τκz Oκz

for τκz =

1 2

s−1 i=1

3

1

ei + 8 es + 8 es+1

WZW Orientifolds and Finite Group Cohomology

31

for some integers n z , n κ , n κz and Oz , Oκ , Oκz ∈ Spin(2r ). The Cartan algebra elements τz , τκ and τκz belong to the positive Weyl alcove A. It is easy to see that 4τz belongs also to the coweight lattice but not to the coroot lattice. On the other hand, 2τκ = λ∨ s belongs to the coroot lattice if and only if s is even. Since wz4 and wκ2 project to the identity matrix in SO(2r ), it follows that wz4 = z 2 , We also have

−1 Oκz (wκ wz )2

Oκz = e

2π i(2τκz )

=

wκ2 = z 2s . 3

(70)

1

e2π i(e1 + 4 es + 4 es+1 ) 3 1 e2π i( 4 es + 4 es+1 )

πi

for s even, for s odd πi

= z 2(s+1) e 2 (3es +es+1 ) = z 2(s+1) O e 2 (3e1 +e2 ) O −1 for O ∈ Spin(2r ) that is straightforward to construct. By the relation (22), πi

1

1

1

1

z 2 e 2 (3e1 +e2 ) = z 2 e2π i( 2 τ1 + 2 τ2 ) = wz−2 e2π i( 2 τ0 + 2 τ2 ) wz2 πi

πi

= wz−2 e 2 (e1 +e2 ) wz2 = O e 2 (e1 +er ) O

−1

.

πi

We infer that (wκ wz )2 is in the same conjugacy class as z 2s e 2 (e1 +er ) and that the latter πi is different from the conjugacy class of z 2(s+1) e 2 (e1 +er ) . On the other hand, it is easy to πi check that (wκ wz )2 projects to the same matrix in SO(2r ) as e 2 (e1 +er ) . It follows that πi

(wκ wz )2 = z 2s e 2 (e1 +er ) , which, together with the second equality in (70), implies that πi

wκ wz wκ−1 = e 2 (e1 +er ) wz−1 . Using (69), we obtain the relations: −

wκ wzn wκ−1 wzn = e2π i ∆n ,

wzn wκ wzn wκ−1 = e2π i ∆n ,

+

where

⎧ 0 for n = 0, ⎪ ⎪ ⎪ ⎪ ⎨± 1 (e ± e ) for n = 1, r 4 1 ∆± n = 1 ⎪± e1 for n = 2, ⎪ 2 ⎪ ⎪ ⎩ 1 ± 4 (e1 ∓ er ) for n = 3.

Together with (70), they are all that is needed to find bγ ,γ for γ , γ in the maximal orientifold group Γ = Z2Z4 . We may set bn,n = bn,n =

n −n−[n −n]

e1 ,

e1 + ∆− n, n −n−[n −n] = + s e1 + ∆+[n 0 +n] 4

bn,n = bn,n

n+n −[n+n ] 4

4

32

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

for n, n = 0, 1, 2, 3. We also have τz −1 0 = λr∨ ,

τz −2 0 = λ∨ 1,

τz −3 0 = λr∨−1 .

Rather than displaying the corresponding obstruction 3-cocycle (32) in full, we shall focus on its specific components. First, for the inversion group Γ = Z2 , the only entry of the 3-cocycle different from 1 is πi

u 0,0,0 = e 2 kr (n 0 +2δn0 ,3 ) . The trivializing cochain may be given by the formulae πi

v0,0 = ± e− 4 kr (n 0 +2δn0 ,3 ) ,

v0,0 = v0,0 = v0,0 = 1,

with the two signs corresponding to cohomologically inequivalent solutions. Next, we pass to the case of orientifold groups Γ = Z2Zm with m = 2, 4. The restriction of the obstruction 3-cocycle to the orbifold group Z4 is u n,n ,n = (−1)kn

n +n −[n +n ] 4

.

It is not trivializable if k is odd, see [15]. On the other hand, its further restriction to Z2 ⊂ Z4 is trivial for all k. In order to proceed further, we note that the scalar product tr τz −n 0 e1 takes values in integers if n is even and in half-integers if n is odd. It follows that, for k even, only the terms ∆± in bγ ,γ contribute to u γ ,γ ,γ if m = 4. This is still the case if m = 2. Indeed, if the twist element ζ ∈ Z2 ⊂ Z4 , then tr τz −n 0 e1 and tr τ(z 0 z n )−1 e1 take integral values because n = 0, 2. Conversely, if ζ ∈ Z4 \ Z2 , then a straightforward check shows that the combination X of (41) is equal to (−1)k for n = 2, thereby contradicting the trivializability of the obstruction cocycle for k odd. Summarizing, we obtain the condition k

is even if

m=4

or

m=2

and

n0

is odd

under which only the terms ∆± in bγ ,γ contribute to the obstruction cocycle u γ ,γ ,γ . With this observation in mind, we obtain, for m = 4 or for m = 2 and n 0 odd, i.e. in both cases in which k has to be even, the following expressions for the obstruction cocycle: u n,n ,n = u n,n ,n = u n,n ,n = u n,n ,n = 1, 1

u n,n ,n = (−1) 2 k(1−δn,0 )(1−δn ,0 )(1−δn,n ) , 1

u n,n ,n = (−1) 2

k(1−δ[n 0 +n],0 )(1−δn ,0 )(1−δ[n

0 +n],n

)

,

u n,n ,n = (−1)

1 2 k(1−δn,0 )(1−δ[n 0 +n ],0 )(1−δ[n 0 +n+n ],0 )

u n,n ,n = (−1)

1 2 k(1−δ[n 0 +n],0 )(1−δ[n 0 +n ],0 )(1−δ[2n 0 +n+n ],0 )

, .

Similarly, for m = 2 and n 0 even, in which case k can be any integer, u n,n ,n = u n,n ,n = u n,n ,n = u n,n ,n = 1, u n,n ,n = (−1)k δn,2 δn ,2 , u n,n ,n = (−1)

k δn,2 δ[n

0 +n ],2

u n,n ,n = (−1)k δ[n0 +n],2 δn ,2 , ,

u n,n ,n = (−1)

k δ[n 0 +n],2 δ[n

0 +n ],2

.

WZW Orientifolds and Finite Group Cohomology

33

In all these cases, there exists a trivializing cochain. It may be taken in the form 1

vn,n = 1,

vn,n = σ 4 mn , 1

πi

vn,n = e 4 k(n+2δn,3 ) ,

πi

vn,n = σ σ 4 mn e− 4 k([n 0 +n]+2δ[n0 +n],3 ) , with different signs σ, σ = ±1 giving four cohomologically inequivalent solutions. In summary, for each k and each choice of the twist element ζ ∈ Z4 , there are two inequivalent Jandl structures on the level-k gerbe on Spin(2r ) with r odd. For each k even and each choice of the twist element ζ ∈ Z4 and for each k odd and ζ ∈ Z2 ⊂ Z4 , there are four inequivalent Z2Z2 -equivariant structures on the level-k gerbe on Spin(2r ), giving rise, altogether, to eight inequivalent Jandl structures on the induced gerbe on SO(2r ) when k is even and to four when k is odd. Finally, for each k even and each choice of ζ ∈ Z4 , there are four inequivalent Z2Z4 -equivariant structures on the levelk gerbe on Spin(2r ), giving rise, altogether, to four inequivalent Jandl structures on the induced gerbe on Spin(2r )/Z4 . Note that the count is similar to that for the group SU (4). 4.4.2. The subcase of r even For r = 2s, the center Z (G) ∼ = Z2 × Z2 is generated by ∨ ∨ z 1 = e−2π i λr and z 2 = e−2π i λ1 , with Spin(2r )/{1, z 2 } = SO(2r ). For τ ∈ A and z = z 1 , z 2 , the relations (22) are satisfied if we take for wz 1 and wz 2 the elements of Spin(2r ) that project to the SO(2r ) matrices with entries5 (wz 1 ) j, j =

−δ j,2r +1− j δ j,2r +1− j

for j = 1, . . . , r, for j = r + 1, . . . , 2r,

(wz 2 ) j, j = δ j,1 δ2, j + δ j,2 δ1, j + δ j,2r −1 δ j ,2r + δ j,2r δ2r −1, j +

2r −2

δ j,i δi, j ,

i=3

with the actions of z 1 and z 2 on the positive Weyl alcove reducing to z 1 τi = τr −i , z 2 τ0 = τ1 , z 2 τ1 = τ0 , z 2 τi = τi for i = 2, . . . , r − 2, z 2 τr −1 = τr , z 2 τr = τr −1 on the vertices. The corresponding symmetries of the extended Dynkin diagram are depicted in Fig. 7 and Fig. 8. The adjoint action of wz 1 and wz 2 on the Cartan algebra is given by the equations wz 1 ei wz−1 = −er −i+1 , wz 2 ei wz−1 = (−1)δi,1 +δi,r ei . 1 2 The relations (27) are, in turn, satisfied for τ ∈ A if we take for wκ the element of Spin(2r ) that projects to an SO(2r ) matrix with entries (wκ ) j, j =

r (δ j,2i−1 δ2i, j + δ j,2i δ2i−1, j ), i=1

5 For later convenience, we make a choice different from that in [15].

34

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

0

2s

... 1

2

3

2s−3 2s−2 2s−1

Fig. 7. The transformation of the extended Dynkin diagram of D2s under z 1

0 1

2

2s 3

...

2s−3 2s−2 2s−1

Fig. 8. The transformation of the extended Dynkin diagram of D2s under z 2

with the trivial action of κ on the positive Weyl alcove. We have the relations nz

for τz 1 = − 2 λr∨ ,

nz

for τz 2 = − 2 λ∨ 1,

wz 1 = z 2 1 Oz 1 e2π i τz1 Oz−1 1 wz 2 = z 2 2 Oz 2 e2π i τz2 Oz−1 2 n z1 z2

wz 1 wz 2 = z 2

1

1

Oz 1 z 2 e2π i τz1 z2 Oz−1 1 z2

wκ = z 2n κ Oκ e2π i τκ Oκ−1

∨ for τz 1 z 2 = 2 (λ∨ 1 − λr ), 1

for τκ = 2 λ∨ s, 1

n κz 1

−1 Oκz 1 e2π i τκz1 Oκz 2

∨ for τκz 1 = 2 (λ∨ s − λr ),

n κz 2

−1 Oκz 2 e2π i τκz2 Oκz 2

∨ for τκz 2 = − 2 (λ∨ 1 − λs ),

wκ w z 1 = z 2

wκ w z 2 = z 2

1

1

from which it follows that wz21 = z 1 ,

wz22 = z 2 ,

(wκ wz 1 )2 = z 1 z 2s ,

(wz 1 wz 2 )2 = z 1 z 2 ,

(wκ wz 2 )2 = z 2s+1 ,

wκ2 = z 2s ,

(wκ wz 1 wz 2 )2 = z 1 z 2s+1 .

(71) (72)

The last equality is a consequence of the previous ones since wκ−1 (wκ wz 1 wz 2 )2 (wκ wz 2 )−2 = wκ wz 1 wz 2 wκ wz 1 wκ−1 wz−1 2 = z 2s wκ wz 1 wz 2 (wκ wz 1 )2 wz−1 wz−1 wκ−1 = z 2 wκ (wz 1 wz 2 )2 wκ−1 = z 1 . 1 2 Note that the relations (71) and (72) imply that wz 1 , wz 2 and wκ all commute. This will lead to simple expressions for the obstruction cocycle.

WZW Orientifolds and Finite Group Cohomology

35

Similarly as in the case of groups with cyclic centers, we shall use the abbreviated notation z 1n 1 z 2n 2 ≡ n 1 n 2 ,

z 0 z 1n 1 z 2n 2 ≡ n 1 n 2

for the elements of the orientifold group Z2(Z2 × Z2 ), setting wn 1 n 2 = wzn11 wzn22 ,

wn 1 n 2 = wκ wzn101 wzn202 wzn11 wzn22

for n 1 , n 2 , n 01 , n 02 = 0, 1 and for the twist element ζ = z 1n 01 z 2n 02 ≡ n 01 n 02 . It is easy to show, with the help of (71) and (72), that the Cartan algebra elements bγ ,γ may be taken in the form bn 1 n 2 ,n 1 n 2 = bn 1 n 2 ,n 1 n 2 = bn 1 n 2 ,n 1 n 2 = n 1 n 1 λr∨ + n 2 n 2 λ∨ 1, bn 1 n 2 ,n 1 n 2 = (n 01 + n 1 n 1 ) λr∨ + (s + n 02 + n 2 n 2 ) λ∨ 1. Employing the relations ∨ ∨ τ(z n1 z n2 )−1 0 = (1 − n 1 )n 2 λ∨ 1 + n 1 n 2 λr −1 + n 1 (1 − n 2 ) λr , 1

2

together with 2 tr(λ∨ 1 ) = 1,

1 ∨ ∨ ∨ tr λ∨ 1 λr −1 = trλ1 λr = 2 ,

tr λr∨−1 λr∨ =

tr(λr∨ )2 = 2s ,

s−1 2 ,

we obtain from the definition (32) the explicit expressions for the obstruction cocycle u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = (−1)k(s n 1 n 1 n 1 +n 1 n 2 n 2 +n 2 n 1 n 1 ) ,

(73)

u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = (−1)k(s n 1 (1+n 01 +n 1 n 1 )+n 1 (n 02 +n 2 n 2 )+n 2 (n 01 +n 1 n 1 )) ,

(74)

u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = (−1)k(s(n 01 +n 1 )n 1 n 1 +(n 01 +n 1 )n 2 n 2 +(n 02 +n 2 )n 1 n 1 ) ,

(75)

u n 1 n 2 ,n 1 n 2 ,n 1 n 2 = (−1)k(s(n 01 +n 1 )(1+n 01 +n 1 n 1 )+(n 01 +n 1 )(n 02 +n 2 n 2 )+(n 02 +n 2 )(n 01 +n 1 n 1 ))

(76)

that can easily be analyzed. First, we note that the restriction of u γ ,γ ,γ to the inversion group Γ = Z2 is trivial, with the formulae v00,00 = v00,00 = v00,00 = 1,

v00,00 = ±1,

providing two cohomologically inequivalent trivializing cochains. As the next case, let us consider the restriction of the obstruction cocycle to the orientifold subgroup Γ = Z2Z with Z = {1, z 2 }. Since the combination X of (41)

36

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

with n = 01 is easily calculated to be equal to (−1)k n 01 , we infer that the obstruction cocycle restricted to Γ may be trivialized only if k

n 01 = 1.

is even if

Under this condition, the restricted cocycle becomes trivial for all choices of the twist element. Passing to the orientifold groups Γ = Z2Z with Z = {1, z 1 }, Z = {1, z 1 z 2 } or Z = Z2 × Z2 , we recall from [15] that, in all these three cases, the restriction of the obstruction cocycle to the orbifold group Z may be trivialized only under the condition that k

is even

if

s=

r 2

is odd.

(77)

If k is even the whole obstruction cocycle becomes trivial. Suppose then that k is odd but s is even so that the terms multiplied by s may be dropped in the explicit expression for the cocycle. The combinations X of (41) with n = 10 and n = 11 are now easily calculated to take the values (−1)k n 02 and (−1)k(n 01 +n 02 ) , respectively. For Z = {1, z 1 }, we then obtain the condition k

is even

n 02 = 1,

if

(78)

and, for Z = {1, z 1 z 2 }, the condition k

is even

if

n 01 + n 02

is odd.

(79)

The obstruction cocycle restricted to Γ = Z2Z with Z = {1, z 1 } or Z = {1, z 1 z 2 } becomes trivial under the conditions (77) and (78) or (77) and (79), respectively. Finally, for the maximal orientifold group, the conditions (77), (78) and (77) must hold simultaneously, implying that if the twist element ζ = 1 then the obstruction cocycle can be trivialized only if k is even. On the other hand, the trivializability of u γ ,γ ,γ cannot depend on the choice of the twist element in this case so that for Γ = Z2(Z2 × Z2 ) the cohomological equation (21) has a solution only if k

is even,

(80)

whatever the choice of the twist element. Indeed, if ζ = 1 then the combination X of (41) with n = 10 takes the value (−1)k for the cocycle u γ ,γ ,γ obtained by composing u γ ,γ ,γ with the automorphism h z 1 of Γ , see (26). Since u γ ,γ ,γ is trivializable if and only if u γ ,γ ,γ is, the condition (80) for the trivial twist element follows. For all orientifold groups Γ = Z2Z with a non-trivial orbifold subgroup Z , the obstruction cocycle u γ ,γ ,γ of (73)–(76) is then trivial whenever it may be trivialized. Sixteen cohomologically inequivalent trivializing 2-cocycles vγ ,γ on Γ = Z2(Z2 ×Z2 ) are given by the formulae

vn 1 n 2 ,n 1 n 2 = vn 1 n 2 ,n 1 n 2 = σ n 2 n 1 , n

n

vn 1 n 2 ,n 1 n 2 = σ n 2 n 1 σ1 1 σ2 2 ,

n

n

vn 1 n 2 ,n 1 n 2 = σ σ n 2 n 1 σ1 1 σ2 2

WZW Orientifolds and Finite Group Cohomology

37

with σ, σ1 , σ2 , σ = ±1. In particular, the choice of σ distinguishes two inequivalent restrictions of the 2-cocycle vγ ,γ to the orbifold group Z2 × Z2 that give rise to two inequivalent gerbes on Spin(2r )/(Z2 × Z2 ), see [15]. For Γ = Z2Z with Z ∼ = Z2 , four inequivalent cohomologically non-trivial 2-cocycles are obtained from the above expressions (with, say, σ = 1) by restriction. Let us summarize the results for the group Spin(2r ) with r even. First, for each k and each of the four choices of the twist element, there are two inequivalent Jandl structures on the level-k gerbe on Spin(2r ). Next, for each k even and each choice of the twist element, there are four inequivalent Γ -equivariant structures on the level-k gerbe on Spin(2r ) for Γ = Z2Z with Z ∼ = Z2 . They give rise to eight inequivalent Jandl structures on the induced gerbe on Spin(2r )/Z . For such orientifold groups and k odd, there exist four inequivalent Γ -equivariant structures only if the twist belongs to Z and, for Z = {1, z 1 } or Z = {1, z 1 z 2 }, if, additionally, s = r2 is even. For fixed Z , we thus obtain four inequivalent Jandl structures on the induced gerbe on Spin(2r )/Z . Finally, for the maximal orientifold group Γ = Z2(Z2 × Z2 ) and each k even, there exist sixteen inequivalent Γ -equivariant structures on the level-k gerbe on Spin(2r ) for each choice of the twist element. They give rise to, altogether, eight inequivalent Jandl structures on each of the two inequivalent gerbes induced on Spin(2r )/(Z2 × Z2 ). 4.5. The case of G = E 6 . As in Sect. 4.7 of [15], we identify the Cartan algebra t of E 6 with the subspace of R7 composed of the vectors with the first six coordinates summing to zero. The Killing form is inherited from the scalar product in R7 . The vertices of the positive Weyl alcove A are ∨ ∨ ∨ ∨ ∨ τ0 = 0, τ1 = λ∨ 1 , τ2 = 2 λ2 , τ3 = 3 λ3 , τ4 = 2 λ4 , τ5 = λ5 , τ6 = 2 λ6 1

1

1

1

for the fundamental coweights λi∨ corresponding to the simple roots αi = ei − ei+1 for i = 1, . . . , 5, α6 = 21 (−e1 − e2 − e3 + e4 + e5 + e6 ) + √12 e7 , where ei are the vectors of the canonical basis of R7 . The positive roots have the form ei − e j for 1 ≤ i < j ≤ 6 and 21 (±e1 ± e2 ± e3 ± e4 ± e5 ± e6 ) + √1 e7 with three 2 √ ∼ Z3 signs + and three signs −, and φ = 2 e7 (the highest root). The center Z (E 6 ) = ∨

is generated by z = e−2π i λ5 . We shall construct the elements wz and wκ entering the πi relations (22) and (27) in terms of group elements wα = e 2 (eα +e−α ) that implement the Weyl reflections rα in roots α, acting on the Cartan algebra by τ −→ wα τ wα−1 = τ − α ∨ tr τ α ≡ rα (τ ),

where e±α and α ∨ are, respectively, the step generators and the coroot associated to α. One has ∨

wα2 = eπ i α . Besides, since [eα , eβ ] does not vanish only if α + β is a root, wα and wβ commute if neither α + β nor α − β is a root. The relation (27) is satisfied for τ ∈ A if we take wκ = wα3 wα2 +α3 +α4 wα1 +α2 +α3 +α4 +α5 wφ ,

(81)

38

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

6 1

2

4

3

5

Fig. 9. The Weyl reflection of the Dynkin diagram of E 6 under κ

with the action of κ on A reducing to κτ0 = τ0 ,

κτi = τ6−i for i = 1, . . . , 5,

κτ6 = τ6

on the vertices and thereby giving rise to the symmetry of the Dynkin diagram represented in Fig. 9. It is easy to check that all the factors on the right-hand side of (81) commute so that wκ2 = wα23 wα22 +α3 +α4 wα21 +α2 +α3 +α4 +α5 wφ2 ∨

∨

∨

∨

∨

∨

∨

∨

∨

= eπ i(α3 +α2 +α3 +α4 +α1 +α2 +α3 +α4 +α5 +φ

∨)

= 1.

As observed in [15], there is another set of simple roots β1 β3 β4 β6

= α1 + α2 + α3 + α4 , β2 = α3 + α4 + α5 + α6 , = −α1 − α2 − 2α3 − α4 − α5 − α6 , = α1 + α2 + α3 + α6 , β5 = α2 + α3 + α4 + α5 , = α3

such that (22) may be satisfied for τ ∈ A if the adjoint action of wz on the Cartan algebra is given by the product of the Weyl reflections wz τ wz−1 = rβ1 rβ4 rβ5 rβ2 (τ ), with the action of z on A reducing to zτ0 = τ1 , zτ1 = τ5 , zτ2 = τ4 , zτ3 = τ3 , zτ4 = τ6 , zτ5 = τ0 , zτ6 = τ2 on the vertices and thus giving rise to the symmetry of the extended Dynkin diagram depicted in Fig. 10. Note that wκ β1∨ wκ−1 = −β5∨ and wκ β2∨ wκ−1 = −β4∨ . It follows that wκ e±β1 wκ−1 = µ∓1 1 e∓β5 ,

wκ e±β2 wκ−1 = µ∓1 2 e∓β4

for some µ1 and µ2 of absolute value 1. Hence, πi

wκ wβ1 wκ−1 = e 2 (µ1 eβ5 +µ¯ 1 e−β5 ) ,

πi

wκ wβ2 wκ−1 = e 2 (µ2 eβ4 +µ¯ 2 e−β4 ) .

WZW Orientifolds and Finite Group Cohomology

39

0

6

1

2

3

4

5

Fig. 10. The transformation of the extended Dynkin diagram of E 6 under z πi

¯ −α ) Since conjugation by e 2 (µeα +µe induces the Weyl reflection rα on the Cartan algebra for all µ with |µ| = 1, we may set

wz = wβ1 wκ wβ2 wκ−1 wκ wβ1 wκ−1 wβ2 . The elements e±βi with i = 1, . . . , 5 generate an su(6) subalgebra of the Lie algebra of E 6 . The coroots βi∨ may be taken as its simple coroots and e±βi as its step generators. Clearly, wz belongs to the SU (6) subgroup of E 6 corresponding to this subalgebra and, with the standard identification of the simple roots and the step generators of su(6) in terms of matrices, ⎞ ⎛ 0

⎜i ⎜ ⎜ wz = ⎜ 0 ⎜0 ⎝

0 0

−1 0 0 0 0 0

0 0

i 0 0 0

0 0 0 0

0 0 0 0 ⎟ ⎟ 0 0 ⎟ ⎟ 0 −µ1 µ2 ⎟ ⎠ 0 iµ¯ 2 0 0 iµ¯ 1 0

∈ SU (6) ⊂ E 6 .

The relation wz3 = 1 follows by raising the above matrix to the third power. Let us further note that, since [eβ1 , e±β4 ] = 0 and [eβ2 , e±β5 ] = 0, we have the commutation relations wβ1 wκ wβ2 wκ−1 = wκ wβ2 wκ−1 wβ1 ,

wβ2 wκ wβ1 wκ−1 = wκ wβ1 wκ−1 wβ2 .

Using these identities and the equality wκ = wκ−1 , we infer that (wκ wz )2 = wκ wβ1 wκ−1 wβ2 wβ21 wκ wβ22 wκ−1 wκ wβ1 wκ−1 wβ2 ∨

∨

= wκ wβ1 wκ−1 wβ2 eπ i(β1 +β4 ) wβ2 wκ wβ1 wκ−1 . Next, the relations [β1∨ + β4∨ , e±β2 ] = ∓e±β2 imply that ∨

∨

∨

∨

. eπ i(β1 +β4 ) wβ2 e−π i(β1 +β4 ) = wβ−1 2

40

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Similarly, ∨

∨

∨

∨

eπ i(β1 +β4 ) wk wβ1 wk−1 e−π i(β1 +β4 ) = wk wβ−1 wk−1 , 1 so that we obtain the identities ∨

∨

∨

∨

(wk wz )2 = eπ i(β1 +β4 ) = eπ i(α4 +α6 ) , ∨

∨

(wz wk )2 = wk (wk wz )2 wk−1 = eπ i(α2 +α6 ) . It follows easily that we may choose bn,n = bn,n = 0,

bn,n =

⎧ ⎪ ⎪0 ⎨

1

2 ⎪ ⎪ ⎩1

for n = 0, (α2∨ + α6∨ )

∨ 2 (α4

bn,n =

⎧ 0 ⎪ ⎪ ⎨

1

2 ⎪ ⎪ ⎩1

+ α6∨ )

for n = 1, for n = 2, for [n 0 + n] = 0,

(α4∨ + α6∨ )

∨ 2 (α2

+ α6∨ )

for [n 0 + n] = 1, for [n 0 + n] = 2.

∨ Since τz −1 0 = λ∨ 5 and τz −2 0 = λ1 , it follows from the definition (32) that the obstruction cocycle u γ ,γ ,γ is trivial on both orientifold groups Γ = Z2Z3 and Γ = Z2 so that two cohomologically inequivalent cocycles may be taken in the form

vn,n = vn,n = vn,n = 1,

vn,n = ±1.

In short, for each orientifold group, each k and each of the three choices of the twist element, there are two inequivalent Γ -equivariant structures on the level-k gerbe on E 6 . They give rise to six inequivalent Jandl structures on that gerbe and to two inequivalent Jandl structures on the induced gerbe on E 6 /Z3 . 4.6. The case of G = E 7 . As in Sect. 4.8 of [15], we identify the Cartan algebra of E 7 with the subspace in R8 composed of the vectors whose coordinates sum to zero, with the Killing form inherited from the scalar product in R8 . The vertices of the positive Weyl alcove A are 1 ∨ 1 ∨ τ0 = 0, τ1 = λ∨ 1 , τ2 = 2 λ2 , τ3 = 3 λ3 ,

1 ∨ 1 ∨ 1 ∨ τ4 = 41 λ∨ 4 , τ5 = 3 λ5 , τ6 = 2 λ6 , τ7 = 2 λ7

for the fundamental coweights λi∨ corresponding to the simple roots

WZW Orientifolds and Finite Group Cohomology

41

αi = ei − ei+1 for i = 1, . . . , 6, α7 = (−e1 − e2 − e3 − e4 + e5 + e6 + e7 + e8 ), 1 2

where ei are the vectors of the canonical basis of R8 . Roots have the form ei − e j for i = j and 21 (±e1 ± e2 ± e3 ± e4 ± e5 ± e6 ± e7 ± e8 ) with four signs + and four signs −. ∨ The highest root is φ = −e7 + e8 . The center Z (E 7 ) ∼ = Z2 is generated by z = e−2π i λ1 1 with λ∨ 1 = 4 (3e1 − e2 − e3 − e4 − e5 − e6 − e7 + 3e8 ). The relation (27) may be satisfied for τ ∈ A if we take wκ = wα1 wα3 wα5 wα7 wα3 +2α4 +α5 +α7 wα1 +2α2 +2α3 +2α4 +α5 +α7 wφ ,

(82)

with the trivial action of κ on A. All the factors on the right hand side of (82) commute so that ∨

∨

∨

wκ2 = eπ i(α1 +α3 +α7 ) = z. The roots β1 β2 β3 β4 β5 β6 β7

= α1 + 2α2 + 2α3 + 2α4 + α5 + α7 , = −(α1 + α2 + 2α3 + 2α4 + α5 + α7 ), = α1 + α2 + 2α3 + 2α4 + α5 + α6 + α7 , = −(α1 + α2 + α3 + 2α4 + α5 + α6 + α7 ), = α4 , = α7 , = α1 + α2 + α3 + 2α4 + 2α5 + α6 + α7

form another system of simple roots such that (22) may be satisfied for τ ∈ A if the adjoint action of wz on the Cartan algebra is given by the product of the Weyl reflections wz τ wz−1 = rβ1 rβ3 rβ7 (τ ), with the action of z on A reducing to zτ0 = τ1 , zτ1 = τ0 , zτi = τ8−i for i = 2, . . . , 6, zτ7 = τ7 on the vertices, as illustrated in Fig. 11. Since wκ βi∨ wκ−1 = −βi∨ , we must have wκ e±βi wκ−1 = µi∓2 e∓βi for some µi with |µi | = 1. Let πi

w˜ βi = e 2 (µi eβi +µ¯ i e−βi ) . Conjugation with w˜ βi still induces the Weyl reflections rβi on the Cartan algebra and, ∨ similarly as for wβi , w˜ β2i = eπ i βi . We may then take wz = w˜ β1 w˜ β3 w˜ β7 . Since w˜ β1 , w˜ β3 and w˜ β7 commute, the relation ∨

∨

∨

∨

∨

∨

wz2 = eπ i(β1 +β3 +β7 ) = eπ i(α1 +α3 +α7 ) = z

42

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

7 1

2

4

3

5

6

0

Fig. 11. The transformation of the extended Dynkin diagram of E 7 under z

holds. By construction, wκ w˜ βi wκ−1 = w˜ βi . Hence, (wκ wz )2 = wκ wz wκ−1 wz−1 = 1. It follows that we may set bn,n = bn,n = bn,n = nn λ∨ 1,

bn,n = (1 + n 0 + nn )λ∨ 1,

3 ∨ 2 which, with the help of the relations τz −n 0 = δ[n],1 λ∨ 1 and tr(λ1 ) = 2 , gives rise to the obstruction cocycle (32) on Γ = Z2Z2 of the form

u n,n ,n = u n,n ,n = u n,n ,n = (−1)k nn n ,

u n,n ,n = (−1)k n(1+n 0 +n n ) ,

u n,n ,n = u n,n ,n = u n,n ,n = (−1)k(n 0 +n)n n ,

u n,n ,n = (−1)k(n+n 0 )(1+n 0 +n n ) . The restriction of this cocycle to the orbifold subgroup Z (E 7 ) may be trivialized if and only if k

is even,

in which case the whole cocycle becomes trivial. As four cohomologically inequivalent trivializing cochains we may take the cocycles vn,n = 1,

vn,n = 1,

vn,n = σ n ,

vn,n = σ σ n

(83)

for σ, σ = ±1. On the other hand, the restriction of the obstruction cocycle to the inversion group Z2 is trivial for all k. Two cohomologically nonequivalent trivializing cocycles may be obtained by restriction of (83) to n = n = 0. To summarize, for each k and each of the two choices of the twist element, there are two inequivalent Jandl structures on the level-k gerbe on E 7 . For k even and each choice of the twist element, there are four inequivalent Z2Z2 -equivariant structures on the level-k gerbe on E 7 , giving rise to, altogether, four Jandl structures on the induced gerbe on E 7 /Z2 . There are no Z2Z2 -equivariant structures for k odd.

WZW Orientifolds and Finite Group Cohomology

43

4.7. The cases of G = E 8 , F4 , G 2 . These are the simple groups with a trivial center and no non-trivial Dynkin diagram symmetries. The only possible orientifold group is the inversion group Γ = Z2 and, whatever the values of bγ ,γ , the obstruction 3-cocycle (32) is trivial since τγ −1 0 = τ0 = 0 for all γ ∈ Γ . Two cohomologically inequivalent trivializing cochains are given by the 2-cocycles of (68). They give rise to two inequivalent Jandl structures on the level-k gerbe on G from the list for each k.

5. Conclusions We have studied orientifolds of the WZW theories with simple compact simply connected groups G as targets. For orientifold groups Γ = Z2Z , where the generator of Z2 acts on G by a twisted inversion g → (ζ g)−1 , with ζ ∈ Z (G), and where Z is a subgroup of the center of G, we have classified all inequivalent Γ -equivariant structures on the level-k gerbes on groups G. Such structures are required to unambiguously define Feynman amplitudes of classical fields of the orientifold theory. For Z of even order, there may be obstructions to the existence of the orientifold theory with a given twist ζ even if the Z -orbifold theory exists. The classification of the Γ -equivariant structures on the level-k gerbe on G descends to the classification of the Jandl structures [31] on the induced gerbe on the quotient group G/Z . There exists an even number, at least two, of such induced Jandl structures, giving rise to different orientifold extensions of the Z -orbifold theory, i.e. to different unoriented closed-string theories with the target space G/(Z2Z ). Our results also show that, in all cases except for G = Spin(8n) and Z = Z2 × Z2 , the only obstructions to the existence of a Γ -equivariant structure with the trivial twist element ζ = 1 are the ones that obstruct the existence of a Z -equivariant structure. In the exceptional case, Z -equivariant structures exist (two inequivalent ones [15]) for all integer levels k, whereas Γ -equivariant ones with the trivial twist element exist only for k even. In [8], an additional condition was imposed on the Z -orbifold theory, see (2.15) therein, that is equivalent to the existence of a Γ -equivariant structure with the trivial twist element. This condition, that was unjustly related to unitarity of the Z -orbifold theory, eliminated odd levels k for the SO(8n)/Z2 WZW theory (in fact, unitarity holds also for k-odd theories; what fails is the left-right symmetry of the toroidal partition functions). As we shall discuss in [16], our results, based on a systematic geometric approach to the classical orientifold theory, are in agreement with the ones obtained in [5] by studying the sewing and modular invariance constraints for the crosscap states in the simple-current orbifolds of the WZW theory. Acknowledgements. K.G. and R.R.S. acknowledge the support of the European Commission under the contract EUCLID/HPRN-CT-2002-00325 and the funding by the Agence National de Recherche grant ANR-05BLAN-0029-03. K.W. was partly supported by Rudolf und Erika Koch-Stiftung.

6. Appendix Here is a short list of results, with the signs σ = ±1, σ1 = ±1, σ2 ± 1 and σ = ±1 describing different choices of trivializing cochains.

44

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

============================================ Group Ar center Zr +1 twist element n 0 = 0, 1, . . . , r −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Zm , m odd level k∈Z r +1 r +1 trivializing cochain for n, n = 0, m , . . . , m (m − 1) 2π ik

mn

π ik

vn,n = e r +1 (2nn −n

2π ik

vn,n = (−1)kr n 0 r +1 e− r +1 nn ,

vn,n = e r +1 nn , 2 −(r +1)n)

,

vn,n = σ π ik

−n (n + r (r +1) )−2n(n +n )+n 2 +(r +1)n

0 0 0 2 · e r +1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Zm , m even

level

k ∈ Z if

r +1 m

trivializing cochain for

n, n = 0,

r +1 r +1 , . . . , m (m m

2π ik

mn

π ik

− 1) mn

2π ik

vn,n = σ r +1 (−1)kr n 0 r +1 e− r +1 nn ,

vn,n = e r +1 nn , vn,n = e r +1 (2nn −n

and n 0 are even, k ∈ 2Z otherwise

2 −(r +1)n)

,

mn

vn,n = σ σ r +1 π ik r +1

−n 0 (n 0 + r (r2+1) )−2n(n 0 +n )+n 2 +(r +1)n

·e ============================================ Group Br center Z2 twist element n 0 = 0, 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain π ik

v0,0 = v0,0 = v0,0 = 1, v0,0 = σ e 2 n 0 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Z2 level k∈Z trivializing cochain for n, n = 0, 1

WZW Orientifolds and Finite Group Cohomology

vn,n = σ n (−1)k nn ,

vn,n = (−1)k nn ,

45

−3π ik

π ik

vn,n = (−1)k nn e 2 n , vn,n = σ σ n (−1)k n(n 0 +n ) e 2 (n 0 +3n) ============================================ Group Cr center Z2 twist element n 0 = 0, 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1, v0,0 = σ −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Z2 level k ∈ Z if r is even, k ∈ 2Z otherwise trivializing cochain for n, n = 0, 1 vn,n = 1,

vn,n = σ n ,

vn,n = σ σ n vn,n = 1, ============================================ Group Dr for r odd center Z4 twist element n 0 = 0, 1, 2, 3 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group level trivializing cochain

Z2 k∈Z

πi

v0,0 = v0,0 = v0,0 = 1, v0,0 = σ e− 4 kr (n 0 +2δn0 ,3 ) −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Z2 level k ∈ Z if n 0 is even, k ∈ 2Z otherwise trivializing cochain for n, n = 0, 2 vn,n = 1, π ik

1

vn,n = σ 2 n , 1

π ik

vn,n = e 4 n , vn,n = σ σ 2 n e− 4 ([n 0 +n]+2δ[n0 +n],3 ) −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

46

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

orientifold group level trivializing cochain for

Z2Z4 k ∈ 2Z n, n = 0, 1, 2, 3

vn,n = 1,

vn,n = σ n ,

π ik

π ik

vn,n = e 4 (n+2δn,3 ) , vn,n = σ σ n e− 4 ([n 0 +n]+2δ[n0 +n],3 ) ============================================ Group Dr for r even center Z2 × Z2 = {n 1 n 2 | n 1 , n 2 = 0, 1} twist element n 01 n 02 = 00, 10, 01, 11 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v00,00 = v00,00 = v00,00 = 1, v00,00 = σ −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2{n0 | n = 0, 1} r level k ∈ Z if 2 is even and n 02 = 0, k ∈ 2Z otherwise trivializing cochain for vn0,n 0 = 1,

n, n = 0, 1

vn0,n 0 = σ n ,

vn0,n 0 = 1, vn0,n0 = σ σ n −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2{0n | n = 0, 1} level k ∈ Z if n 01 = 0, k ∈ 2Z otherwise trivializing cochain for n, n = 0, 1 v0n,0n = 1,

v0n,0n = σ n ,

v0n,0n = 1, v0n,0n = σ σ n −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2{nn | n = 0, 1} r level k ∈ Z if 2 and n 01 = n 02 , k ∈ 2Z otherwise trivializing cochain for vnn,n n = 1,

n, n = 0, 1

vnn,n n = σ n ,

vnn,n n = 1, vnn,n n = σ σ n −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

WZW Orientifolds and Finite Group Cohomology

orientifold group level trivializing cochain for

vn 1 n 2 ,n 1 n 2 = σ n 2 n 1 ,

47

Z2(Z2 × Z2 ) k ∈ 2Z n 1 , n 2 , n 1 , n 2 = 0, 1

n

n

vn 1 n 2 ,n 1 n 2 = σ n 2 n 1 σ1 1 σ2 2 ,

n

n

vn 1 n 2 ,n 1 n 2 = σ n 2 n 1 , vn 1 n 2 ,n 1 n 2 = σ σ n 2 n 1 σ1 1 σ2 2 ============================================ Group E6 center Z3 twist element n 0 = 0, 1, 2 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1, v0,0 = σ −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Z3 level k∈Z trivializing cochain for n, n = 0, 1, 2 vn,n = σ vn,n = vn,n = vn,n = 1, ============================================ Group E7 center Z2 twist element n 0 = 0, 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1, v0,0 = σ −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2Z2 level k ∈ 2Z trivializing cochain for n, n = 0, 1, 2 vn,n = 1,

vn,n = σ n ,

vn,n = 1, vn,n = σ σ n ============================================

48

K. Gaw¸edzki, R. R. Suszek, K. Waldorf

Group E8 center Z1 twist element n0 = 0 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1, v0,0 = σ ============================================= Group F4 center Z1 twist element n0 = 0 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1, v0,0 = σ ============================================ Group G2 center Z1 twist element n0 = 0 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− orientifold group Z2 level k∈Z trivializing cochain v0,0 = v0,0 = v0,0 = 1,

v0,0 = σ

References 1. Alvarez, O.: Topological quantization and cohomology. Commun. Math. Phys. 100, 279–309 (1985) 2. Bachas, C., Couchoud, N., Windey, P.: Orientifolds of the 3-sphere. JHEP 0112, 003 (2001) 3. Brown, K.S.: Cohomology of Groups. Graduate Texts in Mathematics 87, New York: Springer-Verlag, 1982 4. Brunner, I.: On orientifolds of WZW models and their relation to geometry. JHEP 0201, 007 (2002) 5. Brunner, I., Hori, K.: Notes on orientifolds of rational conformal field theories. JHEP 0407, 023 (2004) 6. Brylinski, J.-L.: Loop Spaces, Characteristic Classes, and Geometric Quantization. Progress in Mathematics 107, Boston: Birkhäuser, 1993 7. Chatterjee, D.S.: On gerbs. Ph.D. thesis, Trinity College, Cambridge, 1998, available online at http:// www2.maths.ox.ac.uk/hitchin/hitchinstudents/chatterjee.pdf, 1998

WZW Orientifolds and Finite Group Cohomology

49

8. Felder, G., Gaw¸edzki, K., Kupiainen, A.: Spectra of Wess–Zumino–Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127–158 (1988) 9. Fjelstad, J., Fuchs, J., Runkel, I., Schweigert, C.: Topological and conformal field theory as Frobenius algebras. In: Contemp. Math. 431, Providence RI: Amer. Math. Soc., 2007, pp. 225–248 10. Fuchs, J., Huiszoon, L.R., Schellekens, A.N., Schweigert, C., Walcher, J.: Boundaries, crosscaps and simple currents. Phys. Lett. B495, 427–434 (2000) 11. Gajer, P.: Geometry of Deligne cohomology. Invent. Math. 127, 155–207 (1997) 12. Gaw¸edzki, K.: Topological actions in two-dimensional quantum field theory. In: ’t Hooft, G., Jaffe, A., Mack, G., Mitter, P.K., Stora, R. (eds.), Non-perturbative Quantum Field Theory. Proceedings, Cargèse 1987, New York: Plenum Press, 1988, pp. 101–142 13. Gaw¸edzki, K.: Abelian and non-Abelian branes in WZW models and gerbes. Commun. Math. Phys. 258, 23–73 (2005) 14. Gaw¸edzki, K., Reis, N.: WZW branes and gerbes. Rev. Math. Phys. 14, 1281–1334 (2002) 15. Gaw¸edzki, K., Reis, N.: Basic gerbe over non simply connected compact groups. J. Geom. Phys. 50, 28–55 (2004) 16. Gaw¸edzki, K., Suszek, R.R., Waldorf, K.: In preparation 17. Huiszoon, L.R., Schellekens, A.N.: Crosscaps, boundaries and T-duality. Nucl. Phys. B584, 705–718 (2000) 18. Huiszoon, L.R., Schellekens, A.N., Sousa, N.: Klein bottles and simple currents. Phys. Lett. B470, 95–102 (1999) 19. Huiszoon, L.R., Schellekens, A.N., Sousa, N.: Open descendants of non-diagonal invariants. Nucl. Phys. B575, 401–415 (2000) 20. Kapustin, A.: D-branes in a topologically nontrivial B-field. Adv. Theor. Math. Phys. 4, 127–154 (2000) 21. Lupercio, E., Uribe, B.: An introduction to gerbes on orbifolds. Annales Math. Blaise Pascal 11, 155–180 (2004) 22. McCleary, J.: A User’s Guide to Spectral Sequences. Cambridge Studies in Advanced Mathematics 58, Cambridge: Cambridge University Press, 2001 23. Meinrenken, E.: The basic gerbe over a compact simple Lie group. L’Enseignement Mathématique 49, 307–333 (2003) 24. Murray, M.K.: Bundle gerbes. J. London Math. Soc. (2) 54, 403–416 (1996) 25. Murray, M.K., Stevenson, D.: Bundle gerbes: stable isomorphisms and local theory. J. London Math. Soc. (2) 62, 925–937 (2000) 26. Pradisi, G., Sagnotti, A.: New developments in open-string theories. http://arXiv.org/abs/hep-th/9211084, 1992 27. Pradisi, G., Sagnotti, A., Stanev, Y.: Planar duality in SU (2) WZW models. Phys. Lett. B354, 279–286 (1995) 28. Pradisi, G., Sagnotti, A., Stanev, Y.: The open descendants of nondiagonal SU (2) WZW models. Phys. Lett. B356, 230–238 (1995) 29. Reis, N.: Interprétation géométrique des théories conformes des champs à bord. Ph.D. thesis, École Normale Supérieure de Lyon, 2003 30. Sah, C.-H.: Cohomology of split group extensions. J. Algebra 29, 255–302 (1974) 31. Schreiber, U., Schweigert, C., Waldorf, K.: Unoriented WZW models and holonomy of bundle gerbes. Commun. Math. Phys. 274, 31–64 (2007) 32. Schwarz, J.H.: Superstring theory. Phys. Rept. 89, 223–322 (1982) 33. Vafa, C.: Modular invariance and discrete torsion on orbifolds. Nucl. Phys. B273, 592–606 (1986) 34. Weibel, C.: An Introduction to Homological Algebra. Cambridge Studies in Advanced Mathematics 38, Cambridge: Cambridge University Press, 1995 35. Witten, E.: Non-Abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) Communicated by M. R. Douglas

Commun. Math. Phys. 284, 51–77 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0605-3

Communications in

Mathematical Physics

On Asymptotic Stability in Energy Space of Ground States for Nonlinear Schrödinger Equations Scipio Cuccagna1 , Tetsu Mizumachi2 1 DISMI University of Modena and Reggio Emilia, via Amendola 2,

Padiglione Morselli, Reggio Emilia 42100, Italy. E-mail: [email protected]; [email protected]

2 Faculty of Mathematics, Kyushu University, 6-10-1 Hakozaki, Higashi-ku,

Fukuoka 812-8581, Japan. E-mail: [email protected] Received: 7 May 2007 / Accepted: 22 May 2008 Published online: 9 August 2008 – © Springer-Verlag 2008

Abstract: We consider nonlinear Schrödinger equations iu t + u + β(|u|2 )u = 0 , for (t, x) ∈ R × Rd , where d ≥ 3 and β is smooth. We prove that symmetric finite energy solutions close to orbitally stable ground states converge to a sum of a ground state and a dispersive wave as t → ∞ assuming the so called the Fermi Golden Rule (FGR) hypothesis. We improve the “sign condition” required in a recent paper by Gang Zhou and I.M.Sigal. 1. Introduction We consider asymptotic stability of standing wave solutions of nonlinear Schrödinger equations iu t + u + β(|u|2 )u = 0 , for (t, x) ∈ R × Rd , (NLS) u(0, x) = u 0 (x) for x ∈ Rd , where d ≥ 3 and β is smooth. In this paper, we discuss the asymptotic stability of ground states in the energy class. Following Soffer and Weinstein [31], the papers [2–4,7,8,26,27,32,35–37] studied the case when the initial data are rapidly decreasing and the linearized operators of (NLS) at the ground states have at most one pair of eigenvalues that lie close to the continuous spectrum. Cases when the linearized operators have many eigenvalues were considered in [34]. One of the difficulties in proving asymptotic stability is the possible existence of invariant tori corresponding to eigenvalues of the linearization. A large amount of effort has been spent to show that “metastable” tori decay like t −1/2 as t → ∞ by means of a mechanism called the Fermi Golden Rule (FGR) introduced by Sigal [29] and by a normal form expansion. Recently, thanks to a significant improvement of the

52

S. Cuccagna, T. Mizumachi

normal form expansion, Zhou and Sigal [45] were able to prove asymptotic stability of ground states in the case when the linearized operators have two eigenvalues not necessarily close to the continuous spectrum. In a different direction, Gustafson et al. [18] proved that small solitons are asymptotically stable in H 1 (Rd ) if d ≥ 3 and if the linearized operators do not have eigenvalues except for the 0 eigenvalue. Recently, [23,24] extended [18] to the lower dimensional cases (d = 1, 2). The papers [18,23,24] utilize the endpoint Strichartz estimate or local smoothing estimates. In the present paper, we unify the methods in [45] and [18] and show that the result proved by [45] in a weighted space holds also in H 1 (Rd ). Furthermore, our assumption on (FGR) is weaker than [45]. In [45] sign hypothesis on a coefficient of the ODE is assumed, for the discrete mode. See [44] for a conjecture behind this assumption. By exploiting the orbital stability of solitons, we show that it is enough to assume the nondegeneracy of the coefficient, without any need to assume anything about its sign. To be more precise, let us introduce our assumptions: (H1) β(0) = 0, β ∈ C ∞ (R, R); d+2 (H2) There exists a p ∈ (1, d−2 ) such that for every k = 0, 1, k d 2 p−k−1 if |v| ≥ 1. dv k β(v ) |v| (H3) There exists an open interval O such that u − ωu + β(u 2 )u = 0 for x ∈ Rd ,

(1.1)

admits a C 1 -family of ground states φω (x) for ω ∈ O. We also assume the following: (H4) d φω 2L 2 (Rd ) > 0 for ω ∈ O. dω

(1.2)

2 (Rd ). (H5) Let L + = −+ω−β(φω2 )−2β (φω2 )φω2 be the operator whose domain is Hrad Then L + has exactly one negative eigenvalue and does not have kernel. (H6) For any x ∈ Rd , u 0 (x) = u 0 (−x). That is, the initial data u 0 of (NLS) is even.

(H7) Let Hω be the linearized operator around eitω φω (see Sect. 2 for the precise definition). Hω has a positive simple eigenvalue λ(ω) for ω ∈ O. There exists an N ∈ N such that N λ(ω) < ω < (N + 1)λ(ω). (H8) (FGR) is nondegenerate (see Hypothesis 3.5 in Sect. 3). (H9) The point spectrum of Hω consists of 0 and ±λ(ω). The points ±ω are not resonances. Theorem 1.1. Let d ≥ 3. Let ω0 ∈ O and φω0 (x) be a ground state of (1.1). Let u(t, x) be a solution to (NLS). Assume (H1)–(H9). Then, there exist an 0 > 0 and a C > 0 such that if ε := inf γ ∈[0,2π ] u 0 − eiγ φω H 1 < 0 , there exist ω+ ∈ O, θ ∈ C 1 (R; R) and h ∞ ∈ H 1 with h ∞ H 1 + |ω+ − ω0 | ≤ Cε such that lim u(t, ·) − eiθ(t) φω+ − eit h ∞ H 1 = 0.

t→∞

Asymptotic Stability in Energy Space of Ground States for NLS Equations

53

Remark 1.1. Under the assumption (H1)–(H5), it is well known that the standing waves are stable (see [6,16,17,28,40] and the references in [5]). Remark 1.2. Ground states of (1.1) are known to be unique for typical nonlinearities like β(s) = s ( p−1)/2 or β(s) = s ( p−1)/2 − s (q−1)/2 (see [14,21,22] and [41]). The assumption (H5) is satisfied for those cases (see [19,22]). Remark 1.3. Hypothesis (H9) is generic because resonances and embedded eigenvalues can be eliminated by perturbations following the ideas in [11,12]. Remark 1.4. Hypothesis (H8), that is Hypothesis 3.5 in Sect. 3, probably holds generically. Remark 1.5. Hypothesis (H6), that is the symmetry assumption u 0 (x) = u 0 (−x), can be dropped maintaining the same proof, if we add some inhomogeneity to the equation, for example a linear term V (x)u. In particular our result holds in the setting of [45]. Remark 1.6. Theorem 1.1 supports the conjecture by Soffer and Weinstein in [33] about the sign in “dispersive” normal forms for 1 dimensional Hamiltonian systems coupled to dispersive equations, since we prove in our case that the sign is the expected one. Conclusions similar to Theorem 1.1 can be obtained allowing more eigenvalues for the linearization, replacing (H7)–(H9) with: (H7’) Hω has a certain number of simple positive eigenvalues with 0 < N j λ j (ω) < ω < (N j + 1)λ j (ω) with N j ≥ 1. (H8’) The (FGR) Hypothesis 5.2 in Sect. 5 holds. (H9’) Hω has no other eigenvalues except for 0 and the ±λ j (ω). The points ±ω are not resonances. (H10’) For multi indexes m = (m 1 , m 2 , ...) and n = (n 1 , ...), setting λ(ω) = (λ1 (ω), ...) and (m − n) · λ = (m j − n j )λ j , we have the following non resonance hypotheses: (m − n) · λ(ω) = 0 implies m = n and (m − n) · λ(ω) = ω for all (m, n) Theorem 1.2. The same conclusions of Theorem 1.1 hold assuming (H1)–(H6) and (H7’)–(H10’). Remark 1.7. The (FGR) Hypothesis 5.2 is an analogue of the (FGR) in [45] and is a sign hypothesis on the coefficients of the equation of the discrete modes. In particular it is stronger than Hypothesis 3.5. In the case N j = 1 for all j, one can replace Hypothesis 5.2 with an hypothesis similar to Hypothesis 3.5 in the sense that it is known that if certain coefficients are non zero, then they have a specific sign. Remark 1.8. If we do not assume (H6), the solitary waves can move around. This causes technical difficulties when trying to show asymptotic stability in the energy space. However the results of this paper go through if we break the translation invariance of (NLS) by adding for example a linear term V (x)u(t, x) as in [45] or by replacing the nonlinearity by V (x)β(|u|2 )u, for appropriate V (x). Remark 1.9. The result in [45] is restricted to initial data satisfying a certain symmetry assumption and to an (NLS) with an additional linear term V (x)u(t, x) with V (x) = V (|x|). The argument of Theorem 1.2 can be used to generalize the result in [45] to generic, not spherically symmetric, V (x) and for initial data in H 1 not required to satisfy symmetry assumptions. The case when V (x) is spherically symmetric is untouched by our argument, because in that case the linearization admits a nonzero eigenvalue which is non simple.

54

S. Cuccagna, T. Mizumachi

Remark 1.10. Theorem 1.2 is relevant to a question in [33] on whether in the multi eigenvalues case the interaction of distinct discrete modes causes persistence of some excited states or radiation always wins. Theorem 1.2 suggests that the latter case is the correct one. Remark 1.11. Theorems 1.1 and 1.2 can be proved also in dimensions 1 and 2 extending to the linearizations the smoothing estimates for Schrödinger operators proved in [23,24]. See [9,13]. Remark 1.12. Theorem 1.2 seems also relevant to L 2 critical Schrödinger equations with a spatial inhomogeneity in the nonlinearity treated by Fibich and Wang [15], in the sense that if certain spectral assumptions and a (FGR) hold, it should be possible to prove that the ground states which are shown to be stable in [15], are also asymptotically stable, at least in the low dimensions d = 1, 2 when the critical nonlinearity is smooth. Remark 1.13. The ideas in this paper can also be used to give partial proof of the orbital instability of standing waves with nodes, even in the case when these waves are linearly stable, see [10]. Gustafson, Nakanishi and Tsai have announced Theorem 1.1 in the case N = 1 for the equation of [35] where some small ground states are obtained by bifurcation. Our proof is valid in their case and has the advantage that it can treat large solitons and the case where eigenvalues are not necessarily close to the edge of continuous spectrum. Our paper is planned as follows. In Sect. 2, we introduce the ansatz and linear estimates that will be used later. In Sect. 3, we introduce normal form expansions on the dispersive part and discrete modes of solutions. In Sect. 4, we prove a priori estimates of transformed equations and prove Theorem 1.1. In Sect. 5 we sketch the proof of Theorem 1.2. In the Appendix, we give the proof of the normal form expansion used in Theorem 1.1 following [3,4,45]. Finally, let us introduce several notations. Given an operator L, we denote by N (L) the kernel of L and by N g (L) the generalized kernel of L. We denote by R L the resolvent operator (L − λ)−1 . A vector or a matrix will be called real when all of their components are real 2 and let H be a set of functions defined by H (Rd ) = 1 + |x| valued. Let x

= a a d a x

u ∈ S(R ) : e u H k (Rd ) < ∞ for every k ∈ Z≥0 . For any Banach spaces X , Y , we denote by B(X, Y ) the space of bounded linear operators from X to Y . Various constants will be simply denoted by C in the course of calculations. 2. Linearization, Modulation and Set up Now, we review some well known facts about the linearization at a ground state. We can write the ansatz t iθ(t) u(t, x) = e (φω(t) (x) + r (t, x)) , θ (t) = ω(s)ds + γ (t). (2.1) 0

Inserting the ansatz into the equation we get 2 2 2 irt = − r + ω(t)r − β(φω(t) )r − β (φω(t) )φω(t) r 2 2 2 − β (φω(t) )φω(t) r + γ˙ (t)φω(t) − i ω(t)∂ ˙ ω φω(t) + γ˙ (t)r + O(r ).

(2.2)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

55

Because of r , we write the above as a system. Let 1 0 , σ3 = ; 0 −1

= σ3 (− + ω), Vω = −σ3 β(φω2 ) + β (φω2 )φω2 + iβ (φω2 )φω2 σ2 ;

01 σ1 = 10 Hω,0

0 i , σ2 = −i 0

(2.3)

H (ω) = Hω,0 + Vω , R = t(r, r¯ ), ω = t(φω , φω ). Then (2.2) is rewritten as i Rt = Hω(t) R + σ3 γ˙ R + σ3 γ˙ ω(t) − i ω∂ ˙ ω ω(t) + N ,

(2.4)

where N = σ3 β(| ω + R|2 /2)( ω + R) − β(| ω |2 /2) ω −∂ε β(| ω + ε R|2 /2)( ω + ε R) ε=0 = O(R 2 ) as R → 0. The essential spectrum of Hω consists of (−∞, −ω] ∪ [ω, +∞). It is well known (see [40]) that under the assumption (H3)–(H6), 0 is an isolated eigenvalue of Hω , dim N g (Hω ) = 2 and Hω σ3 ω = 0,

Hω ∂ω ω = − ω .

Since Hω∗ = σ3 Hω σ3 , we have N g (Hω∗ ) = span{ ω , σ3 ∂ω ω }. Let ξ(ω) be a real eigenfunction with eigenvalue λ(ω). Then we have Hω ξ(ω) = λ(ω)ξ(ω),

Hω σ1 ξ(ω) = −λ(ω)σ1 ξ(ω).

Note that ξ, σ3 ξ > 0 since σ Hω ·, · is positive definite on ⊥ N g (Hω∗ ). Both φω and ξ(ω, x) are smooth in ω ∈ O and x ∈ Rd and satisfy sup ω∈K,x∈Rd

ea|x| (φω (x)| + |ξ(ω, x)| < ∞

√ for every a ∈ (0, inf ω∈K ω − λ(ω)) and every compact subset K of O. For ω ∈ O, we have the Hω -invariant Jordan block decomposition L 2 (Rd , C2 ) = N g (Hω ) ⊕ (⊕± N (Hω ∓ λ(ω))) ⊕ L 2c (Hω ),

(2.5)

where L 2c (Hω ) := ⊥ N g (Hω∗ ) ⊕ (⊕± N (Hω∗ ∓ λ(ω)) . Correspondingly, we set R(t) = z(t)ξ(ω(t)) + z(t)σ1 ξ(ω(t)) + f (t), ∗ R(t) ∈ ⊥ N g (Hω(t) ) and f (t) ∈ L 2c (Hω(t) ).

(2.6) (2.7)

By using the implicit function theorem, we obtain the following (see e.g. [25] for the proof).

56

S. Cuccagna, T. Mizumachi

Lemma 2.1. Let I be a compact subset of O and let u(t) be a solution to (NLS). Then there exist a δ1 > 0 and a C > 0 satisfying the following. If δ := sup u(t) − eiθ0 φω0 H 1 (Rd ) < δ1 0≤t≤T

holds for a T ≥ 0, an ω0 ∈ I and a θ0 ∈ R, then there exist C 1 -functions z(t), ω(t) and θ (t) satisfying (2.1), (2.6) and (2.7) for 0 ≤ t ≤ T , and sup (|z(t)| + |ω(t) − ω0 | + |θ (t) − θ0 |) ≤ Cδ. 0≤t≤T

Remark 2.1. Let ε and ε0 be as in Theorem 1.1 and let δ and δ1 be as in Lemma 2.1. By (H4) and (H5), we have orbital stability of eiω0 t φω0 and it follows that sup f (t) H 1 + |z(t)| + |ω(t) − ω0 | ε. t≥0

(See [39] and also [30].) Thus there exists ε0 > 0 such that inf u(t) − eiγ φω0 H 1 < δ1 /2.

γ ∈R

By continuation argument (see e.g. [25]), we see that there exist z ∈ C 1 ([0, ∞); C) and ω, θ ∈ C 1 ([0, ∞); R) such that (2.6) and (2.7) are satisfied for t ∈ [0, ∞). Substituting (2.6) into (2.4), we have i f t = Hω(t) + σ3 γ˙ f + l + N ,

(2.8)

where l = σ3 γ˙ ω(t) − i ω∂ ˙ ω ω(t)

˙ 1 ξ(ω(t)) +(zλ(ω(t)) − i z˙ )ξ(ω(t)) − (zλ(ω(t)) + i z)σ +σ3 γ˙ (zξ(ω(t)) + z¯ σ1 ξ(ω(t))) − i ω(z∂ ˙ ω ξ(ω(t)) + z¯ σ1 ∂ω ξ(ω(t))).

We expand N in (2.2) as N (R) =

m,n (ω)z m z¯ n +

2≤|m+n|≤2N +1

+Oloc (| f | ω + R

2

z m z¯ n Am,n (ω) f

1≤|m+n|≤N p−3

) + O(|β(| f |2 ) f |) + Oloc (|z 2N +2 |),

(2.9)

where m,n (ω) and Am,n (ω) are real vectors and matrices which decay like e−a|x| as |x| → ∞, with σ1 m,n = −n,m and Am,n = −σ1 An,m σ1 . In the sequel, we denote by Oloc (g) terms with g multiplied by a function which decays like e−a|x| . By taking the L 2 -inner product of the equation with generators of N g (H ∗ ) and N (H ∗ − λ), we obtain a system of ordinary differential equations on modulation and discrete modes. ⎛ ⎞ ⎛ ⎞ i ω˙ N , ω

A ⎝ γ˙ ⎠ = ⎝ N , σ3 ∂ω ω ⎠, (2.10) i z˙ − λz N , σ3 ξ(ω)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

where

57

A = diag dφω 2L 2 /dω, −dφω 2L 2 /dω, ξ, σ3 ξ

+O(|z| + e−a|x| f L 2 ).

Finally, we introduce linear estimates which will be used later. Let Pc (ω) be the spectral projection from L 2 (Rd , C2 ) onto L 2c (Hω ) associated to the splitting (2.5). Lemma 2.2 (The Strichartz Estimate). Let d ≥ 3. Assume (H3)–(H9). Let ω ∈ O and k ∈ Z≥0 . Then ∇ k eit Hω Pc (ω)ϕ L ∞ L 2 ∩L 2 L 2d/(d−2) ∇ k ϕ L 2 t

x

for any ϕ ∈ L 2 (Rd ; C2 ), and t k −is Hω ∇ e P (ω)g(s)ds c 0

L 2x

(2.11)

x

t

∇ k g L 1 L 2 +L 2 L 2d/(d+2) , t

x

t

x

(2.12) (2.13)

t k i(t−s)Hω ∇ e Pc (ω)g(s)ds 0

2d/(d+2)

for any g ∈ L 1t L 2x + L 2t L x

2 2 L∞ t L x ∩L t L x

2d/(d−2)

∇ k g L 1 L 2 +L 2 L 2d/(d+2) t

x

t

x

(2.14)

.

Proof. As is explained in Yajima [42,43], Lemma 2.2 follows from the Strichartz estimates in the flat case and W k, p -boundedness of wave operators and their inverses. Specifically, let W (ω) = limt→∞ e−it Hω eitσ3 (−+ω) . By [7,12], W (ω) : W k, p (Rd ; C2 ) → W k, p (Rd ; C2 ) ∩ ⊥ N g (Hω∗ ) and its inverse are bounded for k ∈ N ∪ {0} and 1 ≤ p ≤ ∞. By e−it Hω Pc (ω) = W (ω)eitσ3 (−ω) W −1 (ω) and by Keel and Tao [20], we obtain (2.11)–(2.14). By our hypotheses and by regularity theory, the map ω → Vω which associates to ω the vector potential in (2.3), is a continuous function with values in the Schwartz space S(Rd ; C4 ). The following holds also under weaker hypotheses. Lemma 2.3. Let s1 = s1 (d) > 0 be a fixed sufficiently large number. Let K be a compact subset of O and let I be a compact subset of (ω, ∞) ∪ (−∞, −ω). Assume that ω → Vω is continuous with values in the Schwartz space S(Rd ; C4 ). Assume furthermore that for any ω ∈ O there are no eigenvalues of Hω in the continuous spectrum and the points ±ω are not resonances. Then there exists a C > 0 such that d

x −s1 e−i Hω t R Hω (µ + i0)Pc (ω)g L 2 (Rd ) ≤ C t − 2 x s1 g L 2 (Rd ) for every t ≥ 0, µ ∈ I , ω ∈ K and g ∈ S(Rd ; C2 ). We skip the proof. See [8] for d = 3 and I ⊂ (ω, ∞), see also [33]. The proof for d = 3 and I ⊂ (−∞, −ω) is almost the same. Finally for d > 3 a similar proof to [8] holds, changing the formulas for R− (µ + i0).

58

S. Cuccagna, T. Mizumachi

3. Normal Form Expansions In this section, following [45] we introduce normal form expansions on the dispersive part f , the modulation mode ω and the discrete mode z. First, we will expand f into normal forms isolating the slowly decaying part of solutions that arises from the nonlinear interaction of discrete and continuous modes of the wave. Lemma 3.1. Assume (H1)–(H9) and that ε∗ > 0 in Theorem 1.1 is sufficiently small. √ (N ) Let a ∈ (0, inf ω∈K ω − λ(ω)). Then there exist m,n (ω) ∈ Ha (Rd , R2 ) ∩ L 2c (Hω ) for (m, n) ∈ Z≥0 with m + n = N + 1 and m,n (ω) ∈ Ha (Rd , R2 ) ∩ L 2c (Hω ) for (m, n) ∈ Z≥0 with 2 ≤ m + n ≤ N such that for t ≥ 0, f (t) = f N (t) +

m,n (ω(t))z(t)m z(t)n ,

(3.1)

2≤m+n≤N

˙ 3 fN i Pc (ω(t))∂t f N − Hω(t) + Pc (ω(t))γ (t)σ ) m n =

(N m,n (ω(t))z(t) z(t) + N N ,

(3.2)

m+n=N +1

where N N is the remainder term satisfying |N N | (|z| N +2 + |z f N | + | f N |2 )e−a|x| + | f N |1+4/d + | f N |(d+2)/(d−2) +|z|(|z|e−a|x| f N L 2 + e−a|x|/2 f N 2H 1 )e−a|x| .

(3.3)

Before we start to prove Lemma 3.1, we observe the following. Lemma 3.2. Suppose (H1)–(H9) and that ε∗ > 0 is a sufficiently small number. Then for t ≥ 0, ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ i ω˙ p(z, z¯ ) f, αm,n (ω)

⎝ γ˙ ⎠ = ⎝ q(z, z¯ ) ⎠ + ⎝ f, βm,n (ω) ⎠ z m z¯ n i z˙ − λz r (z, z¯ ) 1≤m+n≤N f, γm,n (ω)

+O(|z|2N +2 + e−a|x|/2 f 2H 1 ),

(3.4)

where p(x, y), q(x, y), r (x, y) are real polynomials of degree (2N + 1) satisfying | p(x, y)| + |q(x, y)| + |r (x, y)| = O(x 2 + y 2 ) as (x, y) → (0,√0) and αm,n (ω), βm,n (ω), γm,n (ω) ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω∗ ) with 0 < a < inf ω∈K ω − λ(ω). Proof. Let us substitute (2.9) into (2.10). Since N (R) = O(R 2 ) as R → 0, the resulting equation can be written as (3.4). The components of the matrix A in (2.10) are given by real linear expressions of z, z¯ and f, ω , f, σ ∂ω ω and f, σ3 ξ . Hence it follows that p(x, y), q(x, y), r (x, y) are real polynomials and αm,n (ω), βm,n (ω), γm,n (ω) ∈ Ha (Rd ; R2 ). Since f ∈ L 2c (Hω ), we choose αm,n (ω, x), βm,n (ω, x) and γm,n (ω, x) in L 2c (Hω∗ ).

Asymptotic Stability in Energy Space of Ground States for NLS Equations

59

Proof of Lemma 3.1. We will prove Lemma 3.1 by induction. Let f 1 = f and let (k) f k+1 (t) = f k (t) + z(t)m z(t)n m,n (ω(t)) for 1 ≤ k ≤ N − 1, (3.5) m+n=k+1

where O ω → m,n (ω) → Ha (Rd , R2 ) ∩ L 2c (Hω ) is C 1 in ω. We will choose (k) (k) m,n (ω) so that for k = 1, · · · , N , there exist m,n (ω) ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω ) (m, 2 n ∈ Z≥0 , m + n = k + 1) and Nk ∈ L c (Hω ) such that

Pc (ω)i∂t f k − (Hω + Pc (ω)γ˙ σ3 ) f k =

m n

(k) m,n (ω)z z¯ + Nk ,

(3.6)

m+n=k+1

|Nk | (|z|k+2 + |z f k | + | f k |2 f k p−3 )e−a|x| + |β(| f k |2 ) f k | +|z|(|z|e−a|x| f k L 2 + e−a|x|/2 f k 2H 1 )e−a|x| . (1)

(1)

(3.7)

(1)

By (2.8), (2.9) and Lemma 3.2, there exist 2,0 (ω), 1,1 (ω), 0,2 (ω) ∈ Ha (Rd ; R2 )∩ such that

L 2c (Hω )

(1)

(1)

(1)

Pc (ω)(l + N ) = 2,0 (ω)z 2 + 1,1 (ω)|z|2 + 0,2 (ω)¯z 2 + N1 , and |N1 | e−a|x| (|z|3 + |z f | + | f |2 f p−3 ) + |β(| f |2 ) f | +e−a|x| |z|(|z|e−a|x| f L 2 + e−a|x|/2 f 2H 1 ). Thus we have (3.6) and (3.7) for k = 1. (k) Suppose that there exist m,n ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω(t) ) satisfying (3.6) and (3.7). Substituting (3.5) into (3.6), we have i Pc (ω)∂t f k+1 − (Hω + γ˙ σ3 ) f k+1 (k) Pc (ω) γ˙ σ3 m,n (ω) − i ω∂ ˙ ω m,n (ω) z m z¯ n = Nk + m+n=k+1

+

(k) z m z n (Hω − (m − n)λ)m,n (ω) +

m+n=k+1

−

m+n=k+1

z m z n (k) m,n (ω)

(k) mz m−1 z¯ n (i z˙ − λz) − nz m z¯ n−1 (i z˙ − λz) m,n (ω).

m+n=k+1

Put (k) m,n (ω) = −R Hω ((m − n)λ) (k) m,n (ω).

Then by (3.4), the right-hand side of (3.8) can be rewritten as m n

(k+1) m,n (ω)z z¯ + Nk+1 m+n=k+2

(3.8)

60

S. Cuccagna, T. Mizumachi (k+1)

for some m,n ∈ L 2c (Hω ) ∩ Ha (Rd ; R2 ) (m, n ∈ Z≥0 and m + n = k + 2) and Nk+1 satisfying |Nk+1 | (|z|k+3 + |z f k | + | f k |2 f k p−3 )e−a|x| + |β(| f k |2 ) f k | +|z|(|z|e−a|x| f k L 2 + e−a|x|/2 f k 2H 1 )e−a|x| . By (H1) and (H2), 4

d+2

|β(|u|2 )u| |u|3 u p−3 |u|1+ d + |u| d−2 . Thus we have (3.3). Let f˜N = Pc (ω0 ) f N and

f N +1 = f˜N +

(N ) m,n (ω0 )z m z¯ n ,

(3.9)

m+n=N +1

where (N ) ) (ω0 ) = −R Hω0 ((m − n)λ) (N m,n m,n (ω0 ) for |m − n| ≤ N, (N )

(N )

N +1,0 (ω0 ) = −R Hω0 ((N + 1)λ + i0) N +1,0 (ω0 ), (N ) 0,N +1 (ω0 )

= −R Hω0 (−(N

(3.10)

(N ) + 1)λ + i0) 0,N +1 (ω0 ).

To simplify (3.4), we will introduce new variables ω˜ := ω + P(z, z¯ ) + z m z¯ n f N , α˜ m,n (ω) , 1≤m+n≤N

z˜ := ω + Q(z, z¯ ) +

z m z¯ n f N , β˜m,n (ω) ,

1≤m+n≤N

where P(x, y) and Q(x, y) are real polynomials and α˜ m,n , β˜m,n ∈ Ha (Rd ; R2 ). Lemma 3.3. Assume (H1)–(H9) and that ε∗ is sufficiently small. Then there exist a polynomial P(x, y) of degree 2N +1 satisfying P(x, y) = O(x 2 + y 2 ) as (x, y) → (0, 0) and α˜ m,n (ω) ∈ L 2c (Hω∗ ) ∩ Ha (Rd ; R2 ) such that for t ≥ 0, i ω˙˜ = O(|z|2N +2 + e−a|x|/2 f N +1 2L 2 ) for t ≥ 0.

(3.11)

Lemma 3.4. Assume (H1)–(H9) and that ε∗ is sufficiently small. Then there exists a polynomial Q(x, y) of degree 2N +1 satisfying Q(x, y) = O(x 2 +y 2 ) as (x, y) → (0, 0), and β˜m,n (ω) ∈ L 2c (Hω∗ ) ∩ Ha (Rd ; R2 ) such that for t ≥ 0, (N ) am (ω, ω0 )|˜z |2m z˜ + z˜ N f N +1 , γ˜0,N (ω)

i z˙˜ − λ˜z = 1≤m≤N

+O(|˜z |2N +2 + e−a|x|/2 f N +1 2L 2 ),

(3.12) (N )

where am (ω, ω0 ) (1 ≤ m ≤ N − 1) are real numbers, and γ˜0,N (ω) ∈ Ha (Rd ; C2 ).

Asymptotic Stability in Energy Space of Ground States for NLS Equations

61

Lemmas 3.3 and 3.4 can be obtained in the same way as [45]. See Appendix for the proof. Now, let us introduce our assumption on (FGR). Let (ω, ω0 ) := a N (ω, ω0 ). Hypothesis 3.5. There exists a positive constant such that inf |(ω, ω)| > .

ω∈O

Under the above assumption, we have the following. Lemma 3.6. Assume (H1)–(H9) and that ε∗ > 0 is sufficiently small. Then there exist a positive constant C such that for every T ≥ 0, T T 2N +2 2 2 −a|x|/2 2 |z(t)| dt ≤ C |z(T )| + |z(0)| + e f N +1 (t) L 2 (Rd ) dt . 0

0

Proof. Choosing ε∗ smaller if necessary, we may assume that |(ω(t), ω0 )| > /2 for every t ≥ 0. Multiplying (3.12) by z˜ and taking the imaginary part of the resulting equation, we have d |˜z |2 N +1 (N ) = (ω, ω0 )|˜z |2N +2 + ˜z f N +1 , γ˜0,N (ω)

dt 2 + O(|˜z |2N +3 + |˜z |e−a|x|/2 f N +1 2L 2 ). By the Schwarz inequality, we have for a c > 0, N +1 (N ) f N +1 , γ0,N (ω) ≤ |z|2N +2 + Ce−a|x|/2 f N +1 2L 2 . ¯z 4 Combining (3.13) and (3.14), we obtain Lemma 3.6.

(3.13)

(3.14)

4. Proof of Theorem 1.1 To begin with, we restate Theorem 1.1 in a more precise form. Theorem 4.1. Assume (H1)–(H9) and that d ≥ 3. Let u be a solution of (NLS), U = t(u, u), and let m,n (ω) be as in Lemma 3.1. Then if ε∗ in Theorem 1.1 is sufficiently small, there exist C 1 -functions ω(t) and θ (t), a constant ω+ ∈ O such that supt≥0 |ω(t) − ω0 | = O(ε), limt→+∞ ω(t) = ω+ , and we can write U (t, x) = eiθ(t)σ3 ω(t) (x) + z(t)ξ(ω(t)) + z(t)σ1 ξ(ω(t)) n m,n (ω(t))z(t)m z(t) + eiθ(t)σ3 f N (t, x), +eiθ(t)σ3 2≤m+n≤N m,n∈Z≥0

with +1 z(t) LN2N +2 + f N (t, x) L ∞ H 1 ∩L 2 W 1,2d/(d−2) ≤ C. t

t

x

t

x

H 1 (Rd , C2 )

such that Furthermore, there exists f ∞ ∈ lim eiθ(t)σ3 f N (t) − eitσ3 f ∞ t→∞

H1

= 0.

62

S. Cuccagna, T. Mizumachi

Theorem 4.1 shows that a solution to (NLS) around the ground state can be decomposed into a main solitary wave, a well localized slowly decaying part, and a dispersive part that decays like a solution to iu t + u = 0. To prove Proposition 3.1, we will apply the endpoint Strichartz estimate. Let T > 0 and let X T = L ∞ (0, T ; L 2 (Rd )) ∩ L 2 (0, T ; L 2d/(d−2) (Rd )), YT = L 1 (0, T ; L 2 (Rd )) + L 2 (0, T ; L 2d/(d+2) (Rd )), Z T = L 2 (0, T ; L 2 (Rd ; x −2s1 d x)), where s1 is the positive number given in Lemma 2.3. To prove Theorem 4.1, we need the following. Lemma 4.2. Assume (H1)–(H9) and assume that ε∗ is sufficiently small. Then there exists a C > 0 such that for every T ≥ 0, f˜N X T + ∇ f˜N X T +1 ≤ Cε + C sup (1 + |ω(t) − ω0 | + |z(t)|) z LN2N +2

+C

0≤t≤T

min(1, d4 ) ˜ sup |z(t)| + f N X T ( f˜N X T + ∇ f˜N X T ).

(4.1)

0≤t≤T

Lemma 4.3. Assume (H1)–(H9). Let s1 be as in Lemma 2.3 and let ε∗ > 0 be a sufficiently small number. Then there exists a C > 0 such that for every T > 0, f N +1 Z T + ∇ f N +1 Z T +1 ≤ Cε + C sup (|ω(t) − ω0 | + |γ˙ (t)| + |z(t)|) z LN2N +2 0≤t≤T

+C

sup 0≤t≤T

min(1, 4 ) |z(t)| + f˜N X T d

( f˜N X T + ∇ f˜N X T )

2 +C sup |z(t)| N f N +1 Z T + ∇ f N +1 Z T .

(4.2)

0≤t≤T

As in [3,8], let P+ (ω) and P− (ω) be the spectral projections defined by 1 P+ (ω) f = R Hω (λ + i0) − R Hω (λ − i0) f dλ, 2πi λ≥ω 1 P− (ω) f = R Hω (λ + i0) − R Hω (λ − i0) f dλ. 2πi λ≤−ω To apply the Strichartz estimate (Lemma 2.2) to (3.2), we will use a gauge transformation introduced by [3] and give a priori estimates for the remainder terms. Lemma 4.4. Assume (H1)–(H9) and that ε∗ is sufficiently small. For t ≥ 0, i∂t f˜N = Hω0 + (θ˙ − ω0 )(P+ (ω0 ) − P− (ω0 )) f˜N ) m n +

(N m,n (ω0 )z z¯ + N N , m+n=N +1

(4.3)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

63

i∂t f N +1 = Hω0 + (θ˙ − ω0 )(P+ (ω0 ) − P− (ω0 )) f N +1 N +1 , + N N +1 + N where

(4.4)

(N +1) (N +1) N N +1 = (N + 1) z N (i z˙ − λz) N +1,0 (ω) − z¯ N (i z˙ − λz)0,N +1 (ω) (N +1)

(N +1)

−(θ˙ − ω0 )(P+ (ω) − P− (ω))( N +1,0 z N +1 + 0,N +1 (ω)¯z N +1 ), and N YT + ∇ N N YT + N N +1 YT + ∇ N N +1 YT N +1 sup (|ω(t) − ω0 | + |z(t)|) z LN2N +2 0≤t≤T

+

sup 0≤t≤T

min(1, 4 ) (|ω(t) − ω0 | + |z(t)|) + f N X T d

( f N X T + ∇ f N X T ). (4.5)

To obtain Lemma 4.4, we need the following, which holds also under weaker hypotheses. Lemma 4.5 ([8]). Assume that ω → Vω is continuous with values in the Schwartz space S(Rd ; C4 ). Assume furthermore that for any ω ∈ O there are no eigenvalues of Hω in the continuous spectrum and the points ±ω are not resonances. Then Pc (ω)σ3 − (P+ (ω) − P− (ω)) B(L q ,L p ) ≤ c p,q (ω) < ∞ for any p ∈ [1, 2] q ∈ [2, ∞), where c p,q (ω) is a constant upper semicontinuous in ω. Proof of Lemma 4.4. By a simple computation, we have (4.3) and (4.4) with ◦ N +1 = N N + N N +1 , where N = Pc (ω0 )N N + δN N , N N δN N = Pc (ω0 ) i ω∂ ˙ ω Pc (ω) + (θ˙ − ω0 ) (Pc (ω)σ3 − P+ (ω0 ) + P− (ω0 )) f N ) (N ) m n + Pc (ω0 ) (N m,n (ω) − m,n (ω0 ) z z¯ , m+n=N +1

and ◦

N N +1 =

(N +1) mz m−1 z¯ n (i z˙ − λz) − nz m z¯ n−1 (i z˙ − λz) m,n (ω0 )

m,n∈N m+n=N +1

−(θ˙ − ω0 )

(N ) (P+ (ω0 ) − P− (ω0 ))m,n (ω0 )z m z¯ n .

m,n∈N m+n=N +1

Applying Hölder’s inequality to (3.3), we have +1 N N YT sup |z(t)| z LN2(N +1) (0,T ) + f N X T 0≤t≤T

d+4

4

+ f N 2X T + f N XdT + f N X T f N d−2

2d

L ∞ (0,T ;L d−2 )

.

64

S. Cuccagna, T. Mizumachi

Similarly, we have

+1 + f + ∇ f ∇N N YT sup |z(t)| z LN2(N N X N X +1) (0,T ) T T 0≤t≤T

+∇ f N X T

4

4

f N X T + f N Xd T + f N d−2

2d

L ∞ (0,T ;L d−2 )

.

See [18] for the details. By (2.10), we have |ω| ˙ + |θ˙ − ω| + |i z˙ − λz| |z|2 + f N 2

2d

L d−2

2d

.

2d

From the definition, it is obvious that ∂ω Pc (ω) ∈ B(L d+2 , L d−2 ). Thus by Lemma 4.5, it follows that δN N YT + ∇δN N YT sup |z(t)|2 + f (t)2H 1 f N X T 0≤t≤T

+1 + sup |ω(t) − ω0 | z LN2(N + f N XT . +1) (0,T ) 0≤t≤T

Similarly, we have +2 N 2 N N +1 YT + ∇N N +1 YT z LN2(N +2) (0,T ) + sup |z(t)| f N X T . 0≤t≤T

Combining the above, we obtain (4.5). Thus we complete the proof. Proof of Lemma 4.2. Let f ± = P± (ω0 ) f˜N and U± (t, s) = e±i

t s

(ω0 −θ˙ )dτ

P± (ω0 )e−i(t−s)Hω0 P± (ω0 ).

It follows from Lemma 2.2 that there exists a C > 0 such that U± (·, s)ϕ X T ≤ Cϕ L 2 for every T ≥ 0, s ∈ R and ϕ ∈ L 2 (Rd ), and t U± (t, s)g(s)ds 0

≤ CgYT

(4.7)

XT

for every T ≥ 0 and g ∈ S(Rd+1 ). By Lemma 4.4, t f ± (t) = U± (t, 0) f ± (0) − i U± (t, s) 0

(4.6)

) m n

(N m,n (ω0 )z z¯

N . +N

(4.8)

m+n=N +1

In view of Lemma 2.1 and the definition of f ± (t), we have f ± (0) H 1 ε. Applying (4.6) and (4.7) to (4.8), we have f ± (t) X T + ∇ f ± (t) X T +1 f ± (0) H 1 + z LN2(N +1) (0,T ) + N N YT + ∇ N N YT +1 ε + sup (1 + |ω(t) − ω0 | + |z(t)|) z LN2N +2

+

0≤t≤T

sup 0≤t≤T

min(1, 4 ) |z(t)| + f N X T d

( f N X T + ∇ f N X T ).

(4.9)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

65

By the definition of Pc (ω), f N − f˜N H 1 |ω − ω0 |e−a|x| f N L 2 .

(4.10)

Substituting (4.10) into (4.9), we obtain (4.1). Thus we complete the proof of Lemma 4.2. Proof of Lemma 4.3. Let h ± (t) = P± (ω0 ) f N +1 . Using the variation of constants formula, we have t N +1 )ds. h ± (t) = U± (t, 0)h ± (0) − i U± (t, s)(N N +1 + N 0

Put h ± (0) = h 0,1,± + h 0,2,± , where h 0,2,± = f ± (0) +

(N ) m,n (ω0 )z(0)m z(0)n .

m+n=N +1 m,n≥1 (N )

(N )

(N )

Note that m,n (0) ∈ H 1 if m, n ≥ 1, whereas N +1,0 (0) and 0,N +1 (0) may not belong to L 2 . Since s1 > 0, we have f Z T f X T . Applying (4.6) and (4.7), we have U± (t, 0)h 0,2,± Z T + ∇U± (t, 0)h 0,2,± Z T U± (t, 0)h 0,2,± X T + ∇U± (t, 0)h 0,2,± X T ε, and

t t U± (t, s)N N +1 ds + ∇ U± (t, s)N N +1 ds 0 0 ZT ZT t t ∇ U (t, s) N ds + U (t, s) N ds ± N +1 ± N +1 0

sup 0≤t≤T

0 N +1 (|ω(t) − ω0 | + |z(t)|) z L 2N +2

+

sup 0≤t≤T

XT

min(1, 4 ) |z(t)| + f˜N X T d

XT

( f˜N X T + ∇ f˜N X T )

in the same way as the proof of Lemma 4.2. (N ) (N ) By Lemma 2.3 and the definition of N +1,0 (0) and 0,N +1 (0), we have U± (t, 0)h 0,1,± Z T + ∇U± (t, 0)h 0,2,± Z T (N ) (N ) t −d/2 x s1 N +1,0 (0) H 1 + x s1 0,N +1 (0) H 1

L 2 (0,T )

ε. It follows from Lemma 3.2 that |i z˙ − λz| |z|2 + e−a|x|/2 f N +1 2H 1 , |θ˙ − ω0 | ≤ |θ˙ − ω| + |ω − ω0 | |ω − ω0 | + |z|2 + e−a|x|/2 f N +1 2H 1 .

66

S. Cuccagna, T. Mizumachi

Thus by Lemma 2.3, t U± (t, s)N N +1 ds i=0,1

0

ZT

t + ∇ U± (t, s)N N +1 ds 0

ZT

t −d/2 N +1 N −a|x|/2 2 t − s

(ε|z(s)| + |z(s)| e f (s) )ds N +1 H1 0

+1 εz LN2N +2 (0,T )

L 2 (0,T )

+ sup |z(t)| ( f N +1 Z T + ∇ f N +1 Z T ) . N

2

0≤t≤T

Combining the above, we obtain (4.2). Now, we are in position to prove Theorems 1.1 and 4.1. Proof of Theorems 1.1 and 4.1. Since eiω0 t φω0 is orbitally stable, Lemma 3.2 and Remark 2.1 imply that sup (|z(t)| + |ω(t) − ω0 | + |γ˙ (t)|) ε. t≥0

We have f N W k, p f˜N W k, p

(4.11)

for every k ∈ Z≥0 and 1 ≤ p ≤ ∞ because f˜N − f N W k, p = (Pc (ω) − Pc (ω0 )) f N W k, p |ω − ω0 | f N W k, p . Thus by Lemmas 3.6, 4.2 and 4.3, it holds that for every T ≥ 0, +1 z LN2N +2 (0,T ) ε + f N +1 Z T + ∇ f N +1 Z T ,

(4.12)

f N X T + ∇ f N X T

min(1, d4 ) +1 ( f N X T + ∇ f N X T ) ε + z LN2N + ε + f N +2 (0,T ) XT

(4.13)

f N +1 Z T + ∇ f N +1 Z T ε

+1 + εz LN2N +2 (0,T )

min(1, d4 ) ( f N X T + ∇ f N X T ) + ε + f N XT

+ε N ( f N +1 Z T + ∇ f N +1 Z T )2 .

(4.14)

Let A > 0 be a sufficiently large number. Adding (4.13) to (4.14) multiplied by A and substituting (4.12) into the resulting equation, we have f N X T + ∇ f N X T + min(1, d4 )

ε + f N XT

A ( f N +1 Z T + ∇ f N +1 Z T ) 2

( f N X T + ∇ f N X T ) + ε N ( f N +1 Z T + ∇ f N +1 Z T )2 .

Asymptotic Stability in Energy Space of Ground States for NLS Equations

67

Letting T → ∞, we obtain fN

2d 1, d−2 2 1 L∞ t Hx ∩L t W x

∞

+ x −s1 f N +1 L 2 H 1 ε, t

(4.15)

x

|z(t)|2N +2 dt ε.

(4.16)

0

Since z˙ is bounded from (2.10), it follows from (4.16) that limt→∞ z(t) = 0. Furthermore, Lemma 3.3, (4.15) and (4.16) imply that there exists an ω+ ∈ O such that ˜ = ω+ . lim ω(t) = lim ω(t)

t→∞

t→∞

Thus we prove Theorem 1.1. Finally, we will prove that is f N (t) is asymptotically free as t → ∞. Let U (t, s) = U+ (t, s) + U− (t, s) and t2 ≥ t1 ≥ 0. Lemma 2.2 and (4.3) yield that as t1 → ∞, U (0, t2 ) f˜N (t2 ) − U (0, t1 ) f˜N (t1 ) 1 H t2 (N ) m n = U (0, s)

m,n (ω0 )z z¯ + N N ds t1 m+n=N +1

+1 z LN2N +2 (t ,t ) 1 2

N +1 + N

H1

L 1 (t1 ,t2 ;H 1 (Rd ))+L 2 (t1 ,t2 ;W

2d 1, d+2

(Rd ))

→ 0.

Hence there exists f˜∞ ∈ H 1 (Rd ) such that lim f˜N (t) − U (t, 0) f˜∞ H 1 = 0.

t→∞ 2d For q ∈ (2, d−2 ), we have

lim f˜N (t) L q = lim U (t, 0) f˜∞ L q = 0.

t→∞

t→∞

By the definition of f N and f˜N and (4.11), f˜N (t) − f N (t) H 1 = (Pc (ω) − Pc (ω0 )) f N H 1 |ω − ω0 | f N L q f˜N L q → 0, as t → ∞. Combining the above, we have by the definition of U (t, 0), lim f N (t) − ei[(tω0 −θ(t)+θ(0)](P+ (ω0 )−P− (ω0 )) e−it Hω0 f˜∞ H 1 = 0.

t→+∞

Consider the strong limit W (ω0 ) = limt∞ eit Hω0 eit (−ω0 )σ3 and set f ∞ = W (ω0 )−1 eiθ(0)(P+ (ω0 )−P− (ω0 )) f˜∞ . Notice that since eitω0 σ3 is a unitary matrix periodic in t and eitω0 σ3 f ∞ describes a circle in L 2 , we have lim W (ω0 )eitω0 σ3 f ∞ − eit Hω0 eit (−ω0 )σ3 eitω0 σ3 f ∞ = 0. t→+∞

68

S. Cuccagna, T. Mizumachi

Since eit Hω0 L ∞ 2 t B(L c (Hω

2 0 ),L c (Hω0 ))

1, Lemma 2.2, implies

e−it Hω0 W (ω0 )eitω0 σ3 f ∞ − eit (−ω0 )σ3 eitω0 σ3 f ∞ H 1 ≈ W (ω0 )eitω0 σ3 f ∞ − eit Hω0 eit (−ω0 )σ3 eitω0 σ3 f ∞ H 1 , the above 0 limit implies lim e−it Hω0 W (ω0 )eitω0 σ3 f ∞ − eit (−ω0 )σ3 eitω0 σ3 f ∞ H 1 = 0.

t→+∞

Since W (ω0 ) conjugates Hω0 into σ3 (− + ω0 ), we get e(itω0 +iθ(0))(P+ (ω0 )−P− (ω0 )) e−it Hω0 f˜∞ = e−it Hω0 W (ω0 )eitω0 σ3 f ∞ . Thus we get the following, completing the proof of Theorem 4.1: lim eiθ(t)σ3 f N (t) − eitσ3 f ∞ 1 = 0. t→+∞

H

Corollary 4.6. If Hypothesis 3.5 holds, then (ω, ω) > holds. Suppose we have (ω, ω0 ) < −/2. We can pick initial datum so that f N +1 (0) = 0 and z(0) ≈ . Then from Lemma 4.3 we get f N +1 Z T + ∇ f N +1 Z T ≤ C 2 for any T for fixed C > 0. Then integrating (3.13) we get z(0)| ≥ | z(t)| − | 2 2

t

2

| z|

2N +2

+ o()

0

t

| z|

2N +2

21

+ o( 2 ).

0

t 2N +2 For large t | z(t)| < | z(0)| since z(t) → 0, so for large t we get 0 | z| = o( 2 ). In 2 2 particular for t → ∞ we get ≤ o( ) which is absurd for → 0. 5. Proof of Theorem 1.2 We will provide only a sketch of the proof. The argument is essentially the same as that of Theorem 1.1. However, when we select the main terms of the equations of the discrete modes we have more than just one dominating term. Since these dominating terms could cancel with each other, the situation is harder than the one in (3.13). We resolve all problems by assuming Hypothesis 5.2 which is very close in spirit to the (FGR) hypothesis in [45]. The eigenvectors λ j (ω) have corresponding real eigenvectors ξ j (ω), normalized so that ξ j , σ3 ξ = δ j . σ1 ξ(ω) generates N (Hω + λ(ω)) . The ξ j (ω) can be chosen real because Hω has real coefficients. The functions (ω, x) ∈ O × Rd → ξ j (ω, x) are C 2 ; |ξ j (ω, x)| < ce−a|x| for fixed c > 0 and a > 0 if ω ∈ K ⊂ O, K compact. ξ j (ω, x) is even in x since by assumption we are restricting ourselves in the category of such functions. We order the indexes so that N1 ≤ N2 ≤ · · · . We set ⎡ ⎤ N (H (t) ∓ λ j (t))⎦ ⊕ L 2c (H (t)), R(t) = (z · ξ + z¯ · σ1 ξ ) + f (t) ∈ ⎣ j,±

Asymptotic Stability in Energy Space of Ground States for NLS Equations

69

" m z j ξ j . In the sequel we use the multi index notation z m = j z j j . − → → a ≤ b if a j ≤ b j Denote by N the largest of the N j . Given two vectors we will write − − → − → for all components. If this happens we write a ω and if M 1 < M then − → M 1 · λ < ω. Then we have: where z · ξ =

Theorem 5.1. Assume (H1)–(H6), (H7’)–(H10’) (in particular Hypothesis 5.2 below) and that d ≥ 3. Let u be a solution of (NLS), U = t(u, u). Let m,n (ω) ∈ S(Rd , R2 ) be the vectors rapidly decreasing for |x| → ∞, with real entries, and with continuous dependence on ω. Then if ε∗ is sufficiently small, there exist C 1 -functions ω(t) and θ (t), a constant ω+ ∈ O such that supt≥0 |ω(t) − ω0 | = O(ε), limt→+∞ ω(t) = ω+ , and we can write U (t, x) = eiθ(t)σ3 ω(t) (x) + ζ (t) · ξ(ω(t)) + ζ (t) · σ1 ξ(ω(t)) n m,n (ω(t))ζ (t)m ζ (t) + eiθ(t)σ3 f N (t, x), +eiθ(t)σ3 2≤|m+n|≤N |(m−n)·λ(ω)|<ω

with for a fixed C > 0,

ζ M (t) L 2 + f N (t, x) L ∞ H 1 ∩L 2 W 1,2d/(d−2) ≤ C. t

M∈Res

t

x

t

x

Furthermore, there exists f + ∈ H 1 (Rd , C2 ) such that lim f N (t) − e−iθ(t)σ3 eitσ3 f + H 1 = 0.

t→∞

We consider k = 1, 2, . . . N and set f = f k and z (k), j = z j for k = 1. The other f k and z (k), j are defined below by induction:

E ODE (k) =

M 2 M O(|z (k) | ) + O(z (k) f k ) + O( f k2 ) + O(β(| f k |2 f k )).

M∈Res

In the PDE’s there will be error terms of the form E PDE (k) =

M∈Res

Oloc (|z (k) | M |)|z (k) | + Oloc (z (k) f k ) + O( f k2 ) + O(β(| f k |2 f k )).

70

S. Cuccagna, T. Mizumachi

For k = 1, f 1 = f and z (k), j = z j thanks to (2.9) we have # m n (k) i ω , ˙ ∂ω = m,n (ω)z (k) z¯ (k) 2≤|m+n|≤2N +1

+

$

m n z (k) z¯ (k) A(k) m,n (ω) f k

+ E O D E (k), ,

1≤|m+n|≤N

i z˙ j,(k) − λ j z j,(k) =

N |m|=1

+

(5.1)

# (k) m 2 a j,m (ω)|z (k) | z (k), j

+

m n (k) m,n (ω)z (k) z¯ (k)

k+1≤|m+n|≤2N +1 (m−n)·λ=λ j

$

m n z (k) z¯ (k) A(k) m,n (ω) f k

+ E O D E (k), σ3 ξ j

1≤|m+n|≤N

i∂t f k = (Hω + σ3 γ˙ ) f k + E P D E (k) (k) m n Rm,n (ω)z (k) z¯ (k) (sum over pairs with |(m − n) · λ| < ω) + k+1≤|m+n|≤N +1

+

(k) m n Rm,n (ω)z (k) z¯ (k) (sum over pairs with |(m − n) · λ| > ω)

2≤|m+n|≤N +1

(k) with a j,m = 0 and (k) (k) A(k) m,n , Rm,n and m,n real, rapidly decreasing in x,

(5.2)

(k) (k) = −Rn,m . continuous in (ω, x), with σ1 Rm,n

We set f 1 = f and, summing only over (m, n) with |(m − n) · λ| < ω, we define inductively f k with k ≤ N by (k−1) m n f k = f k−1 + R Hω ((m − n) · λ)Pc (Hω )Rm,n (ω)z (k−1) z¯ (k−1) . |m+n|=k

(k−1)

(k−1)

(k−1)

By σ1 Rm,n = −Rn,m , by [σ1 , Pc (Hω )] = 0, by the fact that Rm,n is real and by σ1 Hω = −Hω σ1 , we get σ1 f k = f k . Summing only over (m, n) with λ j (ω) = (m − n) · λ(ω), we set z (k), j = z (k−1), j +

|m+n|=k

m n z (k−1) z¯ (k−1)

λ j − (m − n) · λ

(k−1) m,n , σ3 ξ j .

By induction f k and z (k) solve (5.2) and (5.2). At the step k = N , we can define

ζ j = z (N ), j + p j (z (N ) , z (N ) ) ω = ω + q(ζ, ζ¯ ) +

m n z (N ) z (N ) f N , α jmn ,

1≤|m+n|≤N

1≤|m+n|≤N

ζ m ζ¯ n f N , βmn ,

(5.3)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

71

with: α jmn and βmn vectors with entries which are real valued exponentially decreasing functions; p j polynomials in (z (N ) , z (N ) ) with real coefficients and whose monomials have degree not smaller than N + 1; q a polynomial in (ζ, ζ ) with real coefficients and monomials at least quadratic. The above transformation can be chosen so that with a j,m (ω) real we have i ω˙ = E PDE (N ),

i ζ˙j − λ j (ω)ζ j =

+

a j,m (ω)|ζ m |2 ζ j + E ODE (N ), σ3 ξ j

(5.4)

1≤|m|≤N

ζ

n

(N ) A0,n (ω) f N , σ3 ξ j .

n+δ j ∈Res

Now we fix ω0 = ω(0), set H = H (ω(0)) and rewrite the equation for f N , i∂t Pc (ω0 ) f N = H + (θ˙ − ω0 )(P+ (ω0 ) − P− (ω0 )) Pc (ω0 ) f N (N ) PDE (N ) + +Pc (ω0 ) E Pc (ω0 )Rm,n (ω0 )ζ m ζ¯ n ,

(5.5)

2≤|m+n|≤N +1

where in the summation |m + n| ≤ N implies |(m − n) · λ| > ω and with (N ) (N ) PDE (N ) = E PDE (N ) + E Pc (ω0 ) Rm,n (ω) − Rm,n (ω0 ) ζ m ζ¯ n 2≤|m+n|≤N +1

+(θ˙ − ω0 ) (Pc (ω0 )σ3 − (P+ (ω0 ) − P− (ω0 ))) f N + (V (ω) − V (ω0 )) f N +(θ˙ − ω0 ) (Pc (ω) − Pc (ω0 )) σ3 f N . (5.6) Next, recall H = H (ω(0)), We set (N ) fN = − R H ((m − n) · λ(ω0 ) + i0)Pc (ω)Rm,n (ω0 )ζ m ζ¯ n + f N +1 , (5.7) 2≤|m+n|≤N +1

where in the summation |m + n| ≤ N implies |(m − n) · λ| > ω. Substituting in (5.4) we get i ω˙ = E P D E (N ), ,

i ζ˙j − λ j (ω)ζ j =

1≤|m|≤N (N ) × A0,n (ω)R H ((m

+

n

a j,m (ω)|ζ m |2 ζ j −

N +1

ζ mζ

n+ n

n |≥2 n+δ j ∈Res |m+ (N )

− n ) · λ(ω0 ) + i0)Pc (ω)Rm, n (ω0 ), σ3 ξ j

(N )

ζ A0,n (ω) f N +1 , σ3 ξ j + E O D E (N ), σ3 ξ j .

(5.8)

n+δ j ∈Res

Substituting in (5.5), where k = N , and writing as in (5.6) we get i∂t Pc (ω0 ) f N +1 = H + (θ˙ − ω0 )(P+ (ω0 ) − P− (ω0 )) Pc (ω0 ) f N +1 (N ) + O(|ζ ||m+n|+1 )R H ((m − n) · λ(ω0 ) + i0)Rm,n (ω0 ) 2≤|m+n|≤N +1

P D E (N ) +Pc (ω0 ) E

(5.9)

72

S. Cuccagna, T. Mizumachi

where O(|ζ ||m+n|+1 ) = O(|ζ M ζ |) with M ∈ Res for the factors in the above sum. In (5.8) we eliminate by a new change of variables ζ j = ζ j + p j (ζ, ζ ) the terms with n+ n

n ζ mζ not of the form |ζ m |ζ j . The p j (z, z) are polynomials with monomials z m z n+ M M which, by (m + n ) · λ > ω, are O(z ) for M ∈ Res. This implies M∈Res ζ (t) L 2 ≈ t M M∈Res ζ (t) L 2 . In the new variables t

i ω˙ = E P D E (N ),

i ζ˙j − λ j (ω) ζj − ζj = a j,m (ω)| ζ m |2

1≤|m|≤N

m+δ j ∈Res

ζj | ζ m |2

) (N ) × A(N 0,m (ω)R H (m · λ(ω0 ) + λ j (ω0 ) + i0)Rm+δ j ,0 (ω0 ), σ3 ξ j (ω)

m (N ) ζ A0,m (ω) f N +1 , σ3 ξ j (ω) + E O D E (N ), σ3 ξ j

+

(5.10)

m+δ j ∈Res (N )

(N )

with a j,m , A0,m and Rm+δ j ,0 real and with all the m such that m + δ j ∈ Res. We can denote by m+δ j , j (ω, ω0 ) the quantity ) (N ) (ω)R (m · λ(ω ) + λ (ω ) + i0)R (ω )σ ξ (ω)

m+δ j , j (ω, ω0 ) = A(N H 0 j 0 0 3 j 0,m m+δ j ,0 (N )

(N )

= π A0,m (ω)δ(H − m · λ(ω) − λ j (ω))Pc (ω0 )Rm+δ j ,0 (ω)σ3 ξ j (ω) .

(5.11)

Then ζ j |2 d | =− dt 2 +(

ζ j |2 m+δ j , j (ω, ω0 )| ζ m

m+δ j ∈Res m

(N ) ζ ζ j + E O D E (N ), σ3 ξ j (ω) ζ j ). (5.12) A0,m (ω) f N +1 , σ3 ξ j (ω)

m+δ j ∈Res

Notice that (5.12) contains more terms than (3.13) and that the signs of m+δ j , j now matter. Denote by Res j the subset of Res which have at least 1 in the j th component. We assume the following hypothesis: Hypothesis 5.2. For m ∈ Res, let J (m) = { j : m ∈ Res j }. There is a fixed C0 > 0 such that for |z| < , |z m |2 m, j (ω, ω) ≥ C0 |z m |2 . m∈Res

j∈J (m)

m∈Res

Assuming Hypothesis 5.2 we obtain Theorem 5.1 proceeding along the lines of the proof of Theorem 1.1. Remark 5.1. It is possible that a formula of the following form might be true (N ) (N ) A0,m−δ j (ω)R Hω (m · λ(ω) + i0)Pc (ω)Rm,0 (ω0 ), σ3 ξ j (ω)

j∈J (m)

(N )

(N )

= Cm δ(Hω − m · λ(ω))Rm,0 (ω), σ3 Rm,0 (ω)

(5.13)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

73 (N )

for some constant Cm > 0. It is elementary to show (5.13) if we replace A0,m−δ j with (N )

A0,m−δ j and Rm,0 with Rm,0 , from the Taylor expansion in (2.9). For N = 1 this yields Theorem 1.2 substituting the Hypothesis 5.2 with a generic hypothesis similar to ) (N ) Hypothesis 3.5. Indeed if N = 1, it is easy to see that A(N 0,δ j = A0,δ j and Rδ j +δk ,0 = Rδ j +δk ,0 . To get (5.13) in the general case, one should exploit the Hamiltonian nature of (NLS) which has been lost in our proof. A. Appendix Proof of Lemma 3.3. Following the idea of [4, Prop. 4.1], we will transform (3.4) into (3.11) and (3.12) by induction. Let ω1 = ω and let (k) ωk+1 = ωk + f N , α˜ m,n (ω) z m z¯ n . (A.1) m,n≥0 m+n=k (k)

We will determine α˜ m,n (ω) ∈ Ha (Rd , R2 ) so that (k) i ω˙ k = bm,n (ω)z m z¯ n + 2≤m+n≤2N +1

+O(|z|

(k) f N , αm,n

z m z¯ n

k+1≤m+n≤N

2N +2

+ e

−a|x|/2

f N +1 2H 1 )

(A.2)

for k = 1, . . . N . For k = 1, Eq. (A.2) follows from Lemma 3.2. Furthermore, we have (1) (1) (1) (1) (1) bm,n (ω) = −bn,m (ω), αm,n (ω) = αn,m (ω) and σ1 αm,n (ω) = −αn,m (ω) because ω is a real number and f N = σ1 f N .

(A.3)

Suppose that (A.2), that ωk is a real number, and that (k) (k) (k) bm,n (ω) are real numbers with bm,n (ω) = −bn,m (ω),

(A.4)

(k) αm,n (ω)

(A.5)

∈ Ha (R , R ), d

2

(k) σ1 αm,n (ω)

=

(k) −αn,m (ω)

are true for k = l with l ≤ N . Differentiating (A.1) with respect to t and substituting (3.4), (3.6) and (A.2) with k = l into the resulting equation, we obtain (l) i∂t f N , α˜ m,n (ω) z m z¯ n i ω˙ l+1 = i ω˙ l + m+n=l

& % d (l) (l) i f N , α˜ m,n + (ω) (z m z¯ n ) + i ω ˙ f N , ∂ω α˜ m,n (ω) z m z¯ n dt m+n=l ( ' (l) m n (l) (l) f N , αm,n = bm,n z z¯ + + (Hω∗ + (m − n)λ)α˜ m,n 2≤m+n≤2N +1

+

m+n=l

#

p+1=N +1

m+n=l ) p q

(N p,q (ω)z z¯

$ (l) + N N , α˜ m,n

z m z¯ n

(l) (l) + γ˙ Pc (ω)σ3 f N , α˜ m,n (ω) + i ω ˙ f N , ∂ω α˜ m,n (ω) z m z¯ n

74

S. Cuccagna, T. Mizumachi

+

(l+1) f N , α˜ m,n (ω) mz m−1 z¯ n (i z˙ − λz) − nz m z¯ n−1 (i z˙ − λz)

m+n=l

+O(|z|2N +2 + e−a|x|/2 f N +1 2H 1 ). (l)

(l)

Put α˜ m,n (ω) = R Hω∗ ((n − m)λ)αm,n (ω). Then by Lemma 3.2, the definition of N N and |ω| ˙ + |γ˙ | + |i z˙ − λz| + e−a|x| N N H 1 |z|2 + e−a|x|/2 f N +1 2H 1 , (l+1)

it holds that (A.2) with k = l + 1 is true for some bm,n (ω) ∈ R (2 ≤ m + n ≤ 2N + 1) (l+1) and αm,n (ω) ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω∗ ) (m + n = l + 1). Note that N N can be expanded into a formal power series of z, z¯ and f N whose coefficients are real. (l) , (A.5) with k = l and the fact that σ1 Hω σ1 = −Hω , By the definition of α˜ m,n (l) (l) σ1 α˜ m,n (ω) = α˜ m,n (ω).

(A.6)

From (A.3), (A.6) and (A.2) for k = l + 1, we see that ωl+1 is a real number and that (A.4) and (A.5) are true for k = l + 1. Thus we prove

i ω˙ N +1 =

(N +1) bm,n (ω)z m z¯ n

2≤m+n≤2N +1 2N +2

+ O(|z|

(N +1)

(A.7) + e

−a|x|/2

f N +1 2H 1 ),

(N +1)

(N +1)

where bm,n (ω) are real numbers satisfying bm,n (ω) = −bn,m (ω). In particular, we (N +1) = 0 for n = 1, . . . , N . have bn,n Using

d m n (z z¯ ) = z m z¯ n −iλ(m − n) + O(|z|2 + e−a|x|/2 f 2L 2 ) , dt we can find a real polynomial p(x, ˜ y) of degree 2N + 1 such that ω˜ = ω N +1 + p(z, ˜ z¯ ), ω˙˜ = O(|z|2N +2 + e−a|x|/2 f N +1 2H 1 ). Thus we complete the proof. Proof of Lemma 3.4. Let z 1 = z and z k+1 = z k +

(k) f N , γ˜m,n (ω) z m z¯ n for k = 1, . . . , N .

(A.8)

m+n=k n= N (k)

For k = 1, . . . , N + 1, we will choose γ˜m,n ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω∗ ) such that i z˙ k − λz k = rk (z k , z k ) + f N , γ (k) (z)

+ O(|z k |2N +2 + e−a|x|/2 f N 2H 1 ),

(A.9)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

75

where rk is a real polynomial of degree 2N + 1 with rk (x, y) = O(x 2 + y 2 ) as (x, y) → (0, 0), ⎧ (k) m n ⎪ γm,n z z¯ for k = 1, . . . , N , ⎨ γ (k) (z) = k≤m+n≤N ⎪ ⎩ (N ) N γ0,N z¯ for k = N + 1, (k)

and γm,n (ω) ∈ Ha (Rd ; R2 ) ∩ L 2c (Hω∗ ). This is true for k = 1. Assume (A.9) for k = l ≤ N and substitute (A.8) into (A.9). Then i z˙l+1 − λzl+1

= i z˙l − λzl +

(l) (Hω − λ(m − n − 1)) f N , γ˜m,n

z m z¯ n

m+n=l,n= N

(

' (l) (l) i Pc (ω)∂t f N +1 − Hω f N , γ˜m,n + i ω ˙ f N , ∂ω γ˜m,n +

z m z¯ n m+n=l n= N

+

(l) mz m−1 z¯ n (i z˙ − λz) − nz m z¯ n−1 (i z˙ − λz) f N , γm,n

z m z¯ n .

(A.10)

m+n=l n= N

Substituting (3.4) into (A.10) and letting (l) (l) γ˜m,n (ω) = R Hω∗ ((m − n − 1)λ)γm,n (ω),

we see that (A.9) is true for k = l + 1. Thus we complete the induction. By (3.9), (A.9) with k = N + 1 and the fact that |z N +1 − z| = O(z 2N +1 ), f N − f˜N H 1 |ω − ω0 |(e−a|x| f N +1 H 1 + |z| N +1 ), we have i z˙ N +1 − λz N +1

= r N +1 (z N +1 , z N +1 ) +

(N ) (N +1) n+N m,n (ω0 ), γ0,N (ω) z m N +1 z N +1

m+n=N +1

O |z|2N +2 + e−a|x|/2 f N +1 2H 1 + O |ω − ω0 |(|z| N e−a|x| f N +1 H 1 + |z|2N +1 .

(N ) + z N +1 f N +1 , γ0,N

+ N

(A.11)

The standard theory of normal forms (see [1]) tells us that by introducing a new variable n z˜ = z N +1 + c˜m,n (ω)z m N +1 z N +1 , 2≤m+n≤2N +1 m,n≥0, m−n=1 (N +1)

we can transform (A.11) into (3.12). Since r N +1 is a real polynomial and m,n (ω) ∈ Ha (Rd , R2 ) for m, n ∈ N with m + n = N + 1, it follows that c˜m,n (ω) ∈ R for n ≤ 2N and an (ω, ω0 ) ∈ R for 1 ≤ n ≤ N − 1 and by (3.10) with (N )

(N )

a N (ω, ω0 ) = R Hω0 ((N + 1)λ + i0) N +1,0 (ω0 ), γ0,N (ω) .

(A.12)

76

S. Cuccagna, T. Mizumachi

Remark A.1. By

1 x−i0

(N )

= P V x1 + iπ δ0 (x), by [8] and by the fact that N +1,0 (ω0 ) and

(N ) γ0,N (ω) have real entries, we have

(N )

(N )

R Hω0 ((N + 1)λ(ω0 ) + i0) N +1,0 (ω0 ), γ0,N (ω)

) (N ) = π δ0 Hω0 − (N + 1)λ(ω0 ) (N N +1,0 (ω0 ), γ0,N (ω) .

(A.13)

If Hypothesis 3.5 fails because (N )

δ(Hω − (N + 1)λ(ω)) N +1,0 (ω) = 0

(A.14)

(N )

identically in ω, then by [12] the vector N +1,0 (ω) is real and rapidly decreasing to 0 as |x| → ∞. This suggests that we can continue the normal form expansion one more step. Acknowledgements. Scipio Cuccagna wishes to thank Professor Yoshio Tsutsumi for supporting a visit at Kyushu and Kyoto Universities where part of this work was carried out, and Gang Zhou for information about [44,45]. Tetsu Mizumachi is supported by Grant-in-Aid for Scientific Research (No. 17740079).

References 1. Arnold, V.I.: Geometrical methods in the theory of ordinary differential equations. Grundlehren der Mathematischen Wissenschaften 250, New York: Springer-Verlag, 1983 2. Buslaev, V.S., Perelman, G.S.: Scattering for the nonlinear Schrödinger equation: states close to a soliton. St. Petersburg Math. J. 4, 1111–1142 (1993) 3. Buslaev, V.S., Perelman, G.S.: On the stability of solitary waves for nonlinear Schrödinger equations. In: Nonlinear evolution equations, N.N. Uraltseva, ed. Transl. Ser. 2, 164, Providence, RI: Amer. Math. Soc., 1995, pp 75–98 4. Buslaev, V.S., Sulem, C.: On the asymptotic stability of solitary waves of Nonlinear Schrödinger equations. Ann. Inst. H. Poincaré. An. Nonlin. 20, 419–475 (2003) 5. Cazenave, T.: Semilinear Schrodinger equations. Courant Lecture Notes in Mathematics 10, New York University, Courant Institute of Mathematical Sciences, Providence, RI: Amer. Math. Soc., 2003 6. Cazenave, T., Lions, P.L.: Orbital stability of standing waves for nonlinear Schrödinger equations. Commun. Math. Phys. 85, 549–561 (1982) 7. Cuccagna, S.: Stabilization of solutions to nonlinear Schrödinger equations, Comm. Pure App. Math. 54, 1110–1145 (2001); Comm. Pure Appl. Math. 58, 147 (2005) 8. Cuccagna, S.: On asymptotic stability of ground states of NLS. Rev. Math. Phys. 15, 877–903 (2003) 9. Cuccagna, S.: Dispersion for Schrödinger equation with periodic potential in 1D. To appear J. Diff. Eq. 10. Cuccagna, S.: On instability of excited states of the nonlinear Schrödinger equation. http://arxiv.org/abs/ 0801.4237v2[math.AP], 2008 11. Cuccagna, S., Pelinovsky, D.: Bifurcations from the endpoints of the essential spectrum in the linearized nonlinear Schrodinger problem. J. Math. Phys. 46, 053520 (2005) 12. Cuccagna, S., Pelinovsky, D., Vougalter, V.: Spectra of positive and negative energies in the linearization of the NLS problem. Comm. Pure Appl. Math. 58, 1–29 (2005) 13. Cuccagna, S., Tarulli, M.: On asymptotic stability in energy space of ground states of NLS in 2D. http:// arxiv.org/abs/0801.1277v1[math.AP], 2008 14. Dancer, E.N.: A note on asymptotic uniqueness for some nonlinearities which change sign. Bull. Austral. Math. Soc. 61, 305–312 (2000) 15. Fibich, G., Wang, X.P.: Stability of solitary waves for nonlinear Schrödinger equations with inhomogeneous nonlinearities. Physica D 175, 96–108 (2003) 16. Grillakis, M., Shatah, J., Strauss, W.: Stability of solitary waves in the presence of symmetries, I. J. Funct. An. 74, 160–197 (1987) 17. Grillakis, M., Shatah, J., Strauss, W.: Stability of solitary waves in the presence of symmetries, II. Jour. Funct. An. 94, 308–348 (1990)

Asymptotic Stability in Energy Space of Ground States for NLS Equations

77

18. Gustafson, S., Nakanishi, K., Tsai, T.P.: Asymptotic Stability and Completeness in the Energy Space for Nonlinear Schrödinger Equations with Small Solitary Waves. Int. Math. Res. Notices 66, 3559–3584 (2004) 19. Kabeya, Y., Tanaka, K.: Uniqueness of positive radial solutions of semilinear elliptic equations in R N and Sere’s non-degeneracy condition. Comm. Partial Differ. Eqs. 24, 563–598 (1999) 20. Keel, M., Tao, T.: Endpoint Strichartz estimates. Amer. J. Math. 120, 955–980 (1998) 21. Kwong, M.K.: Uniqueness of positive solutions of u − u + u p = 0 in Rn . Arch. Rat. Mech. Anal. 105, 243–266 (1989) 22. McLeod, K.: Uniqueness of positive radial solutions of u + f (u) = 0 in Rn , II. Trans. Amer. Math. Soc. 339, 495–505 (1993) 23. Mizumachi, T.: Asymptotic stability of small solitons to 1D NLS with potential. http://arxiv.org/abs/math. AP/0605031, 2006, to appear in J. Math. Kyoto Univ 24. Mizumachi, T.: Asymptotic stability of small solitons for 2D Nonlinear Schrödinger equations with potential. http://arxiv.org/abs/math.AP/0609323, 2006 25. Pillet, C.A., Wayne, C.E.: Invariant manifolds for a class of dispersive, Hamiltonian partial differential equations. J. Diff. Eq. 141, 310–326 (1997) 26. Perelman, G.S.: Asymptotic stability of solitons for nonlinear Schrödinger equations. Comm. in PDE 29, 1051–1095 (2004) 27. Rodnianski, I., Schlag, W., Soffer, A.: Asymptotic stability of N-soliton states of NLS. http://arxiv.org/ abs/math.AP/0309114, 2003 28. Shatah, J., Strauss, W.: Instability of nonlinear bound states. Commun. Math. Phys. 100, 173–190 (1985) 29. Sigal, I.M.: Nonlinear wave and Schrödinger equations. I. Instability of periodic and quasi- periodic solutions. Commun. Math. Phys. 153, 297–320 (1993) 30. Stuart, D.M.A.: Modulation approach to stability for non topological solitons in semilinear wave equations. J. Math. Pures Appl. 80, 51–83 (2001) 31. Soffer, A., Weinstein, M.: Multichannel nonlinear scattering II. The Case of Anisotropic Potentials and Data. J. Diff. Eq. 98, 376–390 (1992) 32. Soffer, A., Weinstein, M.: Selection of the ground state for nonlinear Schrödinger equations. Rev. Math. Phys. 16, 977–1071 (2004) 33. Soffer, A., Weinstein, M.: Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations. Invent. Math. 136, 9–74 (1999) 34. Tsai, T.P.: Asymptotic dynamics of nonlinear Schrödinger equations with many bound states. J. Diff. Eq. 192, 225–282 (2003) 35. Tsai, T.P., Yau, H.T.: Asymptotic dynamics of nonlinear Schrödinger equations: resonance dominated and radiation dominated solutions. Comm. Pure Appl. Math. 55, 153–216 (2002) 36. Tsai, T.P., Yau, H.T.: Relaxation of excited states in nonlinear Schrödinger equations. Int. Math. Res. Not. 31, 1629–1673 (2002) 37. Tsai, T.P., Yau, H.T.: Classification of asymptotic profiles for nonlinear Schrödinger equations with small initial data. Adv. Theor. Math. Phys. 6, 107–139 (2002) 38. Weder, R.: Center manifold for nonintegrable nonlinear Schrödinger equations on the line. Commun. Math. Phys. 170, 343–356 (2000) 39. Weinstein, M.: Lyapunov stability of ground states of nonlinear dispersive equations. Comm. Pure Appl. Math. 39, 51–68 (1986) 40. Weinstein, M.: Modulation stability of ground states of nonlinear Schrödinger equations. Siam J. Math. Anal. 16, 472–491 (1985) 41. Wei, J., Winter, M.: On a cubic-quintic Ginzburg-Landau equation with global coupling. Proc. Amer. Math. Soc. 133, 1787–1796 (2005) 42. Yajima, K.: The W k, p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47, 551–581 (1995) 43. Yajima, K.: The W k, p -continuity of wave operators for Schrödinger operators III. J. Math. Sci. Univ. Tokyo 2, 311–346 (1995) 44. Zhou, G.: Perturbation Expansion and Nth Order Fermi Golden Rule of the Nonlinear Schrödinger Equations. http://arxiv.org/abs/math.AP/0610381, 2006 45. Zhou, G., Sigal, I.M.: Relaxation of Solitons in Nonlinear Schrödinger Equations with Potential. http:// arxiv.org/abs/math-ph/0603060, 2006 Communicated by H.-T. Yau

Commun. Math. Phys. 284, 79–91 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0606-2

Communications in

Mathematical Physics

A Rigorous Path Integral for Supersymmetic Quantum Mechanics and the Heat Kernel Dana S. Fine1 , Stephen F. Sawin2 1 Department of Mathematics, University of Massachusetts Dartmouth,

North Dartmouth, MA 02747, USA. E-mail: [email protected]

2 Department of Mathematics and Computer Science, Fairfield University, Fairfield,

CT 06824, USA. E-mail: [email protected] Received: 19 July 2007 / Accepted: 7 April 2008 Published online: 19 August 2008 – © Springer-Verlag 2008

Abstract: In a rigorous construction of the path integral for supersymmetric quantum mechanics on a Riemann manifold, based on Bär and Pfäffle’s use of piecewise geodesic paths, the kernel of the time evolution operator is the heat kernel for the Laplacian on forms. The path integral is approximated by the integral of a form on the space of piecewise geodesic paths which is the pullback by a natural section of Mathai and Quillen’s Thom form of a bundle over this space. In the case of closed paths, the bundle is the tangent space to the space of geodesic paths, and the integral of this form passes in the limit to the supertrace of the heat kernel. Introduction In [B-P] Bär and Pfäffle construct a path integral representation of the heat kernel for a general Laplacian on a Riemann manifold. They express the path integral as an integral over piecewise geodesic paths in the limit as n, the number of pieces, approaches infinity. In this note, we begin with the Lagrangian for N = 1 supersymmetric quantum mechanics (SUSYQM), restrict the action to piecewise geodesic paths, and identify the resulting expression as a form on a finite-dimensional manifold. This form derives directly from Mathai and Quillen’s universal Thom form. We interpret the integral of the top part of this form over the finite-dimensional space as defining an approximation to the path integral representing the kernel of the SUSYQM time evolution operator. Applying Bär and Pfäffle’s arguments to evaluate the appropriate large-n limit shows the partition functions for piecewise geodesic paths with fixed endpoints converge to the heat kernel for the Laplacian on forms. Precisely, we prove as a corollary to Bär and Pfäffle’s Theorems 2.8 and 6.1: Theorem 3.2. For any sequence of partitions t1 , t2 , . . . , tn such that maxi (ti ) → 0 and t → t and for any form α on M, i i lim KMQ (t1 )KMQ (t2 ) · · · KMQ (tn )α = e−t∆/2 α,

80

D. S. Fine, S. F. Sawin

where ∆ is the Laplace-Beltrami operator on forms. Moreover, for some such sequence of partitions lim KMQ (t1 ) ∗ KMQ (t2 ) ∗ · · · ∗ KMQ (tn ) = K ∆ (x, y; t) uniformly, where K ∆ is the heat kernel of ∆ (the kernel of e−t∆/2 ). Here the kernel KMQ (t) of the operator K(t) is the pullback by a certain natural section of Mathai and Quillen’s Thom form on a vector bundle over a finite-dimensional (open) manifold. In fact, the indicated n-fold ∗-product expresses an integration of the analogous Mathai-Quillen Thom form on a bundle over M n+1 restricted to an open subset and pulled back by a section. The import of this theorem is that the finite-dimensional partition functions which directly approximate the kernel of the time evolution operator e−t∆/2 converge to the heat kernel. Further, for closed paths based at a given point, this yields a rigorous path integral expression for the supertrace of the heat kernel. This path integral is the large-n limit of the Mathai-Quillen Euler form integrated over the finite-dimensional manifold. Getzler [G] uses stochastic integrals due to Stroock [S], and asymptotics of the heat operator for the Laplacian on spinors due to Patodi [P], to calculate the supertrace of this heat operator as a rigorous path integral. Rogers [R1] uses stochastic analysis techniques to express the heat operator on forms in terms of a supersymmetric generalization of Wiener integrals and thereby obtains a path integral expression for the supertrace of the heat operator. The novelty of our approach is in constructing a rigorous path integral that directly links the heat operator to the SUSYQM time evolution operator and the Mathai-Quillen construction. These results confirm Alvarez-Gaumé’s [A] and Witten’s [W] now-standard arguments, which express the supertrace of the heat operator heuristically as a path integral. Our approach to rigorizing these arguments is sufficiently direct to see the relation, as derived formally by Blau [B], between SUSYQM and Mathai & Quillen’s universal Thom form [M-Q]. 1. Preliminaries and Notation We review the key facts needed from Riemannian geometry and fix notation, most of which follows Berline, Getzler and Vergne [B-G-V]. 1.1. Notation for Riemannian geometry. Let M be a compact oriented 2m-dimensional Riemann manifold. In a coordinate patch let ∂µ be the corresponding basis of tangent fields, ψ µ be the dual basis of one-forms1 , and ιµ be the odd derivation on forms defined by ιµ ψ ν = δµν . The metric gµν = (∂µ , ∂ν ) determines Christoffel symbols γ = Γµν

1 γη 1 g (∂ν gµη + ∂µ gνη − ∂η gµν ) = g γ η (gµη,ν + gνη,µ − gµν,η ), 2 2

(1.1)

1 The element of the dual basis is more commonly denoted d x µ . We use ψ µ in anticipation of the interpretation in terms of supersymmetric variables in (1.4) below.

Rigorous SUSYQM Path Integral and the Heat Kernel

81

(indices after the comma denote differentiation in that coordinate) in terms of which the Levi-Civita connection is ν η µ (Y ν ∂ν ) = (∂Y ν /∂ x µ )∂ν + Γµη Y ∂ν .

The operator extends to a one-form with values in differential operators on forms by η µ = ιµ d − Γµν ψ ν ιη .

The curvature R of the Levi-Civita connection is a smooth two-form with values in linear transformations on the fiber. Acting on the coordinate basis, it is R(∂π , ∂η ) · ∂µ = Rπ ηµν ∂ν , where

δ δ δ χ δ χ Rµνγδ = Γνγ ,µ − Γµγ ,ν + Γµχ Γνγ − Γνχ Γµγ .

(1.2)

We will freely raise and lower all four indices on R with the metric, keeping track of the order by spacing. With this convention, the symmetries of R are Rµνπη + Rπ µνη + Rνπ µη = 0.

Rµνπ η = Rπ ηµν = −Rνµπ η , The Ricci tensor is

Ricciσ τ = Rσ µτµ .

(1.3)

1.2. Laplace-Beltrami and heat kernels. The Laplace-Beltrami operator ∆ on the space Ω(M) of forms is σ ∆ = −g µν (µ ν −Γµν σ ) − Ricciπη ψ η ιπ −

1 νπ µ η R ψ ψ ιν ιπ . 2 µη

(1.4)

The evolution operator e−t∆/2 is a semigroup of operators on Ω(M) depending on a parameter t ∈ [0, ∞) such that for α ∈ Ω(M) αt = e−t∆/2 α is a solution to the heat equation (∆/2 + ∂t )αt = 0 with α0 = α as initial conditions. The heat kernel, a smooth map K ∆ from (0, ∞) to sections of Ω(M × M), provides an integral representation of the time evolution operator. Explicitly, for α ∈ Ω(M), −t∆/2 α)(x) = K ∆ ∗ α = K ∆ (x, y; t)α(y), (e y∈M

where on the right-hand side we wedge the forms over y together, take the top form piece, and integrate over the second factor of M. In general operators on Ω(M) are represented by forms in Ω(M × M), with operator composition K 1 (x, y)K 2 (y, z). (1.5) K 1 ∗ K 2 (x, z) = y∈M

82

D. S. Fine, S. F. Sawin

1.3. Riemann Normal Coordinates. Orthonormal coordinates on Ty M extend via exp y to coordinates on a patch of M called Riemann normal coordinates. In Riemann normal coordinates lines from the origin are geodesics with length consistent with the coordinates, and the following hold, where x = x µ ∂µ is the tangent vector at y corresponding to x and |x| is its length, 1 (1.6) g µν (x) = δ µν − R µσ ντ (0)x σ x τ + O(|x|3 ), 3 1 δ (x) = − Rµνγδ (0) + Rγ νµδ (0) x ν + O(|x|2 ). (1.7) Γµγ 3 Finally, any vector v ∈ Tx M defines two vectors in Ty M: the first is v = (d exp y )−1 v; the second is the parallel translate v || of v along the geodesic from x to y. These are related by 1 v || = v + R(x, v) · x + O(|x|3 )|v|. (1.8) 6 Whenever we work in Riemann normal coordinates we implicitly restrict attention to a patch within the injectivity radius of the center, small enough that there is a unique geodesic from the center to each point. 1.4. Supersymmetric variables. If V is a vector space, we represent elements of Λ(V ∗ ) by formulas involving an anticommuting element ψ of V. For example, given a basis e1 , . . . , en of V, an antisymmetric matrix ωµν determines an element ω(ψ) of Λ2 (V ∗ ) via ω(ψ) = 21 ωµν ψ µ ψ ν , with ψ = ψ µ eµ . In the latter expansion of ψ, each ψ µ is an anticommuting numerical variable. On the other hand, each ψ µ in the expansion of an element of Λ(V ∗ ) is a map sending the real element v ∈ V to a real number; namely, its component v µ in the given basis. Thus ψ 1 , . . . , ψ n also represents the basis of V ∗ dual to e1 , . . . , en . In this interpretation ψ = ψ µ eµ is then an expression for the identity map d x µ eµ on V composed with the natural map from V to the exterior algebra Λ(V ). In calculations it is usually easier to work with ψ as denoting an anticommuting tangent vector; to interpret the resulting expressions, it is helpful to remember it means this identity map. By the same token we can consider ρ as an anticommuting variable in V ∗ which will be used in formulas representing elements of Λ(V ). In this context ρµ replaces eµ in the usual expressions. Equivalently, ρ represents the identity map on V ∗ . Most often ρ will be used inside a Berezin integral. This integration is defined for f an anticommuting polynomial, in terms of a volume form on V , by f (ρ) is the volume of the dim(V ) degree piece of f. For example if V is 2m dimensional, with a basis chosen so ψ 1 · · · ψ 2m is the volume form, and g(ψ) is an anticommuting polynomial in ψ, then the Berezin integral over ρ is k iρµ ψ µ iρ,ψ gν1 ···νl ψ ν1 · · · ψ νl dρ e g(ψ) dρ = k! k (−1)m = gν1 ···νl ψ ν1 · · · ψ νl dρ ρµ1 ψ µ1 · · · ρµ2m ψ µ2m (2m)! gν1 ···νl ψ ν1 · · · ψ νl = ψ 1 · · · ψ 2m = g(0)ψ 1 · · · ψ 2m .

Rigorous SUSYQM Path Integral and the Heat Kernel

83

The right-hand side denotes the 0-degree part of g times the volume form on V. In this paper ψ and ρ will be anticommuting elements of the tangent and cotangent spaces, respectively, at a point in M, so that the formulas involving them will describe forms on M. Two examples serve to illustrate Berezin integration in this context and to provide formulas we will require in Sect. 3. With ψx , ρ y and ψ y denoting anticommuting || tangent and cotangent vectors at points x and y in M, ψx representing the parallel transport of ψx from x to y along some path connecting them, and α ∈ Λ(Ty∗ M), y || i ρ ,ψx −ψ y e α dρ y dψ y = α || . (1.9) Here α || is α parallel transported along the given path. Thus we have an operator that can implement parallel transport. (Of course, parallel transport could be replaced by any linear map.) The key to this calculation is that the coefficient of the top form in ρ is || || proportional to [(ψx )1 − ψ y1 ] · · · [(ψx )2m − ψ y2m ]. The top-form piece of the product of ||

||

this with α will include terms like (ψx )1 ψ y2 ψ y3 (ψx )4 · · · ψ y2m α14 (y)ψ y1 ψ y4 . Rearranging ||

||

the order of factors, this contributes the term ψ y1 · · · ψ y2m α14 (y)(ψx )1 (ψx )4 , which leads, ||

||

after integration with respect to ψ y , to the α14 (y)(ψx )1 (ψx )4 part of α || on the right-hand side. Likewise, for µ ∈ {1, . . . , 2m},

|| y || y i ρ ,ψx −ψ y α dρ y dψ y = i ιµ α . (1.10) ρµ e In this notation, f (x, ψx ), for f smoothly varying in x and an antisymmetric multinomial in ψx , corresponds to a smooth differential form f on M. Moreover, f (x, ψx ) dψx d x is the integral M f of the top-form part of f over M. 2. Discrete Approximation to the SUSYQM Lagrangian In this section we define a sequence of finite-dimensional subspaces of the space of paths in M on which we interpret the N = 1 supersymmetric quantum mechanical Lagrangian as a form. This form describes a kernel which is an operator product of a number of copies of a simpler kernel described by a form KMQ on a 4m-dimensional space. We will ultimately apply Bär & Pfäffle’s arguments to show that, as the dimension of the subspaces increases, the product of kernels converges uniformly to the kernel of the Laplace-Beltrami heat operator. 2.1. Short geodesics. A short geodesic is a geodesic of length less than the injectivity radius of M. The space of short geodesics is isomorphic to M (2) , the subspace of M 2 consisting of pairs of points within the injectivity radius of each other. (We take our paths as oriented but not parameterized; later we will choose parameterizations). Let Pathn denote the space of n-segment piecewise short geodesic paths in M, and let Pathn (x, y) denote the subspace of those going from y to x. Pathn is isomorphic to M (n+1) , the subspace of M n+1 in which each successive point is within the injectivity radius of the previous. If σt is a short geodesic in Path1 the isomorphism with M (2) sends σ to (x, y), where x = σ1 and y = σ0 . Note the unconventional choice of a path going from y to x. This is necessitated by the standard conventions of kernels and operators.

84

D. S. Fine, S. F. Sawin

2.2. Tangents to short geodesics. If σt is a geodesic, represent a tangent vector to it in || the space of geodesics by a tangent field ψt ∈ Tσt M along σ.2 Let ψt ∈ Tσ0 M be the parallel translate of ψt from σt to σ0 along σ according to the Levi-Civita connection. Suppose σ from t = 0 to t = 1 is a short geodesic, and take ψt to be tangent to a oneparameter family of short geodesics. Since each geodesic in this family is determined by its endpoints, ψt should be determined by ψ0 and ψ1 . In fact, Lemma 2.1. If σt is a geodesic path mapping [0, 1] to M, ψt is a tangent field along σ, d = d(σ0 , σ1 ) is the distance between the endpoints of σ , and |ψ| = max(|ψ0 |, |ψ1 |), then ||

||

ψt = tψ1 + (1 − t)ψ0 + −

t3 − t || R(σ˙ 0 , ψ1 ) · σ˙ 0 6

t 3 − 3t 2 + 2t R(σ˙ 0 , ψ0 ) · σ˙ 0 + O(d 3 )|ψ|, 6

(2.1)

where R is computed at σ0 , and σ˙ t = ∂t σt . Proof. Since the result is linear in ψ1 and ψ0 , we prove it when ψ0 = 0. The case ψ1 = 0 and thus the general case follow from reversing the parameterization. In Riemann normal coordinates centered at σ0 , ψ1 = (d expσ0 )ψ 1 for some ψ 1 ∈ Tσ0 M. Extend ψ 1 to a path of tangent vectors as ψ t = tψ 1 . Note that because lines through the origin are geodesics in Riemann normal coordinates, this path of tangent vectors describes a tangent vector to the space of geodesics. Applying Eq. (1.8) to ψ t and ψ 1 gives t2 R(σ˙ 0 , ψ t ) · σ˙ 0 + O(d 3 )|ψ|, 6 1 || ψ1 = ψ 1 + R(σ˙ 0 , ψ 1 ) · σ˙ 0 + O(d 3 )|ψ|, 6 ||

ψt = ψ t +

so ||

||

ψt = tψ1 +

t3 − t || R(σ˙ 0 , ψ1 ) · σ˙ 0 + O(d 3 )|ψ|. 6

Reversing the parameterization and assuming ψ1 = 0 introduces the terms 3 || (1 − t)ψ0 + (1−t) 6−(1−t) R(σ˙ 1 , ψ0 ) · σ˙ 1 , where the parallel transport is from σ0 to σ1 . After parallel transporting back to σ0 in the second term, these become the additional terms the lemma requires. Note this substitution is permitted to the given order, since σ˙ is parallel along σ , and the difference between applying the curvature and metric at σ1 and applying them at σ0 after parallel transport is of order d 3 |ψ|. Remark 2.1. The scale of the parameterization is of course arbitrary in the above lemma. ψ is determined by its value at any two points of σ , and Eq. (2.1) continues to describe this dependence with the parameter t adjusted appropriately. 2 In what follows, the components of this vector field could be either real or anticommuting numbers. Since our application of the lemma below will be to the anticommuting case, we use ψ to denote a generic vector.

Rigorous SUSYQM Path Integral and the Heat Kernel

85

2.3. The SUSYQM Lagrangian. The action for N = 1 supersymmetric quantum mechanics on the manifold M is t r 1 r σ˙ r2 r − + i ρ , (σ˙ ψ)r − ρ , R(ψr , ψr ) · ρ dr, S(σ, ψ, ρ, t) = 2 4 0 where σ is an element of the space of paths in M, ψr is an anticommuting element of the tangent to the space of paths, and ρ r is an anticommuting t variable modeled on the dual to the tangent space of the space of paths. In a pairing 0 ρ r , ψr dr , the end result is (at least formally) a one-form on the space of paths with values in linear functions in ρ. The partition function for SUSYQM on M is Z= e S(σ,ψ,ρ,t) . The (formal) Berezin integration in ρ produces a form on the space of paths. The “top form piece” of this form is integrated over the space of paths to give the partition function. Taking the paths to have fixed endpoints, the partition function is a path integral representation for the kernel of the time evolution operator or the Feynman propagator. Given a family of paths σ , we may think of σ˙ and ψ as vector fields on M, which must necessarily commute, since the paths locally define coordinate curves which are integral curves for ψ and σ˙ . Thus, in the action we may replace σ˙ ψ with ψ σ˙ . We thereby recognize the Lagrangian as (formally) exactly the Mathai-Quillen Thom form on the tangent bundle to the space of paths, pulled back by the section σ˙ . The connection t is the Levi-Civita connection determined by the metric 0 (X t , Yt )dt. This observation and its formal consequences are due to Blau [B]. It is of course the integral over the infinite-dimensional space of paths that makes the links between the heat kernel, the partition function, and a Mathai-Quillen integral purely formal. However, if we interpret the path integral by restricting it to a sequence of finite-dimensional subspaces that in a reasonable sense approximate the whole space of paths, the arguments are correct on the finite-dimensional approximating spaces. We approximate the space of continuous paths σ : [0, t] → M with σ (0) = yn and σ (t) = x by Pathn (x, y). We choose positive numbers t1 , . . . , tn such that t = i=1 ti and parameterize each path in Pathn (x, y) so that the first segment is the image of [0, t1 ] parameterized proportionally to arclength (so the segment is a parameterized geodesic), the second segment is the image of [t1 , t1 + t2 ] parameterized proportionally to arclength, and so forth. Let Pathn (x, y; t1 , . . . , tn ) denote the space of paths in Pathn parameterized in this way so that the parameter length of the i th geodesic segment is ti . In the computation of the approximation to the partition function, ψ will become an anticommuting vector tangent to Pathn . This tangent space has dimension 2m(n + 1), since it consists of vectors ψ1 , . . . , ψn+1 with ψi ∈ Txi M and with the xi denoting the (n + 1) endpoints of the geodesic segments. The situation with ρ is a bit more complicated: The quantum mechanical state space µ consists of sums of anticommuting polynomials in ψi with coefficients depending on n+1 xi ; these correspond to forms on M . The path integral will give a kernel of the time evolution operator which will act on the form on σ0 representing the initial state. Thus we should think of the form at σ0 as already being determined, so that the space in which ψ lives is the space of all tangent vectors extending a given tangent vector at σ0 . The variable ρ should thus not be a dual tangent vector at each of n + 1 terminal points of the geodesic pieces, as we might naively expect, because it should have no component dual

86

D. S. Fine, S. F. Sawin

to the tangent space at σ0 . Thus ρ will consist of dual vectors ρ 1 , ρ 2 , . . . , ρ n , with each ρ i an anticommuting element in Tx∗i M, where xi is the final point of the i th segment. The pairing of ρ and ψ is given by

t

n

ρ , ψr dr = ti ρ i , ψi .

r

0

i=1 µ

Note that in local coordinates xi in a neighborhood of xi , the pointwise pairing on the µ right-hand side is ρ i , ψi = ρµi ψi . Thus the natural restriction of the path integral to the space of piecewise geodesic paths is KnMQ (x, y; t1 , . . . , tn )) = exp

n

n

(2π ti )−m

i=1

Pathn (x,y;t1 ,...,tn )

···

|σ˙ i |2 ti + i ti ρ i , (σ˙ ψ)i 2 i=1 ti i i ρ , R(ψi , ψi ) · ρ dρ 1 · · · dρ n , − 4 −

(2.2)

where σ˙ i denotes the tangent of the final point xi of the i th geodesic segment. The normalization factor out front is chosen to make the trivial case M = R2m with the Euclidean metric work out right. The expression on the right-hand side is the kernel of the product of n operators, one MQ for each geodesic segment, each of which has K1 as its kernel. That is, the approximation to the kernel of the time-evolution operator is (suppressing the spatial variables) MQ

MQ

MQ

KnMQ (t1 , t2 , . . . , tn ) = K1 (t1 ) ∗ K1 (t2 ) ∗ · · · ∗ K1 (tn ). 2.4. Expressing K1 as a form on M (2) . In this section we explicitly evaluate the form MQ K1 , in terms of geometric invariants. It is natural to rescale the parameterization length to 1, and adjust the meaning of σ˙ accordingly, to obtain the following form on the same path parameterized from 0 to 1: MQ

K

MQ

|σ˙ |2 exp − 2t t x x ρ , R(ψx , ψx ) · ρ x dρ x . +i ρ , (σ˙ ψ)x − 4 −m

(x, y; t) = (2π t)

This is a form on Path1 ∼ = M (2) , and can be expressed as such. First σ˙ in Riemann normal coordinates centered at x is − y. From Lemma 2.1, by taking the derivative with respect to t at t = 1 in Eq. 2.1 and parallel transporting everything to x = σ1 , we get (σ˙ ψ)x = ψx − ψ y|| +

1 1 R( y, ψx ) · y + R( y, ψ y|| ) · y + O(| y|3 )|ψ|. 3 6

Rigorous SUSYQM Path Integral and the Heat Kernel

So

87

| y|2 t x ρ , R(ψx , ψx ) · ρ x KMQ (x, y; t) = (2π t)−m exp − − 2t 4 1 1 + i ρ x , ψx − ψ y|| + R( y, ψx ) · y + R( y, ψ y|| ) · y + O(| y|3 ) dρ x . 3 6

2.5. Shifting ψ y to ψx . Suppose η and π are indices for an orthonormal basis of Ty M. Suppose f (ρ) is an anticommuting polynomial in the ρ1 , . . . , ρ2m excluding ρη , and g(ψ) is an anticommuting polynomial in the ψ 1 , . . . , ψ 2m excluding ψ π . Then η

iρπ ψx − ψ y|| f (ρ)g(ψ) exp i ρ, ψx − ψ y|| dρ

= f (ρ)g(ψ)δπη exp i ρ, ψx − ψ y|| dρ.

|| In particular, within an integral against exp i ρ, ψx − ψ y , i

i 1 ρ, R( y, ψ y|| ) · y = ρ, R( y, ψx ) · y − ( y, Ricci · y). 6 6 6 So defining

1 H (x, y; t) = (2π t)−m exp − d(x, y)2 2t

(2.3)

for x, y ∈ M within the injectivity radius of each other and t > 0 (and zero otherwise) this gives 1 MQ x || K (x, y; t) = H (x, y; t) exp i ρ , ψx − ψ y + R( y, ψx ) · y 2 t x 1 ρ , R(ψx , ψx ) · ρ x + O(| y|3 ) dρ x . (2.4) − ( y, Ricci · y) − 6 4 2.6. Mathai-Quillen on paths and loops. The map e1 : Path1 (x, y; t) → M, with e1 (σ ) = σ (1), pulls the tangent bundle T M over M back to a bundle e1∗ T M whose fiber over σ ∈ Path1 (x, y; t) is Tσ (1) M. The Levi-Civita connection pulls back to a connection e1∗ on e1∗ T M. Then a calculation shows that the Mathai-Quillen Thom form built from the connection e1∗ on the bundle e1∗ T M, when pulled back via the section s(σ ) = MQ −σ˙ (1) of e1∗ T M, gives the kernel K1 (t) interpreted as a form on Path1 (x, y; t). We note that for nonzero t this gives a closed form on Path1 (x, y; t), but not a compactlyMQ supported closed form. Likewise the form on Pathn (M) whose integral gives Kn is the pullback by the corresponding section of the Mathai-Quillen form of a connection built similarly from the Levi-Civita connection on the bundle whose fiber at σ is Tσ (t1 ) M × Tσ (t1 +t2 ) M × · · · × Tσ (t1 +···+tn ) M. Instead of paths we can consider piecewise geodesic loops. Here it is natural to consider the kernel Eq. (2.2) with not only the points x0 and xn identified but also ψ0 and ψn identified. That is, we identify x and y and wedge the form over x with the form over y (the form over x coming first). The relevant bundle over Pathn can then

88

D. S. Fine, S. F. Sawin

be identified with the tangent space to Pathn . The integral of the resulting form on M is the supertrace of the kernel on the left-hand side of Eq. (2.2). Theorem 3.2 below implies this integral approaches the supertrace of the heat kernel as n goes to infinity. The ability to connect the supertrace of the heat kernel to the integral of the pullback of the Mathai-Quillen form for a tangent bundle, through an intervening limit, is strong circumstantial evidence that this is a productive way of interpreting the supersymmetric path integral.

3. Strong Convergence of the Time Evolution Operator Bär and Pfäffle [B-P] offer a rigorous expression for various heat kernels as a kind of path integral. Specifically they use a form of Chernoff’s theorem to prove the following result: Theorem 3.1 (Bär, Pfäffle). Suppose K (x, y; t) ∈ E x ⊗ E y∗ is a smooth one-parameter family of kernels (with positive real parameter t) representing the family of operators K(t) on a Euclidean vector bundle E that satisfy the following three assumptions: 1. ||K(t)|| = 1 + O(t) for small t, where the norm is as an operator on the space of smooth functions with the supremum norm. 2. On each α ∈ Γ (M, E), lim (K(t)α − α)/t → −

t→0

∆ α, 2

in the supremum norm where ∆ is a generalized Laplacian on E. 3. For each y, lim K (x, y; t) = δ(x, y)

t→0

as a distribution. If t1 , t2 , . . . , tn iscalled a partition, then for any sequence of partitions in which maxi ti → 0 and i ti → t and for any form α on M, lim K(t1 )K(t2 ) · · · K(tn )α = e−t∆/2 α. Moreover, for some such sequence of partitions lim K (t1 ) ∗ K (t2 ) ∗ · · · ∗ K (tn ) → K ∆ (x, y; t) uniformly, where K ∆ is the heat kernel of ∆, i.e. the kernel of e−t∆/2 , and we suppress the spatial variables in K . Remark 3.1. Bär and Pfäffle work with ∆ rather than ∆/2, which of course amounts to nothing more than a rescaling of t by a factor of 2. However, in the usual scaling of the physics literature, the time evolution operator corresponds to e−t∆/2 , so we follow this convention.

Rigorous SUSYQM Path Integral and the Heat Kernel

89

3.1. Applying the theorem to KMQ . Bär and Pfäffle apply this theorem to operators constructed from heat kernel asymptotics to give their path integral formulation. It is possible to relate Eq. (2.4) to the kernel in their Theorem 6.1 (note that their paths are parameterized in the opposite direction, and thus signs on all integrals are reversed), thus showing that supersymmetric quantum mechanics path integral restricted to piecewise short geodesic paths approaches the heat kernel for the Laplace Beltrami operator on forms as the number of pieces goes to infinity (for certain sequences of parameterization lengths). Instead we will check directly that the SUSYQM Lagrangian satisfies the assumptions of Theorem 3.1, thus achieving the same result. The check is a simple calculation that involves no sophisticated understanding of heat kernel asymptotics and seems closer in spirit to path integral arguments. Write KMQ (t) for KMQ (x, y; t) when the spatial variables are to be understood, and MQ K (t) for the operator represented by this kernel. Proof of Assumption 1. The operator norm of KMQ (t) is 1 + O(t). By compactness we can check this pointwise at each x, and because KMQ (t) is zero outside the injectivity radius we can do the calculation inside a coordinate patch in Riemann normal coordinates. It suffices to let KMQ act on a function times a covariantly constant form, and the result follows from the fact that H (x, y; t) has operator norm 1. Proof of Assumptions 2 and 3. If α is a form on M, we must show ∆ lim KMQ (t)α − α /t = − α, t→0 2 where ∆ is the Laplace-Beltrami operator on forms Eq. (1.4). Again, we may check at a specific point x, and we may assume α is zero outside the geodesic neighborhood of x. We may also assume α is simply a function times a covariantly constant form, so that α || (y, ψ y ) = f (y)α(x, ψx ), where the parallel transport from y to x is along the minimal geodesic. Working in Riemann normal coordinates centered at x so that 1 det 1/2 (g)( y) = 1 + Ricciσ τ y σ y τ + O(| y|3 ), 6 and writing H ( y; t) for the expression of H (x, y; t) in these coordinates, gives 1 x || (t)α = H ( y; t) exp i ρ , ψx − ψ y + R( y, ψx ) · y 2 t x 1 x 3 ρ , R(ψx , ψx ) · ρ + O(| y| ) − ( y, Ricci · y) − 6 4 ·α( y, ψ y ) dρ x dψ y d y 1 = H ( y; t) exp i ρ y , ψx|| − ψ y + R( y, ψx|| ) · y 2 t 1 ρ y , R(ψx|| , ψx|| ) · ρ y + O(| y|3 ) − ( y, Ricci · y) − 6 4 y ·α( y, ψ y ) dρ dψ y d y

K

MQ

(3.1)

90

D. S. Fine, S. F. Sawin

1 i 1 − Ricciσ τ y σ y τ + ρτy Rπ ηστ y π (ψ || )ηx y σ + O(| y|3 ) H ( y; t) 6 2

t y νπ || µ || η y · 1 − ρν Rµη (ψ )x (ψ )x ρπ + O(t 3/2 ) exp i ρ y , ψx|| − ψ y 4 ·α( y, ψ y ) dρ y dψ y d y 1 1 σ τ τ π η σ 3 = H ( y; t) f (y) 1 − Ricciσ τ y y + Rπ ησ y ψx y ιτ + O(| y| ) 6 2

t 1/2 νπ µ η 3/2 ·det (g) d y 1 + Rµη ψx ψx ιν ιπ + O(t ) α(0, ψx ), 4

=

where we have applied Eqs. (1.9) and (1.10). Now, if f is a smooth function on R2m , then 1 H ( y; t) f ( y) dy 1 · · · dy 2m = (2π t)−m exp − | y|2 f ( y) dy 1 · · · dy 2m 2t (∆f )(0) + O(t 2 ), = f (0) − t (3.2) 2 where ∆f = −δ µν ∂µ ∂ν f. According to Eq. (3.1), 1 − 16 Ricciσ τ y σ y τ det 1/2 (g) = 2m 1 + O(| y|3 ),so Eq.(3.2) implies the term linear in t coming from the integral over R η is just − 21 ∆ f ( y) 1 + 21 Rπ ηστ y π ψ0 y σ ιτ . That is, y=0

t t η µ η KMQ (t)α = f (0) 1 + Ricciπη ψ0 ιπ + Rµηνπ ψ0 ψ0 ιν ιπ α(0, ψx ) 2 4 t µν + δ ∂µ ∂ν f (0)α(0, ψx ) + O(t 3/2 ). 2 Thus the required t-derivative is 1 µν δ ∂µ ∂ν f (0)α(0, ψx ) lim KMQ (t)α − α /t = t→0 2

1 1 η µ η Ricciπη ψ0 ιπ + Rµηνπ ψ0 ψ0 ιν ιπ f (0)α(0, ψx ). + 2 4 On the other hand µ α = 0 since it is covariantly constant, so in Riemann normal coordinates, with the derivatives acting at 0, the right-hand side of Assumption 2 is −

1 ∆ α = − ∆0 f (x)α(0, ψx ) 2 2

1 1 νπ µ η 1 µν π η δ ∂µ ∂ν f (0)α(0, ψx )+ Ricciη ψ ιπ + Rµη ψ ψ ιν ιπ f (0)α(0, ψx ). = 2 2 2

Assumption 3 is an analogous but simpler calculation where we consider K (x, y; t)α(x)d x for a smooth α and require it to converge to α(y) as t goes to zero. This proves the following holds for the operator KMQ and its kernel KMQ :

Rigorous SUSYQM Path Integral and the Heat Kernel

91

Theorem 3.2. For any sequence of partitions t1 , t2 , . . . , tn such that maxi (ti ) → 0 and t → t and for any form α on M, i i lim KMQ (t1 )KMQ (t2 ) · · · KMQ (tn )α = e−t∆/2 α, where ∆ is the Laplace-Beltrami operator on forms. Moreover, for some such sequence of partitions lim KMQ (t1 ) ∗ KMQ (t2 ) ∗ · · · ∗ KMQ (tn ) = K ∆ (x, y; t) uniformly, where K ∆ is the heat kernel of ∆ (the kernel of e−t∆/2 ). Remark 3.2. Thus the approximation KMQ (x, y; t1 , . . . , tn ) to the kernel of the time evolution operator for supersymmetric quantum mechanics converges to the heat kernel for ∆ in the large partition limit. Acknowledgement. It is our pleasure to thank Christian Bär and Steve Rosenberg for helpful comments on the draft of this paper.

References [A] [B-G-V] [B] [B-P] [M-Q] [G] [R1] [R2] [P] [S] [W]

Alvarez-Gaumé, L.: Supersymmetry and the Atiyah-Singer index theorem. Commun. Math. Phys. 90, 161 (1983) Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. Berlin: Springer, 2004 Blau, M.: The Mathai-Quillen formalism and topological field theory. J. Geom. Phys. 11 (1–4), 95–127 (1993) Lecture notes given at the School, Infinite-dimensional Geometry in Physics Karpacz, 1992 Bär, C., Pfäffle, F.: Path integrals on manifolds by finite dimensional approximations. J. Reine Angew. Math. in press, available at http://arxiv.org/abs/math.AP/07032731v1, 2007 Mathai, V., Quillen, D.: Superconnections, thom classes, and equivariant characteristic classes. Topology 25, 85–110 (1986) Getzler, E.: A short proof of the local atiyah-singer index theorem. Topology 25, 111–117 (1986) Rogers, A.: Stochastic calculus in superspace. ii. differential forms, supermanifolds and the atiyahsinger index theorem. J. Phys. A 25(22), 6043–6062 (1992) Rogers, A.: Supersymmetry and brownian motion on supermanifolds. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6(suppl.), 83–102 (2003) Patodi, V.K.: Curvature and the eigenforms of the laplace operator. J. Diff. Geom 5, 233–249 (1971) Stroock, D.W.: On certain systems of parabolic equations. Comm. Pure Appl. Math. 23, 447–457 (1970) Witten, E.: Supersymmetry and morse theory. J. Diff. Geom. 17, 661–692 (1982)

Communicated by A. Connes

Commun. Math. Phys. 284, 93–116 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0608-0

Communications in

Mathematical Physics

Extended Connection in Yang-Mills Theory Gabriel Catren1 , Jorge Devoto2 1 Instituto de Astronomia y Fisica del Espacio, Casilla de Correo 67, Sucursal 28,

1428 Buenos Aires, Argentina. E-mail: [email protected]

2 Math. Dept. FCEN, Universidad de Buenos Aires, Ciudad Universitaria, Pabellon 1, 1428 Buenos Aires,

Argentina. E-mail: [email protected] Received: 1 August 2007 / Accepted: 29 May 2008 Published online: 19 August 2008 – © Springer-Verlag 2008

Abstract: The three fundamental geometric components of Yang-Mills theory –gauge field, gauge fixing and ghost field– are unified in a new object: an extended connection in a properly chosen principal fiber bundle. To do this, it is necessary to generalize the notion of gauge fixing by using a gauge fixing connection instead of a section. From the equations for the extended connection’s curvature, we derive the relevant BRST transformations without imposing the usual horizontality conditions. We show that the gauge field’s standard BRST transformation is only valid in a local trivialization and we obtain the corresponding global generalization. By using the Faddeev-Popov method, we apply the generalized gauge fixing to the path integral quantization of Yang-Mills theory. We show that the proposed gauge fixing can be used even in the presence of a Gribov’s obstruction.

I. Introduction It is our objective in the present work to show how the main geometric structures of Yang-Mills theory can be unified in a single geometric object, namely a connection in an infinite dimensional principal fiber bundle. We will show how this geometric formalism can be useful to the path integral quantization of Yang-Mills theory. Some of the historical motivations for the study of an extended connection in Yang-Mills theory are the following. In the beginning of the 80’s Yang Mills theory was at the center of important mathematical developments, especially Donaldson’s theory of four manifolds’ invariants [9] and Witten’s interpretation of this theory in terms of a topological quantum field theory [23] (for a general review on topological field theories see Refs. [5,8]). A central aspect of these theories is the study of the topological properties of the space A /G , where A is the configuration space of connections in a G-principal bundle P → M and G the gauge group of vertical automorphisms of P. It is possible to show that under certain hypotheses one obtains a G -principal bundle structure A → A /G [9].

94

G. Catren, J. Devoto

The non-triviality of many invariants is then intimately linked with the topological non-triviality of this bundle. In Ref. [3] Baulieu and Singer showed that Witten’s theory can be interpreted in terms of the gauge fixed version of a topological action through a standard BRST procedure. To do so, the authors unify the gauge field A and the ghost field c for Yang-Mills symmetry in an extended connection ω = A + c defined in a properly chosen principal bundle. The curvature F of ω splits naturally as F = F + ψ + φ, where ψ and φ are the ghost for the topological symmetry and the ghost for ghost respectively (the necessity of a ghost for ghost is due to the dependence of the topological symmetry on the Yang-Mills symmetry). By expanding the expressions for the curvature F and the corresponding Bianchi identity, the BRST transformations for this topological gauge theory are elegantly recovered. It is commonly stated that the passage from this topological Yang-Mills theory to the ordinary (i.e. non-topological) case is mediated by the horizontality (or flatness) conditions, i.e. by the conditions ψ = φ = 0 (see Refs. [2], [4,22]). In this case, an extended connection ω = A + c can also be defined for an ordinary Yang-Mills theory with a horizontal curvature of the form F = F. In this work, an extended connection A for an ordinary Yang-Mills theory will be defined as the sum of two factors. The first factor is a universal family AU parameterized by A of connections in the G-principal bundle P → M. The second factor is an arbitrarily chosen connection η in the G -principal fiber bundle A → A /G . We will show that the connection η encodes the ghost field that generates the BRST complex. In a local trivialization the ghost field can be identified with the canonical vertical part of the connection η, which corresponds to the Maurer-Cartan form of the gauge group G [6]. Alternatively, the ghost field can be considered a universal connection in the gauge group’s Weil algebra. The connection η can then be defined as the image of this universal connection under a particular Chern-Weil homomorphism. We will then argue that the connection η also defines what might be called generalized gauge fixing. In fact, the connection η defines a horizontal subspace at each point of the fiber bundle A → A /G . These subspaces can be considered first order infinitesimal germs of sections. The significant difference is that the connection η induces a global section ση in a fiber bundle associated to the space of paths in A /G . In fact, the section ση assigns the horizontal lift defined by η to each path γ ⊂ A /G . Since the path integral is not an integral in the space of fields A , but rather an integral in the space of paths in A , the section ση allows us to eliminate the gauge group’s infinite volume in the corresponding path integral. We will then show that the gauge fixed action corresponding to the generalized gauge fixing defined by η can be obtained by means of the usual Faddeev-Popov method. One of the main advantages over the usual formulation is that the connection η is globally well-defined even when the topology of the fiber bundle A → A /G is not trivial (Gribov’s obstruction). In this way, the existence of a generalized notion of gauge fixing demonstrates that the Faddeev-Popov method is still valid even in the presence of a Gribov’s obstruction. It is worth stressing that the extended connection A encodes not only the gauge and ghost fields (as the connection ω = A + c in Ref.[3]), but also the definition of a global gauge fixing of the theory. We will then show that the proposed formalism allows us to recover the BRST transformations of the relevant fields without imposing the usual horizontality conditions ([2], [4,22]). This implies that the extended connection’s curvature F does not necessarily have the horizontal form F = F. Moreover, we will show that the gauge field’s standard BRST transformation is valid only in a local trivialization of the fiber bundle A → A /G . We will then find the corresponding global generalization.

Extended Connection in Yang-Mills Theory

95

The paper is organized as follows. In Sect. II we define the extended connection. In Sect. III we study the curvature of this connection and show how the curvature forms induce the BRST transformations of the different fields. In Sect. IV we study the relation between the gauge fixing connection and the ghost field. In Sect. V we define the generalized gauge fixing at the level of path integrals. In Sect. VI we calculate the gauge fixed action by means of the Faddeev-Popov method. In the final section we summarize the proposed formalism. II. Extended Connection Let M denote space-time. We will suppose that it is possible to define a foliation of M by spacelike hypersurfaces. This foliation is defined by means of a diffeomorphism ι : M → R × M, where M is a smooth 3-dimensional Riemannian manifold. We will also assume for the sake of simplicity that M is compact. Let G be a compact Lie group with a fixed invariant inner product in its Lie algebra g and let P → M be a fixed G-principal bundle. Using the diffeomorphism ι and the fact that R is contractible, we can assume that the fiber bundle P → M over the space-time M is the pullback of a fixed G-principal bundle P → M over the space M ∗ (P) P∼ = pst

M

/P pst

/ M,

where pst : M R × M → M is the projection onto the second factor. We will denote by Ad(P) the fiber bundle P ×G G → M associated to the adjoint action of G on itself and by ad(P) the vector bundle P ×G g → M associated to the adjoint representation of G on g. The gauge group G is the group of vertical automorphisms of P. It can be naturally identified with the space of sections of Ad(P). Its Lie algebra Lie(G ) is the space of sections of ad(P). Its elements can be identified with G-equivariant maps g : P → g. In the case of a principal fiber bundle over a finite dimensional manifold with a compact structure group, there are three equivalent definitions of connections [9, Chap. 2]. In what follows, we will also consider connections on infinite dimensional spaces (see Refs.[17,18]). For the general case, we will use the following as the basic definition [16]. π

Definition 1. Let K be a Lie group with Lie algebra k and let E − → X be a K -principal bundle over a manifold X (both K and X can have infinite dimension). A connection on E is an equivariant distribution H , i.e. a smooth field of vector spaces H p ⊂ T E p (with p ∈ E) such that 1. For all p ∈ E there is a direct sum decomposition T E p = H p ⊕ ker dπ p . 2. The field is preserved by the induced action of K on T E, i.e. H pg = Rg∗ H p , where Rg∗ denotes the differential of the right translation by g ∈ K .

(1)

96

G. Catren, J. Devoto

As in the finite dimensional case, we can assign to each connection a K -equivariant k-valued 1-form ω on E such that H p = ker ω. The action of K on k is the adjoint action. A connection can also be considered a K -invariant splitting of the exact sequence

0

/ ker dπ

ι

/ TE

y

σ dπ

/ π ∗T X

/ 0,

(2)

where π ∗ T X → E is the pullback induced by the projection π : E → X of the bundle T X → X . In this sequence H p = σ (π ∗ Tπ( p) X ). Given a connection ω on E the splitting considered in Eq. (2) induces an isomorphism T E π ∗ T X ⊕ ker dπ . . The bundle TV E = ker dπ is called the bundle of vertical tangent vectors. This bundle is intrinsically associated to the definition of principal bundles. Given a connection, the isomorphism T E π ∗ T X ⊕ ker dπ induces a projection ωV : T E → TV E. By definition, the vertical cotangent bundle TV∗ E is the annihilator of σ (π ∗ T X ). In other words, a k-form α is vertical if and only if it vanishes whenever one of its arguments is a vector in σ (π ∗ T X ). It is worth noticing that the definition of the bundle TV∗ E requires a . connection. The sections of kV (E) = ∧k TV∗ E are called vertical k-forms. By definition, the connection form ω is a vertical form. Given a connection there is a decomposition of the de Rham differential on E, d E = d H + dV

(3)

into a horizontal and a vertical part. The horizontal part corresponds to the covariant derivative. The vertical part is defined by the expression dV α(X 1 , . . . , X n ) = dα( ωV X 1 , . . . , ωV X n ),

(4)

where α is a (n − 1)-form. The vertical forms ∗V (E) equipped with the vertical differential dV define the vertical complex. Let’s now suppose that it is possible to define a global section σ : X → E. This section defines a global trivialization ϕσ : X × K → E, where ϕσ (x, g) = σ (x) · g. This trivialization induces a distinguished connection ωσ on E such that the pullback . connection ω˜ σ = ϕσ∗ ωσ coincides with the canonical flat connection on X × K [16]. Roughly speaking, the horizontal distribution defined by ωσ at p = σ (x) is tangent to the section σ . The vertical complex defined by the connection ω˜ σ can be naturally identified with the de Rham complex of K . This implies that dV = d K . Since the connection form ω˜ σ is a k-valued K -invariant vertical form, it can be identified with the Maurer-Cartan form θ MC of the group K . On the contrary, a general connection ω defines a splitting of T E which does not coincide with the splitting induced by the section σ . In other words, the horizontal distribution defined by ω is not tangent to σ . In fact, the pullback connection ϕσ∗ ω at (x, g) can be written as a sum ϕσ∗ ω = adg−1 + θ MC ,

(5)

where = σ ∗ ω ∈ 1 (X ) ⊗ k and θ MC ∈ k∗ ⊗ k is the Maurer-Cartan form of K [7]. We will denote connections on the G-principal fiber bundle P → M by the letter A. In a local trivialization P|U U ×G we have an induced local k-valued 1-form AU . The

Extended Connection in Yang-Mills Theory

97

forms AU are the so-called gauge fields [11].1 The configuration space of all connections is an affine space modelled on the vector space 1 (M, g) consisting of 1-forms with values on the adjoint bundle ad(P). The gauge group G acts on this configuration space by affine transformations. We will fix a metric g on M and an invariant scalar product tr on g. These data together with the corresponding Hodge operator ∗ induce a metric on 1 (M, g). Hence, a metric can be defined in the spaces k (M, g), k ≥ 1, by means of the expression 1 , 2 = tr (1 ∗ 2 ). (6) M

Since the action of G on the configuration space of connections is not free, the quotient generally is not a manifold. This problem can be solved by using framed connections 2 [9]. The letter A will denote the space of framed connnections of Sobolev class L l−1 for the metric defined in (6), where l is a fixed number bigger than 2. The group G is the group of gauge transformations of Sobolev class L l2 . The action of G on A is free (see Ref.[9, Sect. 5.1.1]). We will denote by B the quotient A /G . Uhlenbeck’s Coulomb gauge fixing theorem (see Ref.[9, Sect. 2.3.3]) implies, for a generic metric g, local triviality. Hence A → B is a G -principal bundle. The initial geometric arena for our construction is the pullback G-principal bundle p ∗ (P) → A × M, which is obtained by taking the pullback of the bundle P → M by p the projection A × M − → M. The bundle p ∗ (P) can be identified with A × P: p ∗ (P) = A × P

/P

A ×M

/ M.

p

The gauge group G has an action on A × M induced by its action on A . This action is covered by the action of G on p ∗ (P) induced by its action on both A and P. Proposition 2. The bundle p ∗ (P) → A × M induces a G-principal bundle [1] ρ

Q = (A × P)/G − → B × M. We therefore have the following tower of principal bundles: G

/ p ∗ (P) = A × P q

G

/ Q = (A × P)/G ρ

B × M. 1 In a covariant framework, the connection A can be regarded as the spatial part of a connection A on P → M. In fact, since we can identify P with R × P, each connection A has a canonical decomposition A = A(t) + A0 (t)dt, where A(t) is a time-evolving connection on P and A0 (t) is a time-evolving section of ad(P) = P ×G g. The action of the gauge group G on A0 is induced by the natural action on associated bundles. This action is the restriction of the action of the automorphism group of P to {t} × M.

98

G. Catren, J. Devoto

The fiber bundle p ∗ (P) = A × P → A × M can be considered a universal family π → M with tautological connections (parameterized by the space A ) of fiber bundles P − U A. The universal family A of tautological connections is defined in the following way. Let (A, x) be a point of A × M. Then the elements of the fiber of A × P over (A, x) have the form (A, p), with p ∈ π −1 (x). Let us fix one of these elements. Let v ∈ T (A × P)(A, p) be a tangent vector such that π∗ (v) ∈ T M ⊂ T (A × M) (i.e., v is tangent to a copy of P in A × P). Then AU (v) = A(v). We will write H U for the distribution associated to the family AU . For each element A ∈ A , the distribution H U induces the distribution H A on T P defined by the connection A. The universal family of connections AU allows us to define parallel transports along paths contained in any copy of M inside A × M. Let’s now pick a connection in the G -principal bundle A → B. This connection will be defined by means of a 1-form η ∈ 1 (A ) ⊗ Lie(G ). We will denote by Hη the corresponding equivariant distribution. This connection will define a generalized notion of gauge fixing.2 In fact, let’s suppose that it is possible to define a global gauge fixing section σ : B → A . This section defines a global trivialization ϕσ : B × G → A . One can then define an induced flat connection ησ on A → B such that the corresponding distribution Hησ is always tangent to σ . In other words, the pullback ϕσ∗ ησ coincides with the canonical flat connection on B × G (see Ref.[16]). This shows that a global gauge fixing section σ can always be expressed in terms of a flat connection ησ . On the contrary, a connection η can not in general be integrated to a section. A local obstruction is the curvature and a global one the monodromy. Besides, it is always possible to define a global connection η, even when the topology of the fiber bundle A → B is not trivial (Gribov’s obstruction). In Sect. V and VI we will use the connection η as a generalized gauge fixing for the path integral quantization of Yang-Mills theory. In particular, we will show that the connection η does not have to be flat in order to induce a well defined gauge fixing. Let’s consider now a particular example of a gauge fixing connection. Due to the affine structure of A there is a canonical diffeomorphism T A A × 1 (M, g). We will consider 1 (M, g) with the inner product defined by (6). Since ad(P) is a vector bundle associated to the principal bundle P, any connection A on P induces a covariant derivative d A : k (M, g) → k+1 (M, g). Let d ∗A : k+1 (M, g) → k (M, g) be the adjoint operator. This is a differential operator of first order. Define H A : Ker{d ∗A : 1 (M, g) → 0 (M, g)}.

(7)

Then H A defines a connection on A called the Coulomb connection (see for example Ref. [9, p. 56]). on T p ∗ (P). The distribution Hη together with H U define a smooth distribution H If (A, p) ∈ p ∗ (P), then U H (A, p) = Hη (A) ⊕ H (A)( p). 2 In Ref.[19] the authors analyze the particular case of the Coulomb connection for a SU (2) Yang-Mills theory on S 3 × R. The authors point out that in the absence of a global section, the gauge can be consistently fixed by means of such a connection.

Extended Connection in Yang-Mills Theory

99

is transversal to the orbits of the action of G on Proposition 3. The distribution H A × P and to the fibers of p ∗ (P) → A × P. Proof. Let (A, p) ∈ p ∗ (P). Then we have two homomorphisms of vector spaces ι : g → T p P and κ : Lie(G ) → T A A . These homomorphisms are induced by the principal bundle structures of P and A . For each point p there is also a homomorphism of Lie algebras τ p : Lie(G ) → g given by τ p (g) = g( p), where we use the identification of the elements of Lie(G ) with equivariant maps g : P → g. With these definitions the tangent space to the orbit FG of the action of G at the point (A, p) is equal to (8) T FG (A, p) = v − v ∈ T A A ⊕ T p P | v = κ(g) and v = ι(τ p (g)) . The tangent spaces to the orbits are contained in the sum of the tangent spaces to the orbits in A and P. The proposition follows from the fact that the connections are transversal to these spaces. does not define a connection on p ∗ (P) → A × M due Remark 4. The distribution H to the fact that the tangent space at (A, p) has a decomposition T p ∗ (P)(A, p) = T FG (A, p) ⊕ Hη (A) ⊕ H U (A)( p) ⊕ι(g),

H (A, p)

does not define vector ι(g) being the vertical subspace. Hence, the distribution H spaces complementary to the vertical subspace ι(g). However there is a reason for the introduction of this distribution which is explained by the following lemma. is G -invariant and induces a connection H on Lemma 5. The distribution H Q = (A × P)/G → B × M. This lemma follows from the invariance of the distributions Hη and HU . Let E be the connection on the bundle p ∗ (P) → A × M obtained as the pullback of the connection H by the projection A × M → B × M. This pullback can be understood either in the language of distributions or in the language of forms. Let’s consider the diagram p ∗ (P) = A × P OOO nn OOO q nnnn OOO n n n OOO wnnn ' A ×P Q= G A ×M o PPP PPP ooo o o PPP ooo PPP ( wooo B × M. The map q : p ∗ (P) → Q induces a map q∗ : T p ∗ (P) → T Q. At each point (A, p) the subspace of T(A, p) p ∗ (P) which defines E is q∗−1 (Hq(A, p) ). The g-valued 1-form A ∈ 1 (A × P) ⊗ g associated to E is the pullback by q of the 1-form associated to H . We will now identify this distribution and this 1-form. The distribution which defines the new connection at each point (A, p) is the direct sum E(A, p) = T FG (A, p) ⊕ Hη (A) ⊕ H U (A)( p) .

H (A, p)

(9)

100

G. Catren, J. Devoto

If (A, p) ∈ A × P and v = v1 + v2 ∈ T A A ⊕ T p P, then we define a g-valued 1-form A on A × P given by A(v) = AU (A, p) (v2 ) + η(v1 )( p).

(10)

We will now show that the horizontal distribution defined by A is effectively given by (9). Lemma 6. If v ∈ T F(A, p) ⊕ H (A, p) , then A(v) = 0. Proof.

(i) If v ∈ H U (A)( p) ⊂ H (A, p) , then A(v) = AU (A, p) (v) = A(v) = 0

by definition of the connection A. (ii) If v ∈ Hη (A) ⊂ H (A, p) , then A(v) = η(v)( p) = 0 by definition of the connection η. (iii) If v = κ(g) − ι(τ p (g)) ∈ T F(A, p) , then A (v) = −AU (A, p) (ι(τ p (g))) + η (κ(g)) ( p) = −A(ι(τ p (g))) + η (κ(g)) ( p) = −τ p (g) + g( p) = 0, where we have used that by definition of connection A ◦ ι = idg and η ◦ κ = idLie(G ) . Remark 7. It is the gauge fixing connection Hη that allows us to make the decompositions (9) and (10) of the horizontal distribution E and the corresponding 1-form A. The reason is that these kinds of decompositions require the choice of a complement to a subspace of a vector space. Remark 8. An important difference with the work of Baulieu and Singer for topological Yang-Mills theory is that in Ref. [3] the connection ω is a natural connection, which is defined by using the orthogonal complements to the orbits of G. In order to define these orthogonal complements one uses the fact that the space A × P has a Riemannian metric invariant under G × G (see Ref. [1] for details). In our case the connection A, being tautological in the factor P, is not natural in the factor A , in the sense that the gauge fixing connection η can be freely chosen. This freedom is in fact the freedom to choose the gauge.

Extended Connection in Yang-Mills Theory

101

III. Extended Curvature and the BRST Complex In this section we will begin to consider the rich geometric structure induced by the connection H . We will do this through the pullback form A and its curvature F. Since we have a diffeomorphism p ∗ (P) A × P, the de Rham complex of g-valued forms on p ∗ (P) is the graded tensor product of the de Rham complexes of A and P, i.e. ∗ ( p ∗ (P)) ⊗ g ∗ (A ) ⊗ ∗ (P) ⊗ g.

(11)

This fact has two consequences. Firstly, the forms we are considering are naturally bigraded. Secondly, the exterior derivative in p ∗ (P) can be decomposed as = δ +d, where δ and d are the exterior derivatives in A and P respectively. Since the forms that we are considering are equivariant forms, the right complex to study these forms is ∗ (A ) ⊗ ∗ (P) ⊗G g . (12) Since

∗ (A ) ⊗ 0 (P) ⊗G g ∗ (A ) ⊗ Lie(G ),

(13)

the Lie(G )-valued equivariant k-forms on A can be considered as elements of bidegree (k, 0) of the complex (12). In particular the connection form η defines an element of bidegree (1, 0) of this complex. Using the splitting of the exterior derivative and the decomposition of forms we obtain the following decomposition of the curvature F: F = A +

1 [A, A] = +F(1,1) + F(0,2) , 2

where 1 [η, η] ≡ φ, 2 = δAU + dη + AU , η ≡ ψ, 1 U A , AU ≡ FU . = dAU + 2

F(2,0) = δη +

(14)

F(1,1)

(15)

F(0,2)

(16)

The (0, 2)-form FU is the universal family of curvature forms corresponding to the universal family of connections AU . In this sense, Eq. (16) is the extension to families of the usual Cartan structure equation. The (2, 0)-form φ is the curvature of the connection η. The (1, 1)-form ψ is a mixed term which involves both the gauge fields and the gauge fixing connection. This last term shows that this construction mixes in a non-trivial way the geometric structures coming from the fiber bundles P → M and A → B. We will now decompose Eqs. (14) and (15) in order to recover the usual BRST transformations of the gauge and ghost fields. Let kV (A ) for k > 0 be the vertical differential forms induced by η. We define 0V (A ) = C ∞ (A ). The differential of the de Rham complex of A induces a differential δV on the vertical forms. The complex ∗V (A ) ⊗ ∗ (P) ⊗G g will be termed the vertical complex. This decomposition shows that we can identify the vertical complex with a subcomplex of ∗ (A ) ⊗ ∗ (P) ⊗G g. Via this identification, the (1, 0)-form η can be identified with a vertical form. In particular, we see that the (1, 1)-forms have a decomposition in two terms, one of which is the part of degree (1, 1) of the vertical complex.

102

G. Catren, J. Devoto

We will now write the explicit decomposition of both sides of Eq. (15). Let δ = δV +δ H be the decomposition of the de Rham differential on A induced by η. On degree (0, ∗) the decomposition is given by the following definition. Let pV and p H be the projectors onto the factors associated with the decomposition into vertical and horizontal forms. If we think of elements of 0 (A ) ⊗ 1 (P) ⊗G g as 1 (P) ⊗G g-valued functions on A , then δ has a natural decomposition δ = δV + δ H , where δV = pV ◦ δ and δ H = p H ◦ δ. The universal family of connections AU can be interpreted as a function AU : A → 1 (P) ⊗G g. We have a splitting δAU = δV AU + δ H AU ∈ 1 (A ) ⊗ 1 (P) ⊗G g, where δV AU is an element of the vertical complex. The universal family of connections AU induces a family of connections on each of the vector bundles associated to P and in particular on ad(P). This family of connections can be seen as a homomorphism

(17) dAU : p ∗ (Lie(G )) → p ∗ 1 (P) ⊗ g , given on each copy {A} × Lie(G ) by the covariant derivative d A : Lie(G ) → 1 (P) ⊗ Lie(G ) associated to the connection A. Recall that sections of ad(P) can be seen as equivariant functions P → g and that the covariant derivative is d A = d ◦ π A , where d is the exterior derivative in P and π A : T P → T P is the horizontal projection. These constructions are equivariant, which explains the codomain in Eq. (17). The term dAU η is by definition the extension 1 ⊗ dAU : 1 (A ) ⊗ Lie(G ) = 1 (A ) ⊗ (0 (P) ⊗G g) → 1 (A ) ⊗ 1 (P) ⊗G g applied to η. Since the homomorphism 1 ⊗ dAU acts on the second factor, it preserves vertical forms. It follows that dA U η = dη + [AU , η] is a vertical form. From these remarks we see that the vertical summand of the left hand side of Eq. (15) is δV AU + dη + [AU , η].

(18)

We will consider now the right hand side of Eq. (15). We will demonstrate the following proposition. Proposition 9. The element ψ is horizontal for the connection η. Proof. Since the connection A is the pullback of a connection on the G-fiber bundle Q = (A × P)/G → A /G × M, the same is true for the curvature F. If we denote ωH and FH for the connection and curvature forms of the distribution H on Q, then one has A = q ∗ ωH , F = q ∗ FH , q

where q is the projection A × P − → Q = (A × P)/G . If X is a vector tangent to the fibers T FG of the action of G on A × P, then the contraction ı X F is equal to ı X q ∗ FH = ı q∗ X FH . Since X has the form X = (v, −v) ∈ T FG with T FG given by (8), then q∗ X = 0. This results from the fact that the vectors tangent to G are projected to zero when we take the quotient by the action of G . Hence,

Extended Connection in Yang-Mills Theory

103

the contraction ı X F of the curvature F with a vector X = (v, −v) tangent to the fibers given by (8) is zero: ı (v,−v) F = ı q∗ (v,−v) FH = 0.

(19)

An analysis of the different components of F shows that • ı (v,−v) F(2,0) = ı (v) F(2,0) = 0, since F(2,0) is induced by the connection η and v is vertical for this connection. • ı (v,−v) F(0,2) = ı −v F(0,2) = 0, since F(0,2) is induced by the connection A and −v is tangent to the fibers of p ∗ (P) → A × P. • ı −v F(1,1) = 0, since −v is tangent to the fibers of p ∗ (P) → A × P. From these remarks and Eq. (19) it follows that 0 = ı (v,−v) F = ı (v,−v) F(1,1) = ı v F(1,1) = ı v ψ. The equation (15) then splits in the equations δV AU = −dAU η,

(20)

δ H A = ψ.

(21)

U

The equation for the (2, 0)-form φ can also be canonically decomposed in components belonging to the vertical and horizontal complexes. The differential δ acting on elements of 1 (A ) ⊗ Lie(G ) = 1 (A ) ⊗ 0 (P) ⊗G g also has a decomposition δ = δV + δ H , where δV = δ ◦ pV and δ H = δ ◦ p H . The horizontal part δ H corresponds to the covariant derivative with respect to the connection η. Therefore we have a splitting δη = δV η + δ H η ∈ 2 (A ) ⊗ (0 (P) ⊗G g) = 2 (A ) ⊗ Lie(G ). By definition of the curvature φ associated to the connection η we have δ H η = φ.

(22)

The vertical component of the equation is such that δV η + δ H η = δη = φ − 21 [η, η]. We have then 1 δV η = − [η, η] . 2

(23)

In the next section we will show how the BRST transformations of the gauge and ghost fields can be obtained from Eqs. (20) and (23) respectively.

104

G. Catren, J. Devoto

IV. The Relationship Between the Gauge Fixing Connection and the Ghost Field The proposed formalism allows us to further clarify the relationship between the gauge fixing and the ghost field. To do so, we shall first work in a local trivialization ϕσi : Ui × G → π −1 (Ui ) defined by a local gauge fixing section σi : Ui → A over an open subset Ui ⊂ A /G . Let η˜ be the pull-back by ϕσi of the connection form η restricted to π −1 (Ui ). As we have seen in Sect. II, the connection form η˜ at ([A], g) ∈ Ui × G takes the form η˜ = adg−1 ηi + θ MC ,

(24)

where ηi = σi∗ η ∈ 1 (Ui ) ⊗ Lie (G ) is the local form of the connection η and θ MC ∈ Lie (G )∗ ⊗ Lie (G ) is the Maurer-Cartan form of the gauge group G [7]. The Maurer-Cartan form satisfies the equation 1 δG θ MC = − [θ MC , θ MC ] . 2

(25)

The formal resemblance between this equation and the BRST transformation of the ghost field c led in Ref.[6] to the identification of δ B R ST and c with the differential δG and the Maurer-Cartan form θ MC of G respectively. Hence, Eq. (24) shows that the ghost field can be identified with the canonical vertical part of the gauge fixing connection η expressed in a local trivialization. We will now show that it is possible to recover the standard BRST transformation of the gauge field δ B R ST A = −d A c from Eq. (20). To do so, we will first suppose that it is possible to define a global gauge fixing section σ : B → A . As we have seen in Sect. II, the associated trivialization ϕσ induces a distinguished connection ησ such that ϕσ∗ ησ = θ MC . Therefore, Eq. (20) yields in this trivialization δG AU = −dAU θ MC .

(26)

This equation is an extension to families of the usual BRST transformation of the gauge field A. If a global gauge fixing section cannot be defined, then it is possible to show that the usual BRST transformation of A is valid locally. In fact, since δV AU is a vertical form, the substitution of the local decomposition (24) in Eq. (20) yields the BRST transformation (26). We can thus conclude that the usual BRST transformation of A given by (26) is only valid in a local trivialization of A → A /G . Therefore, Eq. (20) can be considered the globally valid BRST transformation of the gauge field A. In fact, we will now show that Eq. (20) plays the same role as the usual BRST transformation of the gauge field. To do so, we have to take into account that the BRST transformation δ B R ST A = −d A c defines a general infinitesimal gauge transformation of A [6]. In order to obtain a particular gauge transformation from this general expression, it is necessary to choose an element ξ ∈ Lie (G ). In doing so, the usual gauge transformation of A is recovered δ A = (δ B R ST A)(ξ ) = −d A (c(ξ )) = −d A ξ.

(27)

Let’s now consider Eq. (20). According to the definition of connections, η(ξ ) = ξ , where ξ is the fundamental vector field in T A corresponding to ξ ∈ Lie (G ). Therefore, Eq. (20) yields δAU = (δV AU )(ξ ) = −dAU (η(ξ )) = −dAU ξ.

Extended Connection in Yang-Mills Theory

105

This equation is the universal family of infinitesimal gauge transformations defined in (27). Therefore, Eq. (20) can be consistently considered the globally valid extension to families of the usual BRST transformation of A. The identification of the ghost field with the canonical vertical part of the gauge fixing connection η depends on a particular trivialization of the fiber bundle. Nevertheless, we will now show that it is also possible to understand the relationship between c and η without using such a trivialization. To do so it is necessary to introduce the Weil algebra of the gauge group G (see Refs. [10,13,21]). The Weil algebra of a Lie algebra k is the tensor product = S ∗ k∗ ⊗ ∗ k∗ of the symmetric algebra S ∗ k∗ and the exterior ∗ ∗ W (k) algebra k of k∗ (where k and k∗ are dual spaces). Let Ta and ϑ a be a base of k and ∗ k respectively. The Weil algebra is then generated by the elements θ a = 1 ⊗ ϑ a and ζ a = ϑ a ⊗ 1. The graduation is defined by assigning degree 1 to θ a and degree 2 to ζ a . Let’s define the elements θ and ζ in W (k) ⊗ k as θ = θ a ⊗ Ta and ζ = ζ a ⊗ Ta respectively. In fact, the element θ is the Maurer-Cartan form θ MC of the group. Weil’s differential δW acts on these elements by means of the expressions 1 [θ, θ ] , 2 δW ζ = − [θ, ζ ] . δW θ = ζ −

These equations reproduce in the Weil algebra Cartan structure equation and the Bianchi identity respectively. What is important to note here is that the connection η in the G -principal bundle A → A /G can be defined as the image of θ MC under a particular Chern-Weil homomorphism ω : (W (Lie(G )) ⊗ Lie(G ), δW ) −→ ∗ (A ) ⊗ Lie(G ), δ θ MC → η. Therefore, the Weil algebra is a universal model for the algebra of a connection and its curvature. In this way, a gauge fixing of the theory by means of a connection η can be defined by choosing a particular Chern-Weil’s immersion ω of the “universal connection” θ MC into ∗ (A ) ⊗ Lie(G ). Hence, the ghost field can be considered a universal connection whose different immersions ω define different gauge fixings of the theory. We will now show that the usual BRST transformation of the ghost field can be recovered from Eq. (23). To do so, it is necessary to restrict the attention to the vertical complexes. Indeed, the connection η ∈ 1 (A ) ⊗ Lie(G ) defines a homomorphism of differential algebras ωV : (Lie(G )∗ ⊗ Lie(G ), δG ) → (∗V (A ) ⊗ Lie(G ), δV ), with ωV (θ MC ) = η. This means that ωV (δG α) = δV (ωV (α)) for α ∈ Lie(G )∗ ⊗ Lie(G ). Therefore, equation (23) yields 1 δV (ωV (θ MC )) = − [ωV (θ MC ), ωV (θ MC )] , 2 1 ωV (δG θ MC ) = ωV (− [θ MC , θ MC ]). 2 Therefore 1 δG θ MC = − [θ MC , θ MC ] , 2 which coincides with the ghost field’s BRST transformation.

106

G. Catren, J. Devoto

Remark 10. Contrary to what is commonly done in order to reobtain the BRST transformations for the ordinary (non-topological) Yang-Mills case, it has not been necessary to impose the horizontality conditions φ = ψ = 0 on the extended curvature F (see for example Refs.[2,4,22]). V. Path Integral Gauge Fixing V.1. Usual gauge fixing. The central problem in the quantization of Yang-Mills theory is computing the transition amplitudes [A0 ] | [A1 ] = exp{i S} DA Dπ, (28) T ∗ P ([A0 ],[A1 ])

where S is the canonical action, D A is the Feynman measure on the space of paths P([A0 ], [A1 ]) = {γ : [0, 1] → A /G | γ (i) = [Ai ], i = 0, 1}, in A /G and Dπ is a Feynmann measure in the space of moments. The canonical Yang-Mills’s action is given by the expression

S = dt d 3 x A˙ ak πak − H0 πak , Bak − Aa0 φa , (29) a are the field strengths). The Yangwith πak = Fak0 and Bka = 21 εkmn F amn (where Fmn k k Mills Hamiltonian H0 πa , Ba is

1 πak πka + Bak Bka , (30) H0 πak , Bak = 2

and the functions φa are c k b πc Ak . φa = −∂k πak + f ab

(31)

The pairs (Aak , πak ) are the canonical variables of the theory. The temporal component Aa0 is not a dynamical variable, but the Lagrange multiplier for the generalized Gauss constraint φa ≈ 0. The geometry of the quotient space A /G is generally quite complicated. The usual approach is to replace the integral (28) with an integral over the space of paths in the affine space A . To do so, one must pick two elements Ai ∈ π −1 [Ai ] in the fibres [Ai ] ∈ A /G (i = 0, 1). Then one replaces the integral (28) by A0 | A1 = exp{i S}DA Dπ, (32) T ∗ P (A0 , A1 )

where the integral is now defined on the cotangent bundle of the following space of paths in A P(A0 , A1 ) = {γ : [0, 1] → A | γ (0) = A0 , γ (1) = A1 }. The problem with this approach is that it introduces an infinite volume in the path integral, which corresponds to the integration over unphysical degrees of freedom. The projection π : A → A /G induces a projection π : P(A0 , A1 ) → P([A0 ], [A1 ]).

Extended Connection in Yang-Mills Theory

107

The path group PG = {g(t) : [0, 1] → G | g(0) = g(1) = idG } acts on P(A0 , A1 ) by pointwise multiplication and the fibers of π consist of the orbits of the action of PG . Since the action S is invariant under the action of PG , one needs to extract the volume of this group from the integral (32). In order to get rid of this infinite volume the usual approach is to fix the gauge by defining a section σ : A /G → A such that σ ([Ai ]) = Ai , i = 0, 1. The gauge fixing section σ induces a map σ˜ w P(A0 , A1 )

σ˜ π

/ P([A0 ], [A1 ]),

defined by σ˜ (γ ) = σ ◦ γ , which is a section of π . This section σ˜ induces a trivialization PG P(A0 , A1 ) P([A0 ], [A1 ]) × and a similar decomposition at the level of cotangent bundles. The ghost field appears when one computes the Jacobian which relates the corresponding measures. One can then use Fubini’s theorem in order to extract the irrelevant and problematic factor. It is worth noting that the definition of a gauge fixing section σ in the G -principal fiber bundle of fields A → A /G is only an auxiliary step for defining a section σ˜ of π the PG -projection P(A0 , A1 ) − → P([A0 ], [A1 ]) in the space of paths where the path integral is actually defined. V.2. Generalized gauge fixing. We will now consider in which sense the connection η can be used to fix the gauge. This gauge fixing will be globally well-defined, even if there is a Gribov’s obstruction. We shall begin by considering paths in A such that the initial condition A0 is fixed and the final condition is defined only up to a gauge transformation (see Ref.[19, p. 123]). This means that the final condition can be any element of the final fiber π −1 [A1 ]. The corresponding space of paths is P(A0 , π −1 [A1 ]) = {γ : [0, 1] → A | γ (0) = A0 , π(γ (1)) = [A1 ]}. The relevant path group is now PG = {g(t) : [0, 1] → G | g(0) = idG }. This group acts on P(A0 , π −1 [A1 ]). This actions defines the projection π : P(A0 , π −1 [A1 ]) → P([A0 ], [A1 ]). It is easy to show that the action of the path group PG on P(A0 , π −1 [A1 ]) is free. We will not need to assume that it is a principal bundle. The gauge fixing by means of the connection η is defined by taking parallel transports along paths in A /G of the initial condition A0 ∈ π −1 [A0 ] (as has already been suggested in Ref.[19]). This procedure defines a section σ η of the projection π , u P(A0 , π −1 [A1 ])

ση π

/ P([A0 ], [A1 ]).

(33)

108

G. Catren, J. Devoto

The section σ η sends each path γ ∈ P([A0 ], [A1 ]) to its η-horizontal lift γ = σ η γ ∈ P(A0 , π −1 [A1 ]) starting at A0 .3 A path γ ∈ P(A0 , π −1 [A1 ]) is in the image of σ η if and only if the tangent vectors to γ at each A ∈ A belong to the horizontal subspaces Hη (A) defined by η. Recalling that Hη (A) = Ker η(A), this local condition leads to the gauge fixing equation η(γ˙ (t)) = 0, ∀t.

(34)

In local bundle coordinates this condition defines a non-linear ordinary equation. The explicit form of this equation is given in (57) (see Ref.[18] for details). In the case of the Coulomb connection defined in (7), Eq. (34) becomes dγ∗ (t) γ˙ (t) = 0, ∀t.

(35)

This equation can be expressed in local terms on M and P for each t. If π : X → X/G is a quotient space by a principal action, then any section σ : → X , where ([x], g) = X/G → X induces a global trivialization : X/G × G − σ ([x]) · g. It can be shown that is a diffeomorphism. It follows that the space of paths P(A0 , π −1 [A1 ]) can be factorized as P([A0 ], [A1 ]) × PG . A similar decomposition is induced at the level of cotangent bundles. Strictly speaking the path integral is not an integral on the space of fields A , but rather an integral on the space of paths. Hence, the section σ η induced by the connection η suffices to get rid of the infinite volume of the path group PG . Remark 11. The generalized gauge fixing can be also be defined as the null space of a certain functional as follows. The Lie algebra Lie(G ) of the gauge group G can be identified with the sections of the adjoint bundle ad(P). The invariant metric on g induces a metric < , >ad(P) on ad(P). Using this metric we define a G -invariant metric on Lie(G ) = (ad(P)) by σ1 , σ2 Lie(G ) = < σ1 ( p), σ2 ( p) >ad(P) d x. (36) M

Then we define the functional F : P(A0 , π −1 [A1 ]) → R by

1

F(γ ) = 0

η(γ˙ (t))2Lie(G ) dt.

(37)

This is a positive functional and the image of the section ση is the null space of F. VI. Faddeev-Popov Method Revisited We will now proceed to implement the proposed generalized gauge fixing at the level of the path integral. To do so, we will show that the usual Faddeev-Popov method can also 3 The theorem of existence of parallel transport has been extended to infinite dimensions in Ref. [17, Theorem 39.1]. It can be shown that under suitable assumptions the parallel transport depends smoothly on the path. We will therefore assume that the section σ η is smooth and that its image is a smooth submanifold of P(A0 , π −1 [A1 ]). By definition, this submanifold is transversal to the action of PG .

Extended Connection in Yang-Mills Theory

109

be used with this generalized gauge fixing. We will then start by introducing our gauge fixing condition at the level of the transition amplitude

A0 | π −1 [A1 ] =

T ∗ P (A0 , π −1 [A1 ])

exp{i S}DA Dπ.

(38)

The first possible form of the gauge fixing condition is δ(F(γ )), where δ is the Dirac delta function on R and F(γ ) the functional (37). This form is mathematically consistent and does not require any product of distributions. This gauge fixing condition has the direct exponential representation δ(F(γ )) = =

dλeiλF (γ )

dλe

iλ

dλe

iλ

=

γ

η(γ˙ (t))2Lie(G ) dt

γ

M

η(γ˙ (t))2g d xdt

.

The second form is based on the elementary observation that the integral of a continuous positive function is zero if and only if the function is zero at all points. One can then define the gauge fixing condition N

δ(η(γ˙ )) = lim N

δLie(G ) (η(γ˙ (tk )))

(39)

k=1

= lim

N ,M

N M

δg(η(γ˙ (tk ))(x j )),

k=1 j=1

where δLie(G ) is the delta function on Lie(G ) defined as an infinite product of the Dirac delta δg on g. If Ta is a fixed basis of g, we can write δg(η(γ˙ (tk ))(x j )) in terms of δ(η(γ˙ (tk ))(x j )a ), where now the delta function is the usual delta function on R3 . As usual we define the element −1 [γ ] as −1 [γ ] =

PG

D g δ(η(γ ˙g )),

where γ g denotes the right action of PG on P(A0 , π −1 [A1 ]). Proposition 12. The element −1 [γ ] is PG invariant. Proof. −1 [γ g] = =

PG PG

˙ )) = D g δ(η(γ gg

PG

˙ )) D(gg )δ(η(γ gg

D(g )δ(η(γ g˙ )) = −1 [γ ].

(40)

110

G. Catren, J. Devoto

We can then express the number one in the following way: D g δ(η(γ ˙g )). 1 = [γ ]

(41)

PG

Roughly speaking, the element [γ ] corresponds to the determinant of the operator which measures the gauge fixing condition’s variation under infinitesimal gauge transformations. It can be shown that the element [γ ] is never zero. The local gauge fixing condition η(γ˙ (t)) = 0 induces a well-defined section σ η of the PG -projection π : P(A0 , π −1 [A1 ]) → P([A0 ], [A1 ]) in the space of paths. Since the sub-manifold defined by the image of σ η is by definition transversal to the action of PG , an infinitesimal gauge transformation of the gauge fixing condition η(γ˙ (t)) = 0 will be always non-trivial. In this way we can argue that the element [γ ] is never zero. This fact ensures that the connection η induces a well-defined global gauge fixing, even when the topology of the fiber bundle A → A /G is not trivial. By inserting (41) in (38) we obtain A0 | π −1 [A1 ] = DA Dπ [γ ] D g δ(η(γ ˙g ))) exp{i S} . T ∗ P (A0 , π −1 [A1 ])

PG

If we now perform in the usual manner a gauge transformation taking γ g to γ we obtain D g DA Dπ [γ ] δ(η(γ˙ ))) exp{i S} , A0 | π −1 [A1 ] = T ∗ P (A0 , π −1 [A1 ])

PG

where we have used that DA Dπ , −1 [γ ] and S are gauge invariant. In this way we have isolated the infinite volume of the path group PG . We will now follow the common procedure for finding the new terms in the action coming from the Dirac delta δ(η(γ˙ )) and the element [γ ]. In order to find an integral representation of the gauge condition’s Dirac delta δ(η(γ˙ )) we will start by finding the integral representation of the Dirac delta function δLie(G ) on Lie(G ) used in (39). If ξ ∈ Lie(G ), the Dirac delta δLie(G ) (ξ ) defined as δLie(G ) (ξ ) = lim M

M

δg(ξ(x j )),

j=1

can be expressed in terms of the integral representations of the Dirac delta δg(ξ(x j )) on g as δLie(G ) (ξ ) = lim M

= =

M

dλ(x j )ei

M

j=1 λ(x j ),ξ(x j ) g

j=1

i Dλe

d xλ(x),ξ(x) g

iλ,ξ Lie(G ) , Dλe

,

,

Extended Connection in Yang-Mills Theory

111

= lim M M dλ(x j ). The Dirac delta where λ is a section of ad(P) = P ×G g and Dλ j=1 δ(η(γ˙ )) of the gauge fixing condition can then be expressed as δ(η(γ˙ )) = lim N

=

=

δLie(G ) (η(γ˙ (tk )))

k=1

= lim N

N

N

k=1

Dλe

i

Dλe

i

k ei Dλ

N

k=1 λk ,η(γ˙ (tk )) Lie(G )

γ λ,η(γ˙ (t)) Lie(G ) dt

γ

M λ,η(γ˙ (t)) g d xdt

,

where λ is a time-evolving section of ad(P) = P ×G g. The final measure Dλ is then Dλ = lim N

N k=1

k = lim Dλ

N ,M

M N

dλk (x j ).

k=1 j=1

Let’s now compute explicitly the element [γ ]. Let X be an element of the Lie algebra Lie(PG ) identified with the tangent space of PG at the identity element. Given a path γ (t) ∈ P(A0 , π −1 [A1 ]), one must calculate the variation of η (γ˙ ) under an infinitesimal gauge transformation defined by X ∈ Lie(PG ). Let u → ku be the uniparametric subgroup of PG generated by X by means of the exponential map exp : d Lie(PG ) → PG . We have then X = du ku |u=0 ∈ Lie(PG ). Remark 13. The gauge fixing condition has a natural interpretation in terms of the geometry of P(A0 , π −1 [A1 ]). The tangent space Tγ P(A0 , π −1 [A1 ]) to P(A0 , π −1 [A1 ]) at a path γ can be identified with the sections of the pullback γ ∗ (T A). The connection η induces a map from Tγ P(A0 , π −1 [A1 ]) to the Lie algebra of PG . The tangent field γ˙ represents a marked point in Tγ P(A0 , π −1 [A1 ]). The gauge fixing condition means that we will only consider paths such that the connection η vanishes on the marked point γ˙ . Using the description of the tangent spaces to path spaces given in the previous remark we can identify X with a map t → X t with 0 ≤ t ≤ 1 and X t ∈ Lie(G ). In order to find an expression for the vectors X t one must take into account that ku describes a family of paths in the gauge group G parameterized by t. This means that ku = gu (t) ⊂ G for u fixed. To emphasize this, let us write ku (t). For a given t, the vector X t ∈ Lie(G ) is d then given by X t = du ku (t)u=0 . If γ ∈ P(A0 , π −1 [A1 ]) we must compute d d ηγ (t)ku (t) Rk (t) γ (t) |u=0 . (42) du dt u At a fixed time t0 the time derivative in (42) is equal to d d Rk (t) γ (t)|t=t0 = Rku (t0 )ku (t0 )−1 ku (t) γ (t)|t=t0 dt u dt d Rk (t ) γ (t)|t=t0 = dt u 0 d + Rku (t0 )−1 ku (t) Rku (t0 ) γ (t0 ) |t=t0 dt = d Rku (t0 ) (γ˙ (t0 )) + ι(γ (t0 )ku (t0 )) (X u ), (43)

112

G. Catren, J. Devoto

d where X u = ku (t0 )−1 dt ku (t)|t=t0 . The first term in the last equation is the differential of the action of ku (t). We have used that the differential of a function f is defined as (t)) |t=t0 with X = dγdt(t) |t=t0 . The second term is the homomorphism d f (X ) = d f (γ dt from the Lie algebra of G to the vertical vector fields (see Appendix A for detailed definitions).4 Let’s apply the connection form η to each term in Eq. (43). Firstly, we have

ηγ (t0 )ku (t0 ) (d Rku (t0 ) (γ˙ (t0 ))) = Ad(ku−1 (t0 ))ηγ (t0 ) (γ˙ (t0 )).

(44)

The equality (44) follows from one of the connection’s defining properties (see (54) in Appendix A). The second term is ηγ (t0 )ku (t0 ) ι(γ (t0 )ku (t0 )) (X u ) = X u , (45) where we have used Eq. (54). The infinitesimal variation defined by X ∈ Lie(PG ) is given by the sum of d Ad(ku (t0 )−1 )ηγ (t0 ) (γ˙ (t0 )) |u=0 = Ad(−X t0 )ηγ (t0 ) (γ˙ (t0 )) du = −X t0 , ηγ (t0 ) (γ˙ (t0 )) and

d du

X u |u=0 . Let’s now calculate this last term: d d −1 d X u |u=0 = ku (t0 ) ku (t) |u=0,t=t0 du du dt −2 dku (t0 ) dku (t) −1 d dku (t) |u=0,t=t0 = −ku (t0 ) + ku (t0 ) du dt dt du dku (t0 ) dk0 (t) |u=0 |t=t0 = −k0 (t0 )−2 du dt d dku (t) |u=0 +k0 (t0 )−1 dt du t=t0 = X˙ t (t0 ),

where in the last step we used that k0 (t) = idG ∀t. The infinitesimal variation of the gauge fixing condition is then d d ηγ (t)ku (t) Rk (t) γ (t) |u=0 = −X t , ηγ (t) (γ˙ (t)) + X˙ t . du dt u This expression defines a linear endomorphism Mγ in Lie(PG ) for each path γ (t). Equivalently, it defines a linear endomorphism Mγ (t) in Lie(G ) for each t. 4 If we use the usual form for the gauge transformation of connections (49) the time derivative in (42) can be expressed as d −1 ad ku (t) γ (t0 ) + ku−1 (t)dku (t) t=t0 . ad(ku−1 (t0 ))γ˙ (t0 ) + dt The first term is equal to the differential of the action d Rku (t0 ) (γ˙ (t0 )) (see (53)). The second term is equal to the infinitesimal symmetry ι(γ (t0 )ku (t0 )) (X u ) (see (52)).

Extended Connection in Yang-Mills Theory

113

In order to find the exponential representation of the element γ , we will introduce a Grassmann algebra generated by the anticommuting variables c and c. ¯ By following the common procedure, we can express the element γ as c¯t ,Mγ (t)ct g d 3 xdt [γ ] = D cDce ¯ , where Dc = lim N

N k=1

t = lim Dc k

N ,M

N M

dctk (x j ),

k=1 j=1

and the same for D c¯ (see Ref.[24] for a precise definition of c and c). ¯ By gathering all the pieces together, the path integral takes the form −1 A0 |π [A1 ] = D g DA Dπ DλDcD c¯ exp{i Sg f }, PG

(46)

where Sg f is the gauge fixed action

Sg f = d 4 x A˙ k π k − H0 − A0 φ + λ, η(γ˙ ) g − ic, ¯ Mγ c g . By explicitly introducing the indices of the Lie algebra g, the endomorphisms Mγ (t)(x) can be expressed as c + δac ∂0 , Mγ (t)(x)ac = −ηγ (t) (γ˙ (t))b f ab c are the structure constants of g. The gauge fixed action takes then the form where f ab

c Sg f = d 4 x A˙ ak πak − H0 − Aa0 φa + λa η(γ˙ )a + i c¯a η(γ˙ )b f ab cc − i c¯a c˙a . (47)

The last term can be recast as +i c˙a c¯a . Therefore, this term can be interpreted as the kinetic term corresponding to the new pair of canonical variables (c, i c). ¯ VII. Conclusions The principal aim of this work was to study the quantization of Yang-Mills theory by using an extended connection A defined in a properly chosen principal bundle. This connection unifies the three fundamental geometric objects of Yang-Mills theory, namely the gauge field, the gauge fixing and the ghost field. This unification is an extension of the known fact that the gauge and ghost fields can be assembled together as ω = A + c [3]. The first step in the unification process was to generalize the gauge fixing procedure by replacing the usual gauge fixing section σ with a gauge fixing connection η in the G -principal bundle A → A /G . We have then shown that the connection η also encodes the ghost field of the BRST complex. In fact, the ghost field can either be considered the canonical vertical part of η in a local trivialization or the universal connection in the gauge group’s Weil algebra. The unification process continues by demonstrating that the universal family of gauge fields AU and the gauge fixing connection η can be unified in the single extended connection A = AU +η on the G-principal bundle A × P → A × M.

114

G. Catren, J. Devoto

In this way, we have shown that the extended connection A encodes the gauge field, the gauge fixing and the ghost field. A significant result is that it is possible to derive the BRST transformations of the relevant fields without imposing the usual horizontality or flatness conditions on the extended curvature F = φ + ψ + FU ([2,4,22]). In other words, it is not necessary to assume that φ = ψ = 0. Moreover, the proposed formalism allows us to show that the standard BRST transformation of the gauge field A is only valid in a local trivialization of the fiber bundle A → A /G . In fact, Eq. (20) can be considered the corresponding global generalization. We then applied this geometric formalism to the path integral quantization of Yang-Mills theory. Rather than selecting a fixed representant for each [A] ∈ A /G by means of a section σ , the gauge fixing connection η allows us to parallel transport any initial condition A0 ∈ A belonging to the orbit [A0 ]. A significant advantage of this procedure is that one can always define a section σ η of the projection P(A0 , π −1 [A1 ]) → P([A0 ], [A1 ]) in the space of paths, even when it is not possible to define a global section σ : A /G → A in the space of fields. Since the path integral is not an integral in the space of fields A , but rather an integral in the space of paths in A , such a section σ η suffices for eliminating the infinite volume of the group of paths PG . Hence, this generalized gauge fixing procedure is globally well-defined even when the topology of the fiber bundle A → A /G is not trivial (Gribov’s obstruction). We then used the standard Faddeev-Popov method in order to introduce the generalized gauge fixing defined by η at the level of the path integral. The corresponding gauge fixed extended action Sg f was thereby obtained. We have thus shown that the Faddeev-Popov method can be used even when there is a Gribov’s obstruction. Acknowledgement. We wish to thank Marc Henneaux for his helpful comments and the University of Buenos Aires for its financial support (Projects No. X103 and X193).

A. The Geometry of η In this appendix we will review some geometric properties of the connection η. The gauge group G consists of diffeomeorphisms ϕ : P → P such that π ϕ = π and ϕ( pg) = ϕ( p)g (with g ∈ G). Each element ϕ ∈ G can be associated to a map g : P → G by ϕ( p) = pg( p). This map g satisfies g( ph) = h −1 g( p)h = ad(h −1 )g( p). This description of the elements of G also allows one to describe the elements of the Lie algebra Lie(G ). These elements consist of maps g : P → g such that g( ph) = Ad(h −1 )g( p). The gauge group G acts on the right on A via the pullback of connections. If ϕ : P → P is an element of G and A ∈ A , then the action is given by: Rϕ A = ϕ ∗ A.

(48)

This action is commonly described in terms of the function g by the transformation formula Rϕ A = g−1 Ag + g−1 dg.

(49)

The first term in the right hand side of Eq. (49) denotes the composition Ap

adg−1 ( p)

T p P −→ g −−−−−→ g.

(50)

Extended Connection in Yang-Mills Theory

115

The second part g−1 dg is the pullback of the Maurer-Cartan form ω defined by the map g : P → G. The action (48) induces two constructions of interest. The first one is the infinitesimal action. This is a morphism of Lie algebras ι : Lie(G ) → (T P), where (T P) is the Lie algebra of the vector fields on P. We will identify elements of a Lie algebra with tangent vectors at the identity. If ζ ∈ Lie(G ) and ϕs is a curve such that d ϕs s=0 = ζ, ds and A ∈ A , then we define d ιAζ = Rϕ As=0 . (51) ds s In terms of the explicit description of Eq. (49) we obtain d (52) ι A ζ = Ad(ζ )A + g−1 dgs=0 . ds The second action is the differential of the action of G . Since A is an affine space, there is a natural identification T A = A ⊕ A . An element V ∈ T A A can then be d identified with a connection. Let As be a curve in A such that ds As |s=0 = V . Then we define d ad(g)As s=0 . d Rϕ V = (53) ds Remark 14. The infinitesimal action and the differential of the action are geometric intrinsic constructions. They only depend on the vectors ζ and V and not on the particular curves used to compute them. The connection η has the following properties η(ι(ζ )) = ζ,

(54)

η(d Rϕ V ) = ad(g−1 )η(V ).

(55)

B. The Local Form of η The local form of a connection in a principal bundle is commonly described in terms of Christoffel symbols. This description has been extended to infinite dimensions in Ref.[17, Chap. VIII]. We will give a brief description of the constructions involved. Let Uα be a trivializing covering of A /G with trivializing functions ψα : Uα ×G → A Uα . Then the pullback of η defines a Lie(G )-valued 1-form on Uα × G . Let v ∈ Tx Uα and w ∈ Tg G be two tangent vectors. Then α

ψα∗ η(v, w) =: − α (v, g) + w

(56) α

for a certain 1-form with values on the vector fields on G . The 1-form is called the local Christoffel form of η in the trivialization (Uα , ψα ). If γ is a path in Uα , then any lift γ˜ of γ to Uα × G has the form γ˜ (t) = (γ (t), τ (t)) for a path τ in G . Then the gauge fixing condition is given in local coordinates by the equation d d τ (t) − α ( τ (t), γ (t)) = 0. (57) dt dt

116

G. Catren, J. Devoto

References 1. Atiyah, M.F., Singer, I.M.: Dirac operators coupled to vector potentials. Proc. National Acad. Sci. 81, 2597 (1984) 2. Baulieu, L., Bellon, M.: p-forms and Supergravity: Gauge symmetries in curved space. Nucl. Phys. B 266, 75 (1986) 3. Baulieu, L., Singer, I.M.: Topological Yang-Mills symmetry. Nucl. Phys. 5(Proc. Suppl.), 12 (1988) 4. Baulieu, L., Thierry-Mieg, J.: The principle of BRST symmetry: An alternative approach to Yang-Mills theories. Nucl. Phys. B 197, 477 (1982) 5. Birmingham, D., Blau, M., Rakowski, M., Thompson, G.: Topological Field Theory. Phys. Rep. 209, 129 (1991) 6. Bonora, L., Cotta-Ramusino, P.L.: Some remarks on BRS transformations, anomalies and the cohomology of the lie algebra of the group of gauge transformations. Commun. Math. Phys. 87, 589 (1983) 7. Choquet-Bruhat, Y., DeWitt-Morette, C.: Analysis, Manifolds and Physics. Part II: 92 Applications. New York: Elsevier Science Publishers B.V., 1989 8. Cordes, S., Moore, G., Ramgoolam, S.: Lectures on 2D Yang-Mills Theory, Equivariant Cohomology and Topological Field Theories. Nucl. Phys. 41(Proc Suppl.), 184 (1995) 9. Donaldson, S.K., Kronheimer, P.B.: The geometry of four-manifolds. Oxford: Oxford University Press, 1990 10. Dubois-Violette, M.: The Weil-B.R.S. algebra of a Lie algebra and the anomalous terms in gauge theory. J. Geom. Phys. 3, 525 (1986) 11. Faddeev L., Slavnov A.: Gauge fields: An introduction to quantum theory. Second ed., Frontiers in Physics, Cambridge: Perseus Books, 1991 12. Gribov, V.: Quantization of non-Abelian gauge theories. Nucl. Phys. B 139, 1 (1978) 13. Guillemin, V., Sternberg S., Guillemin V.W.: Supersymmetry and equivariant de Rham theory. Berlin-Heidelberg-NewYork: Spinger-Verlag, 1999 14. Henneaux, M.: Hamiltonian form of the path integral for theories with a gauge freedom. Phys. Rep. 126(1), 1 (1985) 15. Henneaux, M., Teitelboim C.: Quantization of gauge systems. Princeton, NJ: Princeton Univ. Press, 1994 16. Kobayashi, S., Nomizu, K.: Foundations of differential geometry. Vol. I, New York: Wiley, 1963 17. Kriegl, A., Michor, P.: A convenient setting for global analysis. Mathematical Surveys and Monographs, Vol. 53, Amer. Math. Soc., 1997 18. Michor, P.: Gauge theory for fiber bundles. Monographs and Textbooks in Physical Sciences, Lecture Notes 19, Napoli: Bibliopolis, 1991 19. Narasimhan, M.S., Ramadas, T.R.: Geometry of SU (2) Gauge Fields. Commun. Math. Phys. 67, 121 (1979) 20. Singer, I.: Some remarks on the Gribov Ambiguity. Commun. Math. Phys. 60, 7 (1978) 21. Szabo, R.: Equivariant cohomology and localization of path integrals. Berlin-Heidelberg-NewYork: Springer-Verlag, 2000 22. Thierry-Mieg, J.: Geometrical reinterpretation of Faddeev-Popov ghost particles and BRS transformations. J. Math. Phys. 21, 2834 (1980) 23. Witten, E.: Topological quantum field theory. Commun. Math. Phys. 117, 353 (1988) 24. Witten, E.: Dynamics of Quantum Field Theory. In: Quantum Fields and Strings: A course for mathematicians. Vol(2), Providence, RI: Amer. Math. Soc., (1999), pp. 1119–1424 Communicated by N.A. Nekrasov

Commun. Math. Phys. 284, 117–185 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0566-6

Communications in

Mathematical Physics

Entanglement Entropy in Quantum Spin Chains with Finite Range Interaction A. R. Its1, , F. Mezzadri2, , M. Y. Mo2, 1 The Department of Mathematical Sciences, Indiana University-Purdue

University Indianapolis, 402 N. Blackford Street, Indianapolis, IN 46202-3216, USA. E-mail: [email protected]

2 Department of Mathematics, University of Bristol, Bristol BS8 1TW, UK.

E-mail: [email protected]; [email protected] Received: 16 August 2007 / Accepted: 17 March 2008 Published online: 12 August 2008 – © Springer-Verlag 2008

Abstract: We study the entropy of entanglement of the ground state in a wide family of one-dimensional quantum spin chains whose interaction is of finite range and translation invariant. Such systems can be thought of as generalizations of the XY model. The chain is divided in two parts: one containing the first consecutive L spins; the second the remaining ones. In this setting the entropy of entanglement is the von Neumann entropy of either part. At the core of our computation is the explicit evaluation of the leading order term as L → ∞ of the determinant of a block-Toeplitz matrix with symbol iλ g(z) (z) = , g −1 (z) iλ where g(z) is the square root of a rational function and g(1/z) = g −1 (z). The asymptotics of such determinant is computed in terms of multi-dimensional theta-functions associated to a hyperelliptic curve L of genus g ≥ 1, which enter into the solution of a Riemann-Hilbert problem. Phase transitions for these systems are characterized by the branch points of L approaching the unit circle. In these circumstances the entropy diverges logarithmically. We also recover, as particular cases, the formulae for the entropy discovered by Jin and Korepin [14] for the XX model and Its, Jin and Korepin [12,13] for the XY model. Contents 1. 2. 3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Statement of Results . . . . . . . . . . . . . . . . . . . . . . Quantum Spin Chains with Anisotropic Hamiltonians . . . . The von Neumann Entropy and Block-Toeplitz Determinants

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

A. R. Its was partially supported by the NSF grants DMS-0401009 and DMS-0701768. F. Mezzadri and M. Y. Mo acknowledge financial support by the EPSRC grant EP/D505534/1.

. . . .

. . . .

118 120 125 128

118

A. R. Its, F. Mezzadri, M. Y. Mo

5. 6. 7. 8. 9. 10.

130 136 143 147 149 151 152 157 158 166

The Asymptotics of Block Toeplitz Determinants. Widom’s Theorem . . . . The Wiener-Hopf Factorization of (z) . . . . . . . . . . . . . . . . . . . . The Asymptotics of d log D L (λ)/dλ and D L (λ) . . . . . . . . . . . . . . . The Limiting Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrability at ±1. The Final Formula for the Entropy . . . . . . . . . . . . Critical Behavior as Roots of g(z) Approaches the Unit Circle . . . . . . . . 10.1 The limit of two real roots approaching 1 . . . . . . . . . . . . . . . . 10.2 The limit of complex roots approaching the unit circle . . . . . . . . . 10.2.1 Case 1. r < n. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 Case 2: r = n. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Pairs of complex roots approaching the unit circle together with one pair of real roots approaching 1 . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. The Density Matrix of a Subchain . . . . . . . . . . . . . . . . . Appendix B. The Correlation Matrix C M . . . . . . . . . . . . . . . . . . . . . Appendix C. Thermodynamic Limit of the Correlation Matrix C M . . . . . . . Appendix D. The Riemann Constant K . . . . . . . . . . . . . . . . . . . . . . Appendix E. The Cycle Basis (10.18) . . . . . . . . . . . . . . . . . . . . . . . Appendix F. Solvability of the Wiener-Hopf Factorization Problem . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

171 174 175 178 180 181 183 185

1. Introduction One dimensional quantum spin chains were introduced by Lieb et. al. [17] in 1961 as a model to study the magnetic properties of solids. Usually such systems depend on some parameter, e.g. the magnetic field. One of their most important features is that at zero temperature, when the system is in the ground state, as the number of spins tend to infinity they undergo a phase transition for a critical value of the parameter. As a consequence, the rate of the decay of correlation lengths changes suddenly from exponential to algebraic at the critical point. Furthermore, many examples of such chains are exactly solvable. Because of these reasons over the years the statistical mechanical properties of quantum spin chains have been investigated in great detail. More recently, Osterloh et al. [20], and Osborne and Nielsen [22] realized that the existence of non-local physical correlations at a phase transition is a manifestation of the entanglement among the constituent parts of the chain. Entangled quantum states are characterized by non-local correlations that cannot be described by classical mechanics. Such correlations play an important role in the transmission of quantum information. It is therefore essential to be able to quantify entanglement. In its full generality this is still an open problem. However, when a physical system is in a pure state and is bipartite, i.e. is made of two separate parts, say A and B, a suitable measure of the entanglement shared between the two constituents is the von Neumann entropy of either part [2]. In this situation the Hilbert space of the whole system is HAB = HA ⊗ HB , where HA and HB are the Hilbert spaces associated to A and B respectively. Now, if ρAB is the density matrix of the composite system, then the reduced density matrices of A and B are ρA = tr B ρAB and ρB = tr A ρAB ,

(1.1)

where tr A and tr B are partial traces over the degrees of freedom A and B respectively. The entropy of the entanglement of formation is S(ρA ) = −trρA log ρA = S(ρB ) = −trρB log ρB .

(1.2)

Entanglement Entropy in Quantum Spin Chains

119

In this paper we compute the entropy of entanglement of the ground state of a vast class of spin chains whose interaction among the constituent spins is non-local and translation invariant. These systems can be mapped into quadratic chains of fermionic operators by a suitable transformation and are generalizations of the XY model. We study the ground state of such systems, divide the chain in two halves and compute the von Neumann entropy in the thermodynamic limit of one of the two parts. If the ground state is not degenerate, then ρAB = g g . At the core of our derivation of the entropy of entanglement is the computation of determinants of Toeplitz matrices for a wide class of 2 × 2 matrix symbols. The explicit expressions for such determinants were not available in the literature. The appearance of Toeplitz matrices and their invariants in the study of lattice models is a simple consequence of the translation invariance of the interaction among the spins. Thus, Toeplitz determinants appear in the computations of many other physical quantities like spin-spin correlations or the probability of the emptiness of formation, not only the entropy of entanglement. Therefore, our results have consequences that go beyond the application to the study of bipartite entanglement that we discuss. Vidal et. al. [23] were the first to investigate the entanglement of formation of the ground state of spin chains by dividing them in two parts. The models they considered were the XX, XY and XXZ model. They computed numerically the von Neumann entropy of one half of the chain and discovered that at a phase transition it grows logarithmically with its length L. Jin and Korepin [14] computed the von Neumann entropy of the ground state of the XX model using the Fisher-Hartwig formula for Toeplitz determinants. They showed that at the phase transition the entropy grows like 13 log L, which is in agreement with the numerical observations of Vidal et.al. For lattice systems that have a conformal field theory associated to it the logarithmic growth of the entropy was first discovered by Holzhey et. al. [10] in 1994. This approach was later developed by Korepin [15], and by Calabrese and Cardy [4]. Its, Jin and Korepin [12,13] determined the entropy for the XY model by computing an explicit formula for the asymptotics of the determinant of a block-Toeplitz matrix. They expressed the entropy of entanglement in terms of an integral of Jacobi theta functions. Consider a p × p matrix-valued function on the unit circle : ϕ(z) =

∞

ϕk z k , |z| = 1.

k=−∞

A block-Toeplitz matrix with symbol ϕ is defined by TL [ϕ] = (ϕ j−k )0≤ j,k≤L−1 . Furthermore, we shall denote its determinant by D L = det TL [ϕ]. The main ingredient of the computation of Its, Jin and Korepin was to use the Riemann-Hilbert approach to derive an asymptotic formula for the Fredholm determinant D L (λ) = det TL [ϕ] = det (I − K L ) ,

(1.3)

where K L is an appropriate integral operator on L 2 (, C2 ). The symbol of the Toeplitz matrix TL [ϕ] was iλ g(θ ) (1.4) ϕ eiθ = −g −1 (θ ) iλ,

120

A. R. Its, F. Mezzadri, M. Y. Mo

where g(θ ) =

α cos θ − 1 − iγ α sin θ . |α cos θ − 1 − iγ α sin θ |

Keating and Mezzadri [18,19] introduced families of spin chains that are characterized by the symmetries of the spin-spin interaction. The entropy of entanglement of the ground state of these systems, as well as other thermodynamical quantities like the spin-spin correlation function, can be determined by computing averages over the classical compact groups, which in turn means computing determinants of Toeplitz matrices or of sums of Hankel matrices. These models are solvable and can be mapped into a quadratic chain of Fermi operators via the Jordan-Wigner transformations. One of the main features of these families is that the symmetries of the interaction can be put in one to one correspondence with the structure of the invariant measure of the group to be averaged over. If the Hamiltonian is translation invariant and the interaction is isotropic, then the relevant group over is U(N ) equipped with Haar measure. In turn such averages are equivalent to Toeplitz determinants with a scalar symbol. These systems are generalizations of the XX model. In this paper we consider spin chains whose interaction is translation invariant but the Hamiltonian is not isotropic. These are generalization of the XY model. The Fredholm determinant that we need to compute has the same structure as (1.3), but now the 2 × 2 matrix symbol is iλ g(z) (z) := (1.5) −g −1 (z) iλ, where the function g(z) is defined by g(z) :=

p(z) z 2n p(1/z)

(1.6)

and p(z) is a polynomial of degree 2n. We recover the XY model if we set p(z) =

α(1 + γ ) α(1 − γ ) 2 z −z+ . 2 2

(1.7)

In the above equation α = 2/ h, where h is magnetic field, and γ measures the anisotropy of the Hamiltonian in the XY plane. 2. Statement of Results Following [14] and [13], we will identify the limiting von Neumann entropy for the systems that we study with the double limit,

1 d S(ρ A ) = lim+ lim e(1 + , λ) log D L (λ)(λ2 − 1)−L dλ . (2.1) L→∞ 4πi ( )

→0 dλ In the above formula ( ) is the contour in Fig. 1, D L (λ) is the determinant of the block-Toeplitz matrix TL [] with symbol (1.5) and x + ν x − ν x −ν x +ν log − log . (2.2) e(x, ν) := − 2 2 2 2

Entanglement Entropy in Quantum Spin Chains

121

Fig. 1. The contour ( ) of the integral in Eq. (2.1). The bold lines (−∞, −1 − ) and (1 + , ∞) are the cuts of the integrand e(1 + , λ). The zeros of D L (λ) are located on the bold line (−1, 1)

The explicit Hamiltonians for the family of spin systems that we consider and their connection to formula (2.1) will be discussed in detail in Sects. 3 and 4. One of the main objectives of this paper is to compute the double limit (2.1), which, as we shall see, can be expressed as an integral of multi-dimensional theta functions defined on Riemann surfaces. Thus, in order to state our main results, we need to introduce some definitions and notation. Let us rewrite the function (1.6) as 2n

z − zj g (z) = , 1 − zjz 2

(2.3)

j=1

where the z j ’s are the 2n roots of the polynomial p(z). This representation of g(z) will be used throughout the paper. We fix the branch of g(z) by requiring that g(∞) > 0 on the real axis. The function g(z) has jump discontinuities on the complex z-plane. In order to define its branch cuts we need to introduce an ordering of the roots z j . Let −1 }, {λ1 , λ2 , . . . , λ4n } = {z 1 , . . . , z 2n , z 1−1 , . . . , z 2n

(2.4)

where the above is merely an equality between sets, and we do not necessarily have, for example, λi = z i . We order the λi ’s such that Re(λi ) ≤ Re(λ j ), i < j Im(λi ) ≤ Im(λ j ), i < j, |λi |, |λ j | < 1, Re(λi ) = Re(λ j ) Im(λi ) ≤ Im(λ j ), i > j, |λi |, |λ j | > 1, Re(λi ) = Re(λ j ).

(2.5)

This ordering need not coincide with the ordering of the z j ’s. If necessary, we can always −1 assume that one of the z −1 j has the smallest real part and set λ1 = z j . This choice is

122

A. R. Its, F. Mezzadri, M. Y. Mo

Fig. 2. The choice of cycles on the hyperelliptic curve L. The arrows denote the orientations of the cycles and branch cuts. Note that we have λ1 = z 1−1

equivalent to taking the transpose of TL []. The branch cuts for g(z) are defined by the intervals i joining λ2i−1 and λ2i : i = [λ2i−1 , λ2i ], i = 1, . . . , 2n.

(2.6)

Therefore, g(z) has the following jump discontinuities: g+ (z) = −g− (z), z ∈ i ,

(2.7)

where g± (z) are the boundary values of g(z) on the left/right hand side of the branch cut. Now, let L be the hyperelliptic curve L : w2 =

4n

(z − λi ).

(2.8)

i=1

The genus of L is g = 2n − 1. We now choose a canonical basis for the cycles {ai , bi } on L as shown in Fig. 2, and define dωi to be 1-forms dual to this basis, i.e. dω j = δi j , dω j = i j . (2.9) ai

bi

Furthermore, let us define the g × g matrix by setting ()i j = i j . The theta function θ : Cg −→ C associated to L is defined by − → − → − →− → → θ (− s ) := eiπ n · n +2iπ s · n , (2.10) − → n ∈Zg

− → → while the theta function with characteristics −

and δ is defined by

− →

− → θ − → (s) δ − − → − → → → 1− δ

··−

1− − → − → − → → → θ s + := exp 2iπ + · s + · δ + , 8 2 4 2 2 (2.11)

Entanglement Entropy in Quantum Spin Chains

123

− → → where −

and δ are g-dimensional complex vectors. Our main results are summarised by the following two theorems. Theorem 1. Let Hα be the Hamiltonian of the one-dimensional quantum spin system defined in Eq. (3.11). Let A be the subsystem made of the first L spins and B the one formed by theremaining M − L. We also assume that the system is in a non-degenerate ground state g and that the thermodynamic limit, i.e. M → ∞, has been already taken. Then, the limiting (as L → ∞) von Neumann entropy (2.1) is → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 1 ∞ S(ρ A ) = log dλ, (2.12) 2 1 θ 2 τ2 → where − e is a 2n − 1 vector whose last n entries are 1 and the first n − 1 entries are 0. The parameter τ in the argument of θ is introduced in Sect. 6 and is defined in Eq. (6.11), while the expression of β(λ) is β(λ) :=

λ+1 1 log . 2πi λ−1

(2.13)

Theorem 1 generalizes the result by Its et al. [12,13] for the XY model. In that case the genus of L is one, and the theta function in the integral reduces to the Jacobi theta function θ3 . However, for the XY model the integral (2.12) can be expressed in term of the infinite series S(ρA ) =

∞ m=−∞

∞

(1 + µm ) log

2 =2 e(1, µm ), 1 + µm

where the numbers µm are the solutions of the equation στ = 0, θ3 β(λ) + 2

(2.14)

m=0

(2.15)

and σ is 0 or 1 depending on the strength of the magnetic field. The zeros of the one dimensional theta function are all known, so that the numbers µm can be described by the explicit formula 1−σ π τ. µm = −i tan m + 2 Moreover, as it was shown by Peschel [21] (who also suggested an alternative heuristic derivation of Eq. (2.14) based on the work of Calabrese and Cardy [3]), the series (2.14) can be summed up to an elementary function of the complete elliptic integrals corresponding to the modular parameter τ . It is an open problem whether an analogous representation of the integral (2.12) exists for g > 1. The next step consists of understanding what happens to formula (2.12) when we approach a phase transition. The hyperelliptic curve L, and hence all the parameters in the integral (2.12), are determined by the roots of the polynomial p(z) which defines the symbol (1.5). In Sect. 3 we discuss how the coefficients of p(z) are related to the Hamiltonians of the spin chains. In the case of the XY model p(z) is given by Eq. (1.7); since the degree of p(z) is two the roots λ j can be easily determined as a function of

124

A. R. Its, F. Mezzadri, M. Y. Mo

Fig. 3. The location of one of the roots (2.4), say λ j determines the positions of other three: λ j , 1/λ j and 1/λ j

the parameters α and γ . It was shown by Calabrese and Cardy [4] that when α = 1 — or the magnetic field h = 2 — the XY model undergoes a phase transition and the entropy diverges. Jin and Korepin [14] showed that when γ approaches 0, i.e. the XY model approaches the XX model, and α ≤ 1, then the entanglement entropy diverges logarithmically. Its et. al. [12,13] discovered that the divergence of the entropy for the XY and XX model corresponds to the roots (2.4) of (2.8) approaching the unit circle. This phenomenon extends to the family of systems that we study. In other words, a phase transition manifests itself when pairs of roots of (2.8) approach the unit circle; one root in each pair is inside the unit circle, the other outside. As we shall see, in these circumstances the entropy of entanglement diverges logarithmically. From (2.4) we see that if λ j is a root of (2.8) so is λ−1 j . Moreover, since (2.8) is a polynomial with real −1

coefficients, if λ j is complex then λ j and λ j

will be roots of (2.8) too (see Fig. 3). −1 −1 Now, suppose that λ j approaches the unit circle and λ j < 1, then λ j > 1 and λ j will also be approaching the unit circle with −1

λ j − λ j → 0. At a phase transition the behavior of the entropy of entanglement is captured by −1

Theorem 2. Let the m pairs of roots λ j , λ j , j = 1, . . . , m, approach together towards −1

−1

the unit circle such that the limiting values of λ j , λ j are distinct from those of λk , λk if j = k, then the entanglement entropy is asymptotic to S(ρ A ) = −

m 1 −1 −1 log λ j − λ j + O(1), λ j → λ j , 6

j = 1, . . . , m. (2.16)

j=1

From the integral (2.1) it is evident that in order to prove Theorems 1 and 2 we need an explicit asymptotic formula for the determinant D L (λ). Indeed, the following

Entanglement Entropy in Quantum Spin Chains

125

proposition gives us an asymptotic representation for the determinants of block-Toeplitz matrices whose symbols belong to the family defined in Eqs. (1.5) and (1.6). Proposition 1. Let be the set := {λ ∈ R : |λ| ≥ 1 + }.

(2.17)

Then the Toeplitz determinant D L (λ) admits the following asymptotic representation, which is uniform in λ ∈ : − → τ − → τ 2 L θ β(λ) e + 2 θ β(λ) e − 2 −L 1 + O ρ D L (λ) = (1 − λ ) , L → ∞. θ 2 τ2 (2.18) Here ρ is any real number satisfying the inequality 1 < ρ < min{|λ j | : |λ j | > 1}. Remark 1. The first factor in the right hand side of Eq. (2.18) corresponds to the “trivial” factor, G[] of the general Widom formula (5.1), which we discuss in detail in Sect. 5, while the ratio of the theta functions provides an explicit expression of the most interesting part of the formula — Widom’s pre-factor E[] ≡ det T∞ []T∞ [−1 ] , which is given in formula (5.2). Remark 2. The Asymptotic representation (2.18) is actually valid in a much wider domain of the complex plane λ. Indeed, it is true everywhere away from the zeros of the right hand side, which, unfortunately, in the case of the genus g > 1 is very difficult to express in a simple closed form — one faces a very transcendental object, i.e. the theta-divisor. This constitutes an important difference between the general case and that one with g = 1 studied in [12] and [13], where the zeros of equation (2.15) can be easily evaluated. 3. Quantum Spin Chains with Anisotropic Hamiltonians The XY model is a spin-1/2 ferromagnetic chain with an exchange coupling α in a constant transversal magnetic field h. The Hamiltonian is H = h Hα with Hα given by Hα = −

M−1 M−1 α y y x (1 + γ )σ jx σ j+1 + (1 − γ )σ j σ j+1 − σ jz , 2 j=0

(3.1)

j=0

where {σ x , σ y , σ z } are the Pauli matrices. The parameter γ lies in the interval [0, 1] and measures the anisotropy of Hα . When γ = 0 (3.1) becomes the Hamiltonian of the XX model. In the limit M → ∞ the XY model undergoes a phase transition at αc = 1. It is well known that the Hamiltonian (3.1) can be mapped into a quadratic form of Fermi operators and then diagonalized. To this purpose, we introduce the Jordan-Wigner transformations. Let us define ⎛ ⎛ ⎞ ⎞ l−1 l−1

y m 2l+1 = ⎝ σ jz ⎠ σlx and m 2l = ⎝ σ jz ⎠ σl . (3.2) j=0

j=0

126

A. R. Its, F. Mezzadri, M. Y. Mo

The inverse relations are σlz = im 2l m 2l+1 , ⎛ ⎞ l−1

σlx = ⎝ im 2 j m 2 j+1 ⎠ m 2l+1 , ⎛ σl = ⎝ y

j=0 l−1

⎞ im 2 j m 2 j+1 ⎠ m 2l .

(3.3)

j=0

These operators obey the commutation relations {m j , m k } = 2δ jk but are not quite Fermi operator since they are Hermitian. Thus, we define bl = (m 2l+1 − im 2l )/2 and bl† = (m 2l+1 + im 2l )/2, which are proper Fermi operator as {b j , bk } = 0 and {b j , bk† } = δ jk . In terms of the operators b j ’s the Hamiltonian (3.1) becomes1 Hα =

M−1 M−1 † α † b j b j+1 + b†j+1 b j + γ b†j b†j+1 − b j b j+1 − 2 bjbj. 2 j=0

(3.4)

j=0

It turns out that the expectation values of the operators (3.2) with respect to the ground state g are g m k g = 0, (3.5) (3.6) g m j m k g = δ jk + i(C M ) jk , where the correlation matrix C M has the block structure ⎞ ⎛ C11 C12 · · · C1M C · · · C2M ⎟ ⎜C C M = ⎝ 21 22 ··· ··· ··· ··· ⎠ C M1 C M2 · · · C M M with

C jk =

(3.7)

0 g j−k . −gk− j 0

For large M, the real numbers gl are the Fourier coefficients of g(θ ) =

α cos θ − 1 − iγ α sin θ . |α cos θ − 1 − iγ α sin θ |

1 This is strictly true only for open-end Hamiltonians. If we impose periodic boundary conditions, then † † M−1 the term b†M−1 b0 in (3.4) should be replaced by j=0 2b j b j − 1 b M−1 b0 . However, because we are † interested in the limit M → ∞, the extra factor in front of b M−1 b0 can be neglected.

Entanglement Entropy in Quantum Spin Chains

127

In other words, C M is a block-Toeplitz matrix with symbol 0 g(θ ) . ϕ(θ ) = −g −1 (θ ) 0

(3.8)

(We outline the derivations of formulae (3.5) and (3.6) for the family of systems (3.10) that we study in Appendices B and C.) Equation (3.5) is a straightforward consequence of the invariance of Hα under the map b j → −b j ; for the same reason the expectation value of the product of an odd number of m j ’s must be zero. Formula (3.6) was derived for the first time by Lieb et al. [17]. The expectation values of the product of an even number of the m j ’s can be computed using Wick’s theorem:

g m j1 m j2 · · · m j2n g = (−1) p (contraction of the pair) , (3.9) all pairings

all pairs

where a contraction of a pair is defined by g m jl m jm g and p is the signature of the permutation, for a given pairing, necessary to bring operators of the same pair next to one other from the original order. Many important physical quantities, including the von Neumann entropy and the spin-spin correlation functions, are expressed in terms of the expectation values (3.9). In this paper we study generalizations of the Hamiltonian (3.4) that are quadratic in the Fermi operators and translation invariant. More explicitly, we consider the family of systems ⎤ ⎡ M−1 M−1 † † γ Hα = α ⎣ b†j B jk bk† − b j B jk bk ⎦ − 2 b j A jk bk + bjbj (3.10) 2

j,k=0

j=0

with cyclic boundary conditions. In terms of Pauli operators this Hamiltonian becomes ⎡ ⎛ ⎞ k−1

α ⎣(A jk + γ B jk )σ jx σkx ⎝ Hα = − σlz ⎠ 2 0≤ j≤k≤M−1 l= j+1 ⎛ ⎞⎤ k−1 M−1

y y + (A jk − γ B jk )σ j σk ⎝ σlz ⎠⎦ − σ jz . (3.11) l= j+1

j=0

The translation invariance of the interaction implies that A jk = A j−k and B jk = B j−k , and the cyclic boundary conditions force A and B to be circulant matrices. Furthermore, since Hα is a Hermitian operator, the matrices A and B must be symmetric and antisymmetric respectively. Now, let us introduce two real functions, a : Z/MZ −→ R and b : Z/MZ −→ R, such that a( j − k) = α A j−k − 2δ jk and b( j − k) = α B j−k , Since A is symmetric and B anti-symmetric, we must have a(− j) = a( j) and b(− j) = −b( j).

j, k ∈ Z/MZ.

(3.12)

128

A. R. Its, F. Mezzadri, M. Y. Mo

We shall consider systems with finite range interaction, which implies that there exists a fixed n < M such that a( j) = b( j) = 0 for j > n.

(3.13)

In Appendices B and C we derive the expectation values in the ground state of the Jordan-Wigner operators m j ’s. They have the same structure as the expectation values (3.5) and (3.6), but now in the limit as M → ∞ the symbol (3.8) of the correlation matrix C M is replaced by 0 g(z) , |z| = 1, (3.14) (z) = −g −1 (z) 0 where

g(z) = q(z) =

q(z) = q(1/z)

n

p(z) z 2n p(1/z)

,

(a( j) − γ b( j)) z j ,

j=−n n

p(z) = z q(z).

(3.15) (3.16) (3.17)

4. The von Neumann Entropy and Block-Toeplitz Determinants We now our attention to study the entanglement of formation of the ground concentrate state g of the family of Hamiltonians (3.10). Since the ground state is not degenerate, the density matrix is simply the projection operator g g . We then divide the system into two subchains: the first one A containing L spins; the second one B, made of the remaining M − L. We shall further assume that 1 L M. This division creates a bipartite system. The Hilbert space of the whole system is the direct product HAB = HA ⊗ HB , where HA and HB are spanned by the vectors L−1

(b†j )r j | vac and

j=0

M−L

(b†j )r j | vac , r j = 0, 1,

j=L

respectively. The vector | vac is the vacuum state, which is defined by b j | vac = 0,

j = 0, . . . , M − 1.

Our goal is to determine the asymptotic behavior for large L, with L = o(M), of the von Neumann entropy S(ρA ) = −trρA log ρA , (4.1) where ρA = tr B ρAB and ρAB = g g . It turns out that after computing the partial trace of ρAB over the degrees of freedom of B, the reduced density matrix ρA can be expressed in terms of first L Fermi operators that generate a basis spanning HA . As a consequence, only the submatrix C L formed by the first 2L rows and columns of the correlation matrix (3.7) will be relevant in the

Entanglement Entropy in Quantum Spin Chains

129

computation of the entropy (4.1). Now, C L is even dimensional and skew-symmetric. Furthermore, since g e−iθ = g eiθ its Fourier coefficients are real, therefore there exists an orthogonal matrix V that blockdiagonalizes C L : L−1 0 1 , (4.2) V CL V t = νj −1 0 j=0

where the ±iν j s are imaginary matrix C L = TL [ϕ], where ϕ is

numbers and are the eigenvalues of the block-Toeplitz the symbol (3.14). Let us introduce the operators c j = (d2 j+1 − id2 j )/2,

j = 0, . . . , L − 1,

(4.3)

where dj =

2L−1

V jk m k .

(4.4)

k=0

Since V is orthogonal {d j , dk } = 2δ j and the c j ’s are Fermi operators. Combining Eqs. (4.2), (4.3) and (4.4), we obtain the expectation values g c j g = g c j ck g = 0, (4.5) † 1 − νj g c j ck g = δ jk . (4.6) 2 The reduced density matrix ρA can be computed directly from these expectation values. We report this computation in Appendix A. We have L−1

1 − νj † 1 + νj † ρA = (4.7) cj cj + cj cj . 2 2 j=0

In other words, as Eqs. (4.5) and (4.6) already suggest, these fermionic modes are in a product of uncorrelated states, therefore the density matrix is the direct product ρA =

L−1 !

ρ j with ρ j =

j=0

1 − νj † 1 + νj cj cj + c j c†j . 2 2

(4.8)

Since (1 + ν j )/2 and (1 − ν j )/2 are eigenvalues of density matrices they must lie in the interval (0, 1), therefore, −1 < ν j < 1,

j = 0, . . . , L − 1.

At this point the entropy of the entanglement between the two subsystems can be easily derived from Eq. (4.1): S(ρA ) =

L−1 j=0

e(1, ν j ),

(4.9)

130

A. R. Its, F. Mezzadri, M. Y. Mo

where e(x, ν) is defined in Eq. (2.2). Using the residue theorem, formula (4.9) can be rewritten as ⎛ ⎞ L−1 1 2λ ⎝(−1) L ⎠ e(1 + , λ)dλ S(ρA ) = lim+ 2 − ν2

→0 4πi ( ) λ j j=0 1 d log D L (λ) dλ, (4.10) e(1 + , λ) = lim+

→0 4πi ( ) dλ where ( ) is the contour in Fig. 1 and D L (λ) = (−1) L

L−1

(λ2 − ν 2j )

(4.11)

j=0

is the determinant of the block-Toeplitz matrix TL [](λ) with symbol (1.5). The integral (4.10) was introduced for the first time by Jin and Korepin [14] to compute the entropy of entanglement in the XX model. In this case g −1 (θ ) = g(θ ) and D L (λ) becomes the determinant of a Toeplitz matrix with a scalar symbol. Keating and Mezzadri [18,19] generalized it to lattice models where D L (λ) becomes an average over one of the classical compact groups. Its et al. [12,13] computed the same integral for the XY model, for which D L (λ) is the determinant of a block-Toeplitz matrix with symbol (1.4). Following the same approach of Its et al., in this paper we express D L (λ) as a Fredholm determinant of an integrable operator on L 2 (, C2 ) and solve the Riemann-Hilbert problem associated to it. This will give an explicit formula for D L (λ), which can then be used to compute the integral (4.10). 5. The Asymptotics of Block Toeplitz Determinants. Widom’s Theorem A generalization of the strong Szeg˝o’s theorem to determinants of block-Toeplitz matrices was first discovered by Widom [24,25]. Consider a p × p matrix symbol ϕ and assume that ∞ 1/2 ∞ 2 ||ϕ|| = ||ϕk || + |k| ||ϕk || < ∞. k=−∞

k=−∞

The norm that appears in the right-hand side of this equation is the Hilbert-Schmidt norm of the p × p matrices that occur. In addition, we shall require that det ϕ(z) = 0 and ||z|=1 arg det ϕ(z) = 0. Widom showed that if one defines

1 G[ϕ] := exp 2πi

dz log det ϕ(z) z

(5.1)

then E[ϕ] := lim

L→∞

D L [ϕ] −1 = det T [ϕ]T [ϕ ] , ∞ ∞ G[ϕ] L+1

(5.2)

Entanglement Entropy in Quantum Spin Chains

131

where T∞ [ϕ] is a semi-infinite Toeplitz matrix acting on the Hilbert space of a semi-infinite sequence of p-vectors: " # ∞ p l 2 = {vk }∞ ||vk ||2 < ∞ . k=0 vk ∈ C , k=0

Formulae (5.1) and (5.2) reduce to Szeg˝o’s strong limit theorem when p = 1. Although this beautiful formula is very general, it is difficult to extract information from the righthand side of Eq. (5.2) and determine formulae that can be used in the applications. The advantage of our approach is precisely to derive an explicit formula for the leading order term of the asymptotics of block-Toeplitz determinants whose symbols (z) belong to the one-parameter family defined in (1.5). A starting point of our analysis is the asymptotic representation of the logarithmic derivative (with respect to the parameter λ) of the determinant D L (λ) = det TL [](λ) in terms of 2 × 2 matrix-valued functions, denoted by U± (z) and V± (z), which solve the following Wiener-Hopf factorization problem: (z) = U+ (z)U− (z) = V− (z)V+ (z), (5.3) U− (z) and V− (z) (U+ (z) and V+ (z)) are analytic outside (inside) the unit circle , (5.4) U− (∞) = V− (∞) = I. (5.5) Now, let us fix > 0 and define the set := {λ ∈ R : |λ| ≥ 1 + }.

(5.6)

In the next section we will show that for every λ ∈ the solution of the above Wiener-Hopf factorization problem exists, and the corresponding matrix functions, U± (z) and V± (z) satisfy the following uniform estimate: 1 U+ (z) , 1 V+ (z) , |U− (z)|, |V− (z)| < C , ∀z ∈ D± , ∀λ ∈ , (5.7) λ λ where the notation D+ (D− ) is used for the interior (exterior) of the unit circle . Moreover, generalizing the approach of [12,13] we will obtain the multidimensional theta function explicit formulae for the functions U± (z) and V± (z). The asymptotic representation of the logarithmic derivative d log D L (λ)/dλ is given by the following theorem: Theorem 3. Let λ ∈ , and fix a positive number R > 0. Then, we have the following asymptotic representation for the logarithmic derivative of the determinant D L (λ) = det TL []: d 2λ log D L (λ) = − L dλ 1 −λ2 1 tr U+ (z)U+−1 (z) + V+−1 (z)V+ (z) −1 (z) dz + 2π + r L (λ), (5.8) where ( ) means the derivative with respect to z, the error term r L (λ) satisfies the estimate |r L (λ)| ≤ Cρ −L ,

λ ∈ ∩ {|λ| ≤ R}, L ≥ 1,

and ρ is any real number such that 1 < ρ < min{|λ j | : |λ j | > 1}.

(5.9)

132

A. R. Its, F. Mezzadri, M. Y. Mo

This theorem, without the error term estimate is a specification of one of the classical results of H. Widom [24] for the case of the matrix generators (z) whose dependence on the extra parameter λ is given by the equation (z) ≡ (z; λ) = iλI + (z; 0). The estimate (5.9) of the error term as well as an alternative proof of the theorem itself in the case of curves of genus one is given in [12] and [13]. The method of [12] and [13] is based on the Riemann-Hilbert approach to the Toeplitz determinants [5] and on the theory of the integrable Fredholm operators [9,11]; its extension to symbols (1.5), where the polynomial p(z) entering in (1.6) is of arbitrary degree, is straightforward. Indeed, the following generalization of Theorem 3 follows directly from the analytic considerations of [13]. Theorem 4. Suppose that the matrix generator (z) is analytic in the annulus, Dδ = {1 − δ < |z| < 1 + δ}. Suppose also that (z) depends analytically on an extra parameter µ and that it admits a Wiener-Hopf factorisation for all µ from a certain set M. Finally, we shall assume that the matrix functions (z), −1 (z),

∂(z) , U± (z), and V± (z) ∂µ

are uniformly bounded for all µ ∈ M and all z from the respective domains, i.e. Dδ in the case of (z), −1 (z), and ∂(z)/∂µ, and D± in the case of U± (z) and V± (z). Then, the logarithmic derivative of the determinant D L (µ) = det TL [] has the following asymptotic representation: d L ∂(z) dz ∂(z) dz 1 tr −1 (z) tr (−1 ) (z) log D L (µ) = + dµ 2πi ∂µ z 2πi ∂µ z ∂(z) ∂(z) 1 tr U+ (z)U+−1 (z) + −1 (z) + V+−1 (z)V+ (z)−1 (z) dz 2πi ∂µ ∂µ (5.10) + r L (µ),

where the error term r L (µ) satisfies the uniform estimate |r L (µ)| ≤ Cρ −L ,

µ ∈ M, L ≥ 1,

(5.11)

and ρ is any positive number such that 1 < ρ < 1 + δ. This theorem, without the estimate of the error term and with much weaker assumptions on the generator (z), is exactly the classical result of Widom from [24]. Remark 3. Denote u ± (z) = V±−1 (z), and v± (z) = U±−1 (z), so that −1 (z) = u + (z)u − (z) = v− (z)v+ (z).

Entanglement Entropy in Quantum Spin Chains

133

Then, Eq. (5.10) can be re-written in a more compact way: d L ∂(z) dz −1 log D L (µ) = tr (z) dµ 2πi ∂µ z i ∂(z) dz + tr (u + (z)u − (z) − v− (z)v+ (z)) 2π ∂µ +r L (µ). (5.12) The form in which this result is formulated is in [24]. Theorem 4 can be used to strengthen the statement of Theorem 3 by removing the dependence of the constant C on R in the estimate (5.9). This leads to the following extension of Theorem 3: Theorem 5. Let be the set defined in (5.6) and let (z) be the symbol defined in (1.5). Then we have the following asymptotic representation of the logarithmic derivative of the determinant D L (λ) = det TL [] for all λ ∈ : 2λ 1 d −1 −1 −1 L + tr U (z)U (z) + V (z)V (z) (z) dz, log D L (λ) = − + + + + dλ 1 − λ2 2π +r L (λ), (5.13) where ( ) means the derivative with respect to z, the error term r L (λ) satisfies the uniform estimate |r L (λ)| ≤

C −L ρ , |λ|3

λ ∈ , L ≥ 1,

(5.14)

and ρ is any real number such that 1 < ρ < min{|λ j | : |λ j | > 1}. Proof. Let R > 1 + and denote C1 the constant C from estimate (5.9). Take now λ ∈ , |λ| ≥ R and set $ % 1 1 1 µ = ∈ M ≡ µ ∈ R : |µ| ≤ < . λ R 1+

By trivial algebra, we arrive at det D L (λ) = (−λ2 ) L det D˜ L (µ), ˜ and where D˜ L (µ) ≡ det TL [] ˜ (z) ≡

1 1 −iµg(z) . (z) = I − iµ(z; 0) ≡ 1 iµg −1 (z) iλ

(5.15)

From this relation it also follows that d 2L 1 d log det D L (λ) = − 2 log det D˜ L (µ), dλ λ λ dµ

(5.16)

and hence the asymptotic analysis of the logarithmic derivative d log det D L (λ)/dλ ˜ for |λ| ≥ &R is reduced to that of the ' logarithmic derivative d log det D L (µ)/dµ for 1 . µ ∈ M ≡ µ ∈ R : |µ| ≤ R1 < 1+

134

A. R. Its, F. Mezzadri, M. Y. Mo

˜ ˜ −1 (z) and Firstly, we notice that for all µ ∈ M and z ∈ Dδ the functions (z), ˜ ∂ (z)/∂µ are uniformly bounded. Secondly, we have that 1 1 1 (z) = U+ (z)U− (z) = V− (z)V+ (z), iλ iλ iλ

˜ (z) =

and hence the matrix valued functions U˜ ± (z) and V˜± (z) defined by the relations 1 1 U˜ + (z) = U+ (z), V˜+ (z) = V+ (z), U˜ − (z) = U− (z), V˜− (z) = V− (z) iλ iλ ˜ provide the Wiener-Hopf factorization of the generator (z). Moreover, because of the ˜ ˜ estimates (5.7), the functions U± (z) and V± (z) are uniformly bounded for all µ ∈ M and z ∈ D± . Hence, all the conditions of Theorem 4 are met, and we can claim the uniform asymptotic representation (5.10) of the logarithmic derivative of the determi˜ U˜ and V˜ respectively. We nant D˜ L (µ) with the symbols , U , and V replaced by , shall also use the notation r˜L (µ) and C2 for the error term and constant C from the corresponding estimate (5.11) respectively. ˜ The specific form (5.15) of dependence of the generator (z) on the parameter µ implies that ˜ 1 ∂ (z) −µ −ig(z) −1 ˜ (z) = , (5.17) ∂µ 1 − µ2 ig −1 (z) −µ and ˜ −1 ) (z) (

˜ 1 ∂ (z) = ∂µ 1 − µ2

Hence

0 −µg −1 (z)g (z) . 0 µg −1 (z)g (z)

tr tr

˜ ∂ (z) ˜ −1 (z) ∂µ

˜ (

˜ ∂ (z) ) (z) ∂µ

=−

−1

(5.18)

2λ 2µ = , 1 − µ2 1 − λ2

= 0,

and Eq. (5.10) for the determinant D˜ L (µ) becomes d 2λ L log D˜ L (µ) = dµ 1 − λ2 ˜ ˜ 1 ∂ (z) ∂ (z) ˜ −1 (z) + V˜+−1 (z)V˜+ (z) ˜ −1 (z) + tr U˜ + (z)U˜ +−1 (z) dz 2πi ∂µ ∂µ +˜r L (µ),

(5.19)

with |˜r L (µ)| ≤ C2 ρ −L ,

µ ∈ M,

L ≥ 1.

Observe now that Eq. (5.17) can be rewritten as ˜ −1 (z)

˜ ˜ ∂ (z) ∂ (z) ˜ −1 (z) = λI − iλ2 −1 (z) . = ∂µ ∂µ

(5.20)

Entanglement Entropy in Quantum Spin Chains

135

This relation, together with the obvious fact that U˜ + (z)U˜ +−1 (z) = U+ (z)U+−1 (z) and V˜+−1 (z)V˜+ (z) = V+−1 (z)V+ (z), allows to transform (5.19) into the asymptotic formula 2λ d L log D˜ L (µ) = dµ 1 − λ2 λ2 − tr U+ (z)U+−1 (z) + V+−1 (z)V+ (z) −1 (z) dz 2π (5.21) +˜r L (µ). The substitution of this relation into the right hand side of Eq. (5.16) yields the following asymptotic formula — which is complementary to Eq. (5.8): d 2λ log D L (λ) = − L dλ 1 −λ2 1 + tr U+ (z)U+−1 (z) + V+−1 (z)V+ (z) −1 (z) dz 2π +r L (λ), (5.22) with the error term r L (λ) satisfying the estimate |r L (λ)| ≤

C2 −L ρ , |λ|2

λ ∈ ∩ {|λ| ≥ R},

L ≥ 1.

(5.23)

Choosing C = max {C1 R, C2 }, we arrive at the statement of the theorem, but with a better estimate for the error term r L (λ) than that one in (5.14). ˜ In order to improve the estimate (5.23), we notice that since (z) becomes the iden˜ tity matrix as µ → 0, the Wiener-Hopf factorization of (z) exists for all µ from the small complex neighbourhood M0 ≡ {µ ∈ C : |µ| < 0 ≤

1 } R

of the point µ = 0. In particular, this implies that the Wiener-Hopf factors, U˜ ± (z) and V˜± (z), admit an analytic continuation to the disc M0 and that the validity of the formulae (5.19) and (5.20) can be extended to the set M0 ∪ M. Moreover, from Eq. (5.19) it follows that r˜L (µ) is analytic in the disc M0 and that r˜L (0) = 0. In order to see that the latter equality is true, one has to take into account that U˜ ± (z) = V˜± (z) = I for all z and µ = 0 and the evenness of D˜ L (µ) as a function of µ. Now, define rˆL (µ) =

r˜L (µ) . µ

136

A. R. Its, F. Mezzadri, M. Y. Mo

The function rˆL (µ) is analytic in the disc M0 and satisfies the estimate (5.20) uniformly for µ ∈ C ≡ {|µ| = } and for any 0 < < 0 . With the help of the Cauchy formula, rˆL (µ ) 1 dµ , rˆL (µ) = 2πi |µ |= 0 /2 µ − µ we conclude that |ˆr L (µ)| < Cρ L , |µ| ≤ 0 /3,

L>1

or |˜r L (µ)| < C|µ|ρ L , |µ| ≤ 0 /3,

L > 1.

The last inequality combined with (5.20) allows to replace it by the estimate |˜r L (µ)| < C|µ|ρ L , µ ∈ M,

L > 1,

which, in turn, transforms (5.23) into the estimate |r L (λ)| ≤

C2 −L ρ , |λ|3

λ ∈ ∩ {|λ| ≥ R},

L ≥ 1,

(5.24)

and hence yields the correction term as announced in (5.14). This completes the proof of the theorem. 6. The Wiener-Hopf Factorization of (z) In this section we will compute the Wiener-Hopf factorization of (z). We will express the solution in terms of theta functions on a hyperelliptic curve L. From the equality 1 0 (1 − λ2 )σ3 −1 (z)σ3 = (z), σ3 = , 0 −1 we can express V in terms of U as follows: V− (z) = σ3 U−−1 σ3 ,

V+ (z) = σ3 U+−1 (z)σ3 (1 − λ2 ), λ = ±1.

(6.1)

Therefore, we only need to compute U (z). To do so, first note that (z) can be diagonalized by the matrix g(z) −g(z) . (6.2) Q(z) = i i Indeed, it is straightforward to see that (z) = Q(z)Q −1 (z), λ+1 0 . =i 0 λ−1

Entanglement Entropy in Quantum Spin Chains

137

The function Q(z) has the following jump discontinuities on the z-plane: Q + (z) = Q − (z)σ1 , z ∈ i , 01 σ1 = , 10 where the branch cuts i are defined in (2.4), (2.5) and (2.6) and Q ± (z) are the boundary values of Q(z) to the left/right of i . It also has square-root singularities at each branch point with the following behavior: ±1 ± 21 1 −1 , z → z i± , Q(z) = Q ±i (z) (z − z i ) 0 1 1 0 1 where Q ±i (z) are functions that are holomorphic and invertible at z i± . Let us define S(z) = U− (z)Q(z)−1 , |z| ≥ 1,

S(z) = U+ (z)−1 Q(z), |z| ≤ 1.

(6.3)

By direct computation we see S(z) is the unique solution of the following Riemann-Hilbert problem: S+ (z) = S− (z)σ1 , z ∈ i , i = 1, . . . , n,

S+ (z) = S− (z)σ1 −1 , z ∈ i , i = n + 1, . . . , 2n, −1

lim S(z) = Q(∞)

z→∞

(6.4)

,

where, as before, S± (z) denotes the boundary values of S(z) to the left/right of the branch cuts. The matrix function S(z) is holomorphic and invertible everywhere, except on the cuts j , where it has the jump discontinuities given in (6.4), and in proximity of the branch points, where it behaves like ±1 ± 21 S(z) = S±i (z) (z − z i ) 0 ±1 ± 21 S(z) = S±i (z) (z − z i ) 0

1 −1 0 , z → z i± , |z i | < 1, 1 1 1 1 −1 0 −1 , z → z i± , |z i | > 1, 1 1 1

(6.5)

where S±i (z) are holomorphic and invertible at z i± . The Riemann-Hilbert problem (6.4) can be solved in terms of the multi-dimensional theta functions (2.10). However, before we compute explicitly S(z), we need to introduce further notions and properties of θ . Throughout the rest of this section we shall use the definitions (2.8) of the hyperelliptic curve L and (2.10) of the theta function associated to L. Furthermore, recall that the choice of the canonical basis for the cycles is described in Fig. 2 and that the normalized 1-forms dual to this basis are defined in Eq. (2.9). Let us introduce some basic properties of the theta functions. The proofs of such properties can be found in many standard textbooks in Riemann surfaces like, for example, [6].

138

A. R. Its, F. Mezzadri, M. Y. Mo

Proposition 2. The theta function is quasi-periodic with the following properties: − → → → θ (− s + M ) = θ (− s ), +

( ) *− → − → − → − → − → − → → θ (− s ), θ ( s + M ) = exp 2πi − M , s − M , M 2

(6.6) (6.7)

where ·, · denotes the usual inner product in Cg . A divisor D of degree m on a hyperelliptic curve L is a formal sum of m points on L, i.e. D :=

m

di , di ∈ L.

i=1

Let us introduce the Abel map ω : L −→ Cg by setting p p ω( p) := dω1 , . . . , dωg , p0

p0

where p0 is a chosen base point on L and ωi are the normalized 1-forms given in (2.9). In what follows we shall set p0 = z 1 = λ1 . The composition of the theta function with the Abel map has g zeros on . The following lemma tells us where the zeros are. ,g Lemma 1. Let D = i=1 di be a divisor of degree g on L, then the multivalued function θ (ω( p) − ω(D) − K ) has precisely g zeros located at the points di , i = 1, . . . , g. The vector K = (K 1 , . . . , K g ) is the Riemann constant p 2πi + j j 1 − (dωl ( p) dω j ). Kj = 2 2πi al z1 l= j

The hyperelliptic curve L can be thought of as a branched cover of the Riemann sphere C ∪ {∞}. Indeed, a point p ∈ L can be identified by two complex variables, p = (z, w), where w and z are related by Eq. (2.8). We shall denote by C1 the the Riemann sheet where g(∞) > 0 on the real axis, and by C2 the other Riemann sheet in L. Thus, a function f on L can be thought of as a function in two complex variables: f ( p) = f (z, w). Consider the map 2n T : C/∪i=1 i −→ L, T (z) = (z, w),

where the branch of w is chosen such that (z, w) is on C1 . A function f on L then 2n by defines the function f ◦ T on C/ ∪i=1 i f ◦ T (z) = f (z, w). For the sake of simplicity, and when there is no ambiguity, we shall write f (z) instead of f ◦ T (z) and f ( p) instead of f (z, w).

Entanglement Entropy in Quantum Spin Chains

139

Fig. 4. The Jordan arc connects all the branch points and extends to infinity on the left hand side of λ1 and on the right hand side of λ4n . All branch cuts belong to and are denoted by i , while the intervals between ˜i the branch cuts are denoted by

Abelian integrals on L can be represented as integrals on the Riemann sheet with jump discontinuities. To do so, let us first define a Jordan arc as in Fig. 4. Let f (z, w) be a function on L and f (z) = f ◦ T (z). Then an Abelian integral on L, p I ( p) = f ( p )dp , λ1

defines the following integral on C: I (z) =

z λ1

f ◦ T (z )dz ,

where the path of the integration does not intersect /{λ1 }. Such integral will in general have jump discontinuities along , and its value on the left hand side of will be denoted by I (z)+ , while its value on the right hand side of will be denoted by I (z)− . Let ρ be the hyperelliptic involution that interchanges the two sheets of L, i.e. ρ(z, w) = (z, −w). The action of ρ on f (z) is given by ρ( f )(z) = f (z, −w),

(6.8)

i.e. it is the function evaluated on C2 . Similarly, the action of ρ on an integral I (z) is defined by z ρ( f )(z )dz , (6.9) ρ(I )(z) = λ1

From Proposition 2 we see that the composition of the Abel map ω with θ has the following jump discontinuities when considered as a function on C: Lemma 2. Let z be a point on C, and let be a Jordan arc joining all the branch cuts as in Fig. 4, then the quotient of theta functions has the following jump discontinuities on θ (ω(z) + A) θ (ω(z) + A) ˜ j, = , z∈ θ (ω(z) + B) + θ (ω(z) + B) − θ (ω(z) + A) θ (−ω(z) + A) = e−2πi(A j−1 −B j−1 ) , z ∈ j , θ (ω(z) + B) + θ (−ω(z) + B) − where A and B are arbitrary 2n − 1 vectors and A0 = B0 = 0.

140

A. R. Its, F. Mezzadri, M. Y. Mo

Proof. The holomorphic differentials dω j are given by dωi =

Pi (z) dz, w(z)

for some polynomial Pi (z) of degree less than 2n − 1 in z. This means that, under the action of ρ, dωi becomes −dωi . In particular, we have ρ(ω)(z) = −ω(z),

(6.10)

where the action of ρ on ω is given by (6.8) and (6.9). ˜ j . Take two distinct paths from λ1 to a We first consider the jumps across the gaps ˜ point z ∈ j . Assume also that both curves do not intersect and that one extends to the left of , while the other to its right. The union of these paths lifts to a loop γ˜ on L. Moreover, γ˜ is a linear combinations of a-cycles, i.e. γ˜ =

g

Ni ai ,

i=1

where the Ni ’s are non-negative integers. Therefore, we have θ (ω(z) + ℵ + A) θ (ω(z) + A) = θ (ω(z) + ℵ + B) + θ (ω(z) + B) + θ (ω(z) + A) ˜ j, = , z∈ θ (ω(z) + B) − ℵ=

g

Ni E i ,

i=1

where E i is the column vector with 1 in the i th entry and zero elsewhere. Now consider the jumps on the branch cuts j . Let z ∈ j , then take a loop γ on L consisting of two distinct curves joining λ1 to z, both non-intersecting ; one on the left of the cut in C1 , the other on the right of the cut in C2 . This closed loop γ is homologic to the b-cycle b j . Therefore, θ (ω(z) + + A) θ (ω(z) + A) = e−2πi(A j−1 −B j−1 ) θ (ω(z) + + B) − θ (ω(z) + B) − θ (−ω(z) + A) = , z ∈ j, θ (−ω(z) + B) + k = ki . This proves the lemma.

We can now solve the Riemann-Hilbert problem (6.4), (6.5). Let us define τ := − ω(z i−1 ) − K , 2 i=2 z d, (z) := 2n

+∞

(6.11)

Entanglement Entropy in Quantum Spin Chains

141

where d is the normalized differential of third type with simple poles at ∞± and residues ± 21 respectively. In addition, we write κ :=

1 2πi

1 d, . . . , 2πi b1

d . bg

Proposition 3. Let ∞± be the points on L that projects to ∞ on C1 . The unique solution of the Riemann-Hilbert problem (6.4), (6.5) is given by S(z) = Q(∞)−1 −1 (∞)(z),

(6.12)

where entries of (z) are given by 11 (z) = 12 (z) = 21 (z) = 22 (z) =

→ ω(z) + β(λ)− e − κ + τ2 z − λ1 e , θ ω(z) + τ2 → θ ω(z) − β(λ)− e + κ − τ2 − z − λ1 e(z) , θ ω(z) − τ2 − → τ (z) θ ω(z) + β(λ) e +κ − 2 − z − λ1 e , θ ω(z) − τ2 − → κ + τ −(z) θ ω(z) − β(λ) e − 2 z − λ1 e , θ ω(z) + τ2 -

−(z) θ

(6.13)

→ where and − e is a 2n − 1 dimensional√vector whose last n entries are 1 and the first ˜ 0. n − 1 entries are 0. The branch cut of z − λ1 is defined to be / Proof. By using Lemma 2, we see that (z) has the following jump discontinuities: (11 (z))+ (12 (z))+ (21 (z))+ (22 (z))+

= = = =

(11 (z))+ = (12 (z))+ = (21 (z))+ = (22 (z))+ =

(12 (z))− , z ∈ i , (11 (z))− , z ∈ i , (22 (z))− , z ∈ i , (21 (z))− , z ∈ i , λ−1 (12 (z))− , z λ+1 λ+1 (11 (z))− , z λ−1 λ−1 (22 (z))− , z λ+1 λ+1 (21 (z))− , z λ−1

i i i i

= 1, . . . , n, = 1, . . . , n, = 1, . . . , n, = 1, . . . , n,

∈ i , i = n + 1, . . . , 2n, ∈ i , i = n + 1, . . . , 2n, ∈ i , i = n + 1, . . . , 2n, ∈ i , i = n + 1, . . . , 2n.

This means that (z) has the same jump discontinuities as in (6.4). To see that (z) has the singularity structure given by (6.5), note that the function U˜ + = Q(z)−1 (z), |z| < 1 U˜ − = (z)Q −1 (z), |z| > 1

142

A. R. Its, F. Mezzadri, M. Y. Mo

has no jump discontinuities across the branch cuts j . It can at only have singularities of order less than or equal to 21 at the points z ±1 j . This means that, if it was singular

at z ±1 j , then it would have jump discontinuities across j due to the branch point type

singularities. Therefore it is holomorphic at the points z ±1 j . Hence, the function (z) must have the singularity structure of the form (6.5). To show that S(z) has the correct asymptotic behavior at z = ∞, we only need to prove that (z) is invertible at z = ∞. The asymptotic behavior of (z) is given by τ −0 → 11 (∞) = θ ω(∞) − κ + β(λ)− e e + , 2 τ −0 → 22 (∞) = θ ω(∞) − κ − β(λ)− e e + , 2 12 (∞) = 21 (∞) = 0, where 0 = lim z→∞ (z) − 21 log(z − λ1 ). We will now show that ω(∞) = κ. Let η be a third type differential with simple ˜ i be their poles at the points xi ∈ L and η˜ be a holomorphic differential. Let i and periods η = i , η = i+g , ai bi ˜ i, ˜ i+g . η˜ = η˜ = ai

bi

Now, by the Riemann bilinear relation [16] we have g

˜ i+g − g+i ˜ i = 2πi i

xi

i=1

Resxi (η)

xi

η, ˜

p0

where p0 is an arbitrary point on L. By substituting η = d and η˜ = dω j for j = 1, . . . , g, we see that κj =

1 ω j (∞+ ) − ω j (∞− ) = ω j (∞), 2

where the last equality follows from (6.10). Therefore, we obtain τ −0 → 11 (∞) = θ β(λ)− e e + , 2 τ −0 → e 22 (∞) = θ −β(λ)− e + , 2 12 (∞) = 21 (∞) = 0.

(6.14)

Therefore (z) is invertible at ∞ as long as τ τ → → θ β(λ)− e + θ −β(λ)− e + = 0. 2 2 Thus, S(z) is the unique solution of the Riemann-Hilbert problem (6.4).

(6.15)

Entanglement Entropy in Quantum Spin Chains

143

Remark 4. In Appendix F, we will show that the Wiener-Hopf factorization is solvable for β(λ) ∈ iR, i.e. the Riemann-Hilbert problem (6.4) is solvable for these β(λ). This in turn implies that (6.15) is true for all β(λ) ∈ iR. Define (cf. (5.6)) = {λ ∈ R : |λ| ≥ 1 + }. The function λ → β(λ) maps onto the bounded subset N ≡ {α ∈ iR : 0 < |α| ≤ 1 −1 + 1). By continuity, the inequality (6.15) is valid for all α from the closure 2π log(2

of N . This fact, together with the explicit formulae (6.12), (6.3) and (6.1) implies the uniform estimates which have been stated in (5.7) and used in the proof of Theorem 5. 7. The Asymptotics of d log D L (λ)/dλ and D L (λ) We are now ready to compute the derivative of the determinant D L (λ). First we notice that by virtue of (6.1), Eq. (5.8) can be re-written as d 2λ log D L (λ) = − L dλ 1 −λ2 1 + tr U+ (z)U+−1 (z) −1 (z) − σ3 −1 (z)σ3 dz 2π |z|=1 (7.1) +r L (λ). Define (z) := −1 (z) − σ3 −1 (z)σ3 =

2 1 − λ2

0 −g(z) . g −1 (z) 0

From Eqs. (6.3) and (6.12) we have U+−1 (z) = A(z)Q −1 (z), U+ (z) = Q (z)−1 (z)A−1 + Q(z)(−1 ) (z)A−1 , where we denote A = Q(∞)−1 −1 (∞). Furthermore, from Eq. (6.2) we obtain Q −1 (z) =

1 2

g −1 (z) −i −g −1 (z) −i

.

Therefore, formula (7.1) transforms into the relation d 2λ log D L (λ) = − L dλ 1 − λ2

i −1 d dz (z)σ tr + 3 π(1 − λ2 ) dz +r L (λ). We will now prove the following:

(7.2)

144

A. R. Its, F. Mezzadri, M. Y. Mo

Theorem 6. Let s(λ) be given by i α(z)dz, π(1 − λ2 ) |z|=1

−1 d (z)σ3 , α(z) = tr dz s(λ) =

(7.3)

where the entries of the 2 × 2 matrix (z) are given by (6.13). Then s(λ) can be written as s(λ) = −

i d τ τ − → − → log θ β(λ) e + θ β(λ) e − . π(1 − λ2 ) dβ 2 2

Proof. To begin with, we would like to treat α(z)dz as a 1-form on the hyperelliptic curve L. We will show that it is, in fact, the holomorphic 1-form

α(z)dz =

2n−1 i=1

τ τ → → θ β(λ)− e − dωi , ∂i log θ β(λ)− e + 2 2

where dωi are the normalized holomorphic differentials on L and ∂i is the partial derivative with respect to the i th argument. Suppose this is true, then by deforming the contour of the integral (7.3), we see that it can be written as

s(λ)

2n−1 π(1 − λ2 ) =− α(z)dz i ak

=−

k=n 2n−1 k=n

2n−1

ak j=1

τ τ → → θ β(λ)− e − dω j ∂ j log θ β(λ)− e + 2 2

τ τ → → θ β(λ)− e − ∂ j log θ β(λ)− e + 2 2 k=n τ τ d → → log θ β(λ)− e + θ β(λ)− e − . =− dβ 2 2

=−

2n−1

To see that α(z)dz is given by the corresponding 1-form, let us first compute α(z)dz. We have α(z)dz = (det (z))−1 22 (z) 11 (z) − 11 (z) 22 (z) −12 (z) 21 (z) + 21 (z) 12 (z) dz, where the prime denotes the derivative with respect to z.

(7.4)

Entanglement Entropy in Quantum Spin Chains

145

We can simplify Eq. (7.4) by observing that τ → , e −κ + 11 (z) = h 1 (z) θ ω(z) + β(λ)− 2 τ → 22 (z) = h 1 (z) θ ω(z) − β(λ)− , e −κ + 2 τ → 12 (z) = h 2 (z) θ ω(z) − β(λ)− , e +κ − 2 τ → 21 (z) = h 2 (z) θ ω(z) + β(λ)− , e +κ − 2 e−(z) , h 1 (z) = z − λ1 θ ω(z) + τ2 e(z) . h 2 (z) = − z − λ1 θ ω(z) − τ2 Therefore, we have 22 (z) 11 (z) − 11 (z) 22 (z) = (h 1 (z))2 θ2 θ1 − θ1 θ2 , 12 (z) 21 (z) − 21 (z) 12 (z) = (h 2 (z))2 θ3 θ4 − θ4 θ3 , where the θi ’s are given by τ → , e −κ + θ1 = θ ω(z) + β(λ)− 2 τ → θ2 = θ ω(z) − β(λ)− , e −κ + 2 τ → θ3 = θ ω(z) − β(λ)− , e +κ − 2 τ → θ4 = θ ω(z) + β(λ)− . e +κ − 2 Now, the θi ’s are just θi dz =

2n−1

(∂k θi ) dωk .

k=1

By substituting the right hand side of this equation into (7.4) we obtain α(z)dz = det (z)−1

2n−1

dωk (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z) ,

k=1

G 1k (z) G 2k (z)

= θ2 ∂k θ1 − θ1 ∂k θ2 , = θ3 ∂k θ4 − θ4 ∂k θ3 .

We would like to show that the expression det (z)−1 (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z)

146

A. R. Its, F. Mezzadri, M. Y. Mo

is a constant. First note that, by considering the jump and singularity structure of det (z), we have det (z) = g(z) det (∞)g(∞)−1 , where g(z) is given by (2.3). Since the i j (z)’s have square root singularities at the n points z = z −1 j , the functions (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z) can have at most simple poles at the points (z j )±1 , j = 1, . . . , 2n. Near each of these points, they behave like j

1

j

(h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z) = A0 + A1 (z − z j ) 2 + O(z − z j ), z → z j , −1 − 2 −1 +O(1), z → z −1 (h 1 (z))2 G 1k (z)−(h 2 (z))2 G 2k (z) = B0 (z−z −1 j ) +B1 (z−z j ) j . j

j

1

Since ρ()(z) = −(z), ρ(ω)(z) = −ω(z) and ρ(z − λ1 ) = z − λ1 , we have ρ(h 21 )(z) = h 22 (z), ρ(θ1 )(z) = θ3 (z), ρ(θ2 )(z) = θ4 (z) and (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z) = (h 1 (z))2 G 1k (z) − ρ((h 1 )2 G 1k )(z).

(7.5)

Since the action of ρ on a Laurent series near a branch point λ j is given by ∞ ∞ k k ρ X k (z − λ j ) 2 = X k (−(z − λ j )) 2 , k=−∞ j

k=−∞ j

by (7.5) we obtain A0 = B0 = 0 for all j. Hence, the function det (z)−1 (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z)

(7.6)

does not have any pole on L. To see that it does not have jumps too, let us consider (h 1 (z))2 G 1k (z) = (h 1 (z))2 (θ2 ∂k θ1 − θ1 ∂k θ2 ) . The periodicity of the term inside the brackets is given by Proposition 2: θ1 ∂k θ2 (z + a j ) = θ1 ∂k θ2 , θ1 ∂k θ2 (z + b j ) = θ1 ∂k θ2 e−2πi(2ω j (z)−2κ j +τ j + j j ) , − θ1 θ2 (2πiδ jk )e−2πi(2ω j (z)−2κ j +τ j + j j ) , θ2 ∂k θ1 (z + a j ) = θ1 ∂k θ2 θ2 ∂k θ1 (z + b j ) = θ2 ∂k θ1 e−2πi(2ω j (z)−2κ j +τ j + j j ) − θ2 θ1 (2πiδ jk )e−2πi(2ω j (z)−2κ j +τ j + j j ) ,

.z where ω j (z) = λ1 dω j is the j th component of the vector ω(z). Hence the multiplicative factor picked up by G 1k (z) after going around a b-cycle cancels exactly with the factor picked up by (h 1 (z))2 . It follows that the function (7.6) does not have jumps on L

Entanglement Entropy in Quantum Spin Chains

147

too. Hence, they are holomorphic functions on L without any pole and must be constants. These constants can be computed by taking z = ∞. In other words, they are given by (6.14). We therefore have τ τ → → θ β(λ)− e − . det (z)−1 (h 1 (z))2 G 1k (z) − (h 2 (z))2 G 2k (z) = ∂k log θ β(λ)− e + 2 2

This proves the theorem.

Theorem 6, in its turn, yields our main asymptotic result. Theorem 7. Let be the domain of solvability (5.6). Then the logarithmic derivative of Toeplitz determinant D L (λ) admits the following asymptotic representation, which is uniform in λ ∈ : τ τ d 2λ d → → log D L (λ) = − log θ β(λ)− e + θ β(λ)− e − L+ 2 dλ 1−λ dλ 2 2 −L ρ , L → ∞. (7.7) +O λ2 Here ρ is any real number satisfying the inequality 1 < ρ < min{|λ j | : |λ j | > 1}. The uniformity of the estimate (7.7) with respect to λ ∈ allows its integration over , which yields the equation log D L (λ)(1 − λ2)−L − lim log D L (s)(1 − s 2 )−L s→∞ − → → τ θ β(λ) e + 2 θ β(λ)− e − τ2 = log + r (L), θ 2 τ2 where r (L) = O ρ −L as L → ∞. Taking into account (4.11), the second term in the left hand side is zero. This proves Proposition 1. 8. The Limiting Entropy Observe that Eq. (4.10) can also be rewritten as 1 d SL (ρ A ) = lim+ e(1 + , λ) log D L (λ)(λ2 − 1)−L dλ.

→0 4π i ( ) dλ

(8.1)

The right hand side of this equation follows from d 2λ lim+ e(1 + , λ) log(λ2 − 1)−L dλ = L lim+ e(1 + , λ) dλ

→0 ( )

→0 ( ) dλ 1 − λ2

2λ 2λ + res e(1 +

, λ) = 2πi L lim+ resλ=1 e(1 + , λ) λ=−1

→0 1 − λ2 1 − λ2 2+

+ log = 0. = 2πi L lim+ (2 + ) log

→0 2 2

148

A. R. Its, F. Mezzadri, M. Y. Mo

We identify the limiting entropy S(ρ A ) as the following double limit (cf.[13]):

1 d 2 −L dλ. (8.2) e(1 + , λ) log D L (λ)(λ − 1) S(ρ A ) = lim+ lim L→∞ 4π i ( )

→0 dλ We now want to apply Theorem 7 and evaluate the large L limit in the right hand side of this equation. To this end we need first to replace the integration along the contour ( ) by the integration along a subset of the set where we can use the uniform asymptotic formula (7.7). Let us define d δ(λ) := log D L (λ)(λ2 − 1)−L . dλ The function δ(λ) satisfies the following properties. 1. 2. 3. 4.

δ(λ) is analytic outside of the interval [−1, 1]. δ(−λ) = −δ(λ). δ(λ) = O λ−3 , λ → ∞. δ(λ) = O log |1 − λ2 | , λ → ±1.

Consider the identity d e(1 + , λ) log D L (λ)(λ2 − 1)−L dλ ≡ e(1 + , λ)δ(λ)dλ. dλ ( ) ( ) Property 1 allows us to replace the contour of integration ( ) by the large contour as depicted in Fig. 1, so that e(1 + , λ)δ(λ)dλ = e(1 + , λ)δ(λ)dλ.

( )

Simultaneously, Property 3 allows to push R → ∞ in the right hand side of the last formula and hence re-write it as the relation, e(1 + , λ)δ(λ)dλ ( )

1+ +λ 1+ +λ 1+ +λ log+ − log− dλ δ(λ) − 2 2 2 −∞

∞ 1+ −λ 1+ −λ 1+ −λ + log+ − log− dλ. (8.3) δ(λ) − 2 2 2 1+

Here log+ 1+ ±λ and log− 1+ ±λ denote, respectively, the upper and lower boundary 2 2 1+ ±λ on the real axis. We note that values of the functions log 2 1+ +λ 1+ +λ − log− = 2πi, for all λ < −1 − , log+ 2 2

=

−1−

and

log+

1+ −λ 2

− log−

1+ −λ 2

= −2πi, for all

λ > 1 + .

Entanglement Entropy in Quantum Spin Chains

149

Therefore, Eq. (8.3) becomes −1−

∞ e(1 + , λ)δ(λ)dλ = −π i (1 + + λ)δ(λ)dλ + π i (1 + − λ)δ(λ)dλ ( ) −∞ 1+

∞ (1 + − λ)δ(λ)dλ, (8.4) = 2π i 1+

where we have also taken into account the oddness of the function δ(λ), i.e. Property 2. Recalling the definition of the function δ(λ), we arrive at d e(1 + , λ) log D L (λ)(λ2 − 1)−L dλ dλ ( ) ∞ d (8.5) = 2π i (1 + − λ) log D L (λ)(λ2 − 1)−L dλ. dλ 1+

The estimate (7.7) can be used in the right hand side of formula (8.5). This enables us to perform an explicit evaluation of the large L limit in (8.2) so that the formula for the entropy S(ρ A ) becomes

∞ τ τ 1 d − → − → lim θ β(λ) e − dλ S(ρ A ) = (1 + − λ) log θ β(λ) e + 2 →0+ 1+

dλ 2 2 ∞ → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 1 τ log = dλ. (8.6) lim+ 2 →0 1+

θ2 2 To complete the evaluation of the entropy, we need to prove the existence of this limit. 9. Integrability at ±1. The Final Formula for the Entropy We will now prove the integrability of the function → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 log θ 2 τ2 at ±1. First let us denote the real and imaginary parts of the period matrix by Re and → Im . Since the Im is non-singular, there exists a real vector − v such that − → → e = Im − v. We now can write → → v. i− e = ( − Re ) − → Let Q be a large real number, and let − m be an integer vector such that → → − → Qv =− m +− q, → where the entries of − q are between 0 and 1. In particular, we have − → → → m = Q (Im )−1 − e −− q.

(9.1)

150

A. R. Its, F. Mezzadri, M. Y. Mo

Then, from the periodicity of the theta function (6.6), we see that → − → → → m +− q )T ( − Re) + − θ i Q− e +→ c 0 = θ (− c0 ( ) → → e , (Im )−1 Re (Im )−1 − e = exp Q 2 π i −

) ( → → e + − e , (Im )−1 − ) ) ( ( → → → → → → e ,− q q + − e , (Im )−1 − c 0 + 2i − − 2iπ Q − e , (Im )−1 Re − 0 /→ − → − + iπ − q , → q +2 − q ,→ c0 → → → → c 0+− q T ×θ −(− m +− q )T Re + −

(9.2)

→ for some bounded constant − c 0. − → → Note that there exists an integer vector l and real vector − r with entries between 0 and 1 such that − → → → → r . (− m +− q )T Re = l + − Therefore, we have → → → → → → → θ −(− m +− q )T Re + − c 0+− q T = θ −− r +− c 0+− q T . → − If log θ i Q − e +→ c 0 is non-zero for all Q, then from (9.2) we see that ) ( → − → → log θ i Q − e +→ c 0 = Q2π i − e e , (Im )−1 Re (Im )−1 − ) ( N (Q, c0 ) → − → −1 − −1 + O(Q ) , e + 2i + e , (Im ) Q2

Q → ∞,

where N (Q, c0 ) is an integer that depends on the branch of the logarithm. It may depend → on Q and − c 0 . This term arises because in the integral expression of the entropy, → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 1 ∞ log dλ, (9.3) 2 1+

θ 2 τ2 the branch of the logarithm must be chosen so that the integrand is continuous in λ. We shall determine the asymptotic behavior of N (Q, c0 ) as Q → ∞. Due to Theorem 9, the inequality (6.15) is true when β(λ) ∈ iR. Therefore, we can apply the above result to compute the asymptotic behavior of the integrand in (9.3): → → ( ) θ β(λ)− e + τ2 θ β(λ)− e − τ2 − → → 2 −1 − τ π e , ) log e = −2β(λ) (Im θ2 2 ) ( → → e +i − e , (Im )−1 Re (Im )−1 − N (β(λ), τ2 ) + N (β(λ), − τ2 ) β(λ)2 +O(β(λ)−1 ) .

−2i

(9.4)

Entanglement Entropy in Quantum Spin Chains

151

Since D L (λ) in (4.11) is real and positive for λ ∈ (1, ∞), and log D L (λ)(λ2 − 1)−L has to be zero at λ = ∞ (which is needed to deform the contour to obtain (8.6)), we see that log D L (λ) has to be real for λ ∈ (1, ∞). Therefore, the imaginary part of the leading order term in (9.4) must be zero. In particular, this means that ( ) N (β(λ), τ2 ) + N (β(λ), − τ2 ) − → → e , (Im )−1 Re (Im )−1 − = O(β(λ)−1 ). e −2 β(λ)2 Thus, the asymptotic behavior of the integrand in (9.3) is θ β(λ) + τ2 θ β(λ) − τ2 log θ 2 τ2 ( ) → → e , (Im )−1 − = −2πβ(λ)2 − e + O(β(λ)−1 ) , λ → 1+ .

(9.5)

The left hand side of this equation is therefore integrable at λ = 1+ and we can take the limit → 0 in (8.6) to obtain our final result for the entropy: → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 1 ∞ S(ρ A ) = log dλ. (9.6) 2 1 θ 2 τ2 10. Critical Behavior as Roots of g(z) Approaches the Unit Circle The purpose of this section is to prove Theorem 2. We shall study the critical behavior of the entropy of entanglement as some pairs of the roots (2.4) approach the unit circle. As we discussed in Sect. 2, in each pair one root lies inside the unit circle, while the other outside. In this limit the entropy becomes singular. We shall study all the possible cases of such degeneracy, namely the following three: 1. the limit of two real roots approaching 1; 2. the limit of 2r pairs of complex roots approaching the unit circle; 3. the limit of 2r pairs of complex roots approaching the unit circle together with one pair of real roots approaching 1. When pairs of roots in (2.4) approach the unit circle, the period matrix in the definition of the theta function (2.10) becomes degenerate and some of its entries tend to zero. This will lead to a divergence in the sum (2.10) and hence a divergence in the entropy. It is very difficult to study such divergence directly from the sum (2.10). In order to compute such limits, we need to perform modular transformations to the theta functions. In particular, the following theorem from [7] will be used throughout the whole section. ˜ and (A B) are related by Theorem 8. If the canonical bases of cycles ( A˜ B) Z 11 Z 12 A˜ A A = =Z , ˜ B B Z Z B 21 22 where the matrix Z is symplectic i.e. 0 −I2n−1 0 −I2n−1 ZT Z = , I2n−1 0 I2n−1 0 T T Z 22 −Z 12 Z −1 = , T ZT −Z 21 11

152

A. R. Its, F. Mezzadri, M. Y. Mo

then we have the following relations between the theta functions with different period matrices: ε˜ ε T T ˜ T −1 T ˜ ˜ ˜ (10.1) θ (ξ, ) = ς exp −πi ξ (−Z 12 + Z 22 ) Z 12 ξ θ (ξ˜ , ), ε˜ ε where

T ˜ T T ξ˜ = (−Z 12 + Z 22 ) ξ

(10.2)

and ς is a constant. The characteristics of the theta functions are related by T T T ε = Z 22 ε˜ + Z 12 ε˜ − diag Z 12 Z 22 , T T T ε = Z 21 ε˜ + Z 11 ε˜ − diag Z 11 Z 21 , where diag(C D T ) is a column vector whose entries are the diagonal elements of C D T . The new period matrix is given by ˜ = (Z 22 + Z 21 ) (Z 12 + Z 11 )−1

(10.3)

and the normalized one forms are related by T ˜ T T ˜ = (−Z 12 + Z 22 d ) d

(10.4)

˜ T = (dω˜ 1 , . . . , dω˜ 2n−1 )T , dT = (dω1 , . . . , dω2n−1 )T , d which is the same transformation as in (10.2). ˜ remains finite Our aim is to find a good choice of basis A˜ B˜ such that θ (ξ˜ , ) ˜ tend to infinity as certain pairs of roots λ j approach the unit while some entries of circle. This would confine the divergence of the entropy within the exponential factor in (10.1), which can be computed. 10.1. The limit of two real roots approaching 1. In this section the choice of the basis ˜ described in Theorem 8 is the one shown in Fig. 5. In the notation of Theorem 8, ( A˜ B) ˜ and the old one (A B) are related by the new basis ( A˜ B) ˜ A, A , =Z ˜ B B Z 11 Z 12 0 −C2 Z = = Z 21 Z 22 C1 0 A˜ T = (a˜ 1 , . . . , a˜ 2n−1 )T ,

B˜ T = (b˜1 , . . . , b˜2n−1 )T ,

A = (a1 , . . . , a2n−1 ) , B = (b1 , . . . , b2n−1 ) , (C1 )i j = 1, j ≥ i, (C1 )i j = 0, j < i, (C2 )ii = 1, , (C2 )i,i−1 = −1, (C2 )i j = 0, j = i, i − 1, T C1 = C2−1 . T

T

T

T

(10.5)

Entanglement Entropy in Quantum Spin Chains

153

Fig. 5. The choice of cycles on the hyperelliptic curve L. The arrows denote the orientations of the cycles and branch cuts

Fig. 6. As λ2n → λ−1 2n , integration around a˜ n becomes a residue integral around z = 1

The relation between the two period matrices can be found using (10.3), ˜ = −C1 −1 C −1 . 2

(10.6)

To study the behavior of the entropy as the real roots λ2n → λ−1 2n , we need to know the ˜ in this limit. Now, we have behavior of the period matrix 1 1 2 4n 2 4n 2 2 3 w0 = lim (z − λi ) = (z − 1)3 (z − λi ). (10.7) λ2n →λ−1 2n

i=2n,2n+1

i=1

Furthermore, as λ2n → λ−1 ˜ n tends the residue at z = 1; the 2n the integration around a hyperelliptic curve L becomes a singular hyperelliptic curve L0 of genus 2n − 2; the tilded basis of canonical cycles on this curve reduces to A˜ 0T = (a˜ 1 , . . . , a˜ n−1 , a˜ n+1 , . . . , a˜ 2n−1 )T , B˜ 0T = (b˜1 , . . . , b˜n−1 , b˜n+1 , . . . , b˜2n−1 )T .

(10.8)

The holomorphic 1-forms dω˜ j tend to the following limit [1]: ˜ 0j = dω

ϕ j (z) dz, w0

where ϕ j (λ) are degree 2n − 2 polynomials determined by the normalization conditions dω˜ k0 = δk j , j = n, a˜ j

2πiResz=1,w=w0 (1) dω˜ k0 = δkn . Therefore, the 1-forms dω˜ k0 , k = n, become the holomorphic 1-forms that are dual to ˜ 0n becomes a normalized meromorphic 1-form with the basis A˜ 0 on L0 . Furthermore, dω simple poles at the points above z = 1 on L0 .

154

A. R. Its, F. Mezzadri, M. Y. Mo

˜ tend to the following limits: As in [1], we see that the entries of the period matrix lim

λ2n →λ−1 2n

˜ jk = ˜ 0jk , i, j = n, n, ˜ nn = 2 =

n

λ2 j

j=1 λ2 j−1

dω˜ n

1 −1 log |λ−1 2n − λ2n | + O(1), λ2n → λ2n , πi

˜ 0 is finite for i, j = n, n. where ij Let us adopt the notation of Theorem 8 and denote the argument of the theta function in the entropy (9.6) by ξ , that is τ → ξ = β(λ)− e ± . 2

(10.9)

We will now compute the behavior of the argument ξ˜ in (10.1) with ξ given by (10.9). We have Lemma 3. Let ξ be given by (10.9) and ξ˜ be T ˜ T T ξ˜ = (−Z 12 + Z 22 ) ξ, where Z i j are given by (10.5). Then as λ2n → λ−1 2n we have ˜ in ± ηi , i = 1, . . . , 2n − 1, ξ˜i = β(λ)

(10.10)

where ηi remains finite as λ2n → λ−1 2n . Proof. To begin with, we will need to express Recall that the term τ2 in (2.12) is given by

τ 2

in terms of the Abel map.

τ =− ω(z −1 j ) − K, 2 2n

j=2

where K is the Riemann constant. As in [6] (see also Appendix D), the Riemann constant can be expressed as a sum of images of branch points under the Abel map. In particular, we have K =−

2n

ω(λ2 j−1 ).

j=2

Therefore we have τ =− ω(z −1 ) + ω(λ2 j−1 ). j 2 2n

2n

j=2

j=2

Entanglement Entropy in Quantum Spin Chains

155

Now by substituting (10.5) into (10.2) and making use of (10.4) and (10.6), we see that ˜ can be expressed as follows: the argument ξ˜ in θ (ξ˜ , ) ⎛ ⎞ 2n 2n ˜ in ± ⎝ ω˜ i (z −1 ω˜ i (λ2 j−1 )⎠ , i = 1, . . . , 2n − 1, (10.11) ξ˜i = β(λ) j )− j=2

j=1

where ω˜ is the Abel map with dω replaced by dω˜ and ω˜ i is the i th component of the map. We would like to show that the term 2n

ω˜ i (z −1 j )−

j=2

2n

ω˜ i (λ2 j−1 )

j=1

in (10.11) remains finite as λ2n → λ−1 2n . To see this, note that the set of points {z −1 j } must contain either one of the points λ2n

−1 −1 or λ−1 ˜ n (λ2n ) 2n , but not both, while {λ2i−1 } contains λ2n only. As λ2n → λ2n , the terms ω −1 and ω˜ n (λ2n ) in the sum in Eq. (10.11) will tend to −∞. However, since they appear in the sum with opposite signs, these contributions cancel and the quantity 2n

ω˜ n (z −1 j )−

j=2

2n

ω˜ n (λ2 j−1 )

j=1

remains finite as λ2n → λ−1 2n . We can therefore write ξ˜ as ˜ in ± ηi , i = 1, . . . , 2n − 1, ξ˜i = β(λ) where ηi remains finite as λ2n → λ−1 2n .

We are now ready to apply Theorem 8 to compute the theta function as λ2n → λ−1 2n . Lemma 4. In the limit λ2n → λ−1 2n the theta function θ (ξ, ) behaves like 2 θ (ξ, ) = exp log |λ2n − λ−1 2n |β (λ) + O(1) ,

(10.12)

where ξ is given by (10.9). ˜ we Proof. Firstly, let us use (10.1) and (10.6) to express θ (ξ, ) in terms of θ (ξ˜ , ), have ˜ −1 ξ˜ θ ξ˜ , ˜ . θ (ξ, ) = ς exp πi ξ˜ T (10.13) Let us now use (10.10) to compute the asymptotic of the exponential term in (10.13). We obtain ˜ −1 ξ˜ = ˜ −1 ξ˜i ξ˜ j . ξ˜ T i, j

ij

156

A. R. Its, F. Mezzadri, M. Y. Mo

˜ −1 can be calculated by computing the determinant and The behavior of the entries in the minors. We have ˜ −1 = O(1), λ2n → λ−1 2n , i, j = n, ij ˜ −1 = O log−1 |λ2n − λ−1 | , λ2n → λ−1 2n 2n , j = n, nj −1 −2 ˜ −1 = πi log−1 |λ2n − λ−1 |λ2n − λ−1 2n | + O log 2n | , λ2n → λ2n . nn

Therefore, Eq. (10.14) becomes ˜ −1 ξ˜i ξ˜ j = log |λ2n − λ−1 |β 2 (λ) + O(1), λ2n → λ−1 . πi 2n 2n ij

i, j

(10.14)

Next, we will use the definition (2.10) of the theta function to compute its limit as λ2n → λ−1 2n . We have, ˜ = θ (ξ˜ , )

− → m ∈Z2n−1

⎡ exp ⎣πi

jk=nn

˜ jk m j m k + 2πi ⎤

˜ jn ± η j m j β(λ) j=n

˜ nn m 2n + 2β(λ)m n ± 2ηn m n ⎦ . + 2πi

(10.15)

Since lim

λ2n →λ−1 2n

˜ nn ) = −∞ Re(2πi

and β(λ) is purely imaginary, we see that in the limit only the terms with m n = 0 contribute to the sum. Therefore, Eq. (10.15) reduces to ˜ = θ ξ˜ 0 , ˜0 , (10.16) lim θ (ξ˜ , ) λ2n →λ−1 2n

ξ˜ 0 = (ξ˜1 , . . . , ξˆ˜n , . . . , ξ˜2n−1 )T , where the ξˆ˜n in the above equation means that the n th entry of the vector is removed. ˜ 0 is an (2n − 2) × (2n − 2) matrix obtained by removing the n th The period matrix th ˜ Thus, the theta function θ ξ˜ 0 , ˜ 0 remains row and n column of the period matrix . finite as λ2n → λ−1 2n . This fact, together with (10.14), shows that θ (ξ, ) behaves like 2 ˜ 0, ˜ 0 , λ2n → λ−1 . θ (ξ, ) = ς exp log |λ2n − λ−1 |β (λ) + O(1) θ ξ 2n 2n ˜ 0 and ς remain finite as λ2n → λ−1 , the above equation becomes (10.12). Since θ ξ˜ 0 , 2n This proves the lemma.

Entanglement Entropy in Quantum Spin Chains

157

Fig. 7. Two pairs of roots, labelled according to the ordering (2.5), approaching the unit circle in the critical limit. We have λ2( j+1) → λ2(2n− j)−1 , λ2n → λ2n+1 and λ2 j+1 → λ2(2n− j) respectively

Finally, by substituting (10.12) into (9.6), we have → → θ β(λ)− e + τ2 θ β(λ)− e − τ2 1 ∞ log S(ρ A ) = dλ 2 1 θ 2 τ2 ∞ 2 log |λ2n − λ−1 = 2n |β (λ) + O(1) dλ. 1

Since

∞ 1

1 β 2 (λ)dλ = − , 6

we arrive at the following expression for the entropy of entanglement 1 −1 S(ρ A ) = − log |λ2n − λ−1 2n | + O(1), λ2n → λ2n . 6

10.2. The limit of complex roots approaching the unit circle. We will now study the case when 2r pairs of complex roots approach each other towards the unit circle. Let λ2 j+1 be a complex root with n − r ≤ j ≤ n − 1. As we discussed in Sect. 2, λ2 j+1 , 1/λ2 j+1 and 1/λ2 j+1 are roots too. The ordering (2.5) implies (see Fig. 7) λ2(2n− j)−1 = λ2(2n− j) , λ2( j+1) , = λ2 j+1 , λ2(2n− j) , = 1/λ2( j+1) , λ2(2n− j)−1 = 1/λ2 j+1 .

(10.17)

The critical limit occurs as λ2( j+1) → λ2(2n− j)−1 . From the relations (10.17) this implies λ2 j+1 → λ2(2n− j) . Thus, in what follows we shall mainly discuss the limit λ2( j+1) → λ2(2n− j)−1 .

158

A. R. Its, F. Mezzadri, M. Y. Mo

Fig. 8. The choice of cycles on the hyperelliptic curve L. The arrows denote the orientations of the cycles and branch cuts

˜ 10.2.1. Case 1. r < n. We now choose the tilded canonical basis of the cycles ( A˜ B) as in Fig. 8. Namely, we have

a˜ j = a j , b˜ j = b j ,

j < n − r, j < n − r,

a˜ n−k = bn−k − bn+k−1 +

j > n + r − 1, j > n + r − 1, n+k−2

a j , k = 1, . . . , r,

(10.18)

j=n−k+1 n+k

a˜ n+k = bn+k − bn−k−1 +

a j , k = 0, . . . , r − 1,

j=n−k−1

b˜n−k = bn−k −

n+k−2 j=n−k

b˜n+k = bn+k +

n−k−2 j=n−r

aj −

n−k−1

(−1)n−k− j a j − 2b j , k = 1, . . . , r,

j=n−r

(−1)n−k− j a j − 2b j , k = 0, . . . , r − 1.

Entanglement Entropy in Quantum Spin Chains

159

We will show in Appendix E that this is indeed a canonical basis of cycles. We can partition this basis as follows: ⎛ I ⎞ a˜ A˜ = ⎝ a˜ I I ⎠ , a˜ I I I a˜ Ij = a˜ j , 1 ≤ k ≤ n − r − 1, a˜ Ij I = a˜ n−r + j−1 , 1 ≤ k ≤ 2r, a˜ Ij I I = a˜ j , n + r ≤ k ≤ 2n − 1. The relations among the b-cycles and the untilded basis are analogous. If we write this relation in matrix form as in Theorem 8, then the corresponding transformation matrix is given by Z 11 Z 12 A A˜ A , = =Z ˜ B B Z Z B 21 22 where the blocks Z i j can be written as

Zi j

⎞ ⎛ 0 δi j In−r −1 0 0 Ci j 0 ⎠, =⎝ 0 0 δi j In−r

where In−r −1 is the identity matrix of dimension n −r −1 and the Ci j ’s are the following 2r × 2r matrices: $ (C11 )kl = $ (C11 )kl = (C12 )kl = (C21 )kl = (C21 )kl = (C22 )kl = (C22 )kl =

1 k + 1 ≤ l ≤ 2r − k, 0 otherwise,

1 ≤ k ≤ r,

1 k ≤ l ≤ 2r − k + 1, r + 1 ≤ k ≤ 2r, 0 otherwise, δkl − δl,2r −k+1 1 ≤ k, l ≤ 2r, ⎧ ⎨ (−1)k−l+1 , 1 ≤ l ≤ k − 1; 1 ≤ k ≤ r, −1 k ≤ l ≤ 2r − k, ⎩0 2r − k + 1 ≤ l, $ 1 ≤ l ≤ 2r − k, (−1)k+l r + 1 ≤ k ≤ 2r, 0 otherwise, $ 1 ≤ l ≤ k − 1, 2(−1)k−l 1 ≤ k ≤ r, δkl + 0 otherwise, δkl − 2 (C21 )kl , r + 1 ≤ k ≤ 2r.

160

A. R. Its, F. Mezzadri, M. Y. Mo

These are matrices of the form ⎛ 0 1 1 ⎜0 0 1 ⎜. . . ⎜. . . ⎜. . . ⎜ ⎜0 0 . . . C11 = ⎜ ⎜0 0 . . . ⎜0 . . . 0 ⎜ ⎜. . . ⎝ .. .. .. 1 1 1 C12 = I2r − J2r , ⎛ 0 ... ... ⎜0 . . . 1 Jr = ⎜ ⎝ ... ... ...

C21

C22

... ... .. .

... ... .. .

1 0 1 .. .

1 0 1 .. .

... ...

⎞ 1 1 0 1 0 0⎟ .. .. ⎟ ⎟ . . ⎟ ⎟ . . . 0 0⎟ , . . . 0 0⎟ ⎟ ⎟ 0 . . . 0⎟ .. .. ⎟ . . ⎠ 1 1 1

⎞ 1 0⎟ , .. ⎟ .⎠

1 0 ... 0 ⎛ −1 −1 −1 . . . . . . −1 −1 ⎜ 1 −1 −1 . . . . . . . . . −1 ⎜ ⎜−1 1 −1 −1 . . . . . . −1 ⎜ . .. .. ⎜ . . . . . . . .. . . . . ⎜ . . . ⎜ ⎜ . . . . . . −1 1 −1 0 . . . =⎜ ⎜ . . . 1 −1 1 0 . . . . . . ⎜ . . . . . .. .. ⎜ . . . . . . . ⎜ . . . . . ⎜−1 1 0 . . . . . . . . . . . . ⎜ ⎝ 1 0 0 ... ... ...... ... 0 0 0 ... ... ... ... ⎞ ⎛ 1 0 0 ... 0 0 0 ⎜−2 1 0 . . . 0 0 0⎟ ⎟ ⎜ . ⎜ . . . . . . . .. .. .. ⎟ . . . . . .⎟ ⎜ . ⎟ ⎜ ⎜ . . . 2 −2 1 . . . 0 0⎟ =⎜ ⎟. ⎜ . . . 2 −2 0 1 . . . 0⎟ ⎜ . . . . . . .⎟ ⎜ . . . . . . .⎟ ⎜ . . . . . . .⎟ ⎝−2 0 0 . . . 0 1 0⎠ 0 0 0 ... 0 0 1

(10.19) −1 −1 0 .. . ... ... .... ..

−1 0 0 .. . ... 0 .. .

... 0 0 0 0 0

⎞ 0 0⎟ ⎟ 0⎟ ⎟ ⎟ ⎟ ⎟ 0⎟ , 0⎟ ⎟ ⎟ ⎟ ⎟ 0⎟ ⎟ 0⎠ 0

As in Sect. 10, some holomorphic 1-forms dω˜ j will become meromorphic as the roots approach the unit circle. In this case, the holomorphic 1-form dω˜ j , n −r ≤ k ≤ n +r −1 becomes a meromorphic 1-form with a simple pole at λ2( j+1) . All the other holomorphic 1-forms become normalized holomorphic 1-forms in the resulting surface. In particular, we have the following:

Entanglement Entropy in Quantum Spin Chains

161

˜ behave like Lemma 5. The entries of the period matrix lim

˜ i0j , i = j, ˜ ij =

lim

˜ jj = ˜ 0j j ,

lim

˜ jj = γj + ˜ 0j j , n − r ≤ j ≤ n + r − 1,

λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1

γj =

j > n + r − 1,

j < n − r,

1 log λ2( j+1) → λ2(2n− j)−1 , πi

(10.20)

˜ 0 are finite. where ij Let us now consider the behavior of the terms ξ˜ in (10.1). Lemma 6. Let ξ be given by (10.9) and ξ˜ be T ˜ T T ) ξ, + Z 22 ξ˜ = (−Z 12 where Z i j are given by (10.19). Then in the limit λ2( j+1) → λ2(2n− j)−1 we have ξ˜i = ηi± , i > n + r − 1, i < n − r, ξ˜i = i β(λ)γi + ηi± , n − r ≤ i ≤ n + r − 1,

i = 1, i < n,

i = −1. i ≥ n,

(10.21)

where ηi± remains finite as λ2( j+1) → λ2(2n− j)−1 . Proof. Let T ˜ Z 12 −

T Z 22

⎛ 0 0 = ⎝0 (I2r − J2r )Dr 0 0

⎞ 0 0⎠ + W, 0

Dr = diag(γn−r , γn−r +1 , . . . , γn−r +1 , γn−r ),

(10.22)

where W is a matrix that remains finite as λ2( j+1) → λ2(2n− j)−1 . Then from (10.2) and (10.4), we see that ξ˜ is given by ξ˜i = β(λ)

n−1

Wn+ j,i ±

j=1

ξ˜i = i β(λ)γi + β(λ)

τ˜i , i > n + r − 1, i < n − r, 2

n−1 j=1

Wn+ j,i ±

τ˜i , n − r ≤ i ≤ n + r − 1, 2

where

i = 1, i < n,

i = −1, i ≥ n, 2n 2n τ˜i = ω˜ i (z −1 ) − ω˜ i (λ2 j−1 ). j 2 j=1

j=1

(10.23)

162

A. R. Its, F. Mezzadri, M. Y. Mo

Let λ2( j+1) , λ2(2n− j)−1 and λ2 j+1 , λ2(2n− j) , n − r ≤ j ≤ n − 1 be the pairs of points that approach each other. From their ordering we have λ2( j+1) = λ−1 2(2n− j) and λ2 j+1 = λ−1 2(2n− j)−1 . For each fixed j, the point λ2 j+1 is a pole of dω˜ 2n− j−1 , while λ2(2n− j)−1 is a pole of dω˜ j . Therefore, the Riemann constant behaves like 2n

1 γi + O(1), λ2( j+1) → λ2(2n− j)−1 n − r ≤ i ≤ n + r − 1. 2

ω˜ i (λ2 j−1 ) =

j=1

Moreover, among these 4 points there are exactly two points of the form z k−1 for some k. However, since z k are the roots of a polynomial with real coefficients, if λ j = z k−1 for some k, then its complex conjugate λ j is also of the form z k−1 for some k . This means that either of the following is true: 1. Both λ2( j+1) and λ2 j+1 are of the form z k−1 , 2. Both λ2(2n− j) and λ2(2n− j)−1 are of the form z k−1 . Either way, we have 2n j=1

ω˜ i (z −1 j )=

1 γi + O(1), n − r ≤ i ≤ n + r − 1. 2

Therefore, we can rewrite (10.23) as ξ˜i = ηi± , i > n + r − 1, i < n − r, ξ˜i = i β(λ)γi + ηi± , n − r ≤ i ≤ n + r − 1, where ηi± remains finite as λ2( j+1) → λ2(2n− j)−1 .

We now compute the behavior of the theta function θ (ξ, ) in this limit. Lemma 7. In the limit λ2( j+1) → λ2(2n− j)−1 , n − r ≤ j ≤ n + r − 1, the theta function θ (ξ, ) behaves like ⎛ ⎞ n−1 θ (ξ, ) = exp ⎝2πiβ 2 (λ) γ j + O(1)⎠ , (10.24) j=n−r

where ξ is given by (10.9) and γ j by (10.20). Proof. From (10.1) we see that −1 ε T ˜ T T ˜ ˜ θ (ξ, ) = ς exp πi ξ˜ T Z 12 − Z 22 ξ θ (ξ˜ , ), Z 12 ε

(10.25)

where the characteristics on the right-hand side are obtained by solving the linear equations T T T diag Z 12 Z 22 = Z 22 ε + Z 12 ε, T T T diag Z 11 Z 21 = Z 21 ε + Z 11 ε.

Entanglement Entropy in Quantum Spin Chains

163

The solution of this system is ε j = 0 mod 2, j = 1, . . . , 2n − 1, $ 1 mod 2, n − r ≤ j ≤ n − 1; εj = 0 mod 2, otherwise.

(10.26)

Note that, from (2.11) and the periodicity properties of the theta function Proposition 2, characteristics that differ by an even integer vector give the same theta function. That is 7 − →8 ε ε+2N − → − → θ (ξ, ) = θ − → (ξ, ), N , M ∈ Z2n−1 . ε ε + 2M We will now compute the exponential term of (10.25). By performing rows and columns T ˜ − Z T , we can transform its determinant into the form operations on Z 12 22 ⎛⎛ ⎞ ⎞ 0n−r −1 0 0 T ˜ T det Z 12 − Z 22 = det ⎝⎝ 0 S Dr 0 ⎠ + W ⎠ , 0 0 0n−r $ 0, 1 ≤ i ≤ r , Si j = δi j , r + 1 ≤ i ≤ 2r , for some matrix W that remains finite as λ2( j+1) → λ2(2n−1)−1 . This means that the leading order term of the determinant is of the order of n−1 k=n−r γk . That is T ˜ T − Z 22 = Dr + O(γir −1 ), λ2( j+1) → λ2(2n− j)−1 , det Z 12 n−1

Dr = W

γk ,

k=n−r

where the notation O(γir −1 ) means

α r −1 i γi αi ≤ r − 1, O(γi ) = O , i

(10.27)

i

Furthermore, W is the determinant of the (2n − r − 1) × (2n − r − 1) matrix formed by removing the (n − r )th up to the (n − 1)th rows and columns in W . T ˜ − Z T cannot contain more than r facSimilarly, we see that the minors of Z 12 22 −1 T ˜ − ZT tors of γ . In particular, this means that the inverse matrix Z 12 is finite as 22 λ2( j+1) → λ2(2n− j)−1 . −1 T ˜ − ZT behaves like Therefore the inverse matrix Z 12 22

T ˜ T − Z 22 Z 12

−1

= X 0 + X −1 + O(γi−2 ), λ2( j+1) → λ2(2n− j)−1 , (10.28)

where X −1 is a term of order −1 in γi and X 0 is a finite matrix.

164

A. R. Its, F. Mezzadri, M. Y. Mo

From (10.28) and (10.22), we see that the leading order term of −1 T ˜ T T ˜ T − Z 22 − Z 22 Z 12 Z 12 = I2n−1

(10.29)

gives the following: ⎛ 0 0 X 0 ⎝0 (I2r − J2r )Dr 0 0

⎞ 0 0⎠ = 0, 0

while the leading order term of −1 T ˜ T T ˜ T − Z 22 − Z 22 Z 12 Z 12 = I2n−1 gives ⎛ 0 0 ⎝0 (I2r − J2r )Dr 0 0

⎞ 0 0⎠ X 0 = 0. 0

This implies that 0 X i,0 j = X i,2n− j−1 , 1 ≤ i ≤ 2n − 1, n − r ≤ j ≤ n + r − 1, 0 X i,0 j = X 2n−i−1, j , n − r ≤ i ≤ n + r − 1, 1 ≤ j ≤ 2n − 1.

(10.30)

The leading order term of the bilinear product in (10.25) then becomes ⎛ ⎞ 0 0 0 −1 T ˜ T T ˜ − Z 22 ξ = β 2 (λ) T Dn X −1 ⎝0 (I2r − J2r ) 0⎠ Dn

Z 12 ξ˜ T Z 12 0 0 0

i

i

i Dn Let us denote P by

+O(1), λ2( j+1) → λ2(2n− j)−1 , = 0, i < n − r, i > n + r − 1, = 1, n − r ≤ i < n, = −1, n ≤ i < n + r − 1, ⎛ ⎞ 0n−r −1 0 0 Dr 0 ⎠ . =⎝ 0 0 0 0n−r

⎛ ⎞ 0 0 0 P = X −1 ⎝0 (I2r − J2r ) 0⎠ Dn . 0 0 0

Then constant term of (10.29) gives the following: ⎞ ⎛ 0 0 0 X −1 ⎝0 (I2r − J2r ) 0⎠ Dn + X 0 W = I2n−1 . 0 0 0

(10.31)

Entanglement Entropy in Quantum Spin Chains

165

By applying (10.30) to the above, we see that the entries of P are related by Pl, j = P2n−l−1, j + δl, j + δ2n−l−1, j , n − r ≤ l ≤ n − 1, n − r ≤ j ≤ n + r − 1. By substituting this back into (10.31), we see that the exponential factor in (10.25) behaves like n−1 −1 T ˜ T T ˜ 2 ˜ξ T Z 12 Z 12 ξ = 2β (λ) γ j + O(1). − Z 22

(10.32)

j=n−r

We will now show that the limit of the theta function with characteristics remains finite. By using the definition (2.10), we have θ

ε ε

˜ = (ξ˜ , )

n−1 εj mj + 2β(λ) + m j exp πi γj 2 j=n−r m j ∈Z εj ε2n− j−1 + m 2n− j−1 + + 2 2 ε2n− j−1 + O(1) , λ2( j+1) → λ2(2n− j)−1 . × −2β(λ) + m 2n− j−1 + 2

As before, since β(λ) is purely imaginary, only terms such that ε j 2 ε2n− j−1 2 mj + + m 2n− j−1 + = 0, n − r ≤ j ≤ n − 1, 2 2 contribute. Recall that from (10.26) we have ε j = ε2n− j−1 = 0, therefore m j = m 2n− j−1 = 0, n − r ≤ j ≤ n − 1. Thus, as before, the theta function with characteristics reduces to a 2n − 2r − 1 dimensional theta function ε ˜ = θ (ξ˜ 0 , ˜ 0 ), lim θ (ξ˜ , ) (10.33) λ2( j+1) →λ2(2n− j)−1 ε where the arguments on the right-hand side are obtained from removing the (n − r )th ˜ 0 ) is finite in the limit. up to the (n + r − 1)th entries and that θ (ξ˜ 0 , By combining (10.32) and (10.33), we see that the theta function θ (ξ, ) behaves like ⎛ ⎞ n−1 ˜ 0 ). θ (ξ, ) = ς exp ⎝2πiβ 2 (λ) γ j + O(1)⎠ θ (ξ˜ 0 , j=n−r

This concludes the proof of the lemma.

Finally, from Lemma 7 we see that the entropy (9.6) behaves like S(ρ A ) = −

n−1 1 log λ2( j+1) − λ2(2n− j)−1 + O(1), λ2( j+1) → λ2(2n− j)−1 . 3 j=n−r

166

A. R. Its, F. Mezzadri, M. Y. Mo

10.2.2. Case 2: r = n. We will now consider the case when r = n. That is, all roots are complex and they all approach each other pairwise. The canonical basis will be chosen as in (10.18) but with r = n − 1, (not n) while the last elements in the basis are given by a˜ 2n−1 = b2n−1 , b˜2n−1 = −a2n−1 . In other words, we have n+k−2

a˜ n−k = bn−k − bn+k−1 +

a j , k = 1, . . . , n − 1,

j=n−k+1 n+k

a˜ n+k = bn+k − bn−k−1 +

a j , k = 0, . . . , n − 2,

(10.34)

j=n−k−1

b˜n−k = bn−k −

n+k−2

aj −

j=n−k

b˜n+k = bn+k +

n−k−2

n−k−1

(−1)n−k− j a j − 2b j , k = 1, . . . , n − 1,

j=1

(−1)n−k− j a j − 2b j , k = 0, . . . , n − 2,

j=1

a˜ 2n−1 = b2n−1 , b˜2n−1 = −a2n−1 .

(10.35)

As before, we can partition the basis as follows: I a˜ , A˜ , = a˜ I I a˜ Ij = a˜ j , 1 ≤ k ≤ n − r − 1, a˜ 1I I

(10.36)

= a˜ 2n−1 .

Furthermore, the b-cycles and the untilded basis are connected by analogous relations. In the notation of Theorem 8 we have A A˜ A Z 11 Z 12 , =Z = ˜ Z Z B B B 21 22 where the transformation matrix Z can be written in block form according to the partition (10.36): Ci j 0 Zi j = , (10.37) 0 Ei j where Ci j are 2(n − 2) × 2(n − 2) matrices defined as in (10.19), and E is given by Ei j = 0, i = j, E12 = 1, E21 = −1. By deformation of the contours, we see that the cycles a˜ j become close loops around λ2( j+1) in the critical limit.

Entanglement Entropy in Quantum Spin Chains

167

Fig. 9. The curve going around λ2

Let a˜ 0 be the closed curve that becomes a loop around λ2 as λ2 → λ4n−1 (see Fig. 9). We have a˜ 0 = −b2n−1 +

2n−2

aj,

j=1

a˜ 0 = −a˜ 2n−1 +

n−1 2n−2 (−1) j+1 a˜ j + (−1) j a˜ j . j=1

j=n

In particular, this means that in the limit, the 1-form ω˜ j will have a simple pole at 1 1 and a simple pole at λ2 with residue (−1) j+1 2πi for 1 ≤ j ≤ λ2( j+1) with residue 2πi 1 1 j n − 1, (−1) 2πi for n ≤ j ≤ 2n − 2 and − 2πi for j = 2n − 1. Thus, we arrive at the following Lemma 8. The entries of the period matrix behave like lim

˜ ij = ˜ i0j , i = j, i, j = 2n − 1,

lim

˜ jj = γj + ˜ 0j j , 1 ≤ j ≤ 2n − 2,

λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1

lim

λ2( j+1) →λ2(2n− j)−1

˜ 2n−1,2n−1 = 2γ2n−1 + ˜ 02n−1,2n−1 ,

lim

˜ j,2n−1 = (−1) j γ2n−1 + ˜ 0j,2n−1 , 1 ≤ j ≤ n − 1,

lim

˜ j,2n−1 = (−1) j+1 γ2n−1 + ˜ 0j,2n−1 , n ≤ j ≤ 2n − 2,

λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1

168

A. R. Its, F. Mezzadri, M. Y. Mo

˜ j,2n−1 , ˜ 2n−1, j = 1 γj = log λ2( j+1) − λ2(2n− j)−1 , πi ˜ 0 are finite in the limit λ2( j+1) → λ2(2n− j)−1 . where ij In this case, the argument ξ˜ in (10.1) behaves as follows. Lemma 9. Let ξ be given by (10.9) and ξ˜ be T ˜ T T + Z 22 ξ˜ = (−Z 12 ) ξ, where Z i j are given by (10.37). Then in the limit λ2( j+1) → λ2(2n− j)−1 we have ξ˜i = σi β(λ)γi + ηi± , 1 ≤ i ≤ 2n − 1,

(10.38)

σi = (1 + (−1) ), 1 ≤ i ≤ n − 1, σi = −(1 + (−1)i+1 ). n ≤ i ≤ 2n − 1, i+1

where ηi± remains finite as λ2( j+1) → λ2(2n− j)−1 . T ˜ − Z T takes the form Proof. In this case the matrix Z 12 22

0 (I2r − J2r )Dn−1 + W, − → 2γ2n−1 D n−1 = diag(γ1 , γ2 , . . . , γ2 , γ1 ),

T ˜ T − Z 22 Z 12 =

Dn−1 − → D n−1 = (−γ1 , γ2 , . . . , γ2 , −γ1 ),

(10.39)

where W is a finite matrix as λ2( j+1) → λ2(2n− j)−1 . Therefore, ξ˜ behaves like ξ˜i = σi β(λ)γi + β(λ)

n−1

Wn+ j,i ±

j=1

τ˜i , 1 ≤ i ≤ 2n − 1, 2

σi = (1 + (−1) ), 1 ≤ i ≤ n − 1, σi = −(1 + (−1)i+1 ), n ≤ i ≤ 2n − 1, 2n 2n τ˜i = ω˜ i (z −1 ) − ω˜ i (λ2 j−1 ). j 2 i+1

j=1

j=1

As in Sect. 10.2.1, the leading order terms of

τ˜i 2

are zero. We can therefore rewrite ξ˜ as

ξ˜i = σi β(λ)γi + ηi± , 1 ≤ i ≤ 2n − 1, where ηi± are finite in the limit.

The behavior of the theta function for this case is given by

Entanglement Entropy in Quantum Spin Chains

169

Lemma 10. In the limit λ2( j+1) → λ2(2n− j)−1 , 1 ≤ j ≤ 2n − 1, the theta function θ (ξ, ) behaves like ⎛ θ (ξ, ) = exp ⎝2πiβ 2 (λ)

n−1

⎞ γ j + O(1)⎠ ,

(10.40)

j=1

where ξ is given by (10.9) and γ j by Lemma 8. Proof. As in Sect. 10.2.1, from (10.1) we have, −1 ε T T ˜ T T ˜ ˜ ˜ Z 12 − Z 22 θ (ξ, ) = ς exp πi ξ Z 12 ξ θ (ξ˜ , ), ε

(10.41)

where the characteristics on the right-hand side are given by the same formula as before, with r replaced by n − 1: ε j = 0 mod 2, j = 1, . . . , 2n − 1, $ 1 mod 2, 1 ≤ j ≤ n − 1; εj = 0 mod 2, otherwise. Since there is no non-zero matrix X 0 that is independent of γ j such that the leading order term of T ˜ T Z 12 − Z 22 X0 −1 T ˜ − ZT is zero, we can write the inverse matrix Z 12 as 22

T ˜ T Z 12 − Z 22

−1

= X −1 + O(γi−2 ), λ2( j+1) → λ2(2n− j)−1 ,

where X −1 is a term that is of order −1 in the γ j . Then, the leading order term of the bilinear product in (10.41) is −1 T ˜ T T ˜ 2 T −1 (I2n−2 − J2n−2 ) 0 ˜ξ T Z 12 Dn σ − Z 22 Z 12 ξ = β (λ)σ Dn X 0 1 + O(1), λ2( j+1) → λ2(2n− j)−1 , σi = (1 + (−1)i+1 ), 1 ≤ i ≤ n − 1, σi = −(1 + (−1)i+1 ), n ≤ i ≤ 2n − 1, Dn = diag(γ1 , γ2 , . . . , γ2 , γ1 , 2γ2n−1 ). ˜ 1 be the leading order term of : ˜ Let

− →T D D n−1 n−1 ˜ = − . → D n−1 2γ2n−1 1

(10.42)

170

A. R. Its, F. Mezzadri, M. Y. Mo

Equation (10.42) can now be rewritten as −1 (I2n−2 − J2n−2 ) 0 ˜ 1 T ˜ T T ˜ 2 T ˜ 1 −1 ˜ξ T Z 12

− Z 22 Z 12 ξ = β (λ) X , 0 1 +O(1), λ2( j+1) → λ2(2n− j)−1 , (10.43)

i = 1, 1 ≤ i ≤ n − 1,

i = −1, n ≤ i ≤ 2n − 1. The constant term of

T ˜ T − Z 22 Z 12

−1

T ˜ T − Z 22 Z 12 = I2n−1

now gives X −1

I2n−2 − J2n−2 0 ˜ 1 = I2n−1 . 0 1

By substituting this back into (10.43), we obtain 2n−1 −1 T ˜ T T ˜ − Z 22 ξ= Z 12 log λ2( j+1) − λ2(2n− j)−1 + O(1). πi ξ˜ T Z 12 j=1

To complete the proof, note that in this case, the theta function in the right-hand side of (10.41) becomes 1: lim

λ2( j+1) →λ2(2n− j)−1

θ

ε ε

˜ = 1. (ξ˜ , )

Therefore, we have ⎛ θ (ξ, ) = ς exp ⎝πi

2n−1

⎞ γ j + O(1)⎠ , λ2( j+1) → λ2(2n− j)−1 .

j=1

This completes the proof of the lemma.

Finally, by substituting (10.40) into (9.6), we find that the entropy behaves like

S(ρ A ) = −

2n−1 1 log λ2( j+1) − λ2(2n− j)−1 + O(1), λ2( j+1) → λ2(2n− j)−1 . 3 j=1

Entanglement Entropy in Quantum Spin Chains

171

Fig. 10. The choice of cycles on the hyperelliptic curve L. The arrows denote the orientations of the cycles and branch cuts

10.3. Pairs of complex roots approaching the unit circle together with one pair of real roots approaching 1. The canonical basis used in this section is shown in Fig. 10: a˜ k = −bk + bk−1 , k < n − r, k > n + r b0 = 0, b˜k =

2n−1 j=k

aj −

n+r −1

a j , k < n − r,

j=n−r

a˜ n−k = bn−k − bn+k−1 +

n+k−2

a j , k = 1, . . . , r,

j=n−k+1 n+k

a˜ n+k = bn+k −bn−k−1 +

a j , k = 0, . . . , r −1,

j=n−k−1 2n−1

b˜n−k = bn−k +(−1)r −k

n+k−2

aj−

j=n+r

b˜n+k = bn+k + (−1)r −k

j=n−k

2n−1 j=n+r

a˜ n+r = bn−r −1 − bn+r +

r −1 j=0

b˜k =

2n−1 j=k

a j , k ≥ n + r.

aj−

aj +

n−k−2

n−k−1

(−1)n−k− j a j −2b j , k = 1, . . . , r,

j=n−r

(−1)n−k− j a j − 2b j , k = 0, . . . , r − 1,

j=n−r

(−1)r − j−1 2bn+ j + an+ j − 2bn− j−1 + an− j−1 ,

172

A. R. Its, F. Mezzadri, M. Y. Mo

In the notation of Theorem 8, the two bases are related by Z 11 Z 12 A A˜ A = , =Z ˜ B B Z Z B 21 22 ⎛ ⎞ 0 0 0 Z 11 = ⎝0 C11 0⎠ , 0 T 32 0 ⎛ ⎞ 0 −C2n−r −1 0 ⎠, 0 C12 0 Z 12 = ⎝ n−r −1 31 32 V −C2 V ⎛ n−r −1 ⎞ 0 U13 C1 C21 U23 ⎠ , Z 21 = ⎝ 0 0 0 C1n−r −1 ⎛ ⎞ 0 0 0 Z 22 = ⎝0 C22 0⎠ , 0 0 0

(10.44)

where Ci j are defined in (10.19) and Cik are k × k matrix with entries defined as in (10.5). All the entries of the matrices U 13 are 1, while the entries of V 31 , V 32 and U 23 are defined in j+1 32 32 Ti 32 , Ti,2r j = δi1 (−1) − j+1 = Ti j , 1 ≤ j ≤ r,

Vi31 j = δi1 δ j,n−r −1 , j Vi32 j = 2(−1) δi1 , i+1 Ui23 j = (−1) .

Performing the same analysis as in Sect. 10.2.1 we arrive at ˜ behave like Lemma 11. The entries of the period matrix lim

˜ ij = ˜ i0j , i = j,

lim

˜ jj = ˜ 0j j ,

lim

˜ jj = γj + ˜ 0j j , n − r ≤ j ≤ n + r,

λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1 λ2( j+1) →λ2(2n− j)−1

γj =

j > n + r,

j < n − r,

1 log λ2( j+1) → λ2(2n− j)−1 , πi

˜ 0 are finite. where ij In this case, the argument ξ˜ is given by the following Lemma 12. Let ξ be given by (10.9) and ξ˜ be T ˜ T T ξ˜ = (−Z 12 + Z 22 ) ξ,

(10.45)

Entanglement Entropy in Quantum Spin Chains

173

where Z i j are given by (10.44). Then in the limit λ2( j+1) → λ2(2n− j)−1 we have ξ˜i = ηi± , i > n + r, i < n − r, ξ˜i = i β(λ)γi + ηi± , n − r ≤ i ≤ n + r − 1,

(10.46)

ξ˜n+r =

i = 1, i < n, i = −1, i > n − 1, ± β(λ)γn+r + ηn+r ,

where ηi± remains finite as λ2( j+1) → λ2(2n− j)−1 , n − r ≤ j ≤ n + r . The proof of this lemma follows from exactly the same type of argument as in Sect. 10.2.1. We will now compute the limit of the theta function. Lemma 13. In the limit λ2( j+1) → λ2(2n− j)−1 , n − r ≤ j ≤ n + r , the theta function θ (ξ, ) behaves like ⎛ θ (ξ, ) = exp ⎝2πiβ 2 (λ)

n−1

⎞ γ j + β 2 (λ)γn+r + O(1)⎠ ,

(10.47)

j=1

where ξ is given by (10.9) and γ j by (10.45). Proof. The characteristics in the theta function in (10.1) are once more given by (10.26). T ˜ − Z T can now be written as The matrix Z 12 22 ⎞ ⎛ 0 0 0 0n−r −1 0 ⎟ ⎜ 0 (I2r − J2r )Dr 0 T ˜ T − Z 22 + W, =⎝ Z 12 0 0 γn+r 0 ⎠ 0 0 0 0n−r −1 Dr = diag(γn−r , γn−r +1 , . . . , γn−r +1 , γn−r ), where W is finite in the limit and 0n−r −1 is the zero matrix of dimension n − r − 1. T − ˜ As in Sect. 10.2.1, by performing rows and columns operations on the matrix Z 12 T Z 22 , we see that the determinants has the following asymptotic behavior: T ˜ T det Z 12 − Z 22 = γn+r Dr + O(γir ), λ2( j+1) → λ2(2n− j)−1 , Dr = W

n−1

γk ,

k=n−r

where the notation O(γir ) was defined in Eq. (10.27) and W is some constant.

174

A. R. Its, F. Mezzadri, M. Y. Mo

−1 T ˜ − ZT The inverse matrix Z 12 can now be written as in (10.28): 22

T ˜ T − Z 22 Z 12

−1

= X 0 + X −1 + O(γi−2 ), λ2( j+1) → λ2(2n− j)−1 ,

where the entries of the 2r dimensional matrix X 0 satisfy (10.30) with (X 0 )n+r,n+r = 0, −1 and X −1 is a matrix of order −1 in the γ j with (X −1 )n+r,n+r = γn+r . Following exactly the same analysis in Sect. 10.2, we see that the leading order term in the exponential factor in (10.25) is ⎛ ⎞ n−1 −1 T ˜ T T ˜ − Z 22 ξ = β 2 (λ) ⎝2 ξ˜ T Z 12 Z 12 γ j + γn+r ⎠ + O(1). j=n−r

˜ in (10.1). As in Sect. 10.2.1, we see that the theta We now look at the term θ ξ˜ , function becomes 2n − 2r − 2 dimensional: ε ˜ = θ (ξ˜ 0 , ˜ 0 ), θ (ξ˜ , ) lim λ2( j+1) →λ2(2n− j)−1 ε where the arguments on the right-hand side are obtained from removing the (n − r )th up to the (n + r − 1)th entries. Therefore the theta function θ (ξ, ) behaves like ⎛ ⎛ ⎞ ⎞ n−1 ˜ 0 ). θ (ξ, ) = ς exp ⎝2πiβ 2 (λ) ⎝2 γ j + γn+r ⎠ + O(1)⎠ θ (ξ˜ 0 , j=n−r

This completes the proof of the lemma.

By substituting (10.47) into (9.6), we see that the entropy is asymptotic to S(ρ A ) = −

n−1 1 1 log λ2( j+1) − λ2(2n− j)−1 − log λ2(n−r ) − λ2(n+r )+1 3 6 j=n−r

+O(1), λ2( j+1) → λ2(2n− j)−1 . This concludes the proof of Theorem 2. Appendix A. The Density Matrix of a Subchain Let {ψ j } be a basis of the Hilbert space H of a system composed of two parts, A and B, so that H = HA ⊗ HB . The density matrix of a statistical ensemble expressed in the basis {ψ j } is a positive Hermitian matrix given by ρAB =

jk

c jk ψ j ψk | ,

Entanglement Entropy in Quantum Spin Chains

175

with the condition tr AB ρAB = 1. Let us introduce the operators S( j, k) and S( j, k) defined by the relations S( j, k) = ψ j ψk | , S( j, k)S(k, l) = δ jl ψ j ψ j and S( j, k)S(k, l) = δ jl ψ j ψ j . (In this formula repeated indices are not summed over.) Clearly, we have / 0 c jk = tr AB ρAB S(k, j) . Let us now suppose that the Hamiltonian of our physical system is (3.10) and that the subsystem P is composed of the first L oscillators. Then a set of operators S( j, k) for the subchain P can be generated by products of the type Lj=1 G j , where G j can be any of the operators {c j , c†j , c†j c j , c j c†j } and the c j ’s are Fermi operators that span HA ; † L it is straightforward to check that S(k, j) = j=1 G j . We then have ⎡

ρA =

⎢ tr P ⎣ρA ⎝

All the S(l,k)

=

All the S(l,k)

=

⎛

L

⎞† ⎤ L ⎥ G j⎠ ⎦ Gj

j=1

⎡

j=1

⎛

⎢ tr P ⎣tr B (ρAB ) ⎝

L

⎞† ⎤ L ⎥ ⎠ Gj ⎦ Gj

j=1

⎡

j=1

⎞† ⎤ L L

⎢ ⎥ tr PQ ⎣ρAB ⎝ G j⎠ ⎦ G j. ⎛

j=1

All the S(l,k)

j=1

Since ρAB = g g , this expression simply reduces to

ρA =

All the S(l,k)

⎛ ⎞† L L g ⎝ G j ⎠ g G j.

j=1

j=1

The correlation functions in the above sum can be computed using Wick’s theorem (3.9). Finally, if the correlations of the c j ’s are given by (4.5) and (4.6), we immediately obtain formula (4.7).

Appendix B. The Correlation Matrix C M The purpose of this appendix is to provide an explicit derivation of the expectation values g m j m k g (B.1) when the dynamics is determined by the Hamiltonian (3.10).

176

A. R. Its, F. Mezzadri, M. Y. Mo

First, we need to diagonalize Hα , which is achieved by finding a linear transformation of the operators b j of the form ηk =

M−1

gk j b j + h k j b†j ,

(B.2)

j=0

such that the Hamiltonian (3.10) becomes Hα =

M−1

|k | ηk† ηk + C,

(B.3)

k=0

where the coefficients gk j and h k j are real, the ηk ’s are Fermi operators and C is a constant. The quadratic form (3.10) can be transformed into (B.3) by (B.2) if the system of equations [ηk , Hα ] − |k | ηk = 0, k = 0, . . . , M − 1

(B.4)

has a solution. Substituting (3.10) and (B.2) into (B.4) we obtain the eigenvalue equations |k | gk j =

M−1

gkl Al j − h kl B l j ,

gkl B l j − h kl Al j ,

l=0

|k | h k j =

M−1

(B.5)

l=0

where A = α A − 2I and B = αγ B. These equations can be simplified by setting φk j = gk j + h k j , ψk j = gk j − h k j ,

(B.6)

in terms of which Eqs. (B.5) become (A + B)φ k = |k | ψ k ,

(B.7)

(A − B)ψ k = |k | φ k .

(B.8)

Combining these two expressions, we obtain (A − B)(A + B)φ k = |k |2 φ k , |2

(A + B)(A − B)ψ k = |k ψ k .

(B.9) (B.10)

When k = 0, φ k and |k | can be determined by solving the eigenvalue Eq. (B.9), then ψ k can be computed using (B.7). Alternatively, one can solve Eq. (B.10) and then obtain φ k from (B.8). When k = 0, φ k and ψ k differ at most by a sign and can be deduced directly either from (B.7) and (B.8) or from (B.9) and (B.10).

Entanglement Entropy in Quantum Spin Chains

177

Since A and B are real, the matrices (A − B)(A + B) and (A + B)(A − B) are symmetric and positive, which guarantees that all of their eigenvalues are positive. Furthermore, the φ k ’s and ψ k ’s can be chosen to be real and orthonormal. As a consequence the coefficients gk j and h k j obey the constraints M−1

gk j gkl + h k j h kl = δ jl ,

(B.11)

gk j h kl + h k j gkl = 0,

(B.12)

k=0 M−1 k=0

which are necessary and sufficient conditions for the ηk ’s to be Fermi operators. The constant in Eq. (B.3) can be computed by taking the trace of Hα using the two expressions (3.10) and (B.3): tr Hα = 2

M−1

M−1

(α Akk − 2) = 2

M−1

k=0

M−1

|k | + 2 M C.

k=0

Therefore, we have C=

M−1 1 (α Akk − 2 − |k |) . 2 k=0

We are now in a position to compute the contraction pair (B.1). Substituting (B.6) into (B.2) we have ηk =

M−1 1 φk j m 2 j+1 − iψk j m 2 j . 2

(B.13)

j=0

Since the φ k ’s and ψ k ’s are two sets of real and orthogonal vectors, (B.13) can be inverted to give m2 j = i

M−1

ψk j ηk − ηk† ,

(B.14)

k=0

m 2 j+1 =

M−1

φk j ηk + ηk† .

(B.15)

k=0

Since the vacuum state of the operators ηk coincides with g , the expectation values (B.1) are easily computed from the expressions (B.14) and (B.15). We have

M−1 g m 2 j m 2k g = ψl j ψlk = δ jk ,

(B.16)

l=0

M−1 g m 2 j+1 m 2k+1 g = φl j φlk = δ jk

l=0

(B.17)

178

A. R. Its, F. Mezzadri, M. Y. Mo

and

M−1 g m 2 j m 2k+1 g = i ψl j φlk ,

(B.18)

l=0 M−1 g m 2 j+1 m 2k g = −i ψlk φl j .

(B.19)

l=0

Finally, by introducing the real M × M matrix (TM ) jk =

M−1

ψl j φlk ,

j, k = 0, . . . , M − 1

(B.20)

l=0

and combining the expressions (B.16), (B.17), (B.18) and (B.19) we obtain g m j m k g = δ jk + i(C M ) jk ,

(B.21)

where the matrix C M has the block structure ⎛

CM

C11 ⎜ C21 =⎝ ··· C M1

C12 C22 ··· C M2

⎞ · · · C1M · · · C2M ⎟ ··· ··· ⎠ · · · CM M

(B.22)

with 0 (TM ) jk . −(TM )k j 0

C jk =

(B.23)

We call C M the correlation matrix. It is worth noting that because of the definition (B.20), the matrix TM contains all of the physical information relating to the ground state of Hα . Appendix C. Thermodynamic Limit of the Correlation Matrix C M In this appendix we prove the following Lemma 14. Let Hα be the Hamiltonian (3.10) and consider the correlation matrix (B.22) associated to Hα . We have lim C M = T∞ [],

M→∞

where T∞ [] is the semi-infinite block-Toeplitz matrix with symbol 0 g eiθ , = 0 −g −1 eiθ where the function g(z) is defined in (3.15).

(C.1)

Entanglement Entropy in Quantum Spin Chains

179

Proof. From the definitions (3.15) and (3.16) we have that g e−iθ = g eiθ = g −1 eiθ . Thus, from Eq. (B.23) it suffices to show that 2π 1 lim (TM ) jk = g eiθ e−i( j−k)θ dθ, M→∞ 2π 0

(C.2)

where g(z) is defined in (3.15). The first step consists in determining the vectors φ k and ψ k , and the numbers k via the eigenvalue Eqs. (B.7), (B.8), (B.9) and (B.10). If we use the definitions (3.12), we can write (A + B) jk = a( j − k) + γ b( j − k) and (A − B) jk = a( j − k) − γ b( j − k). Two arbitrary circulant matrices commute and a common set of normalised eigenvectors is given by exp 2πiMjk , j, k = 0, . . . , M − 1, (C.3) ψk j = √ M where the index j labels the component of the k th eigenvector. As a consequence, the ψ k ’s are a set of common eigenvectors of both (A + B)(A − B) and (A − B). Now, combining Eqs. (B.8) and (B.10) we can write M−1

/ 0 a( j − l) − γ b( j − l) ψkl = k ψk j = k φk j ,

(C.4)

l=0

with φ k = ψ k k / |k |. Because both φ k and ψ k are normalized, k / |k | must be a complex number with modulo one and we can set k = k . The eigenvalues k can be computed by directly substituting the eigenvectors (C.3) into the left-hand side of (C.4) and using the parity properties of the functions a( j) and b( j). We obtain ⎧ (M−1)/2 , ⎪ ⎪ if M is odd, ⎪ (a( j) − γ b( j)) eik j ⎨ j=−(M−1)/2 k = (C.5) M/2−1 ⎪ , ⎪ ik j l ⎪ (a( j) − γ b( j)) e + (−1) a(M/2) if M is even, ⎩ j=−M/2−1

where k does not denote an integer but the wave number k=

2πl , l = 0, . . . , M − 1. M

We now define the matrix (TM ) jk =

M−1 l=0

ψ l j φlk .

(C.6)

180

A. R. Its, F. Mezzadri, M. Y. Mo

Note that for convenience we have used the complex eigenvectors (C.3), while the matrix (B.20) is defined in terms of the real eigenvectors of (A − B)(A + B) and (A + B)(A − B). However, these are related by the transformations φ k → U φ k and ψ k → U ψ k with the same unitary matrix U . This mapping leaves the right-hand side of Eq. (C.6) unchanged. Therefore, the two matrices (B.20) and (C.6) coincide. The matrix (C.6) now becomes (TM ) jl =

1 2π

2π (1−1/M) k=0

k −ik( j−l) k. e |k |

(C.7)

For M large enough there exists an integer n < M such that a( j) = b( j) = 0 for j > n. Therefore, n lim k(M) = q eiθ = (a( j) − γ b( j)) ei jθ .

M→∞

j=−n

By taking the limit as M → ∞ of the left-hand side of Eq. (C.7) we obtain (C.2).

Appendix D. The Riemann Constant K In this appendix we will show that the Riemann constant K is given by K =−

2n

ω(λ2i−1 ).

j=2

As in [6], let Q 1 , . . . , Q 2n−1 be the zeros of the theta function θ (ω(z)). Then the function ⎛ ⎞ θ ⎝ω(z) − ω(Q j ) − K ⎠ j=1

has the same zeros as θ (ω(z)). Therefore, the quotient of these two functions can be written as an Abelian integral of a holomorphic 1-form ν: , z θ ω(z) − j=1 ω(Q j ) − K = ν. θ (ω(z)) Moreover, all the a-periods of ν must vanish. Thus, the right-hand side of the above equation is in fact a constant C: , θ ω(z) − j=1 ω(Q j ) − K = C. θ (ω(z))

Entanglement Entropy in Quantum Spin Chains

Therefore, we have

181

ω(Q j ) = −K .

j=1

We will now compute the values of ω(λi ) in the basis a1 , . . . , a2n−1 , b1 , . . . , b2n−1 and show that the 2n − 1 points λ3 , . . . , λ4n−1 are the zeros of θ (ω(z)). We have 1 j,k , 0 < j < k ≤ 2n − 1, 2 1 1 ω j (λ2k+1 ) = − + j,k , 0 < k ≤ j ≤ 2n − 1, 2 2 1 ω j (λ2k ) = j,k−1 , 0 < j < k ≤ 2n, 2 1 1 ω j (λ2k ) = − + j,k−1 , 1 < k ≤ j ≤ 2n. 2 2

ω j (λ2k+1 ) =

If we write ω(λi ) as ω(λi ) =

1 1 Ni + Mi , 2 2

then, from the periodicity (6.6) of the theta function, we have θ (ω(λi )) = exp (−πi Ni , Mi ) θ (−ω(λi )). Since N2i+1 , M2i+1 are odd for 1 ≤ i ≤ 2n − 1, we see that θ (ω(λ2i+1 )) = 0 and hence the g zeros of θ (ω(z)) are the points λ3 , . . . , λ4n−1 . Therefore, we have K =−

2n

ω(λ2 j−1 ).

j=2

Appendix E. The Cycle Basis (10.18) In this appendix we will show that the basis defined in (10.18) are canonical. First note that, by direct computation, it is easy to check that the intersections between the a-cycles are zero a˜ n− j−1 · a˜ n+l = 0, 0 ≤ j, l ≤ r − 1. We will now compute the other intersection numbers by induction. First let us compute the intersection numbers between the tilded basis and the untilded basis. We have an−k−1 · a˜ n− j−1 an−k−1 · a˜ n+ j an+k · a˜ n− j−1 an+k · a˜ n+ j an−k−1 · b˜n− j−1

= δk, j , = −δk, j , = −δk, j , = δk, j , ⎧ k = j; ⎨ 1, = 2(−1)k− j , j + 1 ≤ k; ⎩ 0, 0 ≤ k ≤ j − 1;

(E.1) (E.2) (E.3) (E.4) (E.5)

182

A. R. Its, F. Mezzadri, M. Y. Mo

an−k−1 · b˜n+ j =

$

0, 2(−1)k− j ,

0 ≤ k ≤ j; j + 1 ≤ k;

an+k · b˜n− j−1 = 0, an+k · b˜n+ j = δk, j , $ −1, 0 ≤ k ≤ j − 1; bn+k · a˜ n− j−1 = 0, j ≤ k; $ −1, 0 ≤ k ≤ j − 1; bn−k−1 · a˜ n− j−1 = 0, j ≤ k; $ −1, 0 ≤ k ≤ j; bn+k · a˜ n+ j = 0, j + 1 ≤ k; $ −1, 0 ≤ k ≤ j; bn−k−1 · a˜ n+ j = 0, j + 1 ≤ k; $ 1, 0 ≤ k ≤ j − 1; bn+k · b˜n− j−1 = 0, j ≤ k; $ 1, 0 ≤ k ≤ j; bn−k−1 · b˜n− j−1 = (−1)k− j , j + 1 ≤ k; $ 0, 0 ≤ k ≤ j; bn−k−1 · b˜n+ j = (−1)k− j , j + 1 ≤ k; bn+k · b˜n+ j = 0, where j, k range from 0 to r − 1. Now, we have b˜n+r −1 = bn+r −1 . Then from (E.9)–(E.16), we obtain the following intersection numbers: b˜n+r −1 · a˜ j = δn+r −1, j , b˜n+r −1 · b˜ j = 0. Next, from (10.18) we have b˜n+k + b˜n+k−1 = bn+k + bn+k−1 + an−k−1 − 2bn−k−1 , k = 1, . . . , r − 1. From this relation and Eqs. (E.1)–(E.16), we obtain b˜n+k + b˜n+k−1 · a˜ j = −δ j,n+k − δ j,n+k−1 , b˜n+k + b˜n+k−1 · b˜ j = 0. j = 1, . . . , 2n − 1. Therefore, if we assume that b˜n+k has the intersection numbers b˜n+k · a˜ j = −δ j,n+k , b˜n+k · b˜ j = 0, j = 1, . . . , 2n − 1, then b˜n+k−1 will have the intersection numbers b˜n+k−1 · a˜ j = −δ j,n+k−1 , b˜n+k−1 · b˜ j = 0, j = 1, . . . , 2n − 1, 1 ≤ k.

(E.6) (E.7) (E.8) (E.9) (E.10) (E.11) (E.12) (E.13) (E.14) (E.15) (E.16)

Entanglement Entropy in Quantum Spin Chains

183

Therefore, by induction we see that b˜n+k · a˜ j = −δ j,n+k , b˜n+k · b˜ j = 0, j = 1, . . . , 2n − 1, k = 0, . . . , r − 1.

(E.17)

We can now compute the intersection numbers of the b˜n−k−1 . We have b˜n−k−1 − b˜n+k = −a˜ n+k + an+k , k = 0, . . . , r − 1. Therefore, by using (E.1)–(E.17) we obtain b˜n+k − b˜n−k−1 · a˜ j = −δ j,n+k + δ j,n−k−1 , b˜n+k − b˜n−k−1 · b˜ j = 0, j = 1, . . . , 2n − 1. From (E.17), we see that the intersection numbers for the b˜n−k−1 are indeed given by b˜n−k−1 · a˜ j = −δ j,n−k−1 , b˜n−k−1 · b˜ j = 0, j = 1, . . . , 2n − 1, k = 0, . . . , r − 1. Appendix F. Solvability of the Wiener-Hopf Factorization Problem We now show that the Wiener-Hopf factorization problem (5.3) is solvable when β(λ) is purely imaginary. In other words, we have Theorem 9. The following Riemann-Hilbert problem T+ (z) = (z)T− (z), |z| = 1, iλ g(z) , (z) = −g −1 (z) iλ

(F.1)

where T+ (z) is holomorphic for |z| < 1 and T− (z) is holomorphic for |z| > 1 with T− (∞) = 1 is solvable when β(λ) ∈ iR. Proof. We will use the vanishing lemma to proof this theorem. As in [8], we need to show that a certain singular integral operator is a bijection. The solvability of the Riemann-Hilbert problem is related to the bijectivity of a singular integral operator. Let C be the Cauchy operator 1 f (s) C( f )(z) = ds, f ∈ L 2 (), 2πi s − z and let C+ , C− be its limit on the positive and negative side of the real axis C± ( f )(z) = lim C( f )(z ± i ), z ∈ .

→0

Now, define the singular integral operator C as in [8], C ( f ) = C+ f (I − −1 ) .

(F.2)

184

A. R. Its, F. Mezzadri, M. Y. Mo

Suppose that I − C is invertible in L 2 (), and let µ = (I − C )−1 C+ (I − −1 ): then the function Tˆ (z) = I + C (I + µ)(I − −1 ) is a solution to the Riemann-Hilbert problem (F.1). In fact, we have Tˆ+ (z) = I + C+ (I − −1 ) + C µ = I + µ(z), Tˆ− (z) = Tˆ+ (z) − I − µ(z) + −1 (z)(I + µ(z)) = −1 (z)Tˆ+ (z), |z| = 1, where the second equation follows from the identity C+ − C− = I . Therefore, in order to show that (F.1) is solvable when β(λ) ∈ i, we need to show that I − C is invertible in L 2 (). Using standard analysis (see, e.g., [8]), we can show that the operator C is Fredholm and has index zero. Therefore we only need to show that its kernel is {0}. Suppose that the kernel is non-trivial and let (I − C )µ0 = 0. Then the function Tˆ0 (z) = C µ0 (I − −1 ) will solve the Riemann-Hilbert problem (F.1), but its asymptotic behavior will be Tˆ0 (z) = O(z −1 ), z → ∞. This means that the function R(z) = Tˆ0† (z −1 )Tˆ0 (z), where A† is the Hermitian conjugate of A, is analytic outside the unit circle and behaves like O(z −2 ) at infinity. Thus, by Cauchy’s theorem, we have

R− (z)dz = 0.

By making use of the jump conditions, we obtain

R− (z)dz = =

Tˆ0† (z) Tˆ0† (z)

+

−

Tˆ0 (z)

−

dz

† (z) Tˆ0 (z)

−

dz = 0.

(F.3)

From (6.2), we see that the eigenvalues of (z) are i(λ + 1) and i(λ − 1). Therefore or negative definite for the matrix i† (z) is Hermitian and is either positive definite β(λ) ∈ iR. This means that the boundary value of Tˆ0 (z) on the unit circle is zero. In −

particular, it implies that Tˆ0 (z) = 0 and hence the kernel of the singular integral operator I − C is trivial. This concludes the proof of the theorem.

Entanglement Entropy in Quantum Spin Chains

185

References 1. Belokolos, E.D., Bobenko, A.I., Enolskii, V.Z., Its, A.R., Matveev, V.B.: Algebro-geometric approach to nonlinear integrable equations. Springer series in nonlinear dynamics, Berlin-Heidelberg-New York: Springer-Verlag, 1995 2. Bennett, C.H., Bernstein, H.J., Popescu, S., Schumacher, B.: Concentrating partial entanglement by local operations. Phys. Rev. A 53, 2046–2052 (1996) 3. Calabrese, P., Cardy, J.: Entanglement entropy and quantum field theory. J. Stat. Mech. The. Exp., P06002 (2004) 4. Calabrese, P., Cardy, J.: Evolution of entanglement entropy in one-dimensional systems. J. Stat. Mech. The. Exp., P04010 (2005) 5. Deift, P.A.: Integrable operators. Amer. Math. Soc. Transl. (2) 189, 69–84 (1999) 6. Farkas, H.M., Kra, I.: Riemann surfaces. Graduate Texts in Mathematics, 71. New York-Berlin: SpringerVerlag (1980) 7. Rauch, H.E., Farkas, H.M.: Theta functions with applications to Riemann surfaces. Baltimore, MD: The Williams and Wilkins Co., 1974 8. Fokas, A.S., Xin Zhou.: On the solvability of Painlev II and IV. Commun. Math. Phys. 144(3), 601–622 (1992) 9. Harnad, J., Its, A.R.: Integrable Fredholm operators and dual isomonodromic deformations. Comm. Math. Phys. 226, 497–530 (2002) 10. Holzhey, C., Larsen, F., Wilczek, F.: Geometric and renormalized entropy in conformal field theory. Nucl. Phys. B 424, 443–467 (1994) 11. Its, A.R., Izergin, A.G., Korepin, V.E., Slavnov, N.A.: Differential equations for quantum correlation functions. Proceedings of the Conference on Yang-Baxter Equations, Conformal Invariance and Integrability in Statistical Mechanics and Field Theory. Int. J. Mod. Phys. B 4, 1003–1037 (1990) 12. Its, A.R., Jin, B.Q., Korepin, V.E.: Entanglement in the XY spin chain. J. Phys. A 38, 2975–2990 (2005) 13. Its, A.R., Jin, B.Q., Korepin, V.E.: Entropy of XY Spin Chain and Block Toeplitz Determinants. In: Filds Inst. Commun. Universality and Renormalization, I. Bender, D. Kneimer (eds.), Vol. 50, Providence, RI: Amer. Math. Soc., 2007, P. 151 14. Jin, B.Q., Korepin, V.E.: Entanglement, Toeplitz determinants and Fisher-Hartwig conjecture. J. Stat. Phys. 116, 79–95 (2004) 15. Korepin, V.E.: Universality of entropy scaling in 1D gap-less models. Phys. Rev. Lett. 92, 096402 (2004) 16. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: Wiley Interscience, 1978 17. Lieb, E., Schultz, T., Mattis, D.: Two soluble models of an antiferromagnetic chain. Ann. Phys. 16, 407–466 (1961) 18. Keating, J.P., Mezzadri, F.: Random matrix theory and entanglement in quantum spin chains. Commun. Math. Phys. 242, 543–579 (2004) 19. Keating, J.P., Mezzadri, F.: Entanglement in quantum spin chains, symmetry classes of random matrices, and conformal field theory. Phys. Rev. Lett. 94, 050501 (2005) 20. Osterloh, A., Amico, L., Falci, G., Fazio, R.: Scaling of entanglement close to a quantum phase transition. Nature 416, 608–610 (2002) 21. Peschel, I.: On the entanglement entropy for an XY spin chain. J. Stat. Mech. The. Exp., P12005 (2004) 22. Osborne, T.J., Nielsen, M.A.: Entanglement in a simple quantum phase transition. Phys. Rev. A 66, 032110 (2002) 23. Vidal, G., Latorre, J.I., Rico, E., Kitaev, A.: Entanglement in quantum critical phenomena. Phys. Rev. Lett. 90, 227902 (2003) 24. Widom, H.: Asymptotic behavior of block Toeplitz matrices and determinants. Adv. Math. 13, 284–322 (1974) 25. Widom, H.: On the limit of block Toeplitz determinants. Proc. Amer. Math. Soc. 50, 167–173 (1975) Communicated by P. Sarnak

Commun. Math. Phys. 284, 187–202 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0645-8

Communications in

Mathematical Physics

Magnetic Flows on Sol-Manifolds: Dynamical and Symplectic Aspects Leo T. Butler1 , Gabriel P. Paternain2 1 School of Mathematics and Maxwell Institute for Mathematical Sciences,

The University of Edinburgh, James Clerk Maxwell Building, King’s Buildings, Edinburgh, EH9 3JZ, UK. E-mail: [email protected] 2 Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge CB3 0WB, UK. E-mail: [email protected] Received: 20 August 2007 / Accepted: 24 June 2008 Published online: 23 September 2008 – © Springer-Verlag 2008

Abstract: We consider magnetic flows on compact quotients of the 3-dimensional solvable geometry Sol determined by the usual left-invariant metric and the distinguished monopole. We show that these flows have positive Liouville entropy and therefore are never completely integrable. This should be compared with the known fact that the underlying geodesic flow is completely integrable in spite of having positive topological entropy. We also show that for a large class of twisted cotangent bundles of solvable manifolds every compact set is displaceable. 1. Introduction The Lie group Sol is the semidirect product associated with the action of R on R2 given by u · (y0 , y1 ) = (eu y0 , e−u y1 ). The group Sol is diffeomorphic to R3 and the product is (y0 , y1 , u) (y0 , y1 , u ) = (eu y0 + y0 , e−u y1 + y1 , u + u ). It is not difficult to see that Sol admits cocompact lattices. Let A ∈ S L(2, Z) be such that there is P ∈ G L(2, R) with λ 0 −1 P AP = 0 1/λ and λ > 1. There is an injective homomorphism Z2 A Z → Sol

188

L. T. Butler, G. P. Paternain

given by (m, n, l) → (P(m, n), log λ l) which defines a cocompact lattice in Sol. The closed 3-manifold := \Sol is a 2-torus bundle over the circle with hyperbolic gluing map A. The Riemannian metric ds2 = e−2u dy02 + e2u dy12 + du 2 is left-invariant and descends to a Riemannian metric on . It is a remarkable fact discovered by A. Bolsinov and I. Taimanov [2] that the geodesic flow of (, ds2 ) is completely integrable in the sense of Liouville with the two additional integrals f = p y0 p y1 , F = exp

−1 2 p y0 p 2y1

log | p y0 | . sin 2π log λ

The geodesic flow has topological entropy h top = 1 but Liouville (or metric) entropy hµ = 0. It is the simplest example of a geodesic flow on a compact homogeneous space with these properties. Note that the lattice has exponential word growth and the entropy is all carried in the minimizing Aubry-Mather sets given by pu = ±1, p y0 = p y1 = 0. The dynamics on these sets is Anosov and given by the suspension of A. We refer to [1] for a detailed description of the foliation by Liouville tori and for spectral properties of the Laplace-Beltrami operator of (, ds2 ). The manifold has a distinguished monopole, i.e. a closed non-exact 2-form which generates H 2 (, R) given by = dy0 ∧ dy1 . This form is harmonic and Hodge dual to the generator du of H 1 (, R). The Aubry-Mather sets we mentioned before are calibrated by the closed 1-forms ±du. The first goal of this paper is the study of the dynamics of the magnetic flow determined by the metric ds2 and the monopole . We will modulate the intensity of the magnetic field with a parameter s ∈ [0, ∞) and we will always consider the magnetic flow ϕ s running with speed one. The analysis of the flow is carried out in Sect. 3. One of our findings is that the magnetic flow ceases to be Liouville integrable as soon as the magnetic field is switched on. The reason is that the Liouville entropy becomes positive. In fact, one can compute the Liouville entropy exactly as we now explain. Since all the objects involved are left-invariant the flow ϕ s may be reduced to an Euler flow ψ s on s∗ , the dual of the Lie algebra s of Sol. With respect to the basis of left-invariant 1-forms {e−u dy0 , eu dy1 , du}, a point in s∗ will have coordinates (α0 , α1 , ν). It is easy to see that f = α0 α1 + sν is a Casimir and thus an integral of ψ s . Observe that ψ s leaves invariant the sphere S given by α02 + α12 + ν 2 = 1. Let dθ be its canonical probability area measure. Theorem A. The Liouville entropy of ϕ s is given by hµ (ϕ s ) = |¯ν | dθ, S

where ν¯ is the average of ν over the level sets of the Casimir f . Moreover, hµ (ϕ s ) > 0 for all s > 0 and approaches 1/2 as s → ∞, while h top (ϕ s ) ≡ 1. This result should be compared with the well-known example of the magnetic flow on a compact hyperbolic surface with magnetic field given by the area form. In this example, as the intensity s increases the flow becomes “simpler”. Indeed, topological

Magnetic Flows on Sol-Manifolds

189

entropy decreases; at s = 1 we hit the horocycle flow and for s > 1, the flow has all its orbits closed and becomes integrable. The opposite seems to be happening for our magnetic flow on Sol. On the other hand, the well-known Rydberg model of a hydrogen atom in a strong magnetic field is believed to exhibit behaviour similar to that described in Theorem A. We are unaware of any proof, as opposed to evidence, that the Rydberg model has positive Liouville entropy. The second goal of this paper is to try to explain these drastic changes in the dynamics in terms of changes in the symplectic topology of twisted cotangent bundles. Let be a closed manifold and let ω0 be the canonical symplectic form of the cotangent bundle τ : T ∗ → . Given a closed 2-form σ we let ωσ := ω0 − τ ∗ σ be the twisted symplectic form determined by σ . Recall that given a compact set K , the displacement energy of K is defined as e(K ) := inf{ρ(1, h) : h ∈ Hamc (T ∗ , ωσ ), h(K ) ∩ K = ∅}, where ρ is Hofer’s distance and Hamc (T ∗ , ωσ ) is the set of compactly supported Hamiltonian diffeomorphisms. Recall also that a compact set K is said to be displaceable if there exists h ∈ Hamc (T ∗ , ωσ ) such that h(K ) ∩ K = ∅. Thus K is displaceable iff e(K ) is finite. A well known result of M. Gromov [6] asserts that the zero section of (T ∗ , ω0 ) is not displaceable. On the other hand, if σ is non-zero and has zero Euler characteristic, results of F. Laudenbach and J.-C. Sikorav [8] and L. Polterovich [12] imply that the zero section of (T ∗ , ωσ ) is actually displaceable (if σ is non-zero, the zero section of T ∗ ceases to be Lagrangian). Finite displacement energy has important implications. According to a recent result of F. Schlenk [13], if a compact energy level of an autonomous Hamiltonian is displaceable, then it will have finite π1 -sensitive Hofer-Zehnder capacity which in turn yields almost everywhere existence of contractible closed orbits (i.e. there is a full measure set of values of the energy for which the corresponding energy level has a contractible closed orbit). Let us illustrate this discussion with the following example. Consider a closed hyperbolic 3-manifold and let σ be any non-zero closed 2-form. For high values of the energy the magnetic flow will be Anosov, since it can be seen as a pertubation of a geodesic flow on a negatively curved manifold. Thus for high energies, the magnetic flow will have no contractible closed orbits (the magnetic flow will be topologically conjugate to the geodesic flow and it is well known that the latter has no contractible closed geodesics). Schlenk’s result now implies that high energy levels are not displaceable, while low energy levels are by the results of Laudenbach-Sikorav and Polterovich. If we take the closed 3-manifold to have non-zero first Betti number, then it will have non-zero second Betti number and we may choose magnetic fields σ with non-zero cohomology classes (monopoles). Returning to our example on Sol we note that the geodesic flow of (, ds2 ) has no contractible closed orbits, but as soon as the magnetic field is switched on, contractible closed orbits appear. These orbits are related to the vanishing of ν¯ , see Remark 3.4 where \Sol), ω ) these observations are proved. It turns out that every compact set in (T ∗ ( is displaceable. Our last result shows that this is also true for a large class of solvable manifolds. We say that a Lie group G is completely solvable if it is a closed subgroup of the group of upper triangular matrices with positive diagonal entries. The class of completely solvable groups lies strictly in between nilpotent and solvable groups. Given a Lie algebra g, let L : 2 (g) → g be the linear map induced by the Lie bracket, where 2 (g) is the second exterior power of g. Recall that 2-vectors are elements in 2 (g) of the form x ∧ y with x, y ∈ g.

190

L. T. Butler, G. P. Paternain

Theorem B. Let G be a simply connected completely solvable group and suppose Ker L is generated by 2-vectors. Let be a cocompact lattice and := \G. Then, for any monopole σ and any compact set K ⊂ (T ∗ , ωσ ), e(K ) < ∞. \Sol), ω ) fits the hypotheses of the theorem. For tori, Certainly, our example (T ∗ ( the theorem also follows from the proof of Theorem 3.1 in [5]. It is quite likely that Theorem B holds for any simply connected solvable Lie group with lattice. We do not know of an example of a solvable Lie algebra where Ker L is not generated by 2-vectors. In Sect. 4 we show how Theorem B applies to compact quotients of some of the standard nilpotent Lie algebras, like the Heisenberg Lie algebra h2n+1 and the Lie algebra of upper triangular matrices un . Finally, in Subsect. 4.2 we discuss these results in the context of Aubry-Mather theory and Mañé’s critical values. 2. Preliminaries Let Sol be the semidirect product of R2 with R, with coordinates (u, y0 , y1 ) and multiplication (y0 , y1 , u) (y0 , y1 , u ) = (y0 + eu y0 , y1 + e−u y1 , u + u ).

(1)

The map (y0 , y1 , u) → u is the epimorphism Sol → R whose kernel is the normal subgroup R2 . The group Sol is isomorphic to the matrix group ⎛ u ⎞ e 0 y0 ⎝ 0 e−u y1 ⎠. 0 0 1 If one denotes by pu , p y0 and p y1 the momenta that are canonically conjugate to u, y0 and y1 respectively, then the functions α0 = eu p y0 , α1 = e−u p y1 , ν = pu

(2)

are left-invariant functions on T ∗ Sol. The closed 2-form = dy0 ∧ dy1

(3)

is also left-invariant, and consequently, ωs = d pu ∧ du + d p y0 ∧ dy0 + d p y1 ∧ dy1 − sdy0 ∧ dy1

(4)

is a left-invariant twisted symplectic form on T ∗ Sol for any real number s. The Poisson bracket induced by ωs is denoted by {, }s . The Poisson brackets of the coordinate functions are {ν, u}s = 1, {α0 , y0 }s = eu , {α1 , y1 }s = e−u ,

{α0 , α1 }s = s, {ν, α0 }s = α0 , {ν, α1 }s = −α1 ,

(5)

and all others vanish. Define the Hamiltonian H on T ∗ Sol by 2H = ν 2 + α02 + α12 ,

(6)

Magnetic Flows on Sol-Manifolds

191

so that when s = 0, H is the Hamiltonian of the left-invariant Riemannian metric mentioned in the Introduction. The equations of the magnetic flow induced by H are ⎧ ⎨ u˙ = ν, ν˙ = −α02 + α12 , u X H = y˙0 = e α0 , (7) α˙ 0 = −α1 s + να0 , ⎩ α˙ 1 = α0 s − να1 , y˙1 = e−u α1 , or X H (•) = {H, •}s . The Lie algebra of left-invariant functions on T ∗ Sol has a non-trivial centre generated by the Casimir f = sν + α0 α1 .

(8)

Remark 2.1. The 2-form defines a central extension of Sol: R → G → Sol. The Lie algebra g of G is isomorphic to the Lie algebra with basis s, ν, α0 , α1 and Lie bracket {, }s . The equations of the magnetic Hamiltonian H (Eq. 7) may be viewed as the symplectic reduction of a Kaluza-Klein metric Hamiltonian on T ∗ G at a non-zero level of momentum. From this point of view, f and s are Casimirs of the Poisson bracket on g∗ . Actually, the group G may be identified with one of the solvable 4-dimensional geometries, namely Sol41 [15]. It has a matrix representation ⎛ ⎞ 1 y z ⎝ 0 et x ⎠ , 0 0 1 where x, y, z, t ∈ R. Via the Kaluza-Klein metric, Theorem A could be reinterpreted as follows: the geodesic flow on compact quotients of Sol41 has positive Liouville entropy and is not completely integrable. 3. Analysis of the Magnetic Flow Since the Hamiltonian vector field X H (Eq. 7) is left-invariant, the vector field factors onto a vector field E h on s∗ through the projection map T ∗ Sol → s∗ induced by the left-framing of T ∗ Sol. The Euler vector field E h is a Hamiltonian vector field on s∗ equipped with the Lie bracket {, }s . The Hamiltonian h : s∗ → R is the Hamiltonian which induces H . It is clear the dynamics of X H can be reconstructed from the dynamics of E h . Let S = h −1 ( 21 ) be the unit sphere in s∗ ; the unit-sphere bundle H −1 ( 21 ) is naturally diffeomorphic to Sol × S. The functions ν, α0 , α1 will be regarded as coordinate functions on s∗ . Define the standard smooth measure θ on S by 4π × θ = νdα0 ∧ dα1 + α0 dα1 ∧ dν + α1 dν ∧ dα0 |S .

(9)

The measure θ may be decomposed as θ = m ∧ m f . The measure m is defined so that for each connected component of f −1 (c) ∩ S, call it fc , m induces a smooth probability measure on fc that is E h -invariant. Let ν¯ : S → R be defined by

ν dm ∀µ ∈ S, (10) ν¯ (µ) := f f (µ)

that is, ν¯ (µ) is the mean value of ν along the connected component of the level set of f |S containing µ.

192

L. T. Butler, G. P. Paternain

Here is a more prosaic definition of m. Because the vector field E h preserves the volume form dν ∧ dα0 ∧ dα1 on s∗ , and E h is tangent to the unit sphere S, the vector field E h |S is Hamiltonian with respect to the symplectic form θ (the Hamiltonian is g = 4π × f ). Therefore, if c is a non-trivial regular value of the integral f , then a neighbourhood of fc in S admits action-angle coordinates (I, φ mod 1) such that g = g(I ), ) φ˙ = ∂g(I ∂I , Eh = ˙ (11) I = 0, and θ = dφ ∧ dI . The measure m in these coordinates is m = dφ, while

1

ν¯ =

ν(φ, I ) dφ.

(12)

(13)

0

Proposition 3.1. For s = 0, ν¯ : S → R is a continuous, ψ s -invariant function which is real-analytic off the set of non-elliptic singular levels of f |S. Proof. The real-analyticity of ν¯ on the regular-point set follows from the fact that ν¯ and f are real-analytic and the action-angle coordinates are real-analytic. Case 1, |s| = 0, 1. When |s| < 1, f has a pair of peaks (resp. pits) at α0 = α1 =

±α, ν = s (resp. α0 = −α1 = ±α, ν = −s) where α = 21 (1 − s 2 ). When |s| ≥ 1, f has a single peak (resp. pit) at α0 = α1 = 0, ν = 1 (resp. α0 = α1 = 0, ν = −1). These critical points are all non-degenerate for |s| = 0, 1. Case 1a, elliptic singularity. Let p ∈ S be a peak or pit for f |S, hence an elliptic singularity of E h on S. There is a canonical system of coordinates (x, y) defined on a neighbourhood of p such that the Hamiltonian g of E h |S is in Birkhoff normal form: √ 1 2 x + y 2 , x + i y = 2I e2πiφ . g(x, y) = g1 I + g2 I 2 + · · · , I = (14) 2 It is well-known that g has a formal Birkhoff normal form; Zung has proven that the formal Birkhoff normal form converges when g is completely integrable [16]. Inspection of Eqs. (11–13) shows that ν¯ may be written as T 1 ν¯ (µ) = × ν ◦ ψts (µ) dt, ∀µ ∈ S, (15) T 0

where ψ s is the Euler flow of E h |S and T is the period of the orbit through µ. In an action-angle chart T = ∂∂gI , and one sees that T extends over the critical point at I = 0 as a real-analytic function. Therefore, define t · µ = ψtsT (µ) (µ), ∀t ∈ S 1 = R/Z.

(16)

This defines a real-analytic action of S 1 on a neighbourhood of the critical point p. In angle-action coordinates, this action is just t · (φ, I ) = (φ + t mod 1, I ). The integral in Eq. (15) is then 1 ν¯ (µ) = ν(t · µ) dt, ∀µ ∈ S, (17) 0

Magnetic Flows on Sol-Manifolds

193

i.e. ν¯ is the average of ν under the real-analytic action of S 1 . This shows that ν¯ is real-analytic in a neighbourhood of the elliptic critical point p. Case 1b, hyperbolic singularity. In this case, it is known that there are canonical coordinates (x, y) which send the hyperbolic fixed point to (0, 0), its stable and unstable manifolds to the x- and y-axes respectively, and in which the hamiltonian is of the form g = g1 τ + g2 τ 2 + · · · , where τ = x y.

(18)

In this coordinate system, the flow is simply ψts (x, y) = (xe−tω(τ ) , y tω(τ ) ),

(19)

) where ω = ∂g(τ ∂τ [9]. Without loss of generality, one may assume that the coordinate system is defined on a square centred on the origin, as in Fig. 2. For a point p along the right-hand face of the square above the x-axis, let q be the corresponding point along the orbit which intersects the top face, with the convention that when p = P lies on the stable manifold, the corresponding point is q = Q on the unstable manifold. The orbit consists of two segments: the segment pq inside the box, and the segment qp lying in the complement of the box. The period T = T ( p) of this orbit is the sum of the time T0 ( p) that the orbit spends on the segment pq plus the time T1 ( p) that the orbit spends on the segment qp. The time T1 ( p) is a real-analytic function that approaches the finite limit T1 (P) as p → P; T0 ( p) is also real-analytic and approaches +∞ as p → P. From Eq. (15), one has the equation T0 T T1 T0 s ν ◦ ψt ( p) dt + 2 × ν ◦ ψts ( p) dt. (20) ν¯ ( p) = 2 × T T 0 T0

The second term is bounded by a constant times TT1 , which converges to 0 as p → P. The first term converges to ν(0) = ν¯ (0) = ν¯ (P) as p → P. A similar, but slightly more involved, argument shows that if p lies in the right-hand face of the square below the x-axis, then ν¯ ( p) converges to ν¯ (P), also. By symmetry and invariance of ν¯ under ψ s , this proves that ν¯ is a continuous function in a neighbourhood of the hyperbolic singularity and its stable and unstable manifold. The reader may verify by direct computation that, if ν = y in the coordinate box, then ∂∂ νy¯ diverges to +∞ as p → P (y → 0). Case 2, |s| = 1. In this case, f |S has two critical points – at α0 = α1 = 0, ν = ±1 – that are both degenerate. The argument of case 1a may be adapted to show that ν¯ is a continuous function at each of these critical points. Here are some further properties of ν. ¯ Since ν¯ is ψ s -invariant, one may view it as a function defined on the image of f |S. In this case, it makes sense to say that ν¯ is monotone increasing. Proposition 3.2. If s > 1 (resp. s < −1), then ν¯ is a monotone increasing (resp. decreasing) function that vanishes only on the zero level of f |S. Proof. The symmetry of f and the symplectic form θ dictate that ν¯ (c) be an odd function of c. Therefore ν¯ always vanishes on the zero level of f . Let us suppose that s > 0; the case where s < 0 is analogous. From the previous proposition, it suffices to prove that ν¯ is monotone increasing on the regular levels of f |S. From Eq. (13), one sees that if ν2 (φ, I ) > ν1 (φ, I ) for all φ, I , then ν¯ 2 (I ) > ν¯ 1 (I )

194

L. T. Butler, G. P. Paternain

Fig. 1. S seen from the point of view of f , 0 < |s| < 1

for all I . For our purposes, let ν1 = ν and let ν2 = ν ◦ γτ , where γ is a gradient-like flow for f |S – that takes the form γτ (φ, I ) = (φ, I + τ ) in angle-action coordinates – and τ > 0 is a small positive number. That is, if the derivative of ν in the direction of the gradient-like flow γ is positive, then ν¯ is a monotone increasing function. Let us remark that to test the positivity of this directional derivative, it suffices to use any gradient-like vector field; in particular, it suffices to compute the directional derivative of ν with respect to the standard gradient vector field of f |S. One computes that s ν α0 dν, ∇( f |S) = α0 α1 . (21) ν s α1 The symmetric matrix is positive definite if s > 0 and s 2 > ν 2 . If s > 1, then the matrix is always positive definite, whence the right-hand side vanishes only at α0 = α1 = 0, ν = ±1. This proves the proposition. Remark 3.3. When |s| < 1, the function ν¯ cannot be monotone increasing. As one can see in Fig. 1, ν¯ attains its maximum value of unity at the hyperbolic fixed point α0 = α1 = 0, ν = 1; at the same point f = s. On the other hand, at the elliptic critical

points α0 = α1 = ±

1 2 2 (1 − s ), ν

= s, ν¯ attains a value of s while f =

1 2 (1

+ s 2 ).

Thus: s < 21 (1 + s 2 ) while 1 = ν¯ (s) > ν¯ ( 21 (1 + s 2 )) = s. Numerical calculations do suggest that ν¯ is monotone increasing on [−s, s] and decreasing on the two complementary subintervals (see Fig. 3). A related issue concerns the monotone nature of the function s → hµ (ϕ s ). In Fig. 4 we give evidence from numerical computations that this function is a monotone function on (−∞, 0] and [0, ∞). The function ν¯ is approximated by integrating the Euler equations in the almost canonical variables ν, φ (see the discussion around Eq. (33)) using the Runge-Kutta 4-step method and averaging ν over a numerically computed period. The function hµ (ϕ s ) is approximated by numerically integrating ν¯ over a grid using Simpson’s rule. Data and source code is available from http://www.maths.ed.ac.uk/~lbutler/dsol.html here. Remark 3.4. Consider an orbit of the magnetic flow on Sol that projects onto a closed orbit of E h . From Eq. (7) it is clear that u is a periodic function of time if and only if

Magnetic Flows on Sol-Manifolds

195

Fig. 2.

Fig. 3. The function ν¯ as a function of f for selected values of s. Note the loss of differentiability at the hyperbolic critical level f = s and the lack of monotonicity

ν¯ = 0. Left-invariance–or an easy check using (7)–gives that the functions p y0 + sy1 and p y1 − sy0 are first integrals in Sol. Since α0 and α1 are periodic, we conclude that p y0 = e−u α0 and p y1 = eu α1 are periodic if u is periodic. Thus, if s > 0 and ν¯ = 0,

196

L. T. Butler, G. P. Paternain

Fig. 4. The function hµ (ϕ s ) as a function of s. Inset (left): on the interval [0, 5 × 10−3 ]; Inset (right): on the interval [0, 1 × 10−4 ]

the orbit of the magnetic flow on Sol is periodic. Since there are always closed orbits of E h with ν¯ = 0 we conclude that for s > 0 the magnetic flow on \Sol always has contractible closed orbits. Observe that for the geodesic flow (s = 0) no closed orbit is contractible, since if u is periodic, y0 and y1 must diverge linearly. 3.1. Cocompact subgroups of Sol. To compute the metric entropy of the magnetic flow, it is useful to view the lattice subgroup of Sol, especially the diagonalizing transformation P described in the introduction, intrinsically. Given a lattice subgroup of Sol, there is an exact sequence Z2 → → Z induced by the exact sequence R2 → Sol → R [14, pp. 470–472]. The quotient group Z acts on Z2 via a representation ρ : Z → S L(2, Z). The generator ρ(1) (= A from the introduction) is a hyperbolic matrix with eigenvalues λ±1 , |λ| > 1. In terms of the coordinate system (Eq. 1), the group can be described as follows. Let F = Q(λ) be the quadratic number field obtained by adjoining λ to the rationals. The integers of F, O, is isomorphic to Z2 as an abelian group, and the unit group of O, U, acts as an automorphism group. The group is naturally isomorphic to a finite-index subgroup of the semi-direct product U O. We shall henceforth identify with a subgroup of U O. The volume of U O can be defined to be (0) (0) a0 a1 vol := log |λ| × det (22) (1) (1) , a0 a1 ( j)

where a0 , a1 generate ∩O and ai is the j th conjugate of ai 1 . One can see that vol is the determinant of the injection of Z2 A Z into Sol defined in the introduction; indeed, √

1 The field Q(λ) is a quadratic number field and so equals Q( d) for some positive, square-free integer d. √ √ The map d → − d induces a field automorphism, and the image of a = a (0) under this automorphism is

referred to as a conjugate of a and denoted by a (1) .

Magnetic Flows on Sol-Manifolds

197

the matrix P introduced there is effectively the matrix on the right-hand side of Eq. (22). It is clear that vol is the volume of a fundamental region for in Sol relative to the volume form du ∧ dy0 ∧ dy1 . That is \Sol) = vol . vol (

(23)

Let = \Sol and let µ be the X H -invariant probability measure on × S induced by ωs3 = −du ∧ dy0 ∧ dy1 ∧ dν ∧ dα0 ∧ dα1 , i.e. µ=

1 × du ∧ dy0 ∧ dy1 ∧ θ. vol

3.2. Metric entropy of the magnetic flow. Theorem A. Let s = 0 and ϕ s : R × × S → × S be the magnetic flow with infinitesimal generator X H . The metric entropy of the time-1 map ϕ1s is s hµ (ϕ1 ) = |¯ν | dθ. (24) S

Therefore, since ν¯ is non-zero on a positive measure set, hµ (ϕ1s ) > 0. Moreover, h µ (ϕ1s ) approaches 1/2 as s → ∞ and h top (ϕ1s ) ≡ 1. Remark 3.5. As is proven in Proposition 3.1, ν¯ is a continuous function that is real-analytic on the complement of the critical levels of f |S. Therefore ν¯ is non-zero on a set of full measure. It is almost certain that ν¯ vanishes only on one level of f |S; Proposition 3.2 proves this when |s| > 1. Proof. First, consider a flow ϕ : R × Sol × T1 → Sol × T1 which is a skew product over a translation ϕt (g, φ) = (γ (t, g, φ), φ + at mod 1)

∀g ∈ Sol, φ ∈ T1 .

(25)

If ϕ is assumed to be left-invariant, then the cocycle γ satisfies γ (t, g, φ) = gγ (t, 1, φ). Therefore, if T = 1/a, then ϕT (g, φ) = (g γ (φ), φ mod 1)

∀g ∈ Sol, φ ∈ T1 ,

(26)

where γ (φ) = γ (T, 1, φ). Therefore, for all n ∈ Z, g, φ) = ( g γ (φ)n , φ mod 1) ϕnT (

∀g ∈ Sol, φ ∈ T1 .

(27)

The cocycle γ (φ) ∈ Sol either takes values in the non-hyperbolic subgroup R2 or it has a non-trivial projection to R. In the former case, ϕT has zero entropy. In the latter case, T Sol splits into 3 complementary, left-invariant line-bundles E + , E − and E 0 . These line bundles are determined by their value at the identity of Sol. If we identify TI Sol as the Lie algebra s, then E + is the unstable subspace, E − is the stable subspace and E 0 is the centralizer of Adγ (φ) , respectively. Let λ+ (g) be the log of the largest eigenvalue of Ad g , g ∈ Sol. One sees using Pesin’s formula that 1 1 λ+ (γ (φ)) dφ, (28) hµ c (ϕ1 ) = × T 0 where µ c =

1 vol

× du ∧ dy0 ∧ dy1 ∧ dφ is a ϕ-invariant probability measure on × T1 .

198

L. T. Butler, G. P. Paternain

A simple computation shows that λ+ (g) is the projection g → |u(g)| induced by u Sol −→ R. In addition, if we observe that γ (φ) = γ (0, 1, φ)−1 γ (T, 1, φ) and use the fact that u is a group homomorphism, then |u| = |u(γ (φ))| = λ+ (γ (φ)),

(29)

where u is the change in u over the time interval [0, T ]. Let us turn to the magnetic flow: Let c be a regular value of f |S and introduce actionangle variables in a neighbourhood of fc ⊂ S. The flow, ϕ s , of X H restricted to × fc is of the form described by Eq. (25), with a = ∂∂gI , see Eq. (11). The Liouville measure µ on × S induces the invariant conditional probability measure µ c on × fc . Inspection of Eq. (7) shows that over the period T , u changes by T u = ν(t) dt. (30) 0

Since ν is a periodic function, the integral for u is independent of the angle variable φ. Equations (28–29) therefore show that |u|/T is the metric entropy of ϕ s | × fc with respect to the conditional probability measure µ c . Using Eq. (11) in action-angle coordinates, one obtains 1 1 1 u ∂I = × = ν(φ, I ) dφ × ν(φ, I ) dφ = ν¯ , (31) T T ∂g 0 0 since T = ∂∂gI . Therefore, we can integrate to obtain the metric entropy of the magnetic flow on the unit-sphere bundle µ = |¯ν | dθ. hµ (ϕ s ) = |u| dµ (32) ×S

S

This proves Eq. (24). Let us prove the remaining two points in Theorem A. Topological entropy. We note that the arguments above also imply that the sum of the non-negative Liapunov exponents of ϕ s is given by |¯ν | ≤ 1. Thus by Ruelle’s inequality and the variational principle for topological entropy we see that h top (ϕ s ) ≤ 1. Since the flow on the set pu = ±1, p y0 = p y1 = 0 is the same for all s and carries entropy 1 we conclude that h top (ϕ s ) ≡ 1. The limit of metric entropy. To compute lims→∞ hµ (ϕ s ), note that 1s × f = ν +α0 α1 × 1s . Therefore, as s → ∞, the regular level sets of f |S converge uniformly in the C 1 topology to the level sets of ν, i.e. the regular level sets of f |S converge to circles at a constant height off the α0 − α1 plane (and f |S has only the points α0 = α1 = 0, ν = ±1 as critical points). Let us coordinatize S − {(0, 0, ±1)} by spherical coordinates α0 = cos(2π ξ ) sin(η), α1 = sin(2π ξ ) sin(η), ν = cos(η),

0 < η < π, 0 ≤ ξ < 1.

The angle ξ is the normalized longitudinal angle which vanishes along {α1 = 0, α0 > 0} ∂ξ > 0 along the same privileged longitude. In spherical coordinates, the norand has ∂α 1 malized area form is 1 (33) θ = × sin(η) dη ∧ dξ. 2

Magnetic Flows on Sol-Manifolds

199

For s > 1, we normalize the action-angle coordinates (I, φ) = (Is , φs ) on ∂φs S − {(0, 0, ±1)} as follows: first, φs = 0 and ∂α > 0 along the privileged longitude 1 {α1 = 0, α0 > 0}; second, Is (µ) is defined to be the area of the sublevel set { f ≤ f (µ)} in S. The above paragraph shows that as s → ∞, Is converges to the function I∞ which gives the area of the region in S below height ν. A computation shows that I∞ = 1+ν 2 . On the other hand, φs converges to the normalized longitudinal angle ξ . Inspection of Eqs. (12,13) shows that the mean value of ν averaged with respect to the measure dφs converges to ν as s → ∞. This convergence is in the uniform C 0 topology. Therefore 1 π 1 1 | cos(η) sin(η)| dη dξ × = , lim hµ (ϕ1s ) = (34) s→∞ 2 2 0 0 as asserted. 3.3. A variation. There is an interesting variation of the previous example. Consider the group G = Sol × R and the left-invariant 2-form given by := du ∧ dt, where t denotes the variable on the R-factor. We consider on G the left-invariant metric given by ds2 + dt 2 and the cocompact lattice × Z. The magnetic flow ϕ s on the compact quotient thus obtained has the following remarkable properties for s = 0 (as before s is the intensity): • h top (ϕ s ) = 0 for s = 0. This shows that topological entropy may be discontinuous when a twist in the symplectic structure is introduced; • ϕ s is completely integrable with real analytic integrals. If we let τ := pt , then the integrals are α0 α1 , α0 e−τ/s (the two Casimirs) and τ − su, which can be made invariant under the lattice just by composing with a suitable periodic function. We leave the details of the proofs of these claims to the reader, but they do follow in a straightforward fashion from an analysis quite similar to the one done in this section. 4. Proof of Theorem B We first prove the following easy lemma. Lemma 4.1. Let g be a Lie algebra such that Ker L is generated by 2-vectors. Let be an antisymmetric bilinear form on g such that (x, y) = 0 for all x, y with [x, y] = 0. Then is exact, that is, there exists b ∈ g∗ such that (x, y) = b([x, y]) for all x, y ∈ g. Proof. Let L ∗ : g∗ → (2 (g))∗ be the dual of L : 2 (g) → g. It suffices to show that is in the image of L ∗ . But the image of L ∗ coincides with the annihilator of Ker L, so it suffices to check that(q) = 0 for all q ∈ Ker L. But if Ker L is generated by 2-vectors we may write q = i xi ∧ yi with xi ∧ yi ∈ Ker L. Thus (q) = i (xi , yi ). But since [xi , yi ] = 0, (xi , yi ) = 0 by hypothesis and (q) = 0. We now break the proof of Theorem B into a few simple steps. (1) Let σ be a closed 2-form in with non-zero cohomology class. By a theorem of A. Hattori [7] (which in turn is a generalization of a theorem of K. Nomizu for

200

L. T. Butler, G. P. Paternain

nilmanifolds [10]), there exists a left-invariant closed 2-form cohomologous to σ . This is the only part of the proof in which we use that G is completely solvable. We denote by the same symbol the 2-form on G or on . Write σ = + dθ for some smooth 1-form θ . The fibrewise shift (x, p) → (x, p −θ ) takes compact sets to compact sets and is a symplectomorphism between (T ∗ , ωσ ) and (T ∗ , ω ). Hence, from now on we may suppose that the monopole is given by a closed left-invariant 2-form . (2) We identify T ∗ G with G × g∗ using left translations. Smooth left-invariant functions on T ∗ G are then identified with C ∞ (g∗ ). The twisted symplectic structure ω determines a Poisson bracket { , } . Given f, g ∈ C ∞ (g∗ ) we have { f, g} (m) = m([dm f, dm g]) + (dm f, dm g)

(35)

for every m ∈ g∗ , where dm f, dm g ∈ g using the canonical isomorphism (g∗ )∗ = g. This formula is a simple consequence of the definition of the twisted symplectic form on T ∗ G plus left-invariance. (3) If f, g ∈ C ∞ (g∗ ) then, they induce functions on T ∗ = × g∗ , which only depend on the g∗ -variables and their Poisson brackets is computed, of course, also using (35). (4) Since the cohomology class of is not zero, we now invoke Lemma 4.1 to obtain two vectors x, y ∈ g such that [x, y] = 0 but (x, y) = 0. (5) The vectors x, y are obviously linearly independent. Consider a basis {e1 = x, e2 = y, e3 , . . . , en } of g and let {e1∗ , . . . , en∗ } be its dual basis. Given m ∈ g∗ write m = i m i ei∗ and let f i (m) := m i . Using (35) we have { f 1 , f 2 } (m) = (x, y) = 0. (6) Consider f 1 as Hamiltonian on T ∗ . Along the Hamiltonian flow of f 1 we have m˙ 2 = (x, y) = 0 which readily implies that any compact set in T ∗ may be displaced using the Hamiltonian flow of a suitable cut-off of f 1 . This finishes the proof of Theorem B. 4.1. Examples. Consider the Heisenberg Lie algebra h2n+1 with basis {x1 , . . . , xn , y1 , . . . , yn , z} and non-zero brackets [xi , yi ] = z for i = 1, . . . , n. The image of L : 2 (h2n+1 ) → h2n+1 is obviously one dimensional and generated by z. All the vectors xi ∧ x j , yi ∧ y j , xi ∧ z, yi ∧ z and xi ∧ y j for i = j are in the Kernel of L. Additional n − 1, 2-vectors in the Kernel of L are given by (xi + y1 ) ∧ (x1 + yi ) for i = 2, . . . , n. Thus Ker L is generated by 2-vectors. Another well known nilpotent Lie algebra g2n+1 is given by a basis {x1 , . . . , xn , y1 , . . . , yn , z} and non-zero brackets [z, xi ] = yi for i = 1, . . . , n. Here the kernel of L is generated by xi ∧ x j , yi ∧ y j , xi ∧ y j and z ∧ yi .

Magnetic Flows on Sol-Manifolds

201

Finally consider the nilpotent Lie algebra un of upper triangular n × n matrices (with zeros along the diagonal). If ei j denotes the matrix which has a 1 in its (i, j)-entry and zero everywhere else, then the non-zero brackets are [ei j , e jl ] = eil , where i < j < l. As in the case of the Heisenberg Lie algebra it is easy to check that Ker L is generated by 2-vectors and we leave this to the reader. The simply connected nilpotent Lie groups associated with these Lie algebras admit cocompact lattices and monopoles. The corresponding second Betti numbers are: b2 (h2n+1 ) = 2n 2 − n − 1, n ≥ 2; b2 (g2n+1 ) = n(n + 1); (n − 2)(n + 1) . b2 (un ) = 2 To all of them Theorem B applies. 4.2. Relation with Mañé’s critical value. Let M be a closed manifold and σ a non-zero closed 2-form. We say that a compact set K ⊂ (T ∗ M, ωσ ) is stably displaceable if K × S 1 is displaceable in (T ∗ M × T ∗ S 1 , ωσ ⊕ ω0 ). Let g be a Riemannian metric on M. Following Schlenk in [13] we define d(g, σ ) as the supremum of the values of k ∈ R such that the set of (x, p) ∈ T ∗ M with | p|2x ≤ 2k is stably displaceable. The results of Laudenbach-Sikorav [8] and Polterovich [12] that we mentioned in the Introduction imply that d(g, σ ) > 0. We have introduced stable displacement to include the case in which the Euler characteristic of M is different from zero. Note that this was unnecessary before because all the manifolds we discussed had vanishing Euler characteristic. of Suppose now that σ is weakly exact, that is, its lift σ to the universal covering M M is exact. Mañé’s critical value c(g, σ ) is defined as [3]: c(g, σ ) :=

inf

sup

R) x∈ M u∈C ∞ ( M,

1 |dx u + θx |2 , 2

R) the form θ + du ranges over where θ is any primitive of σ . As u ranges over C ∞ ( M, all primitives of σ , because any two primitives differ by a closed 1-form which must be is simply connected. The critical value c(g, σ ) < ∞ if and only if exact since M σ has bounded primitives. Question. Is d(g, σ ) = c(g, σ ) always? As far as we are aware, there are no counterexamples to this equality which is motivated by the desire to relate Aubry-Mather theory with Symplectic Topology. A full motivation for this question together with more examples where equality holds maybe found in [4]. Suppose that π1 (M) is amenable. Then (see [11, Cor. 5.4]) c(g, σ ) = ∞ if and only if [σ ] = 0. Thus, to test the Question when π1 (M) is amenable and σ is a monopole, we must show that d(g, σ ) = ∞. This is exactly the content of Theorem B which could then be interpreted as evidence of a positive answer to the Question (recall that solvable groups are amenable). Acknowledgements. We would like to thank L. Polterovich and F. Schlenk for useful comments and discussions about Theorem B.

202

L. T. Butler, G. P. Paternain

References 1. Bolsinov, A.V., Dullin, H.R., Veselov, A.P.: Spectra of Sol-manifolds: arithmetic and quantum monodromy. Commun. Math. Phys. 264, 583–611 (2006) 2. Bolsinov, A.V., Taimanov, I.A.: Integrable geodesic flows with positive topological entropy. Invent. Math. 140, 639–650 (2000) 3. Burns, K., Paternain, G.P.: Anosov magnetic flows, critical values and topological entropy. Nonlinearity 15, 281–314 (2002) 4. Cieliebak K., Frauenfelder U., Paternain G.P.: Symplectic topology of Mañé’s critical value. In preparation 5. Ginzburg, V.L., Kerman, E.: Periodic orbits in magnetic fields in dimensions greater than two. Geometry and Topology in Dynamics (Winston-Salem, NC, 1998/San Antonio, TX, 1999), Contemp. Math. 246. Providence, RI: Amer. Math. Soc., 1999, pp. 113–121 6. Gromov, M.: Pseudo-holomorphic curves in symplectic manifolds. Invent. Math. 82, 307–347 (1985) 7. Hattori, A.: Spectral sequence in the de Rham cohomology of fibre bundles. J. Fac. Sci. Univ. Tokyo Sect. I 8, 289–331 (1960) 8. Laudenbach, F., Sikorav, J.-C.: Hamiltonian disjunction and limits of Lagrangian submanifolds. Internat. Math. Res. Notices 4, 161–168 (1994) 9. Siegel, C.L., Moser, J.: Lectures on Celestial Mechanics. Berlin-Heidelberg-New York: Springer-Verlag, 1971 10. Nomizu, K.: On the cohomology of compact homogeneous spaces of nilpotent Lie groups. Ann. of Math. 59, 531–538 (1954) 11. Paternain, G.P.: Magnetic rigidity of horocycle flows. Pacific J. Math. 225, 301–323 (2006) 12. Polterovich, L.: An obstacle to non-Lagrangian intersection. In: The Floer memorial volume, Progr. Math. 133. Basel: Birkhäuser, 1995, pp. 575–586 13. Schlenk, F.: Applications of Hofer’s geometry to Hamiltonian dynamics. Comment. Math. Helv. 81, 105–121 (2006) 14. Scott, P.: The Geometries of 3-manifolds. Bull. London Math. Soc. 15, 401–487 (1983) 15. Wall, C.T.C.: Geometric structures on compact complex analytic surfaces. Topology 25, 119–153 (1986) 16. Zung, N.T.: Convergence versus integrability in Birkhoff normal form. Ann. Math. 161, 141–156 (2005) Communicated by P. Sarnak

Commun. Math. Phys. 284, 203–225 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0604-4

Communications in

Mathematical Physics

Scattering Below Critical Energy for the Radial 4D Yang-Mills Equation and for the 2D Corotational Wave Map System Raphaël Côte1, , Carlos E. Kenig2 , Frank Merle3,, 1 Centre de Mathématiques Laurent Schwartz, École Polytechnique, 91128 Palaiseau Cedex,

France. E-mail: [email protected]

2 Department of Mathematics, University of Chicago, 5734 University Avenue, Chicago,

IL 60637-1514, USA. E-mail: [email protected]

3 Département de Mathématiques, Université de Cergy-Pontoise / Saint-Martin,

2, avenue Adolphe Chauvin, 95 302 Cergy-Pontoise Cedex, France. E-mail: [email protected] Received: 19 September 2007 / Accepted: 25 April 2008 Published online: 29 August 2008 – © Springer-Verlag 2008

Abstract: Given g and f = gg , we consider solutions to the following non linear wave equation : 1 f (u) u tt − u rr − u r = − 2 , r r (u, u t )|t=0 = (u 0 , u 1 ). Under suitable assumptions on g, this equation admits non-constant stationary solutions : we denote Q one with least energy. We characterize completely the behavior as time goes to ±∞ of solutions (u, u t ) corresponding to data with energy less than or equal to the energy of Q : either it is (Q, 0) up to scaling, or it scatters in the energy space. Our results include the cases of the 2 dimensional corotational wave map system, with target S2 , in the critical energy space, as well as the 4 dimensional, radially symmetric Yang-Mills fields on Minkowski space, in the critical energy space. 1. Introduction In this paper we study the asymptotic behavior of solutions to a class of non-linear wave equations in R × R, with data in the natural energy space. The equations covered by our results include the 2 dimensional corotational wave map system, with target S2 , in the critical energy space, as well as the 4 dimensional, radially symmetric Yang-Mills fields on Minkowski space, in the critical energy space. The equations under consideration admit non-constant solutions that are independent of time, of minimal energy, the so-called harmonic maps Q (see [3] and the discussion below). It is known, from the work of Struwe [13], that if the data has energy smaller Centre National de la Recherche Scientifique. Institut des Hautes Études Scientifiques. The work of R.C. and F.M. has been supported in part by ANR grant ONDE NONLIN, and the work of

C.E.K. has been supported in part by NSF.

204

R. Côte, C. E. Kenig, F. Merle

than or equal to the energy of Q, then the corresponding solution exists globally in time (see Proposition 1 below). (A recent result [8] shows that large energy data may lead to a finite time blow up solution for the 2 dimensional corotational wave map system, with target S2 – see also [9]). In this paper, we show that, for this class of solutions, an alternative holds : either the data is (Q, 0) (or (−Q, 0) if −Q is also a harmonic map), modulo the natural symmetries of the problem, and the solution is independent of time, or a (suitable) space-time norm is finite, which results in the scattering at times ±∞. Thus the asymptotic behavior as t → ±∞ for solutions of energy smaller than or equal to that of Q, is completely described. Because of the existence of Q, the result is clearly sharp. The result is inspired by the recent works [6,5] of the last two authors, who developed a method to attack such problems, reducing them, by a concentration-compactness approach, to a rigidity theorem. An important element in the proof of the rigidity theorem in [6,5] is the use of a virial identity. This is also the case in this work, where the virial identity we use in the proof of Lemma 8 is very close to the one used in Lemma 5.4 of [5]. Lemma 8 in turn follows from Lemma 7, which has its origin in the work of the first author [3]. The concentration-compactness approach we use here is the same as the one in [5], with an important proviso. The results in [5] are established for dimension N = 3, 4, 5, while here, in order to include the case of radial Yang-Mills in R4 , we need to deal with a case similar to N = 6 ; it also establishes the result in [5] for N = 6. This is carried out in Theorem 2 below. It is conjectured that similar results will hold without the restriction to data with symmetry (for wave maps or Yang-Mills fields). These are extremely challenging problems for future research. We now turn to a more detailed description of our results. Let g : R → R be C 3 such that g(0) = 0, g (0) = k ∈ N∗ , denote f = gg , and N be the surface of revolution with polar coordinates (ρ, θ ) ∈ [0, ∞) × S1 , and metric ds 2 = dρ 2 + g 2 (ρ)dθ 2 (hence N is fully determined by g). We consider u, an equivariant wave map in dimension 2 with target N , or a radial solution to the critical Yang-Mills equations in dimension 4, that is, a solution to the following problem (see [10] for the derivation of the equation): 1 f (u) u tt − u rr − u r = − 2 , (1) r r (u, u t )|t=0 = (u 0 , u 1 ). At least formally, the energy is conserved by such wave maps : g 2 (u) 2 2 u t + ur + r dr = E(u 0 , u 1 ). E(u, u t ) = r2 Shatah and Tahvildar-Zadeh [11] proved that (1) is locally well posed in the energy space H × L 2 = {(u 0 , u 1 )|E(u 0 , u 1 ) < ∞.}. For such wave maps, energy is preserved. From Struwe [13] we have the following dichotomy regarding long time existence of solutions to (1), depending on the geometry of the target manifold N , and thus on g : ∞ • If g(ρ) > 0 for all ρ > 0 (and 0 g(ρ)dρ = ∞, to prevent a sphere at infinity), then any finite energy wave map is global in time.

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

205

• Otherwise there exists a non-constant harmonic map Q, and one may have blow up (cf. [9,8]). Our goal in this paper is to study the latter case, and to describe the dynamics of equivariant wave maps and of radial solutions to the critical Yang-Mills equations in dimension 4, with energy smaller or equal to E(Q). 1.1. Statement of the result. Notations and Assumptions : Denote by v = W (t)(u 0 , u 1 ) the solution to ⎧ ⎨ 1 k2 u tt − u rr − u r − 2 u = 0, (2) r r ⎩ (u, u )| = (u , u ). t t=0 0 1 W (t) is the linear operator associated with the wave equation with a quadratic potential. For a single function u, we use E(u) for E(u, 0), with a slight abuse of notation, and we also use b g 2 (u) E ab (u) = u r2 + r dr. r2 a To avoid degeneracy (existence of infinitely small spheres), we assume that the set of ρ points where g vanishes is discrete. Denote G(ρ) = 0 |g|. G is an increasing function. We make the following assumptions on g (that is on N , the wave map target) : (A1) g vanishes at some point other than 0, and we denote C ∗ > 0 the smallest positive real satisfying g(C ∗ ) = 0. (A2) g (0) = k ∈ {1, 2} and if k = 1, we also have g (0) = 0. (A3) g (−ρ) ≥ g (ρ) for ρ ∈ [0, C ∗ ] and g (ρ) ≥ 0 for all ρ ∈ [0, D ∗ ], where we denote by D ∗ the point in [0, C ∗ ] such that G(D ∗ ) = G(C ∗ )/2. The first assumption is a necessary and sufficient condition on g for the existence of stationary solutions to (1), that is, non-constant harmonic maps. Hence denote Q ∈ H the solution to r Q r = g(Q), with Q(0) = 0, Q(∞) = C ∗ and Q(1) = C ∗ /2, so that (Q, 0) is a stationary wave map (see [3] for more details). Note that E(Q) = 2G(C ∗ ). The second assumption is a technical one : the restriction on the range of k should be removable using harmonic analysis. Recall that k ∈ N∗ , and for equivariant wave maps, one usually assumes g odd. To remain at a lower level of technicality, we stick to the two assumptions in (A2) which encompass the cases of greater interest (see below). The first part of the third assumption is a way to ensure that Q is a non-constant harmonic map (with Q(0) = 0) with least energy. The second part arises crucially in the proof of some positivity estimates. This assumption could be somehow relaxed, but as such encompasses the two cases below, avoiding technicalities which are beside the point. We conjecture that this assumption is removable. These assumptions encompass • Corotational equivariant wave maps to the sphere S2 in energy critical dimension n = 2 (g(u) = sin u, f (u) = sin(2u)/2), k = 1 – we refer to [10] for more details).

206

R. Côte, C. E. Kenig, F. Merle

• The critical (4-dimensional) radial Yang-Mills equation ( f (u) = 2u(1−u 2 ), g(u) = (1 − u 2 ), notice that to enter our setting we should consider g(u) ˜ = g(u − 1) = u(2 − u), k = 2 – we refer to [2] for more details). Recall that if u ∈ H, then u has finite limits at r → 0 and r → ∞, which are zeroes of g : we denote them by u(0) and u(∞) (see [3, Lemma 1]). We can now introduce V(δ) = {(u 0 , u 1 ) ∈ H × L 2 |E(u 0 , u 1 ) < E(Q) + δ, u 0 (0) = u 0 (∞) = 0}. (3) 2 u2 Denote H = u|u2H = u r + r 2 r dr < ∞ . As we shall see below (Lemma 2), for δ ≤ E(Q), V(δ) is naturally endowed with the Hilbert norm u 20 2 2 2 2 2 u 1 + u 0 r + 2 r dr. (u 0 , u 1 ) H ×L 2 = u 0 H + u 1 L 2 = (4) r 2k+3

Finally, for I an interval of time, introduce the Strichartz space S(I ) = L t∈Ik (dt)L (r −2 dr ) and

2k+3 k

u S(I ) = u L 2+3/k (dt)L 2+3/k (r −2 dr ) . t∈I

r

2+3/k L t,x

Notice that S(I ) is simply the Strichartz space adapted to the energy critical wave equation in dimension 2k + 2 (see [5]), under the conjugation by the map u → u/r k . This space appears naturally, see Sect. 3 for further details. Theorem 1. Assume k = 1 or k = 2, and g satisfies (A1), (A2) and (A3). There exists δ = δ(g) > 0 such that the following holds. Let (u 0 , u 1 ) ∈ V(δ) and denote by u(t) the corresponding wave map. Then u(t) is global in time, and scatters, in the sense that ± 2 u S(R) < ∞. As a consequence, there exist (u ± 0 , u 1 ) ∈ H × L such that ± u(t) − W (t)(u ± 0 , u 1 ) H ×L 2 → 0 as t → ±∞.

As a direct consequence, we have the following Corollary 1. Let (u 0 , u 1 ) be such that E(u 0 , u 1 ) ≤ E(Q, 0), and denote by u(t) the corresponding wave map. Then u(t) is global and we have the following dichotomy : • If u 0 = Q (or u 0 = −Q if −Q is a harmonic map) up to scaling, then u(t) is a constant harmonic map (u t (t) = 0). ± 2 • Otherwise u(t) scatters, in the sense that there exist (u ± 0 , u 1 ) ∈ H × L such that ± u(t) − W (t)(u ± 0 , u 1 ) H ×L 2 → 0 as t → ±∞.

Remark 1. The fact that u(t) is global in time is a direct corollary of [13] (in fact one has global well posedness in V(E(Q)) as recalled in Proposition 1). The new point in our result is linear scattering. Remark 2. We conjecture that δ = E(Q). The only point missing for this is to improve Lemma 7 to δ = E(Q). Remark 3. This result corresponds to what is expected in a “focusing” setting. Similarly, there is a defocusing setting, in the case g(ρ) > 0 for ρ > 0. Arguing in the same way as in Theorem 1, we can prove that if g satifies (A2), (A3) and g (ρ) ≥ 0 for all ρ ∈ R, then any wave map is global and scatters in the sense of Theorem 1. Again, we conjecture that the correct assumptions for this result are g(ρ) > 0 for ρ > 0 and G(ρ) → ±∞ as ρ → ±∞ (to prevent a sphere at infinity).

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

207

2. Variational Results and Global Well Posedness in V(E( Q)) First recall the pointwise bound derived from the energy ∀r, r ∈ R+ , |G(u(r )) − G(u(r ))| ≤

1 r E (u), 2 r

(5)

with equality at points r, r if and only if u is harmonic on (r, r ) i.e., there exists ε ∈ {−1, 1} such that ∀ρ ∈ (r, r ), ρu ρ (ρ) = εg(u(ρ)). (See [3, Prop. 1].) Lemma 1 (V(δ) is stable through the wave map flow). If u ∈ H, u is continuous and has limits at 0 and ∞ which are points where g vanishes, we denote them u(0) and u(∞). Furthermore if u(t) is a finite energy wave map defined on some interval I containing 0, then for all t ∈ I , ∀t ∈ I, u(t, 0) = u(0, 0) and u(t, ∞) = u(0, ∞). In particular, for all δ ≥ 0, V(δ) is preserved under the wave map flow. Proof. The properties of u are well known : see [10] or [3]. Let us prove that the u(t, 0) is constant in time by a continuity argument. For all y such that g(y) = 0, denote I y = {t ∈ I |u(t, 0) = y}. Let t ∈ I . As g vanishes on a discrete set, denote ε > 0 such that if g(ρ) = 0, |G(ρ) − G(u(t, 0))| ≥ 2ε. Since u is defined in I , it does not concentrate energy in a neighbourhood of (t, 0) : there exists δ0 , δ1 > 0 such that E 0δ1 (u(τ )) ≤ ε.

∀τ ∈ [t − δ0 , t + δ0 ], From this and the pointwise bound, we deduce

∀τ ∈ [t − δ0 , t + δ0 ], ∀r ∈ [0, δ1 ], |G(u(τ ), 0) − G(u(τ, r )| ≤ ε/2. Now compute for t ∈ [t − δ0 , t + δ0 ] : δ δ1 1 G(u)(t, ρ)dρ − G(u)(t , ρ)dρ ≤ 0

0

δ1 0

1 ≤ 2 Suppose Then

t

is such that u(t, 0) =

δ1

u(t , 0), δ1

t

g(u(τ, ρ)|u t (τ, ρ)|dτ dρ

t

t t

E(u)dτ ≤

1 E(u)|t − t |. 2

and then |G(u)(t, 0) − G(u)(t , 0)| ≥ 2ε.

G(u)(t , ρ)dρ

G(u)(t, ρ)dρ − 0 δ1 ≥ ((G(u)(t, ρ) − G(u)(t, 0)) + (G(u)(t, 0) − G(u)(t , 0)) 0 +G(u)(t , 0) − G(u)(t , ρ)))dρ 0

≥ δ1 (2ε − ε/2 − ε/2) ≥ δ1 ε.

208

R. Côte, C. E. Kenig, F. Merle

We just proved that 1 E(u)|t − t| ≥ εδ1 . 2 This means that Iu(t,0) is open in I . In the same way, I \ Iu(t,0) = y, y=u(t,0) I y is also open in I , so that Iu(t,0) is closed in I . As I is connected, I = Iu(t,0) . Similarly, one can prove that u(t, ∞) is constant in time. The rest of the lemma follows from conservation of energy.

Lemma 2 There exists an increasing function K : [0, 2E(Q)) → [0, C ∗ ), and a decreasing function δ : [0, 2E(Q)) → (0, 1] such that the following holds. For all u ∈ H such that E(u) < 2E(Q), and u(0) = u(∞) = 0, one has the pointwise bound ∀r, |u(r )| ≤ K (E(u)) < C ∗ . Moreover, one has δ(E(u))u H ≤ E(u) ≤ g L ∞ u H . Proof. From the pointwise bound (5), we have 1 r 1 E (u), |G(u)(r )| ≤ Er∞ (u). 2 0 2 So that 2|G(u)(r )| ≤ E(u) < 2E(Q). As G is an increasing function on [−E(Q), E(Q)], and |G(−ρ)| ≥ G(ρ) for ρ ∈ [0, C ∗ ], we obtain |G(u)(r )| = |G(u)(r ) − G(u)(0)| ≤

|u(r )| ≤ G −1 (E(u)/2) < G −1 (E(Q)) = C ∗ . Then K (ρ) = G −1 (ρ/2) fits. We now turn to the second line. For the upper bound, notice that g(0) = 0 so that g 2 (ρ) ≤ g 2L ∞ ρ 2 , and g L ∞ ≥ |g (0)| ≥ 1. For the lower bound, notice that as |u| ≤ K (E(u)) < C ∗ , then g 2 (u) ≥ δ(E(u))u 2 for some positive continuous function δ : (−C ∗ , C ∗ ) → (0, 1] (g(ρ)/ρ is a continuous positive function on (−C ∗ , C ∗ ), δ(ρ) = min(1, inf{g(r )/r | |r | ≤ ρ})).

Proposition 1 (Struwe [13]). Let (u 0 , u 1 ) ∈ V(E(Q)). Then the corresponding wave map is global in time, and satisfies the bound ∀t, r |u(t, r )| ≤ K (E(u 0 , u 1 )). Proof. Indeed suppose that u blows-up, say at time T . By Struwe [13], there exists a non˜ and two sequences tn ↑ T and λ(tn ) such that λ(tn )|T −tn | → constant harmonic map Q, ∞ and r t ˜ ) Hloc (] − 1, 1[t ×Rr ). u n (t, r ) = u tn + , → Q(r λ(tn ) λ(tn ) ˜ ˜ From Lemma 1, one deduces Q(0) = 0, and hence (with assumption (A3)) | Q(∞)| ≥ C ∗. However, as (u, u t ) ∈ V(E(Q)), from Lemma 2, |u(t, r )| ≤ K (E(u 0 , u 1 )) < C ∗ ˜ )| ≥ (K (E(u)) + C ∗ )/2} is an interval of the form (uniformly in t). Now {r ≥ 0|| Q(r ˜ [A E(u) , ∞) ( Q is monotone) so that ˜ )|2 r dr dt ≥ (C ∗ − K (E(u)))2 /4 0. |u n (t, r ) − Q(r t∈[−1/2,1/2] [A E(u) ,A E(u) +1]

This is in contradiction with the Hloc convergence : hence u is global.

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

209

3. Local Cauchy Problem Revisited 1 2k+1 ∂ ) the radial Laplacian in R2k+2 and U (t) the Denote = ∂rr + 2k+1 r r ∂r = r 2k+1 ∂r (r linear wave operator in R2k+2 : √ √ √ U (t)(v0 , v1 ) = cos(t −)v0 + − sin(t −)v1 .

Notice that W (t)(u 0 , u 1 ) = r k U (t)(u 0 /r k , u 1 /r k ),

(6)

as v solves vtt − v = 0 if and only if r k v solves (2). Given an interval I of R, denote v N (I ) = v(t, x) N (t∈I ) = v L ∞ H˙ 1 + v t∈I

x

2k+3

k L t∈I,x

+ v

2(2k+3)

1/2, 2(2k+3) 2k+1

L t∈I2k+1 W˙ x

+ vW 1,∞ L 2 , t∈I

x

(7)

where the space variable x belongs to R2k+2 . This norm appears in the Strichartz estimate (Lemma 6). Theorem 2. Assume k = 1 or 2. Problem (1) is locally well-posed in the space H in the sense that there exist two functions δ0 , C : [0, ∞) → (0, ∞) such that the following holds. Let (u 0 , u 1 ) ∈ H × L 2 be such that u 0 , u 1 H ×L 2 ≤ A, and let I be an open interval containing 0 such that W (t)(u 0 , u 1 ) S(I ) = η ≤ δ0 (A). Then there exist a unique solution u ∈ C(I, H ) ∩ S(I ) to Problem (1) and u S(I ) ≤ C(A)η, (and we also have u/r k N (I ) ≤ C(A) and E(u, u t ) = E(u 0 , u 1 )). As a consequence, if u is such a solution defined on I = R+ , satisfying u S(R+ ) < ∞, there exist (u +0 , u +1 ) ∈ H × L 2 such that u(t) − W (t)(u +0 , u +1 ) H ×L 2 → 0 as t → +∞. 3.1. Preliminary lemmas. Let us first recall some useful lemmas. We consider D s = (−)s/2 the fractional derivative operator and the homogeneous Sobolev space def W˙ s, p = W˙ s, p (Rn ) = ϕ ∈ S (Rn ) ϕW˙ s, p = D s ϕ L p < ∞ . For integer s, it is well known that · W˙ s, p is equivalent to the Sobolev semi-norm: ϕW˙ s, p ∼ ∇ s ϕ L p . Lemma 3 (Hardy-Sobolev embedding). Let n ≥ 3, and p, q, α, β ≥ 0 be such that 1 ≤ q ≤ p ≤ ∞, and 0 < (β − α)q < n. There exist C = C(n, p, q, α, β) such that for all ϕ radial in Rn , n

r q

− np −β+α

ϕW˙ α, p ≤ CϕW˙ β,q .

210

R. Côte, C. E. Kenig, F. Merle

Proof. Given n, p, q and β, we show the estimate for α in the suitable range. The case α = 0 is the standard Hardy inequality in L p combined with the Sobolev embedding (see [11] and the references therein - where the conditions n ≥ 3, 1 ≤ q ≤ p ≤ ∞ and 0 < β < n are required). If α is an integer, we use the Sobolev semi-norm : as ∂rα (r γ v)

=

α

α−γ

ck r γ −k ∂r

v,

k=0

the inequality follows from the case α = 0. In the general case, let α = k + θ for k ∈ N and θ ∈]0, 1[, and γ = qn − np − β + α. We define so that β = + θ , hence qn − np − + k = γ . We consider the operator T : ϕ → D k (r γ D − ϕ) : T maps L q to L p and W˙ 1,q to W˙ 1, p (integer case). By complex interpolation (see [12]), T maps [L q , W˙ 1,q ]θ = W˙ θ,q to [L p , W˙ 1, p ]θ = W˙ θ, p . This means that r γ ϕW˙ k+θ, p ≤ CϕW˙ +θ,q , which is what we needed to prove.

Lemma 4 If v = u/r k , then 1 u2 u r2 + 2 r dr ≤ (k 2 + 1) vr2 r 2k+1 dr. vr2 r 2k+1 dr ≤ 3 r Proof. First notice that vr = −ku/r k+1 +u r /r k , hence vr2 ≤ (k 2 +1)(u 2 /r 2k+2 +u r2 /r 2k ) and u2 u r2 + 2 r dr. vr2 r 2k+1 dr ≤ (k 2 + 1) r Then from the Hardy-Sobolev inequality in dimension 2k + 2 ≥ 3 (optimal constant is 1/k 2 ), 2 2 u v 2k+1 1 r dr = r dr ≤ 2 vr2 r 2k+1 dr. r2 r2 k As u r = r k vr + ku/r , u r2 ≤ 2r 2k vr2 + 2k 2 u 2 /r 2 and 1 u2 2 u r + 2 r dr ≤ 2 + 2 vr2 r 2k+1r dr. r k

Lemma 5 (Derivation rules). Let 1 < p < ∞, 0 < α < 1. Then D α (ϕψ) L p ≤ Cϕ L p1 D α ψ L p2 + D α ϕ L p3 ψ L p4 , D α (h(ϕ)) L p ≤ Ch (ϕ) L p1 D α ϕ L p2 . D α (h(ϕ) − h(ψ)) L p ≤ C(h (ϕ) L p1 + h (ψ) L p1 )D α (ϕ − ψ) L p2 +C(h (ϕ) L r1 + h (ψ) L r1 )(D α ϕ L r2 +D α ψ L r2 )ϕ − ψ L r3 , where

1 p

=

1 p1

+

1 p2

=

1 p3

+

1 p4

=

1 r1

+

1 r2

+

1 r3 ,

and 1 < p2 , p3 , r1 , r2 , r3 < ∞.

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

211

Proof. See [7, Theorem A.6 and A.8], [7, Theorem A.7 and A.12] and [5, Lemma 2.5].

For the rest of this section, we work in dimension 2k + 2 (radial), and the underlying measure, in particular to define Lebesgue and Sobolev spaces, is r 2k+1 dr unless otherwise stated. In particular, notice that from Lemma 5, we have : D 1/2 (ϕψ)

L

2(2k+3) 2k+5

≤ D 1/2 ϕ

L

2(2k+3) 2k+5

ψ L ∞ + ϕ

4(2k 2 +5k+3) L 4k 2 +12k+7

D 1/2 ψ L 4(k+1) . (8)

Recall

√ √ t √ sin((t − s) −) sin(t −) w = cos(t −)v0 + √ v1 + χ (s)ds √ − − 0

solves the problem

wtt − w = χ , (w, wt )|t=0 = (v0 , v1 ),

Lemma 6 (Strichartz estimate). Let I be an interval. There exist a constant C (not depending on I ) such that (in dimension 2k + 2), √ cos(t −)v0 N (R) ≤ Cv0 H˙ 1 , x √ sin(t −) √ v1 N (R) ≤ v1 L 2x , − √ t sin((t − s) −) 1/2 χ (s)ds N (I ) ≤ Dx χ 2(2k+3) 2(2k+3) . √ − 0 L t∈I2k+5 L x 2k+5 Proof. This result is well-known : see [5] and the references therein.

3.2. Proofs of Theorem 2 in the case k = 1 and k = 2. Proof of Theorem 2. Denote v = u/r k . Then vr = u r /r k − ku/r k+1 , vrr = u + k(k + 1) r k+2 , so that ⎧ f (r k v) − k 2 r k v 1+2/k vr ⎨ v , vtt − vrr − (2k + 1) = − r (r k v)1+2/k ⎩ (v, vt )|t=0 = (v0 , v1 ) = (u 0 /r k , u 1 /r k ).

u rr rk

−

2ku r r k+1

(9)

This is something like the (radial) energy critical wave equation in dimension 2k + 2. This is why we stated (8) and the Strichartz estimates in this dimension. Denote h(ρ) =

f (ρ) − k 2 ρ . ρ 1+2/k

Assume that h, h and h are bounded on compact sets : this is automatic if g is C 3 and satifies (A2). Indeed if k = 2, 1 + 2/k = 2 and it is a direct application of Taylor’s expansion, and if k = 1, 1 + 2/k = 3, and it suffices to notice additionally that f (0) = 3kg (0) = 0).

212

R. Côte, C. E. Kenig, F. Merle

Our assumptions on (u 0 , u 1 ) translate to : v0 H˙ 1 + v1 L 2 ≤ C A,

U (t)(v0 , v1 ) L 2+3/k L 2+3/k ≤ Cη. r

t∈I

Consider the map :

√ √ t √ sin((t −s) −) 1+2/k sin(t −) v1 + (v (s)h(r k v)(s))ds, : v → cos(t −)v0 + √ √ − − 0 that is (v) solves the (linear in (v)) equation ⎧ ⎨ (v) − (v) − (2k + 1) (v)r = −h(r k v)v 1+2/k , tt rr r ⎩ (v, vt )|t=0 = (v0 , v1 ) = (u 0 /r k , u 1 /r k ).

(10)

We will find a fixed point for , related to smallness in the norm : v L 2+3/k and D 1/2 v L 2(2k+3)/(2k+1) . t∈I,r

t∈I,r

1/2

The Strichartz estimate shows that we are to control Dr (v 1+2/k h(r 1+2/k v))

2(2k+3)

.

2k+5 L t∈I,r

For convenience in the following, denote : p=

4(k + 2)(2k 2 + 5k + 3) . 4k(k 2 + 12k + 7)

Now, we use (8) together with Lemma 3 and Lemma 5 : D 1/2 (v 1+2/k h(r k v)) ≤ D 1/2 (v 1+2/k )

L

L

2(2k+3) 2k+5

2(2k+3) 2k+5

h(r k v) L ∞ + v 1+2/k

≤ Cv 2/k L k+3/2 D 1/2 v 2/k

≤ Cv L 2+3/k D 1/2 v

L

L

2(2k+3) 2k+1

2(2k+3) 2k+1

4(2k 2 +5k+3) L 4k 2 +12k+7

1+2/k

h(r k v) L ∞ + Cv L p 1+2/k

h(r k v) L ∞ + Cv L p

D 1/2 h(r k v) L 4(k+1)

h (r k v) L ∞ r k vW˙ 1/2,4(k+1)

h (r k v) L ∞ vr L 2 .

From interpolation of Lebesgue spaces and the Hölder inequality, 1+2/k 1+2/k v 2 v L p 2(2k+3) = 4(k+2)(2k +5k+3) L t 2k+5 k(4k 2 +12k+7) (1+2/k)(2(2k+3)/(2k+5) L Lt

r

2/k

≤ v ≤ 1/2 v2/k v 2+3/k D Lr

2(2k+3) 2k+1

Lr

2(2k+3) 2k+5

Lt

2+3/k

L t,r

v

4(2k+3)(k+1) 4k 2 +4k−1

2(2k+3)/(2k+1)

Lt Lr 2/k 1/2 v 2+3/k Dr v 2(2k+3) L t,r L 2k+1

2/k ≤ v 2+3/k Lr 2/k

≤ v

2+3/k

L t,r

and

(11)

t,r

k+3/2

D 1/2 v

Lt

1/2

Dr v

2(2k+3)

L t,r2k+1

.

2(2k+3)

L t,r2k+1

(12)

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

213

Using again Lemma 3 to show r k v L ∞ ≤ Cvr L 2 , we hence get our main estimate, for some increasing function ω (ω is a function of h, h and essentially the constant in the Strichartz estimate, and does not depend on I or v) : 1/2

Dx (v 1+2/k h(r k v))

≤ ω(vr L ∞

2 t∈I L r

2(2k+3) 2k+5 L t∈I,r

2/k

)v

2+3/k

L t∈I,r

D 1/2 v

.

2(2k+3) 2k+1 L t∈I,r

(13)

We now turn to difference estimates. Using the same inequalities, we get : D 1/2 (v 1+2/k h(r k v) − w 1+2/k h(r k w)) L

2(2k+3) 2k+5

r 1/2 1+2/k 1+2/k k (v ≤ CD −w )h(r v) 2(2k+3) L 2k+5 1 1/2 k 1+2/k k k r w +C D (v − w) h (θr (v − w) + r w)dθ 0

≤ D 1/2 (v 1+2/k − w 1+2/k ) +v 1+2/k − w 1+2/k L

L

2(2k+3) 2k+5

4(2k 2 +5k+3) 4k 2 +12k+7

+D 1/2 (r k w 1+2/k (v − w)) +r w k

≤ D

1+2/k

1+2/k

−w

h(r k v) L ∞

D 1/2 h(r k v) L 4(k+1)

2(2k+3) 2k+5

4(2k 2 +5k+3)

L 4k 2 +12k+7 1+2/k

2(2k+3) 2k+5

)

1 0

1

D 0

h (θr k (v − w) + r k w)dθ

1/2

L∞

(h (θr (v − w) + r w))dθ

k

k

L 4(k+1)

2(2k+3) h(r v) L ∞ L 2k+5 2/k 2/k +v − w L p (v L p + w L p )h (r k v) L ∞ vr L 2

+ D 1/2 (w 2/k (v − w)) 2(2k+3) r k w L ∞ L 2k+5

1/2

(v

(v − w)

L

L

k

+w 2/k (v − w)

4(2k 2 +5k+3) L 4k 2 +12k+7

D 1/2 (r k w) L 4(k+1)

× sup h (r k v + θr k (w − v)) L ∞ + r k w L ∞ w 2/k (v − w) θ∈[0,1]

× sup

θ∈[0,1]

L

4(2k 2 +5k+3) 4k 2 +12k+7

h (θr k (v − w) + r k w) L ∞ D 1/2 (r k (θ v + (1 − θ )w)) L 4(k+1) .

Then we have as previously : D 1/2 (w 2/k (v − w)) ≤ CD 1/2 (v − w) 2/k

≤ Cw L 2+3/k D w 2/k (v − w) L

L

L 1/2

2(2k+3) 2k+5

(2(2k+3) 2k+1

w 2/k L k+3/2 + Cv − w L 2+3/k D 1/2 (w 2/k )

(v − w)

4(2k 2 +5k+3) 4k 2 +12k+7

L

(2(2k+3) 2k+1

2/k

2/k−1

+ w L 2+3/k D 1/2 w

≤ w L p v − w L p .

L

(2(2k+3) 2k+1

L

2(2k+3) 5

v − w L 2+3/k ,

214

R. Côte, C. E. Kenig, F. Merle

Doing the computations in each case k = 1 or k = 2, we have that D 1/2 (v 3 − w 3 ) L 10/7 ≤ D 1/2 v − w L 10/3 (v2L 5 + w 2 2L 5 ) +v − w L 5 (D 1/2 v L 10/3 +D 1/2 w L 10/3 )(v L 5 + w L 5 ) and D 1/2 (v 2 − w 2 ) L 14/9 = D 1/2 ((v − w)(v + w)) L 14/9 ≤ CD 1/2 (v − w) L 14/5 (v L 7/2 + w L 7/2 ) +Cv − w L 7/2 (D 1/2 v L 14/5 + D 1/2 w L 14/5 ), so that in both cases D 1/2 (v 1+2/k − w 1+2/k )

L

2(2k+3) 2k+5

≤ C(v L 2+3/k + w L 2+3/k )2/k−1 (v L 2+3/k + w L 2+3/k )D 1/2 (v − w) 2(2k+3) L 2k+1 1/2 1/2 +(D v 2(2k+3) + D w 2(2k+3) v − w L 2+3/k ). L

L

2k+1

2k+1

Here, the assumption k ≤ 2 is crucially needed. Finally observe that |θ v + (1 − θ )w| ≤ |v| + |w|, |D 1/2 (θ v + (1 − θ )w) ≤ |D 1/2 v| + |D 1/2 w|. We can now summarize these computations, and using (11) and (12), we obtain the space time difference estimate (up to a change in the function ω, which now depends on h, h and h , but not on I or v) : D 1/2 (v 1+2/k h(r k v) − w 1+2/k h(r k w))

2(2k+3) 2k+5 L t∈I,r

≤ (ω(v L ∞

t∈I

˙ 1 )) H˙ r1 ) + ω(w L ∞ t∈I Hr

2/k−1 2/k−1 × v 2+3/k + w 2+3/k (v L 2+3/k + w L 2+3/k )D 1/2 (v − w) L t∈I,r

L t∈I,r

+(D 1/2 v

2(2k+3) 2k+1 L t∈I,r

+ D 1/2 w

t∈I,r

2(2k+3) 2k+1 L t∈I,r

t∈I,r

2(2k+3) 2k+1 L t∈I,r

)v − w L 2+3/k . t∈I,r

Given a, b, A ∈ R+ , I a time interval, introduce B(a, b, A, I ) = v| v L 2+3/k ≤ a, D 1/2 v t∈I,r

2(2k+3)

2k+1 L t∈I,r

≤ b, vC(t∈I, H˙ 1 ) ≤ 2C A . r

Hence for v ∈ B(a, b, A, I ), we have (v) L 2+3/k ≤ U (t)(v0 , v1 ) L 2+3/k + ω(2C A)a 2/k b, t∈I,r

D 1/2 (v)

2(2k+3) 2k+1 L t∈I,r

t∈I,r

≤ D 1/2 U (t)(v0 , v1 )

2(2k+3) 2k+1 L t∈I,r

+ ω(2C A)a 2/k b,

(v)C(t∈I, H˙ 1 ) ≤ (v0 , v1 ) H˙ 1 ×L 2 + ω(2C A)a 2/k b, (v) − (w) N (I ) ≤ 2ω(2C A)a 2/k−1 b(D 1/2 (v − w)

2(2k+3) 2k+1 L t∈I,r

+ v − w L 2+3/k ). t∈I,r

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

215

Case k = 1. We compute 2 + 3/k = 5 and 2(2k+3) 2k+1 = 10/3. Given A, set b = 2C A and 1 δ0 (A) = min(1, 1/C, 8C Aω(2C A) ). Then for (v0 , v1 ) such that (v0 , v1 ) H˙ 1 ×L 2 ≤ A = η ≤ δ0 (A), set a = 2η. Notice that the Strichartz estimate and U (t)(v0 , v1 ) L 5 t∈I,r gives D 1/2 U (t)(v0 , v1 ) L 10/3 ≤ C A. t∈I,r

Our relations now are written (the main point is 2/k − 1 = 1 > 0) : (v) L 5

t∈I,r

D 1/2 (v) L 10/3

t∈I,r

a + ω(2C A)(2δ0 a)(2C A) ≤ a, 2 ≤ C A + ω(2C A)(2δ0 a)(2C A) ≤ 2C A, ≤

(v)C(t∈I, H˙ 1 ) ≤ A + ω(2C A)(2δ0 a)(2C A) ≤ 2 A, 1 (v) − (w) N (I ) ≤ (D 1/2 (v − w) L 10/3 + v − w L 5 ). t∈I,r t∈I,r 2 Hence : B(a, 2C A, A, I ) → B(a, 2C A, A, I ) is a well defined 1/2-Lipschitz map, so that has a unique fixed point, which is our solution. 2(2k+3) Case k = 2. We compute 2 + 3/k = 7/2, 2(2k+3) 2k+1 = 14/5 and 2k+5 = 14/9. In this case 2/k − 1 = 0, so that the procedure used in the case k = 1 no longer applies (it is the same problem as for the energy critical wave equation in dimension 6). However, we still have a solution on an interval I where both quantities U (t)(v0 , v1 ) L 7/2 and D 1/2 U (t)(v0 , v1 ) L 14/5 are small. t∈I,r

t∈I,r

1 Indeed, given A, set δ1 (A) = min(1, C1 , 8ω(2C A) ). For (v0 , v1 ) such that (v0 , v1 ) H˙ 1 ×L 2 ≤ A, U (t)(v0 , v1 ) L 7/2 = η ≤ δ1 (A), and D 1/2 U (t)(v0 , v1 ) L 14/5 = t∈I,r

t∈I,r

η ≤ δ1 (A), we set a = 2η and b = 2η . Then we have

a + ω(2C A)a)(2δ0 ) ≤ a, 2 b ≤ + ω(2C A)(2δ1 (A))b ≤ b, 2 ≤ A + ω(2C A)(2δ1 (A))2 ≤ 2 A, 1 ≤ (D 1/2 (v − w) L 14/5 + v − w L 7/2 ). t∈I,r t∈I,r 2

(v) L 7/2 ≤ t∈I,r

D 1/2 (v) L 14/5

t∈I,r

(v)C(t∈I, H˙ 1 ) (v) − (w) N (I )

Hence : B(a, b, A, I ) → B(a, b, A, I ) has a unique fixed point. We just proved the following Claim. Let A > 0. There exist δ1 (A) > 0 such that for (v0 , v1 ) with (v0 , v1 ) H˙ 1 ×L 2 ≤ A, and I such that U (t)(v0 , v1 ) L 7/2 = η ≤ δ1 (A), and D 1/2 U (t)(v0 , v1 ) L 14/5 = η ≤ δ1 (A). t∈I,r

t∈I,r

Then there exist a unique solution v(t) to (9) satisfying (v, vt ) L ∞

˙ 1 ×L 2 )

t∈I ( H

≤ 2 A, v L 7/2 ≤ 2η, D 1/2 v L 14/5 ≤ 2η . t∈I,r

t∈I,r

216

R. Côte, C. E. Kenig, F. Merle

Let us now do a small computation. Given h, n ∈ N and 0 = t0 < t1 < . . . < tn = T (with T ∈ (0, ∞]), we have for i = 0, . . . , n, √ sin((t − s) −) χ (s)ds N (ti ,ti+1 ) √ − 0 √ i−1 t j+1 sin((t − s) −) χ (s)ds N (ti ,ti+1 ) ≤ √ − tj j=0 √ t sin((t − s) −) χ (s)ds N (ti ,ti+1 ) + √ − ti √ i−1 t sin((t − s) −) (χ (s)½s∈[t j ,t j+1 ] )ds N (ti ,ti+1 ) ≤ √ − 0 j=0 √ t sin((t − s) −) (χ (s)½s∈[ti ,ti+1 ] )ds N (ti ,ti+1 ) + √ − ti √ i−1 t sin((t − s) −) (χ (s)½s∈[t j ,t j+1 ] )ds N (R) ≤ √ − 0 j=0 √ t sin((t − s) −) (χ (s)½s∈[ti ,ti+1 ] )ds N (R) + √ − 0 i i 1/2 1/2 ≤C Dx χ (s)½s∈[t j ,t j+1 ] L 14/9 ≤ C Dx χ L 14/9

t

s,x

j=0

j=0

14/9 t∈[t j ,t j +1] L x

.

(14)

Let us now complete the case k = 2. Let A > 0, define n = n(A) such that n = n(A) = 1/(4C Aω(2C A)), so that 2C Aω(2C A)/n ≤ 1/2 and δ0 (A) = δ1 (A)/2n+2 1 (recall δ1 (A) = min(1, C1 , 8C Aω(2C A) )). Let (v0 , v1 ) be such that v0 , v1 H˙ 1 ×L 2 ≤ A and for I = (T0 , T1 ) an interval (possibly with infinite endpoints), U (t)(v0 , v1 ) L 7/2 = η ≤ δ0 (A). t∈I,r

From the Strichartz estimate, we also have D 1/2 U (t)(v0 , v1 ) L 14/5 ≤ C A. t∈I,r

From (v0 , v1 ), we have a solution v defined on an interval I˜ = [0, T ). We choose J = (T0 , T1 ) ⊂ I˜ to be maximal such that v L 7/2 ≤ δ1 (A), D 1/2 v L 14/5 ≤ 2C A, vC(J, H˙ 1 ) ≤ 2C A. t∈J,r

t∈J,r

From the claim, we can choose J non empty. Let T0 = t0 < t1 < . . . tn = T1 be such that

∀i ∈ [[0, n − 1]], D 1/2 v L 14,5

t∈[ti ,ti+1 ],r

≤

1 1 2C A ≤ . n 2 ω(2C A)

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

217

From (13) and (14), we obtain v N (J ) ≤ C A + ω(2C A)v L 7/2 v N (J ) , t∈J,r

v L 7/2

t∈[ti ,ti+1 ],r

≤ U (t)(v0 , v1 ) L 7/2

t∈[ti ,ti+1 ],r

+ω(2C A)

i j=0

Let us denote ai = v L 7/2

t∈[ti ,ti+1 ],r

v L 7/2

t∈[t j ,t j+1 ],r

D 1/2 v L 14/5

t∈[t j ,t j+1 ],r

.

for i ∈ [[0, n − 1]]. Then we have

1 v N (J ) ≤ C A + D 1/2 v L 14/5 ≤ 3/2C A < 2C A, t∈I,r 4 i i−1 aj or equivalently ai ≤ 2η + aj. ai ≤ η + ω(2C A) 2ω(2C A) j=0

(15)

j=0

By recurrence, we deduce that ai ≤ 2i+1 η. In particular, v L 7/2 L 7/2 = t∈J

n−1

r

ai ≤ 2n+1 η ≤ 2n+1 δ0 (A) < δ1 (A).

(16)

i=0

Hence, from (15) and (16) and a standard continuity argument, we deduce that J = I˜ = I , v N (I ) ≤ 2C A and v L 7/2 ≤ 2n+1 η = c(A)η. t∈I,r

Going back to u, we obtain the first part of Theorem 2, in both cases k = 1 and k = 2 (conservation of energy is clear from the construction). Let us now prove the consequence mentioned in Theorem 2. Given u, we associate v(t, r ) = u(t, r )/r k : v is defined on R+ , and satisfies (9). If we denote A = (u, u t ) L ∞ 2 , then there exist T large enough such that t (H ×L ) u S([T,∞)) ≤ δ0 (A). From the previous part, we have that v N [T,∞) ≤ 2C A, v L 2+3/k

t∈[T,∞),r

Denote ν(t) = U (−t)v(t). Then

ν(t) − ν(s) =

t

≤ δ0 (A).

U (−τ )v 1+2/k (τ )h(r k v)(τ )dτ.

s

Hence, for t ≥ s ≥ T , from the Strichartz estimate and (13), we have ν(t) − ν(s) H˙ 1 + νt (t) − νt (s) L 2 ≤ ν(τ ) − ν(s) N (τ ∈[s,t]) ≤ v 1+2/k (τ )h(r k v)(τ ) 2/k

≤ ω(2C A)v

2+3/k

L τ ∈[s,t],r

2(2k+1) 2k+5 L τ ∈[s,t],r

(2C A) → 0 as s, t → +∞.

This means that (ν(t), νt (t)) is a Cauchy sequence in H˙ 1 × L 2 , hence converges to some (v + , vt+ ) ∈ H˙ 1 × L 2 . Going back to u, using Lemma 4 and Remark (6), we obtain the second part of Theorem 2.

218

R. Côte, C. E. Kenig, F. Merle

4. Rigidity Property Recall that g is such that g(0) = 0, g (0) = k ∈ N∗ , with C ∗ the smallest positive real ρ such that g(C ∗ ) = 0, f = g g and G(ρ) = 0 |g|(ρ )dρ ; D ∗ ∈ [0, C ∗ ] is such that ∗ ∗ G(D ) = G(C )/2. 2 2 Introduce the energy density e(u, v) = v 2 + u r2 + g r(u) and p(u) = u r2 + g r(u) 2 2 . Denote

E(u, v) =

e(u, v)r dr,

and similarly for a single function u, E(u) = p(u)r dr,

E ab (u, v) =

e(u, v)r dr,

a

E ab (u)

b

=

b

p(u)r dr. a

We will also need the function d(ρ) = ρ f (ρ), which is linked to the virial identity, and d(u) 2 u r + 2 r dr. F(u) = r The following variational lemma is at the heart of the rigidity theorem. Here is the only point where we use Assumption (A3), which ensures that g (ρ) ≥ 0 for ρ ∈ [−D ∗ , D ∗ ]. Lemma 7 There exist c > 0 and δ ∈ (0, E(Q)) such that for all u such that (u, 0) ∈ V(δ), we have cE(u) ≤ F(u) ≤

1 E(u). c

Proof. Fix δ < E(Q). g 2 (u) ≥ ω(δ)u 2 for some function ω : [0, E(Q)) → R+∗ , and |d(x)| ≤ g 2L ∞ (−C ∗ ,C ∗ ) x 2 for |x| < C ∗ , so that

F(u) ≤ 1 +

g 2L ∞ (−C ∗ ,C ∗ ) ω(δ)

E(u),

which is the upper bound. For the lower bound, we need Assumption (A3) on g. Hence on [−D ∗ , D ∗ ], d(x) ≥ 0, D∗ √ d(x)d x > 0. One easily sees that and on [0, D ∗ ], d(−x) ≥ d(x). Denote A = 0 for a function v : [a, b] → [−D ∗ , D ∗ ] such that v(a) = 0, |v(b)| = D ∗ then b b D∗ d(v) vr2 + 2 r dr ≥ 2 |vr d(v(r ))|dr ≥ 2 d(x)d x = 2 A. r a a 0 In the same way, a

b

g 2 (v) 2 vr + 2 r dr ≥ 2G(D ∗ ) = G(C ∗ ). r

Let δ > 0 to be determined later and u be such that (u, 0) ∈ V(δ). Recall that u L ∞ ≤ K (E(Q) + δ) < C ∗ (Lemma 2), and hence g(u) ≥ ω(E(Q) + δ)|u|.

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

219

Assume first u L ∞ > D ∗ . Then let A1 , A2 be such that u ∈ [−D ∗ , D ∗ ] on both intervals [0, A1 ] and [A2 , ∞) and |u(A1 )| = |u(A2 )| = D ∗ . Then A1 A2 ∞ A2 d(u) d(u) u r2 + 2 r dr. u r2 + 2 r dr = + + ≥ 4A + r r 0 A1 A1 A2 Doing the same with the energy density, one gets ∞ A1 g 2 (u) g 2 (u) 2 2 ur + r dr + ur + r dr ≥ 4G(D ∗ ) = 2G(C ∗ ) = E(Q). r2 r2 0 A2 Hence E AA12 (u) < δ. Now, we have |d(u)| = |u||g (u)||g(u)| ≤ g L ∞ |u|g(u) ≤ so that A2 A1

u r2 +

g L ∞ g 2 (u), ω(E(Q) + δ)

A2 g L ∞ d(u) g L ∞ g 2 (u) 2 r dr ≥ u r dr ≥ − δ. − r 2 2 r ω(E(Q) + δ) r ω(E(Q) + δ) A1

g L ∞ Finally, choosing δ > 0 small enough so that ω(E(Q)+δ) δ ≤ 2 A, we get g L ∞ A d(u) u r2 + 2 r dr ≥ 4 A − δ ≥ 2A ≥ E(u). r ω(E(Q) + δ) E(Q) A This gives the lower bound with constant E(Q) . Assume now that u L ∞ ≤ D ∗ . Then d(u) ≥ 0. As f (x) ∼ k 2 x as x → 0, let D > 0 be such that | f | ≥ k 2 /2x on the interval [−D, D]. If u L ∞ ≤ D, then of course

2 g (u) k2 k2 2 F(u) ≥ u r r dr + r dr ≥ min 1, E(u). r2 2g 2L ∞ 2g 2L ∞ D√ Otherwise, arguing as before, u L ∞ ∈ [D, D ∗ ] and we see that F(u) ≥ 4 0 d so that (as E(u) < E(Q) + δ ≤ 2E(Q)) D√ 2 0 d F(u) ≥ E(u). E(Q) D√ Choosing δ > 0 small enough and c = min(2( 0 d)/E(Q), A/E(Q)k 2 /(2g 2L ∞ ), 1) ends the proof.

Let ϕ be such that ϕ(r ) = 1 if r ≤ 1, ϕ(r ) = 0 if r ≥ 2, and ϕ(r ) ∈ [0, 1]. Denote ϕ R (x) = ϕ(r/R). In the notation O, constants are absolute (do not depend on R or t or u). Lemma 8 Let (u, u t ) ∈ V(δ) be a solution to (1). One has d u t u r r 2 ϕ R (r )dr = − u 2t r dr + O(E ∞ R (u, u t )), dt d u f (u) 2 2 u t − ur − r dr + O(E ∞ uu t r ϕ R (r )dr = R (u, u t )). dt r2

220

R. Côte, C. E. Kenig, F. Merle

Remark 4. For the O, we can consider the rest of the energy E ∞ R or equivalently the tail in H × L 2 ∞ u2 u 2t + u r2 + 2 r dr. τ (R, u, u t ) = r R Proof. One computes d u t u r r 2 ϕ R (r )dr dt 2 = u tt u r r ϕ R (r )dr + u t u r t r 2 ϕ R (r )dr 1 f (u) 1 u rr + u r − 2 u r r 2 ϕ R (r )dr − = u 2t (2r ϕ R (r ) + r 2 ϕ R (r ))dr r r 2 1 1 =− u 2t (2r ϕ R (r ) + r 2 ϕ R (r ))dr + u r2 (r ϕ R (r ) − (r 2 ϕ R (r )) ))dr 2 2 1 + g 2 (u)ϕ R (r )dr 2 1 g 2 (u) 2 2 2 2 r ϕ R (r )dr. u t − ur + = − u t r ϕ R (r )dr + 2 r2 Now notice that g 2 (u) 2 2 2 − u 2 r (1 − ϕ R (r ))dr − 1 r u − u + ϕ (r )dr t t r R 2 2 r ≤ e(u, u t )(1 − ϕ R (r ))r dr + e(u, u t )r 2 |ϕ R (r )|dr 1 ∞ ≤ E R (u, u t ) + e(u)r 2 |ϕ (r/R)|dr R ≤ (1 + 2ϕ L ∞ )E ∞ R (u, u t ). From this, we immediately deduce d u t u r r 2 ϕ R (r )dr = − u 2t r dr + O(E ∞ R (u, u t )). dt In the same way, d 2 uu t r ϕ R (r )dr = u t r ϕ R (r )dr + uu tt r ϕ R (r )dr dt 1 f (u) 2 r ϕ R (r )dr = u t r ϕ R (r )dr + u u rr + u r − 2 r r u f (u) 1 u 2t − u r2 − r ϕ R (r )dr + = u 2 (r ϕ R (r )) dr r2 2 1 − u 2 ϕ R (r )dr. 2

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

221

Then similarly, u f (u) 1 1 2 2 2 2 u t − ur − r (1 − ϕ R (r ))dr + u (r ϕ R (r )) dr − u ϕ R (r )dr r2 2 2 2 2 u f (u) 1 u 2 2 (1 − ϕ ≤ u t − u r − (r ))r dr + |r ϕ R (r ) + r ϕ R (r )|r dr R 2 r 2 r2 2 g (u) r 2 r r dr ≤ C e(u, u t )(1 − ϕ R (r ))r dr + C ϕ (r/R) − (r/R) ϕ 2 2 r R R ∞ ≤ C E∞ R (u, u t ) + C(4ϕ L ∞ + 2ϕ L ∞ )E R (u, u t ). (The bounds on the third line come respectively from the pointwise bounds |u f (u)| ≤

Cg 2 (u) and u 2 ≤ Cg 2 (u), which hold according to the proof of Lemma 7.) Theorem 3 (Rigidity property). Let (u 0 , u 1 ) ∈ V(δ), and denote by u(t) the associated solution. Suppose that for all t ≥ 0, there exist λ(t) ≥ A0 > 0 such that r 1 r (t, r ) ∈ R+ is precompact in H × L 2 . , u t t, K = u t, λ(t) λ(t) λ(t) Then u ≡ 0. Proof. Recall that u is global due to Proposition 1. As K is precompact and λ(t) ≥ A0 > 0, for all ε > 0, there exists R(ε) such that ∀t ≥ 0,

E∞ R(ε) (u, u t ) < ε.

This means that lim sup E ∞ R (u, u t ) = 0.

R→∞ t≥0

Due to Lemma 8 and 7, we have 1 1 u f (u) d 2 2 2 r dr u t u r r ϕ R (r )dr + uu t ϕ R (r )dr = − u t + ur + dt 2 2 r2 +O(E ∞ R (u, u t )) c ≤ − E(u, u t ) + O(E ∞ R (u, u t )). 2 cE(u,u t ) Fix R large enough so that supt≥0 O(E ∞ . Then by integration R (u, u t )) ≤ 4 between τ = 0 and τ = t and conservation of energy : 1 c 2 u t u r r ϕ R (r )dr + uu t r ϕ R (r )dr ≤ − E(u, u t )t + C0 . 2 4

However, from finiteness of energy and u 2 ≤ Cg 2 (u), we have for all t, u t u r r 2 ϕ R (r )dr + 1 uu t r ϕ R (r )dr 2 1 1 g 2 (u) 2 ≤ r ϕ R (r ) u 2t + C 2 (u 2t + u r2 )r 2 ϕ R (r )dr + 2 4 r 1 ≤ R E(u, u t ) + R E(u, u t ), 2C so that this quantity is bounded, hence t ≤ 4(R + R/(2C)+C0 )/c. This is a contradiction with the fact that u is global in time.

222

R. Côte, C. E. Kenig, F. Merle

5. Proofs of Theorem 1 and Corollary 1 Proof of Theorem 1. The proof follows the general framework of Kenig and Merle [6,5]. For a detailed exposition of the various steps and lemmas, we refer to [6, Sect. 4]. Let δ ∈ (0, E(Q)] as in Lemma 7. All the wave maps considered below in the proof will have initial data in V(δ) ; from Proposition 1, they are all defined globally in time. Hence we are left to show that all wave maps u with initial data (u 0 , u 1 ) ∈ V(δ) scatter at t → ±∞. From Theorem 2, we only need to show that u S(R) < ∞. We consider the critical energy E c = sup{E ∈ [0, E(Q) + δ]| ∀(u 0 , u 1 ) ∈ V(δ), E(u 0 , u 1 ) < E =⇒ u(t) S(R) < ∞}. Theorem 1 is the assertion E c = E(Q) + δ. Assume this is not the case, namely E c < E(Q) + δ, and we will reach a contradiction; this will complete the proof of Theorem 1. Due to Theorem 2, notice that def

E c ≥ δ0 = δ0 (E(Q) + δ) > 0.

(17)

The compensated compactness procedure of Kenig and Merle in [5] provides us with a critical element u c (in the case E c < E(Q) + δ) : Proposition 2. There exists (u c0 , u c1 ) ∈ H × L 2 , satisfying (u c0 , u c1 ) ∈ V(δ), E(u c0 , u c1 ) = E c , and if we denote u c (t) the associated solution to Problem (1), u c (t) is global and u c S(R) = u c S(R+ ) = ∞. Moreover, a critical element enjoys the following properties: Proposition 3. Let u c be as in Proposition 2. Then there exists a continuous function λ : R+ → R+∗ such that the set r r 1 c , u t t, K = v(t) ∈ H × L 2 v(t, r ) = u c t, λ(t) λ(t) λ(t) has compact closure in H × L 2 . Up to considering a different critical element, we can furthermore assume that λ(t) ≥ A0 for all t ≥ 0, for some A0 > 0. We thus consider u c given by Propositions 2 and 3. From Theorem 3, we deduce that = (0, 0), which is a contradiction with E(u c , u ct ) = E c > 0 (in view of (17)). Hence E c = E(Q) + δ. This completes the proof of Theorem 1. For the convenience of the reader, we sketch the proof of Proposition 2 and 3; a complete proof can be derived (with minor modifications) from Proposition 4.1, 4.2 and Lemma 4.9 of [6, Sect. 4]. We consider a sequence of (global in time) wave maps u n and their initial data (u 0n , u 1n ) ∈ V(δ) such that W (t)(u 0n , u 1n ) S(R) ≥ δ0 , u n S(In ) = ∞ and E(u n , u n t ) → E c (hence E c ≤ E(u n , u n t ) < E(Q) + δ). Using the result by Bahouri and Gerard [1] on the operator W (t), we apply the (linear) profile decomposition to the sequence (u 0n , u 1n )n≥1 : (u c , u ct )

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

223

Lemma 9 (Profile decomposition) Let (u 0n , u 1n )n be a bounded sequence of H × L 2 . Then there exist sequences (V0, j , V1, j ) j≥1 ∈ H × L 2 and (λ j,n , t j,n ) ∈ R+∗ × R with λ j,n λ j ,n |t j,n − t j ,n | + + → ∞ as n → ∞, λ j ,n λ j,n λ j,n for j = j (orthogonal couple), such that the following holds. Denote V j (t) = W (t) (V0, j , V1, j ) the linear profiles, then for all J ≥ 1, there exist (w0 nJ , w1 nJ ) ∈ H × L 2 such that (up to a subsequence of (u n ), which we still denote (u n )) u 0n (r ) =

J j=1

t j,n r + w0n (r ), Vj − , λ j,n λ j,n

J t j,n 1 r u 1n (t) = + w1n (r ), Vj − , λ j,n t λ j,n λ j,n j=1

with lim lim sup W (t)(w0 nJ , w1 nJ ) S(R) = 0,

J →∞ n→∞

∀J ≥ 1, E(u 0n , u 1n ) =

J j=1

t j,n t j,n , Vj t − E Vj − λ j,n λ j,n

+E(w0 nJ , w1 nJ ) + on→∞ (1). (Notice there is no shift in the space variable r as we are in a radial setting.) Then one can prove the following technical lemma. First recall the notion of non-linear profile : given data (V0 , V1 ) ∈ H × L 2 and a sequence (sn ), with sn → s¯ ∈ R, it is the unique wave map U defined on a neighbourhood of s¯ such that (U (sn ), Ut (sn )) − W (sn )(V0 , V1 ) H ×L 2 → 0 as n → ∞. U exists by virtue of Theorem 2, possibly doing a fixed point at infinity. If (V0 , V1 ) ∈ V(δ), U is global in time due to Proposition 1. Lemma 10 Let (u 0n , u 1n ) ∈ H × L 2 be such that E(u 0n , u 1n ) → E c and W (t)(u 0n , u 1n ) S(R) ≥ δ0 . Let (V j ) j≥1 and λ j,n , t j,n be as in Lemma 9. Assume that one of the following conditions holds (denote sn = −t1,n /λ1,n ) : • lim inf n→∞ E(V1 (sn ), λ−1 1,n V1t (sn )) < E c , or • lim inf n→∞ E(V1 (sn ), λ−1 1,n V1t (sn )) = E c and that after passing to a subsequence so that sn → s¯ ∈ R and E(V1 (sn ), λ−1 1,n V1t (sn )) → E c , if U1 is the non linear profile associated to (V0,1 , V1,1 ) and (sn ), then U1 is global and U1 S(R) < ∞. Denote u n the wave map with initial data (u 0n , u 1n ). Then after passing to a subsequence, u n is global and u n S(R) < ∞. The proof of this lemma relies on the profile decomposition and a perturbation result which is a by-product of Theorem 2. We refer to [6, Lemma 4.9] for further details. From this, one can prove that the all profiles (V0, j , V1, j ) associated to (u 0n , u 1n )n are zero, except for (exactly) one, say (V0,1 , V1,1 ) and that E(V0,1 , V1,1 ) = E c . Then t and consider the non-linear profile associated to (V0,1 , V1,1 ) and (sn ) denote sn = − λ1,n 1,n

224

R. Côte, C. E. Kenig, F. Merle

(up to a subsequence such that sn has a limit in R). Then one can prove that U is global and U S(R) = +∞ : U (t) or U (−t) satisfies the conclusion of Proposition 2. We now turn to Proposition 3; for the compactness result, we argue by contradiction. Assume that there exists η0 > 0 and a sequence (tn )n≥1 such that for all λ > 0, n = n ,

U tn , r , 1 U t t n , r − (U (t , r ), U (t , r )) ≥ η0 . (18) t n n λ λ λ H ×L 2 Up to considering a subsequence, we can assume that tn has a limit in [0, ∞]; by continuity of the flow, tn → ∞. Now consider the profile decomposition of the sequence (U (tn ), Ut (tn )). Again using Lemma 10, one can prove that all profiles are zero, except for one. Then one can obtain for this profile a statement similar to (18), and from there, reach a contradiction. Hence there exists λ : R+ → R+∗ such that the set r 1 = v(t) ∈ H × L 2 v(t, r ) = U t, r t, , U K t λ(t) λ(t) λ(t) has compact closure in H × L 2 . It remains to prove that, up to changing the critical element U , one can further assume λ(t) ≥ A0 > 0. Indeed, if it is not the case for U , by compactness, there exist λn → 0 and tn → ∞ such that U (tn , λrn ), Ut (tn , λrn ) → (W0 , W1 ) in H × L 2 . Then one can prove that the wave map u c with initial data (W0 , W1 ) satisfies all the properties of Propositions 2 and 3.

Proof of Corollary 1. Notice that if (u 0 , u 1 ) is such that E(u 0 , u 1 ) ≤ E(Q) and (u 0 , u 1 ) ∈ / V(δ), then (as u 0 (0) = 0), |u 0 (∞)| ≥ C ∗ , and from the pointwise inequality (5), |u 0 (∞)| = C ∗ , u 0 (r ) = ε Q(λr ) for some λ > 0 and ε ∈ {−1, 1}, and u 1 = 0. Hence in our case, (u 0 , u 1 ) ∈ V(δ), and the result follows from Theorem 1.

References 1. Bahouri, H., Gérard, P.: High frequency approximation of solutions to critical nonlinear wave equations. Amer. J. Math. 121, 131–175 (1999) 2. Cazenave, T., Shatah, J., Tahvildar-Zadeh, A.S.: Harmonic maps of the hyperbolic space and development of singularities in wave maps and Yang-Mills fields. Ann. Inst. H. Poincaré Phys. Théor. 68(3), 315–349 (1998) 3. Côte, R.: Instability of harmonic maps for the (1+2)-dimensional equivariant Wave map system. Int. Math. Res. Not. 57, 3525–3549 (2005) 4. Gérard, P.: Description du défaut de compacité de l’injection de Sobolev, ESAIM Control Optim. Calc. Var. 3, 213–233 (1998) (electronic) 5. Kenig, C.E., Merle, F.: Global well-posedness, scattering and blow-up for the energy-critical, focusing, non-linear wave equation. To appear, Acta. Math 6. Kenig, C.E., Merle, F.: Global well-posedness, scattering and blow-up for the energy-critical, focusing, non-linear Schrödinger equation in the radial case. Invent. Math. 166(3), 645–675 (2006) 7. Kenig, C.E., Ponce, G., Vega, L.: Well-posedness and scattering result for the generalized Korteweg-De Vries equation via contraction principle. Comm. Pure Appl. Math. 46, 527–620 (1993) 8. Krieger, J., Schlag, W., Tataru, D.: Renormalization and blow up for charge one equivariant critical wave maps. http://arXiv.org/list/math.AP/0610248 (2006) 9. Rodnianski, I., Sterbenz, J.: On the formation of singularities in the critical O(3) sigma-model. math.AP/0605023, 2006 10. Shatah, J., Struwe, M.: Geometric wave equations. Courant Lecture Notes in Mathematics, Vol. 2, New York: New York University / Courant Institute of Mathematical Sciences, 1998 11. Shatah, J., Tahvildar-Zadeh, A.S.: On the Cauchy problem for equivariant wave maps. Comm. Pure Appl. Math. 47(5), 719–754 (1994)

Scattering Below Critical Energy for 4D YM Equation and 2D Wave Map System

225

12. Stein, E.M.: Singular integrals and differentiability properties of functions. Princeton Mathematical Series, No. 30, Princeton, N.J: Princeton University Press, 1970 13. Struwe, M.: Equivariant wave maps in two space dimensions. Comm. Pure Appl. Math. 56(7), 815–823 (2003) Communicated by P. Constantin

Commun. Math. Phys. 284, 227–261 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0520-7

Communications in

Mathematical Physics

The Asymptotic Behavior of the Takhtajan-Zograf Metric Kunio Obitsu1, , Wing-Keung To2, , Lin Weng3 1 Department of Mathematics and Computer Science, Faculty of Science, Kagoshima University,

1-21-35 Korimoto, Kagoshima 890-0065, Japan. E-mail: [email protected]

2 Department of Mathematics, National University of Singapore, 2 Science Drive 2,

Singapore 117543, Republic of Singapore. E-mail: [email protected]

3 Graduate School of Mathematics, Kyushu University, Hakozaki, Higashi-ku,

Fukuoka 812-8581, Japan. E-mail: [email protected] Received: 30 September 2007 / Accepted: 8 January 2008 Published online: 4 June 2008 – © Springer-Verlag 2008

Abstract: We obtain the asymptotic behavior of the Takhtajan-Zograf metric on the Teichmüller space of punctured Riemann surfaces. 0. Introduction We consider the Teichmüller space Tg,n and the associated Teichmüller curve Tg,n of Riemann surfaces of type (g, n) (i.e., Riemann surfaces of genus g and with n > 0 punctures). We will assume that 2g−2+n > 0, so that each fiber of the holomorphic projection map π : Tg,n → Tg,n is stable or equivalently, it admits the complete hyperbolic metric of constant sectional curvature −1. The kernel of the differential T Tg,n → T Tg,n forms the so-called vertical tangent bundle over Tg,n , which is denoted by T V Tg,n . The hyperbolic metrics on the fibers induce naturally a Hermitian metric on T V Tg,n . In the study of the family of ∂¯k -operators acting on the k-differentials on Riemann −k −1 → π −1 (s), s ∈ Tg,n ), Takhtajan surfaces (i.e., cross-sections of T V Tg,n π (s) and Zograf introduced in [TZ1] and [TZ2] a Kähler metric on Tg,n , which is known as the Takhtajan-Zograf metric. In [TZ2], they showed that the Takhtajan-Zograf metric is invariant under the natural action of the Teichmüller modular group Modg,n and it satisfies the following remarkable identity on Tg,n : c1 (λk , ρ Q,k ) =

6k 2 − 6k + 1 1 1 · 2 ωWP − ωTZ . 12 π 9

Here λk = det(ind ∂¯k ) = ∧max Ker ∂¯k ⊗ (∧max Coker ∂¯k )−1 denotes the determinant line bundle on Tg,n , ρ Q,k denotes the Quillen metric on λk , and ωWP and ωTZ denote the The first author is partially supported by JSPS Grant-in-Aid for Exploratory Research 2005-2007.

The second author is partially supported by the research grant R-146-000-106-112 from the National

University of Singapore and the Ministry of Education.

228

K. Obitsu, W.-K. To, L. Weng

Kähler forms of the Weil-Petersson metric and the Takhtajan-Zograf metric on Tg,n respectively. In [We], Weng studied the Takhtajan-Zograf metric in terms of Arakelov intersection, and he expressed the class of ωTZ as a rational multiple of the first Chern class of an associated Takhtajan-Zograf line bundle over the moduli space Mg,n = Tg,n /Modg,n . Recently, Wolpert [Wol5] gave a natural definition of a Hermitian metric on the Takhtajan-Zograf line bundle whose first Chern form gives ωTZ . Motivated in part by these developments, we are interested in studying the boundary behavior of the Takhtajan-Zograf metric on Tg,n . Along this direction is an earlier result of Obitsu [O1], who showed that the Takhtajan-Zograf metric is incomplete. We are also inspired by Masur’s beautiful paper [M], which gave the asymptotic boundary behavior of the Weil-Petersson metric on Tg := Tg,0 (see also [Wolp5] and [OW] for recent improvements of this result). Our main result in this paper is to give the asymptotic behavior of the TakhtajanZograf metric near the boundary of Tg,n , which we describe heuristically as follows. Near the boundary of Tg,n , the tangent space at any point in Tg,n can be roughly considered as the direct sum of the pinching directions and the non-pinching directions (that are ‘parallel’ to the boundary). Roughly speaking, our result shows that the Takhtajan-Zograf metric is smaller than the Weil-Petersson metric by an additional factor of 1/| log |t|| along each pinching tangential direction, i.e. it is essentially of the order of growth 1/|t|2 (log |t|)4 along the pinching direction corresponding to a pinching coordinate t. Also, we show that the Takhtajan-Zograf metric extends continuously along the non-pinching tangential directions to the “nodally-depleted Takhtajan-Zograf metrics” on the boundary Teichmüller spaces, which, unlike the case of the Weil-Petersson metric, are only positive semi-definite on the boundary Teichmüller spaces. Our result also leads immediately to an alternative proof of the above mentioned result of Obitsu on the non-completeness of the Takhtajan-Zograf metric (see Theorem 1 for the precise statements of our results.) An important ingredient in the proof of our main result is to obtain certain estimates on degenerative behavior of the Eisenstein series in the setting of holomorphic families of degenerating punctured Riemann surfaces, which seem to be of considerable independent interest. These estimates are largely obtained by geometrically constructing suitable germs of comparison functions for the Eisenstein series near the nodes and punctures. We also need to make certain adaptations from Masur’s paper [M]. This paper is organized as follows. In Sect. 1, we introduce some notation and state our main results. In Sect. 2, we describe the behavior of the hyperbolic metrics on the punctured Riemann surfaces upon degenerations. In Sect. 3, we recall Masur’s construction of a certain local basis of regular quadratic differentials for a degenerating family of punctured Riemann surfaces. In Sect. 4, we derive the necessary estimates of the Eisenstein series near the punctures and nodes of a degenerating family of punctured Riemann surfaces. Finally we complete the proof of our main result in Sect. 5. 1. Notation and Statement of Results 1.1. For g ≥ 0 and n > 0, we denote by Tg,n the Teichmüller space of Riemann surfaces of type (g, n). Each point of Tg,n is a Riemann surface X of type (g, n), i.e., X = X¯ \{ p1 . . . . , pn }, where X is a compact Riemann surface of genus g, and the punctures p1 , . . . , pn of X are n distinct points in X¯ . We will always assume that 2g−2+n > 0, so that X admits the complete hyperbolic metric of constant sectional curvature −1. By the uniformization theorem, X can be represented as a quotient H/ of the upper half

Asymptotic Behavior of the Takhtajan-Zograf Metric

229

plane H := {z ∈ C: Im z > 0} by the natural action of Fuchsian group ⊂ PSL(2, R) of the first kind. is generated by 2g hyperbolic transformations A1 , B1 , . . . , A g , Bg and n parabolic transformations P1 , . . . , Pn satisfying the relation −1 −1 −1 A1 B1 A−1 1 B1 · · · A g Bg A g Bg P1 P2 · · · Pn = Id.

Let z 1 , . . . , z n ∈ R∪{∞} be the fixed points of the parabolic transformations P1 , . . . , Pn respectively, which are also called cusps. The cusps z 1 , . . . , z n correspond to the punctures p1 , . . . , pn of X under the projection H → H/ X respectively. For each i = 1, 2, . . . , n, it is well-known that Pi generates an infinite cyclic subgroup of , and we can select σi ∈ PSL(2, R) so that σi (∞) = z i and σi−1 Pi σi is the transformation z → z + 1 on H. For each i = 1, 2, . . . , n and s ∈ C, the Eisenstein series E i (z, s) attached to the cusp z i is given by E i (z, s) :=

Im(σi−1 γ z)s , z ∈ H.

(1.1.1)

γ ∈\

If Re s > 1, then the above series is uniformly convergent on compact subsets of H. Moreover, E i (z, s) is invariant under , and thus it descends to a function on X , which we denote by the same symbol. Furthermore, it is well-known that

hyp E j = s(s − 1)E j on X,

(1.1.2)

where hyp denotes the hyperbolic Laplacian on X (see e.g. [Ku]). The Teichmüller space Tg,n is naturally a complex manifold of dimension 3g − 3 + n. To describe its tangent and cotangent spaces at a point X , we first denote by Q(X ) 2 1 the space of holomorphic quadratic differentials φ = φ(z) dz on X with finite L norm, i.e., X |φ| < ∞. Also, we denote by B(X ) the space of L ∞ measurable Beltrami differentials µ = µ(z) d z¯ /dz on X (i.e., µ∞ := ess. supz∈X |µ(z)| < ∞). Let H B(X ) be the subspace of B(X ) consisting of elements of the form φ/ρ for some φ ∈ Q(X ). Here ρ = ρ(z) dz d z¯ denotes the hyperbolic metric on X . Elements of H B(X ) are called harmonic Beltrami differentials. There is a natural Kodaira-Serre pairing , : B(X ) × Q(X ) → C given by µ, φ =

µ(z)φ(z) dzd z¯

(1.1.3)

X

for µ ∈ B(X ) and φ ∈ Q(X ). Let Q(X )⊥ ⊂ B(X ) be the annihilator of Q(X ) under the above pairing. Then one has the decomposition B(X ) = H B(X ) ⊕ Q(X )⊥ . It is well-known that one has the following natural isomorphism: TX Tg,n B(X )/Q(X )⊥ H B(X ), and TX∗ Tg,n Q(X )

(1.1.4)

with the duality between TX Tg,n and TX∗ Tg,n given by (1.1.3). It should be remarked that Bers was responsible for many of the concepts described above (see [Be]). The Weil-Petersson metric g WP and the Takhtajan-Zograf metric g TZ on Tg,n (the latter being introduced in [TZ1] and [TZ2]) are defined as follows (see e.g. [IT,Wolp2]

230

K. Obitsu, W.-K. To, L. Weng

and the references therein for background materials on g WP ): for X ∈ Tg,n and µ, ν ∈ H B(X ), one has g WP (µ, ν) = µ¯ν ρ, X

g TZ (µ, ν) =

n

g (i) (µ, ν), where

i=1

g (i) (µ, ν) =

E i (·, 2)µ¯ν ρ, i = 1, 2, . . . , n

(1.1.5)

X

(see (1.1.1)). It follows from results in [A2,Ch,Wolp1,TZ2,O1] that the metrics g WP , g (i) , g TZ are all Kählerian and non-complete. Note that g TZ is well-defined only when n > 0. Moreover, each g (i) is intrinsic to the corresponding cusp pi in the sense that if an element γ in the Teichmüller modular group Modg,n carries the cusp pi to another cusp p j , then γ also carries g (i) to g ( j) . To facilitate subsequent discussion, we will call g (i) the Takhtajan-Zograf cuspidal metric on Tg,n associated to the cusp z i (or the puncture pi ). The moduli space Mg,n of Riemann surfaces of type (g, n) is obtained as the quotient of Tg,n by the Teichmüller modular group Modg,n , i.e., Mg,n Tg,n /Modg,n (see e.g. [N]). As such, Mg,n is naturally endowed with the structure of a complex V -manifold ([Ba]). The metrics g WP and g TZ (but not each individual g (i) unless n = 1) are invariant under Modg,n and thus they descend to Kähler metrics on (the smooth points of) Mg,n , which we denote by the same names and symbols.

1.2. To facilitate the ensuing discussion, we consider some related pseudo-metrics on the associated boundary Teichmüller spaces of Tg,n . As in [M] (in the case of Tg,0 ), we denote by δγ1 ,...,γm Tg,n the boundary Teichmüller space of Tg,n arising from pinching m distinct points. Take a point X 0 ∈ δγ1 ,...,γm Tg,n . Then X 0 is a Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm . Observe that X 0o := X \{q1 , . . . , qm } is a non-singular Riemann surface with n + 2m punctures. Each node qi corresponds to two punctures on X 0o (other than p1 , . . . , pn ). Denote the components of X 0o by Sα , α = 1, 2, . . . , d. Each Sα is a Riemann surface of genus gα and with n α punctures, i.e., Sα is of type (gα , n α ). It will be clear in Sect. 1.3 that we will only need to consider the case where 2gα − 2 + n α > 0 for each α, so that each Sα also admits the complete hyperbolic metric of constant sectional curvature −1. It is easy to see that dα=1 (3gα − 3 + n α ) + m = 3g − 3 + n. With respect to the disjoint union X 0o = ∪dα=1 Sα , one easily sees that δγ1 ,...,γm Tg,n is a product of lower dimensional Teichmüller spaces given by δγ1 ,...,γm Tg,n = Tg1 ,n 1 × Tg2 ,n 2 × · · · × Tgd ,n d

(1.2.1)

with each Sα ∈ Tgα ,n α , α = 1, 2, . . . , d. Recall that the punctures of Sα arise from either the punctures or the nodes of X 0 , and for simplicity, they will be called old cusps and new cusps of Sα respectively. Denote the number of old cusps (resp. new cusps) of Sα by n α (resp. n α ), so that n α = n α + n α . We index the punctures of Sα such that { pα,i }1≤i≤n α denotes the set of old cusps, and { pα,i }n α +1≤i≤n α denotes the set of new cusps. For each α and i, we denote by g (α,i) the Takhtajan-Zograf cuspidal metric on

Asymptotic Behavior of the Takhtajan-Zograf Metric

231

Tgα ,n α with respect to the puncture pα,i (cf. (1.1.5)). Now we define a pseudo-metric gˆ TZ,α on Tgα ,n α by summing the g (α,i) ’s over the old cusps, i.e., g (α,i) . (1.2.2) gˆ TZ,α := 1≤i≤n α

If none of the punctures of Sα are old cusps, then gˆ TZ,α is simply defined to be zero identically. As such, gˆ TZ,α is positive definite precisely when Sα possesses at least one old cusp. Note that by contrast, the Takhtajan-Zograf metric g TZ,α on Tgα ,n α is given by g TZ,α := 1≤i≤n α g (α,i) , and g TZ,α is always positive definite. Definition 1.2.1. The nodally depleted Takhtajan-Zograf pseudo-metric gˆ TZ,(γ1 ,...,γm ) on δγ1 ,...,γm Tg,n is defined to be the product pseudo-metric of the gˆ TZ,α ’s on the Tgα ,n α ’s, i.e., d

δγ1 ,...,γm Tg,n , gˆ TZ,(γ1 ,...,γn ) = Tgα ,n α , gˆ TZ,α . (1.2.3) i=1

1.3. Let Mg,n be the moduli space of Riemann surfaces of type (g, n) as in (1.1), and let Mg,n denote the Knudsen-Deligne-Mumford stable curve compactification of Mg,n ([DM][KM,Kn]). Like Mg,n , Mg,n admits a V -manifold structure, which we describe as follows. Similar description for Mg (i.e., when n = 0) can be found in [M] or [Wolp3]. Take a point X 0 ∈ Mg,n \Mg,n . Then X 0 is a stable Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm for some m > 0. Thus we may regard X 0 as a point in δγ1 ,...,γm Tg,n (cf. (1.2)). Write X 0 \{q1 , . . . , qm } = ∪1≤α≤d Sα and write δγ1 ,...,γm Tg,n = dα=1 Tgα ,n α with each component Sα ∈ Tgα ,n α as in Sect. 1.2. Note that since X 0 is stable, each Sα admits the complete hyperbolic metric of constant sectional curvature −1. Also, for some 0 < r < 1, each node q j in X 0 admits an open neighborhood N j = {(z j , w j ) ∈ C2 : |z j |, |w j | < r, z j · w j = 0} (1.3.1) so that N j = N 1j ∪ N 2j , where N 1j = {(z j , 0) ∈ C2 : |z j | < r } and N 2j = {(0, w j ) ∈ C2 : |w j | < r } are the coordinate discs in C2 . Without loss of generality, we will assume that r is independent of j, upon shrinking r if necessary. For each α, we choose 3gα − (α) 3 + n α linearly independent Beltrami differentials νi , 1 ≤ i ≤ 3gα − 3 + n α , which are n supported on Sα \ ∪ j=1 N j , so that their harmonic projections form a basis of TSα Tgα ,n α (α)

(cf. (1.1.4)). For simplicity, we rewrite {vi }1≤α≤d,1≤i≤3gα −3+n α as {vi }1≤i≤3g−3+n−m . Then one has an associated local coordinate neighborhood V of X 0 in δγ1 ,...,γm Tg,n with holomorphic coordinates τ = (τ1 , . . . , τ3g−3+n−m ) such that X 0 corresponds to 0. Shrinking and reparametrizing V if necessary, we may assume V 3g−3+n−m , where = {z ∈ C: |z| < 1} denotes the unit disc in C. For a point τ ∈ V , one has 3g−3+n−m the associated Beltrami differential µ(τ ) = i=1 τi vi and a quasi-conformal µ(τ ) homeomorphism w : X 0 → X τ onto a Riemann surface X τ satisfying ∂w µ(τ ) ∂w µ(τ ) = µ(z) . ∂ z¯ ∂z

(1.3.2)

232

K. Obitsu, W.-K. To, L. Weng

The map w µ(τ ) is conformal on each N j , j = 1, . . . , m, so that we may regard N j ⊂ X τ for each j. Then for each t = (t1 , . . . , tm ) with each |t j | < r , we obtain a new Riemann surface X t,τ for X τ by removing the disks {z j ∈ N 1j : |z j | < |t j |} and {w j ∈ N 2j : |w j | < |t j |} and identifying z j ∈ N 1j with w j = t j /z j ∈ N 2j , j = 1, . . . , m. Then one obtains a holomorphic family of noded Riemann surfaces {X t,τ } parametrized by the coordinates (t, τ ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ) of m (r ) × V m (r ) × 3g−3+n−m , where m (r ) denotes the m-fold Cartesian product of the disc (r ) = {z ∈ C: |z| < r } in C. Moreover, the Riemann surfaces X t,τ with (t, τ ) ∈ ( ∗ (r ))m × V are of type (g, n), where ∗ (r ) = (r )\{0}. The coordinates t = (t1 , . . . , tm ) will be called pinching coordinates, and τ = (t1 , . . . , t3g−3+n−m ) will be called boundary coordinates. For 1 1 ≤ j ≤ m, let α j denote the simple closed curve |z j | = |w j | = |t j | 2 on X t,τ . Shrinking m ∗

(r ) and V if necessary, it is known that the universal cover of ( (r ))m ×V is naturally a domain in Tg,n and the corresponding covering transformations are generated by a Dehn twist about the α j ’s. Since Dehn twists are elements of Modg,n , the Modg,n -invariant metrics g WP and g TZ descend to metrics on ( ∗ (r ))m × V , which we denote by the same symbols and names. It is well-known that each X 0 ∈ Mg,n \Mg,n admits an open neighborhood Uˆ in Mg,n together with a local uniformizing chart χ : U m (r ) × V → Uˆ for some m (r ) × V as described above, where χ is a finite ramified cover. Obviously the metrics g WP and g TZ on ( ∗ (r ))m × V ⊂ U may also be regarded as extensions of the pull-back of the corresponding metrics on the smooth points of Uˆ ∩ Mg,n via the map χ . 1.4. Before we state our main result, we first need to make the following definition. Definition 1.4.1. Let X 0 be a Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm . A node qi is said to be adjacent to punctures (resp. a puncture p j ) if the component of X 0 \{q1 , . . . , qi−1 , qi+1 , . . . , qm } containing qi also contains at least one of the p j ’s (resp. the puncture p j ). Otherwise, it is said to be non-adjacent to punctures (resp. the puncture p j ). Now we are ready to state our main result in the following Theorem 1. For g ≥ 0 and n > 0, let X 0 ∈ Mg,n \Mg,n be a stable Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm arranged in such a way that qi is adjacent (resp. non-adjacent) to punctures for 1 ≤ i ≤ m (resp. m + 1 ≤ i ≤ m). Let Uˆ be an open neighborhood of X 0 in Mg,n , together with a local uniformizing chart ψ: U m (r ) × V → Uˆ , where V 3g−3+n−m is a domain in the boundary Teichmüller space δγ1 ,...,γm Tg,n corresponding to X 0 and with each γi corresponding to qi . Let (s1 , . . . , s3g−3+n ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ) = (t, τ ) be the pinching and boundary coordinates of U , and let the components of the Takhtajan-Zograf metric g TZ be given by

∂ ∂ TZ TZ , 1 ≤ i, j ≤ 3g − 3 + n, (1.4.1) gi j¯ = g , ∂si ∂s j on U ∗ := ( ∗ (r ))m × V ⊂ U . Then the following statements hold: (i) For each 1 ≤ j ≤ m and any ε > 0, one has lim sup

(t,τ )∈U ∗ →(0,0)

|t j |2 (− log |t j |)4−ε g TZ (t, τ ) = 0. j j¯

(1.4.2)

Asymptotic Behavior of the Takhtajan-Zograf Metric

233

(ii) For each 1 ≤ j ≤ m and any ε > 0, one has lim inf

(t,τ )∈U ∗ →(0,0)

|t j |2 (− log |t j |)4+ε g TZ (t, τ ) = +∞. j j¯

(1.4.3)

(iii) For each 1 ≤ j, k ≤ m with j = k, one has

TZ 1 g (t, τ ) = O as (t, τ ) ∈ U ∗ → (0, 0). j k¯ |t j | |tk | (log |t j |)3 (log |tk |)3 (1.4.4) (iv) For each j , k ≥ m + 1, one has lim

(t,τ )∈U ∗ →(0,0)

TZ,(γ1 ,...,γm )

g TZ (t, τ ) = gˆ j k¯ j k¯

(0, 0).

(v) For each j ≤ m and k ≥ m + 1, one has

TZ 1 g (t, τ ) = O as (t, τ ) ∈ U ∗ → (0, 0). j k¯ |t j |(− log |t j |)3

(1.4.5)

(1.4.6)

TZ,(γ ,...,γ )

m denotes the ( j, k)th component of the nodally depleted Here in (1.4.5), gˆ j k¯ 1 Takhtajan-Zograf pseudo-metric on δγ1 ,...,γm Tg,n (cf. Definition 1.2.1).

Remark 1.4.2. (i) Theorem 1(i) is equivalent to the following statement: For each 1 ≤ j ≤ m and any ε > 0, there exists a constant C1,ε > 0 (depending on ) such that (t, τ ) ≤ g TZ j j¯

|t j

C1,ε 2 | (− log |t

j |)

4−ε

for all (t, τ ) ∈ U ∗ .

(1.4.7)

Similarly, Theorem 1(ii) is equivalent to the following statement: For each 1 ≤ j ≤ m and any ε > 0, there exists a constant C2,ε > 0 (depending on ) such that (t, τ ) ≥ g TZ j j¯

C2,ε for all (t, τ ) ∈ U ∗ . |t j |2 (− log |t j |)4+ε

(1.4.8)

(ii) In view of Theorem 1(i) and (ii), it is natural to ask the following question: Does the stronger estimate g TZ (t, τ ) ∼ j j¯

1 hold for 1 ≤ j ≤ m and (t, τ ) ∈ U ∗ ? |t j |2 (− log |t j |)4

(1.4.9)

The methods of this paper does not seem to generalize easily to answer this question. 2. The Hyperbolic Metric 2.1. In Sect. 2, we are going to give some uniform estimates for the family of hyperbolic metrics near the punctures and nodes of degenerating Riemann surfaces. For a degenerating family of compact Riemann surfaces (i.e. n = 0), Wolpert [Wolp3] has developed from the prescribed curvature equation results which are stronger than what is described in this section. Since the estimates in the form that we need in our ensuing discussion were discussed explicitly only in the case when n = 0 in [M] and [Wolp3], we include here the modifications arising from the punctures for the convenience of the reader. Throughout this article, hyperbolic metrics will always be normalized to be of constant sectional curvature −1. First we have

234

K. Obitsu, W.-K. To, L. Weng

Lemma 2.1.1. Let S be a hyperbolic punctured Riemann surface with hyperbolic metric ρ. Let ∗ (r ) = {z ∈ C: |z| < r } (with r > 0) be a punctured coordinate neighborhood of a puncture p of S with the origin 0 corresponding to p. Write ρ = ρ(z) dz ⊗ d z¯ on

∗ (r ). Then one has lim |z|2 (log |z|)2 ρ(z) = 1. z→0

Proof. First recall from (1.1) that we may write S = H/ , where H = {Z = X + iY ∈ C: Y > 0} and ⊂ PSL(2, R). Moreover, upon conjugation by an element in PSL(2, R) if necessary, we may assume that the puncture p corresponds to the cusp ∞ and the subgroup ∞ of fixing ∞ is generated by the transformation Z → Z + 1. It is well-known that for some R > 0, one has γ A ∩ A = ∅ for some γ ∈ ∞ \, where A = {Z = X + iY ∈ H: Y > R} (cf. e.g. [FK, p. 216] or Remark-Definition 2.1.2(ii) below). It follows that the function w(Z ) = e2πi Z

(2.1.1)

on A descends to the coordinate function on the punctured coordinate neighborhood

∗ (r0 ) = {w ∈ C: 0 < |w| < r0 } of p in S with p corresponding to the origin 0, where 0 < r0 = e−2π R < 1. Being descended from the hyperbolic metric d Z ⊗ d Z¯ /Y 2 on H, one easily sees that dw ⊗ d w¯ ρ= on ∗ (r0 ). (2.1.2) |w|2 (log |w|)2 Now, if z is any coordinate function of S near p with z( p) = 0. Then w can be regarded as a holomorphic function of z near p with w(0) = 0 and C := w (0) = 0. By Taylor’s theorem, we have w = C z + O(z 2 ) and w (z) = C + O(z) as z → 0. Together with (2.1.2), it follows that ρ is given in terms of z near p by ρ=

|C + O(z)|2 dz ⊗ d z¯ |w (z)|2 dz ⊗ d z¯ = , |w(z)|2 (log |w(z)|)2 |z|2 |C + O(z)|2 (log |z| + log |C + O(z)|)2

and upon letting z → 0, Lemma 2.1.1 follows immediately. Remark-Definition 2.1.2. (i) For simplicity, a local holomorphic coordinate function w of a hyperbolic Riemann surface S defined near a puncture p with w( p) = 0 will be said to be standard if it is descended from the Euclidean coordinate function on H via (2.1.1) (so that the hyperbolic metric ρ of S satisfies (2.1.2) near p). If S has nodes, such a definition will also be applied to the punctures of S \ {nodes} (instead of S). As seen above, such standard coordinate functions always exist near the punctures of S. (ii) It follows from the collar lemma for non-compact surfaces (cf. e.g. [Bu, Theorem 4.4.6, p.111-112] that we may always take R = 21 (and thus ro = e−π ) in the proof of Lemma 2.1.1. Notation as in §1. Let X 0 ∈ Mg,n \Mg,n be a Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm , and let Uˆ be an open neighborhood of X 0 in Mg,n together with a local uniformizing chart χ : U → Uˆ , where U m (r )× V = {(t, τ ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ): t ∈ m (r ), τ ∈ V } and V 3g−3+n−m is an open coordinate neighborhood of X 0 in δγ1 ,...,γm Tg,n as in (1.3). Let X := {X t,τ }(t,τ )∈U be the corresponding family of Riemann surfaces parametrized by U with X 0 = X (0,0) , and let π : X → U denote the holomorphic projection map. Fix a puncture pi of X 0 .

Asymptotic Behavior of the Takhtajan-Zograf Metric

235

Shrinking U if necessary, it is easy to see that there exists an open coordinate subset Wi = ∗ (R) × U of X such that π |Wi is given by the projection onto the second factor, and each point (0, (t, τ )) corresponds to the puncture on X t,τ associated to pi (in particular, (0, (0, 0)) corresponds to pi itself). Shrinking R and V if necessary, we will assume without loss of generality that R is independent of i, and each Wi ⊂⊂ Wi for some similarly defined open coordinate subset X of the form Wi = ∗ (R ) × U with U = m (r ) × V , V 3g−3+n−m (δ)

(2.1.3)

for some 0 < R < R < 1, 0 < r < r < 1 and δ > 1. For each (t, τ ) ∈ U , we denote the hyperbolic metric on X t,τ by ρt,τ , and we denote Wi,t,τ := Wi ∩ X t,τ ∗ (R) and Wi,t,τ := Wi ∩ X t,τ ∗ (R ). We also write . ρt,τ = ρt,τ (z i )dz i ⊗ d z¯ i = ρ(z i , t, τ )dz i ⊗ d z¯ i on Wi,t,τ

(2.1.4)

Then it follows from a result of Bers [Be] that the function ρ(z i , t, τ ) on Wi is locally uniformly continuous in all the variables. Proposition 2.1.3. (i) For each 1 ≤ i ≤ n, there exist constants C1 , C2 > 0 such that for all (t, τ ) ∈ U , one has |z i

C1 2 | (log |z

i

|)2

≤ ρt,τ (z i ) ≤

|z i

C2 2 | (log |z

i |)

2

on Wi,t,τ .

(2.1.5)

(ii) (Strengthened version of (i)) If, in addition, z i is a standard local holomorphic coordinate function for X 0 (cf. Remark-Definition 2.1.2), then the inequality in (2.1.5) remains valid with the constants C1 , C2 replaced by positive continuous functions C1,t,τ , C2,t,τ (depending on t, τ ) respectively and satisfying C1,t,τ , C2,t,τ → 1 as (t, τ ) → (0, 0).

(2.1.6)

Proof. For simplicity, we will drop the subscript i, so that W = Wi , Wt,τ = Wi,t,τ , = W Wt,τ i,t,τ , z = z i , etc. First we remark that it is well-known (and follows also from the arguments in Lemma 2.1.1) that (2.1.5) holds for a fixed punctured Riemann surface; in other words, there exist constants C1,0,0 , C2,0,0 > 0 such that C2,0,0 C1,0,0 ≤ ρ0,0 (z) ≤ 2 on W0,0 ∗ (R ). |z|2 (log |z|)2 |z| (log |z|)2

(2.1.7)

For each (t, τ ) ∈ U , since ρt,τ is of constant sectional curvature −1, it follows that one has

0 log ρ(z, t, τ ) = 2ρ(z, t, τ ) on ∗ (R ), (2.1.8) where 0 := ∂ 2 /∂ x 2 + ∂ 2 /∂ y 2 (with z = x + i y) is the Euclidean Laplacian. Consider the continuous function f (z, t, τ ) = log

ρ(z, t, τ ) on W ∗ (R ) × U . ρ0,0 (z)

(2.1.9)

We extend f to a function on (R ) × U by letting f (0, t, τ ) = 0 for all (t, τ ) ∈ U .

(2.1.10)

236

K. Obitsu, W.-K. To, L. Weng

Then it follows from Lemma 2.1.1 that for fixed (t, τ ) ∈ U , f (z, t, τ ) is continuous in the variable z ∈ (R ). By applying the Mean Value Theorem to the real exponential function, one easily sees that for (t, τ ) ∈ U and z ∈ ∗ (R),

0 f (z, t, τ ) =2 ρ(z, t, τ ) − ρ0,0 (z) (by (2.1.8), (2.1.9)) =2 elog ρ(z,t,τ ) − elog ρ0,0 (z) =2eη f (z, t, τ )

(2.1.11)

for some real number η = η(z, t, τ ) between log ρ(z, t, τ ) and log ρ0,0 (z). By the maximum principle, it follows from (2.1.10) and (2.1.11) that for each (t, τ ) ∈ U , one has (2.1.12) max f (z, t, τ ) ≤ max{ 0, max f (z, t, τ )}, ¯ z∈ (R)

z∈∂ (R)

¯ where (R) = {z ∈ C: |z| ≤ R} and ∂ (R) = {z ∈ C: |z| = R}. By applying the above arguments to the function − f , one also easily sees that for each (t, τ ) ∈ U , min f (z, t, τ ) ≥ min{ 0, min

¯ z∈ (R)

z∈∂ (R)

f (z, t, τ )}.

(2.1.13)

Observe also that f (z, 0, 0) = 0 for all z ∈ (R ). Together with the uniform continuity ¯ m (r ) ×

¯ 3g−3+n−m ⊂ of f (z, t, τ ) on the compact set ∂ (R) × U ⊂ W , where U

U , it follows readily that there exists positive continuous functions C1,t,τ , C2,t,τ on U (which can be taken to be the exponential of the right-hand side of (2.1.13) and (2.1.12) respectively) such that C1,0,0 = C2,0,0 = 1 and C1,t,τ ρ0,0 (z) ≤ ρ(z, t, τ ) ≤ C2,t,τ ρ0,0 (z) for all (t, τ ) ∈ U and z ∈ ∗ (R), (2.1.14) which, together with (2.1.7), lead to Proposition 2.1.3(i). Proposition 2.1.3(ii) is an immediate consequence of (2.1.14). 2.2. Next we consider the behavior of the family of hyperbolic metrics near the nodes. Let U = m (r ) × V be as in Sect. 2.1, and fix a node q j of X 0 , where 1 ≤ j ≤ m. Then it follows readily from Sect. 1.3 (and with slight abuse of notation (cf. (1.3.1)) that there exists a local coordinate neighborhood N j = m+1 (r ) × V of q j in X such that for fixed (t, τ ) ∈ U with t = (t1 , . . . , tm ), the set N j,t,τ := N j ∩ X t,τ is given by |t j | < |z j | < r } r |t j | < |w j | < r }. ={(t1 , . . . , t j−1 , z j , w j , t j+1 , . . . , tm , τ ) ∈ N j : z j w j = t j , r (2.2.1)

N j,t,τ ={(t1 , . . . , t j−1 , z j , w j , t j+1 , . . . , tm , τ ) ∈ N j : z j w j = t j ,

When t j = 0, one can identify N j,t,τ as an annulus via coordinate projections as N j,t,τ ↔ {z j ∈ C:

|t j | |t j | < |z j | < r } ↔ {w j ∈ C: < |w j | < r }. r r

(2.2.2)

Note that when t j = 0, N j,t,τ consists of two open coordinate discs of radius r corresponding to the cases when |z j | < r , w j = 0 and when |w j | < r , z j = 0 respectively.

Asymptotic Behavior of the Takhtajan-Zograf Metric

237

In terms of the coordinates t, τ and either z j or w j , we may also write N j = N 1j ∪ N 2j , where 1 N 1j := {(z j , t, τ ) ∈ (r ) × U |t j | 2 ≤ |z j | < r }, and 1 (2.2.3) N 2j := {(w j , t, τ ) ∈ (r ) × U |t j | 2 ≤ |w j | < r }. For each (t, τ ) ∈ U , we also denote 1

N 1j,t,τ := N 1j ∩ X t,τ {z j ∈ C: |t j | 2 ≤ |z j | < r }, and 1

N 2j,t,τ := N 2j ∩ X t,τ {w j ∈ C: |t j | 2 ≤ |w j | < r }.

(2.2.4)

Recall also from Sect. 2.1 that, shrinking r if necessary, we will assume without loss of generality that each N j ⊂⊂ N j for some similarly defined local coordinate neighborhood N j = m+1 (r ) × V of q j in X with r < r < 1, and thus we have corresponding 2, 1, 2, similarly defined sets N 1, j , N j , N j,t,τ , N j,t,τ , etc. For (t, τ ) ∈ U with t j = 0, we define the function on N j,t,τ given by

π log |z j | 2 π ∗ (2.2.5) ρ j,t,τ (z j ) := csc |z j | log |t j | log |t j |

via the first identification of (2.2.2). Observe that the expression for ρ ∗j,t,τ actually does not depend on τ or tk for k = j. It is also easy to see that ρ ∗j,t,τ is given by a similar expression in terms of the coordinate w j . For (t, τ ) ∈ U with t j = 0, we define ρ ∗j,t,τ by ρ ∗j,t,τ (z j ) =

1 |z j

|2 (log |z

j |)

2

on the z j -coordinate disc,

(2.2.6)

and by a similar expression on the w j -coordinate disc. Then it is well-known and easy to see that the ρ ∗j,t,τ ’s glue together to form a continuous function on N j (with singularity along the complex analytic subset z j = w j = 0 of complex codimension two), which we denote by ρ ∗j . Moreover, for each (t, τ ) ∈ U with t j = 0, ρ ∗j,t,τ := ρ ∗j,t,τ (z j )dz j ⊗ d z¯ j = ρ ∗j,t,τ (w j )dw j ⊗ d w¯ j

(2.2.7)

is the restriction of the complete hyperbolic metric on the annulus {z j ∈ C: |t j | < |z j | < 1}(⊃ N j,t,τ ); when t j = 0, similar statements also hold for the two corresponding punctured coordinate discs. For fixed (t, τ ) ∈ U with t j = 0, we write ρt,τ = ρt,τ (z j )dz j ⊗ d z¯ j = ρt,τ (w j )dw j ⊗ d w¯ j on N j,t,τ .

(2.2.8)

Proposition 2.2.1. (i) For each 1 ≤ j ≤ m, there exist constants C3 , C4 > 0 such that for all (t, τ ) ∈ U with t j = 0, one has C3 ρ ∗j,t,τ (z j ) ≤ ρt,τ (z j ) ≤ C4 ρ ∗j,t,τ (z j ) on N j,t,τ .

(2.2.9)

A similar inequality also holds for the coordinate w j . In particular, there exist constants C5 , C6 > 0 such that for all (t, τ ) ∈ U with t j = 0, one has C6 C5 ≤ ρt,τ (z j ) ≤ on N 1j,t,τ . |z j |2 (log |z j |)2 |z j |2 (log |z j |)2 A similar inequality (with z j replaced by w j ) also holds for the region N 2j,t,τ .

(2.2.10)

238

K. Obitsu, W.-K. To, L. Weng

(ii) (Strengthened version of (i)) If, in addition, z j and w j are standard local holomorphic coordinate functions for X 0 (cf. Remark-Definition 2.1.2), then the inequalities in (2.2.9) and (2.2.10) remain valid with the constants C3 , C4 , C5 , C6 replaced by positive continuous functions C3,t,τ , C4,t,τ , C5,t,τ , C6,t,τ (depending on (t, τ ) ∈ U (with t j = 0)) respectively and satisfying C3,t,τ , C4,t,τ , C5,t,τ , C6,t,τ → 1 as (t, τ ) → (0, 0).

(2.2.11)

Proof. The proof of (i) for Tg,n with n > 0 is the same as the case of Tg,0 given in [M, p. 632]. Next we recall Bers’ result [Be] which implies that ρ(z j , t, τ ) is locally uniformly continuous in all variables at points where z j = 0. Then the proof of (ii) follows from this result and a simple adaptation of that of (i) in a manner similar to Proposition 2.1.3, which will be left to the reader. 3. Regular Quadratic Differentials and the Weil-Petersson Metric 3.1. To facilitate the ensuing discussion, we recall in this section Masur’s construction in [M] of a certain local basis of regular quadratic differentials for a degenerating family of punctured Riemann surfaces. The concept of regular quadratic differentials dates back to earlier works of Bers (see e.g. [Be]). Since only the case of a degenerating family of compact Riemann surfaces was explicitly discussed in [M], we will indicate briefly the necessary modifications arising from the punctures of the Riemann surfaces for the convenience of the reader. Similar to [M, p. 627], we first have Definition 3.1.1. (a) Let X be a Riemann surface with possibly both punctures and nodes, and denote the smooth part of X by X o . For k = 1, 2, a regular k-differential φ on X is a holomorphic section of K Xk o such that (i) φ has at most a simple pole at each puncture of X ; and (ii) φ has at most a pole of order k at each of the two punctures of X o associated to a node of X ; moreover, the residues of φ at each such pair of punctures are equal if k = 2 and opposite if k = 1. Here we recall that the residue of φ = φ(z)dz k at a point z = 0 is given locally by the residue of the abelian differential φ(z)z k−1 dz. (b) For k = 1, 2, a regular k-differential on a family of Riemann surfaces with punctures and nodes is a holomorphic function element on the total space which restricts to a regular k-differential on each fiber. We remark that when X has no punctures, the above definition is standard and wellknown (see e.g. [M, §4]). When X has no nodes, the space of regular 2-differential on X coincides with the space of integrable holomorphic quadratic differentials on X . Let X 0 ∈ Mg,n \ Mg,n be a Riemann surface with punctures p1 , . . . , pn and nodes q1 , . . . , qm , and let Uˆ be an open neighborhood of X 0 in Mg,n with a local uniformizing chart ψ: U m (r )×V → Uˆ , where V 3g−3+n−m is a domain in a suitable boundary Teichmüller space δγ1 ,...,γm Tg,n as in Sect. 1.3. Also, we let π : X = {X t,τ }(t,τ )∈U → U be the corresponding degenerating family of Riemann surfaces associated to a choice of Beltrami differentials ν1 , . . . , ν3g−3+n−m on X 0 as in Sect. 1.3. As in the case of Mg,0 in [M, p. 625-626] and for each 1 ≤ i ≤ 3g − 3 + n − m, the coordinate tangent vector ∂/∂τi at (t, τ ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ) ∈ U is identified with the Beltrami differential µ(τ ) νi wz · on X t,τ , (3.1.1) 1 − |µ(τ )|2 w¯ µ(τ ) ◦ (w µ(τ ) )−1 z¯

Asymptotic Behavior of the Takhtajan-Zograf Metric

239

where w µ(τ ) is as in Sect. 1.3, and like the νi ’s, it is easily seen to be of compact support away from the punctures and nodes of X t,τ . In addition, for each 1 ≤ j ≤ m, the tangent vector ∂/∂t j at (t, τ ) ∈ U is identified with the Beltrami differential z j d z¯ j w j d w¯ j ∂ 1 1 (t, τ ) ↔ = ∂t j 2t j log |t j | z¯ j dz j 2t j log |t j | w¯ j dw j

(3.1.2)

supported on N j,t,τ ⊂ X t,τ , where N j,t,τ is as in (2.2.1) (cf. [M, p. 626]). Recall from (2.2) the open coordinate neighborhood N j = m+1 (r ) × V of each node q j of X 0 in X with the corresponding decomposition N j = N 1j ∪ N 2j given in (2.2.3). Recall also from (2.1) the open coordinate neighborhood Wi = ∗ (R) × U of each puncture pi of n {W } ∪ ∪m {N }) X 0 in X . It is also clear from the constructions in (1.3) that X \ (∪i=1 i j j=1 can be covered by a finite number of coordinate neighborhoods {A }1≤≤o of X , where each A is of the form A = (r ) × U with A,t,τ := A ∩ X t,τ = (r ) × {(t, τ )}

(3.1.3)

for each (t, τ ) ∈ U . Here, o ∈ Z+ , and (r ) = {z ∈ C | |z | < r} with r > 0. For each non-empty subset J ⊂ {1, 2, . . . , m}, let B(J ) = {(t, τ ) ∈ U t j = 0 for all / J }∪{∂/∂τ 1 ≤ ≤ 3g −3+n −m}. j ∈ J }, and let T (J ) = {∂/∂t j 1 ≤ j ≤ m, j ∈ Let U ∗ ( ∗ (r ))m × V ⊂ U be as in Theorem 1. Shrinking U if necessary, one has Proposition 3.1.2. ([M]). There exist regular 2-differentials φk = φk (z, t, τ )dz 2 , k = 1, 2, . . . , 3g − 3 + n, on X = {X t,τ }(t,τ )∈U satisfying the following properties: (i) At each (t, τ ) ∈ U ∗ , {φk }1≤k≤3g−3+n forms a basis of regular 2-differentials on X t,τ dual to the ordered set of tangent vectors {∂/∂t j }1≤ j≤m ∪ {∂/∂τ }1≤≤3g−3+n−m via the identifications (3.1.1), (3.1.2) and with respect to the pairing in (1.1.3). (ii) For each non-empty subset J ⊂ {1, 2, . . . , m}, φk ≡ 0 on B(J ) for each k ∈ J , and {φk }k∈{1,...,3g−3+n}\J is dual to the ordered set T (J ) on B(J ) with respect to the pairing in (1.1.3). (iii) For each 1 ≤ k, j ≤ m, one has, on N 1j , tk φk (z j , t, τ ) = − π

∞ δk j 1 tk κ() + a−1 (z j , t, τ ) + 2 t j a (t, τ ) , z 2j z j =1 z j

(3.1.4)

where δk j is the Kronecker symbol, each integer κ() ≥ 0, a−1 has at most a simple pole at z j = 0, and a ( ≥ 1) is holomorphic. In particular, there exist constants C1 , C2 , C3 > 0 such that on N 1j , one has ⎧ |t j | |t j | ⎪ ⎪ ≤ |φ j (z j , t, τ )| ≤ C2 if 1 ≤ j ≤ m, ⎨C1 |z j |2 |z j |2 |tk | ⎪ ⎪ if 1 ≤ k = j ≤ m. |φk (z j , t, τ )| ≤ C3 ⎩ |z j | Similar expressions hold on N 2j with respect to the (w j , t, τ )-coordinates.

(3.1.5)

240

K. Obitsu, W.-K. To, L. Weng

(iv) For each m + 1 ≤ k ≤ 3g − 3 + n and 1 ≤ j ≤ m, one has, on N 1j , φk (z j , t, τ ) = φk (z j , 0, 0) +

∞ ∞ 1 t j κ() ˜ t b (t, τ ) + z j c (t, τ ), j z 2j =1 z j =−1

(3.1.6)

where each integer κ() ˜ ≥ 0, φk (z j , 0, 0) has at most a simple pole at z j = 0, and b , c are holomorphic with c (0, 0) = 0. In particular, there exists a constant C4 > 0 such that on N 1j , one has |φk (z j , t, τ )| ≤

C4 if m + 1 ≤ k ≤ 3g − 3 + n and 1 ≤ j ≤ m. |z j |

Similar expressions hold on N 2j with respect to the (w j , t, τ )-coordinates. (v) For each 1 ≤ i ≤ n, one has, on Wi = ∗ (R) × U , ⎧ tk dk (z i , t, τ ) ⎪ ⎨− if 1 ≤ k ≤ m, φk (z i , t, τ ) = d π(z , t, τz i) ⎪ ⎩ k i if m + 1 ≤ k ≤ 3g − 3 + n, zi

(3.1.7)

(3.1.8)

where each dk (z i , t, τ ) is holomorphic on Wi . In particular, there exist constants C5 , C6 > 0 such that on Wi , one has ⎧ |t | k ⎪ if 1 ≤ k ≤ m, ⎨C5 |z (3.1.9) |φk (z i , t, τ )| ≤ C i | ⎪ ⎩ 6 if m + 1 ≤ k ≤ 3g − 3 + n. |z i | (vi) For each 1 ≤ ≤ o and 1 ≤ k ≤ 3g − 3 + n, φk (z , t, τ ) is holomorphic on A = (r ) × U . Moreover, for 1 ≤ k ≤ m, one has, on A , tk φk (z , t, τ ) = − ek (z , t, τ ) π

(3.1.10)

for some holomorphic function ek (z , t, τ ). In particular, upon shrinking r if necessary, there exist constants C7 , C8 > 0 such that on A , one has C7 |tk | if 1 ≤ k ≤ m, (3.1.11) |φk (z , t, τ )| ≤ C8 if m + 1 ≤ k ≤ 3g − 3 + n. Proof. The proof in the general case when n > 0 follows mutatis mutandis from the discussions of the case when n = 0 in [M, §4, §5 and §7], to which we refer the reader for details. Here we only indicate the necessary modifications arising from the punctures. By adjoining n points to each fiber X t,τ corresponding to the punctures, one has an associated family of compact Riemann surfaces π¯ : X¯ → U , where the punctures of the X t,τ ’s correspond to n non-intersecting holomorphic sections of π¯ , which we ( p) ( p) denote by σ1 , . . . , σn . Applying the arguments of [M, Lemma 4.3], one can produce a regular 1-differential ψ = ψ(z, t, τ )dz and 2g − 2 disjoint holomorphic sections σ1 . . . , σ2g−2 of the family π¯ : X¯ → U such that each σi (t, τ ) is a zero of ψ(z, t, τ )dz and each σi (t, τ ) misses the nodes and the punctures of X t,τ . Then using ψ and the

Asymptotic Behavior of the Takhtajan-Zograf Metric

241 ( p)

( p)

2g − 2 + n disjoint sections σ1 , . . . , σ2g−2 , σ1 , . . . , σn , one can produce the desired regular 2-differentials by following the arguments in [M, §5 and §7]. Finally we remark that (3.1.5) (resp. (3.1.7)) follows readily from (3.1.4) (resp. (3.1.6)) and the inequality 1 |t j | 2 ≤ |z j | < r which holds on N 1j (cf. (2.2.3)). 3.2. Next we recall the well-known result of Masur [M] on the asymptotic behavior of the Weil-Petersson metric g WP on Tg,n . It should be remarked that this result has been improved recently by Wolpert [Wolp5] and Obitsu-Wolpert [OW], where information on higher order terms are obtained. Masur’s original result will be sufficient for our purpose. As in Sect. 3.1, since only the case when n = 0 was explicitly discussed in [M], we will indicate briefly the modifications needed for the case when n > 0 for the convenience of the reader. Proposition 3.2.1. ([M]). For g ≥ 0 and n > 0, let X 0 ∈ Mg,n \Mg,n with local uniformizing chart ψ: U m (r ) × V → Uˆ , where V 3g−3+n−m ⊂ δγ1 ,...,γm Tg,n , U ∗ ( (r )∗ )m × V ⊂ U , and corresponding local coordinates (s1 , . . . , s3g−3+n ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ) = (t, τ ) be as in Theorem 1. Denote the components of the Weil-Petersson metric g WP by

∂ ∂ WP , 1 ≤ i, j ≤ 3g − 3 + n, giWP = g , j¯ ∂si ∂s j on U ∗ . Then the following statements hold: (i) For each 1 ≤ j ≤ m, one has 0< ≤

lim inf

|t j |2 (− log |t j |)3 g WP (t, τ ) j j¯

lim sup

|t j |2 (− log |t j |)3 g WP (t, τ ) < ∞. j j¯

(t,τ )∈U ∗ →(0,0) (t,τ )∈U ∗ →(0,0)

(3.2.1)

(ii) For each 1 ≤ j, k ≤ m with j = k, one has

WP 1 g (t, τ ) = O as (t, τ ) ∈ U ∗ → (0, 0). (3.2.2) j k¯ |t j | |tk | (log |t j |)3 (log |tk |)3 (iii) For each j, k ≥ m + 1, one has lim

(t,τ )∈U ∗ →(0,0)

g WP (t, τ ) = g WP (0, 0), j k¯ j k¯

(3.2.3)

where g WP (0, 0) denotes the ( j, k)th component of the Weil-Petersson metric on the j k¯ boundary Teichmüller space δγ1 ,...,γm Tg,n at X 0 . (iv) For each 1 ≤ j ≤ m and k ≥ m + 1, one has

WP 1 g as (t, τ ) ∈ U ∗ → (0, 0). (t, τ ) = O (3.2.4) j k¯ |t j |(− log |t j |)3

242

K. Obitsu, W.-K. To, L. Weng

Proof. The proof in the general case when n > 0 follows mutatis mutandis from the arguments for the case when n = 0 in [M, §7, proof of Theorem 1] with [M, Prop. 7.1] replaced by Proposition 3.1.2. For 1 ≤ i ≤ n and (t, τ ) ∈ U , let Wi,t,τ be as in (2.1). We remark that the only extra integral estimates needed are those on the Wi,t,τ ’s as follows: Wi,t,τ

φk φ = ρt,τ

O(|tk ||t |) if 1 ≤ k, ≤ m, if 1 ≤ k ≤ m and m + 1 ≤ ≤ 3g − 3 + n, O(|tk |)

(3.2.5)

as (t, τ ) → (0, 0), and for m + 1 ≤ k, ≤ 3g − 3 + n, lim

(t,τ )→(0,0) Wi,t,τ

φk φ = ρt,τ

Wi,0,0

φk φ . ρ0,0

(3.2.6)

The estimates in (3.2.5) and the limit in (3.2.6) follow readily from a straightforward calculation using Proposition 2.1.3(i), Proposition 3.1.2 and the dominated convergence theorem. 4. Estimates on the Eisenstein Series In this section, we are going to obtain some estimates on the Eisenstein series E(z, s) in the setting of holomorphic families of degenerating punctured Riemann surfaces, which will be needed for ensuing discussion in §5. We should clarify that by an Einsenstein series, we will mean here the real-analytic (non-holomorphic) series as in (1.1.1) or [Ku] rather than more customary holomorphic ones. Our approach is geometrical in nature, and it consists largely of constructing suitable germs of comparison functions for the Eisenstein series near the nodes and punctures. Starting from Sect. 4.2, we will restrict our discussions to E(z, 2), although most of our discussions will also be valid for E(z, s) with Re s > 1.

4.1. First we extend the definition of Eisenstein series to the case of punctured Riemann surfaces with nodes. Let X be a stable connected Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm . Then X o := X \ {q1 , . . . , qm } is a smooth punctured Riemann surface with n + 2m punctures, and we denote the connected components of X ◦ by Sα , α = 1, . . . , d (cf. Sect. 1.2). We denote the new punctures by pn+1 , . . . , pn+2m . Each old or new puncture pi , 1 ≤ i ≤ n + 2m, of X ◦ is a puncture of a unique Sα(i) for some 1 ≤ α(i) ≤ d. Definition 4.1.1. For 1 ≤ i ≤ n + 2m and s ∈ C with Re s > 1, the Eisenstein series E i (·, s) on X attached to pi is defined by E i (z, s) =

E i,Sα(i) (z, s) if z ∈ Sα(i) , 0 if z ∈ X \ Sα(i) ,

(4.1.1)

where E i,Sα(i) (·, s) is the corresponding Eisenstein series on Sα(i) attached to pi given as in (1.1.1).

Asymptotic Behavior of the Takhtajan-Zograf Metric

243

In the case when X has no nodes, it is well known that for 1 ≤ i, j ≤ n and in terms of the Euclidean coordinate Z = X + iY on H with p j corresponding to ∞ (as in the proof of Lemma 2.1.1), there exists some constant c > 0 such that E i (Z , s) = δi j Y s + φi j (s)Y 1−s + o(e−cY ) as Y → ∞,

(4.1.2)

where δi j is the Kronecker symbol, (φi j (s)) is a symmetric n × n matrix (cf. e.g. [Ku] and [Wolp4, p.260]). In this section, we are going to give a variant version of (4.1.2) for a Riemann surface X with nodes. For a point z ∈ X ◦ , we denote by injrad(z) the injectivity radius of X ◦ at z with respect to the complete hyperbolic metric on X ◦ . Proposition 4.1.2. Notation as above. Fix an integer 1 ≤ i ≤ n + 2m, and let s ∈ C be a fixed number with Re s > 1. (i) Let z i be a standard local holomorphic coordinate function around p (cf. RemarkDefinition 2.1.2 (i) and (ii)). Then for any > 0, there exists a constant Cs, > 0 (depending only on s, and indpendent of X ) such that

s E i (z i , s) − − log |z i | ≤ Cs, on ∗ (e−2π e ) := {z i ∈ C 0 < |z i | < e−2π e }. 2π (4.1.3) > 0 (depending only on s and κ) such (ii) For any κ > 0, there exists a constant Cs,κ that |E i (z, s)| ≤ Cs,κ for any z ∈ X ◦ with injrad(z) ≥ κ. (4.1.4) (iii) For 1 ≤ i = j ≤ n + 2m, one has E i (z, s) → 0 as z → p j .

(4.1.5)

Proof. We remark that to prove (i), (ii) and (iii), it follows readily from (4.1.1) that we may assume without loss of generality that X ◦ is connected with pi (and possibly p j ) as one of its punctures. To prove (i), we recall from (1.1) that we may write X ◦ = H/ , where is a Fuchsian group which uniformizes X with ∞ corresponding to pi , and the infinite cyclic subgroup ∞ ⊂ generated by Z → Z + 1, Z ∈ H, corresponds to the parabolic transformations of fixing pi . Let C∞ := {Z ∈ C | Im Z ≥ 1} be a horoball around ∞ in H. As mentioned in Remark-Definition 2.1.2(ii), it follows from the collar lemma for non-compact surfaces ([Bu, Theorem 4.4.6, p.112]) that C∞ descends under the projection map z i = e2πi Z

(4.1.6) to the punctured coordinate neighborhood ∗ (e−2π ) := {z i ∈ C 0 < |z i | < e−2π } around pi . Let > 0 be a given constant. We recall from [Ku, Sect. 1.3] (cf. also [O p.146]) the following integral representation for the Eisenstein series: For Z ∈ H, one has s−2 (s)E i (Z , s) = Y d X dY , where (4.1.7) γ ∈∞ \

(s) : =

B(γ Z ,)

Y B(i,)

s−2

d X dY .

(4.1.8)

244

K. Obitsu, W.-K. To, L. Weng

Here Z = X + iY denotes the Euclidean coordinate function on H, and B(Z , ) denotes the hyperbolic geodesic ball in H of radius and with center at Z . From (4.1.7), we have s−2 s−2 (s)E i (Z , s) = Y d X dY + Y d X dY B(Z ,)

= (s)(Im Z ) + s

γ ∈∞ \ γ =id

γ ∈∞ \ γ =id

B(γ Z ,)

B(γ Z ,)

Y

s−2

d X dY ,

(4.1.9)

where the second line is obtained by making the change of variable Z = (Im Z )−1 · (Z − Re Z ) in the first integral of the first line and then invoking the definition of (s) in (4.1.8). (Note that the above change of variable corresponds to a hyperbolic isometry on H.) Next we find an absolute bound for the last term of (4.1.9) by adapting the proof := {Z ∈ C | Im Z ≥ e }, which is easily of Theorem 2.1.2 in [Ku, p.12]. Let C∞ seen to descend under the map in (4.1.6) to the punctured coordinate neighborhood and γ ∈ ,

∗ (e−2π e ) (⊂ ∗ (e−2π )) in (4.1.3). It is easy to see that for any Z ∈ C∞ one has B(Z , ) ⊂ C∞ , and B(γ Z , ) = γ (B(Z , )) ⊂ γ (C∞ ) = Cγ (∞) ,

(4.1.10)

where Cγ (∞) denotes the corresponding horoball around the cusp γ (∞), which is isometric to C(∞) via γ . By the collar lemma mentioned above, all the horoballs Cγ (∞) , γ ∈ ∞ \ , are mutually disjoint. It follows that all the hyperbolic geodesic balls B(γ Z , ), id = γ ∈ ∞ \, are mutually disjoint, and thus they may be considered to be disjoint subsets of {Z ∈ H | − 1 ≤ Re Z ≤ 2, 0 < Im Z ≤ e },

(4.1.11)

after we choose suitable representatives in the coset decomposition of ∞ \. Together , one has with (4.1.9), it follows that for any Z ∈ C∞ s−2 (s) E i (Z , s) − (Im Z )s ≤ Y d X dY −1≤X ≤2 0≤Y ≤e

=

3 e(Re s−1) . Re s − 1

(4.1.12)

Re s−2 d X dY > 0 and it depends Observe from (4.1.8) that (s) = B(i,) Y to the corresponding only on s and . By descending the inequality in (4.1.12) on C∞ ∗ −2π e ) via the map in (4.1.6), one easily sees that (4.1.3) holds with inequality on (e the constant given by 3 e(Re s−1) , (4.1.13) Cs, = (s)(Re s − 1) and this finishes the proof of (i). Next we proceed to give the proof of (ii), which is similar to that of (i). Let p: H → X ◦ denote the covering space projection. Let z ∈ X ◦ be a point with injrad(z) ≥ κ, and fix a point Z ∈ H such that p(Z ) = z. With ∞ ⊂ and other notations as in (i) above, it is easy to see that injrad( p(Z )) ≥ κ2 for any Z ∈ B(γ Z , κ2 ) and any γ ∈ ∞ \. For any Z ∈ H, it is easy to calculate that the

Asymptotic Behavior of the Takhtajan-Zograf Metric

245

hyperbolic length of the horizontal line segment from Z to Z + 1 is Im1Z , which implies readily that injrad( p(Z )) ≤ Im1 Z (since p(Z ) = p(Z + 1)). Hence we have Im Z ≤

2 κ for all Z ∈ B(γ Z , ), γ ∈ ∞ \. κ 2

(4.1.14)

The condition injrad(z) ≥ κ also implies readily that the geodesic balls B(γ Z , κ2 ), γ ∈ ∞ \, are mutually disjoint, and thus similar to (4.1.11), they may be regarded as disjoint subsets of {Z ∈ H | − 1 ≤ Re Z ≤ 2, 0 < Im Z ≤

2 }. κ

(4.1.15)

Together with (4.1.7) and (4.1.8) (and with = κ2 ), it follows as in (4.1.12) that one has s−2 Y d X dY κ (s) E i (Z , s) ≤ −1≤X ≤2 2 0≤Y ≤ κ2

3 · = (Re s − 1)

Re s−1 2 . κ

(4.1.16)

By descending the above inequality to X o , one easily sees that (4.1.4) holds with the constant given by Re s−1 2 3 · = , (4.1.17) Cs,κ κ (s)(Re s − 1) κ 2 and this finishes the proof of (ii). Finally one easily sees that (4.1.5) is a direct consequence of (4.1.2), and this finishes the proof of Proposition 4.1.1. 4.2. Upper bound of E i,t,τ near a node. Notation as in §1. Let X 0 ∈ Mg,n \Mg,n be a Riemann surface with n punctures at p1 , . . . , pn and m nodes at q1 , . . . , qm , and let Uˆ be an open neighborhood of X 0 in Mg,n together with a local uniformizing chart χ : U → Uˆ , where U m (r ) × V = {(t, τ ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ): t ∈

m (r ), τ ∈ V }, and V 3g−3+n−m is an open coordinate neighborhood of X 0 in δγ1 ,...,γm Tg,n as in (1.3). Let X := {X t,τ }(t,τ )∈U be the corresponding family of Riemann surfaces parametrized by U with X 0 = X (0,0) . Let U ∗ ( ∗ (r ))m × V ⊂ U be as in Theorem 1. For each 1 ≤ i ≤ n and (t, τ ) ∈ U , we denote the Eisenstein series with S = 2 on X t,τ associated to the puncture corresponding to pi by E i,t,τ (à la Definition 4.1.1 when some t j = 0). It is well-known that {E i,t,τ }(t,τ )∈U ∗ form a continuous family of functions on {X (t,τ ) }(t,τ )∈U ∗ . The following proposition follows from previous work of Obitsu [O2]: Proposition 4.2.1. ([O2]). For each i = 1, . . . , n, E i,t,τ converges uniformly on compact subsets of X 0 \ { p1 , . . . , pn , q1 , . . . , qm } to E i,0,0 as (t, τ ) ∈ U ∗ → (0, 0). Here, it is easy to see that a compact set K ⊂ X 0 \ { p1 , . . . , pn , q1 , . . . , qm } can be extended to a neighborhood of the form K × U in the total space of {X t,τ }(t,τ )∈U , shrinking U if necessary. Therefore, E i,t,τ may be regarded as a function on K for (t, τ ) sufficiently close to (0, 0).

246

K. Obitsu, W.-K. To, L. Weng

For a fixed integer i with 1 ≤ i ≤ n, we are going to give a pointwise upper bound of E i,t,τ near a node q j with 1 ≤ j ≤ m. Let N j = m+1 (r )× V ⊂⊂ m+1 (r )× V = N j (with 0 < r < r < 1), N j,t,τ , N j,t,τ , N 1j,t,τ , N 2j,t,τ , z j , w j be as in Sect. 2.2. Motivated by (2.2.5) and (2.2.6), we consider a family of comparison functions for the E i,t,τ ’s as follows: For each (t, τ ) ∈ U with t j = 0, we let ∗ E t,τ (z j ) := −

π log |t j | · sin

π log |z j | log |t j |

on N j,t,τ .

(4.2.1)

For each (t, τ ) ∈ U with t j = 0, recall that N j,t,τ consists of two discs {z j ∈ C | |z j | < r } and {w j ∈ C | |w j | < r }, and we let ⎧ ⎪ ⎪ ⎨−

1 log |z j | ∗ (·) := E t,τ 1 ⎪ ⎪ ⎩− log |w j |

on the z j -disc, (4.2.2)

on the w j -disc.

∗ ’s glue together to form a positive continAs in Sect. 2.2, it is easy to see that the E t,τ 3g−3+n−m m 2 |τk |2 uous function on N j \ {nodes}. We write (t, τ ) = j=1 |t j | + k=1 for (t, τ ) ∈ U .

Proposition 4.2.2. For fixed 1 ≤ i ≤ n, 1 ≤ j ≤ m and 0 < α < 1, there exist constants C1 , C2 , δ > 0 such that for all (t, τ ) ∈ U with t j = 0 and satisfying (t, τ ) < δ, one has ∗ α ) on N j,t,τ , so that E i,t,τ ≤ C1 (E t,τ C2 E i,t,τ (z j ) ≤ on N 1j,t,τ , (− log |z j |)α

(4.2.3) (4.2.4)

and a similar inequality (with z j replaced by w j ) holds on N 2j,t,τ . Proof. First we consider the special case when at (t, τ ) = (0, 0), z j , w j are standard local holomorphic coordinates for X 0 (cf. Remark-Definition 2.1.2). Consider the operator

j := 4

∂2 ∂z j ∂z j

on N j,t,τ .

(Note that in terms of real coordinates, one has j =

∂2 ∂ x 2j

(4.2.5) +

∂2 , ∂ y 2j

where z j = x j + i y j .)

By direct calculation, one can check that for (t, τ ) ∈ U with t j = 0, ∗ (z j )

j E t,τ

= 1 + cos ≤

2 C3,t,τ

2

π log |z j | log |t j |

∗ E t,τ (z j )ρ ∗j,t,τ (z j )

∗ E t,τ (z j )ρt,τ (z j ) on N j,t,τ (by Proposition 2.2.1),

(4.2.6)

Asymptotic Behavior of the Takhtajan-Zograf Metric

247

where ρ ∗j,t,τ , C3,t,τ and ρt,τ (z j ) are as in (2.2.5), (2.2.10) and (2.2.7) respectively. For (t, τ ) ∈ U with t j = 0, it follows from the chain rule that ∗ α ∗ α−2 ∗ 2 ∗ α−1 ∗ ) = 4α(α − 1)(E t,τ ) |∂z j E t,τ | + α(E t,τ ) j E t,τ

j (E t,τ 2α ≤ (E ∗ (z j ))α ρt,τ (z j ) on N j,t,τ C3,t,τ t,τ (by (4.2.6) and since 0 < α < 1).

(4.2.7)

On the other hand, for (t, τ ) ∈ U with t j = 0, it follows from (1.1.2) and Definition 4.1.1 that

j E i,t,τ (z j ) = 2E i,t,τ (z j )ρt,τ (z j ) on N j,t,τ ,

(4.2.8)

and a similar expression holds for the w j -coordinate. For each (t, τ ) ∈ U , the boundary ∂ N j,t,τ of N j,t,τ consists of two circles |z j | = r and |w j | = r , and it is easy to see that ∪(t,τ )∈U ∂ N j,t,τ forms a compact subset of N j . It follows readily from Proposition 2.2.1 that for any (t, τ ) ⊂ U and any point z on ∂ N j,t,τ , the injectivity radius of (X t,τ , ρt,τ ) at z is uniformly bounded below by some constant κ > 0 independent of (t, τ ). Thus, by Proposition 4.1.2(ii), there exists a constant C > 0 such that E i,t,τ (z) ≤ C for all (t, τ ) ∈ U and z ∈ ∂ N j,t,τ .

(4.2.9)

It is also easy to see from (4.2.1) and (4.2.2) that there exists a constant C ∗ > 0 such that ∗ (z) ≥ C ∗ for all (t, τ ) ∈ U and z ∈ ∂ N j,t,τ . E t,τ

Let C1 = one has

C (C ∗ )α

(4.2.10)

> 0. Then it follows from (4.2.9) and (4.2.10) that for all (t, τ ) ∈ U , ∗ (z j ))α ≤ 0 on ∂ N j,t,τ . E i,t,τ (z j ) − C1 (E t,τ

(4.2.11)

Since α < 1, it follows from Proposition 2.2.1 that there exists a constant δ > 0 such that C3,t,τ ≥ α for all (t, τ ) ∈ U satisfying t j = 0 and (t, τ ) < δ.

(4.2.12)

Combining (4.2.7), (4.2.8) and (4.2.12), one easily sees that for all (t, τ ) ∈ U satisfying t j = 0 and (t, τ ) < δ, one has ∗ ∗

j (E i,t,τ (z j ) − C1 (E t,τ (z j ))α ) ≥ 2(E i,t,τ (z j ) − C1 (E t,τ (z j ))α )ρt,τ (z j ) on N j,t,τ . (4.2.13) By using the maximum principle, one easily obtains (4.2.3) as a consequence of (4.2.11) and (4.2.13). Then (4.2.4) follows readily from (4.2.3), (2.2.4) and the boundedness of the function x/ sin x for 0 < x ≤ π2 , and this finishes the proof of the proposition in the special case when at (t, τ ) = (0, 0), z j , w j are standard local holomorphic coordinates for X 0 . Finally we remark that the general case of the proposition follows readily from the above special case by performing a change of variable and adjusting the values of C1 and C2 in (4.2.3) and (4.2.4) if necessary.

248

K. Obitsu, W.-K. To, L. Weng

4.3. Integral lower bound of E i,t,τ near an adjacent node. Settings, notations and definitions are as in Sect. 4.2. We are going to derive a desired integral lower bound for E i,t,τ on the region N j,t,τ associated to any node q j adjacent to pi (cf. Remark 4.3.2). Proposition 4.3.1. Let 1 ≤ i ≤ n, 1 ≤ j ≤ m be such that the node q j of X 0 is adjacent to the puncture pi , and let φ j = φ j (z, t, τ )dz 2 be as in Proposition 3.1.2. Then for any fixed β > 1, there exist constants C = C(β), δ = δ(β) > 0 such that for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ, one has E i,t,τ N j,t,τ

φjφj ≥ C|t j |2 (− log |t j |)3−β . ρt,τ

(4.3.1)

Proof. As in Proposition 4.2.2, we will assume without loss of generality that at (t, τ ) = (0, 0), z j , w j are standard local holomorphic coordinates for X 0 . Consider the biholomorphism from N j onto itself given by σ (t1 , . . . , t j−1 , z j , w j , t j+1 , . . . , tm , τ ) = (t1 , . . . , t j−1 , w j , z j , t j+1 , . . . , tm , τ ) in terms of the coordinates in (2.2.1). For each fixed (t, τ ) ∈ U with t j = 0 (and upon suppressing the coordinates t1 , . . . , t j−1 , t j+1 , . . . , tm , τ ), it is easy to see that σ restricts to a biholomorphism σ j,t,τ : N 1j,t,τ → N 2j,t,τ given by σ j,t,τ (z j , w j ) = (w j , z j ) with z j w j = t j

(4.3.2)

(cf. (2.2.3) and (2.2.4)). With ρ ∗j,t,τ as given in (2.2.7) (see also (2.2.5)), it is easy to see that σ j,t,τ induces the following isometry between N 1j,t,τ and N 2j,t,τ : ∗ σ j,t,τ ρ ∗j,t,τ = ρ ∗j,t,τ .

(4.3.3)

Let C1 > 0 be as in (3.1.5). Then it follows from Proposition 3.1.2 and (4.3.2) that for all (t, τ ) ∈ U , one has |φ j (z j , t, τ )|, |φ j (σ j,t,τ (z j ), t, τ )| ≥ C1

|t j | on N 1j,t,τ . |z j |2

(4.3.4)

Using the limit lim x csc x = 1, it is easy to see from (2.2.5) that for all (t, τ ) ∈ U with t j = 0, one has

x→0

ρ ∗j,t,τ (z j ) ≤

|z j

Ct j 2 | (log |z

2 j |)

on N 1j,t,τ ,

(4.3.5)

where Ct j is a positive continuous function in the variable t j such that Ct j → 1 as t j → 0.

(4.3.6)

Asymptotic Behavior of the Takhtajan-Zograf Metric

249

Let C4,t,τ be as in (2.2.11). By Proposition 2.2.1, we have φjφj E i,t,τ ρt,τ N j,t,τ φjφj φjφj 1 ≥ E i,t,τ ∗ + E i,t,τ ∗ C4,t,τ ρ j,t,τ ρ j,t,τ N 1j,t,τ N 2j,t,τ ∗ φ σ∗ φ σ j,t,τ j j,t,τ j φjφj 1 ∗ = + σ j,t,τ E i,t,τ E i,t,τ ∗ ∗ ρ∗ C4,t,τ N 1j,t,τ ρ j,t,τ σ j,t,τ j,t,τ 2 E i,t,τ (z j )|φ j (z j , t, τ )| 1 = C4,t,τ N 1j,t,τ ρ ∗j,t,τ (z j ) E i,t,τ (σ j,t,τ (z j ))|φ j (σ j,t,τ (z j ), t, τ )|2 + dz j dz j (by (4.3.3)) ρ ∗j,t,τ (z j ) |t j |2 C12 ≥ E i,t,τ (z j ) + E i,t,τ (σ j,t,τ (z j )) · · |z j |2 (log |z j |)2 dz j dz j C4,t,τ Ct j N 1j,t,τ |z j |4 (by (4.3.2), (4.3.4) and (4.3.5)).

(4.3.7)

In polar coordinates, we write z j = r j eiθ j , t j = |t j |eiψ j , and write E i,t,τ (z j ) = |t | E i,t,τ (r j , θ j ), so that E i,t,τ (σ j,t,τ (z j )) = E i,t,τ ( r jj , ψ j − θ j ). Then (4.3.7) can be re-written in the following form: r C12 |t j |2 (log r j )2 φjφj E i,t,τ ≥ f (r ) · dr j , where t,τ j ρt,τ C4,t,τ Ct j |t j | 21 rj N j,t,τ

2π |t j | E i,t,τ (r j , θ j ) + E i,t,τ ( f t,τ (r j ) := , ψ j − θ j ) dθ j rj 0 2π ∗ E i,t,τ (r j , θ j ) + (σ j,t,τ = E i,t,τ )(r j , θ j ) dθ j . (4.3.8) 0

It is easy to see that the f t,τ ’s (with (t, τ ) ∈ U ∗ ) form a continuous family of functions and each f t,τ is a smooth function in the variable r j . Moreover, one also has |t | |t | f t,τ (r j ) = f t,τ ( r jj ) for all rj ≤ r j ≤ r , which implies readily that 1

(|t j | 2 ) = 0. f t,τ

Consider the differential operator

∂ 1 ∂ 1 ∂2 j := j +

rj , so that j =

, r j ∂r j ∂r j r 2j ∂θ 2j

(4.3.9)

(4.3.10)

where j is as in (4.2.5). Also, we denote the hyperbolic Laplacian on N j,t,τ with respect to ρ ∗j,t,τ by ∗j,t,τ , so that in terms of the z j -coordinate, one has

∗j,t,τ = (ρ ∗j,t,τ (z j ))−1 j ,

(4.3.11)

250

K. Obitsu, W.-K. To, L. Weng

and a similar expression holds for the w j -coordinate. The isometric property of σ j,t,τ in (4.3.3) implies readily that ∗ ∗

∗j,t,τ (σ j,t,τ E i,t,τ ) = σ j,t,τ ( ∗j,t,τ E i,t,τ ).

(4.3.12)

From the analogues of (4.2.8) and (4.3.11) for the w j -coordinate, one has

∗j,t,τ E i,t,τ (w j ) = 2E i,t,τ (w j ) ·

ρt,τ (w j ) on N 2j,t,τ . ρ ∗j,t,τ (w j )

(4.3.13)

Upon pulling back by σ j,t,τ and using Proposition 2.2.1, one obtains from (4.3.12) and (4.3.13) that on N 1j,t,τ , ∗ ∗

∗j,t,τ (σ j,t,τ E i,t,τ )(z j ) ≤ 2σ j,t,τ E i,t,τ (z j ) · C4,t,τ ∗ ∗ =⇒ j (σ j,t,τ E i,t,τ )(z j ) ≤ 2σ j,t,τ E i,t,τ (z j ) · C4,t,τ · ρ ∗j,t,τ (z j ) (by (4.3.11)) ∗ ≤ 2σ j,t,τ E i,t,τ (z j ) · C4,t,τ ·

|z j

Ct j 2 | (log |z

j |)

2

(by (4.3.5)) (4.3.14)

Similarly, it follows from Proposition 2.2.1, (4.2.8) and (4.3.5) that one has

j E i,t,τ (z j ) ≤ 2E i,t,τ (z j ) · C4,t,τ ·

|z j

Ct j 2 | (log |z

j |)

2

on N 1j,t,τ .

(4.3.15)

It follows readily from (4.3.8) and (4.3.10) that 2π ∗ j f t,τ (r j ) = j E i,t,τ + σ j,t,τ

E i,t,τ (r j , θ j ) dθ j 0

∗

j E i,t,τ + σ j,t,τ E i,t,τ (r j , θ j ) dθ j 0

|t j | 1 2π ∂ 2 E i,t,τ (r j , θ j ) + E i,t,τ ( − 2 , ψ j − θ j ) dθ j . (4.3.16) rj r j 0 ∂θ 2j 2π

=

Observe that 2π 0

∂2 ∂θ 2j

|t j | E i,t,τ (r j , θ j ) + E i,t,τ ( , ψj − θj) rj

dθ j = 0,

(4.3.17)

|t |

since the expression E i,t,τ (r j , θ j ) + E i,t,τ ( r jj , ψ j − θ j ) is periodic in θ j with period 2π . By (4.3.14) and (4.3.15), we also have 2π ∗

j E i,t,τ + σ j,t,τ E i,t,τ (r j , θ j ) dθ j 0

2π

≤ 0

=

∗ (E i,t,τ + σ j,t,τ E i,t,τ )(r j , θ j ) ·

2C4,t,τ Ct j r 2j (log r j )2

· f (r j ).

2C4,t,τ Ct j r 2j (log r j )2

dθ j (4.3.18)

Asymptotic Behavior of the Takhtajan-Zograf Metric

251

Combining (4.3.16), (4.3.17) and (4.3.18), it follows that we have j f t,τ (r j ) ≤

2C4,t,τ Ct j r 2j (log r j )2

· f (r j ).

For any fixed number β > 1, a direct calculation gives

1 β d = > 0 for 0 < r j < 1, and dr j (− log r j )β r j (− log r j )β+1

1 β(β + 1) j = 2

. β (− log r j ) r j (− log r j )β+2

(4.3.19)

(4.3.20) (4.3.21)

Since the node q j of X 0 is adjacent to the puncture pi , it follows that E i,0,0 is positive (and thus bounded below by some constant C2 > 0) on at least one of the boundary circles of Ni,0,0 , namely |z j | = r or |w j | = r . (We remark that E i,0,0 may be identically zero on the other boundary circle of Ni,0,0 .) Together with Proposition 4.2.1, it follows that there exists a constant δ1 > 0 such that E i,t,τ ≥ C22 on the corresponding boundary circle of Ni,t,τ for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ1 . Together with (4.3.8), it follows that f t,τ (r ) ≥ 2π ·

C2 = πC2 for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ1 . 2

(4.3.22)

Let C3 := πC2 · (− log r )β > 0. For each (t, τ ) ∈ U ∗ , consider the function Ft,τ (r j ) :=

1 C3 − f t,τ (r j ), |t j | 2 ≤ r j ≤ r. β (− log r j )

(4.3.23)

Then it follows from (4.3.22) that Ft,τ (r ) ≤ 0 for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ1 .

(4.3.24)

Moreover, it follows from (4.3.9) and (4.3.20) that for all (t, τ ) ∈ U ∗ , one has 1

Ft,τ (|t j | 2 ) ≥ 0.

(4.3.25)

Since β > 1, we have β(β + 1) > 2. Together with (2.2.11) and (4.3.6), it follows that there exists a constant δ2 > 0 such that 2C4,t,τ Ct j < β(β + 1) for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ2 .

(4.3.26)

Together with (4.3.19), (4.3.21) and (4.3.23), it follows that j Ft,τ (r j ) ≥

β(β + 1) r 2j (− log r j )2

Ft,τ (r j ) for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ2 .

(4.3.27) Regarding Ft,τ also as a function in the new variable s j = log r j , one easily sees from (4.3.10) that the inequality in (4.3.27) can be re-written as d2 β(β + 1) Ft,τ ≥ Ft,τ . ds 2j s 2j

(4.3.28)

252

K. Obitsu, W.-K. To, L. Weng

By using the maximum principle, one easily sees from (4.3.24), (4.3.25) and (4.3.28) that for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < min{δ1 , δ2 }, one has Ft,τ (r j ) ≤ 0, or equivalently, f t,τ (r j ) ≥

C3 (− log r j )β

1

for all |t j | 2 ≤ r j ≤ r.

(4.3.29) We remark that to prove (4.3.1), it is clear that we may assume without loss of generality that β < 3. From (4.3.8) and (4.3.29), one has r C 2 |t j |2 (log r j )2 φjφj C3 E i,t,τ ≥ 1 · dr j ρt,τ C4,t,τ Ct j |t j | 21 (− log r j )β rj N j,t,τ 1 2 )3−β − (− log r )3−β 2 2 (− log |t | j C3 C1 |t j | . (4.3.30) = · C4,t,τ Ct j 3−β It follows from (2.2.11) and (4.3.6) that there exists δ3 > 0 such that C4,t,τ ≤ 2 and Ct j ≤ 2 for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ3 . Clearly there also exists 1

δ4 > 0 such that (− log r )3−β < 21 (− log |t j | 2 )3−β if 0 < |t j | < δ4 . Now let δ = min{δ1 , δ2 , δ3 , δ4 } > 0. Then it follows readily from (4.3.30) that (4.3.1) holds for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ (and with the constant C =

C3 C12 8·23−β ·(3−β)

> 0).

Remark 4.3.2. The proof of Proposition 4.3.1 does not work for the case when the node q j is not adjacent to the puncture pi , since E i,0,0 is identically zero near such q j (cf. (4.1.1)). One expects that (4.3.1) will not hold for such q j . For a similar reason, one expects that a pointwise lower bound in the spirit of Proposition 4.2.2 will not hold on the entire N j,t,τ even in the case when q j is adjacent to pi (unless both branches of N j,0,0 are “adjacent to” pi ). 4.4. Upper and lower bounds of E i,t,τ near pi . For a fixed integer i with 1 ≤ i ≤ n, we are going to give a pointwise upper bound of E i,t,τ near the puncture pi . Let Wi =

∗ (R) × U , Wi = ∗ (R ) × U (with 0 < R < R < 1), Wi,t,τ , Wi,t,τ , z i be as in (2.1). Proposition 4.4.1. There exist constants δ, c1 , c2 > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has

log |z i | 2 log |z i | 2 + c1 log |z i | ≤ E i,t,τ (z i ) ≤ − c2 log |z i | on Wi,t,τ , (4.4.1) 2π 2π shrinking R if necessary. In particular, there exist constants c3 , c4 > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has c3 (log |z i |)2 ≤ E i,t,τ (z i ) ≤ c4 (log |z i |)2 on Wi,t,τ .

(4.4.2)

Proof. For any (t, τ ) ∈ U , we let z i,t,τ be a standard local holomorphic coordinate function around the puncture pi on X t,τ (cf. Remark-Definition 2.1.2). As mentioned in Remark-Definition 2.1.2(ii), ∗t,τ (e−π ) := {z i,t,τ ∈ C 0 < |z i,t,τ | < e−π } is a bonafide local coordinate neighborhood around pi in X t,τ . It follows readily from the collar lemma for non-compact surfaces ([Bu, Theorem 4.4.6, p.112]) and Proposition 2.1.3(ii) that there exists δ > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has

Asymptotic Behavior of the Takhtajan-Zograf Metric

253

Wi,t,τ ⊂ ∗t,τ (e−π ), upon shrinking R (and possibly also R) if necessary; in particular, vanishing z i,t,τ (in addition to z i ) provides a holomorphic coordinate function on Wi,t,τ only at pi . In terms of the Euclidean coordinate Z = X + iY on the upper half plane H, it is easy √ to calculate that the hyperbolic distance from a point Z to Z + 1 is given by 2 coth−1 ( 4Y 2 + 1) (with the hyperbolic geodesic joining Z to Z + 1 given by the Euclidean circular arc joining the two points and with center at X + 21 ). Shrinking R again if necessary, it follows that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, the injectivity radius of Wi,t,τ at any point a ∈ Wi,t,τ with respect to the restriction of the hyperbolic metric

1 −1 2 ρt,τ on X t,τ is given by f (|z i,t,τ (a)|), where f (t) := 2 coth 1 + ( π log t) . Next

we recall from Proposition 2.1.3(i) the comparison of ρt,τ with the two model hyperbolic metrics on Wi,t,τ with respect to the z i -coordinate (the comparison was stated for Wi,t,τ there, but clearly it holds for Wi,t,τ here). Upon shrinking R further if necessary, one easily sees that it leads readily to a corresponding comparison of the injectivity radii of Wi,t,τ at the point a with respect to these metrics given by C1 f (R) ≤ f (|z i,t,τ (a)|) ≤ C2 f (R),

(4.4.3)

where C1 , C2 > 0 are as in Proposition 2.1.3(i). It is easy to see that f : (0, 1) → (0, ∞) is a continuous strictly increasing and bijective function. Since z i and z i,t,τ are both coor dinate functions on Wi,t,τ vanishing at pi , it follows that the function h t,τ = z i,t,τ /z i extends across pi as a non-vanishing holomorphic function. By applying the maximum and minimum modulus principles to the (extended) function h t,τ on the disc |z i | ≤ R and varying (t, τ ), it follows from (4.4.3) that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has C3 |z i | ≤ |z i,t,τ | ≤ C4 |z i | on Wi,t,τ , where √ √ −1 f ( C1 f (R)) f −1 ( C2 f (R)) C3 = > 0 and C4 = > 0. R R

(4.4.4) (4.4.5)

Fix a number > 0. Then shrinking R and δ if necessary, we may assume that Wi,t,τ ⊂ −

∗t,τ (e−2π e ) for all (t, τ ) ∈ U satisfying (t, τ ) < δ. Thus by Proposition 4.1.2(i) (with s = 2), there exists a constant C2, > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has − C2, ≤ E i,t,τ (z i,t,τ ) −

log |z i,t,τ | 2π

2 ≤ C2, on Wi,t,τ .

(4.4.6)

By replacing C3 by min{C3 , 1}, etc., we may assume that (4.4.4) holds with C3 ≤ 1 and C4 ≥ 1. Note also that |z i |, |z i,t,τ | < 1 on Wi,t,τ . Thus (4.4.4) leads to the inequality (log |z i | + log C4 )2 ≤ (log |z i,t,τ |)2 ≤ (log |z i | + log C3 )2 on Wi,t,τ ,

(4.4.7)

which, together with (4.4.6), lead readily to (4.4.1). We just remark that the constant terms in (4.4.6) and (4.4.7) can be absorbed by the terms linear in log |z i | in (4.4.1) by adjusting c1 and c2 suitably, if necessary. Finally one easily sees that (4.4.2) is a direct consequence of (4.4.1).

254

K. Obitsu, W.-K. To, L. Weng

4.5. Upper bound of E i,t,τ near a puncture p j with j = i. For fixed integers i, j with 1 ≤ i = j ≤ n, we are going to give a pointwise upper bound of E i,t,τ near the puncture p j . Let W j = ∗ (R) × U (with 0 < R < 1), W j,t,τ , z j be as in (2.1). Proposition 4.5.1. For fixed integers 1 ≤ i = j ≤ n and real number α satisfying 0 < α < 1, there exist constants C, δ > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has E i,t,τ (z j ) ≤

C on W j,t,τ . (− log |z j |)α

(4.5.1)

Proof. The proof is similar to that of Proposition 4.2.2, and as in there, we will assume without loss of generality that at (t, τ ) = (0, 0), z j is a standard local holomorphic coordinate for X 0 . For each (t, τ ) ∈ U , the boundary ∂ W j,t,τ of W j,t,τ consists of the circle |z j | = R. As in (4.2.9), it follows from Proposition 2.1.3 and Proposition 4.1.2(ii) that there exists a constant C1 > 0 such that for all (t, τ ) ∈ U , one has E i,t,τ (z j ) ≤ C1 on ∂ W j,t,τ .

(4.5.2)

Thus for all (t, τ ) ∈ U , one has C E i,t,τ (z j ) − ≤ 0 on ∂ W j,t,τ , where C := C1 · (− log R)α > 0. (− log |z j |)α (4.5.3) Since 0 < α < 1, it follows from Proposition 4.1.2(iii) that for all (t, τ ) ∈ U , one has E i,t,τ (z j ) − Let j := 4

∂2 ∂z j ∂z j

C → 0 as z j → 0. (− log |z j |)α

be as in (4.2.5). Then a direct calculation gives

1 α(α + 1) = .

j (− log |z j |)α |z j |2 (− log |z j |)α+2

(4.5.4)

(4.5.5)

Let ρt,τ (z j ) be as in (2.1.4). For all (t, τ ) ∈ U , it follows from (1.1.2) that one has

j E i,t,τ (z j ) = 2E i,t,τ (z j )ρt,τ (z j ) 2 C1,t,τ E i,t,τ (z j ) on W j,t,τ (by Proposition 2.1.3), ≥ |z j |2 (log |z j |)2

(4.5.6)

where C1,t,τ is as in (2.1.13). Since α(α + 1) < 2, it follows from Proposition 2.1.3(ii) that there exists a constant δ > 0 such that α(α + 1) C1,t,τ > for all (t, τ ) ∈ U satisfying (t, τ ) < δ. (4.5.7) 2 Together with (4.5.5) and (4.5.6), it follows that for all (t, τ ) ∈ U satisfying (t, τ ) < δ, one has

C

j E i,t,τ (z j ) − (− log |z j |)α

α(α + 1) C ≥ on W j,t,τ . · E (z ) − (4.5.8) i,t,τ j |z j |2 (log |z j |)2 (− log |z j |)α By using the maximum principle, one easily obtains (4.5.1) as a consequence of (4.5.3), (4.5.4) and (4.5.8).

Asymptotic Behavior of the Takhtajan-Zograf Metric

255

5. Asymptotic Behavior of the Takhtajan-Zograf Metric 5.1. Let X 0 ∈ Mg,n \Mg,n be a stable Riemann surface with n punctures p1 , . . . , pn and m nodes q1 , . . . , qm , and let ψ: U m × V → Uˆ with V 3g−3+n−m and coordinates (s1 , . . . , s3g−3+n ) = (t1 , . . . , tm , τ1 , . . . , τ3g−3+n−m ) = (t, τ ) be as in Theorem 1. For (t, τ ) ∈ U ∗ = ( ∗ )m × V , thecomponents of the Takhtajan-Zograf metric . For the Weil-Petersson given as in (1.4.1) form a matrix G TZ := g TZ j k¯ 1≤ j,k≤3g−3+n metric, we similarly denote the matrix G WP := g WP , where the g WP ’s j k¯ j k¯ 1≤ j,k≤3g−3+n

are as in Proposition 3.2.1. Let φk , k = 1, . . . , 3g − 3 + n, be the regular 2-differentials given by Proposition 3.1.2. For 1 ≤ j, k ≤ 3g − 3 + n, we define h TZ j k¯

=

n

E i,t,τ φ j φk , ρt,τ

X t,τ

i=1

and denote the corresponding matrix by TZ :=

(5.1.1)

. We remark that it is easy h TZ ¯ jk

to see from Proposition 3.1.2 and Definition 4.1.1 that TZ is actually well-defined on the entirety of U ; moreover, for each non-empty subset J ⊂ {1, . . . , m} and with B(J ) = {(t, τ ) ∈ U t j = 0 for all j ∈ J } as defined in (3.1), one has, at any point (t, τ ) ∈ B(J ), h TZ = 0 whenever either j ∈ J or k ∈ J (cf. Proposition 3.1.2(ii)). In j k¯ particular, at (t, τ ) = (0, 0), we have h TZ (0, 0) = 0 if j ≤ m + 1 or k ≤ m + 1. j k¯

(5.1.2)

Proposition 5.1.1. On U ∗ , we have G TZ = G WP TZ G WP , or equivalently, g TZ = j k¯

3g−3+n ,r =1

g WP h TZ g WP , 1 ≤ j, k ≤ 3g − 3 + n. j ¯ ¯r r k¯

(5.1.3)

Proof. For (t, τ ) ∈ U ∗ , let µk = µk (z, t, τ )dz/dz, k = 1, . . . , 3g − 3 + n, be a basis of harmonic Beltrami differentials on X t,τ dual to {φk }1≤k≤3g−3+n with respect to the pairing in (1.1.3). From the definition of harmonic Beltrami differentials in (1.1) and Proposi3g−3+n tion 3.1.2(i), one easily sees that for each 1 ≤ j ≤ 3g −3+n, µ j = k=1 c jk φk /ρt,τ for some constants c jk . Now for each j, k, we have = g WP jk

µ j µk ρt,τ = X t,τ

3g−3+n

c j X t,τ

=1

φ µk =

3g−3+n

c j δk = c jk .

=1

It follows that µj =

3g−3+n =1

g WP j ¯

φ ρt,τ

(5.1.4)

256

K. Obitsu, W.-K. To, L. Weng

for each j. Now, for each 1 ≤ j, k ≤ 3g − 3 + n, we have g TZ j k¯

=

n i=1

=

E i,t,τ µ j µk ρt,τ X t,τ

3g−3+n n ,r =1 i=1

=

3g−3+n ,r =1

=

3g−3+n ,r =1

g WP j ¯

E i,t,τ g WP φ g WP φ j ¯ k r¯ r ρt,τ

X t,τ

n i=1

X t,τ

E i,t,τ φr φ ρt,τ

(by (5.1.4))

grWP k¯

g WP h TZ g WP . j ¯ ¯r r k¯

5.2. We obtain the asymptotic behavior of the matrix T Z as follows: Proposition 5.2.1. Notation as in Theorem 1 and (5.1). Then the following statements hold: (i) For each 1 ≤ j ≤ m and any ε > 0, there exist constants C1 > 0 (depending on ε) such that h TZ (t, τ ) ≤ C1 |t j |2 (− log |t j |)2+ε ) j j¯

(5.2.1)

for all (t, τ ) ∈ U ∗ . (ii) For each 1 ≤ j ≤ m and any ε > 0, there exists a constant C2 > 0 (depending on ε) such that h TZ (t, τ ) ≥ C2 |t j |2 (− log |t j |)2−ε j j¯

(5.2.2)

for all (t, τ ) ∈ U ∗ . (iii) For each 1 ≤ j, k ≤ m with j = k, TZ h (t, τ ) = O |t j | |tk | as (t, τ ) ∈ U ∗ → (0, 0). j k¯

(5.2.3)

(iv) For each j , k ≥ m + 1, lim

(t, τ ) → (0, 0) (t, τ ) ∈ U ∗

h TZ (t, τ ) = h TZ (0, 0). j k¯ j k¯

(5.2.4)

(v) For each j ≤ m and k ≥ m + 1, TZ h (t, τ ) = O |t j | as (t, τ ) ∈ U ∗ → (0, 0). j k¯

(5.2.5)

Asymptotic Behavior of the Takhtajan-Zograf Metric

257

Proof. First we prove (i). It is easy to see that we only need to verify (5.2.1) for (t, τ ) ∈U ∗ with small (t, τ ). Recall from (1.3) and (3.1.3) the covering of X by coordinate neighborhoods {N j }1≤ j≤m , {Wi }1≤i≤n and {A }1≤≤o , and the corresponding fibers N j,t,τ , Wi,t,τ , A,t,τ . For each 1 ≤ j ≤ m and each 1 ≤ i ≤ n, we have E i,t,τ φ j φ j ρt,τ X t,τ ⎞ ⎛ ⎟ E i,t,τ φ j φ j ⎜ ⎟ ≤⎜ + + + + ⎠ ⎝ ρt,τ N j,t,τ N j ,t,τ Wi,t,τ Wi ,t,τ A,t,τ 1≤ j ≤n j = j

1≤i ≤n i =i

1≤≤o

=: I1 + I2 + I3 + I4 + I5 .

(5.2.6)

Fix an ε with 0 < ε < 1, and recall the decomposition N j,t,τ = N 1j,t,τ ∪ N 2j,t,τ in (2.2.4). By Proposition 4.2.2 (with α = 1 − ), the first line of (3.1.5) in Proposition 3.1.2 and Proposition 2.2.1, it follows that there exist constants C1 , δ1 > 0 such that for all (t, τ ) ∈ U with t j = 0 and satisfying (t, τ ) < δ1 , one has E i,t,τ φ j φ j 1 ρt,τ N j,t,τ |t j | |t j | 1 ≤ C1 · · · |z j |2 (log |z j |)2 dz j d z¯ j 1 1−ε 2 |z j | |z j |2 |t j | 2 <|z j |
E i,t,τ φ j φ j as (t, τ ) ∈ U ∗ → (0, 0). (5.2.8) = O |t j |2 (− log |t j |)2+ε ρt,τ

For each 1 ≤ j ≤ n with j = j, one easily performs a computation similar to (5.2.7) with the first line of (3.1.5) replaced by the second line of (3.1.5) to see that there exist constants C2 , δ2 > 0 such that for all (t, τ ) ∈ U with t j = 0 and satisfying (t, τ ) < δ2 , one has E i,t,τ φ j φ j ≤ C2 |t j |2 |z j |2 (log |z j |)1+ε dz j d z¯ j 1 1 ρ 2 t,τ N |t j | <|z j |
≤ C3 |t j |2

(5.2.9)

for some constant C3 > 0, and a similar estimate holds on N 2j ,t,τ . By summing (5.2.9) over the j ’s, we get an estimate of the form E i,t,τ φ j φ j I2 = = O(|t j |2 ) as (t, τ ) ∈ U ∗ → (0, 0). (5.2.10) ρ t,τ N j ,t,τ 1≤ j ≤n j = j

258

K. Obitsu, W.-K. To, L. Weng

Using Proposition 4.4.1, the first line of (3.1.9) in Proposition 3.1.2 and Proposition 2.1.3, one easily checks that for each 1 ≤ i ≤ n and each 1 ≤ j ≤ m, there exist constants C4 , δ3 > 0 such that for all (t, τ ) ∈ U satisfying (t, τ ) < δ3 , one has E i,t,τ φ j φ j I3 = ρt,τ Wi,t,τ |t j |2 ≤ C4 (log |z i |)2 · · |z i |2 (log |z i |)2 dz i d z¯ i |z i |2 0<|z i |
(5.2.11)

for some constant C5 > 0. A calculation similar to (5.2.11) (using Proposition 4.5.1 in place of Proposition 4.4.1) easily shows that E i,t,τ φ j φ j I4 = = O(|t j |2 ) as (t, τ ) ∈ U ∗ → (0, 0). (5.2.12) ρ t,τ Wi ,t,τ 1≤i ≤n i =i

For each 1 ≤ ≤ o , it follows readily from the result of Bers [Be] mentioned in (2.1) that there exist constants C5 , C6 > 0 such that for all (t, τ ) ∈ U , one has C5 dz ⊗ d z¯ ≤ ρt,τ ≤ C6 dz ⊗ d z¯ on A,t,τ .

(5.2.13)

Together with the first line of (3.1.11) in Proposition 3.1.2 and Proposition 4.2.1, it follows easily that for each 1 ≤ i ≤ n, there exist constants C7 , δ4 > 0 such that for all (t, τ ) ∈ U ∗ satisfying (t, τ ) < δ4 , one has E i,t,τ φ j φ j I5 = ≤ C7 |t j |2 . (5.2.14) ρt,τ A,t,τ 1≤≤o

By using (5.1.1), (5.2.8), (5.2.10), (5.2.11), (5.2.12) and (5.2.14), one easily sees that (5.2.1) can be obtained readily by summing (5.2.6) with the index i running from 1 to n, and this finishes the proof of (i). We remark that I1 is the dominant term on the right-hand side of (5.2.6). Next one easily sees that (5.2.2) is a direct consequence of Proposition 4.3.1 (by setting β = 1 + in (4.3.1)), which gives (ii). The proof of (iii) is similar to that of (i), and thus it will be skipped. To prove (iv), we first observe from 1 (2.2.4) that for each (t, τ ) ∈ U , N 1j,t,τ can be identified with the subset |t j | 2 ≤ |z j | < r in N 1j,0,0 via the projection map in the z j -coordinate, and similar description holds for N 2j,t,τ . Similarly, each Wi,t,τ and A,t,τ can be identifed with Wi,0,0 and A,0,0 respectively. Next we recall the pointwise upper bounds for the E i,t,τ ’s in Proposition 4.2.2, Proposition 4.4.1 and Proposition 4.5.1, the pointwise upper bounds for the φ j ’s (with j ≥ m + 1) in (3.1.7), the second line of (3.1.9) and that of (3.1.11) in Proposition 3.1.2, and the pointwise lower bounds for the ρt,τ ’s in Proposition 2.1.3, Proposition 2.2.1 and (5.2.13). Recall also the pointwise convergence of the E i,t,τ ’s given by Proposition 4.2.1, that of the φk ’s given by Proposition 3.1.2 and that of the ρt,τ ’s given by Bers’ result [Be] as (t, τ ) ∈ U ∗ → (0, 0). Together with a partition of unity of X with respect to the coverings {N j }, {Wi } and {A }, one can easily apply the dominated convergence theorem to show that for each 1 ≤ i ≤ n and j, k ≥ m + 1, one has E i,t,τ φ j φk E i,0,0 φ j φk → as (t, τ ) ∈ U ∗ → (0, 0), (5.2.15) ρ ρ0,0 t,τ X t,τ X 0,0

Asymptotic Behavior of the Takhtajan-Zograf Metric

259

which together with (5.1.1), leads to (5.2.4) readily, and this finishes the proof of (iv). Finally the proof of (v) is similar to those of (i) and (iii) (and involves the use of the pointwise upper bounds for the φ j ’s with j ≤ m needed in (i) and those for the φk ’s with k ≥ m + 1 needed in (iv) above), and thus it will be skipped. 5.3. Finally we are ready to give the proof of Theorem 1 as follows: Proof of Theorem 1. We are going to deduce Theorem 1 from Proposition 3.2.1, Proposition 5.1.1 and Proposition 5.2.1, and it amounts to estimating terms of the form g WP h TZ g WP , 1 ≤ j, , r, k ≤ 3g − 3 + n j ¯ ¯r r k¯ (cf. Proposition 5.2.1). To prove Theorem 1(i) or equivalently (1.4.7), we fix an ε with 0 < ε < 1, and fix a j with 1 ≤ j ≤ m. Then it follows from (3.2.1) and (5.2.1) that

1 1 2 2+ε TZ g WP = O h · |t | (− log |t |) · g WP j j j j¯ j j¯ j j¯ |t j |2 (− log |t j |)3 |t j |2 (− log |t j |)3

1 =O as (t, τ ) ∈ U ∗ → (0, 0). (5.3.1) |t j |2 (− log |t j |)4−ε Similarly, it follows from (3.2.1), (3.2.2), (5.2.1) and (5.2.3) that g WP h TZ g WP j ¯ ¯r r j¯ ⎧ 1 ⎪ ⎪ ⎪ O · |t j | |tr | ⎪ 2 (− log |t |)3 ⎪ |t | j j ⎪

⎪ ⎪ ⎪ 1 ⎪ ⎪ if = j & 1 ≤ r = j ≤ m, · ⎪ ⎪ ⎪ |tr | |t j | (log |tr |)3 (log |t j |)3 ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎪ O |t | |t | (log |t |)3 (log |t |)3 ⎪ ⎨ j j = ·|t |2 (− log |t |)2+

⎪ ⎪ ⎪ 1 ⎪ ⎪ if 1 ≤ = r = j ≤ m, · ⎪ 3 3 ⎪ ⎪ ⎪ |t | |t j | (log |t |) (log |t j |) ⎪ ⎪ 1 ⎪ ⎪ ⎪ O · |t | |t j | ⎪ 3 ⎪ |t | |t | (log |t |t |)3 j j |) (log ⎪

⎪ ⎪ ⎪ 1 ⎪ ⎪ if 1 ≤ = j ≤ m & r = j, · 2 ⎩ |t j | (− log |t j |)3

1 =O as (t, τ ) ∈ U ∗ → (0, 0) if 1 ≤ , r ≤ m & (, r ) = ( j, j). |t j |2 (log |t j |)6 (5.3.2) Similarly one easily checks from (3.2.1), (3.2.2), (3.2.3), (3.2.4), (5.2.4), (5.2.5) that g WP h TZ g WP j ¯ ¯r r j¯

1 as (t, τ ) ∈ U ∗ → (0, 0) if ≥ m + 1 or r ≥ m + 1. =O |t j |2 (log |t j |)6 (5.3.3)

260

K. Obitsu, W.-K. To, L. Weng

Combining Proposition 5.1.1, (5.2.1), (5.2.2) and (5.2.3), we have WP TZ WP g TZ = g h g + g WP h TZ g WP j j¯ j j¯ j j¯ j j¯ j ¯ ¯r r j¯ 1≤,r ≤3g−3+n (,r ) =( j, j)

1 1 =O +O 2 4−ε |t j |2 (log |t j |)6 |t j | (− log |t j |)

1 =O as (t, τ ) ∈ U ∗ → (0, 0), |t j |2 (− log |t j |)4−ε

(5.3.4)

and this gives Theorem 1(i). A calculation similar to (5.3.1) using (5.2.2) in place of (5.2.1) implies that for 1 ≤ j ≤ m and 0 < ε < 1, there exists a constant C > 0 such that C g WP h TZ g WP ≥ (5.3.5) 2 j j¯ j j¯ j j¯ |t j | (− log |t j |)4+ε for all (t, τ ) ∈ U ∗ . Then a calculation similar to (5.3.4) using (5.2.3) in place of (5.2.1) leads readily to (1.4.8), which, in turn, leads to Theorem 1(ii). The proof of Theorem 1(iii) is similar to that of Theorem 1(i), and thus it will be skipped. To prove Theorem 1(iv), we first observe that from (1.2), (5.1.1), Proposition 3.1.2(ii) and using a calculation similar to Proposition 5.1.1, one has, for each j , k ≥ m + 1, TZ,(γ1 ,...,γm )

gˆ j k¯

3g−3+n

(0, 0) =

,r =m+1

WP g WP (0, 0)h TZ ¯r (0, 0)gr k¯ (0, 0). j ¯

(5.3.6)

For each j, k ≥ m +1 and each 1 ≤ , r ≤ m, it follows from (3.2.4), (5.2.1), (5.2.3) that (t, τ )h TZ (t, τ )grWP (t, τ ) g WP k¯ j ¯ ⎧ ¯r

1 1 ⎪ ⎪ if = r, · |t ||tr | · ⎨O 3 |tr |(− log |tr |)3

|t |(− log |t |) = 1 1 ⎪ ⎪ if = r, · |t |2 (− log |t |)2+ · ⎩O 3 |t |(− log |t |) |t |(− log |t |)3 → 0 as (t, τ ) ∈ U ∗ → (0, 0). (5.3.7) Similarly, for each j , k ≥ m + 1, one also easily sees from (3.2.3), (3.2.4), (5.2.4) and (5.2.5) that WP ∗ g WP (t, τ )h TZ ¯r (t, τ )gr k¯ (t, τ ) → 0 as (t, τ ) ∈ U → (0, 0), if ≥ m +1 or r ≥ m +1. j ¯ (5.3.8) Thus, one has, for j , k ≥ m + 1,

lim g TZ¯ (t, τ ) (t,τ )∈U ∗ →(0,0) j k

=

lim

(t,τ )∈U ∗ →(0,0)

3g−3+n ,r =m+1

WP g WP (t, τ )h TZ ¯r (t, τ )gr k¯ (t, τ ) j ¯

(by Proposition 5.1.1, (5.3.7), (5.3.8)) = =

3g−3+n

WP g WP (0, 0)h TZ ¯r (0, 0)gr k¯ (0, 0) (by (3.2.3), (5.2.4)) j ¯

,r =m+1 TZ,(γ ,...,γm ) gˆ j k¯ 1 (0, 0)

(by (5.3.6)),

(5.3.9)

and this finishes the proof of Theorem 1(iv). Finally the proof of Theorem 1(v) is similar to that of Theorem 1(i) and (iv), and thus it will be skipped.

Asymptotic Behavior of the Takhtajan-Zograf Metric

261

Acknowledgements. The authors would like to thank Professor Ngaiming Mok and Professor Scott Wolpert for their interest, comments and suggestions related to this work.

References [A1] [A2] [AB] [Ba] [Be] [Bu] [Ch] [FK] [FM] [He] [IT] [Kn] [KM] [Ku] [M] [N] [O1] [O2] [OW] [Q] [TZ1] [TZ2] [We] [Wolf] [Wolp1] [Wolp2] [Wolp3] [Wolp4] [Wolp5] [Wolp6] [Y]

Ahlfors, L.: An extension of Schwarz’s lemma. Trans. Amer. Math. Soc. 43, 359–364 (1938) Ahlfors, L.: Curvature properties of Teichmüller’s space. J. Analyse Math. 9, 161–176 (1961/1962) Ahlfors, L., Bers, L.: Riemann’s mapping theorem for variable metrics. Ann. of Math. 72, 385–404 (1960) Baily, W.: The decomposition theorem for V-manifolds. Amer. J. Math. 78, 862–888 (1956) Bers, L.: Spaces of degenerating Riemann surfaces. In: Discontinuous groups and Riemann surfaces. Ann. of Math. Studies, No. 79, Princeton, N.J: Princeton Univ. Press, 1974, pp. 43–55 Buser, P.: Geometry and Spectra of compact Riemann surfaces. Boston: Birkhäuser, 1992 Chu, T.: The Weil-Petersson metric in the moduli space. Chinese J. Math. 4, 29–51 (1976) Farkas, H.M., Kra, I.: Riemann surfaces. New York: Springer-Verlag, 1992 Fulton, W., MacPherson, R.: Compactification of configuration spaces. Ann. of Math. 139, 183–225 (1994) Hejhal, D.A.: The Selberg trace formula for PSL(2, R). Vol. 2, Lecture Notes in Mathematics 1001, Berlin: Springer-Verlag, 1983 Imayoshi, Y., Taniguchi, M.: An introduction to Teichmüller spaces. Tokyo: Springer Verlag, 1992 Knudsen, F.: The projectivity of the moduli space of stable curves, II, and III. Math. Scand. 52, 161–199, 200–212 (1983) Knudsen, F., Mumford, D.: The projectivity of the moduli space of stable curves, i. Math. Scand. 39, 19–55 (1976) Kubota, T.: Elementary theory of Eisenstein series. Tokyo: Kodansha, New York-London-Sydney: John Wiley and Sons, 1973 Masur, H.: Extension of the Weil-Petersson metric to the boundary of Teichmüller space. Duke Math. J. 43, 623–635 (1976) Nag, S.: The complex analytic theory of Teichmüller spaces. New York: John Wiley & Sons, 1988 Obitsu, K.: Non-completeness of Zograf-Takhtajan’s Kähler metric for Teichmüller space of punctured Riemann surfaces. Commun. Math. Phys. 205, 405–420 (1999) Obitsu, K.: The asymptotic behavior of Eisenstein series and a comparison of the Weil-Petersson and the Zograf-Takhtajan metrics. Publ. Res. Inst. Math. Sci. 37, 459–478 (2001) Obitsu, K., Wolpert, S.: Grafting hyperbolic metrics and Eisenstein series. Math. Ann. 341, 685–706 (2008) Quillen, D.: Determinants of Cauchy-Riemann operators on Riemann surfaces. Funct. Anal. Appl. 19, 37–41 (1985) Takhtajan, L.A., Zograf, P.G.: The Selberg zeta function and a new Kähler metric on the moduli space of punctured Riemann surfaces. J. Geom. Phys. 5, 551–570 (1988) Takhtajan, L.A., Zograf, P.G.: A local index theorem for families of ∂-operators on punctured Riemann surfaces and a new Kähler metric on their moduli spaces. Commun. Math. Phys. 137, 399–426 (1991) Weng, L.: -admissible theory, II. Deligne pairings over moduli spaces of punctured Riemann surfaces. Math. Ann. 320, 239–283 (2001) Wolf, M.: Infinite energy harmonic maps and degeneration of hyperbolic surfaces in moduli space. J. Differ. Geom. 33, 487–539 (1991) Wolpert, S.: Noncompleteness of the Weil-Petersson metric for Teichmüller space. Pacific J. Math. 61, 573–577 (1975) Wolpert, S.: Chern forms and the Riemann tensor for the moduli space of curves. Invent. Math. 85, 119–145 (1986) Wolpert, S.A.: The hyperbolic metric and the geometry of the universal curve. J. Differ. Geom. 31, 417–472 (1990) Wolpert, S.: Disappearance of cusp forms in special families. Ann. of Math. 139, 239–291 (1994) Wolpert, S.: Geometry of the Weil-Petersson completion of Teichmüller. Surveys in differential geometry, Vol. VIII (Boston, MA, 2002), Surv. Differ. Geom., VIII, Somerville, MA: Int. Press, 2003, pp. 357–393 Wolpert, S.: Cusps and the family hyperbolic metric. Duke Math. J. 138, 423–443 (2007) Yamada, S.: On the geometry of Weil-Petersson completion of Teichmüller spaces. Math. Res. Lett. 11, 327–344 (2004)

Communicated by L. Takhtajan

Commun. Math. Phys. 284, 263–280 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0624-0

Communications in

Mathematical Physics

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1 Patrick Hayden1 , Andreas Winter2,3 1 School of Computer Science, McGill University, Montreal, Quebec, H3A 2A7, Canada.

E-mail: [email protected]

2 Department of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK.

E-mail: [email protected]

3 Centre for Quantum Technologies, National University of Singapore, 2 Science Drive 3,

Singapore 117542, Singapore Received: 10 September 2007 / Accepted: 29 July 2008 Published online: 10 September 2008 – © Springer-Verlag 2008

Abstract: For all p > 1, we demonstrate the existence of quantum channels with nonmultiplicative maximal output p-norms. Equivalently, for all p > 1, the minimum output Rényi entropy of order p of a quantum channel is not additive. The violations found are large; in all cases, the minimum output Rényi entropy of order p for a product channel need not be significantly greater than the minimum output entropy of its individual factors. Since p = 1 corresponds to the von Neumann entropy, these counterexamples demonstrate that if the additivity conjecture of quantum information theory is true, it cannot be proved as a consequence of any channel-independent guarantee of maximal p-norm multiplicativity. We also show that a class of channels previously studied in the context of approximate encryption lead to counterexamples for all p > 2. 1. Introduction The oldest problem of quantum information theory is arguably to determine the capacity of a quantum-mechanical communications channel for carrying information, specifically classical bits of information. (Until the 1990’s it would have been unnecessary to add that additional qualification, but today the field is equally concerned with other forms of information like qubits and ebits that are fundamentally quantum-mechanical.) The classical capacity problem long predates the invention of quantum source coding [1,2] and was of concern to the founders of information theory themselves [3]. The first major result on the problem came with the resolution of a conjecture of Gordon’s [4] by Alexander Holevo in 1973, when he published the first proof [5] that the maximum amount of information that can be extracted from an ensemble of states σi occurring with probabilities pi is bounded above by χ ({ pi , σi }) = H pi σi − pi H (σi ), (1) i

i

264

P. Hayden, A. Winter

where H (σ ) = − Tr σ ln σ is the von Neumann entropy of the density operator σ . For a quantum channel N , one can then define the Holevo capacity χ (N ) = max{ pi ,ρi } χ ({ pi , N (ρi )}),

(2)

where the maximization is over all ensembles of input states. Writing C(N ) for the classical capacity of the channel N , this leads easily to an upper bound of C(N ) ≤ lim

n→∞

1 χ (N ⊗n ). n

(3)

It then took more than two decades for further substantial progress to be made on the problem, but in 1996, building on recent advances [6], Holevo [7] and SchumacherWestmoreland [8] managed to show that the upper bound in Eq. (3) is actually achieved. This was a resolution of sorts to the capacity problem, but the limit in the equation makes it in practice extremely difficult to evaluate. If the codewords used for data transmission are restricted such that they are not entangled across multiple uses of the channel, however, the resulting product state capacity C1∞ (N ) has the simpler expression C1∞ (N ) = χ (N ).

(4)

The additivity conjecture for the Holevo capacity asserts that for all channels N1 and N2 , χ (N1 ⊗ N2 ) = χ (N1 ) + χ (N2 ).

(5)

This would imply, in particular, that C1∞ (N ) = C(N ), or that entangled codewords do not increase the classical capacity of a quantum channel. In 2003, Peter Shor [9], building on several previously established connections [10–12], demonstrated that the additivity of the Holevo capacity, the additivity of the entanglement of formation [13–16] and the superadditivity of the entanglement of formation [17] are all equivalent to another conjecture of Shor’s which is particularly simple to express mathematically, known as the minimum output entropy conjecture [18]. For a channel N , define H min (N ) = min H (N (ϕ)), |ϕ

(6)

where the minimization is over all pure input states |ϕ. The minimum output entropy conjecture asserts that for all channels N1 and N2 , H min (N1 ⊗ N2 ) = H min (N1 ) + H min (N2 ).

(7)

There has been a great deal of previous work on these conjectures, particularly inconclusive numerical searches for counterexamples, necessarily in low dimension, at Caltech, IBM, in Braunschweig (IMaPh) and Tokyo (ERATO) [19], as well as proofs of many special cases. For example, the minimum output entropy conjecture has been shown to hold if one of the channels is the identity channel [20,21], a unital qubit channel [22], a generalized depolarizing channel [23,24] or an entanglement-breaking channel [25–27]. In addition, the weak additivity conjecture was confirmed for generalized dephasing channels [28], the conjugates of all these channels [29] and some other special classes of channels [16,30–32]. Further evidence for qubit channels was supplied in [18]. This list is by no means exhaustive. The reader is directed to Holevo’s reviews for a detailed account of the history of the additivity problem [33,34].

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

265

For the past several years, the most commonly used strategy for proving these partial results has been to demonstrate the multiplicativity of maximal p-norms of quantum channels for p approaching 1 [20]. For a quantum channel N and p > 1, define the maximal p-norm of N to be ν p (N ) = sup N (ρ) p ; ρ ≥ 0, Tr ρ = 1 . (8) In the equation, σ p = (Tr |σ | p )1/ p . The maximal p-norm multiplicativity conjecture [20] asserts that for all quantum channels N1 and N2 , ν p (N1 ⊗ N2 ) = ν p (N1 )ν p (N2 ).

(9)

This can be re-expressed in an equivalent form more convenient to us using Rényi entropies. Define the Rényi entropy of order p to be H p (ρ) =

1 ln Tr ρ p 1− p

(10)

for p > 0, p = 1. Since lim p↓1 H p (ρ) = H (ρ), we will also define H1 (ρ) to be H (ρ). All these entropies have the property that they are 0 for pure states and achieve their maximum value of the logarithm of the dimension on maximally mixed states. Define the minimum output Rényi entropy H pmin by substituting H p for H in Eq. (6). Since H pmin (N ) = 1−p p ln ν p (N ), Eq. (9) can then be written equivalently as H pmin (N1 ⊗ N2 ) = H pmin (N1 ) + H pmin (N2 ),

(11)

in which form it is clear that the maximal p-norm multiplicativity conjecture is a natural strengthening of the original minimum output entropy conjecture (7). This conjecture spawned a significant literature of its own which we will not attempt to summarize. Holevo’s reviews are again an excellent source [33,34]. Some more recent important references include [35–40]. Unlike the von Neumann entropy case, however, some counterexamples had already been found prior to this paper. Namely, Werner and Holevo found a counterexample to Eq. (11) for p > 4.79 [41] that nonetheless doesn’t violate the p-norm multiplicativity conjecture for 1 2 [45]. In light of these developments, the standing conjecture was that the maximal p-norm multiplicativity held for 1 ≤ p ≤ 2, corresponding to the region in which the map X → X p is operator convex [35]. More conservatively, it was conjectured to hold at least in an open interval (1, 1 + ), which would be sufficient to imply the minimum output entropy conjecture. On the contrary, shortly after Winter’s discovery, Hayden showed that the conjecture is false for all 1 < p < 2 [46]. The current paper merges and slightly strengthens [45] and [46]. We begin in Sect. 2, by presenting Winter’s counterexamples from [45], which share some important features with [46] but are simpler to analyze. Section 3 then presents Hayden’s counterexamples from [46] with an improved analysis showing that they work for all p > 1, not just 1 1, we show that there exist channels N1 and N2 with output dimension d such that both H pmin (N1 ) and H pmin (N2 ) are equal to ln d − O(1), but H pmin (N1 ⊗ N2 ) = ln d + O(1), so H pmin (N1 ) + H pmin (N2 ) − H pmin (N1 ⊗ N2 ) = ln d − O(1).

(12)

266

P. Hayden, A. Winter

Thus, one finds that the minimum output entropy of the product channel need not be significantly larger than the minimum output entropy of the individual factors. Since [20, 24] H pmin (N1 ⊗ N2 ) ≥ H pmin (N1 ) = ln d − O(1),

(13)

these counterexamples are essentially the strongest possible for all p > 1, up to a constant additive term. (Note that the dependence of H pmin on p is absorbed here in the asymptotic notation.) At p = 1 itself, however, we see no evidence of a violation of the additivity conjecture for the channels we study. Thus, the conjecture stands and it is still an open question whether entangled codewords can increase the classical capacity of a quantum channel. Notation. If A and B are finite dimensional Hilbert spaces, we write AB ≡ A ⊗ B for their tensor product and |A| for dim A. The Hilbert spaces on which linear operators act will be denoted by a superscript. For instance, we write ϕ AB for a density operator on AB. Partial traces will be abbreviated by omitting superscripts, such as ϕ A ≡ Tr B ϕ AB . We use a similar notation for pure states, e.g. |ψ AB ∈ AB, while abbreviating ψ AB ≡ |ψ ψ| AB . We associate to any two isomorphic Hilbert spaces A A a unique maximally entangled state which we denote | A A . Given any orthonormal basis {|i A } for A, if we define |i A = V |i A , where V is the associated isomorphism, |A| we can write this state as | A A = |A|−1/2 i=1 |i A |i A . We will also make use of the asymptotic notation f (n) = O(g(n)) if there exists C > 0 such that for sufficiently large n, | f (n)| ≤ Cg(n). f (n) = (g(n)) is defined similarly but with the reverse inequality | f (n)| ≥ Cg(n). Finally, f (n) = (g(n)) if f (n) = O(g(n)) and f (n) = (g(n)). 2. Random Unitary Channels: p > 2 This class of counterexamples, while only working for p > 2, has the advantage of being a straightforward application of well-known results. Later in the paper we will present stronger counterexamples that reuse the same basic strategy, albeit with some additional technical complications.A random unitary channel is a map of the form N : ρ −→

n 1 Vi ρVi† , n

(14)

i=1

with the Vi unitary transformations of an underlying (finite dimensional) Hilbert space. Let d be the dimension of this space. Following [47], we call N -randomizing if for all ρ, N (ρ) − 1 I ≤ . (15) d ∞ d In that paper, it was shown that for 0 < < 1, -randomizing channels exist in all 134 dimensions d > 10 , with n = 2 d ln d. In fact, randomly picking the Vi from the Haar measure on the unitary group will, with high probability, yield such a channel. Recently, it was shown by Aubrun [48] that n can in fact be taken to be O(d/ 2 ) for Haar distributed Vi , and O(d(ln d)4 / 2 ) for Vi drawn from any ensemble of exactly randomizing unitaries.

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

267

Lemma 2.1. For a random unitary channel N and its complex conjugate, N : ρ → † 1 Vi ρVi , one has ν p (N ⊗ N ) ≥ n1 . n Proof. We use the maximally entangled state | = d −1/2 i |i|i as a test state, abbreviating = | |: ν p (N ⊗ N ) ≥ (N ⊗ N ) p n 1 † = 2 (Vi ⊗ V j )(Vi ⊗ V j ) n i, j=1 p 1 1 1 † = (V ⊗ V )(V ⊗ V ) + i j i j ≥ , n 2 n n i= j

p

where in the third line we have invoked the U ⊗ U -invariance of for the n terms when i = j. For the final inequality, observe that the largest eigenvalue λ1 of (N ⊗ N ) is at p 1/ p least n1 . Denoting the other eigenvalues λα , (N ⊗ N ) p = ≥ λ1 , and α λα we are done. Lemma 2.2. If the channel N is -randomizing, then when p > 1,

1 + 1−1/ p ν p (N ) = ν p (N ) ≤ . d Proof. Clearly, N and N have the same maximum output p-norm. For the former, observe that the -randomizing condition implies that for an arbitrary input state ρ, N (ρ)∞ ≤ 1+ d . In other words, all the eigenvalues λα of the output state N (ρ) are bounded between 0 and 1+ d . In addition, because N (ρ) is a density operator, the eigenvalues sum to 1. Subject to these constraints, however, the convexity of the function x → x p ensures p 1/ p that the p-norm N (ρ) p = is maximized when the largest eigenvalue is α λα 1+ d d and it occurs with multiplicity 1+ , and all but possibly one remaining eigenvalue is 0. Thus, 1/ p

1/ p

1+ p d 1 + 1−1/ p p N (ρ) p = λα ≤ = . (16) 1+ d d α Theorem 2.3. Fix any 0 < < 1 and a family of -randomizing maps N as in Eq. (14) with n > 134 d ln d/ 2 . Then, for any p > 2 and sufficiently large d,

1 + 2−2/ p 1 ≤ ν p (N ⊗ N ). (17) ν p (N )ν p (N ) ≤ d n In other words, for this family of channels, the maximum output p-norm is strictly supermultiplicative for sufficiently large d when p > 2.

268

P. Hayden, A. Winter

Proof. Follows from Lemmas 2.1 and 2.2 since 2 − 2/ p > 1. These counterexamples to the multiplicativity of the output p-norm for p > 2 are interesting in that they are random unitary channels, which are among the simplest truly quantum maps. In fact, the first proofs of multiplicativity for unital qubit channels [22] and depolarizing channels [24] exploited this type of structure. Indeed, unital qubit channels are always random unitary channels (with d = 2) [18]. Despite the fact that King showed multiplicativity for such channels at all p > 1 [22], there is no conflict with the result here, as the bound on n becomes better than d 2 only for rather large dimension d. We observe, furthermore, that p = 2 is indeed the limit of validity of this class of counterexamples, since n ≥ d for any -randomizing map. 3. Generic Quantum Channels: All p > 1 Let E, F and G be finite dimensional quantum systems, then define R = E, S = F G, A = E F and B = G, so that RS = AB = E F G. Our second and stronger class of counterexamples will be channels from S to A of the form

N (ρ) = Tr B U (|0 0| R ⊗ ρ)U † (18) for U unitary and |0 some fixed state on R. Another, slightly more flexible way of writing this is in the language of isometric Stinespring dilations: namely, the Hilbert space isometry V : S → AB defined by V |ϕ = U |0 R |ϕ S . In this notation, to which we will adhere from now, N (ρ) = Tr B VρV † . Our method will be to fix the dimensions of the systems involved, select U (i.e., the isometry V ) at random, and show that the resulting channel is likely to violate additivity. The rough intuition motivating our examples is the same as in the previous section: we will exploit the fact that there are channels that appear to be highly depolarizing for product state inputs despite the fact that they are not close to the depolarizing channel in, for example, the norm of complete boundedness [49]. Consider a single copy of N and the associated map V : |ϕ S → U |0 R |ϕ S . This map takes S to a subspace of A ⊗ B, and if U is selected according to the Haar measure, then the image of S is itself a random subspace, distributed according to the unitarily invariant measure. In [50], it was shown that if |S| is chosen appropriately, then the image is likely to contain only almost maximally entangled states, as measured by the entropy of entanglement. After tracing over B, this entropy of entanglement becomes the entropy of the output state. Thus, for S of suitable size, all input states get mapped to high entropy output states. We will repeat the analysis below, finding that the maximum allowable size of S will depend on p as described by the following two lemmas. Lemma 3.1. The maps f p (|ϕ) = H p (ϕ A ) on unit vectors (states) |ϕ ∈ A ⊗ B, 2 ≤ |A| ≤ |B|, have expectation E f p ≥ E f ∞ ≥ ln |A| − γ |A|/|B|, (19) for a uniformly random state ϕ, with a universal constant γ which may be chosen arbitrarily close to 3 for sufficiently large |A|. Furthermore, for p > 1, the functions f p are all Lipschitz continuous, with the Lipschitz constant p bounded above by 2p ≤

4 p2 1− 1 |A| p . 2 (1 − p)

(20)

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

269

Proof. The first inequality in Eq. (19) is by the monotonicity of the Rényi entropies in p. For the second, observe f ∞ (|ϕ) = − ln ϕ A ∞ , so E f ∞ (|ϕ) = E − ln ϕ A ≥ − ln E ϕ A . ∞

∞

The expectation of the largest eigenvalue of ϕ A has been widely studied in random matrix theory. Just note that |ϕ is well-approximated by a Gaussian unit vector |, that is, a random vector all of whose real and imaginary components (in any basis) are i.i.d. normal with expectation 0 and variance 1/2|A||B|. (See [51, Appendix].) Indeed, by the triangle inequality, Eϕ ϕ A ≤ E A , ∞

∞

and the right hand side, for large A and B, is known [52,53] to be asymptotically √ √ 2 |A| + |B| 1 2 1 |A| 1 = +√ ≤ + 1+3 . |A||B| |A| |A| |B| |A||B| |B| √

√

2 |A|+ |B| The explicit upper bound of ( |A||B| ) has been obtained for matrices with real Gaussian entries [54], but the analogous statement for complex Gaussian entries seems to be unknown. Now, for the Lipschitz bound: we proceed as in [50], inferring the general bound from A a Lipschitz bound for the Rényi entropy of a dephased version on ϕ . Fix bases {| j} and {|k} of A and B, respectively, so that we can write |ϕ = jk ϕ jk | j|k, where the coefficients are to be decomposed into real and imaginary parts: ϕ jk = t jk0 + it jk1 . We actually show that ⎡⎛ ⎞p⎤ 1 A g p (|ϕ) = ln Tr ⎣⎝ | j j|ϕ | j j|⎠ ⎦ 1− p j p 1 1 A p 2 ln ln =

j|ϕ | j = t jkz 1− p 1− p

j

j

kz

2p is p−1 |A|1/2−1/2 p -Lipschitz. This implies the result for f p as follows. Note first that g p (|ϕ) ≥ f p (|ϕ), with equality if {| j} is an eigenbasis of ϕ A . Now, for two vectors |ϕ, |ψ, we may without loss of generality assume that f p (|ψ) ≥ f p (|ϕ), and that {| j} is the eigenbasis of ϕ A . Thus, by assumption,

f p (|ψ) − f p (|ϕ) ≤ g p (|ψ) − g p (|ϕ) ≤

2p |A|1/2−1/2 p |ψ − |ϕ2 . p−1

To bound the Lipschitz constant of g p , it is sufficient to find an upper bound on its gradient. It is straightforward to see that ⎛ ⎞ p−1 ∂g p 1 1 = t 2jk z ⎠ , p · 2 p t jkz ⎝ ∂t jkz 1 − p t2 z k j kz jkz

270

P. Hayden, A. Winter

so introducing the notation x j = ∇g p 2 = 2

2 kz t jkz ,

we have

2 p−1 p (2 p−1)/ p 4 p2 4 p2 j xj j (x j ) , 2 = 2 2 (1 − p) (1 − p) p p 2 x x j j j j

which we need to maximize subject to the constraint j x j = 1. Since (2 p − 1)/ p ≥ 1, p 1− p , the right the function y (2 p−1)/ p is convex. Therefore, for fixed s = j x j ≥ |A| p hand side is maximal when all the x j except for one are 0. Thus, ∇g p 2 ≤ max|A|1− p ≤s≤1 2

4 p2 4 p2 s [(2 p−1)/ p]−2 = |A|1−1/ p , 2 (1 − p) (1 − p)2

and we are done. Lemma 3.2. Let A and B be quantum systems with 2 ≤ |A| ≤ |B| and 1 < p ≤ ∞. Then there exists a subspace S ⊂ A ⊗ B of dimension

c 1 2 α2 |S| = 1− |A|1/ p |B| (21) 4 p ln(5/δ) (with a universal constant c), that contains only states |ϕ ∈ S with high entanglement, in the sense that H p (ϕ A ) ≥ ln |A| − α − β + ln(1 − δ),

(22)

√ where β = γ |A|/|B| is as in Lemma 3.1. The probability that a subspace of dimension |S| chosen at random according to the unitarily invariant measure will not have this property is bounded above by

2|S| 5 1 2 2 1/ p 2 exp −c 1 − α |A| |B| . (23) δ p The universal constant c may be chosen to be 1/72π 3 . Proof. The argument is nearly identical to the proof of Theorem IV.1 in [50], but with an improvement, possible due to the fact the we are looking at a function defined via a norm. (See [55] and [48].) First of all, by Levy’s Lemma, for a function f on pure states of A ⊗ B with Lipschitz constant , the random variable f (|ϕ) for a uniformly distributed |ϕ on the unit sphere in A ⊗ B obeys

2 α2 Pr{ f < E f − α} ≤ 2 exp − 3 2 |A||B| . 9π (See [50, Lemma III.1] for an exposition.) We apply this to f p , for which we have a Lipschitz bound by Lemma 3.1. Furthermore, we can find a δ-net M of cardinality |M| ≤ (5/δ)2|S| on the unit vectors in S [50, Lemma III.6]. In other words, for each

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

271

unit vector |ϕ ∈ S there exists a |ϕ ˜ ∈ M such that |ϕ − |ϕ ˜ 2 ≤ δ. Combining the net, the Lipschitz constant and the union bound, we get Pr ∃|ϕ ∈ M f p (|ϕ) < ln |A| − α/2 − β S

2|S| 2 α 2 (1 − 1/ p)2 5 2 exp − 3 |A||B| , ≤ δ 9π 16|A|1−1/ p which is the probability inequality claimed in the theorem. Moreover, the right hand side is less than 1 if |S| is chosen as stated in the theorem. Now, assume we have a subspace S with a δ-net M such that (∀|ϕ ∈ M) f p (ϕ) ≥ ln |A| − α − β , i.e. r := max|ϕ∈M ϕ A p ≤ e−(1−1/ p)(ln |A|−α−β) .

(24)

Denote R := max|ϕ∈S unit vector ϕ A p = maxρ d.o. supported on S ρ A p , where the latter equality is due to the convexity of the norm. Hence, for each unit vector |ϕ ∈ S and corresponding |ϕ ˜ ∈ M such that ϕ − ϕ ˜ 1 ≤ δ, ϕ A p ≤ ϕ˜ A p + ϕ A − ϕ˜ A p ≤ r + δ R, where we have used the triangle inequality and the trace norm bound on ϕ − ϕ. ˜ Consequently, R ≤ r/(1 − δ), and inserting that into Eq. (24) finishes the proof. Consider now the product channel N ⊗ N¯ , where N¯ (ρ) = Tr B V¯ ρV T is the complex conjugate of N . We will exploit an approximate version of the symmetry used in the random unitary channel counterexamples. Fix orthonormal bases of S, A and B to be used in the definition of maximally entangled states involving these systems. (These have to be the same product bases with respect to which we define the complex conjugate.) In the trivial case where |S| = |A ⊗ B|, the isometry V is unitary and the identity V ⊗ V¯ | = (V V¯ T ⊗ I )| = | for the maximally entangled state | S1 S2 implies that

(N ⊗ N¯ )(| | S1 S2 ) = Tr B1 B2 | | A1 A2 ⊗ | | B1 B2 = | | A1 A2 . (25) The output of N ⊗ N¯ will thus be a pure state. In the general case, we will choose |S|/|A ⊗ B| to be large but not trivial, in which case useful bounds can still be placed on the largest eigenvalue of the output state for an input state maximally entangled between S1 and S2 . Lemma 3.3. Let | S1 S2 be a state maximally entangled between S1 and S2 as in the |S| . previous paragraph. Then (N ⊗ N¯ )( S1 S2 ) has an eigenvalue of at least |A||B| Proof. This is an easy calculation again exploiting the U ⊗U¯ invariance of the maximally entangled state. Note that whereas V is an isometric embedding, V † is a partial isometry.

272

P. Hayden, A. Winter

More precisely, it can be understood as a unitary U † on A ⊗ B followed by a fixed projection P, say onto the first |S| coordinates of A ⊗ B. Now, (N ⊗ N¯ )| | S1 S2 ∞

¯ ≥ Tr (N ⊗ N )| | S1 S2 | | A1 A2 ≥ Tr (V ⊗ V¯ )| | S1 S2 (V ⊗ V¯ )† (| | A1 A2 ⊗ | | B1 B2 ) S1 S2 ¯ ¯ = Tr (P ⊗ P)| | (P ⊗ P)(U ⊗ U¯ )† (| | A1 A2 ⊗ | | B1 B2 )(U ⊗ U¯ )) |S| S1 S2 A1 A2 ¯ ¯ (P ⊗ P)(| | ⊗ | | B1 B2 ) = = Tr (P ⊗ P)| | , |A||B| and we are done. In order to demonstrate violations of additivity, the first step is to bound the minimum output entropy from below for a single copy of the channel. Fix 1 < p ≤ ∞, let |B| = |A| so that β = γ , set α = δ = 1/2, and then choose |S| according to Lemma 3.2. With probability approaching 1 as |A| → ∞, H pmin (N ) ≥ ln |A| − γ − 1/2 − ln 2,

(26)

when the subspace S defining the channel is chosen according to the unitary invariant measure. (Since we’re interested in |A| → ∞, we may choose any γ > 3.) The same obviously holds for H pmin (N¯ ). Recall that the entropy of the uniform distribution is ln |A| so the minimum entropy is near the maximum possible. Fix a channel such that these lower bounds on H pmin (N ) and H pmin (N¯ ) are satisfied. By Lemma 3.3,

p p |S| |S| 1 p 1 ¯ ln ln ln , λa ≤ = H p (N ⊗ N )() = 1− p 1− p |A||B| 1− p |A||B| α (27) where the λα are the eigenvalues of (N ⊗ N¯ )(). Substituting the value of |S| from Lemma 3.2 into this inequality yields

p p ln ≤ ln |A|+O (1−1/ p)−2 , H p (N ⊗ N¯ )() ≤ ln |A|+O 1+ p−1 p−1 (28) where the O notation hides only an absolute constant, independent of |A| and p > 1. Thus, the Rényi entropy of (N ⊗ N¯ )() is strictly less than H pmin (N ) + H pmin (N¯ ) ≥ 2 ln |A|−O(1). This is a violation of conjecture (11), with the size of the gap approaching ln |A| − O(1) for large |A|. Theorem 3.4. For all 1 < p ≤ ∞, there exists a quantum channel for which the inequalities (26) and (28) both hold. The inequalities are inconsistent with the maximal p-norm multiplicativity conjecture.

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

273

Note, however, that changing p also requires changing |S| according Lemma 3.2, so we have a sequence of channels violating additivity of the minimal output Rényi entropy as p decreases to 1, as opposed to a single channel doing so for every p. This prevents us from drawing conclusions about the von Neumann entropy by taking the limit p → 1. Likewise, an examination of Eq. (28) reveals that we also lose control over the two-copy minimum output entropy of a fixed channel as p → 1. Another observation comes from the fact that our examples violate additivity by so much: namely that, due to Lemma 3.3, the dimension of the subspace S in Lemma 3.2 is essentially optimal up to constant factors (depending on p). Any stronger violations of additivity would contradict Eq. (13), the inequality H pmin (N ⊗ N¯ ) ≥ H pmin (N ). As an aside, it is interesting to observe that violating maximal p-norm multiplicativity has structural consequences for the channels themselves. For example, because entanglement-breaking channels do not violate multiplicativity [56], there must be states |ψ S1 S2 such that (N ⊗ I S2 )(ψ) is entangled, despite the fact that N will be a rather noisy channel. (The same conclusions apply to the maps of Sect. 2 , where the conclusion takes the form that -randomizing random unitary channels need not be entanglementbreaking.) 4. The von Neumann entropy case Despite the large violations found for p close to 1, the class of examples presented here do not appear to contradict the minimum output entropy conjecture for the von Neumann entropy. The reason is that the upper bound demonstrated for H p (N ⊗ N¯ )() in the previous section rested entirely on the existence of one large eigenvalue for (N ⊗ N¯ )(). The von Neumann entropy is not as sensitive to the value of a single eigenvalue as are the Rényi entropies for p > 1 and, consequently, does not appear to exhibit additivity violations. With a bit of work, it is possible to make these observations more rigorous. Lemma 4.1. Let | S1 S2 be a maximally entangled state between S1 and S2 . Assuming that |A| ≤ |B| ≤ |S|,

2 |S|2 1 ¯ , (29) dU = +O Tr (N ⊗ N )(| |) |A|2 |B|2 |A|2 where “dU ” is the normalized Haar measure on R ⊗ S ∼ = A ⊗ B. A description of the calculation can be found in Appendix A. Let the eigenvalues of (N ⊗ N¯ )() be equal to λ1 ≥ λ2 ≥ · · · ≥ λ|A|2 . For a typical U , Lemmas 3.3 and 4.1 together imply that

1 2 . (30) λj = O |A|2 j>1

Thus, aside from λ1 , the eigenvalues λ j must be quite small. A typical eigenvalue distribution is plotted in Fig. 1. If we define λ˜ j = λ j /(1 − λ1 ), then j>1 λ˜ j = 1 and H1 (λ˜ ) ≥ H2 (λ˜ ) = − ln

j>1

λ˜ 2j = 2 ln |A| − O(1).

(31)

274

P. Hayden, A. Winter

Spectrum of the output density operator 0

10

−2

Eigenvalue

10

−4

10

−6

10

−8

10

−10

10

0

100

200

300

400

500

600

Eigenvalue index Fig. 1. Typical eigenvalue spectrum of (N ⊗ N¯ )() when |R| = 3 and |A| = |B| = 24. The eigenvalues are plotted in increasing order from left to right. The green dashed line corresponds to |S|/(|A||B|) = 1/3, which |S| is essentially equal to the largest eigenvalue. The red solid line represents the value (1− |A||B| )/|A|2 = 1/864. If the density operator were maximally mixed aside from its largest eigenvalue, all but that one eigenvalue would fall on this line. While that is not the case here or in general, the remaining eigenvalues are nonetheless sufficiently small to ensure that the density operator has high von Neumann entropy

An application of the grouping property then gives us a good lower bound on the von Neumann entropy: H1 (N ⊗ N¯ )() = H1 (λ) = h(λ1 ) + (1 − λ1 )H1 (λ˜ ) = 2 ln |A| − O(1), (32) where h is the binary entropy function. This entropy is nearly as large as it can be and, in particular, as large as H min (N ) + H min (N¯ ) according to Theorem IV.1 of [50], the von Neumann entropy version of Lemma 3.2. 5. Discussion The counterexamples presented here demonstrate that the maximal p-norm multiplicativity conjecture and, equivalently, the minimum output p-Rényi entropy conjecture are false for all 1 < p ≤ ∞. The primary motivation for studying this conjecture was that it is a natural strengthening of the minimum (von Neumann) output entropy conjecture, which is of fundamental importance in quantum information theory. In particular, since the multiplicativity conjecture was formulated, most attempts to prove the minimum output entropy conjecture for special cases actually proved maximal p-norm multiplicativity and then took the limit as p decreases to 1. This strategy, we now know, cannot be used to prove the conjecture in general. From that perspective, it would seem that the results in this paper cast doubt on the validity of the minimum output entropy conjecture itself. However, as we have shown,

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

275

the examples explored here appear to be completely consistent with the conjecture, precisely because the von Neumann entropy is more difficult to perturb than the Rényi entropies of order p > 1. It is therefore still possible that the p = 1 conjecture could be demonstrated using subtle variants of p-norm multiplicativity such as exact or approximate multiplicativity in a channel-dependent interval (1, 1 + δ). Another strategy that is still open would be to approach the von Neumann minimum output entropy via Rényi entropies for p < 1. It is possible that additivity holds there even as it fails for p > 1. That is not, unfortunately, a very well-informed speculation. With few exceptions [57], there has been very little research on the additivity question in the regime p < 1, even though many arguments can be easily adapted to this parameter region. (Equation (13), for example, holds for all 0 < p.) Unfortunately, since the time the examples presented here were first circulated, counterexamples for p close to 0 were also discovered [58], casting doubt on the conjecture for the whole set of Rényi entropies with p < 1. Indeed, as in the current paper, those examples are based on influencing a single eigenvalue of the output state of the tensor product channel; while here we increase the largest one, there the smallest is suppressed. Thus, while it seems doubtful that the examples of channels presented here will have direct implications for the additivity of the minimum von Neumann entropy, we think that they are still very useful as a new class of test cases. Indeed, as we remarked earlier, our examples eliminate what had been the previously favoured route to the conjecture via the output p-norms. As a final comment, while this paper has demonstrated that the maximal p-norm additivity conjecture fails for p > 1, all the counterexamples presented here have been nonconstructive. For the examples based on -randomizing maps, all the known explicit constructions (by Ambainis and Smith [59] or via iterated quantum expander maps [60,61]) only give bounds in the 2-norm, which do imply bounds on the output p-norm but those are too weak to yield counterexamples to multiplicativity. Likewise, the counterexamples based on generic quantum channels rely on the existence of large subspaces containing only highly entangled states. Even when the entanglement is quantified using von Neumann entropy, in which case the existence of these subspaces was demonstrated in 2003 [50], not a single explicit construction is known. The culprit, as in many other related contexts [62], is our use of the probabilistic method. Since we don’t have any explicit counterexamples, only a proof that counterexamples exist, it remains an open problem to “derandomize” our argument. Acknowledgements. We would like to thank Frédéric Dupuis and Debbie Leung for an inspiring late-night conversation at the Perimeter Institute, Aram Harrow for several insightful suggestions, and Mary Beth Ruskai for discussions on the additivity conjecture. We also thank BIRS for their hospitality during the Operator Structures in Quantum Information workshop, which rekindled our interest in the additivity problem. PH was supported by the Canada Research Chairs program, a Sloan Research Fellowship, CIFAR, FQRNT, MITACS, NSERC and QuantumWorks. AW received support from the U.K. EPSRC, the Royal Society and the European Commission (project “QAP”).

Appendix A: Proof of Lemma 4.1 We will estimate the integral, in what is perhaps not the most illuminating way, by expressing it in terms of the matrix entries of U . Let Us,ab = R 0| S s|U |a A |b B .

276

P. Hayden, A. Winter

Expanding gives

2 (N ⊗ N¯ )(| |) dU (A1) 1 U¯ s1 ,a2 b2 U¯ s ,a b U¯ s ,a b U¯ s ,a b Us1 ,a1 b1 Us ,a b Us ,a b Us ,a b dU. = 2 1 1 2 2 2 1 2 2 2 1 1 1 1 1 2 2 2 |S|2 a ,a s ,s Tr

1 2 b ,b

1 2

1 2 a1 ,a2 b ,b s1 ,s2 1

2

Following [63,64], the non-zero terms in the sum can be represented using a simple graphical notation. Make two parallel columns of four dots, then label the left-hand dots by the indices (s1 , s2 , s1 , s2 ) and the right-hand dots by the indices v = (a2 b2 , a1 b1 , a2 b2 , a1 b1 ). Join dots with a solid line if the corresponding U¯ matrix entry appears in Eq. (A1). Since terms integrate to a non-zero value only if the vector of U indices w = (a1 b1 , a2 b2 , a1 b1 , a2 b2 ) is a permutation of the vector of U¯ indices, a non-zero integral can be represented by using a dotted line to connect left-hand and right-hand dots whenever the corresponding U matrix entry appears in the integral. Assuming for the moment that the vertex labels in the left column are all distinct and likewise for the right column, the integral evaluates to the Weingarten function Wg(π ), where π is the permutation such that wi =vπ(i) . For the rough estimate required here, it is sufficient to know that Wg(π ) = (|A||B|)−4−|π | , where |π | is the minimal number of factors required to write π as a product of transpositions, and that Wg(e) = (|A||B|)−4 1 + O(|A|−2 |B|−2 ) [65]. The dominant contribution to Eq. (A1) comes from the “stack” diagram s1 • • a2 b2 = a1 b1 , s2 •

• a1 b1 = a2 b2 ,

s1 •

• a2 b2 = a1 b1 ,

s2 •

• a1 b1 = a2 b2 ,

in which the solid and dashed lines are parallel and for which the contribution is positive and approximately equal to |S|2 1 −2 −2 a δb b Wg(id) = 1 + O(|A| δ δ δ |B| ) . a a b b a 1 2 1 2 1 2 1 2 |S|2 a ,a |A|2 |B|2 s ,s 1 2 b1 ,b2

1 2

a1 ,a2 b ,b s1 ,s2 1

(A2)

2

(The expression on the left-hand side would be exact but for the terms in which vertex labels are not distinct.) To obtain an estimate of Eq. (A1), it is then sufficient to examine the other terms and confirm that they are all of smaller asymptotic order than this. There are six diagrams representing transpositions, and their associated (negative) contributions are s1 •

•

a 2 b2 = a 1 b1

s1 •

•

a 2 b2 = a 1 b1

s1 •3

s2 •

•

a1 b1 = a2 b2

s2 •=

•

a1 b1 = a2 b2

s2 •

N p p• p N N • s2 •p

a2 b2 = a2 b2

s1 •

a1 b1 = a1 b1

s2 •

s1 •N

(|S|2 |A|−4 |B|−2 )

=

= • == •

a2 b2 = a1 b1 a1 b1 = a2 b2

(|S|2 |A|−4 |B|−4 )

3

•

a2 b2 = a2 b2

3 • a b1 = a b2 1 2 3 3 s1 • 3 • a 2 b2 = a 1 b1 3 • a1 b1 = a1 b1 , s2 •

(|S|2 |A|−2 |B|−4 )

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1 s1 •

a2 b2 = a1 b1 s1 •N

a2 b2 = a2 b2

a1 b1 = a2 b2

a1 b1 = a1 b1

a2 b2 = a2 b2

• = • s2 • == • s •

a2 b2 = a1 b1 s1 •

•

a2 b2 = a1 b1

a1 b1 = a2 b2

s2 •

a1 b1 = a2 b2 s2 •

•

a1 b1 = a2 b2

•

a 2 b2 = a 1 b1

N p p• p N N • s1 •p

a1 b1 = a1 b1

s2 •

s2 •N

•

277

s1 •=

=

1

(|S|2 |A|−2 |B|−4 )

•

N p p• p N N s2 •p •

(|S|2 |A|−4 |B|−4 )

(|S|2 |A|−4 |B|−2 ).

For permutations π such that |π | > 1, the Weingarten function is significantly suppressed: Wg(π ) = O(|A|−6 |B|−6 ). Moreover, for a given diagram type, the requirement that wi = vπ(i) can only hold if at least two pairs of the indices a1 , a2 , a1 , a2 , b1 , b2 , b1 , b2 are identical. The contribution from such diagrams is therefore O(|S|2 |A|−4 |B|−2 ). To finish the proof, it is necessary to consider integrals in which the vertex labels on the left- or the right-hand side of a diagram are not all distinct. In this more general case, choosing a set C of representatives for the conjugacy classes of the permutation group on four elements, the value of the integral can be written N (c) Wg(c), (A3) c∈C

where N (c) =

σ ∈S4 :

τ ∈S4 :

δ(τ π σ ∈ c).

(A4)

v=σ ( v ) w=τ (w)

These formulas have a simple interpretation. Symmetry in the vertex labels introduces ambiguities in the diagrammatic notation; the formula states that every one of the diagrams consistent with a given vertex label set must be counted, and with a defined dimension-independent multiplicity. Conveniently, our crude estimates have already done exactly that, ignoring the multiplicities. The only case for which we need to know the multiplicities, moreover, is for contributions to the dominant term, which we want to know exactly and not just up to a constant multiple. We claim that in the sum (A1) there are at most O(|S|4 |A||B|3 ) terms with vertex label symmetry. The total contribution for terms with vertex label symmetries τ and σ in which |τ π σ | ≥ 1 is therefore of size O(|S|2 |A|−4 |B|−2 ) and does not affect the dominant term. To see why the claim holds, fix a diagram type and recall that the requirement wi = vπ(i) for a permutation π can only hold if at least two pairs of the indices a1 , a2 , a1 , a2 , b1 , b2 , b1 , b2 are identical. Equality is achieved only when all the A indices or all the B indices are aligned, corresponding to the following two diagrams: s1 •N N p• a2 b2 = a2 b2 p Np N s2 •p • a1 b1 = a1 b1 s1 •N N p• a2 b2 = a2 b2 p Np N • a1 b1 = a1 b1 s2 •p

s1 •3

• 3 s2 •N N3 p• 3 p Np3 N p s1 • 3 • 3 • s2 •

a2 b2 = a2 b2 a1 b1 = a1 b1 a2 b2 = a2 b2 a1 b1 = a1 b1

For the first diagram, using the fact that |A| ≤ |B| ≤ |S|, it is easy to check that imposing the extra constraint that either the top or bottom two S or AB vertex labels match singles at most O(|S|4 |A||B|3 ) terms from Eq. (A1). Similar reasoning applies to the second

278

P. Hayden, A. Winter

diagram, but imposing the constraint instead on rows one and four, or two and three. For all other diagram types, at least four pairs of the indices a1 , a2 , a1 , a2 , b1 , b2 , b1 , b2 are identical. (The number of matching A and B indices is necessarily even.) In a term for which the vertex labels are not all distinct, either a pair of S indices or a further pair of A or B indices must be identical. In the latter case, there must exist an identical A pair and an identical B pair among all the pairs. Again using |A| ≤ |B| ≤ |S|, there can be at most O(|S|4 |B|3 ) such terms per diagram type, which demonstrates the claim. We are thus left to consider integrals with vertex label symmetry and N (e) = 0 in Eq. (A3). If N (e) = 1, then our counting was correct and there is no problem. It is therefore sufficient to bound the number of integrals in which N (e) > 1. This can occur only in terms with at least 2 vertex label symmetries. Running the argument of the previous paragraph again, for the two diagrams with A or B indices all aligned, this occurs in at most O(|S|4 |B|2 ) terms. For the rest of the cases, it is necessary to impose equality on yet another pair of indices, leading again to at most O(|S|4 |B|2 ) terms. Since Wg(e) = O(|A|−4 |B|−4 ), these contributions are collectively O(|S|2 |A|−4 |B|−2 ). The bound on the error term in Eq. (29) arises by substituting the inequalities |S| ≤ |A||B| and |A| ≤ |B| into each of the estimates calculated above. References 1. Schumacher, B.: Quantum coding. Phys. Rev. A 51, 2738–2747 (1995) 2. Jozsa, R., Schumacher, B.: A new proof of the quantum noiseless coding theorem. J. Mod. Opt. 41, 2343– 2349 (1994) 3. Pierce, J.: The early days of information theory. IEEE Transactions on Information Theory 19(1), 3–8 (1973) 4. Gordon, J.P.: Noise at optical frequencies; information theory. In: Miles P.A. ed., Quantum electronics and coherent light; Proceedings of the international school of physics Enrico Fermi, Course XXXI, New York: Academic Press, 1964 pp. 156–181 5. Holevo, A.S.: Information theoretical aspects of quantum measurements. Probl. Info. Transm. (USSR), 9(2), 31–42 (1973). Translation: Probl. Info. Transm. 9, 177–183 (1973) 6. Hausladen, P., Jozsa, R., Schumacher, B., Westmoreland, M., Wootters, W.K.: Classical information capacity of a quantum channel. Phys. Rev. A 54, 1869–1876 (1996) 7. Holevo, A.S.: The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory 44, 269–273 (1998) 8. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56, 131–138 (1997) 9. Shor, P.W.: Equivalence of additivity questions in quantum information theory. Commun. Math. Phys. 246, 453–472 (2004) 10. Pomeransky, A.A.: Strong superadditivity of the entanglement of formation follows from its additivity. Physical Review A 68(3), 032317 (2003) 11. Audenaert, K.M.R., Braunstein, S.L.: On strong superadditivity of the entanglement of formation. Commun. Math. Phys. 246, 443–452 (2004) 12. Matsumoto, K., Shimono, T., Winter, A.: Remarks on additivity of the Holevo channel capacity and of the entanglement of formation. Commun. Math. Phys. 246, 427–442 (2004) 13. Bennett, C.H., DiVincenzo, D.P., Smolin, J.A., Wootters, W.K.: Mixed-state entanglement and quantum error correction. Phys. Rev. A 54, 3824–3851 (1996) 14. Hayden, P.M., Horodecki, M., Terhal, B.M.: The asymptotic entanglement cost of preparing a quantum state. J. Phys. A: Math. Gen. 34, 6891–6898 (2001) 15. Vidal, G., Dür, W., Cirac, J.I.: Entanglement cost of bipartite mixed states. Phys. Rev. Lett. 89(2), 027901 (2002) 16. Matsumoto, K., Yura, F.: Entanglement cost of antisymmetric states and additivity of capacity of some quantum channels. J. Phys. A: Math. Gen. 37, L167–L171 (2004) 17. Vollbrecht, K.G.H., Werner, R.F.: Entanglement measures under symmetry. Phys. Rev. A 64(6), 062307 (2001) 18. King, C., Ruskai, M.B.: Minimal entropy of states emerging from noisy quantum channels. IEEE Trans. Inf. Th. 47(1), 192–209 (2001)

Counterexamples to the Maximal p-Norm Multiplicativity Conjecture for all p > 1

279

19. Osawa, S., Nagaoka, H.: Numerical experiments on the capacity of quantum channel with entangled input states. IEICE Trans. Fund. Elect., Commun. and Comp. Sci. E84(A10), 2583–2590 (2001) 20. Amosov, G.G., Holevo, A.S., Werner, R.F.: On some additivity problems of quantum information theory. Probl. Inform. Transm. 36(4), 25 (2000) 21. Amosov, G.G., Holevo, A.S.: On the multiplicativity conjecture for quantum channels. http://arxiv.org/ list/:math-ph/0103015, 2001 22. King, C.: Additivity for unital qubit channels. J. Math. Phys. 43(10), 4641–4643 (2002) 23. Fujiwara, A., Hashizumé, T.: Additivity of the capacity of depolarizing channels. Phys. Lett. A 299, 469–475 (2002) 24. King, C.: The capacity of the quantum depolarizing channel. IEEE Trans. Inf. Th., 49(1), 221–229 (2003) 25. Holevo, A.S.: Quantum coding theorems. Russ. Math. Surv. 53, 1295–1331 (1998) 26. King, C.: Maximization of capacity and p-norms for some product channels. http://arxiv.org/list/:quantph/0103086, 2001 27. Shor, P.W.: Additivity of the classical capacity of entanglement-breaking quantum channels. J. Math. Phys. 43, 4334–4340 (2002) 28. Devetak, I., Shor, P.W.: The capacity of a quantum channel for simultaneous transmission of classical and quantum information. Commun. Math. Phys. 256, 287–303 (2005) 29. King, C., Matsumoto, K., Nathanson, M., Ruskai, M.B.: Properties of conjugate channels with applications to additivity and multiplicativity. http://arxiv.org/list/:quant-ph/0509126, 2005., to appear in special issue of Markov processes and Related Fields in memory of J.F. leuis 30. Cortese, J.: Holevo-Schumacher-Westmoreland channel capacity for a class of qudit unital channels. Phys. Rev. A 69(2), 022302 (2004) 31. Datta, N., Holevo, A.S., Suhov, Y.M.: A quantum channel with additive minimum output entropy. http:// arxiv.org/list/:quant-ph/0403072, 2004 32. Fukuda, M.: Extending additivity from symmetric to asymmetric channels. J. Phys. A: Math. Gen. 38, L753–L758 (2005) 33. Holevo, A.S.: Additivity of classical capacity and related problems. Available online at: http://www. imaph.tu-bs.de/qi/problems/10.pdf, 2004 34. Holevo, A.S.: The additivity problem in quantum information theory. In: Proceedings of the International Congress of Mathematicians, (Madrid, Spain, 2006), Zurich:Publ. EMS, 2007, pp. 999–1018 35. King, C., Ruskai, M.B.: Comments on multiplicativity of maximal p-norms when p = 2. Quantum Inf. and Comput. 4, 500–512 (2004) 36. King, C., Nathanson, M., Ruskai, M.B.: Multiplicativity properties of entrywise positive maps. Linear Alge. Applications. 404, 367–379 (2005) 37. Serafini, A., Eisert, J., Wolf, M.M.: Multiplicativity of maximal output purities of Gaussian channels under Gaussian inputs. Phys. Rev. A 71(1), 012320 (2005) 38. Giovannetti, V., Lloyd, S.: Additivity properties of a Gaussian channel. Phys. Rev. A 69, 062307 (2004) 39. Devetak, I., Junge, M., King, C., Ruskai, M.B.: Multiplicativity of completely bounded p-norms implies a new additivity result. Commun. Math. Phys. 266, 37–63 (2006) 40. Michalakis, S.: Multiplicativity of the maximal output 2-norm for depolarized Werner-Holevo channels. http://arxiv.org/list/:0707.1722, 2007 41. Werner, R.F., Holevo, A.S.: Counterexample to an additivity conjecture for output purity of quantum channels. J. Math. Phys. 43, 4353–4357 (2002) 42. Alicki, R., Fannes, M.: Note on multiple additivity of minimal Renyi entropy output of the Werner-Holevo channels. Open Syst. Inf. Dyn. 11(4), 339–342 (2005) 43. Datta, N.: Multiplicativity of maximal p-norms in Werner-Holevo channels for 1 < p < 2. http://arxiv. org/list/:quant-ph/0410063, 2004 44. Giovannetti, V., Lloyd, S., Ruskai, M.B.: Conditions for multiplicativity of maximal p -norms of channels for fixed integer p. J. Math. Phys. 46, 042105 (2005) 45. Winter, A.: The maximum output p-norm of quantum channels is not multiplicative for any p>2. http:// arxiv.org/abs/:0707.0402, 2007 46. Hayden, P.: The maximal p-norm multiplicativity conjecture is false. arXiv.org:0707.3291, 2007 47. Hayden, P., Leung, D., Shor, P.W., Winter, A.: Randomizing Quantum States: Constructions and Applications. Commun. Math. Phys. 250, 371–391 (2004) 48. Aubrun, G.: On almost randomizing channels with a short Kraus decomposition. http://arxiv.org/abs/: 0805.2900v2, 2008 49. Paulsen, V.I.: Completely bounded maps and dilations. New York: Longman Scientific and Technical, 1986 50. Hayden, P., Leung, D.W., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265, 95–117 (2006) 51. Bennett, C.H., Hayden, P., Leung, D.W., Shor, P.W., Winter, A.: Remote preparation of quantum states. IEEE Trans. Inf. Th. 51(1), 56–74 (2005)

280

P. Hayden, A. Winter

52. Geman, S.: A Limit Theorem for the Norm of Random Matrices. Ann. Prob. 8(2), 252–261 (1980) 53. Johnstone, I.M.: On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 29(2), 295–327 (2001) 54. Davidson, K.R., Szarek, S.J.: Local Operator Theory, Random Matrices and Banach Spaces. In: Johnson W.B., Lindenstrauss J. eds., Handbook of the Geometry of Banach Spaces, Vol. I, Chap. 8, London:Elsevier, 2001, pp. 317–366 55. Ledoux, M.: The concentration of measure phenomenon, Vol. 89 of Mathematical Surveys and Monographs. Providence, RI: American Mathematical Society, 2001 56. King, C.: Maximal p-norms of entanglement breaking channels. Quantum Inf. and Comp. 3(2), 186–190 (2003) 57. Wolf, M.M., Eisert, J.: Classical information capacity of a class of quantum channels. New J. Phys. 7, 93 (2005) 58. Cubitt, T., Harrow, A.W., Leung, D., Montanaro, A., Winter, A.: Counterexamples to additivity of minimum output p-Rényi entropy for p close to 0. http://arxiv.org/abs/:0712.3628v2, 2007, Commun. Math. Phys. doi:10.1007/s00220-008-0625-z 59. Ambainis, A., Smith, A.: Small pseudo-random families of matrices: Derandomizing approximate quantum encryption. In: Proc. RANDOM, LNCS 3122, Berlin-Heidelberg-NewYork: Springer, 2004, pp. 249–260 60. Ben-Aroya, A., Ta-Shma, A.: Quantum expanders and the quantum entropy difference problem. http://arxiv.org/abs/:quant-ph/0702129, 2007 61. Hastings, M.B.: Random unitaries give quantum expanders. Phys. Rev. A 76, 032315 (2007) 62. Pérez-García, D., Wolf, M.M., Palazuelos, C., Villanueva, I., Junge, M.: Unbounded Violation of Tripartite Bell Inequalities. Commun. Math. Phys. 279(2), 455–486 (2008) 63. Aubert, S., Lam, C.S.: Invariant integration over the unitary group. J. Math. Phys. 44, 6112–6131 (2003) 64. Aubert, S., Lam, C.S.: Invariant and group theoretical integrations over the U(n) group. J. Math. Phys. 45, 3019–3039 (2004) ´ 65. Collins, B., Sniady, P.: Integration with respect to the Haar measure on unitary, orthogonal and symplectic group. Commun. Math. Phys. 264, 773–795 (2006) Communicated by M.B. Ruskai

Commun. Math. Phys. 284, 281–290 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0625-z

Communications in

Mathematical Physics

Counterexamples to Additivity of Minimum Output p-Rényi Entropy for p Close to 0 Toby Cubitt1 , Aram W. Harrow2 , Debbie Leung3 , Ashley Montanaro2 , Andreas Winter1,4 1 2 3 4

Department of Mathematics, University of Bristol, Bristol BS8 1TW, UK. E-mail: [email protected] Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK Institute for Quantum Computing, University of Waterloo, Waterloo N2L 3G1, Ontario, Canada Centre for Quantum Technologies, National University of Singapore, 2 Science Drive 3, Singapore 117542, Singapore

Received: 14 February 2008 / Accepted: 3 July 2008 Published online: 10 September 2008 – © Springer-Verlag 2008

Abstract: Complementing recent progress on the additivity conjecture of quantum information theory, showing that the minimum output p-Rényi entropies of channels are not generally additive for p > 1, we demonstrate here by a careful random selection argument that also at p = 0, and consequently for sufficiently small p, there exist counterexamples. An explicit construction of two channels from 4 to 3 dimensions is given, which have non-multiplicative minimum output rank; for this pair of channels, numerics strongly suggest that the p-Rényi entropy is non-additive for all p 0.11. We conjecture however that violations of additivity exist for all p < 1. I. Introduction and Definitions For a quantum channel (i.e. a completely positive and trace preserving linear map) N between finite quantum systems, and p ≥ 0, define S min p (N ) := min ρ

1 log Tr(N (ρ)) p , 1− p

where the minimisation is over all states (normalised density operators) on the input space of N . The quantity S p (σ ) = 1−1 p log Tr σ p is known as the p-Rényi entropy of the state σ (0 < p < ∞ and p = 1), with the definition extended to p = 0, 1, ∞ by taking limits; S1 (σ ) = S(σ ) = − Tr σ log σ is the von Neumann entropy. S∞ (σ ) = − log σ ∞ is the min-entropy, and S0 (σ ) = log rank σ . Due to the concavity of the Rényi entropies in ρ, the minimum in the above definition is attained at a pure input state ρ = |ψψ|. The additivity problem is the question whether for all channels N1 and N2 , it holds that ?

min min S min p (N1 ⊗ N2 ) = S p (N1 ) + S p (N2 ).

(1)

282

T. Cubitt, A. W. Harrow, D. Leung, A. Montanaro, A. Winter

Note that the direction “≤” here is trivial, so proofs and counterexamples have to concentrate on the direction “≥”. This was indeed proved for special channels and some p; for example, it is known for p ≥ 1 if one of the channels is entanglement-breaking [1,2], unital on a qubit space [3], or depolarising of any dimension [4]; in addition for a number of other cases. King [5] has furthermore shown that it holds for p < 1 if one of the channels is entanglement-breaking. Holevo and Werner [6] exhibited the first counterexamples to Eq. (1), for p > 4.79. It was demonstrated recently [7,8] that for every p > 1 there exist channels violating Eq. (1). Here we show that Eq. (1) is also false at p = 0, and by continuity of S p in p, it is thus violated for all p ≤ p0 with some small but positive p0 . Since S0 (σ ) is the logarithm of the rank of the density matrix σ , so S0min (N ) is the logarithm of the minimum output rank of the channel, i.e. of the smallest rank of an output state. In the next section we prove our main existence result of counterexamples, in Sect. III we exhibit an explicit example, and in Sect. IV we explore up to which p < 1 we can violate additivity of S min p . II. Main Result Theorem 1. If d A > 2, d B > 2 and d A d B is even then there exist quantum channels N1 , N2 with d A -dimensional input spaces and d B -dimensional output spaces, such that S0min (N1 ) = S0min (N2 ) = log d B , but S0min (N1 ⊗ N2 ) ≤ log(d B2 − 1) < 2 log d B . Proof. Our approach is the following: let ρ AB = (id ⊗ N ) be a Choi-Jamiołkowski state of the channel N , with a particular choice of reference state | ∈ A ⊗ A = Cd A ⊗ Cd A . Note that while usually people use a fixed maximally entangled state, for the isomorphism it is sufficient that it is of maximal Schmidt rank. In [8,7], additivity counterexamples were found for p > 1 by choosing N randomly subject to a certain constraint. Our approach will be instead to choose the Choi-Jamiołkowski state randomly, again subject to a certain constraint that helps guarantee the additivity counterexample. First note that N (ϕ) has maximal rank d B for every input state ϕ iff the orthogonal complement of ρ doesn’t contain any product vectors, i.e. for all pure states |ϕ ∈ A, |ψ ∈ B, Tr (ρ AB (ϕ ⊗ ψ)) = 0.

(2)

The easy justification of this is as follows: in Appendix A we show that the action of the channel N can be written −1/2 −1/2 N (ϕ) = Tr A ρ AB ρ A U † ϕUρ A ⊗1 , (3) where · denotes the complex conjugate with respect to a fixed computational basis and U is a unitary depending on (see Appendix A for details). Full rank of the output means that for all pure states ϕ, ψ, 0 = Tr (N (ϕ)ψ) −1/2 −1/2 = Tr ρ AB ρ A U † ϕUρ A ⊗ψ ∝ Tr ρ AB (ϕ ⊗ ψ) ,

Counterexamples to Additivity for Minimum Output p-Rényi Entropy

283

−1/2

where we used Eq. (3) and the fact that ρ A U † |ϕ is, up to normalisation, another pure state |ϕ . Note that any unitary U on A will serve to create a channel, so we shall fix it to be the identity from now on – this is only a matter of redefining A A , which we can do if only given ρ AB . So, our task is to find two states ρ AB and σ A B on A ⊗ B with this property, such that ω A A B B = ρ AB ⊗ σ A B does have a product state in its orthogonal complement; we’ll choose it to be the maximally entangled state A A ⊗ B B . Then the condition we seek to enforce is 0 = Tr ((ρ AB ⊗ σ A B )( A A ⊗ B B )) =

1 Tr(ρ σ ), dAdB

where signifies the matrix transpose. that the channel input will not be A A , √ Note√ but rather the normalised version of ρ A ⊗ σ A | A A . What we will do is simply pick ρ to be the (normalised) projection onto a d A d B /2dimensional random subspace, drawn according to the unitary invariant measure on AB, and σ the (normalised) projection onto the orthogonal complement of ρ: ρ=

2 2 , σ = (1 − ). dAdB dAdB

This enforces the condition Tr(ρ σ ) = 0 deterministically, while both the supporting subspaces of ρ and σ are individually uniformly random. Thus we are done once we prove Lemma 2, stated below, since it implies for large enough d A and d B , that with probability 1 neither the orthogonal complement of ρ nor that of σ (which are themselves uniformly random subspaces) contains a product vector. Lemma 2. Let be a uniformly random projector in Cd A ⊗ Cd B of rank d E such that d A d B > d A + d B + d E − 2. Then,

Pr ∃ϕ A ∈ Cd A , ϕ B ∈ Cd B Tr ((ϕ A ⊗ ϕ B )) = 1 = 0. (4)

In words: a random subspace of “small” dimension contains no product states, almost surely. Note that if d A d B is even then d E = d A d B /2 is an integer, and so the inequality (d A − 2)(d B − 2) > 0 can be rearranged to obtain d A d B > d A + d B + d E − 2, thus justifying the application to Theorem 1. Proof. Geometrically, we want to show that the probability for a random subspace of dimension d E to contain a product state, is zero. Using the isomorphism between bipartite vectors and d A × d B -matrices (which identifies Schmidt rank with matrix rank) [9], we can reformulate the task as describing the d E -dimensional subspaces of d A × d B -matrices not containing any nonzero elements of rank 1. In other words, we are interested in the set of subspaces intersecting the determinantal variety of vanishing 2 × 2-minors only in the zero matrix. The dimension of this variety – known as the Segre embedding – is easily seen to be d A + d B − 1, so a generic subspace of dimension d E ≤ d A d B − (d A + d B − 1) = (d A − 1)(d B − 1) will not intersect it except trivially, by standard algebraic-geometric arguments [10,11]; a more explicit argument for this fact was given recently by Walgate and Scott [12].

284

T. Cubitt, A. W. Harrow, D. Leung, A. Montanaro, A. Winter

III. An Explicit Construction in Small Dimension Since our additivity violation takes the form of only a single zero eigenvalue in the two-copy output, it is strongest when the channel dimensions are smallest. Indeed, violations for large dimension can be constructed from channels from small dimension by tensoring the channel with a trivial channel, such as a completely depolarising channel. Thus, we are most interested in finding counterexamples with small dimension. One such counterexample, with d A = 4 and d B = 3 is described here. Based on the constructions in [9], and indeed a slight variation of it, we show now – using the same methodology as above – how to construct two channels Ni : B(C4 ) → B(C3 ) (i = 1, 2) such that S0min (N1 ) = S0min (N2 ) = log 3, but S0min (N1 ⊗ N2 ) ≤ log 8 < 2 log 3. These happen to be the smallest dimensions that satisfy Lemma 2. As we have discussed above, we describe them via their Choi-Jamiołkowski states ρ AB and σ AB (with A and B being a 4- and 3-dimensional system, respectively) such that Tr ρσ = 0 and neither ρ nor σ contains a product state in the respective orthogonal complement of their supports. Resorting to the supporting subspaces of ρ and σ , denoted R, S < A ⊗ B, respectively, we have nothing to do but choose them to be orthogonal and of dimension 6, such that neither contains a product state. Using the customary notation of vectors in C4 × C3 as 3 × 4 matrices [9], and with ω = e2πi/3 , we let R be spanned by ⎡

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 01 1 001 001 ⎣1 ⎦, ⎣ 1 ⎦, ⎣ ω ⎦, ⎣ ⎦; −1⎦, and ⎣ 0 ω2 ⎦, ⎣ 2 100 10 −1 ω 0 ω whereas S is spanned by ⎡

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ 1 01 001 001 01 ⎣1 ⎦, ⎣ ω2 ⎦. ⎦, ⎣ ω ⎦, ⎣ 1⎦, and ⎣0 1 ⎦, ⎣ −1 0 0 1 1 ω0 ω2 Since these twelve vectors are clearly orthogonal, the subspaces R and S are each of dimension 6, and orthogonal to each other. The proof that they don’t contain a product state is by algebraic geometry: one has to show that every nonzero element in R and S has some nonzero 2 × 2-minor. Such a demonstration can be found by off-the-shelf symbolic algebra software using Gröbner basis algorithms. A short argument goes as follows, utilizing the way we found the subspaces in the first place. The first five vectors of R and S span respective 5-dimensional subspaces R0 and S0 . Notice that they are entirely symmetric to each other, and that they don’t contain product states by the arguments of [9] – indeed they are a special case of the construction presented there. Also, the sixth vector is clearly not product in either case. Hence, to obtain a product vector in R, say (the argument for S is very similar), we would need to form the sum of

Counterexamples to Additivity for Minimum Output p-Rényi Entropy

285

the sixth vector with an element from R0 : ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 01 1 001 ⎦+ β ⎣ 1 ⎦+ γ ⎣ ω ⎦+ δ ⎣ −1⎦ M = α ⎣1 ω2 ⎦ + ε ⎣ 100 10 ω2 0 ω ⎡ ⎤ 001 ⎦ +⎣ 0 −1 ⎡ ⎤ β +γ δ ε 1 α β + ωγ ω2 δ −ε ⎦ . =⎣ −1 α β + ω2 γ ωδ For this to be a product vector, all its 2 × 2-minors have to vanish, but we need to look at only a few to obtain a contradiction: the minors {1, 2} × {3, 4}, {1, 3} × {2, 4} and {2, 3} × {1, 4} imply 0 = −ε2 − ω2 δ = ωδ 2 − α = ωαδ − ε, which in turn allow us to express all other variables in terms of ε: δ = −ωε2 , α = ωδ 2 = ε4 , ε = ωαδ = −ω2 ε6 , leaving for ε only the possibilities of being 0 or a fifth root of −ω2 . If ε = 0, so are α and δ, and in this case the {1, 3} × {1, 3}-minor is non-vanishing. Hence we continue with ε5 = −ω2 , and look at the minors {1, 3} × {1, 4}, {1, 2} × {2, 4} and {1, 3} × {3, 4}: these yield the constraints 0 = (β + γ )ωδ + 1 = −δε − (β + ωγ ) = ωδε − (β + ω2 γ ), in other words β + γ = −ω2 /δ = ω/ε2 = −ω2 ε3 , β + ωγ = −δε = ωε3 , β + ω2 γ = ωδε = −ω2 ε3 , which implies γ = 0 and β = −ω2 ε3 from the 1st and 3rd equation, but then the 2nd contradicts by demanding β = ωε3 . Hence, in conclusion, R cannot contain a product state, and the argument for S is similar in nature. IV. Larger Rényi Parameter Now we can use the explicit pair of channels constructed in the previous section to look for larger values of p for which additivity of S min p is violated. The simplest thing is to take the Choi-Jamiołkowski states to be the normalised projections onto the subspaces R and S, respectively. However, we may clearly take any state of rank 6 supported on the respective subspace to obtain a bona fide generalised Choi-Jamiołkowski state. We performed some numerics in both cases: for the first (Choi-Jamiołkowski states pro portional to the subspace projections), using S (N ⊗ N )3 as an upper bound of

min min

S min p (N ⊗ N ) and numerical calculations of S p (N ) and S p (N ), we see violations of additivity for values of p up to ≈0.096. For the second, it turns out that a very good choice is to have ρ AB and σ A B to be diagonal in the above bases of R and S, respectively, with specific probability weights

286

T. Cubitt, A. W. Harrow, D. Leung, A. Montanaro, A. Winter 3.08

3.1

Minimum over product state inputs Maximally entangled state

3.08

3.04

3.06

3.02

3.04

3

3.02

Sp

Sp

3.06

2.98

3

2.96

2.98

2.94

2.96

2.92

2.94

2.9

2.92 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15

p

Minimum over product state inputs Maximally entangled state

0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15

p

Fig. 1. Plots of the output entropy of the tensor product channel with the input state corresponding to the maximally entangled state (red line, shallow slope), versus the numerically obtained minimum when restricted to tensor product input states (blue line, steep slope). On the left the Choi-Jamiołkowski states are simply the normalised projections of the subspaces R and S; on the right, one chooses appropriately weighted density operators with support R and S, respectively

obtained by another numerical search. The weights of the basis vectors of R and S, in the above order, are 0.172776, 0.118738, 0.199229, 0.136705, 0.306899, 0.0656529, and 0.344911, 0.124908, 0.120721, 0.156968, 0.162754, 0.089738, respectively. This results in a numerical violation of additivity for p up to ≈0.112. We see no reason to believe that this value should be the limit of additivity violations. To obtain a rigorous interval [0; p0 ] of violations of additivity, we turn to the ideas of measure concentration explored in [13] in the context of quantum information theory. We will not make everything explicit, but the idea is as follows: we need to put rather tight lower bounds on S min p of the two individual channels; in fact, each of the two channels N , N is individually random from the class of channels with Stinespring dilations A → B ⊗ Cd A d B /2 (random meaning: according to the unitary invariant measure on B ⊗ Cd A d B /2 ). For the output properties of each of the channels, only the embedded d A -dimensional subspace S, S < B ⊗ Cd A d B /2 is relevant, which is a random subspace in the same sense [13]. Now in [13], Lemmas III.4 and III.6, it is shown that the spectrum of all states in a random subspace S < B ⊗ Cd A d B /2 is tightly concentrated around the value 1/d B , for large enough dimensions d A and d B such that d A d B ≥ (log d A ). I.e., with high probability the minimum Schmidt coefficient of any state in S, S is, say, ≥ 21 d1B . In other words, the output states of the channels have spectrum bounded away from 0 by this amount. Then for 0 ≤ p < 1, clearly, 1 p 1 min

log d S min (N ), S (N ) > B p p 1− p 2d B 1 p p log = log d B − . = log d B + 1− p 2 1− p However, S min p

1

min

2 N ⊗ N ≤ S0 N ⊗ N ≤ log d B − 1 = 2 log d B + log 1 − 2 . dB

Counterexamples to Additivity for Minimum Output p-Rényi Entropy

287

In conclusion, a violation is obtained as soon as 2p 1 ≤ − log 1 − 2 , 1− p dB which follows if p ≤

1 . 1+2 ln 2d B2

We omit here any estimate of the d B required in the

above concentration of the reduced state spectrum, which depends on the exact constants one uses in the probability bounds, but it is possible to get p0 in the range of 10−3 to 10−2 by this approach. There is yet another way to get rigorous estimates of p0 for every example, like for the explicit construction in the previous section. Namely, get a lower bound on the minimum minimal eigenvalue of an output state of the single copy channel N , which can be relaxed to a convex optimisation problem, and then use the argument above. In detail, consider the usual Choi-Jamiołkowski operator of the channel, d A AB = (id ⊗ N ), with | = i=1 |i|i. Then, N (ϕ) = Tr A [ AB (ϕ ⊗ 1)] (see Appendix A), and min λmin (N (ϕ)) = min Tr [ AB (ϕ ⊗ ψ)] = ϕ

ϕ,ψ

min

ρ separable

Tr [ AB ρ]

≥ min Tr [ AB ρ] . ρ PPT

The latter is a semidefinite program, so duality theory will yield rigorous lower bounds on the minimum minimal eigenvalue of an output state. Doing that for our example in Sect. III, yields again a rather poor bound for p0 of the order 10−2 . V. Discussion at p > 1, and the close shave After the disproof of the additivity conjecture for S min p by which the original and main conjecture at p = 1 has escaped, some hope was raised that one could prove additivity for p < 1, and hence by taking the limit for p = 1. This suggestion didn’t seem so unreasonable after King [5] showed additivity if one of the channels is entanglement-breaking. Also, it can be seen quite easily that arbitrary numbers of copies of the Holevo-Werner channel [6] obey additivity for p ≤ 1, via the result of [14]. In this respect the log of the minimum output rank, S0min , took prominence as an important test case, and the finding of a counterexample here is putting into doubt possible programmes to prove the “standard additivity conjectures” by approaching p = 1 from below. We feel that, with the minimum output rank not multiplicative, it is rather unlikely that any of the S min for p < 1 should be additive. It is to be noted however, that the p present technique doesn’t really yield massive violations of additivity, even at p = 0, and presumably less so at other 0 1 [7,8], but it can be understood pretty well in terms of control by the random selection to engineer a certain conspiracy between the two channels: while for p > 1 we only need to fix one large eigenvalue of the two-copy output corresponding to the maximally entangled input state, at p < 1 (and most extremely so at p = 0) all non-zero eigenvalues are relevant, and even to make d of them zero exhausts the possibilities of the random selection performing well on the single-copy level. It is amusing to note, however, that we still exploit the peculiar symmetries, and indeed the multiplicativity, of the maximally entangled state to construct a violation.

288

T. Cubitt, A. W. Harrow, D. Leung, A. Montanaro, A. Winter

It is our hope that the present work will spark the search for further counterexamples, potentially finding a unified principle behind the constructions for p > 1 and p < 1 – and eventually helping to decide the original additivity conjecture(s) at p = 1. Note that the construction presented here and in [7,8] share already a couple of important traits. First, the candidate channels are individually random from the unitary invariant ensemble of Stinespring dilations with fixed input, output and environment dimensions – to get strong lower bounds on the minimum output entropy. Second, the pair of channels is chosen to be in some fixed relation to each other, so as to make the output state corresponding to the maximally entangled input (or, in our case, something very close to it) special; for p > 1 we want it to have an unusually large eigenvalue (which is why we choose the channels to be complex conjugate to each other), here we want an eigenvalue to vanish (which is why we impose orthogonality on the Choi-Jamiołkowski states). The possible extension or unification of the constructions thus is not so much how they are individually selected, but has to address the way the two channels are related to each other. Note added. After this work was presented at the AQIS’07 workshop in Kyoto (September 2007), Duan and Shi [15] used the methodology of our explicit construction in their surprising results on quantum zero-error capacity; they also exhibit a single channel from 4 to 4 dimensions violating additivity of S0min – as opposed to our using the tensor product of two different channels – in the sense that S0min N ⊗2 < 2S0min (N ). Acknowledgement. The authors thank Patrick Hayden, Richard Low, Koenraad Audenaert, Runyao Duan and Yaoyun Shi for their interest in the present work, and encouraging as well as interesting discussions. AWH and AW acknowledge support by the European Commission under a Marie Curie Fellowship (ASTQIT, FP-022194). TC, AWH, AM and AW acknowledge support through the integrated EC project “QAP” (contract no. IST-2005-15848), as well as by the U.K. EPSRC, project “QIP IRC”. AH was furthermore supported by the Army Research Office under grant W9111NF-05-1-0294. AM was additionally supported by the ECFP6-STREP network QICS. DL thanks NSERC, CRC, CFI, ORF, MITACS, ARO, and CIFAR for support. AW furthermore acknowledges support through an Advanced Research Fellowship of the U.K. EPSRC and a Royal Society Wolfson Research Merit Award.

Appendix A: On Choi-Jamiolkowski states Here we give a detailed explanation of Eq. (3) by describing how the channel can be recovered from our non-standard Choi-Jamiołkowski operator. Recall how to reconstruct the channel from the “standard” Choi-Jamiołkowski operd A ator AB = (id ⊗ N ), where | = i=1 |i|i. The key is the identity N (ϕ) = Tr A [ AB (ϕ ⊗ 1)] , dA with the complex conjugation with respect to the basis {|i}i=1 denoted by ·. Now, if we have any entangled state | of maximal Schmidt rank, it has a Schmidt form

| =

dA

λi |ei A | f i A ,

i=1 dA dA with local bases {|ei A }i=1 and {| f i A }i=1 , and strictly positive Schmidt coefficients d A λi |ei ei | has full rank (in particular λi > 0. This means that A = Tr A A A = i=1

Counterexamples to Additivity for Minimum Output p-Rényi Entropy

289

it is invertible), so its inverse is well-defined, and dA −1/2 A ⊗ 1 | A A = |ei | f i . i=1

Thus, introducing the unitary basis change U : |ei → | f i , we finally get dA dA −1/2 U A ⊗ 1 | A A = | f i | f i = |i|i = |, i=1

i=1

due to the V ⊗ V -invariance of | for unitary V . So, since the mapping from to only acts on A while the Choi-Jamiołkowski mapping acts only on B, and using the fact that ρ A = A for the generalised Choi-Jamiołkowski state ρ AB = (id ⊗ N ) A A , we finally find that we can recover the “standard” operator AB as −1/2 −1/2 AB = U A ⊗ 1 ρ AB A U † ⊗ 1 . In other words, using the above identities, the channel can be written N (ϕ) = Tr A [ AB (ϕ ⊗ 1)] −1/2 −1/2 = Tr A U A ⊗ 1 ρ AB A U † ⊗ 1 (ϕ ⊗ 1) −1/2 −1/2 = Tr A ρ AB ρ A U † ϕUρ A ⊗1 ,

(A1)

which is Eq. (3) needed in the proof of Theorem 1. Different U correspond to choosing different initial reference states with the same Schmidt spectrum, with respect to which to formulate the Choi-Jamiołkowski isomorphism. Since in our random selection argument we don’t mention to begin with, we are free to put the unitary to U = 1. References 1. Shor, P.W.: Additivity of the classical capacity of entanglement-breaking quantum channels. J. Math. Phys. 43, 4334–4340 (2002) 2. King, C.: Maximization of capacity and p-norms for some product channels. http://arXiv.org/list/quantph/0103086 (2001) 3. King, C.: Additivity for unital qubit channels. J. Math. Phys. 43, 4641–4653 (2000) 4. King, C.: The capacity of the quantum depolarizing channel. IEEE Trans. Inf. Theory 49, 221–229 (2003) 5. King, C.: Announced at the 1st joint AMS-PTM meeting, Warsaw 31 July – 3 Aug (2007) 6. Holevo, A.S., Werner, R.F.: Counterexample to an additivity conjecture for output purity of quantum channels. J. Math. Phys. 43, 4353–4357 (2002) 7. Winter, A.: The maximum output p-norm of quantum channels is not multiplicative for any p > 2. http:// arXiv.org/abs/:0707.0402[quant-ph], (2007) 8. Hayden, P.: The maximal p-norm multiplicativity conjecture is false. http://arXiv.org/abs/:0707. 3291[quant-ph], (2007) 9. Cubitt, T., Montanaro, A., Winter, A.: On the dimension of subspaces with bounded Schmidt rank. J. Math. Phys. 49, 022107 (2008) 10. Eisenbud, D.: Linear Sections of Determinantal Varieties. Amer. J. Math. 110(3), 541–575 (1988) 11. Ilic, B., Landsberg, J.M.: On symmetric degeneracy loci, spaces of symmetric matrices of constant rank and dual varieties. Math. Ann. 314, 159–174 (1999)

290

T. Cubitt, A. W. Harrow, D. Leung, A. Montanaro, A. Winter

12. Walgate, J., Scott, A.J.: Generic local distinguishability and completely entangled subspaces. http:// arXiv.org/abs/:0709.4238[quant-ph], (2007) 13. Hayden, P., Leung, D., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265, 95–117 (2006) 14. Alicki, R., Fannes, M.: Note on Multiple Additivity of Minimal Rényi Entropy Output of the Werner-Holevo Channels. Open Systems Inf. Dyn 11(4), 339–342 (2004) 15. Duan, R.Y., Shi, Y.: Entanglement between Two Uses of a Noisy Multipartite Quantum Channel Enables Perfect Transmission of Classical Information. http://arXiv.org/abs/:0712.3700[quant-ph], (2007) Communicated by M.B. Ruskai

Commun. Math. Phys. 284, 291–316 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0572-8

Communications in

Mathematical Physics

Conformal Operators on Forms and Detour Complexes on Einstein Manifolds A. Rod Gover1 , Josef Šilhan2 1 Department of Mathematics, The University of Auckland, Private Bag 92019,

Auckland 1, New Zealand. E-mail: [email protected]

2 Eduard Cech ˇ Center, Department of Algebra and Geometry, Masaryk University,

Janáˇckovo nám. 2a, 602 00, Brno, Czech Republic. E-mail: [email protected] Received: 21 September 2007 / Accepted: 14 March 2008 Published online: 7 August 2008 – © Springer-Verlag 2008

Abstract: For even dimensional conformal manifolds several new conformally invariant objects were found recently: invariant differential complexes related to, but distinct from, the de Rham complex (these are elliptic in the case of Riemannian signature); the cohomology spaces of these; conformally stable form spaces that we may view as spaces of conformal harmonics; operators that generalise Branson’s Q-curvature; global pairings between differential form bundles that descend to cohomology pairings. Here we show that these operators, spaces, and the theory underlying them, simplify significantly on conformally Einstein manifolds. We give explicit formulae for all the operators concerned. The null spaces for these, the conformal harmonics, and the cohomology spaces are expressed explicitly in terms of direct sums of subspaces of eigenspaces of the form Laplacian. For the case of non-Ricci flat spaces this applies in all signatures and without topological restrictions. In the case of Riemannian signature and compact manifolds, this leads to new results on the global invariant pairings, including for the integral of Q-curvature against the null space of the dimensional order conformal Laplacian of Graham et al. 1. Introduction Differential forms provide a fundamental domain for the study of smooth manifolds. In Riemannian geometry the de Rham complex, its associated Hodge theory, and distinguished forms representing characteristic classes are among the most basic and important tools (e.g. [14,15]). In physics the study of forms is partly motivated by Maxwell theory and its generalisations. Operators on differential forms feature strongly in string and brane theories. In both mathematics and physics Einstein manifolds have a central position [2] and thus they give an important class of special structures for the study of geometric objects. Among the differential operators that are natural for pseudo–Riemannian structures only a select class are conformally invariant. Conformal invariance is a subtle

292

A. R. Gover, J. Šilhan

property which reflects an independence of the point dependent scale. This symmetry is manifest in the equations of massless particles. It is linked to CR geometry (hence complex analysis) through the Fefferman metric [19]; the natural equations on the Fefferman space are conformally invariant. This symmetry also underpins the conformal approach to Riemannian geometry. For example, it is essentially exploited in the Yamabe problem (see [36] and references therein) of prescribing the scalar curvature. Recently there has been a focus on variations of this idea, including the conformal prescription of Branson’s Q-curvature [5,13,16]. These problems use the conformal Laplacian on functions (or densities) and its higher order analogues due to Paneitz, Graham et al. [32]. The use of conformal operators on forms provides a setting where, on the one hand, there is potential to formally generalise such theories, but which, on the other hand, should yield access to rather different geometric data. An immediate difficulty is that forms are more difficult to work with than functions and so, while there was much early work in this direction (e.g. [3,4]), this did not yield a clear picture. In dimension 4, and inspired by constructions from twistor theory, some rather interesting directions and applications to global geometry were pioneered in the work of Eastwood and Singer [17,18]. Links between this result and the tractor calculus of [1,9,10] were established in [6]. On the other hand in [11,27] it is shown that the conformal tractor connection may be recovered as a suitable linearisation of the ambient metric of Fefferman and Graham [20] (and see also [21]). Exploiting both developments a rather complete theory of conformal operators on forms was derived in the joint works [7,8] of the first author with Branson. The main point of these articles was not simply to construct conformal operators on differential forms, but rather, to expose and develop the discovery of preferred versions of such operators and the rather elegant picture that these yield: one may immediately construct, on even dimensional conformal manifolds, a host of new global conformal invariants. Some of these generalise, in a natural way, the integral of Q-curvature. For most of these new operators explicit formulae are not available. For any particular operator a formula may be obtained algorithmically via tractor calculus and the theory developed in [27,28]. However the resulting operators, when presented in the usual way, are given by extremely complicated formulae. It turns out there are striking simplifications when these operators are studied on conformally Einstein manifolds. The purpose of this article is to expose this, via a comprehensive but concise treatment, and use the results to study, in the Einstein setting, the related global conformal invariants and spaces. To describe the content in more detail we first review the relevant results from [7] and [8]. On conformal manifolds of even dimension n ≥ 4 there is a family of formally self-adjoint conformally invariant differential complexes: d

d

d

Lk

δ

δ

δ

E 0 → · · · → E k−1 → E k → Ek → Ek−1 → · · · → E0 .

(1)

denotes the space of k-forms, Ek denotes Here, for each k ∈ {0, 1, . . . , n/2 − 1}, an appropriate density twisting of that space, d is the exterior derivative and δ its formal adjoint. An interesting feature of these complexes is that the operators L k have the structure of a composition Ek

L k = δ Q k+1 d, where Q k+1 is from a family of differential operators, parametrised by k =−1, . . . , n/2−1, and which, as operators on closed forms, generalise Branson’s Q-curvature; in particular under conformal rescaling of the metric g → g = e2ω g (ω ∈ C ∞ ) these have the

Conformal Geometry of Forms on Einstein Manifolds

293

conformal transformation formula Qˆ k u = Q k u + L k (ωu) for u a closed k-form; Q 0 1 is the Q-curvature and L 0 is the dimension order GJMS operator of [32]. On closed forms these Q-operators have the form Q k+1 = (dδ)n/2−k−1 + lower or der ter ms, so in the case of Riemannian signature the complexes (1) are elliptic. Writing HLk for the (conformally invariant) cohomology at k, for the complex (1), it follows that on compact Riemannian manifolds HLk is finite. The composition G k := δ Q k is a conformal gauge companion operator for L k and also for the exterior derivative d. What this means is that the systems (L k , G k ) and (d, G k ) are, in a suitable sense, conformally invariant and, in the case of Riemannian signature, are graded injectively elliptic. For example the null space HkG of (d, G k ) is conformally stable and, as pointed out in [7], is a candidate for a space of “conformal harmonics”. Some perspective on these objects is given by the sequence d

0 → H k−1 → HLk−1 → HkG → H k , k ∈ {1, 2, . . . , n/2},

(2)

where H k indicates the usual de Rham cohomology. The map H k−1 → HLk−1 is the obvious inclusion, since L factors through the exterior derivative. The map HkG → H k is that which simply takes solutions of (d, G k ) to their cohomology class. It is immediate from the definitions of the spaces that the sequence is exact, but it is an open question whether in general the final map HkG → H k is surjective. When it is we say that the space is (k − 1)-regular [7]. We present here a study of all of these spaces and operators specialised to the setting of an Einstein structure. By exploiting some recent developments we obtain a treatment which, surprisingly, obtains most of the results in a uniform way in all signatures and without assuming the manifold is compact. As indicated above the motivation is manifold. The cohomology spaces, and related structures, mentioned above are clearly fundamental to conformal geometry. An important problem is to discover what data they capture. On the other hand there is the opportunity to shed light on Einstein structures which form an important class of geometries which remain rather mysterious; for example there are very few non-existence results for compact Riemannian Einstein spaces, while the construction of examples is primarily through Kähler geometry. The idea that this might be a rewarding approach is suggested by the intimate relationship between conformal geometry and Einstein structures. Conformal structures admit a natural conformally invariant connection on a prolonged structure: this is the Cartan connection of [12], or equivalently the induced structure is the conformal standard tractor connection that was already mentioned. An Einstein structure is equivalent to a suitably generic parallel section of this tractor bundle and so is, in this sense, a type of symmetry of conformal structure. For the case of the conformal Laplacian type operators this last point was exploited heavily in [24] where two of the three main results are as follows: on Einstein manifolds the GJMS operators of [32] factor into compositions of operators each of which is of the form of a constant potential Helmholtz Laplacian; the Q-curvature is constant and (up to a universal constant) simply a power of the scalar curvature. (See also [21] where similar results are obtained using different techniques.) Here we develop the analogous theory for operators on differential forms. In fact we do much more. A first step is that we obtain factorisations of the key operators which generalise those from [24] (to our knowledge such factorisations are new even for the conformally flat Einstein setting). On

294

A. R. Gover, J. Šilhan

the other hand in [30,31] we show that if the factors Pi : V → V (for some vector space V), in a composition P := P0 P1 · · · P of mutually commuting operators, are suitably “relatively invertible” then the general inhomogeneous problem Pu = f decomposes into an equivalent system Pi u i = f , i = 0, . . . , . This is used extensively in the current work to reduce, on non-Ricci flat Einstein manifolds, the generally high order conformal operators to equivalent lower order systems. The outcome is that in any signature (and without any assumption of compactness) on non-Ricci flat Einstein manifolds we can describe the spaces N (L k ) (the null space of L k ), and HkG explicitly as a direct sum of Hσk := N (d) ∩ N (δ) and the (possibly trivial) “eigenspaces”, k k := { f ∈ E k | δd f = λ f }, Hσ,λ := { f ∈ E k | dδ f = λ f }, H σ,λ

for various explicitly known λ ∈ R. (Here σ denotes the Einstein scale in the conformal class, see Sect. 5.) See in particular Proposition 5.3 and (25), and note that for λ = 0 the displayed spaces give the R(d) (range of d) and R(δ) parts of the form Laplacian “eigenspace” { f ∈ E k | (dδ + δd) f = λ f }. We also come to a simple decomposition for HLk (see expression (27)) and other conformal spaces from [7]. Stronger results are available in the compact Riemannian setting and these are summarised in Theorem 6.4. Observe, in particular, that this shows that all compact Riemannian Einstein even manifolds are k-regular for k = 0, 1 . . . , n/2 − 1, and that, in this setting, Hσk agrees with the usual space of harmonics for the form Laplacian. From the k-regularity it follows that the global conformally invariant pairings on HkG , as defined in [8], descends to a conformal quadratic form on de Rham cohomology. See Theorem 6.5, and also Proposition 6.2 which shows that, in the Einstein case, the pairing is given by a power of the scalar curvature; in fact by a formula which generalises the formula from [24] for Q-curvature on Einstein manifolds. Also in Theorem 6.5 we show that the conformal pairing, via Q-operators, of closed forms against forms in N (L k ) descends to a closed form pairing. See the Remark following Theorem 6.5, which emphasises that this also gives a result for the usual Q-curvature. Some of the results for compact Riemannian manifolds could be obtained by using, at the outset, the complete spectral resolution of the form Laplacian. However doing this conceals the fact that, for the most part, the same results are available even when we do not have access to diagonalisations of the basic operators. The development is as follows. Section 2.1 summarises some basic conformal geometry, tractor results, and identities to be used. In Sect. 3 we construct Laplacian operators on weighted tractor bundles. This is in the spirit of [24], but there is an algebraic adjustment to the basic operators. Using these Laplacian power operators, in Sect. 4 we derive formulae for the key operators, L k , Q k , and so forth, in the Einstein setting. The main result is Theorem 4.2. In fact in contrast to the construction in [7] (which heavily uses the Fefferman-Graham ambient metric), these operators are developed and defined directly using invariant tractor operators. That we recover the operators from [7] is the main subject of Sect. 7. In each case the operators are given in terms of compositions of commuting operators. This enables, in Sect. 5, the use of the tools from [30] as recounted in Theorem 5.1. 2. Background: Einstein Metrics and Conformal Geometry We first sketch here notation and background for general conformal structures and their tractor calculus following [11,27]. The latter is then used to describe operators that we

Conformal Geometry of Forms on Einstein Manifolds

295

will need acting on tractor forms and some key identities are developed. Some parts of the treatment are specialised to Einstein manifolds.

2.1. Conformal geometry and tractor calculus. Let M be a smooth manifold of dimension n ≥ 3. Recall that a conformal structure of signature ( p, q) on M is a smooth ray subbundle Q ⊂ S 2 T ∗ M whose fibre over x consists of conformally related signature( p, q) metrics at the point x. Sections of Q are metrics g on M. The principal bundle π : Q → M has structure group R+ , and each representation R+ x → x −w/2 ∈ End(R) induces a natural line bundle on (M, [g]) that we term the conformal density bundle E[w]. We shall write E[w] for the space of sections of this bundle. Here and throughout, sections, tensors, and functions are always smooth. When no confusion is likely to arise, we will use the same notation for a bundle and its section space. We write g for the conformal metric, that is the tautological section of S 2 T ∗ M ⊗ E[2] determined by the conformal structure. This will be used to identify T M with T ∗ M[2]. For many calculations we will use abstract indices in an obvious way. Given a choice of metric g from the conformal class, we write ∇ for the corresponding Levi-Civita connection. With these conventions the Laplacian is given by = g ab ∇a ∇b = ∇ b ∇b . Note E[w] is trivialised by a choice of metric g from the conformal class, and we write ∇ for the connection arising from this trivialisation. It follows immediately that (the coupled) ∇a preserves the conformal metric. Since the Levi-Civita connection is torsion-free, the (Riemannian) curvature Rab c d is given by [∇a , ∇b ]v c = Rab c d v d , where [·, ·] indicates the commutator bracket. The Riemannian curvature can be decomposed into the totally trace-free Weyl curvature Cabcd and a remaining part described by the symmetric Schouten tensor Pab , according to Rabcd = Cabcd + 2g c[a Pb]d + 2g d[b Pa]c , where [· · · ] indicates antisymmetrisation over the enclosed indices. We shall write J := P a a . The Cotton tensor is defined by Aabc := 2∇[b Pc]a . Under a conformal transformation we replace a choice of metric g by the metric gˆ = e2ω g, where ω is a smooth function. Explicit formulae for the corresponding transformation of the Levi-Civita connection and its curvatures are given in e.g. [27]. abcd = Cabcd . We recall that, in particular, the Weyl curvature is conformally invariant C We next define the standard tractor bundle over (M, [g]). It is a vector bundle of rank n + 2 defined, for each g ∈ [g], by [E A ]g = E[1] ⊕ Ea [1] ⊕ E[−1]. If g = e2ϒ g, we A A identify (α, µa , τ ) ∈ [E ]g with ( α, µa , τ ) ∈ [E ]g by the transformation ⎛

⎞ ⎛ 1 0 α ⎝ δa b µa ⎠ = ⎝ ϒa τ − 21 ϒc ϒ c −ϒ b

⎞⎛ ⎞ 0 α 0⎠ ⎝µb ⎠ , τ 1

(3)

where ϒa := ∇a ϒ. It is straightforward to verify that these identifications are consistent upon changing to a third metric from the conformal class, and so taking the quotient by this equivalence relation defines the standard tractor bundle T , or E A in an abstract index notation, over the conformal manifold. (Alternatively the standard tractor bundle may be constructed as a canonical quotient of a certain 2-jet bundle or as an associated bundle to the normal conformal Cartan bundle [10].) On a conformal structure of signature ( p, q), the bundle E A admits an invariant metric h AB of signature ( p + 1, q + 1)

296

A. R. Gover, J. Šilhan

and an invariant connection, which we shall also denote by ∇a , preserving h AB . In a conformal scale g, these are given by ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ∇a α − µa 0 0 1 α h AB = ⎝0 g ab 0⎠ and ∇a ⎝µb ⎠ = ⎝∇a µb + g ab τ + Pab α ⎠ . (4) 1 0 0 τ ∇a τ − Pab µb It is readily verified that both of these are conformally well-defined, i.e., independent of the choice of a metric g ∈ [g]. Note that h AB defines a section of E AB = E A ⊗ E B , where E A is the dual bundle of E A . Hence we may use h AB and its inverse h AB to raise or lower indices of E A , E A and their tensor products. In computations, it is often useful to introduce the ‘projectors’ from E A to the components E[1], Ea [1] and E[−1] which are determined by a choice of scale. They are respectively denoted by X A ∈ E A [1], Z Aa ∈ E Aa [1] and Y A ∈ E A [−1], where E Aa [w] = E A ⊗ Ea ⊗ E[w], etc. Using the metrics h AB and g ab to raise indices, we define X A , Z Aa , Y A . Then we immediately see that Y A X A = 1,

Z Ab Z A c = g bc ,

and that all other quadratic combinations that contract the tractor index vanish. In (3) note that α = α, and hence X A is conformally invariant. Given a choice of conformal scale, the tractor-D operator D A : E B···E [w] → E AB···E [w − 1] is defined by D A V := (n + 2w − 2)wY A V + (n + 2w − 2)Z Aa ∇ a V − X A V,

(5)

where V := V + w J V . This also turns out to be conformally invariant as can be checked directly using the formulae above (or alternatively there are conformally invariant constructions of D, see e.g. [22]). The curvature of the tractor connection is defined by [∇a , ∇b ]V C = ab C E V E for V C ∈ E C . Using (4) and the formulae for the Riemannian curvature yields abC E = Z C c Z E e Cabce − 2X [C Z E] e Aeab .

(6)

We will also need a conformally invariant curvature quantity defined as follows (cf. [22,23]) W BC E F :=

3 D A X [A BC] E F , n−2

(7)

where BC E F := Z B b Z C c bc E F . In a choice of conformal scale, W ABC E is given by (n − 4) Z A a Z B b Z C c Z E e Cabce − 2Z A a Z B b X [C Z E] e Aeab (8) −2X [A Z B] b Z C c Z E e Abce + 4X [A Z B] b X [C Z E] e Beb , where Bab := ∇ c Aacb + P dc Cdacb is known as the Bach tensor. From the formula (8) it is clear that W ABC D has Weyl tensor type symmetries.

Conformal Geometry of Forms on Einstein Manifolds

297

We will work with conformally Einstein manifolds. That is, conformal structures with an Einstein metric in the conformal class. This is the same as the existence of a non

vanishing section σ ∈ E[1] satisfying ∇(a ∇b)0 + P(ab)0 σ = 0, where (. . .)0 indicates the trace-free symmetric part over the enclosed indices. Equivalently (see e.g. [1,26]) there is a standard tractor I A that is parallel with respect to the normal tractor connection ∇ and such that σ := X A I A is non-vanishing. It follows that I A := n1 D A σ = Y A σ + Z aA ∇a σ − n1 X A (+ J )σ , for some section σ ∈ E[1], and so X A I A = σ is non-vanishing. If we compute in the scale σ , then for example W ABC D = (n − 4)Z aA Z bB Z Cc Z dD Cabcd . 2.2. Tractor forms. Following [7] we write E k [w] for the space of sections of ( k T ∗ M) ⊗ E[w] (and E k = E k [0]). Further we put Ek [w] := E k [w + 2k − n]. The space of closed k-forms shall be denoted by C k ⊆ E k. In order to be explicit and efficient in calculations involving bundles of possibly high rank it is necessary to employ abstract index notation as follows. In the usual abstract index conventions one would write E[ab···c] (where there are implicitly k-indices skewed over) for the space E k. To simplify subsequent expressions we use the following conventions. Firstly indices labelled with sequential superscripts which are at the same level (i.e. all contravariant or all covariant) will indicate a completely skew set of indices. Formally we set a 1 · · · a k = [a 1 · · · a k ] and so, for example, Ea 1 ···a k is an alternative notation for E k while Ea 1 ···a k−1 and Ea 2 ···a k both denote E k−1. Next, following [29] we abbreviate this notation via multi-indices: We will use the forms indices ak := a 1 · · · a k = [a 1 · · · a k ], k ≥ 0, a˙ k := a 2 · · · a k = [a 2 · · · a k ], k ≥ 1. If k = 1 then a˙ k simply means the index is absent. The corresponding notations will be used for tractor indices so e.g. the bundle of tractor k–forms E[A1 ···Ak ] will be denoted by E A1 ···Ak or EAk . The structure of EAk is E[A1 ···Ak ] = EAk E k−1 [k] +

E k−1 [k − 2]; E k [k] ⊕ E k−2 [k − 2] +

(9)

in a choice of scale the semidirect sums + may be replaced by direct sums and otherwise they indicate the composition series structure arising from the tensor powers of (3). In a choice of metric g from the conformal class, the projectors (or splitting operators) X, Y, Z for E A determine corresponding projectors X, Y, Z, W for EAk+1 , k ≥ 1. These execute the splitting of this space into four components and are given as follows: a Yk = Y A0aA1······A = Y A0aAk = Y A0 Z aA1 · · · Z aAk ∈ EAa k+1 [−k − 1], k 1 k 1 k k k a = ZAa k = Z Aa1 · · · Z Aak ∈ EAa k [−k], Zk = Z Aa 1 ··· ···Ak 1 ak a1 ak ak ak Wk = W A A0aA··· 1 ···Ak = W A A0 Ak = X [A Y A0 Z A1 · · · Z Ak ] ∈ EAk+2 [−k], 1

k

a Xk = X A0aA1······A k 1

k

k

1

k

= X A0 Z Aa1 · · · Z aAk

= X A0aAk

k

1

k

k

k

∈ EAa k+1 [−k + 1],

where k ≥ 0. The superscript k in Yk , Zk , Wk and Xk shows the corresponding tensor valence. (This is slightly different than in [7], where k is the relevant tractor valence.) Note that Y = Y0 , Z = Z1 and X = X0 and W0 = X [A Y A0 ] . From (4) we immediately

298

A. R. Gover, J. Šilhan

see ∇ p Y A = Z aA Ppa , ∇ p Z aA = −δ ap Y A − Ppa X A and ∇ p X A = Z Ap . From this we obtain the formulae (cf. [29]) ∇ p Y A0aAk = Ppa0 Z Aa 0 Aa k + k Ppa W A0a˙Ak , k

0 k

0 k

1

0

k

k

0

k

∇ p Z Aa 0 Aa k = −(k + 1)δ ap Y A0aAk − (k + 1)Ppa X A0aAk , ∇ p W A0 aA˙ k = −g pa 1 Y A0aAk + Ppa 1 X A0aAa˙k , k

k

1 k

(10)

∇ p X A0aAk = g pa 0 Z Aa 0 Aa k − kδ ap W A0a˙Ak , k

0 k

1

k

which determine the tractor connection on form tractors in a conformal scale. Similarly, one can compute the Laplacian applied to the tractors X, Y, Z and W. As an operator on form tractors we have the opportunity to modify by adding some amount of W

, where denotes the natural tensorial action of sections in End(E A ). Analogously, we shall use C

to modify the Laplacian on forms; here denotes the natural tensorial action of sections in End(E a ). It turns out (cf. [7]) that it will be convenient for us to use the operator

1 + n−4 W

n = 4 / = n = 4. (Note = ∇ a ∇a .) Since the Laplacian is of the second order, it is convenient to con/ YAa˙ τa˙ , where τa˙ ∈ Ea˙ [w]. It will be sufficient for our purpose to calculate sider e.g. this only in an Einstein scale. For example, using (10) and then that Pab = g ab J/n, we have 2 ∇ p ∇ p YAa˙ τa˙ = ∇ p Ppa 1 ZaA + (k − 1)Ppa WaA¨ + YAa˙ ∇ p τa˙ 2(k −1)(n−k +1) δd + dδ + (1− )J + C

τ = −YAa˙ n a˙ 2 a 2(k −1) a¨ n−2k +2 a˙ 2 Z (J dτ )a − WA (J δτ )a¨ − + XA J τa˙ , nk A n n2 where, as usual, A = Ak and a = ak . Summarising, one can compute that in an Einstein scale we obtain 2(k −1)(n−k +1) a˙ a˙ / YA τa˙ = YA δd + dδ + (1− − )J τ n a˙ 2 a 2(k −1) a¨ n−2k +2 a˙ 2 WA (J δτ )a¨ + − ZA (J dτ )a + X J τa˙ , nk n n 2 A 2k(n−k −1) / ZAa µa = − 2kYAa˙ (δµ)a˙ + ZAa δd + dδ − J µ − n a 2k a˙ − XA (J δµ)a˙ , n 2(k −3)(n−k +2) 2 a¨ / WA νa¨ = δd + dδ − YAa˙ (dν)a˙ + WAa¨ J ν − k −1 n a¨ 2 a˙ X (dν)a˙ , − n(k −1) A

Conformal Geometry of Forms on Einstein Manifolds

299

/ XAa˙ ρa˙ = (n−2k +2)YAa˙ ρa˙ − 2(k −1)WAa¨ (δρ)a¨ − 2(k −1)(n−k +1) 2 δd + dδ + (1− )J ρ , − ZAa (dρ)a + XAa˙ k n a˙ (11) if either n = 4, k = 1 or n = 4, cf. [37, (1.50)]. Here τa˙ ∈ Ea˙ [w], µa ∈ Ea [w], νa¨ ∈ Ea¨ [w] and ρa˙ ∈ Ea˙ [w], where a = ak , k ≥ 1 and w is any conformal weight. 2.3. Useful identities. Here we first introduce and discuss some identities that hold on a general conformal manifold. Recall that sequentially labelled indices are assumed to be skew over, e.g. A1 A2 = [A1 A2 ]. The operator D A1 A2 = −2(wW A1 A2 + X A1aA2 ∇a )

(12)

was introduced in [23]. Also recall the definition of the tractor W B 1 B 2 C 1 C 2 in (7). By / in (5) we obtain the conformally invariant operator replacing by

1 D A − n−4 X A W

n = 4 D /A= DA n = 4. The case n = 4 of this operator was introduced in [7]. In contrast to D, and surprisingly, / is algebraic (cf. the commutator of D in [27]) for n = 4: the commutator of D

n + 2w − 2 / B] = / A, D (n + 2w − 4)W AB − (D AB W )

f, n = 4. (13) [D n−4 This can be checked by a direct computation, or alternatively by a rather simple calculation using the ambient metric and its links to tractors as in Sect. 7. For n = 4 we have / A, D / B ] = [D A , D B ], see [27] for the latter. Note one can moreover show that for [D n = 4, the operator D A − y X A W

, y ∈ R has an algebraic commutator only for the 1 value y = n−4 . Proposition 2.1. D A1 A2 and W B 1 B 2 C 1 C 2 have the following properties: (i) D[A1 A2 W B 1 B 2 ]C 1 C 2 = 0, (ii) D A1 P W P A2 B 1 B 2 = −W A1 A2 B 1 B 2 . Proof. We shall use the form indices A = A2 , B = B2 and C = C2 throughout the proof. Both identities can be verified by the direct computation. To simplify the computation note that alternatively [37] we have 2 1 WBC = (n − 4)ZbB − 2XbB ∇ b bC ∈ EBC [−2]. b = Xa Xb = Xb W (i) Using the relations W[A XB] [A B] [A B] = 0 (which follow from X [A X B] = 0) we obtain a b YB] abC D[A WB]C = − 2(n − 4)W[A ZbB] bC − 2(n − 4)X[A a b a b + (n − 4)X[A ZB] ∇a bC − 2X[A ZB] g ab1 ∇ p pb2 C .

Now clearly the first two terms on the right-hand side add up to 0 and the remaining ones both vanish.

300

A. R. Gover, J. Šilhan 1 2

1

2

(ii) Clearly W A1 P Z Pa Aa2 = X A1 Pa X P aA2 = 0. Thus 2 D A1 P W P A2 B = − 2 4(n − 4)W A1 P X P aA2 ∇ q qa 2 B 2 1 2 + (n − 4)X A1 P p − 2Y P aA2 pa 2 B + Z Pa Aa2 ∇ p aB 1 2 2 + X A1 P p − 2Z Pa Aa2 g pa 1 + 2W P A2 δ ap ∇ q qa 2 B . 2

Now using W A1 P X P aA2 = pa 2

1 a2 4 XA ,

1 2

X A1 P p Z Pa Aa2 =

2

− 41 ZA − 41 WA g pa , the proposition follows.

1 a 2 pa 1 2 XA g

2

and X A1 P p Y P aA2 =

Lemma 2.2. (i) Let I A ∈ E A be a parallel tractor. Then I P D P Q W Q A2 B 1 B 2 = 0 and I P D P[A0 W A1 A2 ]B 1 B 2 = 0. 1 2 (ii) Let I A , I¯ A ∈ E A be a two parallel tractors. Then I A I¯ A D A1 A2 W B 1 B 2 C 1 C 2 = 0. 1

Proof. Recall any parallel tractor I A satisfies I A W A1 A2 B 1 B 2 = 0 [26]. Thus the first 1 relation of (i) follows by applying I A to Proposition 2.1 (ii) and the second relation of 1 1 2 (i) by applying I A to Proposition 2.1 (i). Similarly, (ii) follows by applying I A I¯ A to Proposition 2.1 (i). 3. Einstein Manifolds: Conformal Laplacian Operators on Tractors We assume that the structure (M, [g]) is conformally Einstein, and write σ ∈ E[1] for some Einstein scale from the conformal class. Then I A := n1 D A σ is parallel and X A I A = σ is nonvanishing. The operator = + w J acting on tractor bundles of the weight w is invariant only if n + 2w − 2 = 0. On the other hand the scale σ (or equivalently I A ), yields the operator / σ := I A D / A : E B···E [w] −→ E B···E [w − 1]

(14)

/ σ )p, which is well defined for any w, cf. [24]. Thus we can consider the composition ( 0 / σ ) := id. These operators generally depend on the choice of the p ∈ N and we set ( Einstein scale but one has the following modification of [24, Theorem 3.1]. Theorem 3.1. Let σ, σ¯ be two Einstein scales in the conformal class and consider the operators 1 1 / σ ) p , p ( / σ¯ ) p : E B···E [w] −→ E B···E [w − 2 p], ( σp σ¯ for p ∈ Z≥0 . If w = p − n/2 then

1 / σ )p σ p (

=

1 / σ¯ ) p . σ¯ p (

Proof. Assume w = p − n/2 and denote the Einstein metric corresponding to σ by g / σ ) p is very similar to the operator Pp given by [24, (19)]. The g := σ −2 g. Then σ1p ( g /A difference is, beside the sign, that we have replaced D A in the definition of Pp by D 1 p / σ ) . Thus our statement is analogous to [24, Theorem 3.1] in the definition of σ p ( g for the operators Pp . We can follow the proof of the latter theorem literally; the dif/ A appears only in the commutator in the last display in the ference between D A and D

Conformal Geometry of Forms on Einstein Manifolds

301

/ A, D / B ] instead proof of [24, Theorem 3.1]. In our case, we need the commutator [D / A, D / B ] vanishes of that display and to finish the proof it remains to show that I A I¯ B [D on density valued tractor fields. Here I A = n1 D A σ , I¯ B = n1 D B σ¯ are parallel tractors corresponding to the Einstein scales σ , σ¯ . But this follows from (13), Lemma 2.2 (ii) 1 and the fact I A W A1 A2 B 1 B 2 = 0. 4. Einstein Manifolds: Q-Operators, Gauge Operators and Detour Complexes The main aim here is to recover, in the Einstein setting, the differential complexes (1), the Q-operators and the related conformal spaces, as defined in [7]. In that source the Fefferman-Graham ambient metric is used to generate the operators which form the “building blocks” of the theory. In contrast here, in the conformally Einstein setting, all the operators are “rediscovered” directly using the tractor operators. However in Sect. 7 we use the ambient metric to establish that we do have exactly the specialisation of the operators and spaces from [7]; see the comment following expression (19) and Proposition 7.1. Here we work on an Einstein manifold in an Einstein scale σ ∈ E[1]. The first step is the conformally invariant differential splitting operator MAa : Ea −→ EA [−k], n − 2k a ZA f a + XAa˙ (δ f )a˙ , MAa f a = k

(15)

where A = Ak and a = ak. This is a special case of the operator M from [29] (up to the multiple k). Let I A := n1 D A σ be the Einstein tractor for σ ∈ E[1]. Since I A = Y A σ − n1 X A σ J in the scale σ , it follows from (14) that / σ = σ (− / −2

w (n + w − 1)J ) : E B···E [w] −→ E B···E [w − 1] n

(16)

in the scale σ . For any differential operator E : E k → E k we have the compositions p Pk [E]

:=

p i=1

2i(n − 2k − i + 1) E+ J n

p

(17)

p

for p ≥ 1. We set Pk [E] := id for p ≤ 0. Note that Pk can be considered as a polynomial in E. Next we define the operator p

Lk :=

1 / σ ) p MAa : Ea [w] −→ EA [w − k − 2 p] ( σp p

(18)

/ σ )0 := id. We put Lk := Lk for p = n+2w−2k for p ≥ 0 where ( . It follows from 2 Theorem 3.1 that Lk is independent on the choice of the Einstein scale σ . Now we are ready to state the main technical step of our construction.

302

A. R. Gover, J. Šilhan

Theorem 4.1. Let f a ∈ Ea , where a = ak and p ≥ 1. Then computing in the Einstein scale σ we obtain (Lk f )a = − p(n − 2k − 2 p)YAa˙ [δ Pk [dδ] f ]a˙ 1 p−1 p−1 + ZAa [(n − 2k − 2 p)dδ Pk [dδ] f + (n − 2k)δd Pk+1 [δd] f ]a k p(n − 2k + 2) p−1 J )Pk [dδ] f ]a˙ . + XAa˙ [δ(dδ + n p

p−1

Proof. It is easy to show by a direct computation (using (10) and (11)) that the theorem holds for p = 1. Now assume, by induction, that the theorem holds for a fixed p ∈ N. To verify the theorem for p + 1 we need to compute p+1

(Lk

1

/ σ ( / σ ) p MAa f a 1 k+p / σ ) p MAa f a / +2 (n − k − p − 1)J ( = p+1 σ − σ n k+p p / +2 (n − k − p − 1)J (Lk f )a , = − n

f )a =

σ p+1

/ σ ) p MAa f a where the multiple of J follows from (16) and from the conformal weight of ( p ∈ EA [−k − p]. Since the theorem yields the formula for (Lk f )a we can continue in the computation using (10) and (11). The rest of the proof is to finish this computation. Let p us show at least the top slot. Using (10), (11) and the formula for (Lk f )a we obtain p−1 YaA˙ − p(n − 2k − 2 p)δdδ Pk [dδ] f 2(k − 1)(n − k + 1) p−1 )(n − 2k − 2 p)J δ Pk [dδ] f n p−1 − 2(n − 2k − 2 p)δdδ Pk [dδ] f p(n − 2k + 2) p−1 + (n − 2k + 2)δ(dδ + J )Pk [dδ] f n k+p p−1 (n − k − p − 1) p(n − 2k − 2 p)J δ Pk [dδ] f . −2 n a˙ − p(1 −

Summing up appropriate terms, we obtain that this is exactly −( p + 1)(n − 2k − 2( p + 1))YaA˙ [δ Pk [dδ] f ]a˙ . p

The computation of remaining slots is similar and the theorem follows.

Now let us assume that the dimension n is even and k ∈ {1, . . . , n2 }. Then putting p = n−2k 2 ≥ 0 in Theorem 4.1, we obtain (Lk f )a =

p 2p a p−1 ZA δd Pk+1 [δd] f + XAa˙ δ Pk [dδ] f a˙ a k

(19)

for f a ∈ Ea . It follows from Proposition 7.1 that Lk acting on Ea agrees with the operator Lk defined in [7] up to a nonzero scalar multiple. Thus we have the following results for the operators G σk , Q σk and L k from [7].

Conformal Geometry of Forms on Einstein Manifolds

303

Theorem 4.2. In an Einstein scale σ we have, up to a nonzero scalar multiple, the formula 2i(n − 2k − i + 1) dδ + J n

n−2k 2

δ

(20)

i=1

for the operator G σk of [7], G σk : E k → Ek−1 . As an operator on closed k-forms, up to a nonzero scalar multiple, we have: Q σk

2i(n − 2k − i + 1) dδ + J . n

n−2k 2

=

(21)

i=1

Note that at the k = n/2 extreme these are taken to mean G σn/2 = δ and Q σn/2 = id. Proof. Note that setting p = n−2k 2 in Theorem 4.1, the coefficient of Y vanishes in the display and the left-hand-side of the display is the operator Lk of [7]. The coefficient of X is thus G σk as in Theorem 4.5 of [7]. This gives the formula presented. Here we have used the fact that the factor dδ + p(n−2k+2) J , with p = n−2k n 2 , of the coefficient of X p appears as a composition factor in Pk [dδ] (the factor with i = p in (17)). Now by Theorem 2.8 of [7] and its proof we have that G σk = δ Q σk and Q σk is formally σ σ k−1 → E . On the other self-adjoint. Thus the formal adjoint G σ,∗ k k of G k is Q k d : E σ hand since Q k is a differential operator it follows from this that G σ,∗ k determines the given formula for Q σk as an operator on closed forms. Observe that from (21) we have immediately the following useful observation. Corollary 4.3. On Einstein manifolds and in an Einstein scale Q σk : C k → C k . The operator G σk acting on closed forms will be denoted by G k : C k → Ek−1 . It follows from (19) that G k is conformally invariant (recall it is defined as a projection to the X-slot of (19)). Corollary 4.4. In an Einstein scale σ the conformally invariant detour operator L k : E k → Ek is given, up to a nonzero scalar multiple, by 2i(n − 2k − i − 1) dδ + J d Lk = δ n n−2k−2 2

i=1

for k < n/2. Moreover we set L n/2 = 0. Proof. On a general manifold we have L k = δ Q σk+1 d from Theorem 2.8 of [7]. Hence the statement follows from the formula Q σk+1 from (21).

304

A. R. Gover, J. Šilhan

Remark 4.5. Observe that the operators L k , and the fact that they have the form δ Md, may be extracted directly from expression (19). Thus, in this conformally Einstein setting, we obtain detour complexes as in (1) from (19). Of course these were developed on general conformal manifolds in [7] but the construction in the conformally Einstein setting here is independent of [7]. Proposition 7.1 is only used here to verify that it is (the specialisation of) the same complex and also to make the connection to the Q-operators from [7]. We have constructed the operator Lk (thus also G σk , Q σk and L k ) for k ∈ {1, . . . , n2 } as we assumed the latter range in (19). However formulae (20) for G σk , (21) for Q σk and Corollary 4.4 for L k make sense also for k = 0 [24]; the operators Q σ0 and L 0 formally agree, up to a nonzero scalar multiple, with the corresponding operators Q g and n/2 from [24]. Further we put L0 := L 0 . Here and below g = σ −2 g is the Einstein metric. So henceforth we shall assume k ∈ {0, . . . , n2 }. 5. Decompositions of the Conformal Spaces We work on a general (possibly noncompact) conformally Einstein even dimensional manifold (M, [g]) of signature ( p, q). As usual we write σ to denote an Einstein scale. If not stated otherwise, we assume k ∈ {0, . . . , n2 } and we put E−1 := 0. The space HkG = N (G k : C k −→ Ek−1 ), where G k := δ Q σk , is conformally invariant [7], and we shall term it the space of conformal harmonics. We shall describe HkG in more details. As mentioned in the Introduction, we will use the notation k k := { f ∈ E k | δd f = λ f } Hσ,λ := { f ∈ E k | dδ f = λ f }, H σ,λ

and Hσk := N (d) ∩ N (δ), k k . Note that if λ = 0 then Hkσ,λ ⊂ R(d), and where λ ∈ R. Then Hσk ⊆ Hσ,0 ∩ H σ,0 k ⊂ R(δ). similarly H σ,λ As well as HkG , we shall also study the null spaces of the operators G k and L k . Our treatment relies on the following observation:

Theorem 5.1. Let V be a vector space over a field F. Suppose that E is a linear endomorphism on V, and P = P[E] : V → V is a linear operator polynomial in E which factors as P[E] = (E − λ1 ) · · · (E − λ p ), where the scalars λ1 , . . . , λ p ∈ F are mutually distinct. Then the solution space V P , for P, admits a canonical and unique direct sum decomposition Vλi , V P = ⊕i=1

(22)

where, for each i in the sum, Vλi is the solution space for E − λi . The projection Proji : V P → Vλi is given by the formula Proji = yi

j= p i= j=1

(E − λ j ) where yi =

j= p i= j=1

1 . λi − λ j

Conformal Geometry of Forms on Einstein Manifolds

305

This is a special case of Theorem 1.1 from [30]. To use this result in our setting we need the following result. Lemma 5.2. The constants −

2i(n − 2k − i + 1) , i ∈ N, k ∈ {0, . . . , n − 1}, n

are mutually distinct and negative for i = 1, . . . , n−2k 2 . Proof. Assume that the scalar 2i(n − 2k − i + 1) − 2 j (n − 2k − j + 1) = 2(i − j)[n − 2k − (i + j) + 1] is equal to zero for some i, j = 1, . . . , n−2k 2 . This can happen only if i = j or n − 2k − (i + j) + 1 = 0. But the latter possibility cannot happen as i, j ≤ n−2k 2 means i + j ≤ n − 2k. Thus the discussed scalars are mutually distinct. The scalars are negative since for the ranges considered i ≥ 1, and 2i ≤ n − 2k implies that i + 2k ≤ n − i < n + 1. Definition. We define the scalars: λik := −

2i(n − 2k − i + 1) J, i ∈ N, k ∈ {0, . . . , n − 1}, n

(23)

where, recall, J is the trace of the Schouten tensor. So on Einstein manifolds these scalars are constant and, if J = 0 then these are non-zero and mutually distinct. Proposition 5.3. Let (M, g) be an Einstein manifold which is not Ricci flat. We will use the scalars λik from (23) and put p = n−2k 2 . The null space of the conformally invariant operator L k : E k → Ek defined in Corollary 4.4 is k ⊕ N (L k ) = H σ,0

p−1 i=1

k k+1 , k ∈ {0, . . . , n − 1}, H σ,λi 2

and N (L n/2 ) = E n/2 = En/2 . Proof. The case k = n/2 is obvious so assume k ∈ {0, . . . , n2 − 1}. Since p−1 p−1 p−1 L k = δ Pk+1 [dδ]d according to Corollary 4.4 and Pk+1 [dδ]d = d Pk+1 [δd] we get p−1 L k = δd Pk+1 [δd]. Now the proposition follows from Theorem 5.1 for E = δd and Lemma 5.2. k k , C k ∩ ⊕ p−1 H Note that C k ⊆ H σ,0 i=1

σ,λik+1

p−1 k = {0}, and ⊕i=1 H

σ,λik+1

⊆ R(δ) for k = n2 .

Lemma 5.4. Let (M, g) be an Einstein manifold which is not Ricci flat. In the Einstein scale σ , the null space of the operator G σk : E k → Ek−1 given by (20) is N (G σk ) = N (δ) ⊕

p

i

i=1

where the scalars λik are from (23) and p =

k

Hσ,λk ,

n−2k 2 .

(24)

306

A. R. Gover, J. Šilhan

Proof. Observe G σk = δ Pk [dδ] = Pk [δd]δ. We will work in the Einstein scale σ throughout the proof. The case k = n2 is obvious so we assume k < n2 . Let us start with the inclusion ⊇. Clearly N (δ) ⊆ N (G σk ). Further suppose that p

p

k

k f ∈ Hσ,λ j for some j ∈ 1, . . . , n−2k 2 . This means dδ f = λ j f using the definition of k

p

Hσ,λk . The composition factors in (21), which yield the formula for Pk [dδ], commute j

and one of these factors is dδ − λkj . Hence Pk [dδ] f = 0 which means f ∈ N (G σk ). p k Now we discuss the inclusion N (G σk ) ⊆ N (δ) ⊕ i=1 Hσ,λk . From Lemma 5.2 p

i

it follows that the λik for i = 1, . . . , p are mutually distinct. First observe N (G σk ) ⊆ p p N (dδ Pk [dδ] : E k → E k ) since G σk = δ Pk [dδ]. It follows from Theorem 5.1 p (where we put E = dδ) that f ∈ N (dδ Pk [dδ]) can be uniquely written in the form p f = f¯ + i=1 f i , where ⎤ ⎡ ⎤ ⎡ p j= p (dδ − λkj )⎦ f, i = 1, . . . , , f¯ = y¯ ⎣ (dδ − λkj )⎦ f and f i = yi dδ ⎣ i= j=1

j=1

where y¯ and yi are appropriate scalars, and this decomposition satisfies dδ f i = λik f i for p i = 1, . . . , p and dδ f¯ = 0. Note the last display means f¯ = y¯ Pk [dδ] f . Now consider this decomposition for f ∈ N (G σk ). The condition dδ f i = λik f i means k p p f i ∈ Hσ,λk . Further applying δ to f¯ = y¯ P [dδ] f , we obtain δ f¯ = y¯ δ P [dδ] f = i

y¯ G σk f = 0. Hence f¯ ∈ N (δ).

k

k

Theorem 5.5. Let (M, g) be an Einstein manifold which is not Ricci flat. We use λik to denote the scalars from (23) and put p = n−2k 2 . The conformally invariant space HkG = N (G k : C k −→ Ek−1 ) is given by the direct sum HkG = Hσk ⊕

p

k

Hσ,λk . i

(25)

i=1

Proof. By the definition, HkG is equal to the intersection N (G σk )∩N (d). Since N (G σk ) = p k k N (δ) ⊕ i=1 Hσ,λk according to Lemma 5.4, and Hσ,λk ⊆ R(d) ⊆ N (d), we obtain i i p k HkG = (N (δ) ∩ N (d)) ⊕ i=1 Hσ,λk . i

As discussed in the Introduction, a conformally invariant cohomology space may be defined: HLk := N (L k )/R(d) , where of course d means d : E k−1 → E k, as is clear by context. From the definitions of the various spaces it follows automatically that this fits into the complex (2) of [7]. Here and below we put H −1 := 0 and HL−1 := 0. Let us work on a non Ricci-flat Einstein manifold. To discuss (2) we note that we have mappings k k δ d k−1 and H k−1 −→ Hσ,λ −→ H Hσ,λ . σ,λ σ,λ

(26)

Conformal Geometry of Forms on Einstein Manifolds

307 δ

d

In the case λ = 0 this is a bijective correspondence as f → δ f → dδ f = λ f for k f ∈ Hσ,λ and similarly for the opposite direction. In particular n−2k

d:

n−2k

2

k−1k H σ,λi

i=1

→

2

k

Hσ,λk , i

i=1

is a bijection. Using this it is easily verified that our results above are consistent with (2) in the sense that from Proposition 5.3 and (25) one verifies that (2) is exact at HkG . Also n−2k

2 k−1 H k intersects trivially with R(d) it follows that, in the Einstein scale σ , since ⊕i=1

σ,λi

n−2k

HLk−1

=

2

i=1

k−1k ⊕ H k−1 /R(d) . H σ,0

(27)

σ,λi

The spaces HL∗ also contribute to another exact complex on even conformal manifolds, d

k → HLk , 0 → H k−1 → HLk−1 → HL

k ∈ {0, 1, . . . , n/2}

(28)

k = N (L ). In analogy with which is a tautological consequence of the definition: HL k (2), surjectivity of the last map is not known in general. This motivates studying N (Lk ). From the results for the null spaces of L k and G k we have the following.

Corollary 5.6. The null space N (Lk ), of Lk , is conformally invariant. On (M, g), an Einstein manifold which is not Ricci flat, N (Lk ) is given by the direct sum: N (Lk ) = (N (δ) ∩ N (δd)) ⊕

p−1 i=1

k k+1 ⊕ H σ,λ i

where the scalars λik are given in (23) and p =

p

k

Hσ,λk , 0 ≤ k ≤ i

i=1

n−2k 2 .

n − 1, 2

Further N (Ln/2 ) = N (δ).

Proof. The case k = 0 follows from Proposition 5.3 and the case k = n2 is obvious. Assume 1 ≤ k ≤ n2 − 1. The conformal invariance follows from the conformal invariance of Lk . In the scale σ , we have N (Lk ) = N (L k ) ∩ N (G σk ) hence we need intersection of the direct sums in the displays of Proposition 5.3 and Lemma 5.4. Since k k (by the definition of these spaces) for i = 1, . . . , p, we obtain Hσ,λk ⊆ N (δd) = H σ,0 i ⎡ ⎤ p−1 p k k ⊕ k k+1 ⎦ ⊕ H N (L k ) ∩ N (G σk ) = N (δ) ∩ ⎣H Hσ,λk . σ,0 σ,λ i=1

k Since similarly H

σ,λik+1

i

i

i=1

⊆ N (δ) for i = 1, . . . , p − 1, the statement follows.

Returning to the sequence (28), exactness is easily verified using Corollary 5.6 and the bijections (26). Note the direct summand N (δ) ∩ N (δd) from the previous corollary satisfies N (δ) ∩ N (d) ⊆ N (δ) ∩ N (δd) ⊆ N (dδ + δd).

(29)

308

A. R. Gover, J. Šilhan

Remark. We can also study the space N (Q σk : C k → Ek ). A direct application of Theorem 5.1 shows that N (Q σk

: C → Ek ) = k

p

k

Hσ,λk ⊆ R(d), i

i=1

on (M, g) an Einstein manifold which is not Ricci flat and with notation as above. Let us, as usual, assume that (M, g) is Einstein and not Ricci flat. The operator Q σk k

simplifies on HkG . Considering (25), observe that Q σk vanishes on Hσ,λk ⊆ HkG , for each i

of the nonzero scalars λik ; this is because the composition factor dδ − λik of Q σk vanishes k

on Hσ,λk . Further, Q σk is a multiple of the identity on Hσk because δ vanishes on Hσk . i Using (25), we summarise this. Proposition 5.7. Let (M, g) be an Einstein manifold which is not Ricci flat. The restriction of Q σk : C k → Ek to the conformal harmonics HkG is given in the Einstein scale σ as follows: Q σk |

k

Hσ,λk

= 0 and Q σk |Hσk = s k J (n−2k)/2 id, where

i

n−2k 2

sk =

2i(n − 2k − i + 1) . n i=1

Note the last display means s n/2 = 1. The space B k = {d f | Q σk d f ∈ R(δ)} ⊆ HkG is conformally invariant and in [7] k

plays a role in studying HkG . Clearly f := d f ∈ Hσ,λk , i = 1, . . . , p := i p k Q σk f = 0, thus trivially f ∈ B k . Therefore i=1 Hσ,λk ⊆ B k .

n−2k 2

satisfies

i

6. Compact Conformally-Einstein Spaces Recall that for ϕ ∈ E k , ψ ∈ Ek , and (M, [g]) compact of signature ( p, q), there is the natural conformally invariant global pairing ϕ · ψ dµg , (30) ϕ, ψ → ϕ, ψ := M

where ϕ · ψ ∈ E[−n] denotes a complete contraction between ϕ and ψ. When M is orientable we have ϕ ∧ ψ, ϕ, ψ = M

where is the conformal Hodge star operator. This pairing combines with the operator Q σk to yield other global pairings. For example on compact pseudo-Riemannian manifolds, there is a conformally invariant pairing between N (L k ) and C k given by (u, w) → u, Q k w

k = 0, 1, . . . , n/2

(31)

Conformal Geometry of Forms on Einstein Manifolds

309

for w ∈ C k and u ∈ N (L k ) [7, Theorem 2.9,(ii)]. We want to examine this in the Einstein setting. The case k = n/2 just recovers (30) and so we focus on the remaining cases. We assume that (M, g) is even dimensional, Einstein and not Ricci-flat. Consider u, Q k w with w ∈ C k , u ∈ N (L k ), k ∈ {0, · · · n/2 − 1}. By Proposition 5.3 u decomk k+1 . Now u 1 = δu k and u 1 ∈ ⊕ p−1 H poses directly: u = u 0 + u 1 , where u 0 ∈ H σ,0 i=1 σ,λi

for some (k + 1)-form u so, integrating by parts,

u 1 , Q σk w = u , d Q σk w. But using Corollary 4.3 we have d Q σk w = 0. We summarise this simplification of the pairing. Lemma 6.1. On an even, Einstein, and non Ricci-flat, compact manifold (M, g = σ −2 g) k × C k. the pairing on N (L k ) × C k descends to H σ,0 k is the null space of the Laplacian. Note that for k = 0, H σ,0 Next, on compact pseudo-Riemannian manifolds, we recall that Q σk also gives a conformally invariant quadratic form on HkG [8]; this is given by (31) with now u, w ∈ HkG . We write this as ˜ : HkG × HkG → R.

(32)

By Proposition 5.7, this specialises as follows: Proposition 6.2. On non Ricci-flat compact even Einstein manifolds (M, g) the qua˜ : Hk × Hk → R descends to dratic form G G Hσk × Hσk → R

k ∈ {0, 1, . . . , n/2}

given by (u, w) → s k J

n−2k 2

u, w,

where the constant s k is given in Proposition 5.7. 6.1. Compact Riemannian spaces. We now assume (M, g) is a compact Einstein manifold of Riemannian signature. As above we relate g to σ ∈ E+ [1] by g = σ −2 g, we assume k ∈ {0, . . . , n2 } and we set E−1 := 0. Many results from the previous section simplify in this setting. In particular we may use the de Rham Hodge decomposition E k = R(d) ⊕ R(δ) ⊕ Hσk and Hσk is the usual space of de Rham harmonics, that is Hσk = N (dδ + δd) ∼ = H k . It also follows that the containments in (29) may be replaced by set equalities. Note also that, for example, N (δd) = N (d). Next observe that, since the operators δd and dδ are positive, we have the following from Lemma 5.2. Proposition 6.3. If (M, g) is a positive scalar curvature compact Riemannian Einstein manifold then

k k = 0 and Hkσ,λk = 0 H σ,λ i

for the λik as in (23).

i

k ∈ {0, . . . , n}

310

A. R. Gover, J. Šilhan

Using (29) and the related observations we have the following specialisations of the results of Sect. 5. Theorem 6.4. Let (M, g) be a compact Riemannian Einstein manifold of even dimension. We have the exact sequences d

0 → H k−1 → HLk−1 → HkG → H k → 0, and d

k 0 → H k−1 → HLk−1 → HL → HLk → 0.

In particular (M, [g]) is (k − 1)-regular for k = 1, . . . , n/2. Assume (M, g) is not Ricci-flat. With the scalars λik as in (23) and p = N (L k ) = C ⊕ k

p−1 i=1

n−2k 2

we have:

k k+1 , k < n/2, H σ,λ i

HkG = Hσk ⊕

p

k

Hσ,λk , i

i=1

N (Lk ) =

Hσk

⊕

p−1 i=1

k k+1 ⊕ H σ,λ i

p

k

Hσ,λk , k < n/2, i

i=1

while trivially we have N (L n/2 ) = E n/2 and N (Ln/2 ) = N (δ). In particular, in the case of positive scalar curvature: N (L k ) = C k and N (Lk ) = Hσk for k < n/2, and HkG = Hσk . If (M, g) is Ricci-flat then N (L k ) = C k and HkG = N (Lk ) = Hσk for k < n/2. Note that in the non Ricci-flat case HkG is formally as in (25), but now we have ∼ = Hk. The implications for the global pairings are as follows. The first statement in the k = C k in the compact theorem follows from Lemma 6.1, Theorem 6.4, and that H σ,0 Riemannian setting.

Hσk

Theorem 6.5. Let (M, g) be a compact Riemannian Einstein manifold of even dimension. The pairing on N (L k )×C k , by (u, w) → u, Q k w, with k = 0, 1, . . . , n/2 descends to ˜ from (32) yields a conformally invariant C k × C k . By Theorem 6.4, the quadratic form quadratic form H k × H k → R. In the Einstein scale this is given by Proposition 6.2 where Hσk are the usual harmonics for g. In the Ricci–flat case this quadratic form is zero for k < n/2 and recovers (30) for k = n/2. The last statement of the theorem uses expression (21) and Theorem 6.4.

Conformal Geometry of Forms on Einstein Manifolds

311

Remark 6.6. Note that, for the case of k = 0 and M connected, the first result of the theorem states that for f in the null space of the dimension order GJMS operator (recall L 0 = n/2 + lower or der ter ms) f Q = c Q, where c is a unique constant such that c − f ∈ R(δ σ ). (Here we write δ σ to emphasise that, although the display is conformally invariant, to write the difference c − f as a divergence requires working in the Einstein scale.) Corollary 6.7. Put p :=

In an Einstein scale the space B k is given as follows:

k p ⊕ j=1 Hσ,λk J = 0 k i B = 0 J = 0.

n−2k 2 .

7. The Fefferman-Graham Ambient Metric Thus let us review briefly the basic relationship between the Fefferman-Graham ambient metric construction and tractor calculus as described in [11] for general conformal manifolds. Let π : Q → M be a conformal structure of signature ( p, q). Let us use ρ to denote the R+ action on Q given by ρ(s)(x, gx ) = (x, s 2 gx ). An ambient manifold is a smooth (n + 2)-manifold M˜ endowed with a free R+ –action ρ and an R+ –equivariant embedding ˜ We write X ∈ X( M) ˜ for the fundamental field generating the R+ –action, i : Q → M. ˜ and u ∈ M˜ we have X f (u) = (d/dt) f (ρ(et )u)|t=0 . that is for f ∈ C ∞ ( M) If i : Q → M˜ is an ambient manifold, then an ambient metric is a pseudo– Riemannian metric h of signature ( p + 1, q + 1) on M˜ such that the following conditions hold: (i) The metric h is homogeneous of degree 2 with respect to the R+ –action, i.e. if L X denotes the Lie derivative by X, then we have L X h = 2h. (I.e. X is a homothetic vector field for h.) (ii) For u = (x, gx ) ∈ Q and ξ, η ∈ Tu Q, we have h(i ∗ ξ, i ∗ η) = gx (π∗ ξ, π∗ η). To simplify the notation we will usually identify Q with its image in M˜ and suppress the embedding map i. To link the geometry of the ambient manifold to the underlying conformal structure on M one requires further conditions. In [20,21] Fefferman and Graham treat the construction of a formal power series solution, along Q, for the Goursat problem of finding an ambient metric h satisfying (i) and (ii) and the condition that it be Ricci flat, i.e. Ric(h) = 0. In even dimensions for a general conformal structure this is obstructed at finite order. However when the underlying conformal structure is (conformally) Einstein then an explicit Ricci-flat ambient metric is available [33–35]. (In fact also more generally a similar result is available for certain products of Einstein manifolds [25].) Here we shall use only the existence part of the Ricci-flat ambient metric. The uniqueness of the operators we will construct is a consequence of the fact that they can be uniquely expressed in terms of the underlying conformal structure as in [11,27]. It turns out that one may arrange that h is a metric satisfying the conditions above (i.e. (i) and (ii) and with h Ricci flat to the order possible) with Q := h(X, X) a defining

312

A. R. Gover, J. Šilhan

function for Q, and 2h(X, ·) = d Q to all orders in both odd and even dimensions. We write ∇ for the ambient Levi-Civita connection determined by h. We will use ˜ For example, if v B is a vector upper case abstract indices A, B, . . . for tensors on M. ˜ then the ambient Riemann tensor will be denoted R AB C D and defined by field on M, [∇ A , ∇ B ]v C = R AB C D v D. In this notation the ambient metric is denoted h AB and, with its inverse, this is used to raise and lower indices in the usual way. We will not normally distinguish tensors related in this way even in index free notation; the meaning should be clear from the context. Thus for example we shall use X to mean both the Euler vector field X A and the 1-form X A = h AB X B. ˜ Let E(w) denote the space of functions on M˜ which are homogeneous of degree ˜ w ∈ R with respect to the action ρ. That is f ∈ E(w) means that X f = w f . Similarly a tensor field F on M˜ is said to be homogeneous of degree w if ρ(s)∗ F = s w F or equiv˜ alently L X F = wF. Just as sections of E[w] are equivalent to functions in E(w)| Q we will see that the restriction of homogeneous tensor fields to Q have interpretations on M as weighted sections of tractor bundles [11,27]. On the ambient tangent bundle T M˜ we define an action of R+ by s · ξ := s −1 ρ(s)∗ ξ . The sections of T M˜ which are fixed by this action are those which are homogeneous of degree −1. Let us denote by T the space of such sections and write T (w) for sections ˜ ˜ in T ⊗ E(w), where the ⊗ here indicates a tensor product over E(0). Along Q the R+ ˜ Q )/R+ , is action on T M˜ is compatible with the R+ action on Q, so the quotient (T M| a rank n + 2 vector bundle over Q/R+ = M; in fact this is (up to isomorphism) the normal standard tractor bundle T (or E A ) [11,27] and the composition structure of T ˜ Q . Sections of T are equivalent to sections reflects the vertical subbundle T Q in T M| ˜ Q which are homogeneous of degree −1, that is sections of T |Q . Using this of T M| relationship one sees that the ambient metric h and the ambient connection ∇ descend to, respectively the tractor metric h, and the tractor connection ∇ T . For the metric this is obvious. We discuss the connection briefly. For U ∈ T , let U˜ be the corresponding section of T |Q . A tangent vector field ξ on M has a lift to a homogeneous degree 0 ˜ Q , which is everywhere tangent to Q. This is unique up to adding section ξ˜ , of T M| ˜ ˜ and ξ˜ smoothly and homogeneously to fields f X, where f ∈ E(0)| Q . We extend U ˜ ˜ on M. Then we can form ∇ξ˜ U ; along Q, this is clearly independent of the extensions. Since ∇X U˜ = 0, the section ∇ξ˜ U˜ is also independent of the choice of ξ˜ as a lift of ξ . ˜ Q and so Finally, the restriction of ∇ ˜ U˜ is a homogeneous degree −1 section of T M| ξ

determines a section of T which depends only on U and ξ . This is ∇ T U. Finally we will say that an ambient tensor F is homogeneous of weight w if ∇ X F = wF. The weight is a convenient shifting of homogeneity degree. Note, for example, that an ambient 1-form U˜ which is homogeneous of degree −1 is homogeneous of weight 0 and this means that ∇ X U˜ = 0.

7.1. The main result. In Sect. 4 several operators were defined on conformally Einstein manifolds directly using tractor calculus and the parallel tractor of the Einstein structure. On the other hand in [7] operators with the same notation were defined on general conformal manifolds via the Fefferman-Graham ambient metric, and its link to tractor calculus. The aim of this section is simply to show that these agree (up to a nonzero multiple).

Conformal Geometry of Forms on Einstein Manifolds

313

Proposition 7.1. Assume n even and k ∈ {1, . . . , n2 }. On Einstein manifolds the operator Lk defined by (19) agrees with the operator with the same notation in [7]. The operators L k and G σk from Sect. 4 also agree with the operators of the same notation in [7]. Here “agree” means the operator is the same up to a non-zero multiple, and we will not pay attention to the detail of what this constant factor is. On the ambient manifold a special role is played by differential operators P on ambient tensor bundles which act tangentially along Q, in the sense that P Q = Q P for some operator P (or equivalently [P, Q] = Q P for some P ). Note that compositions of tangential operators are tangential. If tangential operators are homogeneous (i.e. the commutator with the Lie derivative L X recovers a constant multiple of the operator) then they descend to operators on M. An example of a tangential operator is given by / =: D / : T (w) → T ⊗ T (w − 1), (n + 2w − 2)∇ + X where T (w) indicates the space of sections, homogeneous of weight w, of some ambient tensor bundle, and / = − R

. Here we use the Laplacian := −∇ A ∇ A for compatibility with [7]. We will leave / is tangential to the reader, but note this also follows from the the verification that D result that (n + 2w − 2)∇ + X =: D : T (w) → T ⊗ T (w − 1) is tangential as discussed in [27,11]. Since this is tangential and homogeneous it descends to an operator on weighted tractors. In fact it gives the usual tractor-D operator [27,11]. The ambient R

similarly descends (in dimensions n = 4) to a multiple of W

, acting / : T (w) → T ⊗ T (w − 1) descends to on weighted tractor bundles [28]. Thus D D / : T (w) → T ⊗ T (w − 1) in dimensions other than 4. (Here T means the tractor / := D, bundle corresponding to T .) Henceforth for (M, [g]) of dimension 4 we take D rather than the definition above. Now if (M, [g]) is conformally Einstein and I a parallel tractor corresponding to an Einstein scale then along Q in M˜ we have a corresponding parallel vector field I. From the explicit formula for the ambient metric over an Einstein manifold ones sees that I ˜ (In fact when the Einstein scale is not Ricci flat extends to a parallel vector field on M. then the ambient metric is given as a product of the metric cone with a line.) We have (on T [w]) / A = (n + 2w − 2)I A ∇ A + σ /, I AD ˜ Note that σ is a homogeneous function on Q corresponding where σ = I A X A ∈ E(1). to σ = I A X A . Thus if we extend a tensor field U ∈ T (w)|Q off Q in such a way that I A ∇ A U = 0 (which implies U ∈ T (w)) then we get simply / A U = σ / U. I AD Note that I A ∇ A U = 0 can be achieved by starting with a section along Q and then extending off Q by parallel transport. The key point here is that I A X A is non-vanishing, at least in a neighbourhood of Q, and so I A ∇ A is not tangential to Q. Next observe that, since σ = I A X A and I A is parallel, we have ∇Aσ = I A,

314

A. R. Gover, J. Šilhan

which is parallel. Thus / , σ ] = [, σ ] = 2I A ∇ A , [

(33)

where we consider σ as a multiplication operator. The following observations will be useful. Lemma 7.2. If R denotes the ambient curvature then I A ∇ A R = 0. Proof. By the Bianchi identity I A ∇ A R BC D E + I A ∇C R AB D E + I A ∇ B RC A D E = 0. But I is parallel which implies that [∇, I] = 0 and I A R AB D E = 0 = I A RC A D E . So the result follows. Lemma 7.3. If U is an ambient tensor such that I A ∇ A U = 0 then, for any p ∈ N∪{0}, p / U) = 0. I A ∇ A ( Proof. Clearly, acting on any ambient tensor, we have [I A ∇ A , ∇ B ] = 0. Thus I A ∇ A / differs commutes with the Bochner Laplacian . On the other hand by definition / − = −R

, while from the previous from the Bochner by a curvature action: lemma the ambient curvature is parallel along the flow of I A ∇ A . The main technical result we need is this. Proposition 7.4. For f an ambient form homogeneous of weight k − n/2 we have k

/ A )k f = σ k / f, (I A D along Q. Proof. First note that both sides are tangential operators. For the right-hand-side this is / is tangential and I is proved in [7]. For the left-hand-side it holds simply because D parallel on the ambient manifold. So neither side can depend on the transverse (to Q) derivatives of the homogeneous f . Now the result is true if k = 1. Also, calculating along Q, / A )k f = (I B D / B )k−1 I A D / A f, (I A D and so by induction / A )k f = σ k−1 / (I A D

k−1 A

/ A f. I D

Since the result is independent of transverse derivatives we may choose the extension off Q to suit. Thus we assume without loss of generality that I A ∇ A f = 0. Then / A f = σ / f and so I AD / σ k−1

k−1

/ A ) f = σ k−1 / (I A D

So from (33) and Lemma 7.3 the result follows. m

k−1

/ f ). (σ

/ is homogeneous and acts tangentially on By Proposition 3.2 of [7], the operator ambient differential forms of weight m − n/2. Thus it descends to an operator that we / m on form-tractors of weight m − n/2. From the above proposition we obtain denote immediately the following results.

Conformal Geometry of Forms on Einstein Manifolds

315

/m : Corollary 7.5. On conformally Einstein manifolds (M, [g]) the invariant operator T k [m − n/2] → T k [−m − n/2], m ∈ {0, 1, 2, . . .}, is formally self-adjoint and given by / m = σ −m (I A D / A )m , where σ −2 g is an Einstein metric on M and I = n1 Dσ . In odd dimensions these are natural operators. In even dimensions the same is true with the restrictions that either m ≤ n/2 − 2; or m ≤ n/2 − 1 and k = 1; or m ≤ n/2 and k = 0. In the conformally flat case the operators are natural with no restrictions on m ∈ {1, 2, . . .}. Proof. The statements on naturality are extracted from [7]. It only remains to establish the claim that the operator is formally self-adjoint. But this is immediate from the / A = / σ according to (14). formula for the right-hand-side from (16) because I A D Finally we are ready to prove the main result: / m , as in Proof of Proposition 7.1. By expression (40) from [7] and the fact that Corollary 7.5, is formally self-adjoint we have that the operator Lk from [7] is given by / ι(D / )ε(X )qk , Lk := where the notation is from that source. But it is a straightforward calculation to verify / )ε(X )qk is exactly the operator M from (15). (See that, up to a non-zero multiple, ι(D also [37, 2.1.2 and (2.8)] where the special case k = 2 is treated in detail.) So the result now follows from the corollary and (18), where w = 0 and p = n−2k 2 . Acknowledgements. The first author would like to thank the Royal Society of New Zealand for support via Marsden Grant no. 06-UOA-029. The second author was supported from Basic Research Center no. LC505 ˇ (Eduard Cech Center for Algebra and Geometry) of the Ministry of Education of Czech Republic. We are appreciative of the careful reading by the referee; this exposed a number of typographical errors in the original manuscript.

References 1. Bailey, T.N., Eastwood, M.G., Gover, A.R.: Thomas’s structure bundle for conformal, projective and related structures. Rocky Mountain J. Math. 24, 1191–1217 (1994) 2. Besse, A.L.: Einstein manifolds. Ergebnisse der Mathematik und ihrer Grenzgebiete (3) (Results in Mathematics and Related Areas (3)), 10. Berlin, Springer-Verlag, 1987 3. Branson, T.: Differential operators canonically associated to a conformal structure. Math. Scand. 57, 293–345 (1985) 4. Branson, T.P.: Group representations arising from Lorentz conformal geometry. J. Funct. Anal. 74, 199–291 (1987) 5. Branson, T.: Sharp inequalities, the functional determinant, and the complementary series. Trans. Amer. Math. Soc. 347, 3671–3742 (1995) 6. Branson, T., Gover, A.R.: Electromagnetism, metric deformations, ellipticity and gauge operators on conformal 4-manifolds. 8th International Conference on Differential Geometry and its Applications (Opava, 2001). Diff. Geom. Appl. 17(2–3), 229–249 (2002) 7. Branson, T., Gover, A.R.: Conformally invariant operators, differential forms, cohomology and a generalisation of Q-curvature. Commun. Part. Diff. Eqs. 30, 1611–1669 (2005) 8. Branson, T., Gover, A.R.: Pontrjagin forms and invariant objects related to the Q-curvature. Commun. in Comtemp. Math. 9, 335–358 (2007) ˇ 9. Cap, A., Gover, A.R.: Tractor calculi for parabolic geometries. Trans. Amer. Math. Soc. 354, 1511–1548 (2002)

316

A. R. Gover, J. Šilhan

ˇ 10. Cap, A., Gover, A.R.: Tractor bundles for irreducible parabolic geometries. In: Global analysis and harmonic analysis (Marseille-Luminy, 1999), Sémin. Congr., 4, Paris: Soc. Math. France, 2000, pp. 129–154 ˇ 11. Cap, A., Gover, A.R.: Standard tractors and the conformal ambient metric construction. Ann. Global Anal. Geom. 24(3), 231–259 (2003) 12. Cartan, E.: Les espaces à connexion conforme. Ann. Soc. Pol. Math. 2, 171–202 (1923) 13. Chang, S.-Y.A., Qing, J., Yang, P.: On the Chern-Gauss-Bonnet integral for conformal metrics on R4 . Duke Math. J. 103, 523–544 (2000) 14. Chern, S.-S., Simons, J.: Characteristic forms and geometric invariants. Ann. Math. 99, 48–69 (1974) 15. de Rham, G.: Variétés différentiables. Formes, courants, formes harmoniques. Actualités Sci. Ind., no. 1222 = Publ. Inst. Math. Univ. Nancago III. Paris: Hermann et Cie, 1955 16. Djadli, Z., Malchiodi, A.: Existence of conformal metrics with constant Q-curvature. http://arxiv.org/ list/math.AP/0410141, 2004 17. Eastwood, M.G., Singer, M.: A conformally invariant Maxwell gauge. Phys. Lett. 107A, 73–74 (1985) 18. Eastwood, M.G., Singer, M.: The Fröhlicher spectral sequence on a twistor space. J. Diff. Geom. 38, 653–669 (1993) 19. Fefferman, C.: Monge–Ampére equations, the Bergman kernel and geometry of pseudoconvex domains. Ann. of Math. 103, 395–416 (1976); Erratum 104, 393–394 (1976) 20. Fefferman, C., Graham, C.R.: Conformal invariants. The mathematical heritage of Élie Cartan (Lyon, 1984). Astérisque 1985, Numero Hors Série, 95–116, (1985) 21. Fefferman, C., Graham, C.R.: The ambient metric, http://arxiv.org/list/:0710.0919, 2007 22. Gover, A.R.: Aspects of parabolic invariant theory. In: The 18th Winter School “Geometry and Physics” (Srní 1998), Rend. Circ. Mat. Palermo (2) Suppl. No. 59, 1999, pp. 25–47 23. Gover, A.R.: Invariant theory and calculus for conformal geometries. Adv. Math. 163, 206–257 (2001) 24. Gover, A.R.: Laplacian operators and Q-curvature on conformally Einstein manifolds. Mathematische Annalen. 336, 311–334 (2006) 25. Gover, A.R., Leitner, F.: A sub-product construction of Poincare-Einstein metrics, http://arxiv.org/list/ math/0608044, 2006 26. Gover, A.R., Nurowski, P.: Obstructions to conformally Einstein metrics in n dimensions. J. Geom. Phys. 56, 450–484 (2006) 27. Gover, A.R., Peterson, L.J.: Conformally invariant powers of the Laplacian, Q-curvature, and tractor calculus. Commun. Math. Phys. 235, 339–378 (2003) 28. Gover, A.R., Peterson, L.J.: The ambient obstruction tensor and the conformal deformation complex. Pac. J. Math. 226, 309–351 (2006) 29. Gover, A.R., Šilhan, J.: Conformal Killing equation on forms – prolongations and applications. Diff. Geom. Appl. 26(3), 244–266 (2008) 30. Gover, A.R., Šilhan, J.: A decomposition theorem for linear operators; application to Einstein manifolds. http://arxiv.org/list/math.AC/0701377, 2007 31. Gover, A.R., Šilhan, J.: Commuting linear operators and algebraic decompositions, Archivum Mathematicum, 43(5), 373–387 (2007) 32. Graham, C.R., Jenne, R., Mason, L.J., Sparling, G.A.: Conformally invariant powers of the Laplacian, I: Existence. J. London Math. Soc. 46, 557–565 (1992) 33. Graham, C.R., Hirachi, K.: The ambient obstruction tensor and Q-curvature. In: AdS/CFT correspondence: Einstein metrics and their conformal boundaries, IRMA Lect. Math. Theor. Phys., 8, Zürich: Eur. Math. Soc., 2005, pp. 59–71 34. Leistner, T.: Conformal holonomy of C-spaces, Ricci-flat, and Lorentzian manifolds. Diff. Geom. Appl. 24(5), 458–478 (2006) 35. Leitner, F.: Conformal Killing forms with normalisation condition. Rend. Circ. Mat. Palermo (2) Suppl. No. 75, 279–292, (2005) 36. Schoen, R.: Conformal deformation of a Riemannian metric to constant scalar curvature. J. Diff. Geom. 20, 479–495 (1984) 37. Šilhan, J.: Invariant differential operators in conformal geometry. PhD thesis, The University of Auckland, 2006 Communicated by G.W. Gibbons

Commun. Math. Phys. 284, 317–343 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0647-6

Communications in

Mathematical Physics

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach G. Eskin Department of Mathematics, UCLA, Los Angeles, CA 90095-1555, USA. E-mail: [email protected] Received: 25 September 2007 / Accepted: 8 July 2008 Published online: 18 September 2008 – © Springer-Verlag 2008

Abstract: We describe the general setting for the optical Aharonov-Bohm effect based on the inverse problem of the identification of the coefficients of the governing hyperbolic equation by the boundary measurements. We interpret the inverse problem result as a possibility in principle to detect the optical Aharonov-Bohm effect by the boundary measurements. 1. Introduction In this section we will review the quantum mechanical Aharonov-Bohm (AB) effect (cf. [AB,WY,OP,W,E4]). Let be a smooth bounded domain in Rn having the form = 0 \ ∪mj=1 j , where 0 is a simply-connected domain and j , 1 ≤ j ≤ m, are smooth domains called obstacles. We assume that j ⊂ 0 for 1 ≤ j ≤ m, and j ∩ k = ∅ when j = k, 1 ≤ j, k ≤ m. Consider the stationary Schrödinger equation in with magnetic potential A(x) = (A1 (x), . . . , An (x)) and electric potential V (x): de f

Hu =

2 n ∂ −i − A j (x) u(x) + V (x)u(x) = k 2 u(x), ∂x j

(1.1)

j=1

describing the nonrelativistic quantum electron in the classical electromagnetic field. We assume that u|∂ j = 0, 1 ≤ j ≤ m,

(1.2)

i.e. j are unpenetrable for the electron, and u|∂0 = f (x ).

(1.3)

318

G. Eskin

Let (k) f be the Dirichlet-to-Neumann (DN) operator on ∂0 , i.e. ∂u − i(A · ν)u |∂0 , (k) f = ∂ν

(1.4)

where u(x) is the solution of (1.1), (1.2), (1.3) and ν is the unit outward normal vector at x ∈ ∂0 . Denote by G() the group of all complex-valued C ∞ () functions c(x) in such that |c(x)| = 1. If c(x) ∈ G() and u = c−1 (x)u(x) then u satisfies the Schrödinger equation of the form (1.1) with A(x), V (x) replaced by A (x), V (x), where Aj (x) = A j (x) − ic−1 (x)

∂c , 1 ≤ j ≤ n, ∂x j

(1.5)

V (x) = V (x). We shall call the electromagnetic potentials A (x), V (x) and A(x), V (x) gauge equivalent. We also call the DN operators (k) and (k), corresponding to A(x), V (x) and A (x), V (x), respectively, gauge equivalent if there exists c(x) ∈ G() such that (k) = c0−1 (k)c0 , where c0 is the restriction of c(x) to ∂0 . Let B(x) = curl A(x) or, equivalently, B = dA, where A = nj=1 A j (x)d x j , be the magnetic field in . It follows from (1.5) that B(x) = B (x) in if A(x) and A (x) are gauge equivalent. If is simply-connected then the inverse is true: B(x) = B (x) in implies that A(x) and A (x) are gauge equivalent. When is not simply-connected this is not true anymore. It was shown in the seminal paper of Aharonov and Bohm [AB] that if curl A = curl A = 0, but A (x) and A(x) belong to distinct gauge equivalent classes, they have a different physical impact that is detectable in the experiments. This fact is called the Aharonov-Bohm effect. An important description of gauge equivalence classes was given by Wu and Yang [WY]: Let γ be any closed path in . It is easy to see that A(x) and A (x) belong to the same gauge equivalent class iff exp(i A · d x) = exp(i A · d x) (1.6) γ

γ

for all paths γ in , or, equivalently, A · dx − A · d x = 2π p, γ

γ

(1.7)

where p ∈ Z. In the original paper [AB] Aharonov and Bahm consider the case of one obstacle 1 in R2 and the magnetic field confined to 1 . Then γ A · d x = α is the magnetic flux and α is independent of any simple path γ encircling 1 . The quantity eiα that determines the gauge equivalence class of A(x) was measured in this experiment. If

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

319

α = 2π p, p ∈ Z, then the gauge equivalence class of A(x) is nonzero despite the fact that B = 0 in = 0 \1 . Consider now the case of several obstacles 1 , . . . , m . Suppose that the magnetic field is hidden inside each of these obstacles. Let αk = γk A · d x be the magnetic fluxes, αk where γk encircles k only. Suppose that some of 2π are not integers and m k=1 αk = 0, i.e. the total magnetic flux is zero. In this case the gauge equivalence classes are determined by m parameters eiαk , 1 ≤ k ≤ m, however the AB experiment will not find a gauge equivalent class different from zero. To identify an arbitrary gauge equivalence class one needs to use broken rays (i.e. the rays reflected at the obstacles) belonging to the base of the homology group of (cf. [E5], p. 1512). It is necessary to perform at least m AB type experiments to determine all eiαk , 1 ≤ k ≤ m. When B(x) = curl A is not zero in it is not enough to perform a finite number of AB type experiments to identify the gauge equivalence class of A. Therefore the following question arises: Is it possible by the measurements on the boundary ∂0 to detect the difference in the gauge equivalence classes of A(x) and A (x)? The answer to this question is affirmative, and it is given by the following theorem (cf. [E4,W,N,KL] and further references there): Theorem 1.1. Consider two boundary value problems (1.1), (1.2), (1.3) corresponding to electromagnetic potentials A(x), V (x) and A (x), V (x). Then A(x), V (x) and A (x), V (x) belong to the same gauge equivalence class iff the DN operators (k) and (k) are gauge equivalent for all k. We consider each boundary measurement as an experiment. Theorem 1.1 asserts that the boundary measurements are able to identify an arbitrary gauge equivalence class. We interpret this theorem as a confirmation of the Aharonov-Bohm effect.

In Sect. 2 we develop the same approach in the case of the optical Aharonov-Bohm effect, and we shall formulate the unique identification theorem for the optical AB effect. In Sect. 3 we prove the main unique identification theorem (Theorem 2.3). Our approach to the hyperbolic inverse problems is based on a modification of the BC-method given in [E1,E2]. The powerful BC-method was discovered by M.Belishev and extended by M.Belishev, Y.Kurylev, M.Lassas and others (cf. [B,KKL,KL] and additional references there). An important part of the BC-method is the unique continuation theorem by Tataru[T]. The approach of [E1,E2] allows one to consider new problems that were not accessible by the BC-method as the inverse hyperbolic problems with time dependent coefficients (see [E3]). The inverse problem results of this paper are also new. 2. The Optical Aharonov-Bohm Effect In this section we consider hyperbolic (wave) equation of the form: n

1 ∂ √ |g| ∂ x j j,k=0

∂u(x0 , x) |g|g (x) ∂ xk jk

= 0,

(2.1)

where x = (x1 , . . . , xn ) ∈ , x0 is the time variable, ([g jk ]njk=0 )−1 is the pseudo-Riemannian metric tensor with Minkowsky signature, i.e. the quadratic form n jk jk −1 j,k=0 g (x)ξ j ξk has the signature (1, −1, . . . , −1), g(x) = (det[g ]) . We assume jk that g (x) are smooth in and independent of x0 .

320

G. Eskin

We make two additional assumptions: g 00 (x) > 0,

x ∈ ,

(2.2)

and The plane ξ0 = 0 intersects the cone

n

g jk (x)ξ j ξk = 0 at

j,k=0

(ξ1 , ξ2 , . . . , ξn ) = (0, 0, . . . , 0) only, i.e. the form −

n

g jk (x)ξ j ξk is positive definite, x ∈ .

(2.3)

j,k=1

The important physical example of equation of form (2.1) is the equation of the propagation of light in the moving medium. Here the tensor g jk (x) has the following form (see Gordon (1923), [NVV,LP1]): g jk = η jk + (n 2 (x) − 1)u j u k , 0 ≤ j, k ≤ n, n = 3,

(2.4)

jk 00 1, η j j = −1 when [η jk ]−1 is the Lorentz metric √ tensor, η = 0, when j = k, η = 0 for 1 ≤ j ≤ n, x0 = ct, n(x) = ε(x)µ(x) is the refraction index, (u , u 1 , u 2 , u 3 ) is 2 1 )− 2 (1, wc ), w(x) = the four-velocity of the medium flow, (u 0 , u 1 , u 2 , u 3 ) = (1 − |w| c2 (w1 , w2 , w3 ) is the velocity of the flow (cf. [LP,LP1,LP2]). 2 In the case of slowly moving medium one drops the terms of order ( |w| c ) (cf. [LP1, LP2,CFM]). Then the metric of the slowly moving medium has the form:

g jk = η jk for

1 ≤ j, k ≤ n,

g 00 = n 2 (x), g 0 j = g j0

(2.5) w (x) de f j , 1 ≤ j ≤ n, n = 3, = v j (x) = (n 2 − 1) c

and the corresponding equation is n 1 ∂ 2u ∂u ∂ n (x) 2 + |g(x)|v j (x) √ ∂ x0 |g(x)| ∂ x j ∂ x0 j=1 n 1 ∂u ∂ |g(x)|v j (x) + √ ∂x j |g(x)| ∂ x0 2

j=1

−

n j=1

1 ∂ √ |g(x)| ∂ x j

∂u |g(x)| ∂x j

= 0.

(2.6)

We shall also consider in addition to Eq. (2.6) the following equation: n ∂ 1 ∂ ∂ 2u ∂ ∂ n 2 (x) 2 − u = 0, (2.7) |g(x)| −v j (x) −v j (x) √ ∂ x0 ∂x j ∂ x0 |g(x)| ∂ x j ∂ x0 j=1

where v j (x) and n 2 (x) are the same as in (2.5), 1 ≤ j ≤ n. n ∂2u 2 Equation (2.7) differs from Eq. (2.6) by the term j=1 v j (x) ∂ x 2 . Since 0

2 v 2j = O(( |w| c ) ), Eq. (2.7) also describes the propagation of light in the slowly moving

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

321

medium. We consider (2.7) to have a closer analogy with the quantum mechanical AB effect, although the addition of extra terms affects the uniqueness of the inverse problem (compare Theorems 2.1 and 2.2). Note that the nonuniqueness is of the first order in |w| c (see Theorem 2.1). We consider the initial-boundary value problem for (2.6) and (2.7) in the infinite cylinder × (−∞, +∞), where is the same domain as in Sect. 1: u(x0 , x) = 0 for x0 0, u(x0 , x)|∂ j ×(−∞,+∞) = 0, 1 ≤ j ≤ m,

(2.8) (2.9)

u(x0 , x)|∂0 ×(−∞,+∞) = f (x0 , x ), x ∈ ∂0 , where f (x0 , x ) has a compact support on ∂0 × (−∞, +∞). Denote by the hyperbolic DN operator: ∂u ∂u f = |∂0 ×(−∞,+∞) , − (v · ν) ∂ν ∂ x0

(2.10)

where, as in Sect. 1, ν is the outward unit normal to ∂0 . In studying Eq. (2.7) we shall use the following change of variables in ×(−∞, +∞): xˆ0 = x0 + a(x),

xˆ j = x j , 1 ≤ j ≤ n,

(2.11)

ˆ xˆ0 , x) ˆ is u(x0 , x) in new coordinates, where a(x) ∈ C ∞ (), a(x) = 0 on ∂0 . If u( then u( ˆ xˆ0 , x) ˆ also satisfies an equation of the form (2.7): ˆ xˆ0 , x) ∂ 2 u( de f Lˆ uˆ = nˆ 2 (x) ∂ xˆ02 n ∂ 1 ∂ ∂ ∂ uˆ = 0, (2.12) − −vˆ j (x) |g(x)| ˆ −vˆ j (x) ∂x j ∂ xˆ0 ∂x j ∂ xˆ0 ˆ j=1 |g(x)| where n(x) ˆ = n(x), v j (x) is replaced by

We assume that

n

vˆ j (x) = v j (x) − ax j (x), 1 ≤ j ≤ n.

2 j=1 vˆ j (x)

(2.13)

< n 2 (x) to preserve the hyperbolicity of (2.12). Note that uˆ = 0 for xˆ0 0

(2.14)

and u| ˆ ∂ j ×(−∞,+∞) = 0, 1 ≤ j ≤ m,

(2.15)

u| ˆ ∂0 ×(−∞,+∞) = fˆ(xˆ0 , x ),

where fˆ(xˆ0 , x ) = f (x0 , x ) since a = 0 on ∂0 We shall say that vˆ j , 1 ≤ j ≤ n, and v j , 1 ≤ j ≤ n, belong to the same equivalence class if (2.13) holds. If v(x) = (v1 (x), . . . , vn (x)) and v(x) ˆ = (vˆ1 , . . . , vˆn ) belong to the same equivalence class then v · d x − vˆ · d x = 0 (2.16) γ

for all closed paths in since

n

j=1 γ

γ

ax j d x j = 0.

322

G. Eskin

It is easy to see that if v and vˆ belong to the same equivalence class then the D N ˆ are equal on ∂0 × (−∞, +∞). A nontrivial fact is that the inverse operators and statement is also true. The following unique identification theorem holds: Theorem 2.1. Let Lu = 0, Lˆ uˆ = 0 be equations of the form (2.7), (2.12) in domains ˆ ˆ = 0 \∪mˆ ˆ , j=1 j , respectively. Suppose that the DN operators and are equal on ∞ ˆ = , n(x) ∂0 × (−∞, +∞) for all f ∈ C0 (∂0 × (−∞, +∞)). Then the ˆ = n(x) and the corresponding velocity flows v(x), v(x) ˆ belong to the same equivalent class, i.e. (2.13) holds for some a(x) ∈ C ∞ (), a(x)|∂0 = 0. ˆ and their location Note that we did not assume apriori that the number of obstacles mˆ in are the same as in . A consequence of Theorem 2.1 is that boundary measurements on ∂0 ×(−∞, +∞) uniquely determine the integrals γ v · d x for all paths γ in . As in Sect. 1 we view the optical Aharonov-Bohm effect as the fact that the different equivalence classes of the velocity flow have different physical impacts. Theorem 2.1 confirms that the boundary measurement (experiments) allow one to distinguish different equivalence classes, i.e. to detect the Aharonov-Bohm effect. Remark 2.1. There is a difference between the optical Aharonov-Bohm effect and the quantum mechanical AB effect. In the case of the optical AB effect the boundary measurements allow one to recover γ v · d x. In the case of the quantum mechanical AB effect we can recover only γ A · d x (mod 2π p), p ∈ Z.

∞ Let u(k, ˜ x)) = −∞ u(x0 , x)e−i x0 k d x0 be the Fourier-Laplace transform of u(x0 , x) ˜ x) be a monochromatic wave. Then u(k, ˜ x) satisfies the Schrödinger in x0 , or let eikx0 u(k, equation: n ∂ 1 2 2 ˜ x) − − ikv j (x) −k n (x)u(k, √ |g(x)| ∂ x j j=1 ∂ − ikv j (x) u(k, ˜ x) = 0 (2.17) · |g(x)| ∂x j with the boundary conditions u(k, ˜ x)|∂ j = 0,

(2.18)

u(k, ˜ x)|∂0 = f˜(k, x ).

Now kv(x) plays the role of the vector potential and it depends on k. Note also that the Fourier-Laplace image T˜ of the transformation (2.11) is the multiplication by eika(x) , i.e. T˜ is a gauge transformation depending on parameter k. When is multi-connected one can expect that the Aharonov-Bohm effect takes place for (2.17). This problem was studied in optics (cf. [LP,LP1,LP2,CFM]). An analogous problem was considered for the water waves and for the acoustic waves in [BCLUW, RdeRTF,VMCL]. These authors considered the case of one obstacle 1 ⊂ R2 and irrotational flow in 0 \1 . Performing an Aharonov-Bohm type experiment they measured exp(ik γ v · d x) as in the quantum mechanical AB effect. Since such experiments are

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

323

based on the geometric optics considerations, i.e. when k → ∞, it was assumed that the light rays are straight lines and kv(x) is not large. A natural question arises whether some form of the AB effect takes place when these conditions are not satisfied. Note that a rigorous geometric optics approach when k → ∞ for Eq. (2.17) is more delicate than for Eq. (1.1). In particular, the eiconal equation depends on v(x) and the light rays are not the straight lines. 2 Remark 2.2. Let curl v = 0 in = (0 \1 )) ⊂ R . In this case the equivalence class of v(x) depends only on one parameter α = γ v · d x, where γ encircles 1 . There is a simple solution of the inverse problem in this case that does not use either the geometric optics or the Theorem 2.1: Let v(x) and v(x) ˆ be two irrotational velocity flows in 0 \1 . Consider two Schrödinger equations of the form (2.17) in = 0 \1 assuming that n(x) ˆ = n(x) in ˆ and (k) = (k) on ∂0 for some fixed k. It was shown in [NSU,ER], using the parametrix of the DN operators, that vˆ ·τ = v ·τ on ∂0 , where τ (x) is the tangent vector to ∂0 at x ∈ ∂0 . It follows from vˆ · τ = v · τ on ∂0 that α = ∂0 v · d x = ∂0 vˆ · d x. Since v and vˆ are irrotational this implies that there exists a(x) ∈ C ∞ () such that vˆ − v = ∂∂ax . Since ∂∂ax · τ = 0 on ∂0 we get that a|∂0 = a0 = const. Replacing a(x) by a(x) − a0 we obtain that vˆ and v belong to the same equivalence class. Similar arguments apply in the case of Eqs. (2.6) and (2.1) with the metric (2.4). Using the parametrix of the DN operator we can recover the restriction of the metric to ∂0 (cf. [LU] or [E1], Remark 2.2). In particular, we can determine w(x) · τ (x) on ∂0 . Therefore we can recover α = ∂0 w(x) · d x. In the case of irrotational flow and one obstacle α is the same for any simple path in = 0 \1 .

We shall investigate now the inverse problem for Eq. (2.6). The case of Eq. (2.1) with the metric (2.4) will be studied in another paper. Theorem 2.2. Consider two initial-boundary value problems in domains ×(−∞, +∞) ˆ × (−∞, +∞) for operators of the form (2.6), corresponding to the metric tensors and ˆ −1 of the form (2.5), respectively. Assume that the DN operators [g jk (x)]−1 , [gˆ jk (x)] ˆ and , corresponding to L and Lˆ are equal on ∂0 × (−∞, +∞). Assume also that there exists an open, connected and dense set O ⊂ such that the velocity flow v(x) ˆ = (vˆ1 , . . . , vˆn ) does not vanish on O. Then ˆ = , n(x) ˆ = n(x), v(x) ˆ = v(x), 1 ≤ j ≤ n, unless v(x) ˆ is a gradient flow, i.e. there exists b(x) ∈ C ∞ () such that v(x) ˆ = ∂∂bx and b(x) = 0 on ∂0 . In the case of the gradient flow there are two solutions v(x) ˆ = v(x) and v(x) ˆ = −v(x). The proofs of Theorems 2.1 and 2.2 will be given in the end of this section.

Now we shall consider the general case of the initial-boundary value problem (2.8), (2.9) for Eq. (2.1). The DN operator for (2.1) has the following form: − 1 n 2 ∂u g jk (x) νk g pr (x)ν p νr |∂0 ×(−∞,+∞) , f = ∂x j p,r =1 j,k=0 n

where ν is the unit normal as in (2.10).

(2.19)

324

G. Eskin

Consider a diffeomorphism of the form: xˆ0 = x0 + a(x), xˆ = ϕ(x),

(2.20)

ˆ where ˆ is a domain of where a(x)|∂0 = 0 and ϕ(x) is a diffeomorphism of onto , m ˆ ˆ j and ϕ = I on ∂0 . Note that (2.20) transforms (2.1) into ˆ = 0 \ ∪ the form j=1

ˆ an equation of the same form. More precisely, (2.1) has the following form in (xˆ0 , x) coordinates: n ∂ 1 ∂ uˆ jk ˆL uˆ = = 0, (2.21) |g( ˆ x)| ˆ gˆ (x) ˆ ∂ xˆk |g( ˆ x)| ˆ ∂ xˆ j j,k=0

where [gˆ jk (x)] ˆ = J (x)[g jk (x)]J T (x), g( ˆ x) ˆ = (det[gˆ jk (x)]) ˆ −1 , 0 ≤ j, k ≤ n,

(2.22)

J (x) is the Jacobi matrix of (2.20). Theorem 2.3. Consider Eqs. (2.1) and (2.21) in domains × (−∞, +∞) and ˆ × (−∞, +∞), respectively, with initial-boundary conditions (2.8), (2.9) and (2.14), ˆ are equal on (2.15), respectively, where f = fˆ. Assume that the DN operators and ˆ Then there exists a ∂0 × (−∞, +∞) and the conditions (2.2), (2.3) hold for L and L. map ψ of the form (2.20) such that ψ ◦ Lˆ = L in × (−∞, +∞).

(2.23)

Note that (2.23) is equivalent to (2.22). Note also that since ϕ is a diffeomorphism of ˆ we have that mˆ = m and ∂ j are diffeomorphic to ∂ ˆ j , 1 ≤ j ≤ m. onto , The proof of Theorem 2.3 will be given in Sect. 3. Remark 2.3. Making the Fourier-Laplace transform in (2.1) we obtain L(ik,

∂ )u(k, ˜ x) = 0 in , ∂x

(2.24)

where L( ∂∂x0 , ∂∂x ) is the operator (2.1). Let (k) be the Fourier-Laplace image of the DN operator (2.19). Using well known estimates for the initial-boundary value problem (2.1), (2.8), (2.9) one can prove that the hyperbolic DN operator (2.19) on ∂0 × (−∞, +∞) uniquely determines the DN operator (k) for the elliptic boundary value problem (2.24), (2.18) and vice versa (see, for example, [KKLM]). Here k ∈ C\Z , where Z is a discrete set. Suppose g jk − η jk = 0, when |x| > R, and suppose that 0 ⊃ {x : |x| ≤ R}. It is well known that (k) given on ∂0 for fixed k = k0 uniquely determines the scattering amplitude a(θ, ω, k) for k = k0 and any θ ∈ S n−1 , ω ∈ S n−1 , and vice versa (see, for example, the recent work [OD] and additional references there). Therefore one can consider the inverse scattering problem for (2.24) in Rn instead of the inverse boundary value problem for (2.24), (2.18). In the case when there are no obstacles and the principal part of (2.24) is the Laplacian, such inverse problems were

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

325

studied for n ≥ 3 and fixed k (see, for example, [NSU] and [ER1], where the case of exponentially decreasing electromagnetic potentials was considered). When obstacles are present or when the metric is not Euclidean the hyperbolic inverse problem approach is much more powerful.

We shall show now how Theorem 2.3 implies Theorem 2.1 and Theorem 2.2. Proof of Theorem 2.1. Consider two equations of the form (2.7) and (2.12), i.e. g jk = ˆ on −δ jk in (2.1) and gˆ jk = −δ jk in (2.21), 1 ≤ j, k ≤ n. We assume that = ∂0 × (−∞, +∞). It follows from Theorem 2.3 that there exists a map ψ of the form (2.20) such that (2.22) holds. It follows from (2.22) that gˆ jk =

n p,r =1

g pr

∂ϕ j ∂ϕk , 1 ≤ j, k ≤ n, ∂ x p ∂ xr

(2.25)

where ϕ = (ϕ1 , . . . , ϕn ) = I on ∂0 .

(2.26)

Since gˆ jk = −δ jk , g pr = −δ pr we have that ϕ j (x) = x j , 1 ≤ j ≤ n, is the solution of (2.25). Fix j and consider (2.25) for j = k. Then ϕ j (x) is the solution of the scalar first order differential equation satisfying ϕ j = x j on ∂0 . The Cauchy problem for ϕ j (x) with the initial conditions on any surface S(x) = 0 can be uniquely solved in a small neighborhood of S(x) = 0. This implies that ϕ j (x) = x j , 1 ≤ j ≤ m, is the unique global solution of (2.25), (2.26). Therefore the map (2.20) reduces to the map (2.11). ˆ = . Making the change of variables (2.11) with the same a(x) This implies that as in (2.20) we get two identical operators. Therefore (2.13) holds and, subseqently, n(x) ˆ = n(x).

Proof of Theorem 2.2. It follows from Theorem 2.3 that there exists a map of the form (2.20) such that (2.22) holds. Let [g jk (x)] = [g jk (x)]−1 , [gˆ jk (x)] ˆ = [gˆ jk (x)] ˆ −1 . Then (2.22) is equivalent to n

gˆ jk (x)d ˆ xˆ j d xˆk =

j,k=0

n

g jk (x)d x j d xk ,

(2.27)

j,k=0

where (xˆ0 , x) ˆ are related to (x0 , x) by (2.20). Note that (cf. [LP1]) g00 = n −2 (x), g jk = −δ jk for 1 ≤ j, k ≤ n, w j (x) = n −2 (x)v j (x), g0 j = g j0 = −(n −2 (x) − 1) c

(2.28)

and gˆ jk have a similar form. Here v j (x) is the same as in (2.5). Since g jk = gˆ jk = −δ jk for 1 ≤ j, k ≤ n, we have, as in the proof of Theorem 2.1, that xˆ = ϕ(x) = x. Therefore ˆ = . Note that d xˆ0 = d x0 +

n ∂a(x) j=1

∂x j

dx j.

(2.29)

326

G. Eskin

Substitute (2.29) into (2.27). Taking into account that xˆ j = x j , 1 ≤ j ≤ n, and that d x0 , d x1 , . . . , d xn are arbitrary, we get from (2.27) and (2.29): nˆ −2 (x) = n −2 (x),

(2.30)

2nˆ −2 (x)ax j + 2nˆ −2 (x)vˆ j (x) = 2n −2 (x)v j (x), 1 ≤ j ≤ n,

(2.31)

nˆ −2 (x)(

n

ax j d x j )2 + 2(

j=1

n

nˆ −2 (x)vˆ j (x)d x j )(

j=1

n

ax j d x j ) = 0.

(2.32)

j=1

It follows from (2.30) that n(x) ˆ = n(x). Multiplying (2.31) by n 2 (x) we get vˆ j (x) + ax j (x) = v j (x), 1 ≤ j ≤ n.

(2.33)

If there exists x such that not all ax j (x) = 0, 1 ≤ j ≤ n, then we can cancel nˆ −2 (x) nj=1 ax j (x)d x j in (2.32) and get nj=1 ax j (x)d x j + 2 nj=1 vˆ j (x)d x j = 0, i.e. ax j (x) + 2vˆ j (x) = 0, 1 ≤ j ≤ n,

(2.34)

since d x j , 1 ≤ j ≤ n, are arbitrary. Comparing (2.33) and (2.34) we get vˆ j (x) = −v j (x), 1 ≤ j ≤ n, when

∂a ∂x

∂a = ( ∂∂a x1 , . . . , ∂ xn ) = 0. ∂a ∂ x = 0 on O, ∂a(x (1) ) = 0 and ∂x

Let O ⊂ be an open connected set and v(x) ˆ = 0 on O. Then either

∂a ∂ x = 0 for all x ∈ O. To prove the last assertion suppose we have ∂a(x (2) ) = 0 for some x (1) , x (2) ∈ O. Let x = x(s), 0 ≤ s ≤ 1, be a curve connecting ∂x = 0, when x (1) and x (2) , x(0) = x (1) , x(1) = x (2) . Let s0 ∈ (0, 1] be such that ∂a(x(s)) ∂x ∂a(x(s)) 0 )) = 0. It follows from (2.34) that = −2v(x(s)) for s < s0 . s < s0 and ∂a(x(s ∂x ∂x ∂a(x(s0 )) = −2v(x(s0 )) = 0 and this is a contraTaking the limit when s → s0 we get ∂ x = −2v(x) on O. If O is dense in we get v(x) ˆ = − 21 ∂a(x) diction. Therefore ∂a(x) ∂x ∂x

or

in . i.e. v(x) ˆ is a gradient flow and v(x) = −v(x) ˆ is also a solution in addition to the trivial solution v(x) = v(x) ˆ that corresponds to a(x) = 0. Note that the boundary measurements can not distinguish between these two solutions v(x) and −v(x). Suppose that the set O where v(x) ˆ = 0 consists of several open components O1 , . . . , Or . For each O j , 1 ≤ j ≤ r, either a(x) = 0 on O j or ∂a(x) ˆ = 0 ∂ x = −2v(x) ∂a(x) ˆ = 0. Denote b j (x) = 0 in on O j . It follows from (2.32) that ∂ x = 0 when v(x) ∂b (x)

\O j , ∂j x = v(x) ˆ in O j . Suppose b j ∈ C ∞ (), b j = 0 on ∂0 . Then we have 2r solutions of the inverse prob∂b (x) ∂b (x) lem where each of these solutions is equal either to ∂j x or to − ∂j x on O j , 1 ≤ j ≤ r . If v(x) and v(x) ˆ are any two solutions of the inverse problem then (2.33) implies that γ v(x) ˆ · d x = γ v(x) · d x for any γ in . Therefore the boundary measurements uniquely determine γ v(x) · d x. This fact can be considered as an analogue of the Aharonov-Bohm effect.

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

327

3. The Proof of the Main Theorem As in [E1] we start the proof of Theorem 2.3 with the introduction of a convenient system of coordinates that simplifies the equation. Let U0 be a neighborhood of some part of ∂0 and let (x , xn ) be a system of coordinates in U0 such that xn = 0 is the equation of ∂0 ∩ U0 . Let T be small. Denote by ψ ± , the solutions of the eiconal equations in U0 , n

g jk (x)ψx±j (x0 , x)ψx±k (x0 , x) = 0,

(3.1)

ψ + = x0 when xn = 0, ψ − = T − x0 when xn = 0,

(3.2)

j,k=0

such that

ψx±n |xn =0

=

∓g 0n (x) +

(g 0n (x))2 − g 00 (x)g nn (x) |xn =0 . g nn (x)

(3.3)

Solutions ψ ± (x0 , x) exist for 0 ≤ xn ≤ δ where δ is small. We assume that surfaces ψ + = 0 and ψ − = 0 intersect when xn ≤ δ. In the case when g jk (x) are independent of x0 we have ψ + = x0 + ϕ + (x), ψ − = T − x0 + ϕ − (x),

(3.4)

where ϕ ± (x) satisfy g 00 (x) ± 2

n

g 0 j (x)ϕx±j +

j=1 ±

ϕ |xn =0 = 0,

n

g jk (x)ϕx±j ϕx±k = 0,

j,k=1

ϕx±n |xn =0

=

∓g 0n (x) +

(g 0n (x))2 − g 00 (x)g nn (x) |xn =0 . g nn (x)

Denote by ϕ p (x) the solutions of n

g jk (x)ψx−j ϕ pxk (x) = 0

(3.5)

j,k=0

with the initial conditions ϕ p |xn =0 = x p , 1 ≤ p ≤ n − 1. Note that ϕ px0 = 0, ψx−0 = −1. Therefore we have n j,k=1

g jk ϕx−j ϕ pxk −

n j=1

g j0 ϕ px j = 0, 1 ≤ p ≤ n − 1.

(3.6)

328

G. Eskin

Make the following change of variables in U0 × [0, T ]: s = ψ + (x0 , x) = x0 + ϕ + (x),

(3.7)

τ = ψ − (x0 , x) = T − x0 + ϕ − (x), y j = ϕ j (x), 1 ≤ j ≤ n − 1. We shall call (s, τ, y ) the Goursat coordinates. Let u(s, ˆ τ, y ) = u(x0 , x). Then u(s, ˆ τ, y ) satisfies the equation 2 ∂ de f Lˆ uˆ = − |g| ˆ ∂s +

n−1 j=1

+

∂ uˆ ∂ uˆ 2 ∂ gˆ +,− (s, τ, y ) |g| − gˆ +,− (s, τ, y ) |g| ˆ ˆ ∂τ ∂s |g| ˆ ∂τ

n−1 ∂ uˆ ∂ uˆ 2 ∂ 2 ∂ gˆ +, j (s, τ, y ) |g| + gˆ +, j (s, τ, y ) |g| ˆ ˆ ∂s ∂yj |g| ˆ ∂yj ˆ ∂s j=1 |g|

n−1

1 ∂ ˆ ∂yj j,k=1 |g|

The terms containing

∂ uˆ gˆ j,k (s, τ, y ) |g| = 0. ˆ ∂ yk

2 2 ∂2 , ∂ , ∂ ∂s 2 ∂τ 2 ∂ y j ∂τ

(3.8)

vanished because of (3.1), (3.5). Here

−1 . gˆ = −4(gˆ +,− )−2 det[gˆ jk ]n−1 j,k=1

(3.9)

It follows from (3.7) that s + τ − T = ϕ + (x) + ϕ − (x), s − τ + T = 2x0 + ϕ + (x) − ϕ − (x). Denote (cf. [E1], (2.23)) yn =

ϕ + (x) + ϕ − (x) T −s−τ =− , 2 2

y0 =

s−τ +T ϕ + (x) − ϕ − (x) = x0 + , 2 2

(3.10)

y j = ϕ j (x), 1 ≤ j ≤ n − 1. We shall also use the coordinates (3.10). Note that ϕ + = ϕ − = 0 when xn = 0. Therefore the map (3.10) is the identity on xn = 0.

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

329

Since u s = 21 (u y0 − u yn ), u τ = − 21 (u y0 + u yn ), Eq. (3.8) has the following form in (y0 , y , yn ) coordinates: ∂ 2 uˆ 1 ∂ ∂ uˆ |g| ˆ gˆ +,− (s, τ, y ) Lˆ uˆ = gˆ +,− 2 − ∂ yn ∂ y0 |g| ˆ ∂ yn n−1 1 ∂ ∂ ∂ +, j uˆ + |g| ˆ gˆ (s, τ, y ) − ∂ y0 ∂ yn ˆ ∂yj j=1 |g| +

n−1 j=1

+

1 |g| ˆ

∂ ∂ − ∂ y0 ∂ yn

n−1

1 ∂ ˆ ∂yj j,k=1 |g|

∂ uˆ |g| ˆ gˆ +, j (s, τ, y ) ∂yj

|g| ˆ gˆ j,k (s, τ, y )

∂ uˆ ∂ yk

= 0.

(3.11)

We used above that gˆ jk , gˆ + , gˆ +, j depend on (y , yn ) and do not depend on y0 . Divide (3.11) by gˆ +,− . As in [E1] put 1

1

u = |g| ˆ 4 (gˆ +,− ) 2 u. ˆ Then

u

(3.12)

will be the solution of the equation de f

L 1 u = u y 2 − u y 2 + n

0

+

n−1 j=1

+

n−1 j=1

n−1

∂ ∂yj

j,k=1

∂ ∂ − ∂ y0 ∂ yn

∂ ∂yj

0j g0

jk

g0

0j g0

∂u ∂ yk

∂u ∂yj

∂ ∂ − ∂ y0 ∂ yn

u + V1 u = 0,

(3.13)

where g0 = (gˆ +,− )−1 gˆ jk , g0 = −g0 = (gˆ +,− )−1 gˆ +, j , 1 ≤ j, k ≤ n − 1, V1 has a form similar to (2.8) in [E1]: n−1 n−1 ∂ ∂2 A ∂A 2 jk ∂ A jk ∂ A ∂ A V1 (s, τ, y ) = g − + − g0 0 ∂ yn2 ∂ yn ∂yj ∂ yk ∂ y j ∂ yk jk

0j

nj

j,k=1

+

n−1 j=1 1

∂ ∂ yn 1

0j

g0

∂A ∂yj

j,k=1

+

∂ ∂yj

1

0j

g0

∂A ∂ yn

0j

+ 2g0

∂A ∂A ∂ y j ∂ yn

,

(3.14)

−1 (cf. (3.9) and (3.12)). where A = ln[(gˆ +,− ) 2 |g| ˆ 4 ] = ln( √1 g14 ), g1 = (det[gˆ jk ]n−1 j,k=1 ) 2 Note that L 1 is formally self-adjoint. The DN operator 1 corresponding to L 1 has the following form: ⎛ ⎞ n−1 ∂u ∂u n j ⎠ | yn =0 , 1 f = ⎝ + g0 (3.15) ∂ yn ∂yj j=1

330

G. Eskin 1

1

1

where f = u | yn =0 . It follows from Remark 2.2 in [E1] that e A = (gˆ +,− ) 2 |g| ˆ 4 = √1 g14 2 and its derivatives on yn = 0 can be determined by the DN operator of L. Therefore the DN operator 1 of L 1 is determined by the DN operator of L (cf. [E1], (2.9)-(2.12)). Introduce notations similar to [E1], p. 819. Let ⊂ (1) ⊂ (2) ⊂ U0 ∩∂0 . Denote ( j) by D js0 , 1 ≤ j ≤ 2, 0 ≤ s0 ≤ T, the forward domain of influence of × [s0 , T ] in ( j)

the half-space yn ≥ 0. Denote by D − × [0, T ] j the backward domain of influence of for yn ≥ 0. Let Y js0 , s0 ∈ [0, T ), 1 ≤ j ≤ 2, be the intersection of D js0 with the plane T − yn − y0 = 0. Denote by X js0 the part of D js0 below Y js0 . Let Z js0 = ∂ X js0 \(Y js0 ∪{yn = 0}). We assume that X 20 ∩∂0 ⊂ U0 and X 20 does not intersect ∂ for yn > 0. We shall call ( j) ×[s , T ]. Denote by R D js0 ∩ D − 0 js0 the intersection j the double cone of influence of −

of D js0 ∩ D j with Y js0 . We shall assume that ( j) , 1 ≤ j ≤ 2, are such that D10 ∩ ∂0 ⊂ (2) × [0, T ]. Let Q j be the rectangle in the plane τ = 0 : Q j = {(s, τ, y ) : τ = 0, 0 ≤ s ≤ (j

T, y ∈ )}. Note that Q j is the intersection of D − j with the plane τ = 0. Therefore R js0 is the intersection of Y js0 with Q j , j = 1, 2. Note also that if (s, 0, y ) ∈ Y js0 then the line segment (s, 0, y ), s ≤ s ≤ T , also belongs to Y js0 . The following theorem is a generalization of Lemma 2.1 in [E1]: ˆ (i) be the Theorem 3.1. Let Lˆ (1) and Lˆ (2) be two operators of the form (3.11) and let (1) (2) (2) ˆ on × (0, T ) and T is small. ˆ = corresponding DN operators. Assume that Then there exists changes of variables yˆ0 = y0 , yˆn = yn , yˆ = α (i) (yn , y ), i = 1, 2, such that α (1) (0, y ) = α (2) (0, y ) = y and L˜ (1) = L˜ (2) when yˆ ∈ , yn ∈ [0, T2 ]. Here L˜ (i) are differential operators Lˆ (i) in the coordinates ( yˆ0 , yˆn , yˆ ). Many parts of the proof of Theorem 3.1 are the same as in Lemma 2.1 in [E1]. We shall skip the proofs in such cases and concentrate only on the new elements. We shall start with the derivation of Green’s formulas analogous to formulas (2.33) and (2.24) in [E1]. Consider the following initial-boundary value problem for L 1 : L 1 u = 0 for yn > 0, u = u y0 = 0 for y0 = 0, yn > 0, u| yn =0 = f, (2)

where supp f ⊂ × (0, T ], (2) ⊂ U0 ∩ {yn = 0}. Let v be such that v = v y0

L ∗1 v = 0, yn > 0, = 0 when y0 = 0, yn > 0,

v| yn =0 = g, supp g ⊂

(2)

× (0, T ].

Here L ∗1 is formally adjoint to L 1 . We have 0 = (L 1 u, v) − (u, L ∗1 v),

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

where (u, v) =

(2

331

u(y0 , y)v(y0 , y)dy0 dy dyn . Integrating by parts we get

X 20

n−1 n−1 ∂ 0 j ∂u ∂ 0 j ∂u g0 , v) +2 g ∂s ∂ yk ∂ yk 0 ∂s j=1

j=1

= (−2

n−1

0 j ∂u ∂v ∂u , , vs ) + (−2 g0 )+ ∂ yk ∂s ∂ yk n−1

0j

g0

j=1

j=1

n−1 yn =0 j=1

0j

g0

∂u vdy0 dy ∂ yk

n−1 n−1 ∂ ∂ 0 j ∂v 0 j ∂v )) + (u, 2 (g = (u, 2 (g )) ∂ yk 0 ∂s ∂s 0 ∂ yk j=1

−

yn =0

u

j=1

n−1

0j

g0

j=1

∂v dy0 dy + ∂ yk

n−1 yn =0 j=1

0j

g0

∂u vdy0 dy . ∂ yk

(3.16)

We used here that u, v vanish on Z 20 . Note that other terms in L 1 are the same as in [E1], formula (2.33). Therefore integrating these terms by parts as in [E1], (2.33), and combining with (3.16) we get the following Green’s formula:

Y20

∂u ∂v v−u ∂s ∂s

dsdy = −

(2) ×[0,T ]

(1 f g − f 1 g)dy dy0 ,

(3.17)

where 1 is the DN operator (3.15). Note that L ∗1 = L 1 in our case. Therefore the left-hand side of (3.17) is determined by the boundary data. Now we shall derive another Green’s formula similar to (2.24) in [E1]. Consider 0 = (L 1 u,

∂v ∂u )+( , L 1 v). ∂ y0 ∂ y0

Integrating by parts in y j and s we get

(2

n−1 n−1 ∂ 0 j ∂u ∂ 0 j ∂u , v y0 ) + (2 g g0 , v y0 ) ∂yj ∂s ∂s 0 ∂ y j j=1

j=1

= (−2

n−1 j=1

+

0 j ∂u ∂u , v y j y0 ) + (−2 g0 , v y0 s ) ∂s ∂yj n−1

0j

g0

n−1 yn =0 j=1

j=1

0j

g0

∂u v y dy dy0 . ∂yj 0

(3.18)

332

G. Eskin

Now integrate by parts in y0 and then again in s and y j . We get n−1 n−1 ∂ 0 j ∂u ∂ 0 j ∂u , v y0 ) + (2 g g , v y0 ) (2 ∂ y j 0 ∂s ∂s 0 ∂ y j j=1

j=1

=−

n−1 Y20 j=1

+

n−1 yn =0 j=1

−(u y0 , 2

∂u ∂v dsdy − ∂s ∂ y j

0j

g0

0j

g0

∂u ∂v dy dy0 + ∂ y j ∂ y0

n−1 Y20 j=1

0j

g0

n−1 yn =0 j=1

∂u ∂v dsdy ∂ y j ∂s 0j

g0

∂u ∂v dy dy0 ∂ y0 ∂ y j

n−1 n−1 ∂ 0 j ∂v ∂ 0 j ∂v (g0 )). )) − (u y0 , 2 (g0 ∂s ∂yj ∂yj ∂s j=1

(3.19)

j=1

The remaining terms in (3.18) are the same as in [E1], formulas (2.18)-(2.25). Therefore, combining all terms after the integration by parts we get (cf. [E1], (2.25)): 0 = (L 1 u, v y0 ) + (u y0 , L 1 v) ˜ ˜ 0 ( f, g), = Q(u, v) +

(3.20)

where ˜ Q(u, v) = (3.21) ⎤ ⎡ n−1 n−1 jk ∂u ∂v 0 j ∂u ∂v 0 j ∂u ∂v 1 ⎣4u s vs − g0 −2 +g g0 +V1 uv ⎦ dy ds 2 Y20 ∂ y j ∂ yk ∂s ∂ y j 0 ∂ y j ∂s j,k=1

j=1

and ˜ 0 ( f, g) =

(2) ×[0,T ]

(1 f g y0 + f y0 1 g)dy dy0 .

(3.22)

Again in the derivation of (3.20) we used that u = v = 0 on Z 20 . We shall show now that the "ellipticity" condition (2.3), i.e. that the reduced quadratic ˜ form is negative definite, implies that Q(u, u) is positive definite. Note that the map of the form (3.7) and, consequently, the map (3.10), preserves the ellipticity condition. The reduced quadratic form in (3.13) has the form: n−1 j,k=1

jk

g0 (x)ξ j ξk − ξn2 − 2

n−1

0j

g0 ξ j ξn .

(3.23)

j=1

The “ellipticity” condition (2.3) implies that (3.23) is negative definite. Replacing in the complexification of (3.23) ξn by 2u s and ξ j by −u y j , 1 ≤ j ≤ n − 1, we get that ˜ Q(u, u) is positive definite assuming that T is small. ˜ Having Green’s formulas (3.20) with positive definite Q(u, u) we can proceed as in [E1]. Let L (i) , i = 1, 2, be two operators of the form (2.1) and let (y0 , y) = i (x0 , x), i = 1, 2, be two maps of the form (3.10) that transform L (i) to Lˆ (i) of the form (3.11), (1) (2) i = 1, 2. Let L 1 and L 1 be two operators of the form (3.13).

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach g

(i)

f

333

(i) g

f

(i)

f

Let vi , u i , i = 1, 2, be such that L 1 u i = 0, L 1 vi = 0 in X 20 , u i | yn =0 = g f g f g f, vi | yn =0 = g, u i = vi = u i y0 = vi y0 = 0 for y0 = 0, yn > 0, i = 1, 2. We shall (i)

denote by [gi0 ]nj,k=0 the matrices of L 1 in (y0 , y , yn ) coordinates, i = 1, 2. As in jk

0j

nj

+, j

−, j

0j

(3.13) we have gi0 = −gi0 , gi0 = gi0 , gi0 = 0. (1) (2) We assume that supp f and supp g are contained in (2) × (0, T ] and 1 = 1 (i) (i) on (2) × (0, T ), where 1 are the DN operators for L 1 , i = 1, 2. ( j) (i) (i) (i) (i) Let i , D js0 , Y js0 , X js0 , j = 1, 2, correspond to L 1 , i = 1, 2. It was proven

(1) (2) ∩ {yn = 0} = D10 ∩ {yn = 0}. in Lemma 2.4 in [E1] that if 1(1) = 2(1) then D10 (2) (2) Therefore we can take 1 = 2 , i.e. the sets (1) , (2) can be chosen the same for i = 1, 2. ◦

(i) (i) Denote by H 1 (Y js ) the closure of C0∞ (Y js ) in the Sobolev norm u1,Y (i) and 0 0 js0

(i) (i) (i) denote by H01 (Y js ) the closure of C ∞ functions in Y js equal to zero on ∂Y js \{yn = 0}. 0 0 0 ◦1

Analogously one defines H ( j × [s0 , T ]) and H01 ( ( j) × [s0 , T ]) (cf. [E1]). (2) (2) × (0, T ) we Lemma 3.1 (cf. Lemma 3.4 in [E1]). Assuming that (1) 1 = 1 on have f

f

f

C1 u 1 1,Y (1) ≤ u 2 1,Y (2) ≤ C2 u 1 1,Y (1) 2s0

2s0

(3.24)

2s0

for all f ∈ H01 ( (2) × (s0 , T )). Proof. Applying the Green’s formula (3.20) for i = 1, 2 and taking into account that (1) (2) 1 = 2 we get Q (1) (u 1 , u 1 ) = Q (2) (u 2 , u 2 ), f

f

f

f

(i)

where Q (i) corresponds to L 1 , i = 1, 2. The inequality (3.24) follows from the ellip ticity of Q (i) , i = 1, 2.

Denote by 1 the domain in Rn+1 bounded by the planes: 2 = {τ = T −yn −y0 = 0, 0 ≤ yn ≤ T2 , y ∈ Rn−1 }, 3 = {s = y0 − yn = 0, T2 ≤ yn ≤ T, y ∈ Rn−1 } and 4 = {y0 = T, 0 ≤ yn ≤ T, y ∈ Rn−1 }. Let L 1 be an operator of the form (3.13) in 1 . ◦

Lemma 3.2 (cf. Lemma 3.1 in [E1] and Lemma 3.1 in [E3]). For any v0 ∈ H 1 (2 ) ◦

there exists u ∈ H 1 (1 ), w0 ∈ H 1 (4 ), w1 ∈ L 2 (4 ) such that L 1 u = 0 in 1 , u|2 = v0 , u|3 = 0, u|4 = w0 , u y0 |4 = w1 . Proof. Integrating by part as in the proof of (3.20) and taking into account that u|3 = 0 we get an identity (cf. (3.1) in [E1]): Q(v0 , v0 ) = E(u, u),

(3.25)

334

G. Eskin

where E(u, u) =

4

n

(|u y0 |2 −

jk

g0 u y j u u k + V1 |u|2 )dy.

j,k=1

nj

0j

Note that g0nn = −1, g0 = −g0 and the quadratic form E(u, u) is positive definite asuming that T is small. Once the identity (3.25) is established, the proof of Lemma 3.2 proceeds as in [E1], Lemma 3.1. Lemma 3.3 (Density lemma) (cf. Lemma 2.2 in [E1]). For any w ∈ H01 (R js0 ) there exists a sequence u fn ∈ H01 (Y js0 ), f n ∈ H01 ( ( j) × (s0 , T )), such that w − u fn 1,Y js0 → 0 when n → ∞. Note that H01 (R js0 ) ⊂ H01 (Y js0 ). The proof of Lemma 3.3 is based on the Green’s formula (3.20), Lemma 3.2 and the unique continuation theorem of Tataru (cf. [T]) and it is identical to the proof of Lemma 2.2 in [E1].

The main lemma used in the proof of Theorem 3.1 is the following Lemma 3.4. (cf. (2.40) in [E1]). Let u i , vi , i = 1, 2, be the solutions of L (i) 1 u = 0 f g (i) in X 10 , i = 1, 2, with zero initial conditions and u i | yn =0 = f, vi | yn =0 = g, where f, g belong to H01 ( (1) × [0, T ]). Assume for simplicity that f is smooth. Suppose (2) (2) × (0, T ). Then for any s ∈ [0, T ] we have (1) 0 1 = 1 on f

f

(1)

Y10 ∩{s≥s0 }

∂u 1 g v dsdy = ∂s 1

g

f

(2)

Y10 ∩{s≥s0 }

∂u 2 g v dsdy . ∂s 2

(3.26)

The proof of Lemma 3.4 uses Lemmas 3.1 and 3.3, and is exactly the same as the proof of (2.40) in [E1]. We shall repeat this proof here for the convenience of the reader. Integrating by parts we obtain f g f (u s v g − u f v s )dsdy = 2 u s v g dsdy Y20 Y20 − u f (T, y , 0)v g (T, y , 0)dy . (3.27) (2)

Since u f (T, y , 0) = f (T, y ), v g (T, y , 0) = g(T, y ) we get, using (3.17), that de f f f (u s , v g ) = Y20 u s v g dsdy is determined by the DN operator. Therefore we have f

(

f

∂u 1 g ∂u g , v1 ) = ( 2 , v2 ) ∂s ∂s

(3.28)

for all f, g ∈ H01 ( (2) × (0, T )). Consider f, g ∈ H01 ( (1) × (0, T )). Then supp u i g (i) (i) and supp vi are contained in Y10 , i = 1, 2. Take any s0 ∈ [0, T ). Note that Y10 ∩ {s ≥ f (i) i s0 } ⊂ R2s0 . Let wi be such that ∂w ∂s = 0 when s > s0 and wi |s=s0 = u i |s=s0 . f

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach (i)

335

(i)

f

Let u 0 = u i −wi when s ≥ s0 , u 0 = 0 when s < s0 . Assume that f and therefore f (i) (i) (i) u i is smooth. Then u 0 ∈ H01 (R2s0 ) ⊂ H01 (Y2s0 ). We shall prove that (1)

(2)

g

g

(u 0s , v1 ) = (u 0s , v2 )

(3.29)

for any g ∈ H01 ( (1) × [0, T ]). f f (1) (1) By Lemma 3.3 there exists u 1n ∈ H01 (Y2s0 ) such that u 0 − u 1n 1,Y (1) → 0. By 2s0

(2)

Lemma 3.1 u 2n − v (2) 1,Y (2) → 0 for some v (2) ∈ H01 (Y2s0 ). Substituting f = f n in f

2s0

(3.28) and passing to the limit when n → ∞ we get (1)

(u 0s , v1 ) = (vs(2) , v2 ). g

g

(3.30)

Note that (3.30) holds for any g ∈ H01 ( (2) × (0, T )). Take g ∈ H01 ( (2) × (s0 , T )), f

g

∂u i ∂s

(i) ). Since u (i) i.e. vi ∈ H01 (Y2s 0s = 0 have (cf. (3.28)) (1)

g

g

when s ≥ s0 , and vi = 0 for s < s0 , i = 1, 2 we g

(2)

g

(i)

(u 0s , v1 ) = (u 0s , v2 ), ∀vi ∈ H01 (Y2s0 ).

(3.31)

Comparing (3.30) and (3.31) for g = g , we obtain g

(2)

g

(u 0s , v2 ) = (vs(2) , v2 ). g

(3.32)

(i)

Since vi ∈ H01 (Y2s0 ) is arbitrary we get by Lemma 3.3 that (2)

vs(2) = u 0s

(2)

on R2s0 .

(3.33) (2)

(2)

When g ∈ H01 ( (1) ×[0, T ]) we have that (supp v2 )∩{s ≥ s0 } ⊂ Y10 ∩{s ≥ s0 } ⊂ R2s0 . g

(2)

(2)

g

(2)

Therefore we can replace vs by u 0s in (3.30) when v2 ∈ H01 (Y10 ) and this proves (3.29). Finally, subtracting (3.29) from (3.28) we get (3.26).

The next step of the proof of Theorem 3.1 will use the geometric optics solutions. Since the constructions here differ from [E1], p. 824, we shall proceed with more details. f As in (2.41) in [E1] we are looking for u i in the form: u i = eik(s−s0 ) f

N p=0

1 (i) (N +1) a (s, τ, y ) + u i , (ik) p p

(3.34)

where k is a large parameter, i = 1, 2, (i)

4

(i)

0j ∂a0 ∂a −4 gi0 (y) 0 = 0, ∂τ ∂yj n−1

(3.35)

j=1

a0(i) | yn =0 = χ1 (s)χ2 (y ), (i)

i = 1, 2,

a p , p ≥ 1, satisfy nonhomogeneous equations of the form (3.35) that we will not write here and u (N +1) is the same as in (2.41) in [E1] (cf. [E1], p. 824). Here χ1 (s) ∈

336

G. Eskin

C0∞ (R1 ), χ1 (s) = 1 for |s − s0 | < δ, χ1 (s) = 0 for |s − s0 | > 2δ, δ is small, χ2 (y ) ∈ C0∞ ( (1) ) is arbitrary. (i) Let β j (yn , α) be the solution of the system of differential equations (i)

dβ j

dyn

(i)

= 2gi0 (β (i) , yn ), β j (0, α) = α j , 1 ≤ j ≤ n − 1, i = 1, 2. 0j

(3.36)

For each yn β (i) = β (i) (yn , α) is a diffeomorphism equal to the identity when yn = 0. (i) (i) (i) Let α (i) = {α j (yn , y )} be the inverse to β (i) = {β j (yn , α)}, i.e. α j (yn , β (i) (yn , yˆ )) = yˆ j , 1 ≤ j ≤ n − 1, i = 1, 2. Differentiating this identity in yn we get (i)

, y) ∂α j ( T −s−τ 2 ∂τ

−

n−1 k=1

(i)

k0 gi0 (y)

∂α j

∂ yk

( j)

= 0, α j | yn =0 = y j , 1 ≤ j ≤ n − 1. (3.37)

, y )) is the solution of (3.35), a0(i) | yn =0 = Therefore a0(i) (s, τ, y ) = χ1 (s)χ2 (α (i) ( T −s−τ 2 χ1 (s)χ2 (y ). Substituting the geometric optics solutions (3.34) in (3.26), integrating by parts and taking the limit when k → ∞ we obtain (cf. (2.42) in [E1]): T −s0 g T −s0 g , y )v¯1 (s0 , 0, y )dy = , y )v¯2 (s0 , 0, y )dy . χ2 (α (1) ( χ2 (α (2) ( n−1 n−1 2 2 R R (3.38) (i)

Note that τ = 0 on Y10 , i = 1, 2. Changing T to T − τ , 0 < τ ≤ T we get (3.38) for any 0 < τ < T . Consider the following change of coordinates: T −s−τ , y ), i = 1, 2. 2 The inverse change of variables has the form: sˆ = s, τˆ = τ, yˆ = α (i) (

s = sˆ , τ = τˆ , y = β (i) (

(3.39)

T − sˆ − τˆ , yˆ ), i = 1, 2. 2

(3.40) (1)

Note that y = β (i) (yn , yˆ ) is the endpoint of the curve (3.36) starting at yˆ ∈ when yn = 0 and yˆ = α (i) (yn , y ). (1) Let = {(s, τ ) : s ≥ 0, τ ≥ 0, s + τ ≤ T }. Denote by β (i) ( × ) the image of (i) (1) (1) × under the map (3.40), i = 1, 2. Note that β (i) ( × ) is contained in X 10 . (i) (1) (i) de f (i) = Q 1 ∩ β (i) ( × ) is contained in R = Q 1 ∩ X 10 , i = 1, 2. Therefore R˜ 10

10

(1)

(i)

Here Q 1 is the rectangle {(s, τ, y ) : τ = 0, s ∈ [0, T ], y ∈ }. Denote by B10 (i) (i) the image of R˜ 10 under the map (3.39). Finally, denote by B (i) the projection of B10 (1)

on the plane y0 = 0. Note that B (i) ⊂ × [0, T2 ], i = 1, 2. We shall assume that B (1) ⊃ × [0, T2 ]. This assumption always can be satisfied when T is small enough. Make the change of variables (3.40) in (3.38). We get T −τ −s g , yˆ ))J1 (yn , yˆ )d yˆ χ2 ( yˆ )v¯1 (s, τ, β (1) ( (1) 2 T −τ −s g , yˆ ))J2 (yn , yˆ )d yˆ , = χ2 ( yˆ )v¯2 (s, τ, β (2) ( (3.41) 2 (1)

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach T −s−τ , Ji ( T −s−τ , yˆ ) is the Jacobian of 2 2 (1) C0∞ ( (1) ) function, we get that for any yˆ ∈ ,

yn =

v1 (s, τ, β (1) ( g

337

the map (3.40). Since χ2 (y ) is any

T −τ −s T −τ −s g , yˆ ))J1 (yn , yˆ ) = v2 (s, τ, β (2) ( , yˆ ))J2 (yn , yˆ ). 2 2 (3.42) (1)

Note that (3.42) holds for (s, τ, yˆ ) ∈ × . g Let χ1 (s) be the same as before, and χ3 (y ) ∈ C0∞ ( (1) ) be arbitrary. Construct vi,k as geometric optics solution (3.34) with g = χ1 (s)χ3 (y ). Take s = s0 and k → ∞. We get vi∞ = χ1 (s0 )χ3 (α (i) (s0 , τ, y )), g

(3.43)

where vi,∞ = limk→∞ vi,k . Note that eik(s−s0 ) = 1 when s = s0 . Substituting vi,k in (3.38) and taking the limit when k → ∞ we obtain χ2 (α (1) (s0 , τ, y )χ3 (α (1) (s0 , τ, y )dy Rn−1 = χ2 (α (2) (s0 , τ, y )χ3 (α (2) (s0 , τ, y )dy . g

g

g

Rn−1

Make the change of variables (3.40). Since χ2 , χ3 are arbitrary we get, as in (3.42), that J1 (y) = J2 (y). Therefore v1 (s, τ, β (1) ( g

T −τ −s T −τ −s g , yˆ )) = v2 (s, τ, β (2) ( , yˆ )), 2 2

(3.44)

(1)

where (s, τ, yˆ ) ∈ × . g g g g Let wi (s, τ, yˆ ) = vi (s, τ, β (i) ), i = 1, 2. Then w1 (s, τ, yˆ ) = w2 (s, τ, yˆ ), ∀(s, τ, (1)

yˆ ) ∈ × . Our strategy to complete the proof of Theorem 3.1 will be the following: (i) g (i) g Making the changes of variables (3.40) in L 1 vi = 0 we get L˜ 1 wi = 0, i = 1, 2. g g Using that w1 = w2 for all g ∈ H01 ( (1) × (0, T )) and using the density Lemma 3.4 we (1) (2) shall prove that the coefficients of L˜ 1 and L˜ 1 are equal. Since the density property holds for τ fixed we have to take care of terms in L˜ (i) 1 that contain derivatives in τ .

Note that integrating by parts as in (3.27) we get f g (u s v g − u f v s )dsdy Y20 f g = −2 u v s dsdy + u f (T, y , 0)v g (T, y , 0)dy . (1)

Y20

Therefore as in (3.28) we conclude that f

g

f

g

(u 1 , v1s ) = (u 2 , v2s ).

(3.45)

338

G. Eskin

We shall assume that both f and g are smooth. Using (3.45) instead of (3.28) we get an equality of the form (3.26) with the roles of u f and v g reversed:

g

(1)

Y10 ∩{s≥s0 }

f u1

∂v1 dy ds = ∂s

g

f

(2)

Y10 ∩{s≥s0 }

u2

∂v2 dy ds. ∂s

(3.46)

From (3.46) we get, analogously to (3.44), that g ∂v (s, τ, β (2) ( T −τ2 −s , yˆ )) ∂v1 (s, τ, β (1) ) (1) = 2 on × . ∂s ∂s g

We used here again that J1 (yn , yˆ ) = J2 (yn , yˆ ) in × g vi (s, τ, β i ) in s and yˆ we get

(1)

(3.47)

. Differentiating wi (s, τ, yˆ ) = g

∂wi (s, τ, yˆ ) ∂v (s, τ, β (i) ) ∂vi (s, τ, β (i) ) (i) βks , = i + ∂s ∂s ∂ yk g

g

n−1

g

(3.48)

k=1

(i)

∂wi (s, τ, yˆ ) ∂vi (s, τ, β (i) ) ∂βk = , ∂ yˆ j ∂ yk ∂ yˆ j g

g

n−1

(3.49)

k=1

where β (i) = β (i) (yn , yˆ ), yn = It follows from (3.49) that

T −s−τ . 2

(i) T −s−τ g g n−1 (i) ∂vi (s, τ, β (i) ) ∂α j ( 2 , β ) ∂wi (s, τ, yˆ ) = , ∂ yk ∂ yk ∂ yˆ j

(3.50)

k=1

where

(i)

∂α j (yn ,β (i) ) ∂ yk

is the inverse matrix to

(i)

∂βk (yn , yˆ ) ∂ yˆ j

.

Substituting (3.50) into (3.48), using (3.47) and that w1 (s, τ, y ) = w2 (s, τ, y ), we g

get

n−1 ∂α (1) (y , β (1) ) n j

∂ yk

j,k=1

g (1) ∂w1 (s, τ, yˆ )

βks

∂ yˆ j

=

g

n−1 ∂α (2) (y , β (2) ) n j j,k=1

∂ yk

g (2) ∂w1 (s, τ, yˆ )

βks

∂ yˆ j

,

(3.51) where yn =

T −s−τ ,τ 2

= 0, (s, yˆ ) ∈

(1)

× [0, T ].

◦

(1)

Since {v1 (s, τ, y ), g ∈ C0∞ ( (1) ×(0, T )])} are dense in H 1 (R10 ) (cf. Lemma 3.3), g

◦

(1)

(1)

(1)

(1)

we get that {w1 (s, τ, yˆ )} are dense in H 1 (B10 ), where B10 is the image of R˜ 10 ⊂ R10 under the map (3.38). The density lemma implies (cf. the end of Sect. 2 in [E3]) that g

n−1 ∂α (1) (y , β (1) ) n j k=1

∂ yk

(1)

βks (yn , yˆ ) =

n−1 ∂α (2) (y , β (2) ) n j k=1

∂ yk

(2)

βks (yn , yˆ )

(3.52)

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach (1)

339 (1)

on B10 . Here τ = 0, yn = T 2−s . Note that B (1) is the projection of B10 on the plane y0 = 0. Therefore (3.52) holds on B (1) since α (i) and β (i) do not depend on y0 . As before differentiating in s the identity (i)

αj (

T − s − τ (i) T − s − τ ,β ( , yˆ )) = yˆ j , 1 ≤ j ≤ n − 1, i = 1, 2, 2 2

(3.53)

we get: T − s − τ (i) T − s − τ ,β ( , yˆ )) 2 2 n−1 (i) T − s − τ (i) T − s − τ , β (i) )βks ( , yˆ )) = 0, i = 1, 2. α j yk ( + 2 2

(i)

α js (

(3.54)

k=1

Combining (3.54) and (3.52) we get (2) (1) (2) (1) α (1) js (yn , β (yn , yˆ )) = α js (yn , β (yn , yˆ )), 1 ≤ j ≤ n−1, (yn , yˆ ) ∈ B . (3.55) (i) g

(i)

Consider the equations L 1 vi = 0 in X 10 . It has the following form in (s, τ, y ) coordinates: g g n−1 ∂ 2 vi ∂ jk ∂vi (i) g L 1 vi = −4 + gi0 ∂s∂τ ∂yj ∂ yk j,k=1 g g n−1 ∂ +, j ∂vi ∂ +, j ∂vi g +2 g (3.56) + 2 gi0 + V1 vi = 0, ∂s ∂yj ∂ y j i0 ∂s j=1

+, j

−, j

0j

∂2v

g

where gi0 = gi0 . Note that gi0 , i.e. the coefficient of ∂τ ∂ yi j , is zero. Making the change of variables (3.39) we get equations of the form

g g ∂w ∂w ∂ ∂ J1 i + J1 i −2J1−1 (yn , yˆ ) ∂s ∂τ ∂τ ∂s g g n−1 ∂ ∂ (i) (i) −1 (i) ∂wi (i) ∂wi J1 α js (yn , β ) − 2J1 + J1 α js (yn , β ) ∂τ ∂ yˆ j ∂yj ∂τ j=1 g n−1 (1) jk ∂wi g (i) −1 ∂ J1 J1 g˜i0 + V1 (yn , β (i) )wi (s, τ, yˆ ) = 0, (s, τ, yˆ ) ∈ × , + ∂ yˆ j ∂ yˆk

(i) g de f L˜ 1 wi =

j,k=1

(3.57) where jk g˜i0 (yn , yˆ )

=

(i) ∂α j (yn , β (i) (yn , yˆ )) pr (i) gi0 (yn , β ) ∂ yp p,r =1 n

∂αk(i) (yn , β (i) (yn , yˆ )) , (3.58) ∂ yr

340

G. Eskin pr

nn = −1. We used in (3.57) that (cf. 1 ≤ j, k ≤ n − 1, gi0 are the same as in (3.13), gi0 (3.37)) +, j g˜i0

=

(i) ∂α (i) j (yn , β (yn , yˆ )) +, p (i) gi0 (yn , β ) ∂ yp p=1

n−1

(i)

−, j

g˜i0 = −

∂α j (yn , β (i) (yn , yˆ )) ∂s

−

g (2) L˜ 1 )w1

=

n−1 j,k=1

= 0,

(3.59)

−, j

g

(1) ( L˜ 1

∂τ

, since gi0 = 0, 1 ≤ j ≤ n − 1.

Since w1 (s, τ, yˆ ) = w2 (s, τ, yˆ ) in × g

−

∂α (i) j

∂ J1−1 ∂ yˆ j

(1)

(1)

, we have in B10 :

jk J1 (g˜ 10

−

g jk ∂w1 g˜ 20 ) ∂ yˆk

+(V1(1) (yn , β (1) ) − V1(2) (yn , β (2) ))w1 = 0. g

(1)

(3.60) (2)

We took into account that J1 (yn , yˆ ) = J2 (yn , yˆ ) and α js (yn , β (1) (yn , yˆ )) = α js (yn ,

(1) . β (2) (yn , yˆ )), 1 ≤ j ≤ n − 1, hold on B10

◦

(1)

Since {wi , g ∈ C0∞ ( (1) × (0, T ))} are dense in H 1 (B10 ), we get, as in [E3] (see the end of Sect. 2 in [E3]), that g

(1) g˜ 10 = g˜ 20 , V1(1) (yn , β (1) ) = V1(2) (yn , β (2) ) in B10 , 1 ≤ j, k ≤ n − 1. jk

jk

(3.61)

Noting that the coefficients in (3.61) do not depend on y0 and B (1) is the projection of (1) (1) (2) B10 on y0 = 0 we have that (3.61) holds in B (1) . Therefore we proved that L˜ 1 = L˜ 1 in B (1) . Now we shall prove that also L˜ (1) = L˜ (2) in B (1) , where L˜ (i) are the operators Lˆ (i) (see (3.8)) in (s, τ, yˆ ) coordinates. Operators L˜ (i) have the following form (cf. (3.57)): ∂ +,− ∂ +,− ∂ (i) (i) ∂ ˜L (i) = − 2 gˆi (yn , β (yn , yˆ )) + gˆ (yn , β ) ∂τ ∂τ i ∂s |gˆi | ∂s +

n−1

j,k=1

−

n−1

∂ jk ∂ |g˜i |g˜i ∂ yk |g˜i | ∂ y j 1

2

∂ ∂ (i) |g˜i |gˆi+,− (yn , β (i) )α js (yn , β (i) ) ∂τ ∂ yk |g˜i |

2

∂ ∂ (i) |g˜i |gˆi+,− (yn , β (i) )α js (yn , β (i) ) , ∂τ |g˜i | ∂ y j

j=1

−

n−1 j=1

1

1

(3.62)

where g˜i has the form (3.58) with gi0 replaced by gˆi (yn , β (i) (yn , yˆ )). Since pr pr gi0 = (gˆi+,− )−1 gˆi we get that jk

pr

pr

g˜i (yn , yˆ ) = (gˆi+,− (yn , β (i) ))−1 g˜i0 (yn , yˆ ). jk

jk

(3.63)

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach −, j

We used in (3.62) that gˆi

= 0 for 1 ≤ j ≤ n − 1, i = 1, 2, and that (3.37) implies

n−1

(i)

+, j

gˆi

j=1 0j

since gi0 =

341

∂α j

(i)

− gˆi+,−

∂yj

∂α j

∂τ

= 0,

+, j

gˆi . gˆ +,−

Therefore to prove that L˜ (1) = L˜ (2) it remains to prove that gˆ 1+,− (yn , β (1) ) = gˆ 2+,− (yn , β (2) ).

(3.64)

Making the change of coordinates (3.39) in (3.14) we get n n (i) ˜ ˜ (i) ∂ A˜ (i) ∂ ∂ A jk jk ∂ A (i) Ji−1 g˜i0 , (3.65) Ji g˜i0 − V1 (yn , β (i) (yn , yˆ )) = − ∂ yˆ j ∂ yˆk ∂ yˆ j ∂ yˆk j,k=1

j,k=1

jk where yˆn = yn , A˜ (i) (yn , yˆ ) = A(i) (yn , β (i) (yn , yˆ )), g˜i0 , 1 ≤ j, k ≤ n − 1, are the jn nj nn (i) same as in (3.57), g˜i0 = g˜i0 = −α (i) js (yn , β ), 1 ≤ j ≤ n − 1, g˜ i0 = −1. jk

jk

Taking into account that g˜ 10 = g˜ 20 , J1 = J2 , and (1) (1) (2) (2) (1) (2) (1) (1) (2) (2) A˜ yˆ A˜ yˆ − A˜ yˆ A˜ yˆ = ( A˜ yˆ − A˜ yˆ ) A˜ yˆ + ( A˜ yˆ − A˜ yˆ ) A˜ yˆ , j

k

j

k

j

j

k

k

k

j

we can rewrite 0 = V1(1) (yn , β (1) ) − V1(2) (yn , β (2) ) as a homogeneous second order elliptic equation for A(1) (yn , β (1) ) − A(2) (yn , β (2) ), 1 1 jk − 14 (cf. (3.9)). where A(i) (yn , y ) = ln((gˆi+,− (y)) 2 |gˆi | 4 ) = ln( √1 (det[gˆi (y)]n−1 j,k=1 ) 2 (1) (2) ˜ ˜ Since A and A have the same Cauchy data when yn = 0 (see Remark 2.2 in [E1]) we get, by the unique continuation theorem for the elliptic equations, that A(1) (yn , β (1) ) = A(2) (yn , β (2) ) in B (1) . Since gˆ i (yn , β (i) ) = jk

gi0 (yn , β (i) ) jk

gˆi+,− (yn , β (i) )

and g10 (yn , β (1) ) = g20 (yn , β (2) ) jk

jk

we get (3.64). Therefore L˜ (1) = L˜ (2) in B (1) . Note that by the assumption B (1) ⊃ × [0, T2 ].

Theorem 3.1 concludes the local step of the proof of the main Theorem 2.3. The global step of the proof is similar to the proof in [E2]: Consider the initial-boundary value problems for L (i) u i = 0 in domains (i) = (i) i 0 \ ∪mj=1 j , i = 1, 2. Let i ⊂ (i) be the image of ϕi−1 ◦ α (i) ( × [0, T2 ]), where αi is the map (3.39) and i (x0 , x) = (x0 + ai (x), ϕi (x)) is the map (3.10). Denote by

342

G. Eskin

(1) ◦ β ◦ , where β has the form (3.40). Note that is 3 the map 3 = −1 2 2 2 3 1 ◦α a diffeomorphism of the form (3.10), 3 = I on (2 ∩ ∂0 ) × (−∞, +∞) and

3 ◦ L (2) = L (1) on 1 . Note that any map of the form (3.10) can be represented as a composition = a1 ◦ ϕ1 = ϕ2 ◦ a2 , where ϕi are the diffeomorphisms of 2 onto 1 and maps ai have the form y0 = x0 + ai (x), y = x, ai (x) ∈ C ∞ , ai (x) = 0 on ∂0 . ˜ 3 of the map 3 such It follows from [Hi], Chap. 8, that there exists an extension that 3 |∂0 ×(−∞,∞) = I, 3 has a form (3.10), i.e. 3 = a3 ◦ ϕ3 , ϕ3 is a diffeomor(2)

(3) de f

(2)

phism of onto = ϕ3 ( ). Denote L (3) = a3 ◦ ϕ3 ◦ L (2) . Then L (3) is a differential operator of the form (2.1) on (3) , 1 ⊂ (3) and L (3) = L (1) on 1 . The proof of the following lemma is the same as in [E1], Lemma 3.3 (cf. [KKL1], Lemma 9):

Lemma 3.5. Let 1 ⊂ 1 be such that 1 \1 has a smooth boundary, γ1 = ∂0 ∩∂1 is connected and L (1) = L (3) on 1 . Let γ2 = ∂1 \γ1 . Suppose (1) = (2) on ∂0 ×(−∞, +∞), where (i) are DN operators corresponding to L (i) , respectively, i = (3) 1, 2, 3. Note that (3) = (2) on ∂0 × (−∞, +∞). Then the DN operators (1) 1 , 1 corresponding to the operators L (1) , L (3) in the smaller domains (1) \1 , (3) \1 are equal on ((∂0 \γ 1 ) ∪ γ2 ) × (−∞, +∞). Therefore Theorem 3.1 and Lemma 3.5 reduce the inverse problem for L (1) , L (2) in × (−∞, +∞), (2) × (−∞, +∞) to the inverse problem for L (1) , L (3) in smaller domains ((1) \1 ) × (−∞, +∞), ((3) \1 ) × (−∞, +∞). Continuing this process as in [E2] we can prove the main Theorem 2.3. Note that it is enough to have (1) = (2) on × (0, T0 ), where T0 is large enough, to prove Theorem 2.3. (1)

References [AB] [B] [BCLUW] [CFM] [E1] [E2] [E3] [E4] [E5] [ER] [ER1] [G]

Aharonov, Y., Bohm, D.: Significance of electromagnetic potentials in quantum theory. Phys. Rev., Second Series 115, 485–491 (1959) Belishev, M.: Boundary control in reconstruction of manifolds and metrics (the BC method). Inverse Problems 13, R1–R45 (1997) Berry, M., Chambers, R., Large, M., Upstill, C., Walmsley, J.: Eur. J. Phys. 1, 154 (1980) Cook, R., Fearn, H., Millouni, P.: Am. J. Phys. 63, 705 (1995) Eskin, G.: A new approach to the hyperbolic inverse problems. Inverse Problems 22(3), 815–831 (2006) Eskin, G.: A new approach to the hyperbolic inverse problems ii: global step. Inverse Problems 23, 2343–2356 (2007) Eskin, G.: Inverse hyperbolic problems with time-dependent coefficients. Comm. in PDE 32, 1737–1758 (2007) Eskin, G.: Inverse problems for the Schrödinger equations with time-dependent electromagnetic potentials and the Aharonov-Bohm effect. J. Math. Phys 49, 022105 (2008) Eskin, G.: Inverse boundary value problems in domains with several obstacles. Inverse Problem 20, 1497–1516 (2004) Eskin, G., Ralston, J.: Inverse scattering problem for the Schrödinger equation with magnetic and electric potentials. The IMA Volumes in Mathematics and its applications, Vol 90, New York: Springer, 1997, pp. 147–166 Eskin, G., Ralston, J.: Inverse scattering problem for the Schrödinger equation with magnetic potential at a fixed energy. Commun. Math. Phys. 173, 199–224 (1995) Gordon, W.: Ann. Phys. (Leipzig) 72, 421 (1923)

Optical Aharonov-Bohm Effect: An Inverse Hyperbolic Problems Approach

[Hi] [KKL] [KKL1] [KL] [KKLM] [LP] [LP1] [LP2] [LU] [N] [NSU] [NVV] [OD] [OP] [P] [RdeRTF] [VMCL] [W] [WY]

343

Hirsch, M.: Differential Topology, New York: Springer, 1976 Katchalov, A., Kurylev, Y., Lassas, M.: Inverse Boundary Spectral Problems, Boca Baton: Chapman & Hall, 2001 Katchalov, A., Kurylev, Y., Lassas, M.: Energy measurements and equivalence of boundary data for inverse problems on noncompact manifolds. IMA Volumes 137, 183–214 (2004) Kurylev, Y., Lassas, M.: Hyperbolic inverse problems with data on a part of the boundary. AMS/1P Stud. Adv. Math 16, 259–272 (2000) Katchalov, A., Kurylev, Y., Lassas, M., Mandache, N.: Equivalence of time-domain inverse problems and boundary spectral problems. Inverse Problems 20(2), 419–436 (2004) Leonhardt, V., Philbin, T.: General relativity in Electrical Engineering. New J. Phys. 8, 247 (2006) Leonhardt, V., Piwnicki, P.: Phys. Rev. A60, 4301 (1999) Leonhardt, V., Piwnicki, P.: Phys. Rev. Lett. 84, 822 (2000) Lee, J., Uhlmann, G.: Determining anisotropic real-analytic conducivity by boundary measurements. Comm. Pure Appl. Math. 42, 1097–1112 (1989) Nicoleau, F.: An inverse scattering problem with the Aharonov-Bohm effect. J. Math. Phys. 41, 5223–5237 (2000) Nakamura, G., Sun, Z., Uhlmann, G.: Global identifiability for inverse problem for the Schrödinger equation in a magnetic field. Math. Ann. 303(1), 377–88 (1995) Novello, M., Visser, M., Volovik, G. (eds): Artificial Black Holes, Singapore: World Scientific, 2002 O’Dell, S.: Inverse scattering for the Laplace-Beltrami operators with complex-valued electromagnetic potentials and embedded obstacles. Inverse Problems 22(5), 1579–1603 (2006) Olarin, S., Popescu, I. Iovitzu: The quantum effects of electromagnetic fluxes. Rev. Mod. Phys. 57(N2), 339–436 (1985) Quan, Pham Mau: Arch. Rat. Mech. Anal. 1, 54 (1957) Roux, P., de Rosny, J., Tanter, M., Fink, M.: Phys. Rev. Lett. 79, 317 (1997) Vivanco, F., Melo, F., Coste, C., Lund, F.: Surface Wave Scattering by a Vertical Vortex and the Symmetry of the Aharonov-Bohm Wave Function. Phys. Rev. Lett. 83, 1966–1969 (1999) Weder, R.: The Aharonov-Bohm effect and time-dependent inverse scattering theory. Inverse Problems 18(4), 1041–1056 (2002) Wu, T., Yang, C.: Phys. Rev. D 12, 3845 (1975)

Communicated by P. Constantin

Commun. Math. Phys. 284, 345–389 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0570-x

Communications in

Mathematical Physics

Spacetime Singularity Resolution by M-Theory Fivebranes: Calibrated Geometry, Anti-de Sitter Solutions and Special Holonomy Metrics Oisín A. P. Mac Conamhna1,2 1 Theoretical Physics Group, Blackett Laboratory, Imperial College London, London SW7 2AZ, U.K. 2 The Institute for Mathematical Sciences, Imperial College London, London SW7 2PE, U.K.

E-mail: [email protected] Received: 26 September 2007 / Accepted: 19 March 2008 Published online: 18 July 2008 – © Springer-Verlag 2008

Abstract: The supergravity description of various configurations of supersymmetric M-fivebranes wrapped on calibrated cycles of special holonomy manifolds is studied. The description is provided by solutions of eleven-dimensional supergravity which interpolate smoothly between a special holonomy manifold and an event horizon with Anti-de Sitter geometry. For known examples of Anti-de Sitter solutions, the associated special holonomy metric is derived. One explicit Anti-de Sitter solution of M-theory is so treated for fivebranes wrapping each of the following cycles: Kähler cycles in Calabi-Yau two, three- and four-folds; special lagrangian cycles in three- and four-folds; associative threeand co-associative four-cycles in G 2 manifolds; complex lagrangian four-cycles in Sp(2) manifolds; and Cayley four-cycles in Spin(7) manifolds. In each case, the associated special holonomy metric is singular, and is a hyperbolic analogue of a known metric. The analogous known metrics are respectively: Eguchi-Hanson, the resolved conifold and the four-fold resolved conifold; the deformed conifold, and the Stenzel four-fold metric; the Bryant-Salamon-Gibbons-Page-Pope G 2 metrics on an R4 bundle over S 3 , and an R3 bundle over S 4 or CP2 ; the Calabi hyper-Kähler metric on T ∗ CP2 ; and the Bryant-Salamon-Gibbons-Page-Pope Spin(7) metric on an R4 bundle over S 4 . By the AdS/CFT correspondence, a conformal field theory is associated to each of the new singular special holonomy metrics, and defines the quantum gravitational physics of the resolution of their singularities. 1. Introduction The AdS/CFT correspondence [1] provides a conceptual framework for consistently encoding the geometry of Anti-de Sitter and special holonomy solutions of M-/string theory in a quantum theory. Though the class of spacetimes to which it can be applied is restricted, and unfortunately does not include FLRW cosmologies, it provides the only complete proposal extant for the definition of a quantum theory of gravity. For the prototypical example of Ad S5 × S 5 /R10 and N = 4 super Yang-Mills, the Maldacena

346

O. A. P. Mac Conamhna

conjecture is by now approaching the status of proof [2,3]. The literature on the correspondence is enormous, from applications in pure mathematics to phenomenological investigations. On the phenomenological front, much effort has been devoted to extending the AdS/CFT correspondence from N = 4 super Yang-Mills to more realistic field theories [4] and even QCD itself [5,6]. Also, recent developments have raised the hope that we may soon be able to use AdS/CFT to test M-/string theory in the lab [7,10]. On the mathematical front, the motivation provided by the AdS/CFT correspondence has stimulated spectacular progress in differential geometry; early work on the correspondence showed that there is a deep interplay between Anti-de Sitter solutions of M-/string theory, singular special holonomy manifolds and conformal field theories [11,12]. This relationship has since been the topic of intense investigation; a recent highlight has been the beautiful work on Sasaki-Einstein geometry, toric Calabi-Yau three-folds and the associated conformal field theories [13,19]. What has become clear is that the geometry of a supersymmetric AdS/CFT dual involves an Anti-de Sitter manifold, a singular special holonomy manifold1 and a supergravity solution which, in a sense that will be made more precise, interpolates smoothly between them. This geometrical relationship, between Anti-de Sitter manifolds and singular special holonomy manifolds, in the context of the AdS/CFT correspondence in M-theory, is the subject of this paper. The canonical example of this relationship, from IIB, is that between conically singular Calabi-Yau three-folds and Sasaki-Einstein Ad S5 solutions of IIB supergravity. Each of these geometries, individually, is a supersymmetric solution of IIB, preserving eight supercharges. Furthermore, the manifolds may be superimposed2 to obtain another supersymmetric solution of IIB, admitting four supersymmetries. This interpolating solution - the supergravity description of D3 branes at a conical Calabi-Yau singularity - has metric

B ds = A + 4 r 2

−1/2

ds (R 2

1,3

B )+ A+ 4 r

1/2 dr 2 + r 2 ds 2 (SE5 ) ,

(1.1)

for constants A, B and a Sasaki-Einstein five-metric ds 2 (SE5 ). Setting B = 0 gives the IIB solution R1,3 × CY3 , while setting A=0 gives the solution AdS5 × SE5 . For positive A, B, the solution is globally smooth, and contains two distinct asymptotic regions: a spacelike infinity where the metric asymptotes to that of the Calabi-Yau, and an internal spacelike infinity, where the metric asymptotes to that of the Anti-de Sitter, on an event horizon at infinite proper distance. The causal structure of these solutions is discussed in detail in [20]. The Calabi-Yau singularity is excised in the interpolating solution, and removed to infinity; an important feature of the interpolating solution is that it admits a globally-defined SU (3) structure. The geometry of branes at Calabi-Yau and other special holonomy conical singularities in M-theory and IIB is systematically studied in [21] and (with a particular focus on three-folds and D3 branes) [22]. The AdS/CFT correspondence tells us how to perform this geometrical interpolation in a quantum framework. Open string theory on the singular Calabi-Yau reduces, at low energies, to a conformally invariant quiver gauge theory, at weak ’t Hooft coupling. This is the low-energy effective field theory on the world-volume of a stack of probe D3 branes located at the singularity. The gauge theory encodes the toric data of the Calabi-Yau. The same quiver gauge theory, at strong ’t Hooft coupling, is identical to IIB string theory on the AdS5 × SE5 ; by the AdS/CFT dictionary, the CFT also encodes the Sasaki-Einstein 1 With the obviously special non-singular exception of flat space. 2 Because, with a suitable ansatz including both, the supergravity field equations linearise.

Spacetime Singularity Resolution by M-Theory Fivebranes

347

data of the Ad S solution. Clearly, it can only do this for both the Calabi-Yau and the Ad S5 if their geometry is intimately related. In the classical regime, this relationship is provided by the interpolating solution. In the quantum regime, the relationship is provided by the CFT itself; the interpolation parameter is the ’t Hooft coupling. In effect, the CFT is telling us how to cut out the Calabi-Yau singularity quantum gravitationally, and replace it with an event horizon with the geometry of Anti-de Sitter. The correspondence is best understood for branes at conical singularities of special holonomy manifolds. However, starting from the work of Maldacena and Nuñez [23], many supersymmetric Ad S solutions of M-/string theory have been discovered, [13,24,31], which cannot be interpreted as coming from a stack of branes at a conical singularity. Instead, they have been interpreted as the near-horizon limits of the supergravity description of branes wrapped on calibrated cyles of special holonomy manifolds. The CFT dual of the Ad S/special holonomy manifolds is the low-energy effective theory on the unwrapped worldvolume directions of the branes. A brane, heuristically envisioned as a hypersurface in spacetime, can wrap a calibrated cycle in a special holonomy manifold, while preserving supersymmetry. A heuristic physical argument as to why this is possible is that a calibrated cycle is volume-minimising in its homology class; as a probe brane has a tension, it will always try to contract, and so a wrapped probe brane is only stable if it wraps a minimal cycle. The supergravity description of a stack of wrapped branes, by analogy with that of branes at conical singularities, should be a supergravity solution which smoothly interpolates between a special holonomy manifold with an appropriate calibrated cycle, and an event horizon with Anti-de Sitter geometry. As the notion of an interpolating solution is central to this paper, a more careful definition of what is meant by these words will now be given. Definition 1. Let M Ad S be a d-dimensional manifold admiting a warped-product Ad S metric g Ad S , that, together with a matter content F Ad S , gives a supersymmetric solution of a supergravity theory in d dimensions. Let M S H be a d-dimensional manifold admitting a special holonomy metric g S H , which gives a supersymmetric vacuum solution of the supergravity with holonomy G ⊂ Spin(d − 1). Let M I be a d-dimensional manifold admitting a globally-defined G-structure, together with a metric g I and a matter content F I that give a supersymmetric solution of the supergravity. Then we say that (M I , g I , F I ) is an interpolating solution if for all , ζ > 0, there exist open sets O Ad S ⊂ M Ad S , O I , O I ⊂ M I , O S H ⊂ M S H , such that for all points p Ad S ∈ O Ad S , p I ∈ O I , p I ∈ O I , p S H ∈ O S H , |g Ad S ( p Ad S ) − g I ( p I )| < , |g S H ( p S H ) − g I ( p I )| < ζ.

(1.2)

We also define the following useful pieces of vocabulary: Definition 2. If for a given pair (M Ad S , g Ad S , F Ad S ), (M S H , g S H , F S H ), there exists an interpolating solution, then we say that M S H is a special holonomy interpolation of M Ad S and that M Ad S is an Anti-de Sitter interpolation of M S H . Collectively, we refer to (M Ad S , g Ad S , F Ad S ) and (M S H , g S H , F S H ) as an interpolating pair. The objective of this paper is to derive candidate special holonomy interpolations of some of the wrapped fivebrane near-horizon limit Ad S solutions of [24,27]. In [33], candidate special holonomy interpolations of the Ad S5 M-theory solutions of [23] were derived. These Ad S solutions describe the near-horizon limit of fivebranes wrapped on Kähler two-cycles in Calabi-Yau two-folds and three-folds. As these results fit nicely into the more extensive picture presented here, they will be reviewed briefly below. The

348

O. A. P. Mac Conamhna

new special holonomy metrics that will be derived here are candidate interpolations of: the Ad S3 solution of [26], describing the near-horizon limit of fivebranes wrapped on a Kähler four-cycle in a four-fold; the Ad S4 solution of [25], interpreted in [26] as the near-horizon limit of fivebranes on a special lagrangian (SLAG) three-cycle in a threefold; the Ad S3 solution of [26], for fivebranes on a SLAG four-cycle in a four-fold; the Ad S4 solution of [24], for fivebranes on an associative three-cycle in a G 2 manifold; the Ad S3 solution of [26], for fivebranes on a co-associative four-cycle in a G 2 manifold; the Ad S3 solution of [27], for fivebranes on a complex lagrangian (CLAG) four-cycle in an Sp(2) manifold; and the Ad S3 solution of [26], for fivebranes on a Cayley four-cycle in a Spin(7) manifold. This paper therefore provides one candidate interpolating pair for every type of cycle on which M-theory fivebranes can wrap, in all manifolds of dimension less than ten with irreducible holonomy, with the exception of Kähler four-cycles in three-folds and quaternionic Kähler four-cycles in Sp(2) manifolds, for which no Ad S solutions are known to the author. No interpolating solutions of eleven-dimensional supergravity which describe wrapped branes are known. However, based on various symmetry and supersymmetry arguments, the differential equations they satisfy are known, for all types of calibrated cycles in all special holonomy manifolds that play a rˆole in M-theory. These equations will be called the wrapped brane equations; there is an extensive literature on their derivation [34,43]; the most general results are those of [41,43]. The key point that will be exploited here is that both members of an interpolating pair should individually be a solution of the wrapped brane equations, with a suitable ansatz for the interpolating solution. This is just like what happens for an interpolating solution associated to a conical special holonomy manifold. One of the many important results of [13] was to show how any Ad S5 solution of M-theory, coming from fivebranes on a Kähler two-cycle in a three-fold, satisfies the appropriate wrapped brane equations. The canonical frame of the Ad S5 solutions, defined by their eight Killing spinors, admits an SU (2) structure. The Ad S5 solutions may also be re-written in such a way that the canonical Ad S5 frame is obscured, but a canonical R1,3 frame is made manifest. This frame admits an SU (3) structure, and is defined by half the Killing spinors of the Ad S5 solution. And it is this Minkowski SU (3) structure which satisfies the wrapped brane equations. By definition, any interpolating solution describing fivebranes on a Kähler two-cycle in a three-fold admits a globallydefined SU (3) structure; this structure smoothly matches on to the SU (3) structure of the Calabi-Yau and also to the canonical SU (3) structure of the Ad S5 solution. This construction has since been systematically extended to all calibrated cycles in manifolds with irreducible holonomy of relevance to M-theory in [41–43], and, starting from the wrapped brane equations, has been used to classify (i.e., derive the differential equation satisfied by) all supersymmetric Ad S solutions of M-theory which have a wrapped-brane origin. The strategy used here to construct candidate special holonomy interpolations of the Ad S solutions is therefore the following. We first construct the canonical Minkowski frames and structures of the Ad S solutions, which satisfy the appropriate wrapped brane equations. We then use these as a guide to formulating a suitable ansatz for an interpolating solution. It is then a (reasonably) straightforward matter to determine the most general special holonomy solution of the Ad S-inspired ansatz for the interpolating solution. In each case, the special holonomy metric thus obtained is the proposed interpolation of the Ad S solution. No attempt has been made to determine the interpolating solutions themselves. It is therefore a matter of conjecture whether the special holonomy

Spacetime Singularity Resolution by M-Theory Fivebranes

349

metrics obtained are indeed interpolations of the Ad S solutions. However the results are sufficiently striking that it is reasonable to believe that for the proposed interpolating pairs an interpolating solution does indeed exist. As an illustration of this procedure, consider the results of [33] for the proposed interpolation of the N = 2 Ad S5 solution of [23], describing the near-horizon limit of fivebranes on a Kähler two-cycle in a two-fold. When re-written in the canonical Minkowski frame, the Ad S solution is of the form F 2 2 2 −1 2 1,3 ds = L ds (R ) + ds (H ) 2 2 −1 2 du + u 2 (dψ − P)2 + dt 2 + t 2 ds 2 (S 2 ) , (1.3) +L F where3 d P = Vol[H 2 ], the period of ψ is 2π and F, L are known functions of the coordinates u and t. The ansatz for the interpolating solution is then simply that F, L are allowed to be arbitrary functions of u, t. The most general special holonomy solution with this ansatz is ds 2 = ds 2 (R1,6 ) + ds 2 (Nτ ),

(1.4)

where, up to an overall scale, −1 1 R2 1 2 2 2 ds (H ) + ds (Nτ ) = − 1 (dψ − P) + −1 d R 2 . (1.5) 4 R4 R4 2

The range of R is R ∈ (0, 1]. At R = 1, an S 2 degenerates smoothly, and a H 2 bolt stabilises. At R = 0, the metric is singular, where the Kähler H 2 cycle degenerates. In the probe-brane picture, the fivebranes should be thought of as wrapping the H 2 at the singularity. Otherwise, they can always decrease their worldvolume by moving to smaller R. This incomplete special holonomy metric is to be compared with the Eguchi-Hanson metric [44], which is 1 −1 2 R2 1 2 2 2 ds (S ) + 1 − 4 (dψ − P) + 1 − 4 ds (EH) = d R , (1.6) 4 R R 2

where now d P = Vol[S 2 ]. As is well known, this metric is complete in the range R ∈ [1, ∞). At R = 1, an S 2 degenerates smoothly and a Kähler S 2 bolt stabilises. In every case, the conjectured special holonomy interpolations of the Ad S solutions derived in this paper are singular, and they have exactly the same relationship with known complete special holonomy metrics as that of (1.5) with Eguchi-Hanson. To make the pattern clear, it worth quoting one more example now. The conjectured special holonomy interpolation of the Ad S3 solution of [26] for fivebranes on a Cayley four-cycle in a Spin(7) manifold is ds 2 = ds 2 (R1,2 ) + ds 2 (Nτ ),

(1.7)

3 Here, and throughout, ds 2 (Ad S ), ds 2 (H n ), ds 2 (S n ), denote the maximally symmetric Einstein metrics n on n-dimensional Ad S manifolds, n-hyperboloids or n-spheres with unit radius of curvature, respectively. The cartesian metric on flat space will be denoted by ds 2 (Rn ). The volume form on a unit n-hyperboloid or n-sphere will be denoted by Vol[H n ], Vol[S n ], respectively.

350

O. A. P. Mac Conamhna

where, up to an overall scale, 9 2 2 4 36 2 R ds (H ) + R ds (Nτ ) = 20 100 2

1 R 10/3

a a − 1 DY DY +

1 R 10/3

−1 −1 d R2, (1.8)

where the Y a are constrained coordinates on an S 3 and D will be defined later. The range of R is R ∈ (0, 1]; at R = 1 the S 3 degenerates smoothly and a H 4 bolt stabilises. At R = 0 the metric is singular where the H 4 Cayley four-cycle degenerates. This metric is to be compared with the Spin(7) metric on an R4 bundle over S 4 , first found by Bryant and Salamon [45] and later independently by Gibbons, Page and Pope [46]: −1 1 9 2 2 4 36 2 1 a a R ds (S ) + R 1− 10/3 DY DY + 1− 10/3 d R2. ds (BSGPP) = 20 100 R R (1.9) 2

This metric is complete in the range R ∈ [1, ∞); at R = 1 an S 4 degenerates smoothly and a Cayley S 4 bolt stabilises. This relationship with known complete special holonomy metrics is a universal feature of all the proposed special holonomy interpolations of this paper. As this series of incomplete special holonomy metrics has so many features in common, they will be given a collective name, the Nτ series. Though they have been derived here from the Ad S M-theory solutions ab initio, they may be obtained in a much simpler way a posteriori, by analytic continuation of known complete metrics4 . In every case, they may be obtained from a known complete metric with a radial coordinate of semi-infinite range, at the endpoint of which an S m degenerates and a calibrated S n (or, as appropriate, CP2 ) cycle stabilises. The Nτ series is obtained by changing the sign of the scalar curvature of the bolt and analytically continuing the dependence of the metric on the radial coordinate. This generates a special holonomy metric with a “radial” coordinate of finite range, with a smoothly degenerating S m and a stabilised H n (or Bergman) bolt at one endpoint, and a singular degeneration at the other. For the Calabi-Yau Nτ with Kähler cycles in three-folds and four-folds, the analogous known metrics are the resolved conifold of [47,48], and its four-fold analogue (see [49] for useful additional background on the resolved conifold). For the Calabi-Yau Nτ with SLAG cycles, the analogous known metrics are the Stenzel metrics [50] (see [51,52] for useful background on the Stenzel metrics). The Stenzel two-fold metric coincides with Eguchi-Hanson, and the Stenzel three-fold metric coincides with the deformed conifold metric of [47] (see [49,53] for additional background on the deformed conifold). For the G 2 Nτ metrics with co-associative cycles, the analogous known metrics are the BSGPP metrics [45,46] on R3 bundles over S 4 or CP2 . For the G 2 Nτ metric with an associative cycle, the analogous known metric is the BSGPP metric [45,46] on an R4 bundle over S 3 . See [52,54,55] for more background on the complete G 2 metrics. For the Sp(2) Nτ metric with a CLAG cycle, the analogous known metric is the Calabi metric on T ∗ CP2 [56]; the Calabi metric is the unique complete regular hyper-Kähler eight-manifold of co-homogeneity one [57]; for further background on the Calabi metric, see [58]. Finally, for the 4 The N metrics have almost certainly been found before, though because they are incomplete, they have τ presumably been rejected hitherto as pathological and uninteresting. What now makes them interesting is their interpretation as special holonomy interpolations of Ad S solutions, for which their incompleteness is probably a pre-requisite: see Conjecture 2 below.

Spacetime Singularity Resolution by M-Theory Fivebranes

351

Spin(7) Nτ metric with a Cayley four-cycle, we have seen that the analogous known metric is the BSGPP metric on an R4 bundle over S 4 ; see [52,54,55] for more details. What is most striking about the conjectured special holonomy interpolations obtained here is that they are all singular. As occurs in the conical context, the expectation is that the singularity of the special holonomy manifold is excised in the interpolating solution, and that the conformal dual of the geometry gives a quantum gravitational definition of this process. If this is correct, then a singularity of the special holonomy manifold is an essential ingredient of the geometry of AdS/CFT. It would also explain a hitherto rather puzzling feature of the Ad S solutions studied here, all of which were originally constructed in gauged supergravity. While for the Nτ series it is possible to obtain the known special holonomy manifolds by replacing the H n factors with S n factors, for their Ad S interpolations this does not seem to be possible; the Ad S solutions exist only for hyperbolic cycles. This makes sense if an AdS/CFT dual can exist only for a singular special holonomy manifold; otherwise, if Ad S solutions like those studied here, but with S n cycles, existed, their special holonomy interpolations would be non-singular. Another way of saying this is that it seems that a conformal field theory can be associated to the singular Nτ series of special holonomy metrics, but not to their non-singular known analogues. If this idea is correct, it means that what the AdS/CFT correspondence is ultimately describing is the quantum gravity of singularity resolution for special holonomy manifolds. We formalise the geometry of this idea in the following two conjectures. Conjecture 1. Every supersymmetric Anti-de Sitter solution of M-/string theory admits a special holonomy interpolation. Conjecture 2. With the exception of flat space, the metric on every special holonomy manifold admitting an Anti-de Sitter interpolation is incomplete. The organisation of the remainder of this paper is as follows. In Section Two, as useful introductory material, we will review the relationship between the canonical Ad S and Minkowski frames for Ad S solutions, how to pass from one to the other by means of a frame rotation, and the relationship between the Ad S and wrapped brane structures. In Section Three, we will derive the conjectured special holonomy interpolations of Ad S solutions for fivebranes wrapped on cycles in Calabi-Yau manifolds. Section Four is devoted to the proposed Sp(2) interpolating pair, Section Five to the G 2 interpolating pairs and Section Six to the Spin(7) interpolating pair. In Section Seven we conclude and discuss interesting future directions. 2. Canonical Minkowski Frames for AdS Manifolds In this section we will review how the canonical Ad S frame defined by all the Killing spinors of a supersymmetric Ad S solution is related to its canonical Minkowski frame defined by half its Killing spinors; for more details, the reader is referred to [13,41,43]. The canonical Minkowski structure of an Ad S solution is the one which can match on to the G-structure of an interpolating solution. This phenomenon–the matching of the structure defined by half the supersymmetries of the Ad S manifold to that of an interpolating solution–is another, more precise way of stating the familiar feature of supersymmetry doubling in the near-horizon limit of a supergravity brane solution. We will in fact distinguish two cases, which will be discussed separately. The Ad S solutions we study for fivebranes on cycles in manifolds of SU (2), SU (3) or G 2 holonomy have purely magnetic fluxes. This means that no membranes are present in the

352

O. A. P. Mac Conamhna

geometry. However, the Ad S solutions for fivebranes on four-cycles in eight-manifolds (Spin(7), SU (4) or Sp(2) holonomies) have both electric and magnetic fluxes. In probebrane language, we can think of a stack of fivebranes wrapped on a four-cycle in the eight-manifold. We also have a stack of membranes extended in the three overall transverse directions to the eight-manifold. The membrane stack intersects the fivebrane stack in a string; the low-energy effective field theory on the string worldvolume is then the two-dimensional dual of the Ad S3 solutions that come from these geometries. The presence of the membranes complicates the relationship of the Ad S and Minkowski frames a little, so first we will discuss the case of fivebranes alone, and purely magnetic fluxes. 2.1. AdS spacetimes from fivebranes on cycles in SU (2), SU (3) and G 2 manifolds. The metric of an interpolating solution describing a stack of fivebranes wrapped on a calibrated cycle in a Calabi-Yau two- or three-fold, or a G 2 manifold, takes the form (2.1) ds 2 = L −1 ds 2 (R1, p ) + ds 2 (Mq ) + L 2 dt 2 + t 2 ds 2 S 10− p−q , where Mq admits a globally-defined SU (2), SU (3) or G 2 structure respectively. The Minkowski isometries are isometries of the full solution, and the flux has no components along the Minkowski directions. The dimensionality of Mq is q = 4, 6, 7, respectively. The dimensionality of the unwrapped fivebrane worldvolume is p + 1, so p = 3 for a Kähler two-cycle, p = 2 for a SLAG or associative three-cycle, and p = 1 for a co-associative four-cyle. The intrinsic torsion of the G-structure on Mq must satisfy certain conditions, implied by supersymmetry and the four-form Bianchi identity. These conditions are what are called the wrapped brane equations; they will be given for each case below, and need not concern us now. For more details, the reader is referred to [41]. Our interest here is how to obtain a warped product Ad S metric from the wrappedbrane metric (2.1), and vice versa. The first step is to recognise that every warped-product Ad S p+2 metric, written in Poincaré coordinates, may be thought of as a special case of a warped R1, p metric. If the Ad S warp factor is denoted by λ, and is independent of the Ad S coordinates, then λ−1 ds 2 (Ad S p+2 ) = λ−1 [e−2r ds 2 (R1, p ) + dr 2 ].

(2.2)

Therefore our first step is to identify L = λe2r in (2.1), with r the Ad S radial coordinate. The next step is to pick out the Ad S radial direction rˆ = λ−1/2 dr from the space transverse to the R1, p factor in (2.1). In the cases of interest to us, the Ad S radial direction is a linear combination of the radial direction vˆ = Ldt on the overall transverse space, and a radial direction in Mq , transverse to the wrapped cycle. We denote this radial basis one-form on Mq by u. ˆ Thus we can obtain the Ad S radial basis one-form by a local rotation of the frame of (2.1): rˆ = sin θ uˆ + cos θ v, ˆ

(2.3)

for some local angle θ which we take to be independent of r . Denoting the orthogonal linear combination in the Ad S frame by ρ, ˆ we have ρˆ = cos θ uˆ − sin θ v. ˆ

(2.4)

Now, imposing closure of dt and r -independence of θ , we get ρˆ =

λ d(λ−3/2 cos θ ). 2 sin θ

(2.5)

Spacetime Singularity Resolution by M-Theory Fivebranes

353

Defining a coordinate ρ for the Ad S frame according to ρ = λ−3/2 cos θ , we get ρ t = − e−2r , 2 λ ρˆ = dρ. 2 1 − λ3 ρ 2

(2.6)

Finally, we impose that the metric on the space tranverse to the Ad S factor is independent of the Ad S radial coordinate, and (in deriving the Ad S supersymmetry conditions from the wrapped brane equations) that the flux has no components along the Ad S radial direction. Thus we obtain the (for our purposes) general Ad S p+2 metric contained in (2.1): dρ 2 λ3 2 −1 2 2 2 10− p−q ds = λ ds (Ad S p+2 ) + + ds 2 (Mq−1 ), + ρ ds S 4 1 − λ3 ρ 2 (2.7) where ds 2 (Mq−1 ) is defined by ˆ ds 2 (Mq ) = ds 2 (Mq−1 ) + uˆ ⊗ u.

(2.8)

In addition, we have ⎛

1 − λ3 ρ 2 dr + uˆ = λ ⎝ λ3

⎞ λ3 ρ ⎠ dρ . 1 − λ3 ρ 2 2

(2.9)

Since in general we know the relationship between the Minkowski-frame coordinate t and the Ad S frame coordinates r, ρ, when we know λ explicitly for a particular solution, we can integrate (2.9) to find an explicit coordinatisation of the Ad S solution in the Minkowski frame. Thus we can pass freely from one frame to the other, for any explicit solution. Having discussed the relationship of the frames, let us now discuss the relationship between the structures. Since, in passing from (2.1) to (2.7) we pick out a preferred direction on Mq , the G-structure of (2.1) on Mq is reduced to a G structure on Mq−1 in (2.7). For q = 4, the SU (2) structure on M4 is reduced to an identity structure on M3 ; the SU (2) forms on M4 decompose according to ˆ J4 = e1 ∧ e2 + e3 ∧ u, 1 2 3 4 = (e + ie ) ∧ (e + i u), ˆ

(2.10) (2.11)

with ˆ ds 2 (M4 ) = e1 ⊗ e1 + e2 ⊗ e2 + e3 ⊗ e3 + uˆ ⊗ u.

(2.12)

For q = 6, the SU (3) structure on M6 reduces to an SU (2) structure on M5 ; the SU (3) structure forms decompose according to J6 = J4 + e5 ∧ u, ˆ 6 = 4 ∧ (e5 + i u), ˆ

(2.13)

354

O. A. P. Mac Conamhna

with ds 2 (M6 ) = ds 2 (M5 ) + uˆ ⊗ uˆ = ds 2 (M4 ) + e5 ⊗ e5 + uˆ ⊗ u, ˆ

(2.14)

and the SU (2) structure of the Ad S frame is defined on M4 . For q = 7, the G 2 structure on M7 reduces to an SU (3) structure on M6 ; the G 2 structure forms decompose according to

= J6 ∧ uˆ − Im 6 , 1 ϒ = J6 ∧ J6 + Re 6 ∧ u, ˆ 2

(2.15)

with ds 2 (M7 ) = ds 2 (M6 ) + uˆ ⊗ u, ˆ

(2.16)

and the SU (3) structure of the Ad S frame is defined on M6 . 2.2. AdS spacetimes from fivebranes on four-cycles in eight-manifolds of Spin (7), SU (4) or Sp (2) holonomy. As discussed above, because of the presence of non-zero electric flux for Ad S3 solutions from fivebranes on four-cycles in eight-manifolds, the relationship between the canonical Ad S and Minkowski frames of the Ad S solutions is a little more complicated. These systems are the subject of [43], to which the reader is referred for more detail5 . The metric of an interpolating solution describing a stack of fivebranes wrapped on a four-cycle in an eight-manifold, with a stack of membranes extended in the transverse directions, takes the form ds 2 = L −1 ds 2 (R1,1 ) + ds 2 (M8 ) + C 2 dt 2 .

(2.17)

Again, the Minkowski isometries are isometries of the full solution, the electric flux contains a factor proportional to the Minkowski volume form, and the magnetic flux has no components along the Minkowski directions. The Minkowski directions represent the unwrapped fivebrane worldvolume directions; the membranes extend in these directions and also along dt. Note that in this case the warp factor of the overall transverse space (the R coordinatised by t) is independent of the Minkowski warp factor. The global G-structure is defined on M8 ; the structure group is Spin(7), SU (4) or Sp(2), as appropriate. Again, supersymmetry, the four-form Bianchi identity, and now, the four-form field equation imply restrictions on the intrinsic torsion of the global G-structure. These equations, the wrapped brane equations for these systems, are given in [43]. To obtain an Ad S3 metric from (2.17), we again require that L = λe2r , with r the Ad S radial coordinate and λ the Ad S warp factor, which we require to be independent of the Ad S coordinates. As before, we must now pick out the Ad S radial direction rˆ = λ−1/2 dr from the space transverse to the Minkowski factor. In the generic case of interest to us, the Ad S radial direction is a linear combination of the overall transverse direction e9 = Cdt and a radial direction in M8 transverse to the cycle that we denote by e8 . Thus, as before, we write the frame rotation relating the Minkowski and Ad S frames as rˆ = sin θ e8 + cos θ e9 , ρˆ = cos θ e8 − sin θ e9 ,

(2.18)

5 In [43], somewhat more general wrapped brane metrics were considered than those of this discussion.

However the discussion of this section is sufficiently general for the applications of interest in this paper.

Spacetime Singularity Resolution by M-Theory Fivebranes

355

for a local rotation angle θ which we take to be independent of the Ad S radial coordinate. Imposing Ad S isometries on the electric and magnetic flux, and requiring that the metric on the space transverse to the Ad S factor is independent of the Ad S coordinates, we find that we may introduce an Ad S frame coordinate ρ such that λ−3/2 cos θ = f (ρ),

λ ρˆ = dρ, 2 1 − λ3 f 2

(2.19)

for some arbitrary function f (ρ). See [43] for a fuller discussion of this point. Then the general Ad S metric contained in (2.17) is 1 λ3 2 + ds 2 (M7 ), ds 2 (Ad S3 ) + dρ (2.20) ds 2 = λ 4(1 − λ3 f 2 ) where ds 2 (M7 ) is defined by ds 2 (M8 ) = ds 2 (M7 ) + e8 ⊗ e8 .

(2.21)

The basis one-forms of the Minkowski frame are given in terms of the basis one-forms of the Ad S frame by ⎞ ⎛ 3f2 3 1 − λ λ f dρ ⎠ , dr + e8 = λ ⎝ λ3 1 − λ3 f 2 2 1 Cdt = λ f dr − λdρ. 2

(2.22)

For an explicit Ad S3 solution we know λ and f explicitly, and so we can integrate these expressions to get an explicit coordinatisation of the Ad S solution in the Minkowski frame. Thus we can freely pass between the canonical Ad S and Minkowski frames for known Ad S solutions. As in the previous subsection, because we are picking out a preferred direction on M8 , the Minkowski-frame structure on M8 is reduced, in the Ad S frame, to a structure on M7 . A Spin(7) structure on M8 is reduced to a G 2 structure on M7 ; the decomposition of the Cayley four-form is − φ = ϒ + ∧ e8 .

(2.23)

An SU (4) structure on M8 is reduced to an SU (3) structure on M7 . The decomposition of the SU (4) structure forms is J8 = J6 + e7 ∧ e8 , 8 = 6 ∧ (e7 + ie8 ),

(2.24)

with ds 2 (M8 ) = ds 2 (M7 ) + e8 ⊗ e8 = ds 2 (M6 ) + e7 ⊗ e7 + e8 ⊗ e8 ,

(2.25)

with the SU (3) structure forms defined on ds 2 (M6 ). Finally, an Sp(2) structure on M8 reduces to an SU (2) structure on ds 2 (M7 ). The decomposition of the triplet of

356

O. A. P. Mac Conamhna

Sp(2) almost complex structures (which obey the algebra J A J B = −δ AB + ABC J C , A = 1, 2, 3) under SU (2) is J 1 = K 3 + e5 ∧ e6 + e7 ∧ e8 , J 2 = K 2 − e5 ∧ e7 + e6 ∧ e8 , J 3 = K 1 + e6 ∧ e7 + e5 ∧ e8 ,

(2.26)

ds 2 (M8 ) = ds 2 (M4 ) + e5 ⊗ e5 + e6 ⊗ e6 + e7 ⊗ e7 + e8 ⊗ e8 ,

(2.27)

with

and the K A are a triplet of self-dual SU (2)-invariant two-forms on M4 , which satisfy the algebra6 K A K B = −δ AB − ABC K C . Having concluded the introductory review, we now move on to the main results of the paper. 3. Calabi-Yau Interpolating Pairs In this section, we will give conjectured interpolating pairs for fivebranes wrapped on calibrated cycles in Calabi-Yau manifolds. First we will discuss Kähler cycles, then SLAG cycles. In order to present a complete picture, we will summarise the results of [33] for Kähler two-cycles in two-folds and three-folds. In the new cases, we will first present the pair, and then give the derivation of the special holonomy interpolation from the Ad S solution.

3.1. Kähler cycles. In this subsection, the Ad S solutions for which we give a conjectured special holonomy interpolation are: the half-BPS Ad S5 solution of [23], describing the near-horizon limit of fivebranes on a two-cycle in a two-fold; the quarter-BPS Ad S5 solution of [23], for a two-cycle in a three-fold; and the Ad S3 solution of [26], admitting four Killing spinors, for a four-cycle in a four-fold. The special holonomy interpolations of the first two cases are derived in [33]; here we will just describe the conjectured pair. All the other pairs given in this paper are new, and their derivation will be given. 3.1.1. Two-fold. The conjectured interpolating pair. The metric of the half-BPS Ad S5 solution of [23] is given by 1 1 ds 2 (Ad S5 ) + ds 2 (H 2 ) + (1 − λ3 ρ 2 )(dψ − P)2 ds 2 = λ 2 3 2 dρ λ 2 2 2 + + ρ ds (S ) , 4 1 − λ3 ρ 2 8 λ3 = , (3.1) 1 + 4ρ 2 6 The slightly eccentric labelling of the SU (2) structure forms is chosen to coincide with an unfortunate conventional quirk of [43].

Spacetime Singularity Resolution by M-Theory Fivebranes

357

where d P = Vol[H 2 ]. The range of the coordinate ρ, which without loss of generality we take to be non-negative, is ρ ∈ [0, 1/2]. At ρ = 0, the R-symmetry S 2 degenerates smoothly7 . At ρ = 1/2, the R-symmetry U (1), with coordinate ψ, degenerates smoothly, provided that ψ is identified with period 2π . As discussed in the introduction, the conjectured special holonomy interpolation of this manifold is ds 2 (Nτ ) = ds 2 (R1,6 ) + ds 2 (Nτ ),

(3.2)

where, up to an overall scale, −1 1 1 R2 2 ds 2 (Nτ ) = − 1 (dψ − P) − 1 d R 2 . (3.3) + ds 2 (H 2 ) + 4 R4 R4 The range of R is R ∈ (0, 1]. At R = 1, an S 2 degenerates smoothly, provided that ψ has the same period as in the Ad S solution. At R = 0, the metric is singular, where the Kähler H 2 cycle degenerates. 3.1.2. Three-fold. The conjectured interpolating pair. The metric of the quarter-BPS Ad S5 solution of [23] is 1 1 1 ds 2 = ds 2 (Ad S5 ) + ds 2 (H 2 ) + (1 − λ3 ρ 2 ) ds 2 (S 2 ) + (dψ + P − P )2 λ 3 9 3 λ dρ 2 , + 4(1 − λ3 ρ 2 ) 4 , (3.4) λ3 = 4 + ρ2 √ √ where now d√ P = Vol[S 2 ], d P = Vol[H 2 ]. This time, the range of ρ is [−2/ 3, 2/ 3]; at ρ = ±2/ 3, an S 3 degenerates smoothly, provided that ψ is periodically identified with period 4π . The conjectured special holonomy interpolation of this manifold is ds 2 = ds 2 (R1,4 ) + ds 2 (Nτ ),

(3.5)

where, up to an overall scale, ds 2 (Nτ ) =

1 cos2 ξ (1 + sin ξ )ds 2 (H 2 ) + ds 2 (S 2 ) 2 2(1 + sin ξ ) 1 2 2 2 + d R , + R (dψ + P − P ) cos2 ξ

1 2 − sin3 ξ + sin ξ = − R 2 . (3.6) 3 3 √ The range of R is R ∈ [0, 2/ 3). At R = 0 (corresponding to ξ = π/2) an S 3 degenerates smoothly, provided that ψ √ has the same periodicity as for the Ad S coordinate. The metric is singular at R = 2/ 3 (corresponding to ξ = −π/2) where the Kähler H 2 cycle degenerates. This metric is the hyperbolic analogue of the resolved conifold metric of [47,48]. 7 The R-symmetry of the dual theory is SU (2) × U (1).

358

O. A. P. Mac Conamhna

3.1.3. Four-folds. The interpolating pairs. This is the first new case we encounter. A set of Ad S3 solutions was constructed by Gauntlett, Kim and Waldram (GKW) in [26], that describe the nearhorizon limit of M5 branes on a Kähler four-cycle in a Calabi-Yau four-fold, intersecting membranes extended in the directions transverse to the four-fold. The Ad S solutions admit four Killing spinors, and are as follows. The metrics are 1 3 1 3 2 2 2 2 ds = ) + f ) ds (S ) + (dψ + P + P ) ds 2 (Ad S3 ) + ds 2 (KE− (1 − λ 4 λ 4 4 3 λ dρ 2 , + 4(1 − λ3 f 2 ) 9 2ρ λ3 = . (3.7) , f = 2 12 + ρ 3 2

Here KE− 4 is an arbitrary negative scalar curvature Kähler-Einstein manifold, normalised such that the Ricci form R4 is given by R4 = − Jˆ4 , with Jˆ4 the Kähler form of KE− 4 . In addition, d P = Vol[S 2 ], d P = R4 .

(3.8)

The range of ρ is ρ ∈ [−2, 2]; at the end-points, an S 3 smoothly degenerates, provided that ψ is periodically identified with period 4π . These manifolds admit an SU (3) structure, which was obtained in [43], and will be given below (in somewhat more transparent coordinates), together with the magnetic flux (the electric flux, which is irrelevant to the discussion, can be obtained from [26] or [43]). The conjectured special holonomy interpolation of these manifolds is ds 2 = ds 2 (R1,2 ) + ds 2 (Nτ ),

(3.9)

where, up to an overall scale, ds 2 (Nτ ) =

1 cos2 ξ (1 + sin ξ )ds 2 (KE− ds 2 (S 2 ) ) + 4 2 2(1 + sin ξ ) 1 2 + 2 d R + R 2 (dψ + P − P )2 , cos ξ −

1 3 2 sin ξ + sin ξ = − R 2 . 3 3

(3.10)

This is identical to the three-fold metric of the previous subsection, but with the H 2 replaced by a KE− 4 . It has the same regularity properties, and is the hyperbolic analogue of the four-fold resolved conifold. Now we will discuss its derivation.

Spacetime Singularity Resolution by M-Theory Fivebranes

359

The G-structure of the AdS solutions. First we will give the SU (3) structure of the Ad S solutions, defined by all four Killing spinors. Defining the frame 3 a a eˆ , e = 4λ 1 1 − λ3 f 2 eiψ (dθ + i sin θ dφ), e5 + ie6 = 2 1 7 e = 1 − λ3 f 2 (dψ + P + P ), (3.11) 2 12 34 and ˆ ˆ 4 = (eˆ1 + where a = 1, . . . , 4, the eˆa furnish a basis for KE− 4 , J4 = eˆ + eˆ i eˆ2 )(eˆ3 + i eˆ4 ), the SU (3) structure is given by

J6 = e12 + e34 + e56 , 6 = (e1 + ie2 )(e3 + ie4 )(e5 + ie6 ).

(3.12)

This structure is a solution of the torsion conditions of [43] for the near-horizon limit of fivebranes on a Kähler four-cycle in a four-fold, which are ρˆ ∧ d(λ−1 J6 ∧ J6 ) = 0, d(λ−3/2 1 − λ3 f 2 Im 6 ) = 2λ−1 (e7 ∧ Re 6 − λ3/2 f ρˆ ∧ Im 6 ),

(3.13)

(3.14) f . J6 de7 = (1 − λ3 f 2 ) − λ3/2 f ρ ˆ d log 1 − λ3 f 2 1 − λ3 f 2 2λ1/2

λ3

(3.15) In addition it is a solution of the Bianchi identity for the magnetic flux, dFmag = 0, which in this case is not implied by the torsion conditions. The magnetic flux is given by Fmag =

λ3/2

(λ3/2 f + 8 ) 1 − λ3 f 2 ×(d[λ−3/2 1 − λ3 f 2 J6 ∧ e7 ] − 2λ−1 J6 ∧ J6 ) + 2λ1/2 J6 ∧ e7 ∧ ρ, ˆ (3.16)

where 8 is the Hodge dual on the space transverse to the Ad S factor, with positive orientation defined with repect to Vol =

1 J6 ∧ J6 ∧ J6 ∧ e7 ∧ ρ. ˆ 3!

(3.17)

The Ad S solutions in the Minkowski frame. Now we use the discussion of Sect. 2 to frame-rotate the Ad S solutions to the canonical Minkowski frame. Defining the coordinates 1 t = − e−4r/3 ρ, 2 1 u=− 12 − 3ρ 2 e−r , 3

(3.18)

360

O. A. P. Mac Conamhna

the one-forms e8 , e9 in the Minkowski frame are given by e8 = λer du, e9 = λe4r/3 dt,

(3.19)

and the metric in the Minkowski frame takes the form ds 2 =

1 1/3

2/3

ds 2 (R1,1 ) + 2/3

HM5 HM2

1/3

2/3

+ HM2 HM5

1 F

1/3

HM5

HM2

HM2

HM5

du 2 +

dt 2 + 2/3

1/3

3 Fds 2 (KE− 4) 4

u2 2 2 [ds (S ) + (dψ + P + P )2 ] 4

,

(3.20)

where HM5 = λ3 e14r/3 , HM2 = e2r/3 , F = e4r/3 .

(3.21)

These three functions have been chosen so that the metric takes a form reminiscent of the harmonic function superposition rule for intersecting branes, in line with the probe brane picture. The fivebrane worldvolume directions are the Minkowski and KE− 4 directions; the membranes extend along the Minkowski and t directions. Also e2r is given in terms of t and u by a positive signature metric inducing root of the quartic 6 8r

t e

3 2 2r 3 − 1− u e = 0. 4

(3.22)

The wrapped-brane SU (4) structure of the Ad S3 solutions, defined by two of their Killing spinors, is given by J8 = J6 + e7 ∧ e8 , 8 = 6 ∧ (e7 + ie8 ).

(3.23)

By construction, this structure is a solution of the wrapped brane equations for a Kähler four-cycle in a four-fold. These comprise the torsion conditions [62,43], J8 de9 = 0, d(L Re 8 ) = 0, 9 9 e ∧ [J8 d J8 − Le d(L −1 e9 )] = 0, −1

(3.24)

and the Bianchi identity and field equation for the four-form, which is given in the Minkowski frame in [43,62].

Spacetime Singularity Resolution by M-Theory Fivebranes

361

The conjectured Calabi-Yau interpolation. We now make the following ansatz for an interpolating solution: ds 2 =

1

2/3

ds 2 (R1,1 ) + 2/3

1/3 HM5 HM2 1/3

2/3

+HM2 HM5

HM5

dt 2 + 2/3

1/3 HM2 1/3 HM5

α 2 F12 F22 ds 2 (KE− 4)

HM2 1 u2 u2 2 2 2 2 du + (dψ + P + P ) + ds (S ) , (3.25) 4 F12 4F22

with HM5,M2 , F1,2 arbitrary functions of u, t, and α a constant. To determine the Calabi-Yau interpolation with this ansatz, we set HM5,M2 = 1 and require that F1,2 are functions only of u. The derivation of the Calabi-Yau metric is now identical to that for the three-fold interpolation of the previous subsection, as given in [33]. This close analogy between fivebranes wrapped on Kähler four-cycles in four-folds and two-cycles in three-folds has recently been used to construct infinite families of Ad S3 solutions [30–32] motivated by the analogous Ad S5 solutions [13]. In any event, to determine the special holonomy metric, observe that closure of 8 , with the obvious frame inherited from the Ad S solution, is automatic. Closure of J8 results in the pair of equations α 2 ∂u (F12 F22 ) + ∂u

u2 4F22

−

u = 0, 2F12 u = 0. 2F12

(3.26)

As in [33,61], the general solution of these equations inducing a metric with only one singular degeneration point is given by a4 cos2 ξ, α2 u 2 u 2 (1 + sin ξ ) , F22 = 2 2a cos2 ξ 1 2 α2 u 4 − sin2 ξ + sin ξ = − , 3 3 4a 6 F12 =

(3.27)

for some constant α. Defining the coordinate R2 =

α2 u 4 , 4a 6

(3.28)

the metric takes the form given above. 3.2. Special Lagrangian cycles. In this subsection we will give conjectured interpolating pairs for fivebranes wrapped on SLAG cycles in three- and four-folds. The Ad S solutions for which a Calabi-Yau interpolation is derived are respectively the Ad S4 solution of [25], admitting eight Killing spinors, and the Ad S3 solution of [26], admitting four Killing spinors. In each case we will first give the conjectured pair, then the derivation of the Calabi-Yau interpolation from the Ad S solution.

362

O. A. P. Mac Conamhna

3.2.1. Three-fold. The interpolating pair. The eleven-dimensional lift of the Ad S4 solution of [25] was later interpreted [26] as the near-horizon limit of fivebranes wrapped on a SLAG threecycle in a three-fold. The metric is given by 1 2 ds 2 = ds (Ad S4 ) + ds 2 (H 3 ) + (1 − λ3 ρ 2 )DY a DY a λ λ3 2 2 2 1 dρ + + ρ ds (S ) , 4(1 − λ3 ρ 2 ) λ3 =

2 . 8 + ρ2

(3.29)

The flux, which in this case is purely magnetic and irrelevant to the discussion, may be obtained from [26] or [41]. Here the Y a , a = 1, 2, 3, are constrained coordinates on an S 2 , Y a Y a = 1, and DY a = dY a + ωa b Y b ,

(3.30)

where the ωab are the spin-connection one-forms of √ H 3 . The range of ρ, which without 1 loss of generality we take to be positive, √ is ρ ∈2 [0, 8]. At ρ = 0 the R-symmetry S 8 degenerates smoothly , while at ρ = 8 the S degenerates smoothly. Denoting a basis for H 3 by ea , the metric of the conjectured Calabi-Yau interpolation of this solution is ds 2 = ds 2 (R1,4 ) + ds 2 (Nτ ),

(3.31)

where, up to an overall scale, ds 2 (Nτ ) =

(2θ − sin 2θ )1/3 1 1 (1 − cos θ )(ea − Y a Y b eb )2 + (1 + cos θ )DY a DY a sin θ 2 2 3 sin θ 1 dθ 2 + 4(Y a ea )2 . (3.32) + 3 2θ − sin 2θ

The range of θ is θ ∈ (0, π ]. Near θ = π , the S 2 degenerates smoothly; up to a scale, near θ = π the metric is 1 2 dθ + θ 2 DY a DY a . (3.33) ds 2 = ds 2 (H 3 ) + 4 The metric is singular at θ = 0; up to a scale, near θ = 0 it is 1 2 ds 2 = dθ + θ 2 (ea − Y a Y b eb )2 + (Y a ea )2 + DY a DY a . 4

(3.34)

This Calabi-Yau is the hyperbolic analogue of the deformed conifold [47] (which coincides with the Stenzel three-fold metric [50]); the S 3 SLAG cycle of the deformed conifold is replaced by a H 3 in the Nτ metric. Now we discuss the derivation of this interpolation from the Ad S solution. 8 The R-symmetry of the dual conformal theory is U (1).

Spacetime Singularity Resolution by M-Theory Fivebranes

363

The G-structure of the AdS solution. The Ad S4 solution admits an SU (2) structure defined by all eight Killing spinors. It is given by [41] e5 =

1 λ1/2

Y a ea ,

1 1 − λ3 ρ 2 DY a ∧ ea , λ 1 J2 = 1 − λ3 ρ 2 abc Y a DY b ∧ ec , λ 1 abc 1 1 a b 3 3 2 a b c c (1 − λ ρ )Y DY ∧ DY − Y e ∧ e . J = 2 λ λ J1 =

(3.35)

This structure satisfies the torsion conditions of [41] for the near-horizon limit of fivebranes on a SLAG three-cycle in a three-fold, which are d λ−1 1 − λ3 ρ 2 e5 = λ−1/2 J 1 + λρe5 ∧ ρ, ˆ d λ−3/2 J 3 ∧ e5 − ρ J 2 ∧ ρˆ = 0, d J 2 ∧ e5 + λ−3/2 ρ −1 J 3 ∧ ρˆ = 0. (3.36) The following identities, valid for a H 3 or S 3 with scalar curvature R, are useful in verifying this claim: d(Y a ea ) = DY a ∧ ea , R d( abc Y a DY b ∧ DY c ) = − abc Y a DY b ∧ ec ∧ Y d ed , 3 d( abc Y a eb ∧ ec ) = 2 abc Y a DY b ∧ ec ∧ Y d ed , R d( abc Y a DY b ∧ ec ) = abc Y a DY b ∧ DY c − Y a eb ∧ ec ∧ Y d ed . 6

(3.37)

In this case, the Bianchi identity for the flux is implied by the torsion conditions [41]. The AdS solution in the Minkowski frame. From Sect. 2, defining the Minkowski-frame coordinates ρ t = − e−2r , 2 u=−

8 − ρ 2 −r e , 2

(3.38)

the metric of the Ad S solution in the Minkowski frame is given by ds 2 = L −1 ds 2 (R1,2 ) + Fds 2 (H 3 ) + L 2 F −1 (du 2 + u 2 DY a DY a ) + ds 2 (R2 ) , (3.39) where L = λe2r , F = e2r and e2r =

u2 4t 2

−1 + 1 + 32t 2 /u 2 .

(3.40)

364

O. A. P. Mac Conamhna

The wrapped-brane SU (3) structure of the Ad S solution, defined by four of its Killing spinors, is given by J6 = J 1 + e5 ∧ u, ˆ 3 2 6 = (J + i J ) ∧ (e5 + i u), ˆ

(3.41)

with uˆ = L F −1/2 du. By construction, this structure is a solution of the wrapped brane equations for fivebranes wrapped on a SLAG cycle in a three-fold, which are [39] Vol[R2 ] ∧ dIm 6 = d(L −1/2 J6 ) = Re 6 ∧ dRe 6 = d 8 L 3/2 d(L −3/2 Re 6 ) =

0, 0, 0, 0,

(3.42)

where 8 denotes the Hodge dual on the space transverse to the Minkowski factor. The conjectured Calabi-Yau interpolation. We now make the following ansatz for an interpolating solution: ds 2 = L −1 ds 2 (R1,2 ) + F12 (ea − Y a Y b eb )2 + F22 (Y a ea )2 + L 2 F42 du 2 + F32 DY a DY a + ds 2 (R2 ) , (3.43) with L, F1,...,4 arbitrary functions of u and t. To determine the Calabi-Yau interpolation with this ansatz, we set L = 1, and require that F1,...,4 are functions only of u. Then F4 is at our disposal and we set it to unity. The Calabi-Yau condition is d J6 = d 6 = 0,

(3.44)

with J6 and 6 as inherited from the Ad S solution in the Minkowski frame, J6 = F1 F3 DY a ∧ ea + F2 Y a ea ∧ du, 1 Re 6 = F2 F32 abc Y a DY b ∧ DY c − F12 F2 abc Y a eb ∧ ec 2 ∧Y d ed − F1 F3 abc Y a DY b ∧ ec , Im 6 = F1 F2 F3 abc Y a DY b ∧ ec ∧ Y d ed 1 2 abc a F3 Y DY b ∧ DY c − F12 abc Y a eb ∧ ec ∧ du. + 2

(3.45)

Then using Eqs. (3.37), closure of J6 implies ∂u (F1 F3 ) + F2 = 0.

(3.46)

Closure of Re 6 implies 1 ∂u (F2 F32 ) + F1 F3 = 0, 2 1 ∂u (F2 F12 ) − F1 F3 = 0. 2

(3.47)

Spacetime Singularity Resolution by M-Theory Fivebranes

365

Closure of Im 6 implies ∂u (F1 F2 F3 ) − F32 + F12 = 0,

(3.48)

and this equation is implied by the other three. Solving (3.46) and (3.47) is straightforward. Adding (3.47) we immediately get F2 =

F12

a , + F32

(3.49)

for constant a. Next, subtracting (3.47), and defining a new coordinate x according to 4 ∂u = − F1 F3 ∂x , a

(3.50)

we get F32 − F12 F12 + F32

= x + b,

(3.51)

for a constant b which may be eliminated by a shift of x. Solving for F3 , inserting in (3.46), and defining x = cos θ , we obtain 1 − cos θ 3/2 3a 2 6 (2θ − sin 2θ + c) , (3.52) F1 = 32 1 + cos θ for constant c. The metric has pathological behaviour unless c = 0, so we choose this value. Then, up to an overall scale of (3a 2 /4)1/3 , we obtain the three-fold metric given above. 3.2.2. Four-folds. The interpolating pair. The GKW solution for the Ad S3 near-horizon limit of a string intersection of fivebranes wrapped on a SLAG four-cycle in a four-fold, with membranes extended in the directions transverse to the four-fold, was constructed in [26]. The metric is given by 1 8 2 4 λ3 2 2 3 2 a a 2 ds = ds (Ad S3 ) + ds (H ) + (1 − λ f )DY DY + dρ , λ 3 4(1 − λ3 f 2 ) λ3 =

16 3ρ . , f = 2 24 + 3ρ 4

(3.53)

Here the Y a , a = 1, . . . , 4 are constrained coordinates on a three-sphere, Y a Y a = 1, and DY a = dY a + ωa b Y b ,

(3.54)

with ωab the spin connection one-forms of H 4 . The range of ρ is ρ ∈ [−2, 2]; at the endpoints, the S 3 degenerates smoothly. The electric flux may be obtained from [26] or [43]; the magnetic flux will be given below.

366

O. A. P. Mac Conamhna

Denoting a basis for H 4 by ea , the metric of the conjectured Calabi-Yau interpolation of this solution is ds 2 = ds 2 (R1,2 ) + ds 2 (Nτ ),

(3.55)

where, up to an overall scale, ds 2 (Nτ ) =

(2 + cos 2θ )1/4 2 cos θ (ea − Y a Y b eb )2 + sin2 θ DY a DY a cos θ 3 cos θ sin3 2θ 2 a a 2 . + + (Y e ) dθ 8 sin3 θ (2 + cos 2θ )

(3.56)

This metric is the hyperbolic analogue of the Stenzel four-fold. Without loss of generality, we can take the range of θ to be θ ∈ [0, π/2). Near θ = 0, the S 3 degenerates smoothly, and up to a scale the metric is given by ds 2 = ds 2 (H 4 ) + dθ 2 + θ 2 DY a DY a .

(3.57)

The other degeneration point, θ = π/2, is singular. Now we give the derivation of the conjectured interpolation. The G-structure of the AdS solution. The Ad S3 solution admits an SU (3) structure defined by all four Killing spinors. The structure satisfies the torsion conditions of [43] for the near-horizon limit of fivebranes on a SLAG four-cycle in a four-fold, together with the Bianchi identity for the magnetic flux, dFmag = 0, which in this case is not implied by the torsion conditions. The SU (3) structure is [43] e =− 7

J6 =

8(1 − λ3 f 2 ) a e ∧ DY a , 3λ2

Re 6 = − 8 Im 6 = 3

8 a a Y e , 3λ

8 3λ

3

1 abcd a b Y e ∧ ec ∧ ed 3!

8 3 2 1 abcd a 1 − λ f Y DY b ∧ DY c ∧ ed , 3λ3 2 1 − λ3 f 2 1 abcd a Y DY b ∧ ec ∧ ed λ3 2

⎛

⎞3 3f2 1 − λ ⎠ 1 abcd Y a DY b ∧ DY c ∧ DY d . −⎝ λ 3!

(3.58)

Spacetime Singularity Resolution by M-Theory Fivebranes

367

The torsion conditions are

e ∧ ρˆ ∧ d

Re 6

= 0,

(3.59)

−1 7 3 2 1 − λ f e = λ−1/2 (J6 + λ3/2 f e7 ∧ ρ), ˆ d λ

(3.60)

7

1 − λ3 f 2

λ1/2 Im 6 ∧ dIm 6 = (6 + 4λ3 f 2 )Vol[M6 ] 1 − λ3 f 2 λ3 f 7 3/2 ∧e − 2λ f 8 d log , 1 − λ3 f 2

(3.61)

where Vol[M6 ] =

1 J6 ∧ J6 ∧ J6 , 3!

(3.62)

and 8 denotes the Hodge dual on the space transverse to the Ad S factor, with positive orientation defined with respect to Vol = Vol[M6 ] ∧ e7 ∧ ρ. ˆ

(3.63)

The magnetic flux is Fmag = −

λ3/2 3/2 −3/2 −1 7 3 f 2 Im + 4λ (λ f + ) d λ 1 − λ Re ∧ e 8 6 6 1 − λ3 f 2

−2λ1/2 Im 6 ∧ ρ. ˆ

(3.64)

The following identities, valid for a H 4 or an S 4 with scalar curvature R, are useful in verifying the torsion conditions and Bianchi identity: d abcd Y a eb ∧ ec ∧ ed = −3 abcd Y a DY b ∧ ec ∧ ed ∧ Y e ee , d abcd Y a DY b ∧ ec ∧ ed R = −2 abcd Y a DY b ∧ DY c ∧ ed + abcd Y a eb ∧ ec ∧ ed ∧ Y e ee , 12 abcd a b c d d Y DY ∧ DY ∧ e R abcd a b c d abcd a b c d ∧ Y e ee , Y DY ∧ e ∧ e − Y DY ∧ DY ∧ DY = 6 R d abcd Y a DY b ∧ DY c ∧ DY d = abcd Y a DY b ∧ DY c ∧ ed ∧ Y e ee . (3.65) 4

368

O. A. P. Mac Conamhna

The AdS solutions in the Minkowski frame. Using Sect. 2, we define the coordinates 1 t = − e−3r/2 ρ, 2 24 − 6ρ 2 −r e , u=− 16

(3.66)

so that the one-forms e8 , e9 in the Minkowski frame are given by e8 = λer du, e9 = λe3r/2 dt,

(3.67)

and the Ad S metric in the Minkowski frame takes the form ds = 2

2/3

1 1/3

2/3

HM5 HM2 1/3

ds (R 2

2/3

+HM2 HM5

1,1

)+

HM5

2/3

1/3

2

dt +

HM2

HM2

1/3

HM5 1 du 2 + u 2 DY a DY a , F

8 Fds 2 (H 4 ) 3

(3.68)

where HM5 = λ3 e5r , HM2 = er/2 , F = e3r/2 .

(3.69)

The function er is given in terms of t and u by a positive signature metric inducing root of the cubic 2 t 2 e3r + u 2 e2r − 1 = 0. 3

(3.70)

The wrapped brane SU (4) structure of the Ad S3 solution, defined by two of its Killing spinors, is given by J8 = J6 + e7 ∧ e8 , 8 = 6 ∧ (e7 + ie8 ).

(3.71)

By construction, this structure is a solution of the wrapped brane equations for a SLAG four-cycle in a four-fold, which comprise the torsion conditions [43] d(L −1/2 J8 ) = 0, Im 8 ∧ dRe 8 = 0, 9 3/2 9 e ∧ [Re 8 dRe 8 − 2L e d(L −3/2 e9 )] = 0.

(3.72)

together with the Bianchi identity and field equation for the four-form, which is given in [43].

Spacetime Singularity Resolution by M-Theory Fivebranes

369

The conjectured Calabi-Yau interpolation. We make the following ansatz for an interpolating solution: ds =

1

2

1/3

2/3

ds (R 2/3 2

HM5 HM2 1/3

2/3

+HM2 HM5

1,1

)+

HM5

2

dt + 2/3

HM2

1/3 HM2 1/3

H M5 F42 du 2 + F32 DY a DY a ,

F12 (ea − Y a Y b eb )2 + F22 (Y b eb )2

(3.73)

with HM5,M2 , F1,...,4 arbitrary functions of u, t. To determine the Calabi-Yau interpolation with this ansatz, we set HM5,M2 = 1 and require that F1,...,4 are functions only of u. Then F4 is at our disposal, and we set it to 1. Requiring the SU (4) holonomy, we set d J8 = d 8 = 0,

(3.74)

with J6 + e7 ∧ du, 6 ∧ (e7 + idu), −F2 Y a ea , F1 F3 ea ∧ DY a , 1 1 Re 6 = F13 abcd Y a eb ∧ ec ∧ ed − F1 F32 abcd Y a DY b ∧ DY c ∧ ed , 3! 2 1 1 Im 6 = F12 F3 abcd Y a DY b ∧ ec ∧ ed − F33 abcd Y a DY b ∧ DY c ∧ DY d . 2 3! (3.75) J8 8 e7 J6

= = = =

Closure of J8 implies ∂u (F1 F3 ) + F2 = 0.

(3.76)

Using the identities (3.65) with R = −12, closure of Re 8 implies ∂u (F13 F2 ) − 3F12 F3 = 0, ∂u (F32 F2 ) + 3F1 F32 = 0,

(3.77)

while closure of Im 8 implies ∂u (F12 F2 F3 ) + F13 − 2F1 F32 = 0, ∂u (F1 F2 F32 ) − F33 + 2F12 F3 = 0.

(3.78)

It may be verified that the last two equations are implied by the first three. Solving for F1,2,3 is straightforward. First define a new coordinate x according to −F2 ∂u = ∂θ .

(3.79)

Then we have that ∂θ

F1 F3

= −1 −

F1 F3

2 ,

(3.80)

370

O. A. P. Mac Conamhna

which has solution cos θ F1 , = F3 sin θ

(3.81)

up to an irrelevant constant which may be eliminated by a shift of θ . Using this, we find that 2/3 F1 F2 F3 ∂θ log = 0, (3.82) sin 2θ and hence that 2/3

F1 F2

F3 = α sin 2θ,

(3.83)

for constant α. Finaly we get ∂θ

sin 2θ 2/3 F2

=

1 2 F , α 2

(3.84)

which has solution F2 =

3α 8

3/8

sin 2θ (β + [2 + cos 2θ ] sin4 θ )1/4

3/2 ,

(3.85)

for constant β. As was the case for the three-fold solution of the previous subsection, the metric has pathological behaviour unless β = 0. Choosing this value, the metric, up to an overall scale of (8α 3 /3)1/4 , is as given above. 4. Sp(2) Interpolating Pair In this section, we will give a conjectured interpolating pair for fivebranes wrapped on a complex lagrangian four-cycle in an Sp(2) manifold. First we give the pair, then the derivation of the Sp(2) interpolation from the Ad S solution. The interpolating pair. In [27], an Ad S3 solution admitting six Killing spinors and describing the near-horizon limit of fivebranes wrapped on a CLAG four-cycle in an Sp(2) manifold was constructed. In addition to the fivebranes, there are membranes extended in the directions transverse to the Sp(2), which intersect the fivebranes in a string. The quantum dual of the Ad S solution is the two-dimensional low energy effective theory on the string worldvolume. The metric of the Ad S solution is given by 1 5 2 λ3 2 3 2 a a 2 ds = ds (Ad S3 ) + ds (B4 ) + (1 − λ f )DY DY + dρ , λ 2 4(1 − λ3 f 2 ) 2

λ3 =

50 3ρ . , f = 2 60 + 3ρ 5

(4.1)

Spacetime Singularity Resolution by M-Theory Fivebranes

371

Here ds 2 (B4 ) is the Bergman metric on two-dimensional complex hyperbolic space, normalised such that the scalar curvature is R = −12; explicitly, this metric is 1 2 2 2 2 2 2 2 ds (B4 ) = 2 dz + sinh z(σ1 + σ2 + cosh zσ3 ) , (4.2) 4 with dσ1 = σ2 ∧ σ3 , together with cyclic permutations. In the Ad S metric (4.1), the Y a , a = 1, . . . , 4 parameterise an S 3 , Y a Y a = 1, and DY a = dY a + ωa b Y b ,

(4.3)

with ωab the spin connection one-forms of B4 . The electric flux is irrelevant to the discussion, and may be obtained from [27] or [43]; the magnetic flux will be given below. To give the conjectured special holonomy interpolation of this metric, we first make the following definitions. Let ea denote a basis for the Bergman metric (4.2). Let J A , A = 1, 2, 3, denote a basis of self-dual SU (2) invariant three-forms on B4 , obeying the algebra J A J B = −δ AB − ABC J C , and let J 3 be the Kähler form of B4 . Define 1 Y a eb , E = −J 2 Y a eb , E = J 1 Y a DY b , E = J 2 Y a DY b , E1 = Jab 2 3 4 ab ab ab 3 Y a eb , E = Y a ea , E = J 3 Y a DY b . E5 = Jab 6 7 ab

(4.4)

Then the conjectured hyper-Kähler interpolation of the Ad S3 solution is ds 2 = ds 2 (R1,2 ) + ds 2 (Nτ ),

(4.5)

where, up to an overall scale, ds 2 (Nτ ) = 1 + R 2 E21 + E22 + 2 1 − R 2 E23 + E24 + 2R 2 E25 + E26 −1 1 1 2 2 +R − 1 E7 + 4 −1 d R2. (4.6) R4 R4 The range of R is R ∈ (0, 1]. At R = 1, the S 3 degenerates smoothly. Defining R = 1 − y/2, the metric near y = 0 is ds 2 (Nτ ) = 2ds 2 (B4 ) + dy 2 + y 2 DY a DY a .

(4.7)

The metric is singular at R = 0. This Nτ metric is the hyperbolic analogue of the Calabi metric on T ∗ CP2 [56]. Now we give its derivation from the Ad S solution. The G-structure of the AdS solution. The Ad S3 admits an SU (2) structure defined by all six Killing spinors. This structure satisfies the torsion conditions of [43], for the nearhorizon limit of fivebranes on a CLAG four-cycle, together with the Bianchi identity for the flux dFmag = 0, which in this case is not implied by the torsion conditions. The SU (2) structure is given by 5 5 1 − λ3 f 2 E5 , e6 = E6 , e7 = E7, e5 = 2λ 2λ λ

372

O. A. P. Mac Conamhna

1 K1 = λ 1 K2 = λ K3 =

5(1 − λ3 f 2 ) (E1 ∧ E4 + E2 ∧ E3 ) , 2 5(1 − λ3 f 2 ) (−E1 ∧ E3 + E2 ∧ E4 ) , 2

5 (1 − λ3 f 2 ) E1 ∧ E2 + E3 ∧ E4 . 2λ λ

(4.8)

The triplet of SU (2) structure forms K A (not to be confused with the J A forms on B4 ) obey the algebra K A K B = −δ AB − ABC K C . The relevant torsion conditions of [43] are (4.9) ρˆ ∧ d λ−1 Vol[M4 ] + K 3 ∧ e56 = 0, (K 3 + e56 ) de7 =

2λ1/2 1 − λ3 f 2

(1 + λ3 f 2 )

λ3 f 1 − λ3 f 2

, d λ−1 1 − λ3 f 2 e5 = λ−1/2 K 1 + e67 + λ3/2 f e5 ∧ ρˆ , (4.10) −1 6 3 2 1 − λ f e = λ−1/2 K 2 + e75 + λ3/2 f e2 ∧ ρˆ , (4.11) d λ −λ3/2 f ρ ˆ d log

with Vol[M4 ] =

1 3 K ∧ K 3. 2

(4.12)

The magnetic flux is λ3/2 3/2 −3/2 3 7 567 3 2 Fmag = (λ f + 8 ) d λ 1−λ f K ∧e +e 1 − λ3 f 2 ˆ (4.13) −4λ−1 Vol[M4 ] + K 3 ∧ e56 + 2λ1/2 K 3 ∧ e7 + e567 ∧ ρ, with Vol[M8 ] = Vol[M4 ] ∧ e567 ∧ ρ. ˆ

(4.14)

In verifying that the given structure indeed solves the torsion conditions and Bianchi identity, and in the derivation of the Sp(2) metric to follow, the following is useful. Defining Q=

1 3ab J ωab , 2

(4.15)

Spacetime Singularity Resolution by M-Theory Fivebranes

373

the exterior derivatives of the Es are given by dE1 = −E2 ∧ (Q + E7 ) − E3 ∧ E6 + E4 ∧ E5 , dE2 = E1 ∧ (Q + E7 ) + E3 ∧ E5 + E4 ∧ E6 , 1 1 dE3 = E4 ∧ (Q + 2E7 ) − E1 ∧ E6 + E2 ∧ E5 , 2 2 1 1 dE4 = −E3 ∧ (Q + 2E7 ) + E2 ∧ E6 + E1 ∧ E5 , 2 2 dE5 = E1 ∧ E4 + E2 ∧ E3 + E6 ∧ E7 , dE6 = −E1 ∧ E3 + E2 ∧ E4 + E7 ∧ E5 , dE7 = −E1 ∧ E2 + 2E3 ∧ E4 − 2E5 ∧ E6 .

(4.16)

The AdS solution in the Minkowski frame. We now use Sect. 2 to frame-rotate the Ad S solution. Defining the coordinates 1 t = − e−6r/5 ρ, 2 12 − 3ρ 2 −r e , u=− 10

(4.17)

the one-forms e8 , e9 in the Minkowski frame are given by e8 = λer du, e9 = λe6r/5 dt,

(4.18)

and the Ad S metric in the Minkowski frame takes the form ds 2 =

1 1/3

2/3

ds 2 (R1,1 ) + 2/3

HM5 HM2 1/3

2/3

+HM2 HM5

1/3

HM5

HM2

HM2

1/3

dt 2 + 2/3

HM5 1 2 2 a a du + u DY DY , F

5 Fds 2 (B4 ) 2

(4.19)

where HM5 = λ3 e22r/5 , HM2 = e4r/5 , F = e6r/5 .

(4.20)

The function e2r is given in terms of t and u by a positive signature metric inducing root of the sextic 5 2 2r 5 2 12r = 0. (4.21) t e − 1− u e 6 The wrapped brane Sp(2) structure of the Ad S3 solution, defined by three of its Killing spinors, is given by J 1 = K 3 + e5 ∧ e6 + e7 ∧ e8 , J 2 = K 2 − e5 ∧ e7 + e6 ∧ e8 , J 3 = K 1 + e6 ∧ e7 + e5 ∧ e8 .

(4.22)

374

O. A. P. Mac Conamhna

By construction, this structure is a solution of the wrapped brane equations for a CLAG four-cycle in a hyper-Kähler eight-manifold, which comprise the torsion conditions [43] d(L −1/2 J 2 ) = d(L −1/2 J 3 ) = 0, e ∧ [J 1 d J 1 − Le9 d(L −1 e9 )] = 0, 9

(4.23)

together with the Bianchi identity and field equation for the four-form, which is given in [43]. The conjectured hyper-Kähler interpolation. We make the following ansatz for an interpolating solution: ds 2 =

1 1/3

2/3

ds 2 (R1,1 ) + 2/3

HM5 HM2

HM5

dt 2 + 2/3

1/3 HM2 1/3

F12 E21 + E22 + F22 E25 + E26

HM2 HM5 1/3 2/3 + HM2 HM5 F52 du 2 + F32 E23 + E24 + F42 E27 ,

(4.24)

with HM5,M2 , F1,...,5 arbitrary functions of u, t. To determine the hyper-Kähler interpolation with this ansatz, we set HM5,M2 = 1 and require that F1,...,5 are functions only of u. Then F5 is at our disposal, and we set it to 1. Requiring Sp(2) holonomy, we set d J A = 0. From

d J 1,

(4.25)

we derive the conditions ∂u F12 = F4 , ∂u F22 = 2F4 , ∂u F32 = −2F4 ,

1 2 F . (4.26) 2 3 The algebraic constraint, combined with any two of the differential equations, implies the third. From d J 2 , we get F12 = F22 +

∂u (F1 F3 ) = −F2 , ∂u (F2 F4 ) = −F2 , F1 F3 = F2 F4 ,

(4.27)

and from d J 3 we again obtain the Eqs. (4.27). The algebraic constraint in (4.27), combined with either of the differential equations, implies the second. Therefore the system we need to solve is ∂u F12 = F4 , ∂u F22 = 2F4 , ∂u (F2 F4 ) = −F2 , F32 = 2 F12 − F22 , F1 F3 = F2 F4 .

(4.28)

Spacetime Singularity Resolution by M-Theory Fivebranes

375

To solve the system, define a new coordinate x such that ∂u = F4 ∂x .

(4.29)

Then the first two equations of (4.28) give F12 = x + a, F22 = 2x + b,

(4.30)

for constants a, b. We eliminate b by a shift of x. Integrating the third equation we get F42 =

c − x, x

(4.31)

for a constant c. Then the algebraic conditions imply that F32 = 2(a − x), c = a2.

(4.32)

Finally, defining a new coordinate x = a R 2 , up to an overall scale of a, we get the hyper-Kähler Nτ metric given above. 5. G 2 Interpolating Pairs In this section, we will give conjectured interpolating pairs for fivebranes wrapped on calibrated cycles in G 2 manifolds. First we will discuss co-associative four-cycles, then associative three-cycles. In each case we will first give the conjectured pairs, followed by the derivation of the G 2 interpolations from the Ad S solutions. 5.1. Co-associative cycles. The interpolating pairs. The GKW Ad S3 solutions [26], describing the near-horizon limit of M-fivebranes wrapped on a co-associative cycle in a manifold of G 2 holonomy, admit four Killing spinors, and have metrics 1 9 9 ds 2 = ds 2 (Ad S3 ) + ds 2 (4 ) + (1 − λ3 ρ 2 )DY a DY a λ 4 4 3 2 dρ λ + + ρ 2 ds 2 (S 1 ) , 4 1 − λ3 ρ 2 81 λ3 = . (5.1) 64 + 54ρ 2 In this case the flux is purely magnetic, and is irrelevant to the discussion; it may be obtained from [26] or [41]. The wrapped cycle 4 is an arbitrary conformally half-flat Einstein manifold, with scalar curvature normalised such that R = −12. This means that the Ricci tensor of 4 is given by Ri j = −3gi j ,

(5.2)

376

O. A. P. Mac Conamhna

and the Weyl tensor is anti-self-dual, ai j

J4 Ci jkl = 0,

(5.3)

for a triplet of self-dual two-forms J4a , a = 1, 2, 3, on 4 . An example of such a manifold is the hyperbolic four-space H 4 . The Y a are constrained coordinates on S 2 , Y a Y a = 1, and 1 ci j DY a = dY a − abc Y b ωi j J4 , 2

(5.4)

where ωi j are the spin connection one-forms of 4 . The√range of ρ, which without loss of generality is taken to be non-negative, is ρ√∈ [0, 8/3 3]. At ρ = 0 the R-symmetry S 1 degenerates smoothly9 , while at ρ = 8/3 3 the S 2 parameterised by the Y a degenerates smoothly. The metric of the conjectured G 2 interpolation of these Ad S solutions is ds 2 = ds 2 (R1,3 ) + ds 2 (Nτ ),

(5.5)

where up to an overall scale, R2 2 R2 ds (Nτ ) = ds (4 ) + 2 4 2

−1 1 1 a a − 1 DY DY + −1 d R2. R4 R4

(5.6)

The range of R is R ∈ (0, 1]. At R = 1, the S 2 degenerates smoothly. The metric is singular at R = 0 where the co-associative 4 degenerates. These metrics are the analogues, for negatively curved conformally half-flat Einstein 4 , of the regular BSGPP G 2 metrics on R3 bundles over S 4 or CP2 [45,46]. Now we give their derivation from the Ad S solutions. The G-structure of the AdS solutions. The SU (3) structure of the Ad S solutions, defined by all four of their Killing spinors, is given by [41] J6 = 6 =

9 a a 9 1 Y J + (1 − λ3 ρ 2 ) abc Y a DY b ∧ DY c , 4λ 4 4λ 2 27 8

1 − λ3 ρ 2 abc a ( Y DY b ∧ J4c + iDY a ∧ J4a ). λ3

(5.7)

This structure is a solution of the Ad S torsion conditions of [41] for the near-horizon limit of fivebranes on a co-associative four-cycle, which are 1 J d ∧ ρ ˆ − Im 6 6 = 0, λ3/2 ρ 1 J6 ∧ J6 + λ1/2 ρRe 6 ∧ ρˆ = 0. d (5.8) 2λ 9 The R-symmetry of the conformal duals is U (1).

Spacetime Singularity Resolution by M-Theory Fivebranes

377

The following identities, valid for an arbitrary conformally half-flat Einstein manifold of scalar curvature R, are useful in verifying this claim: d(Y a J4a ) = DY a ∧ J4a , R 1 abc a Y DY b ∧ DY c = DY a ∧ J4a , d 2 12 R (5.9) d( abc Y a DY b ∧ J4c ) = Vol[4 ] + Y d J4d ∧ abc Y a DY b ∧ DY c . 3 In this case the Bianchi identity for the four-form is implied by the torsion conditions [41].

The Ad S solution in the Minkowski frame. Using Sect. 2, we now frame-rotate these solutions to the canonical Minkowski frame. The one-form uˆ is given by 1 uˆ = Le−4r/3 d − (5.10) 64 − 27ρ 2 e−2r/3 . 6 Defining the Minkowski frame coordinate u, 1 u=− 64 − 27ρ 2 e−2r/3 , (5.11) 6 the Ad S3 solutions in the Minkowski frame are given by 9 ds 2 + L −1 ds 2 (R1,1 ) + Fds 2 (4 ) 4 2 −4/3 2 2 du + u DY a DY a + ds 2 (R2 ) , (5.12) +L F where F = e2r , L = λF, and

(5.13)

e4r

is a positive signature metric inducing root of the cubic 3 16 2 4r −t e − u 6 e4r = 0. (5.14) 9 The wrapped-brane G 2 structure of the Ad S3 solutions is defined by two of their Killing spinors, and is given by

= J6 ∧ uˆ − Im 6 , 1 ϒ = J6 ∧ J6 + Re 6 ∧ u. ˆ (5.15) 2 By construction, this structure is a solution of the wrapped brane equations for fivebranes on a co-associative four-cycle. From [41,59], these equations are Vol[R2 ] ∧ d = d(L −1 ∧ ϒ) =

∧ d = d L 9 d(L −1 ϒ) =

0, 0, 0, 0.

(5.16)

In the last equation, which comes from the four-form Bianchi identity, 9 denotes the Hodge dual on the space transverse to the Minkowski factor.

378

O. A. P. Mac Conamhna

The conjectured G 2 interpolation. We now make the following ansatz for an interpolating solution: ds 2 + L −1 ds 2 (R1,1 ) + F12 ds 2 (4 ) + L 2 F32 du 2 + F22 DY a DY a + ds 2 (R2 ) , (5.17) with L , F1,2,3 functions of u, t. For special holonomy we must have L = constant, which we take to be unity. We also must have that F1,2,3 are functions of u only; the function F3 is then at our disposal, and we set it to 1. The condition of G 2 holonomy is then d = dϒ = 0,

(5.18)

ds 2 (Nτ ) = F12 ds 2 (4 ) + F22 DY a DY a + du 2 ,

(5.19)

for the metric

with the G 2 structure inherited from the Ad S frame,

= J6 ∧ du − Im 6 , 1 ϒ = J6 ∧ J6 + Re ∧ du, 2 1 J6 = F12 Y a J4a + F22 abc Y a DY b ∧ DY c , 2 6 = F12 F2 ( abc Y a DY b ∧ J4c + iDY a ∧ J4a ).

(5.20)

With R = −12, closure of implies ∂u (F12 F2 ) = F22 − F12 ,

(5.21)

while closure of ϒ implies ∂u (F14 ) = 4F12 F2 , 2∂u (F12 F22 ) = −4F12 F2 .

(5.22)

It is readily verified that (5.22) imply (5.21). Integrating (5.22) is straightforward. Adding, we find that F22 =

F12 α2 , − 2 2F12

(5.23)

for some constant α. Defining a new coordinate x such that ∂u = 4F12 F2 ∂x ,

(5.24)

we then get F14 = x + β,

(5.25)

for an irrelevant constant β which can be eliminated by a shift in x. The constant α 2 may be set to unity, up to an overall scale in the metric. Defining a new coordinate R 4 = x/4, the G 2 metrics conjectured to be the interpolation of the co-associative Ad S3 solutions of [26] are as given above.

Spacetime Singularity Resolution by M-Theory Fivebranes

379

5.2. Associative cycle. The interpolating pair. The Ad S4 solution of [24], describing the near-horizon limit of M-fivebranes wrapped on an associative three-cycle in a G 2 manifold, admits four Killing spinors, and is as follows. The metric is given by 1 4 4 λ3 dρ 2 ds 2 (Ad S4 ) + ds 2 (H 3 ) + (1 − λ3 ρ 2 )µa µa + , ds 2 = λ 5 25 4 1 − λ3 ρ 2 λ3 =

8 . 5 + 3ρ 2

(5.26)

The flux is purely magnetic and is irrelevant to the discussion; it may be obtained from [24] or [41]. The µa , a = 1, 2, 3, are given by 1 µa = σ a − abc ωab , 2

(5.27)

where the σ a are left-invariant one-forms on an S 3 , dσ a = 21 abc σ b ∧ σ c , and the ωab are the spin-connection one-forms of H 3 . The range of ρ is ρ ∈ [−1, 1], with the S 3 degenerating smoothly at ρ = ±1. The conjectured G 2 interpolation of this metric is ds 2 = ds 2 (R1,3 ) + ds 2 (Nτ ),

(5.28)

where up to an overall scale ds 2 (Nτ ) =

R2 2 3 R2 ds (H ) + 3 9

−1 1 1 a a − 1 µ µ + − 1 d R2. R3 R3

(5.29)

This metric is singular where the associative H 3 degenerates, at R = 0. At R = 1, the S 3 degenerates smoothly. This Nτ metric is a singular hyperbolic analogue of the BSGPP G 2 metric on an R4 bundle over S 3 of [45,46]. This Nτ metric was also found in [61], as the conjectured G 2 interpolation of the Ad S2 IIB solution of [28], for D3 branes wrapped on an associative three-cycle. If it is indeed the interpolation of both these Ad S solutions, then there are two distinct conformal theories that have their origins in this geometry. The first is a superconformal quantum mechanics, arising on the unwrapped (time) direction of D3-branes on the H 3 of (5.29); the second is a three-dimensional superconformal theory, arising on the unwrapped worldvolume directions of M5-branes on the H 3 . Now we discuss the derivation of Nτ from the M-theory Ad S4 solution. The G-structure of the AdS solution. With ea a basis for H 3 , the SU (3) structure of the Ad S solution, defined by all its four Killing spinors, is [41], 4 J6 = √ 1 − λ3 ρ 2 µa ∧ ea , 5 5λ 8 1 8 (1 − λ3 ρ 2 ) abc ea ∧ µb ∧ µc − √ Vol[H 3 ], Im 6 = √ 3/2 2 25 5λ 5 5λ3/2 3 1 1−λ3 ρ 2 1 abc a 2 8 abc µa ∧µb ∧ µc − µ ∧ eb ∧ ec . Re 6 = 1−λ3 ρ 2 1/2 5λ 3! 25 λ3 2 (5.30)

380

O. A. P. Mac Conamhna

This structure is a solution of the Ad S torsion conditions of [60], interpreted in [41] as the conditions defining the near-horizon limit of fivebranes wrapped on an associative three-cycle, which are 1 d ρ J6 ∧ ρˆ − 3/2 Im 6 = 0, λ 1 1 d (5.31) J6 ∧ J6 + 5/2 2 Re ∧ ρˆ = 0. 2λρ λ ρ Some useful identities in verifying this claim are 1 d(µa ∧ ea ) = abc µa ∧ µb ∧ ec + 3Vol[H 3 ], 2 1 abc a 1 1 abc a b c µ ∧µ ∧µ = d µ ∧ eb ∧ ec = ea ∧ eb ∧ µa ∧ µb . d 3! 2 2 (5.32) The Ad S solution in the Minkowski frame. Now we use Sect. 2 to frame-rotate to the canonical Minkowski frame. The one-form uˆ is given by uˆ = Le−3r/4 du, with the Minkowski-frame coordinate u given by 4 5 − 5ρ 2 u=− . 5 8

(5.33)

(5.34)

Then the associative Ad S4 solution in the Minkowski frame is 4 2r 2 3 u2 a a 2 −1 2 1,2 2 −3/4 2 2 ds (R ) + e ds (H ) + L F du + µ µ + dt , ds = L 5 4 (5.35) where F = e2r , L = λF,

(5.36)

and er is a positive signature metric inducing root of the octic 4 (1 − 4t 2 e4r )2 − u 2 e5r = 0. 25

(5.37)

The wrapped-brane G 2 structure of the associative Ad S4 solution, defined by two of its Killing spinors, is given by

= J6 ∧ uˆ − Im 6 , 1 ˆ ϒ = J6 ∧ J6 + Re 6 ∧ u. 2

(5.38)

Spacetime Singularity Resolution by M-Theory Fivebranes

381

By construction, this structure is a solution of the wrapped brane equations for an associative three-cycle. From [39,41], these are dt ∧ d(L −1 ϒ) = 0, d(L −5/2 ∧ ϒ) = 0,

∧ d = 0, 3/2 −3/2 d L 8 d(L

) = 0,

(5.39)

where in the last equation (the four-form Bianchi identity), 8 denotes the Hodge dual on the space transverse to the Minkowski factor. The conjectured G 2 interpolation. We now conjecture the existence of a solution of (5.39) which smoothly interpolates between (5.35) and a manifold of G 2 holonomy. We make the following metric ansatz for this solution: ds 2 = L −1 ds 2 (R1,2 ) + F12 ds 2 (H 3 ) + L 2 F32 du 2 + F22 µa µa + dt 2 , (5.40) with L, F1,2,3 functions of u and t. For special holonomy we set L = 1, and require that F1,2,3 are arbitrary functions of u. In fact, the determination of the G 2 metric from this point on exactly follows that of [61], where a conjectured G 2 interpolation of the Ad S2 solution of [28] for a D3 brane wrapped on an associative three-cycle was studied. The ansatz for the G 2 manifold is exactly the same, and the reader is referred to [61] for the rest of the derivation, or invited to perform it as a useful excercise. 6. Spin (7) Interpolating Pairs In this section, we will give conjectured interpolating pairs for fivebranes wrapped on Cayley four-cycles in Spin(7) manifolds. First we give the pairs, then the derivation of the Spin (7) interpolations. The interpolating pairs. The GKW Ad S3 solutions [26] describing the near-horizon limit of fivebranes on Cayley four-cycles, with membranes in the overall transverse directions, admit two Killing spinors and have metrics given by 1 7 λ3 2 ds 2 = , ds 2 (Ad S3 ) + ds 2 (4 ) + (1 − λ3 f 2 )DY a DY a + dρ λ 4 4(1 − λ3 f 2 ) λ3 =

49 6ρ . , f = 2 84 + 15ρ 7

(6.1)

The electric flux may be obtained from [26] or [43], and the magnetic flux will be given below. The wrapped cycle 4 is an arbitrary conformally-half flat negative scalar curvature Einstein four-manifold, normalised such that the Ricci scalar is R = −12. We have flipped the definition of orientation on 4 relative to [26]; the conformally half-flat condition reads J Ai j Ci jkl = 0, with J A , A = 1, 2, 3, a basis of self-dual two-forms on 4 and Ci jkl the Weyl tensor on 4 . The Y a , a = 1, . . . , 4 are constrained coordinates on an S 3 , Y a Y a = 1, and 1 DY a = dY a + ωcd J Acd J Aab Y b , 4

(6.2)

382

O. A. P. Mac Conamhna

where ωab are the spin connection one-forms of 4 . The range of ρ is ρ ∈ [−2, 2]; at the end-points, the S 3 degenerates smoothly. The conjectured Spin(7) interpolation of this metric is ds 2 = ds 2 (R1,2 ) + ds 2 (Nτ ),

(6.3)

where up to an overall scale 9 2 2 36 2 ds (Nτ ) = R ds (4 ) + R 20 100 2

1 R 10/3

a a − 1 DY DY +

1 R 10/3

−1 −1 d R2. (6.4)

These metrics are singular at R = 0, where the Cayley four-cycle 4 degenerates. At R = 1, the S 3 degenerates smoothly. As discussed in the introduction these metrics are the analogues, for negatively curved conformally half-flat Einstein 4 , of the regular BSGPP Spin(7) metric on an R4 bundle over S 4 , [45,46]. We now give the derivation of the Nτ metric from the Ad S metric. The G-structure of the AdS solution. The solution admits a G 2 structure, defined by both its Killing spinors, which satisfies the torsion conditions of [39]10 together with the Bianchi identity for the magnetic flux (also given in [39]) which in this case is not implied by the torsion conditions. The torsion conditions of [39] were interpreted in [43] as the conditions defining the near-horizon limit of fivebranes wrapped on a Cayley four-cycle. These conditions are satisfied by all supersymmetric Ad S3 solutions of M-theory. If ea denote a basis for 4 , the G 2 structure of the Ad S solutions is given by 1 abcd a 7 1 − λ3 f 2 a a b b b c d Y e ∧ e ∧ DY + Y DY ∧ e ∧ e

=− 4 λ3 2 ⎛ ⎞3 3f2 1 − λ ⎠ 1 abcd Y a DY b ∧ DY c ∧ DY d , +⎝ (6.5) λ 3! 7 ϒ = − 2 (1 − λ3 f 2 ) 4λ 1 a 49 1 abcd b a b a c c d e ∧ e ∧ DY ∧ DY + DY ∧ DY ∧ e ∧ e + Vol[4 ]. 2 4 16λ2 (6.6) The torsion conditions of [39] are

d λ−5/2

ρˆ ∧ d(λ−1 ϒ) = 0, 3 2 1 − λ f Vol[M7 ] = −4λ−1/2 f ρˆ ∧ Vol[M7 ], d ∧ =

4λ1/2 1 − λ3 f 2

−2λ

3/2

(6.7) (6.8)

(3 − λ3 f 2 )Vol[M7 ]

λ3 f f 8 d log 1 − λ3 f 2

10 The conditions of [39] contain a minor error which is corrected in [43].

,

(6.9)

Spacetime Singularity Resolution by M-Theory Fivebranes

383

where Vol[M7 ] =

1

∧ ϒ. 7

(6.10)

The four-form Bianchi identity is dFmag = 0, with λ3/2

Fmag = 1 − λ3 f 2

3/2 −3/2 −1 3 2 λ f + 8 d[λ 1 − λ f ] − 4λ ϒ + 2λ1/2 ∧ ρ, ˆ (6.11)

where 8 denotes the Hodge dual on the space transverse to the Ad S3 factor, with positive orientation defined with respect to Vol = Vol[M7 ] ∧ ρ. ˆ

(6.12)

It may be verified that the structure (6.5) is indeed a solution of the torsion conditions and Bianchi identity, by using the following identities, valid for any conformally half-flat Einstein 4 with scalar curvature R: 1 abcd a a a b b b c d Y DY ∧ e ∧ e d Y e ∧ e ∧ DY + 2 R 1 = − Vol[4 ] + ea ∧ eb ∧ DY a ∧ DY b + abcd DY a ∧ DY c ∧ ec ∧ ed , 4 2 1 abcd a b c d Y DY ∧ DY ∧ DY d 3! R a 1 abcd b a b a c c d (6.13) e ∧ e ∧ DY ∧ DY + =− DY ∧ DY ∧ e ∧ e . 48 2 The AdS solutions in the Minkowski frame. Defining the coordinates 1 t = − e−12r/7 ρ, 2 12 − 3ρ 2 −r e , u=− 7

(6.14)

the one-forms e8 , e9 in the Minkowski frame are given by e8 = λer du, e9 = λe12r/7 dt,

(6.15)

and the metric in the Minkowski frame takes the form ds 2 =

1 1/3

2/3

ds 2 (R1,1 ) + 2/3

HM5 HM2

1/3

HM5

HM2

HM2

1/3

dt 2 + 2/3

HM5 1 1/3 2/3 du 2 + u 2 DY a DY a , + HM2 HM5 F

7 Fds 2 (4 ) 4

(6.16)

384

O. A. P. Mac Conamhna

where HM5 = λ3 e38r/7 , HM2 = e2r/7 , F = e12r/7 .

(6.17)

The function e2r

is given in terms of t and u by a positive signature metric inducing root of the twelfth order polynomial 7 2 2r 7 14 24r = 0. (6.18) t e − 1− u e 12 The wrapped brane Spin(7) structure of the Ad S3 solutions, defined by one of their Killing spinors, is given by φ = − ∧ e8 − ϒ.

(6.19)

By construction, this structure is a solution of the wrapped brane equations for a Cayley four-cycle in a Spin(7) manifold, which comprise the torsion conditions [43,63] 1 (6.20) e9 ∧ −L 3 e9 d(L −3 e9 ) + φ dφ = 0, 2 (e9 ∧ +9 )[e9 d(L −1 φ)] = 0,

(6.21)

together with the Bianchi identity and field equation for the four-form, which is given in [43,63]. The conjectured Spin(7) interpolation. We make the following ansatz for an interpolating solution: ds = 2

1 1/3

2/3

ds (R 2/3

HM5 HM2

2

1,1

)+

HM5

2

dt + 2/3

HM2

1/3 HM2 1/3

HM5 1/3 2/3 + HM2 HM5 F32 du 2 + F22 DY a DY a ,

F12 ds 2 (4 )

(6.22)

with HM5,M2 , F1,2,3 arbitrary functions of u, t. To determine the Spin(7) interpolation with this ansatz, we set HM5,M2 = 1 and require that F1,2,3 are functions only of u. Then F3 is at our disposal, and we set it to 1. Requiring Spin(7) holonomy, we set dφ = 0,

(6.23)

with φ = − ∧ du − ϒ, 1

= −F12 F2 Y a ea ∧ eb ∧ DY b + abcd Y a DY b ∧ ec ∧ ed 2 1 + F23 abcd Y a DY b ∧ DY c ∧ DY d , 3! 1 abcd 2 2 1 a b a b a c c d ϒ = −F1 F2 DY ∧ DY ∧ e ∧ e e ∧ e ∧ DY ∧ DY + 2 4 + F14 Vol[4 ].

(6.24)

Spacetime Singularity Resolution by M-Theory Fivebranes

385

Using (6.13) with R = −12, the Spin(7) condition reduces to ∂u (F14 ) = 3F12 F2 , 1 1 ∂u (F12 F22 ) = F23 − F12 F2 . 2 3

(6.25)

Defining a new coordinate x such that 3 ∂x , 4

(6.26)

F1 = x + α,

(6.27)

∂u = we get

for a constant α which may be eliminated by a shift in x. Then 4 10/3 1 2 , F2 = 4/3 β − x x 5

(6.28)

for a constant β which may be set to unity up to an overall scale in the metric. Defining a new coordinate x 10/3 = 5R 10/3 /4, up to an overall scale we obtain the Nτ metric given above. 7. Conclusions and Outlook In this paper, the notion of an interpolation between Anti-de Sitter and special holonomy manifolds has been defined. The importance of this concept in the geometry of the supersymmetric AdS/CFT correspondence has been stressed. Two conjectures have been made: that all supersymmetric Ad S solutions of M-/string theory admit a special holonomy interpolation, and that, with the exception of flat space, all metrics on special holonomy manifolds admitting an Ad S interpolation are incomplete. For a representative sample of known supersymmetric Ad S solutions of M-theory, a series of candidate incomplete special holonomy interpolations has been derived. The series of interpolations is closely related to a set of celebrated complete special holonomy metrics. Several interesting directions for future research are suggested by the results of this paper. The geometrical question of most importance is undoubtedly the construction of an interpolating solution describing a wrapped brane, for one of the proposed interpolating pairs of this paper. Since the whole series of pairs share many common features, understanding how to do this for one of them would almost certainly facilitate the construction of an interpolating solution for all. A reasonable guess for what the boundary conditions of an interpolating solution for these pairs should be is the following. It should match on to an Nτ metric at its regular degeneration point. It should also match on the Ad S solution at a degeneration point of its transverse space. There is an unfixed volume modulus in all of the Nτ metrics; this will be fixed, in an interpolating solution, by the global topological requirement of matching onto an Ad S solution. For the Ad S solutions without R-symmetry isometries, the degeneration points of the transverse space are symmetric; an interpolating solution should match on to one of them. For the Ad S solutions with R-symmetry isometries, the degeneration points of the transverse space are asymmetric; in this case, it seems plausible that an interpolating solution should match on to the Ad S solution at its R-symmetry degeneration point. Understanding how this comes

386

O. A. P. Mac Conamhna

about, and solving the wrapped brane equations for an interpolating solution, is not just a problem in Riemannian geometry. It seems very likely that the Lorenztian character of an interpolating solution will enter in an essential way, with the causal structure of the interpolating solution playing a key part. This is because (at least by analogy with conical interpolations) an interpolating solution should match on to the special holonomy manifold at a spacelike infinity, and the Ad S manifold at an event horizon. Of the two coordinates which play a rˆole in the frame rotation underlying the relationship between the interpolating pairs of this paper, one has a finite range while the range of the other is infinite. Though they cannot really be separated, in a rough sense the non-compact direction should determine the Lorentzian, causal structure, and the compact direction the Riemannian. A very delicate interplay between the two is required, to fulfill the appropriate Lorenztian and Riemannian boundary conditions for an interpolating solution. Understanding the geometry of the frame rotation in more depth may reveal how to linearise the wrapped brane equations, and so superimpose the interpolating pair, just as for conical interpolations. Another intriguing point about the frame rotation is that the relationship between the Ad S and Minkowski frame coordinates is in every case given by the root of a polynomial. This strongly suggests some deeper underlying algebraic geometry which has not been appreciated. Other interesting geometrical questions raised by this work include the following. For branes wrapped on Kähler cycles, there exist rich classes of Ad S solutions that have not been studied here. These include Ad S5 solutions from M-fivebranes on two-cycles in three-folds [13], Ad S3 solutions from M-fivebranes on four-cycles in fourfolds [30,32] and Ad S3 solutions from D3-branes on two-cycles in four-folds [31,32]. It would be interesting to apply the methods of this paper to these other solutions, and so determine candidate interpolations. For the Ad S-from-D3-brane solutions of [31,32] it should be particularly feasible to construct the interpolating solutions, since in this case the fourfold geometry is essentially conical [32,61,64]. Also Ad S2 M-theory solutions have not been discussed in this paper at all; a rich class has recently been discovered in [32], and some older ones are to be found in [26]. Using the classification results of [42,65], it would be interesting to determine their candidate interpolations. It should also be possible to use the notion of an interpolating pair to construct new Ad S solutions. For all cases other than Kähler cycles, to the knowledge of the author, only a single Ad S solution is known to exist - the one studied in this paper. On the other hand, numerous complete cohomogeneity-one special holonomy metrics are known; for example, for G 2 and Spin(7), several complete metrics, whose construction was inspired by the BSGPP metrics, were given in [54,55]. Hyperbolic analogues of these metrics should also exist, and if so, it will almost certainly be possible to map them to new Ad S solutions. A more long-term project concerns the construction of the conformal quantum duals of the interpolating pairs. In M-theory, this problem is hampered by the notoriously intractable question of the effective field theory on the worldvolume of a stack of fivebranes (for membranes, some interesting progress on the world-volume theory, highlighting its non-associativity, has recently been made in [66]). In IIB, this is less of a problem, and it should be possible to make progress constructing the duals of wrapped D3-brane geometries, even with existing techniques. In the geometry of wrapped brane physics, we have for so long been restricted to the near-horizon limit, the Ad S geometry, that it has become commonplace to think that only this geometry is of relevance to investigations of the CFT. Indeed, recently it has been shown that it is in fact possible in principle to reconstruct the CFT from the

Spacetime Singularity Resolution by M-Theory Fivebranes

387

near-horizon geometry alone11 using holographic renormalisation techniques [67,68]. However, doing this for Ad S geometries of the complexity of those studied in this paper is likely to be very difficult indeed, if not impossible, in practice. And focussing on the Ad S geometry alone ignores the central message of this paper: that the geometry of AdS/CFT involves, in an essential way, both an Anti-de Sitter and a special holonomy manifold. It is also possible, as a matter of principle, to construct the CFT dual from the geometry of the special holonomy manifold alone. It is worth recalling that this is how the quiver gauge theory duals of the Y p,q manifolds were in fact constructed [16,17]; as, indeed, was N = 4 super Yang Mills itself in this context [1]. Knowing both members of an interpolating pair means that CFT construction techniques can be brought to bear on both geometries; knowing both significantly enriches our understanding of the correspondence. Acknowledgements. I would like to express my gratitude to Jerome Gauntlett for ongoing discussions, collaboration and debate; and also to Dina Daskalopoulou, for general inspiration. This work was supported by EPSRC.

References 1. Maldacena, J.M.: The Large N Limit of Superconformal Field Theories and Supergravity. Adv. Theor. Math. Phys. 2, 231–252 (1998); Int. J. Theor. Phys. 38, 1113–1133 (1999) 2. Beisert, N.: The S-Matrix of AdS/CFT and Yangian symmetry. PoS (Solvay) 002 (2007) 3. Zarembo, K.: Semiclassical Bethe ansatz and AdS/CFT. Comptes Rendus Physique 5, 1081, (2004); Fortsch. Phys. 53, 647 (2005) 4. Maldacena, J., Nuñez, C.: Towards the large N limit of pure N = 1 super Yang Mills. Phys. Rev. Lett. 86, 588–591 (2001) 5. Gursoy, U., Kiritsis, E.: Exploring improved holographic theories for QCD: Part 1. JHEP 02, 032 (2008) 6. Gursoy, U., Kiritsis, E., Nitti, F.: Exploring improved holographic theories for QCD: part II. JHEP 02, 019 (2008) 7. Hartnoll, S.A., Herzog, C.P.: Ohm’s law at strong coupling: S duality and the cyclotron resonance. http:// arXiv.org/list/0706.3228, 2007 8. Hartnoll, S.A., Kovtun, P.K., Mueller, M., Sachdev, S.: Theory of the Nernst effect near quantum phase transitions in condensed matter, and in dyonic black holes. Phys. Rev. B 76, 144502 9. Hartnoll, S.A., Kovtun, P.: Hall conductivity from dyonic black holes. http://arXiv.org/list/0704.1160, 2007 10. Mateos, D., Myers, R.C., Thomson, R.M.: Holographic viscosity of fundamental matter. Phys. Rev. Lett. 98, 101601 (2007) 11. Klebanov, I.R., Strassler, M.J.: Supergravity and a Confining Gauge Theory: Duality Cascades and χ SB-Resolution of Naked Singularities. JHEP 0008 (2000) 052 12. Klebanov, I.R., Witten, E.: Superconformal Field Theory on Threebranes at a Calabi-Yau Singularity. Nucl. Phys. B 536, 199–218 (1998) 13. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: Supersymmetric AdS5 solutions of M-theory. Class. Quant. Grav. 21, 4335–4366 (2004) 14. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: Sasaki-Einstein Metrics on S 2 x S 3 . Adv. Theor. Math. Phys. 8, 711–734 (2004) 15. Feng, B., Hanany, A., He, Y.-H.: D-Brane Gauge Theories from Toric Singularities and Toric Duality. Nucl. Phys. B 595, 165–200 (2001) 16. Martelli, D., Sparks, J.: Toric Geometry, Sasaki-Einstein Manifolds and a New Infinite Class of AdS/CFT Duals. Commun. Math. Phys. 262, 51–89 (2006) 17. Benvenuti, S., Franco, S., Hanany, A., Martelli, D., Sparks, J.: An Infinite Family of Superconformal Quiver Gauge Theories with Sasaki-Einstein Duals. JHEP 0506, 064 (2005) 18. Intrilligator, K., Wecht, B.: The Exact Superconformal R-Symmetry Maximizes a. Nucl. Phys. B 667, 183–200 (2003) 19. Martelli, D., Sparks, J., Yau, S.-T.: The Geometric Dual of a-maximisation for Toric Sasaki-Einstein Manifolds. Commun. Math. Phys. 268, 39–65 (2006) 11 I thank Marika Taylor for pointing this out to me.

388

O. A. P. Mac Conamhna

20. Duff, M.J., Gibbons, G.W., Townsend, P.K.: Macroscopic superstrings as interpolating solitons. Phys. Lett. B 332, 321–328 (1994) 21. Acharya, B.S., Figueroa-O’Farrill, J.M., Hull, C.M., Spence, B.: Branes at conical singularities and holography. Adv. Theor. Math. Phys. 2, 1249–1286 (1999) 22. Morrison, D.R., Plesser, M.R.: Non-spherical horizons, I. Adv. Theor. Math. Phys. 2, 1249–1286 (1999) 23. Maldacena, J., Nuñez, C.: Supergravity description of field theories on curved manifolds and a no go theorem. Int. J. Mod. Phys. A 16, 822–855 (2001) 24. Acharya, B.S., Gauntlett, J.P., Kim, N.: Fivebranes wrapped on associative three-cycles. Phys. Rev. D 63, 106003 (2001) 25. Pernici, M., Sezgin, E.: Spontaneous compactification of seven-dimensional supergravity theories. Class. Quant. Grav. 2, 673 (1985) 26. Gauntlett, J.P., Kim, N., Waldram, D.: M-fivebranes wrapped on supersymmetric cycles. Phys. Rev. D 63, 126001 (2001) 27. Gauntlett, J.P., Kim, N.: M-fivebranes wrapped on supersymmetric cycles II. Phys. Rev. D 65, 086003 (2002) 28. Nieder, H., Oz, Y.: Supergravity and D-branes wrapping supersymmetric cycles. JHEP 0103, 008 (2001) 29. Naka, M.: Various wrapped branes from gauged supergravities. http://arXiv.org/list/hep-th/0206141, 2002 30. Gauntlett, J.P., Mac Conamhna, O.A.P., Mateos, T., Waldram, D.: New supersymmetric AdS3 solutions. Phys. Rev. D 74, 106007 (2006) 31. Gauntlett, J.P., Mac Conamhna, O.A.P., Mateos, T., Waldram, D.: Supersymmetric AdS3 solutions of type IIB supergravity. Phys. Rev. Lett 97, 171601 (2006) 32. Gauntlett, J.P., Kim, N., Waldram, D.: Supersymmetric AdS(3), AdS(2) and bubble solutions. JHEP 0704, 005 (2007) 33. Mac Conamhna, O.A.P.: Inverting geometric transitions: explicit Calabi-Yau metrics for the MaldacenaNuñez solutions. http://arXiv.org/list/0706.1795, 2007 34. Fayyazuddin, A., Smith, D.J.: Localized intersections of M5-branes and four-dimensional superconformal field theories. JHEP 9904, 030 (1999) 35. Cho, H., Emam, M., Kastor, D., Traschen, J.: Calibrations and Fayyazuddin-Smith Spacetimes. Phys. Rev. D 63, 064003 (2001) 36. Husain, T.Z.: That’s a wrap! JHEP 0304, 053 (2003) 37. Brinne, B., Fayyazuddin, A., Husain, T.Z., Smith, D.J.: N = 1 M5-brane geometries. JHEP 0103, 052 (2001) 38. Fayyazuddin, A., Husain, T.Z., Pappa, I.: The geometry of wrapped M5-branes in Calabi-Yau 2-folds. Phys. Rev. D73, 126004 (2006) 39. Martelli, D., Sparks, J.: G-structures, fluxes and calibrations in M-theory. Phys. Rev. D 68, 085014 (2003) 40. Husain, T.Z.: M2-branes wrapped on holomorphic curves. JHEP 0312, 037 (2003) 41. Gauntlett, J.P., Mac Conamhna, O.A.P., Mateos, T., Waldram, D.: AdS spacetimes from wrapped M5 branes. JHEP 0611, 053 (2006) 42. Mac Conamhna, O.A.P., Ó Colgáin, E.: Supersymmetric wrapped membranes, AdS(2) spaces, and bubbling geometries. JHEP 0703, 115 (2007) 43. Figueras, P., Mac Conamhna, O.A.P., Ó Colgáin, E.: Global geometry of the supersymmetric Ad S3 /C F T2 correspondence in M-theory. Phys. Rev. D 76, 046007 (2007) 44. Eguchi, T., Hanson, A.J.: Asymptotically flat self-dual solutions to Euclidean gravity. Phys. Lett. B 74, 249 (1978) 45. Bryant, R.L., Salamon, S.: On the construction of some complete metrics with exceptional holonomy. Duke Math. J. 58, 829 (1989) 46. Gibbons, G.W., Page, D.N., Pope, C.N.: Einstein metrics on S 3 , R 3 and R 4 bundles. Commun. Math. Phys. 127, 529 (1990) 47. Candelas, P., de la Ossa, X.C.: Comments on Conifolds. Nucl. Phys. B 342, 246 (1990) 48. Pando Zayas, L.A., Tseytlin, A.A.: 3-branes on resolved conifold. JHEP 0011, 028 (2000) 49. Papadopoulos, G., Tseytlin, A.A.: Complex geometry of conifolds and 5-brane wrapped on 2-sphere. Class. Quant. Grav. 18, 1333 (2001) 50. Stenzel, M.B.: Ricci-flat metrics on the complexification of a compact rank one symmetric space. Manus. Math. 80, 151 (1993) 51. Cvetic, M., Gibbons, G.W., Lü, H., Pope, C.N.: Ricci-flat metrics, harmonic forms and brane resolutions. Commun. Math. Phys. 232, 457 (2003) 52. Cvetic, M., Gibbons G.W., Lü, H., Pope, C.N.: Special holonomy spaces and M-theory. http://arXiv.org./ list/hep-th/0206154, 2002, to apper in 2001 Les Houchesproceedings 53. Ohta, K., Yokono, T.: Deformation of conifold and intersecting branes. JHEP 0002, 023 (2000) 54. Cvetic, M., Gibbons, G.W., Lü, H., Pope, C.N.: Cohomogeneity one manifolds of Spin(7) and G 2 holonomy. Phys. Rev. D 65, 106004 (2002)

Spacetime Singularity Resolution by M-Theory Fivebranes

55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68.

389

Gukov, S., Sparks, J.: M-theory on Spin(7) manifolds. Nucl. Phys. B 625, 3 (2002) Calabi, E.: Métriques Kählériennes et fibrés holomorphe. Ann. Scient. École Norm. Sup. 12, 269 (1979) Dancer, A., Swann, A.: Hyperkähler metrics of cohomogeneity one. J. Geom. Phys. 21, 218 (1997) Cvetic, M., Gibbons, G.W., Lü, H., Pope, C.N.: Hyper-Kähler Calabi Metrics, L 2 harmonic forms, resolved M2-branes, and Ad S4 /C F T3 correspondence. Nucl. Phys. B 617, 151 (2001) Mac Conamhna, O.A.P.: The geometry of extended null supersymmetry in M-theory. Phys. Rev. D 73, 045012 (2006) Lukas, A., Saffin, P.: M-theory compactification, fluxes and Ad S4 . Phys. Rev. D 71, 046005 (2005) Gauntlett, J.P., Mac Conamhna, O.A.P.: AdS spacetimes from wrapped D3 branes. http://arXiv.org/list/ hep-th/0707.3105, 2007 Mac Conamhna, O.A.P.: Eight-manifolds with G-structure in eleven dimensional supergravity. Phys.Rev. D 72, 086007 (2005) Gauntlett, J.P., Gutowski, J.B., Pakis, S.: The Geometry of D = 11 Null Killing Spinors. JHEP 0312, 049 (2003) Kim, N.: AdS(3) solutions of IIB supergravity from D3-branes. JHEP 0601, 094 (2006) Kim, N., Park, J.D.: Comments on AdS(2) solutions of D = 11 supergravity. JHEP 0609, 041 (2006) Bagger, J., Lambert, N.: Modelling multiple M2’s. Phys. Rev. D 75, 045020 (2007) Skenderis, K.: Lecture notes on holographic renormalisation. Class. Quant. Grav. 19, 5849 (2002) Skenderis, K., Taylor, M.: Kaluza-Klein holography. JHEP 0605, 057 (2006)

Communicated by G.W. Gibbons

Commun. Math. Phys. 284, 391–406 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0568-4

Communications in

Mathematical Physics

Invariant Measures Satisfying an Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents∗ Pei-Dong Liu School of Mathematical Sciences, Peking University, Beijing 100871, P. R. China. E-mail: [email protected] Received: 10 October 2007 / Accepted: 30 March 2008 Published online: 18 July 2008 – © Springer-Verlag 2008

Abstract: In this paper we prove that, for a C 2 (non-invertible but non-degenerate) map on a compact manifold, an invariant measure satisfies an equality relating entropy, folding entropy and negative Lyapunov exponents if and, under a condition on the Jacobian of the map, only if the measure has absolutely continuous conditional measures on the stable manifolds.

1. Introduction Let M be a connected compact Riemannian manifold without boundary, f : M → M a C 2 non-invertible map and µ an f -invariant measure. The entropy production eµ ( f ) of the dynamical system ( f, µ) is defined by Ruelle [10] as eµ ( f ) := Fµ ( f ) −

log | det Tx f | dµ,

where Fµ ( f ) := Hµ ( | f −1 ) with being the partition of M into single points and it is called the folding entropy of ( f, µ). Let h µ ( f ) be the (measure-theoretic) entropy of ( f, µ) and, for µ-a.e. x, let −∞ ≤ λ1 (x) < λ2 (x) < · · · < λr (x) (x) < +∞ be the Lyapunov exponents of f at x with m i (x) denoting the multiplicity of λi (x). Under a set of conditions on degenerate points of the map, Liu [3] proved the following inequality conjectured by Ruelle [10]: h µ ( f ) ≤ Fµ ( f ) −

λi (x)− m i (x) dµ

(1.1)

i ∗ This work is supported by National Basic Research Program of China (973 Program) (2007CB814800).

392

P.-D. Liu

(where a − := min{a, 0}). When µ satisfies the Pesin entropy formula hµ( f ) = λi (x)+ m i (x) dµ

(1.2)

i

with a + := max{a, 0} (such a measure µ is sometimes called an SRB measure), one obtains the non-negativity of the entropy production since eµ ( f ) = Fµ ( f ) − log | det Tx f | dµ = Fµ ( f ) − λi (x)− m i (x) dµ − h µ ( f ) i

+ hµ( f ) −

λi (x)+ m i (x) dµ

i

≥ 0. In this article, we further investigate the question when eµ ( f ) = 0 or eµ ( f ) > 0. We show that, when f has no degenerate points, the formula h µ ( f ) = Fµ ( f ) − λi (x)− m i (x) dµ (1.3) i

holds if and, under a somewhat restrictive condition on the Jacobian of ( f, µ), only if µ has absolutely continuous conditional measures on the stable manifolds of ( f, µ). This paper is organized in the following way. Section 2 is devoted to the definitions and statement of the results. The rest of the sections are devoted to the proofs. 2. Definitions and Statement of the Results Let f : M → M be a C 2 non-invertible map such that Tx f is non-degenerate at every x ∈ M (i.e. det Tx f = 0 at every x ∈ M), and let µ be an invariant measure of f . Choose a Borel set such that µ() = 1, f ⊂ and every point x ∈ is regular in the sense of Oseledec, that is, there exist a sequence of subspaces of Tx M, {0} = V0 (x) ⊂ V1 (x) ⊂ · · · ⊂ Vr (x) (x) = Tx M, such that lim

n→+∞

1 log |Tx f n v| = λi (x) n

for all v ∈ Vi (x)\Vi−1 (x), 1 ≤ i ≤ r (x). Set I = {x ∈ : λi (x) ≥ 0 for all 1 ≤ i ≤ r (x)} and = \ I . For x ∈ I , define W s (x) = {x}. For x ∈ , define W s (x) = {y ∈ M : lim sup n→+∞

1 log d( f n y, f n x) < 0} n

(log 0 := −∞) and call it the stable manifold of f at x. The arguments in Liu and Qian [5, Sects. III.1-III.3] restricted to a deterministic map show that, for µ-a.e. x ∈ , there

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

393

s exist a sequence of C 1,1 embedded k-dimensional discs {W ( f n x)}+∞ n=0 (k = dim E (x) s n n+1 and E (x) = λi (x)<0 Vi (x)) such that f W ( f x) ⊂ W ( f x) for all n ≥ 0 and

W s (x) =

+∞

f −n W ( f n x).

n=0

Let V s (x) denote the arc connected component of W s (x) which contains x. It is a C 1,1 immersed submanifold of M. Let Bµ (M) denote the completion of the Borel σ -algebra of M with respect to µ so that (M, Bµ (M), µ) constitutes a Lebesgue space. Definition 2.1. A measurable partition ξ of (M, Bµ (M), µ) is said to be subordinate to the W s -manifolds of ( f, µ) if for µ-a.e. x one has ξ(x) ⊂ W s (x) (ξ(x) denotes the element of ξ which contains x) and ξ(x) contains an open neighborhood of x in V s (x) (with respect to the submanifold topology of V s (x)). Definition 2.2. µ is said to have absolutely continuous conditional measures (abbreviated as accm) on the stable manifolds if for every measurable partition ξ subordinate to the W s -manifolds of ( f, µ) one has for µ-a.e. x, µξx λsx ξ

where µx is the conditional measure of µ on ξ(x) and λsx denotes the Lebesgue measure on W s (x) induced by its inherited Riemannian structure as a submanifold of M (λsx := δx if W s (x) = {x}). Theorem 2.3 (Sufficiency). Let ( f, µ) be as given at the beginning of Sect. 2. If µ has accm on the stable manifolds, then the equality (1.3) holds true. Remark 2.4. If f has no negative Lyapunov exponents at µ-a.e. x, one has h µ ( f ) = Fµ ( f ). This follows directly from the inequality (1.1) and the fact that h µ ( f ) ≥ Hµ (| f −1 ) = Fµ ( f ). In order to prove that µ having accm on W s -manifolds is necessary for the equality (1.3), we make further assumptions. Recall now the notion of Jacobian of measurepreserving transformations (Parry [7]). Let T : (X, A, ν) → (Y, B, ρ) be a measurepreserving transformation between two probability spaces. Assume that there is a countable partition of X (ν-mod 0) into measurable sets α = {Ai } such that for each Ai the map Ti := T | Ai : Ai → Y is absolutely continuous (with respect to ν and ρ), i.e., (i) Ti is injective; (ii) Ti A is measurable if A is a measurable subset of Ai ; (iii) ρ(Ti A) = 0 if A ⊂ Ai is measurable and ν(A) = 0. (i) and (ii) allow us to define a measure νTi on each Ai by νTi (A) := ρ(Ti A) for measurable A ⊂ Ai . By (iii), νTi ν. Define JT (x) =

dνTi (x) dν

if x ∈ Ai .

394

P.-D. Liu

Clearly the definition of JT is ν-mod 0 independent of the choice of the partition α. JT is called the Jacobian of T . It is clear that JT (x) ≥ 1, ν − a.e. x.

(2.1)

When (X, A, ν) and (Y, B, ρ) are both Lebesgue spaces and is the partition of Y into single points, Parry [7, Lemma 10.5] gives a very useful property of the Jacobian − log νxT

−1

({x}) = log JT (x), ν − a.e. x.

(2.2)

For a C 1 measure-preserving map g : (M, B(M), ν) ← with ν( ) = 0, where = {x ∈ M : det Tx g = 0}, it is always possible to define the Jacobian of g on a measurable set of full ν-measure. In fact, since Tx g is non-degenerate for any x ∈ M\ , a countable Borel partition α = {Ai } of M\ satisfying (i)–(ii) above clearly exists. Let g −1

= {x ∈ M \ : νx

({x}) > 0}.

Clearly ν( ) = 1 and it is easy to check that g| Ai ∩ : Ai ∩ → M and ν satisfy (iii) for each Ai and hence Jg is well defined on . Moreover, if ν( ∪ g ) = 0, then, since g preserves ν, one has for ν-a.e. y ∈ M, z:gz=y

1 = 1. Jg (z)

(2.3)

We now make an assumption on the Jacobian of f : (M, µ) ← which seems rather restrictive. (H) There is a Hölder continuous function J f : M → [1, +∞) such that J f (y) dµ(y) µ( f B) =

(2.4)

B

for any Borel B ⊂ M which is so that f : B → f B is injective. Assumption (H) clearly implies that (2.3) is true for every y ∈ M. Actually we need the following weaker conditions. J f ( f k x) (H)’ For µ-a.e. x, J f (y) is well defined on V s (x), +∞ k=0 J f ( f k y) converges and is bounded away from 0 and +∞ on any given neighborhood of x in V s (x) whose d s -diameter is finite, where d s is the distance along V s (x); moreover, (2.3) is true λsx almost everywhere on V s (x). Assumption (H) clearly implies (H)’. The author does not know how often (H)’ is satisfied, but it is an almost necessary condition for µ having accm on the W s -manifolds (see Subsect. 4.1). In some particular cases, J f = l is constant everywhere, where l = # f −1 {x} for any x ∈ M. Example 2.5. Let f : M → M be a C 1 map so that Tx f is non-degenerate for every x ∈ M. Let l = # f −1 {x} for all x ∈ M. Take x0 ∈ M. Let µk be the probability so that µk ({z}) = l1k for any z ∈ f −k {x0 }. Let µ be any weak limit point of { n1 n−1 k=0 µk }n≥0 . Then µ is an f -invariant measure and f : (M, µ) ← has constant Jacobian J f = l which satisfies (2.4).

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

395

Theorem 2.6 (Necessity). Let ( f, µ) be as given at the beginning of Sect. 2 and assume (H) or (H)’. If (1.3) holds true, then µ has accm on the stable manifolds. Guess 2.7. If µ is hyperbolic (that is, λi (x) = 0, 1 ≤ i ≤ r (x) for µ-a.e. x), has property (H) or (H)’ and satisfies both formulae (1.2) and (1.3), then µ is absolutely continuous with respect to the Lebesgue measure on M. Note that, when µ satisfies the Pesin formula (1.2), the entropy production eµ ( f ) = 0 if and only if µ satisfies the formula (1.3). Hence, Guess 2.7 implies that, if µ is moreover hyperbolic and satisfies (H) or (H)’, eµ ( f ) = 0 if (see Remark 2.9 below) and only if µ is absolutely continuous with respect to the Lebesgue measure on M. Remark 2.8. The referee indicated to the author the following outline of an argument which hopefully can confirm that Guess 2.7 is true (a rigorous proof is, however, still lacking). In the natural extension or inverse limit system ( f¯, µ) ¯ of ( f, µ), µ¯ has absolutely continuous conditional measures along the unstable manifolds ([8]). On M, the stable foliation is absolutely continuous. To describe the transverse measure, one should take a transversal T and project µ on T along local stable leaves. By projection from µ, ¯ the measure µ is an average of the projections from the natural extension of the conditional measures on unstable manifolds. Each of these projections is an absolutely continuous measure on the projection of the corresponding unstable manifold, which is transversal to the stable foliation. By absolute continuity of the stable foliation, these are carried to T into an absolutely continuous measure on T . The transversal measure on T , which is an average of these last ones, is also absolutely continuous. Now, by Theorem 2.6, the measure µ has moreover absolutely continuous conditional measures on the stable manifolds. Pairing with an absolutely continuous measure on transversals yields an absolutely continuous measure on M. Remark 2.9. For any C 2 measure-preserving map f : (M, µ) ← (possibly with degenerate points), if µ is absolutely continuous with respect to the Lebesgue, then µ satisfies both formulae (1.2) and (1.3) and hence eµ ( f ) = 0. In fact, µ satisfying (1.2) is proved in Liu [4]. On the other hand, φ( f x) J f (x) = | det Tx f |, µ−a.e. x, φ(x) where φ = dµ/dLeb. By (2.2), f −1

− log µx

({x}) = log J f (x),

µ−a.e. x.

By Liu [4], log | det Tx f | ∈ (this follows from the fact that µ Leb.) and log φ◦φ f dµ = 0. Hence f −1 Hµ (| f −1 ) = − log µx ({x})dµ(x) φ( f x) dµ + log | det Tx f |dµ = log φ(x) − λi (x) m i (x) dµ + λi (x)+ m i (x) dµ, = L 1 (M, µ)

i

and thus Fµ ( f ) −

i

−

λi (x) m i (x) dµ =

i

i

λi (x)+ m i (x) dµ = h µ ( f ).

396

P.-D. Liu

3. Proof of Theorem 2.3 By Liu [3], (1.1) holds true for ( f, µ) since we assume here that f has no degenerate points. It remains to prove h µ ( f ) ≥ Fµ ( f ) − λi (x)− m i (x) dµ. (3.1) i

With a little modification of the sets chosen in Sect. 2, take a Borel set with µ() = 1, f ⊂ and = I ∪ so that, for every x ∈ I , λi (x) ≥ 0 for all i and −n W ( f n x). We W s (x) = {x}, and, for every x ∈ , λ1 (x) < 0 and W s (x) = +∞ n=0 f may take so that W s (x) ⊂ for every x ∈ . Lemma 3.1. There exists a measurable partition η of (, µ| ) which has the following properties: (1) f −1 η ≤ η (meaning that ( f −1 η)(x) ⊃ η(x) for µ-a.e. x ∈ ); (2) η is subordinate to the W s -manifolds of ( f, µ); (3) for every Borel set B the function PB (x) = λsx (η(x) ∩ B) is measurable and is µ almost everywhere finite on . The proof of this lemma is omitted here since it is almost the same as that of Liu and Qian [5, Prop. IV.2.1] restricted to the case of a deterministic map. Property (3) just above allows one to define a σ -finite Borel measure λ∗ on by λ∗ (B) = λsx (η(x) ∩ B) dµ for each Borel B ⊂ . From the assumption of µ having accm on W s -manifolds it follows that µ λ∗ . Put h=

dµ . dλ∗

By arguments similar to Ledrappier and Strelcyn [1, Prop. 4.1] or [5, Prop. IV.2.2] we have for µ-a.e. x ∈ , η

h=

dµx dλsx

λsx almost everywhere on η(x).

(3.2)

Let x ∈ and consider the measure-preserving map between Lebesgue spaces f −1 η

f x := f |( f −1 η)(x) : (( f −1 η)(x), µx

η

) −→ (η( f x), µ f x ). f −1 η

η

λsx and µ f x λsf x , we Since Tx f is non-degenerate at every x ∈ M and µx know that f x admits a Jacobian which, using (3.2), is given by J f x (z) =

1 f −1 η µx (η(z))

·

h( f z) · | det(Tz f | E s (z) )| h(z)

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents f −1 η

for µx

397

-a.e. z ∈ ( f −1 η)(x). With a bit of abuse of notations, let be the partition of f −1 η

η(x) into single points. By (2.2), for µx

-a.e. z ∈ ( f −1 η)(x),

f −1 η ( f x )−1 )z ({z})

− log(µx

= log J f x (z)

which, by the transitivity of conditional measures, implies that for µ-a.e. z ∈ , f −1

− log µz

f −1 η

({z}) = − log µz

(η(z)) + log

h( f z) + log | det(Tz f | E s (z) )|. (3.3) h(z)

f −1 Since − log µz ({z})dµ(z) ≤ l (to recall, l = # f −1 {x} for all x ∈ M), we know that the left hand of (3.3) is µ-integrable. Since, by Ruelle inequality [11], h µ ( f ) < +∞, f −1 η

we know that − log µz (η(z)) ∈ L 1 (µ) since Hµ (η| f −1 η) ≤ h µ ( f ). The last term in the right side of (3.3) is clearly integrable. Thus h( f z) dµ(z) = 0. log h(z) Taking integration of the two sides of (3.3), we have f −1 f −1 η − log µz ({z})dµ = − log µz (η(z))dµ + λi (z)− m i (z) dµ.

i

Letting η = on I , the above equality clearly holds true with being replaced by I . Thus λi (z)− m i (z) dµ, Hµ (| f −1 ) = Hµ (η| f −1 η) + M

i

which implies h µ ( f ) ≥ Hµ (η| f −1 η) = Fµ ( f ) −

M

λi (z)− m i (z) dµ.

i

4. Proof of Theorem 2.6 In this section we largely use the strategy of Ledrappier and Young [2] which deals with unstable manifolds of diffeomorphisms. Our maps under consideration are non-invertible and unstable manifolds can not be defined for the system f : (M, µ) ← (but can be defined for its inverse limit system). We deal with stable manifolds and use the Jacobian and the inverse limit space. 4.1. Increasing partitions subordinate to W s -manifolds and the necessity. We first assume that ( f, µ) is ergodic. Let −∞ < λ1 < λ2 < · · · < λr < +∞ be the Lyapunov exponents of ( f, µ) with m i being the multiplicity of λi . If λ1 ≥ 0, W s (x) = {x} for µ-a.e. x ∈ M and the conditional measure of µ on W s (x) is δx , Theorem 2.6 is trivial in this case (cf. Remark 2.4). We will assume that λ1 < 0 and λ1 < λ2 < · · · < λs < 0 are all the negative exponents.

398

P.-D. Liu

Let M¯ = {x¯ = (· · · , x−1 , x0 , x1 , · · · ) : xi ∈ M, f xi = xi+1 , i ∈ Z} be the inverse limit space of (M, f ), π : M¯ → M, x¯ → x0 the natural projection, τ : M¯ → M¯ the left shift transformation, and µ¯ the unique τ -invariant measure such that π µ¯ = µ. Proposition 4.1.1. There exists a measurable partition ξ of (M, Bµ (M), µ) with the following properties: (1) f −1 ξ ≤ ξ , and ξ is subordinate to the W s -manifolds of ( f, µ); +∞ n −1 ¯ (2) n=0 τ (π ξ ) is the partition of M into single points. This result and its proof are similar to Lemma 3.1 (see [5, Prop. IV.2.1] for details). We will give an outline of the proof here since it produces certain additional properties of the partition that will be useful in the next subsection. This is similar to [2, Lemma 3.1.1]. Outline of construction. There is a measurable set S with the following properties: (a) µ(S) > 0; (b) S is the disjoint union of a continuous family of embedded discs {Dα } where each Dα is an open neighborhood of xα in V s (xα ); (c) For µ-a.e. x, there is an open neighborhood Ux of x in V s (x) such that, for each n ≥ 0, either f n Ux ∩ S = ∅ or f n Ux ⊂ Dα for some α; (d) There is γ > 0 such that: i) the d s -diameter of every Dα in S is less than γ ; ii) if x, y ∈ S are such that y ∈ V s (x) and d s (x, y) > γ , then x, y lie on distinct Dα -discs. Let ξˆ be the partition of M defined by Dα ξˆ (x) = M \S Then ξ := ξˆ − =

+∞ n=0

if x ∈ Dα , if x ∈ S.

f −n ξˆ is the partition we desire.

The partitions whose construction is just outlined have the following alternate char −n {S, M \ S}, f acterization: There is a set S satisfying (a)–(d) such that, if σ = +∞ n=0 s n then, for every x ∈ M, y ∈ ξ(x) if and only if y ∈ σ (x) and d ( f y, f n x) ≤ γ whenever f n x ∈ S. Proposition 4.1.2. Let ξ be a partition given in Proposition 4.1.1. Then h µ ( f ) = Hµ (ξ | f −1 ξ ). A discussion of the proof of this proposition will be given in Subsect. 4.2. We first show λi− m i =⇒ µξx λsx for µ−a.e. x. (4.1.1) Hµ (ξ | f −1 ξ ) = Fµ ( f ) − i ξ

Let D s (x) = | det(Tx f | E s (x) )|. Suppose we know that µx λsx for µ-a.e. x. Then ξ dµx = hdλsx µ almost everywhere for some function h (see (3.2)). By Liu [4, Proof of Claim 2.1], this function must satisfy f −1 ξ

µx

(ξ(x)) =

h( f y) 1 · · | det(Ty f | E s (y) )| J f (y) h(y)

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

399

for λsx -a.e. y ∈ ξ(x) and hence h( f y) J f (x) h(y) = · h(x) h( f x) J f (y) = ······ +∞

J f ( f k x) = · J f ( f k y) k=0

·

| det(Ty f | E s (y) )| | det(Tx f | E s (x) )|

D s ( f k y) D s ( f k x)

=: (x, y) as long as then

h( f k y) h( f k x)

→ 1 as k → +∞ and (x, y) is well defined. A candidate for h is (x, y) , ∀y ∈ ξ(x). s ξ(x) (x, y)dλx (y)

h(y) =

(4.1.2)

Also note that the same observation holds if we replace ξ by f −m ξ for m ≥ 0. Lemma 4.1.3. Let m ≥ 0. There exists a measurable function h m : M → (0, +∞) such that for µ-a.e. x, (x, y) , ∀y ∈ ( f −m ξ )(x). s (y) (x, y)dλ x ( f −m ξ )(x)

h m (y) =

(4.1.3)

This lemma follows from our assumption (H) or (H)’ in Sect. 2 and the fact that E s (y) is Lipschitz continuous along each V s (x). The detailed proof is the same as that of [5, Lemma VI.8.2] and is omitted here. Lemma 4.1.4. For µ-a.e. x one has J f (x) (x, y)dλsx (y) = s ( f x, y)dλsf x (y). −1 D (x) ( f ξ )(x) ξ( f x)

(4.1.4)

Proof. Let y0 ∈ ξ( f x). Since Tz f is assumed to be non-degenerate at any z ∈ M, there is an open neighborhood U y0 of y0 in M such that f −1 U y0 = li=1 Vzi , where z i ∈ f −1 {y0 } and Vzi is an open neighborhood of z i so that f i := f |Vzi : Vzi → U y0 is a diffeomorphism. For any Borel set B ⊂ U y0 ∩ ξ( f x), put Ci = Vzi ∩ f −1 B. Then (x, z) dλsx (z) f −1 B

= = =

l i=1

Ci

i=1

B

l

l i=1

B

(x, z) dλsx (z) (x, f i−1 y)| det(Ty f i−1 | E s (y) )| dλsf x (y) (x, f i−1 y) dλsf x (y) | det(T f −1 y f | E s ( f −1 y) )| i

i

400

P.-D. Liu

=

l

B k=0

i=1

=

+∞

J f (x) D s (x)

J f (x) = s D (x)

J f ( f k x) J f ( f k ( f i−1 y)) l

B i=1

B

1 J f ( f i−1 y)

·

D s ( f k ( f i−1 y)) 1 dλsf x (y) · D s ( f k x) D s ( f i−1 y)

( f x, y) dλsf x (y)

( f x, y) dλsf x (y),

the last equality uses (2.3). Taking a finite cover of ξ( f x) by open sets of the type of U y0 , we get (4.1.4). ¯ τ, µ) Let ( M, ¯ be the inverse limit system of (M, f, µ) and set ξ¯ = π −1 ξ . We now define a Borel probability ν¯ on M¯ by letting ν¯ = µ¯ on B(ξ¯ ), the σ -algebra ξ¯ ¯ (where x¯ = generated by ξ¯ , and by introducing a conditional measure ν¯ x¯ on ξ¯ (x) ¯ (· · · , x−1 , x0 , x1 , · · · ) ∈ M) in the following way. For every cylinder set C¯ = { y¯ = (· · · , y−1 , y0 , y1 , · · · ) ∈ M¯ : yi ∈ Ai , i = − p, · · · , −1, 0, 1, · · · , q}, let C = {y ∈ ( f − p ξ )(x− p ) : y ∈ A− p , f y ∈ A− p+1 , · · · , f p+q y ∈ Aq } and define ξ¯ ¯ ν¯ x¯ (C)

=

C

(x− p , y) dλsx− p (y)

( f − p ξ )(x− p )

(x− p , y) dλsx− p (y)

.

From Lemma 4.1.4 and its proof we know that for any Borel B ⊂ ξ( f x),

f −1 B

(x, y) dλsx (y)

( f −1 ξ )(x)

(x, y) dλsx (y)

=

B

( f x, y) dλsf x (y)

ξ( f x)

( f x, y) dλsf x (y)

.

ξ¯

This implies that ν¯ x¯ is well defined. Replacing ξ with f −(m−1) ξ in Lemma 4.1.4, we get for m ≥ 1, ( f −m ξ )(x)

(x, y) dλsx (y) =

J f (x) · · · J f ( f m−1 x) D s (x) · · · D s ( f m−1 x)

ξ( f m x)

( f m x, y)dλsf m x (y). (4.1.5)

Lemma 4.1.5. For m ≥ 1, 1 m

ξ¯

M¯

¯ d µ( ¯ x) ¯ = Fµ ( f ) − − log ν¯ x¯ ((τ m ξ¯ )(x))

i

λi− m i .

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

Proof. Put L(x) ¯ =

ξ(x0 )

qm (x) ¯ :=

(x0 , y) dλsx0 (y). Then

ξ¯ ν¯ x¯ ((τ m ξ¯ )(x)) ¯

ξ(x−m )

= =

(x−m , y) dλsx−m (y)

s ( f −m ξ )(x−m ) (x −m , y) dλx−m (y) L(τ −m x) ¯ D s (x−m ) · · · D s (x−1 )

L(x) ¯

·

J f (x−m ) · · · J f (x−1 )

Since qm (x) ¯ ≤ 1 and log D s and log J f are µ-integrable, we know log+ −m and hence log L◦τL d µ¯ = 0. This yields − log qm (x) ¯ d µ¯ =

401

m

log J f (x− j ) d µ( ¯ x) ¯ −

j=1

m

L◦τ −m ¯ < +∞ L dµ

log D s (x− j ) d µ( ¯ x) ¯

j=1

=m

log J f (x) dµ(x) − − λi m i . = m Fµ ( f ) −

.

log D (x) dµ(x) s

i

This proves the lemma. Lemma 4.1.6.

1 m

Hµ¯ (τ m ξ¯ | ξ¯ ) = Fµ ( f ) −

i

λi− m i implies ν¯ = µ¯ on B(τ m ξ¯ ).

The proof of this lemma is similar to that of [2, Lemma 6.1.3] is omitted here. and m ξ¯ is the partition Noting that m1 Hµ¯ (τ m ξ¯ | ξ¯ ) = m1 Hµ (ξ | f −m ξ ) = h µ ( f ) and +∞ τ m=0 of M¯ into single points, we know that ν¯ = µ, ¯ and hence π ν¯ = π µ¯ = µ. This completes the proof of the ergodic case of Theorem 2.6. In what follows we complete the proof of Theorem 2.6 in the non-ergodic case. The arguments are similar to [2, Subsect. (6.2)] or [5, Subsect. VI.8.B] and are only outlined. Given ( f, µ), by Rokhlin [9], there is a unique (µ-mod 0) measurable partition ζ = {C} of (M, Bµ (M), µ) such that f −1 C = C for each C ∈ ζ and f |C : (C, µC ) ← is ergodic for µζ -a.e. C ∈ M/ζ , where µC is the conditional measure of µ on C and (M/ζ, µζ ) is the factor space of (M, µ) with respect to ζ . Suppose λi (x)− m i (x) dµ. h µ ( f ) = Fµ ( f ) − i

Since hµ( f ) =

M/ζ

h µC ( f ) dµζ (C)

and Fµ ( f )−

i

−

λi (x) m i (x) dµ = M/ζ

− λi (x) m i (x)dµC dµζ (C), FµC ( f )− i

402

P.-D. Liu

by the inequality (1.1), we have h µC ( f ) = FµC ( f ) −

λi (x)− m i (x)dµC

i

for µζ -a.e. C. Let ξ be a measurable partition of M subordinate to the W s -manifolds of (M, µ). Since, by [5, Lemma IV.2.2], ξ refines ζ , we have µξx = (µC )ξx if x ∈ C, ξ

and hence µx λsx for µ-a.e. x. 4.2. Proof of Proposition 4.1.2 in the hyperbolic case. The proof follows largely the arguments of Ledrappier and Young [2], but we will use the inverse limit space and some modifications are necessary. A complete proof is quite long and it is in fact more similar to the arguments in [5, Chap. V] where a version of [2] for random diffeomorphisms is presented. In order to avoid similar arguments, we will only present a proof for the case when ( f, µ) is hyperbolic, that is, ( f, µ) does not have zero Lyapunov exponent. Though it is much simpler than that for the general case, such a presentation is sufficient for the reader to get the full flavor of the necessary modifications of [2] for the complete proof. Lyapunov charts. Since Tx f is assumed to be non-degenerate for any x ∈ M, there are ρ0 , ρ1 > 0 such that, for any x ∈ M, f x := f | B(x,ρ0 ) : B(x, ρ0 ) → M is a diffeomorphism to the image which contains B( f x, ρ1 ). Let f x−1 : f B(x, ρ0 ) → B(x, ρ0 ) denote the local inverse. Assume ( f, µ) is ergodic and let λ1 < λ2 < · · · < λr be all the Lyapunov exponents with λi = 0 for all i. Then there is a Borel set 0 ⊂ M¯ with µ( ¯ 0 ) = 1 and for each x¯ ∈ 0 there exists a measurable (in x) ¯ splitting Tx0 M = E 1 (x) ¯ ⊕ E 2 (x) ¯ ⊕ · · · ⊕ Er (x) ¯ such that for each 1 ≤ i ≤ r , lim

n→±∞

1 log |D(x, ¯ n)v| = λi for 0 = v ∈ E i (x), ¯ n

¯ n) = (Txn f )−1 ◦· · ·◦(Tx−1 f )−1 for n < 0. where D(x, ¯ n) = Tx0 f n for n ≥ 0 and D(x, s u ¯ = ⊕λi <0 E i (x), ¯ E (x) ¯ = ⊕λi >0 E i (x), ¯ s = dim E s (x), ¯ u = dim E u (x), ¯ Put E (x) d = dim M. For (v s , v u ) ∈ Rs × Ru , define (v s , v u ) = max{v s s , v u u } where · s and · u are the usual Euclidean norms on Rs and Ru respectively. The closed disk in Rs of radius ρ centered at 0 is denoted by Rs (ρ) and R(ρ) := Rs (ρ) × Ru (ρ). Put λ− = max{ λi : λi < 0} and λ+ = min{ λi : λi > 0}. Let 0 < ε < min{−λ− /100, λ+ /100} be given. Then there is a Borel set ⊂ 0 with µ( ) ¯ = 1 and τ = and there is a measurable function l : → [1, +∞) with l(τ ±1 x) ¯ ≤ eε l(x) ¯ such that for each x¯ ∈ one can define an embedding x¯ : R(l(x) ¯ −1 ) → M with the following properties:

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

403

i) x¯ (0) = x0 , T0 x¯ takes Rs , Ru to E s (x), ¯ E u (x) ¯ respectively. −1 −1 ◦ f x−1 ◦ x¯ , defined wherever they ii) Put f x¯ := τ x¯ ◦ f ◦ x¯ and f x¯ := −1 −1 τ −1 x¯ make sense. Then − +ε

T0 f x¯ v ≤ eλ

v for v ∈ Rs

T0 f x¯ v ≥ eλ

v for v ∈ Ru .

and + −ε

iii) Let L(g) denote the Lipschitz constant of a map g. Then L( f x¯ − T0 f x¯ ) ≤ ε, L( f x¯−1 − T0 f x¯−1 ) ≤ ε and L(T· f x¯ ) ≤ l(x), ¯ L(T· f x¯−1 ) ≤ l(x). ¯ iv) f x¯ v ≤ eλ v and f x¯−1 v ≤ eλ v for all v ∈ R(e−λ−ε l(x) ¯ −1 ), where λ > 0 is a number depending only on ε and the exponents. In particular, ¯ −1 ) ⊂ R(l(τ −1 x) ¯ −1 ). f x¯−1 R(e−λ−ε l(x) ¯ −1 ) we have v) For any v, v ∈ R(l(x) ¯ K −1 d(x¯ v, x¯ v ) ≤ v − v ≤ l(x)d( x¯ v, x¯ v )

for some universal constant K > 0. The proof of the above facts is similar to [2, Appendix] or [5, Proof of Proposition VI.3.1] (by replacing ω with x) ¯ and is omitted here. Any system of local charts {x¯ : x¯ ∈ } satisfying i)–v) above will be referred to as (ε, l)-charts. Let {x¯ : x¯ ∈ } be a system of (ε, l)-charts and let 0 < δ ≤ 1 be a reduction factor. For x¯ ∈ define n n −1 Sδs (x) ¯ = {z ∈ R(l(x) ¯ −1 ) : −1 ¯ , ∀n ≥ 0}. τ n x¯ ◦ f ◦ x¯ z ≤ δl(τ x)

¯ ⊂ V s (x0 ) for µ-a.e. ¯ x¯ ∈ . And, when δ > 0 is small, Sδs (x) ¯ is the graph Then x¯ Sδs (x) s −1 ¯ ) → Ru (δl(x) ¯ −1 ) with h x¯ (0) = 0 and T· h x¯ ≤ 13 . of a function h x¯ : R (δl(x) ¯ µ) Partitions adapted to Lyapunov charts. A measurable partition P of ( M, ¯ is said to be adapted to ({x¯ }, δ) if for µ-a.e. ¯ x¯ ∈ one has π P − (x) ¯ ⊂ x¯ Sδs (x), ¯ where −n P. P − = +∞ τ n=0 ¯ µ) Lemma 4.2.1. Given {x¯ } and 0 < δ ≤ 1, there is a finite entropy partition P of ( M, ¯ such that P is adapted to ({x¯ }, δ). ¯ ≤ l0 } has positive µ¯ measure. Proof. Fix some l0 > 0 so that := {x¯ ∈ : l(x) For x¯ ∈ , let r (x) ¯ be the smallest positive integer k such that τ −k x¯ ∈ . Define ψ : M¯ → (0, +∞) by if x¯ ∈ , min{δ, ρ0 } ψ(x) ¯ = ¯ , ρ } if x¯ ∈ . min{δl0−2 e−(λ+ε)r (x) 0 Then ψ is defined µ¯ almost everywhere and log ψ is µ-integrable ¯ since r (x)d ¯ µ¯ = 1.

404

P.-D. Liu

Take numbers C > 0 and r0 > 0 such that, for any 0 < r ≤ r0 , there exists a measurable partition αr of M which satisfies diam αr (x) ≤ r

for all x ∈ M

and d 1 , |αr | ≤ C r where |αr | denotes the number of elements of αr . Put Un = {x¯ ∈ M¯ : e−(n+1) < ψ(x) ¯ ≤ e−n }. Define a partition P of M¯ by requiring that P ≥ {Un : n ≥ 0} and P|Un = {π −1 A : A ∈ αrn }|Un , where rn = e−(n+1) . Clearly ¯ diam π P(x) ¯ ≤ ψ(x) ¯ for any x¯ ∈ M, and, by the µ-integrability ¯ of log ψ one has Hµ¯ (P) < +∞ (see Mané [6]). We now check that π P − (x) ¯ ⊂ x¯ R(δl(x) ¯ −1 ) for µ-a.e. ¯ x¯ ∈ n≥0 τ −n . This ¯ First consider x¯ ∈ . clearly implies that π P − (x) ¯ ⊂ x¯ Sδs (x) ¯ for µ-a.e. ¯ x¯ ∈ M. By the choice of P, we have π P − (x) ¯ ⊂ π P(x) ¯ ⊂ B(x0 , ψ(x)) ¯ which is contained ¯ ≤ δl( x) ¯ −1 ) because l(x)ψ( ¯ x) ¯ = l(x) ¯ · δl0−2 e−(λ+ε)r (x) ¯ −1 . Suppose now in x¯ R(δl(x) n x¯ ∈ and n > 0 is the smallest positive integer n such that τ x¯ ∈ . Then ¯ ⊂ π P − (τ n x) ¯ ⊂ B(xn , ψ(τ n x)) ¯ π τ n P − (x) ⊂ τ n x¯ R(δl(τ n x) ¯ −1 e−(λ+ε)r (τ

n x) ¯

).

Now −1 −1 n −1 −(λ+ε)r (τ ¯ ⊂ x¯ f τ−1 ¯ e π P − (x) x¯ ◦ · · · ◦ f τ n−1 x¯ ◦ f τ n x¯ R(δl(τ x)

⊂ x¯ R(δl(τ n x) ¯ −1 e −1 ⊂ x¯ R(δl(x) ¯ ),

−(λ+ε)r (τ n x) ¯

¯ This completes the proof. since n ≤ r (τ n x).

n x) ¯

)

eλn )

f −1 ξ )

Proof of Hµ (ξ | = h µ ( f ). Fix arbitrarily κ > 0. Given {x¯ } and 0 < δ ≤ 1, ¯ µ) take a finite entropy partition P of ( M, ¯ such that P refines π −1 {S, M\S} (where S is the set given in the outline of the proof of Proposition 4.1.1), P is adapted to ({x¯ }, δ) and h µ¯ (τ, P) ≥ h µ¯ (τ ) − κ = h µ ( f ) − κ. Put η1 = ξ¯ ∨ P −

and

η2 = P −

(recall that ξ¯ = π −1 ξ ). Then h µ¯ (τ, η2 ) = h µ¯ (τ, P)

(4.2.1)

h µ¯ (τ, η1 ) = Hµ¯ (ξ¯ | τ −1 ξ¯ ).

(4.2.2)

and

Equality Relating Entropy, Folding Entropy and Negative Lyapunov Exponents

405

The equality (4.2.1) is straightforward. The proof of (4.2.2) is similar to [2, Lemma 3.2.1] and we present it here for completeness. h µ¯ (τ, η1 ) = h µ¯ (τ, ξ¯ ∨ τ −n P − ) = Hµ¯ (ξ¯ ∨ τ −n P − | τ −1 ξ¯ ∨ τ −(n+1) P − ) = Hµ¯ (ξ¯ | τ −1 ξ¯ ∨ τ −(n+1) P − ) + Hµ¯ (P − | τ n ξ¯ ∨ τ −1 P − ), where the first term is ≤ Hµ¯ (ξ¯ | τ −1 ξ¯ ) and the second term goes to 0 as n → +∞ since τ n ξ¯ goes to the partition of M¯ into single points. On the other hand, h µ¯ (τ, η1 ) = h µ¯ (τ, ξ¯ ∨ P) ≥ h µ¯ (τ, ξ¯ ), since Hµ¯ (P) < +∞. This proves (4.2.2). We now show that for sufficiently small δ > 0 we have P − (x) ¯ = (ξ¯ ∨ P − )(x), ¯ µ−a.e. ¯ x, ¯

(4.2.3)

Hµ¯ (ξ¯ | P − ) = 0.

(4.2.4)

which implies

¯ then y0 ∈ ξ(x0 ). Since In order to prove (4.2.3), it is sufficient to show that, if y¯ ∈ P − (x), P refines π −1 {S, M \ S} and y¯ ∈ P − (x), ¯ it suffices to prove that d s ( f n y0 , f n x0 ) ≤ γ whenever f n x0 ∈ S. This is in fact true for all n ≥ 0 since n −1 d s ( f n y0 , f n x0 ) ≤ K · f x¯n −1 x¯ y0 − f x¯ x¯ x 0 − +2ε)n

≤ K · e(λ

−1 −1 ¯ −1 ≤ γ . x¯ y0 − x¯ x 0 ≤ K · 2δl( x)

Then, by (4.2.4), we know that Hµ (ξ | f −1 ξ ) = h µ¯ (τ, η1 ) ≥ h µ¯ (τ, η2 ) ≥ h µ ( f ) − κ. Since κ > 0 is arbitrary, we get Hµ (ξ | f −1 ξ ) = h µ ( f ). Acknowledgement. The author expresses his sincere thanks to the anonymous referee for careful reading of the manuscript and, in particular, for indicating the arguments in Remark 2.8.

References 1. Ledrappier, F., Strelcyn, J.–M.: A proof of the estimation from below in Pesin’s entropy formula. Ergod. Th. Dynam. Syst. 2, 203–219 (1982) 2. Ledrappier, F., Young, L.-S.: The metric entropy of diffeomorphisms. Part I: Characterization of measures satisfying Pesin’s formula. Ann. Math. 122, 509–539 (1985) 3. Liu, P.-D.: Ruelle inequality relating entropy, folding entropy and negative Lyapunov exponents. Commun. Math. Phys. 240, 531–538 (2003) 4. Liu, P.-D.: Pesin’s entropy formula for endomorphisms. Nagoya Math. J. 150, 197–209 (1998) 5. Liu, P.-D., Qian, M.: Smooth Ergodic Theory of Random Dynamical Systems. Lect. Not. Math. 1606, Berlin-Heidelberg-New York: Springer, 1995 6. Mané, R.: A proof of Pesin’s formula. Ergod. Th. Dynam. Syst. 1, 95–102 (1981) 7. Parry, W.: Entropy and Generators in Ergodic Theory. New York: W. A. Benjamin, Inc., 1969 8. Qian, M., Shu, Z.: SRB measures and Pesin’s entropy formula for endomorphisms. Trans. Amer. Math. Soc. 354, 1453–1471 (2002)

406

P.-D. Liu

9. Rokhlin, V.A.: Lectures on the theory of entropy of transformations with invariant measures. Russ. Math. Surv. 22, No. 5, 1–54 (1967) 10. Ruelle, D.. Positivity of entropy production in nonequilibrium statistical mechanics. J. Stat. Phys. 85, Nos.1/2, 1–23 (1996) 11. Ruelle, D.: An inequality for the entropy of differentiable maps. Bol. Soc. Bras. Math. 9, 83–87 (1978) Communicated by G. Gallavotti

Commun. Math. Phys. 284, 407–424 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0577-3

Communications in

Mathematical Physics

On Distribution of Energy and Vorticity for Solutions of 2D Navier-Stokes Equation with Small Viscosity Sergei B. Kuksin1,2 1 C.M.L.S, École Polytechnique, 91128 Palaiseau, France.

E-mail: [email protected]

2 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, UK

Received: 10 October 2007 / Accepted: 4 April 2008 Published online: 9 August 2008 – © Springer-Verlag 2008

Abstract: We study distributions of some functionals of space-periodic solutions for the randomly perturbed 2D Navier-Stokes equation, and of their limits when the viscosity goes to zero. The results obtained give explicit information on distribution of the velocity field of space-periodic turbulent 2D flows.

0. Introduction We consider the 2D Navier-Stokes equation (NSE) under periodic boundary conditions, perturbed by a random force: vτ − εv + (v · ∇)v + ∇ p˜ = εa η(τ, ˜ x), div v = 0, v = v(τ, x) ∈ R2 ,

p˜ = p(τ, ˜ x), x ∈ T2 = R2 /(l1 Z × l2 Z). (0.1)

Here 0 < ε 1, the scaling exponent a is a real number and l1 , l2 > 0. We assume that a < 23 since a ≥ 23 corresponds to non-interesting equations with small solutions (see [Kuk06a], Sect. 10.3). It is also assumed that v d x ≡ η˜ d x ≡ 0 and that the force η˜ is a divergence-free Gaussian random field, white in time and smooth in x. Under mild non-degeneracy assumption on η˜ (see in Sect. 1) the Markov process which the equation defines in the function space H, 2 2 2 H = u(x) ∈ L (T ; R ) | div u = 0, u dx = 0 , T2

has a unique stationary measure. We are interested in asymptotic (as ε → 0) properties of this measure and of the corresponding stationary solution. The substitution v = εb u ,

τ = ε−b t ,

ν = ε3/2−a ,

408

S. B. Kuksin

where b = a − 1/2, reduces Eq. (0.1) to u˙ − νu + (u · ∇)u + ∇ p =

√

ν η(t, x), div u = 0,

(0.2)

˜ −b t) is a new random field, distributed as η˜ (see where u˙ = u t and η(t) = εb/2 η(ε [Kuk06a]). Below we study Eq. (0.2). Let µν be the unique stationary measure for (0.2) and u ν (t) ∈ H be the corresponding stationary solution, i.e., Du ν (t) ≡ µν (here and below D signifies the distribution of a random variable). Comparing to other equations (0.1), Eq. (0.2) has the special advantage: when ν → 0 along a subsequence {ν j }, stationary solution u ν j converges in distribution to a stationary process U (t) ∈ H, formed by solutions of the Euler equation u(t, ˙ x) + (u · ∇)u + ∇ p = 0, div u = 0.

(0.3)

Accordingly, µν j µ0 , where µ0 = DU (0) is an invariant measure for (0.3) (see below Theorem 1.1). The solution U is called the Eulerian limit. This is a random process of order one since E|∇x U (t, ·)|2H equals an explicit non-zero constant. The goal of this paper is to study properties of the measure µ0 since they are responsible for asymptotic properties of solutions for Eq. (0.1). The first main difficulty in this study is to rule out the possibility that with a positive probability the energy E(u ν ) of the process u ν , equal to 21 |u ν (t, x)|2 d x, becomes very small with ν (and that the energy of the Eulerian limit vanishes with a positive probability). In Sect. 2 we show that this is not the case and that P{E(u ν ) < δ} ≤ Cδ 1/4 , ∀ δ > 0,

(0.4)

for each ν. To prove the estimate we develop further some ideas, exploited in [KP08] in a similar situation. Namely, we construct a new process u˜ ν ∈ H, coupled to the process u ν , such that E(u˜ ν (τ )) = E(u ν (τ ν −1 )) and u˜ ν satisfies an Ito equation, independent from ν. Next we use Krylov’s result [Kry87] on distribution of Ito integrals to estimate Du˜ ν (τ ) and recover (0.4). In Sect. 3 we use (0.4) to prove that the distribution of energy of the Eulerian limit U has a density against the Lebesgue measure, i.e. DE(U ) = e(x) d x, e ∈ L 1 (R+ ). The functionals f (u(·)) = f (rot u(x)) d x are integrals of motion for the Euler equation. An analogy with the averaging theory for finite-dimensional stochastic equations (e.g., see [FW03]) suggests that their distributions behave well when ν → 0. Accordingly, in Sect. 4 we study the distributions of vector-valued random variables f (u ν (t)) = f1 (u ν (t), . . . , fm (u ν (t) ∈ Rm , and of f (U (t)). Assuming that the functions f j are analytic, linearly independent and satisfy certain restriction on growth, we show that the distribution of f (U (t)) has a density against the Lebesgue measure: D (f U (t)) = pf (x) d x ,

pf ∈ L 1 (Rm ).

To prove this result we show that the measures Df u ν (t) are absolutely continuous with respect to the Lebesgue measure, uniformly in ν. The proof crucially uses (0.4) as well as the obtained in [Kuk06b] uniform in ν bounds on exponential moments of the random variables rot(u ν (t, x)).

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

409

Since m is arbitrary, then this result implies that the measure µ0 is genuinely infinite dimensional in the sense that any compact set of finite Hausdorff dimension has zero µ0 -measure. Other equations. The results and the methods of this work apply to another PDE of the form √ Hamiltonian equation + νdissipation = ν random force , (0.5) provided that the corresponding Hamiltonian PDE has at least two ‘good’ integrals of motion. In particular, they apply to the randomly forced complex Ginzburg-Landau equation √ (0.6) u˙ − (ν + i)u + i|u|2 u = ν η(t, x), dim x ≤ 4, supplemented with the odd periodic boundary conditions. The corresponding Hamiltonian PDE is the NLS equation, having two ‘good’ integrals: the Hamiltonian H and the total number of particles E = 21 |u|2 d x. Equation (0.6) was considered in [KS04], where it was proved that for stationary in time solutions u ν of (0.6) an inviscid limit V (t) (as ν → 0 along a subsequence) exists and possesses properties, similar to those, stated in Theorem 1.1. The methods of this work allow to prove that the random variable E(u ν (t)) satisfies (0.4) uniformly in ν > 0, that H (u ν (t)) meets similar estimates and that V is distributed in such a way that D (H (V (t))) and D (E(V (t))) are absolutely continuous with respect to the Lebesgue measure. If dim x = 1, then the NLS equation is integrable and the inviscid limit V may be analysed further, using the methods, developed in [KP08] to study the damped/driven KdV equation (which is another example of the system (0.5)). Certainly our methods as well apply to some finite-dimensional systems of the form (0.5). In particular – to Galerkin approximations for the 3D NSE under periodic boundary conditions, perturbed by a random force, similar to (1.2). It is easy to establish for that system analogies of results in Sects. 1–3. A More interesting example is given by system (0.5), where the Hamiltonian equation is the Euler equation for a rotating solid body [Arn89]. This system can be cautiously regarded as a finite-dimensional model for (0.1); see Appendix.1 1. Preliminaries Using the Leray projector : L 2 (T2 ; R2 ) → H we rewrite Eq. (0.2) as the equation for u(t) = u(t, ·) ∈ H: √ (1.1) u˙ + ν A(u) + B(u) = ν η(t). Here A(u) = − u and B(u) = (u · ∇)u. We denote by · and by (·, ·) the L 2 -norm and scalar product in H. Let (es , s ∈ Z2 \{0}) be the standard trigonometric basis of this space: 2π

f s1 2π l 1 x 1 + s2 l 2 x 2 −s2 /l2 , es (x) = s1 /l1 l1 2 1 l2 2 2 l 1 s1 + l 2 s2 1 We are thankful to V. V. Kozlov and members of his seminar in MSU for drawing our attention to this equation.

410

S. B. Kuksin

where f = sin or f = cos, depending on whether s1 + s2 δs1 ,0 > 0 or s1 + s2 δs1 ,0 < 0. Then es = 1 and ∀s . Aes = λs es , λs = (2π )2 (s1 /l1 )2 + (s2 /l2 )2 , The force η is assumed to be a Gaussian random field, white in time and smooth in x: η=

d ζ (t, x), ζ = dt

bs βs (t)es (x) ,

(1.2)

s∈Z2 \{0}

where {bs } is a set of real constants, satisfying |s|2 bs2 < ∞ , bs = b−s = 0 ∀ s, and {βs (t)} are standard independent Wiener processes. This equation is known to have a unique stationary measure µν .2 This is a probability Borel measure in the space H which attracts distributions of all solutions for (1.1) as t → ∞ (e.g., see in [Kuk06a]). Let u ν (t, x) be a corresponding stationary solution, i.e. Du ν (t) ≡ µν . Apart from being stationary in t, this solution is known to be stationary (=homogeneous) in x. For any l ≥ 0 we denote by Hl , l ≥ 0, the Sobolev space H ∩ H l (T2 ; R2 ), given the norm

1/2 2 l/2 (1.3)

u l = (−) u(x) d x (so u 0 = u ). A straightforward application of Ito’s formula to u ν (t) 2 and u ν (t) 21 implies that E u ν (t) 21 ≡

1 1 B0 , E u ν (t) 22 ≡ B1 , 2 2

(1.4)

where for l ∈ R we denote Bl = |s|2l bs2 (note that B0 , B1 < ∞ by assumption); e.g. see in [Kuk06a]. The theorem below describes what happens to the stationary solutions u ν (t, x) as ν → 0. For the theorem’s proof see [Kuk06a] (there the result is stated for the square torus R2 /(2π Z2 ), but the proof does not use this assumption). Theorem 1.1. Any sequence ν˜ j → 0 contains a subsequence ν j → 0 such that Du ν j (·) DU (·) in P C(0, ∞; H1 ) .

(1.5)

2 If T2 is a square torus R2 /(lZ2 ), then by the results of [HM06] the stationary measure µ is unique ν if bs = 0 for |s| ≤ N , where N is a ν-independent constant. Accordingly, if T2 is a square torus, then Theorems 1.1 and 2.1 below remain true under this weaker assumption on the numbers bs . But our arguments in Sects. 3, 4 use essentially that all coefficients bs are non-zero.

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

411

The limiting process U (t) ∈ H1 , U (t) = U (t, x), is stationary in t and in x. Moreover, 1) a) Every trajectory U (t, x) is such that U (·) ∈ L 2 loc (0, ∞; H2 ), U˙ (·) ∈ L 1 loc (0 ∞; H1 ) . b) It satisfies the free Euler equation (0.3), so µ0 = D(U (0)) is an invariant measure for (0.3), c) U (t) 0 and U (t) 1 are time-independent quantities. If g is a bounded continuous function, then T2 g(rot U (t, x)) d x also is a time-independent quantity. 2) For each t ≥ 0 we have E U (t) 21 = 21 B0 , E U (t) 22 ≤ 21 B1 and E exp σ U (t) 21 ≤ C for some σ > 0, C ≥ 1. Amplification. If B2 < ∞, then the convergence (1.5) holds in the space P(C(0, ∞; Hκ )), for any κ < 2. See [Kuk06a], Remark 10.4. Due to 1b), the measure µ0 = DU (0) is invariant for the Euler equation. By 2) it is supported by the space H2 and is not a δ-measure at the origin. The process U is called the Eulerian limit for the stationary solutions u ν of (1.1). Note that apriori the process U and the measure µ0 depend on the sequence ν j . Since u 21 ≤ u 0 u 2 and E u 21 ≤ (E u 20 )1/2 (E u 21 )1/2 , then (1.4) implies that 1 2 −1 1 B0 B1 ≤ E u ν (t) 20 ≤ B1 2 2

(1.6)

for all ν. That is, the characteristic size of the solution u ν remains ∼ 1 when ν → 0. Since the characteristic space-scale also is ∼ 1, then the Reynolds number of u ν grows as ν −1 when ν decays to zero. Hence, Theorem 1.1 describes a transition to turbulence for space-periodic 2D flows, stationary in time. Recall that Eq. (0.2) is the only 2D NSE (0.1), having a limit of order one as ν → 0 (cf. [Kuk06a], Sect. 10.3). Thus the various Eulerian limits as in Theorem 1.1 with different coefficients {bs } (corresponding to different spectra of the applied random forces) describe all possible 2D space-periodic stationary turbulent flows. Our goal is to study further properties of the Eulerian limits. 2. Estimate for Energy of Solutions 2.1. The result. The energy E ν (t) = 21 u ν (t) 20 of a stationary solution u ν , ν ∈ (0, 1], is a stationary process. It satisfies the relations 1 2 −1 1 B0 B1 ≤ EE ν (t) = B0 , E exp(σ E ν (t)) ≤ C , 4 4

(2.1)

where σ, C > 0 are independent from ν (see (1.6) and [Kuk06a], Sect. 4.3). The energy E 0 (t) of the Eulerian limit U also meets (2.1). Let {|b˜ j |, j ∈ N} be the rearrangement of the numbers {|bs |, s ∈ Z2 \0} in decreasing order: |b˜1 | ≥ |b˜2 | ≥ . . . .

412

S. B. Kuksin

Theorem 2.1. Assume that B2 < ∞. Then there exists a constant C > 0, depending only on B1 and |b˜2 |, such that P{E ν (t) < δ} ≤ Cδ 1/4 ,

(2.2)

uniformly in ν ∈ (0, 1]. Due to the convergence (1.5), the energy E 0 (t) = 21 U (t) 2 of the Eulerian limit also satisfies (2.2). Introducing the fast time τ = tν −1 we get for u(τ ) = u(τ, x) the equation du(τ ) = (−Au − ν −1 B(u))dτ + bs es dβs (τ ) , (2.3) s

√ where {βs (τ ) = ν βs (ντ ), s ∈ Z2 \0} are new standard independent Wiener processes. 2.2. Beginning of proof. The proof goes in five steps. We start with a geometrical lemma which is used below in the heart of the construction. Let us denote by S the sphere {u ∈ H | u 0 = 1}. Let {e j , j ≥ 1}, be the basis {es , s ∈ Z2 \{0}}, re-parameterised by natural numbers in such a way that e j = es( j) , where λs( j) ≥ λs(i) if j ≥ i. Lemma 2.2. There exists δ > 0 with the following property. Let v0 , v˜0 be any two points in S. Then for (v, v) ˜ ∈ S × S such that

v − v0 0 < δ, v˜ − v˜0 0 < δ

(2.4)

(v ,v˜ )

0 0 there exists an unitary operator U(v,v) of the space H, satisfying ˜ = U(v,v) ˜

i) U is an operator-valued Lipschitz function of v and v˜ with a Lipschitz constant ≤ 2; ii) U(v,v) ˜ = v; ˜ (v) iii) there exists a unitary vector η = η(v, v) ˜ in the plane span {e1 , e2 } such that the vector U(v,v) ˜ (η) makes with this plane an angle ≤ π/4. Accordingly, (2.5) max (U(v,v) ˜ ei , e j ) ≥ c∗ , i, j∈{1,2}

where c∗ > 0 is an absolute constant. Proof. Let us start with the following observation: There exists δ > 0 such that for any v0 ∈ S and v1 ∈ {v ∈ S | v − v0 0 < δ} there exists an unitary transformation Wv1 ,v0 of the space H with the following property: Wv0 ,v0 = id, Wv1 ,v0 (v0 ) = v1 and W is a Lipschitz function of v1 and v0 with a Lipschitz constant ≤ 2. To prove the assertion let us denote by A the linear space of bounded anti self-adjoint operators in H (given the operator norm), and consider the map A × S → S, (A, v) → e A v . Note that the differential of this map in A, evaluated at A = 0, v = v0 , is the map A → A v0 , which sends A to the space Tv0 S = {v ∈ H | (v, v0 ) = 0} and admits a

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

413

right inverse operator of unit norm. So the assertion with W = e A , where A satisfies the equation e A v0 = v1 , follows from the implicit function theorem. To prove the lemma we choose unit vectors η0 , η˜ 0 ∈ span {e1 , e2 } such that (v0 , η0 ) = 0 and (v˜0 , η˜ 0 ) = 0. Next we choose an unitary transformation U , such that U (v˜0 ) = v0 and U (η˜ 0 ) = η0 . For vectors v, v, ˜ satisfying (2.4), denote U (v) ˜ = ξ˜ . Then ξ˜ −v0 0 < δ. Let Wv,ξ˜ be the operator from the assertion above. We set Uv,v˜ = Wv,ξ˜ ◦U . This operator obviously satisfies i) and ii). Since Uv,v˜ (η˜ 0 )−η0 0 ≤ Cδ, then choosing δ < C −1 2−1/2 we achieve iii) with η = η˜ 0 . Remark. Let j1 and j2 be any two different natural numbers. The same arguments as above prove existence of an unitary operator U , satisfying i), ii) and such that maxi∈{1,2}, j∈{ j1 , j2 } (U ei , e j ) ≥ c∗ . For any (v0 , v˜0 ) ∈ S × S, let Oδ (v0 , v˜0 ) ⊂ S × S be the open domain, formed by all pairs (v, v), ˜ satisfying (2.4). Let O1 , O2 , . . . be a countable system of domains Oδ/2 (v j , v˜ j ) =: O j , j ≥ 1, which cover S × S. We call (v j , v˜ j ) the centre of a domain Oj. Consider the mapping S × S → N, (v, v) ˜ → n(v, v) ˜ = min{ j | (v, v) ˜ ∈ Oj} .

(2.6)

It is measurable with respect to the Borel sigma-algebras. Finally, for j = 1, 2, . . . and (v, v) ˜ ∈ O j we define the operators (v ,v˜ j )

j

Uv,v˜ = Uv,v˜j

.

2.3. Step 1: Equation for u(t). ˜ Till the end of Sect. 2 for any u ∈ H we will denote v = v(u) = u/ u 0 if u = 0 and v = e1 if u = 0.

(2.7)

˜ ), 0 ≤ τ ≤ T0 , with continuLet us fix any T0 > 0. We start to construct a process u(τ ous trajectories, satisfying u(τ ˜ ) 0 ≡ u(τ ) 0 . The process will be constructed as a solution of a stochastic equation, in terms of some stopping times 0 = τ0 ≤ τ1 < τ2 < . . . . We set τ0 = 0 and define a random variable n 0 = n(v(0), v(0)) ∈ N (see (2.6)). Let us consider the following stochastic equation for u(τ ) = (u(τ ), u(τ ˜ )) ∈ H × H: bs es dβs (τ ), (2.8) du(τ ) = (−Au − ν −1 B(u))dτ + d u(τ ˜ )=

−Uu∗ Au dτ

+

s

Uu∗ bs es

dβs (τ ).

(2.9)

s n (ω)

Here Uu∗ is the adjoint to the unitary operator Uu = Uv,0v˜ , where v = v(u) and v˜ = v(u), ˜ see (2.7). Let us fix any γ ∈ (0, 1] and define the stopping times Tγ = inf{τ ∈ [0, T0 ] | u(τ ) 0 ∧ u(τ ˜ ) 0 ≤ γ or u(τ ) 2 ≥ γ −1 } , τ1 = inf{τ ∈ [0, T0 ] | u(τ ) ∈ / Oδ (vn 0 , v˜n 0 )} ∧ Tγ . Here and in similar situations below, inf ∅ = T0 , and (vn 0 , v˜n 0 ) is the centre of the domain On 0 .

414

S. B. Kuksin

For 0 ≤ τ ≤ τ1 the operator Uu is a Lipschitz function of u since u 0 ≥ γ and

u ˜ 0 ≥ γ . As u(τ ) 2 ≤ γ −1 for τ ≤ Tγ , then it is not hard to see that the system (2.8),(2.9), supplemented with the initial condition u(0) = (u(0), u(0)),

(2.10)

has a unique strong solution u(τ ), 0 ≤ τ ≤ τ1 , satisfying E sup u(τ ˜ ) 20 ≤ C(T0 , ν, γ ).

(2.11)

0≤τ ≤τ1

˜ 1 )) and for τ ≥ τ1 re-define the operator Uu in (2.9) Next we set n 1 = n(v(τ1 ), v(τ n (ω) as Uv,1v˜ (as before, v = v(u(τ )) and v˜ = v(u(τ ˜ ))). We set / Oδ (vn 1 , v˜n 1 )} ∧ Tγ , τ2 = inf{τ ∈ [τ1 , T0 ] | u(τ ) ∈ where (vn 1 , v˜n 1 ) is the centre of On 1 , and consider the system (2.8), (2.9) for τ1 ≤ τ ≤ τ2 with the initial condition at τ1 , obtained by continuity. The system has a unique strong solution and (2.11) holds with τ1 replaced by τ2 . Iterating this construction we obtain stopping times τ0 ≤ τ1 ≤ τ2 ≤ . . . , the operator Uu (τ ), piecewise constant in τ and discontinuous at points τ = τ j , as well as a strong solution u(τ ) of (2.8)–(2.10), defined for 0 ≤ τ < lim j→∞ τ j ≤ Tγ , and satisfying (2.11) with τ1 replaced by any τ j . Clearly τ j < τ j+1 , unless τ j = τ j+1 = Tγ . 2.4. Step 2: Growth of stopping times τ j . For any τ ≥ 0 and N ≥ 1 let us write u(τ ˜ ∧TN ) as

u(τ ˜ ∧ TN ) = u(0) −

τ ∧TN

0

U ∗ A(u) dθ +

τ ∧TN 0

bs U ∗ es dβs

s

=: u˜ 1 (τ ) + u˜ 2 (τ ) . Since u 2 ≤ γ −1 , then the process u˜ 1 (τ ) ∈ H is Lipschitz in τ . A straightforward application of the Kolmogorov criterion implies that the process u˜ 2 (τ ) ∈ H a.s. satisfies the Hölder condition with the exponent 1/3. So the process u(τ ˜ ∧ TN ) is a.s. Hölder. The process u(τ ∧ TN ) is Hölder as well, so

u (τ j + ) ∧ TN ; ω) − u(τ j ; ω) 0 ≤ K (ω)1/3 (K (ω) is independent from N ). Since u(τ j+1 ) − u(τ j ) 0 ≥ 2δ unless τ j+1 = Tγ , then |τ j+1 − τ j | ≥ (δ/2K (ω))3 or τ j+1 = Tγ . As τ j ≤ Tγ ≤ T0 , then τ j = Tγ

for j ≥ j (γ ; ω) ,

(2.12)

where j (γ ) < ∞ a.s. For any 0 < γ ≤ 1 we have constructed a process u(τ ) = (u(τ ), u(τ ˜ )), τ ∈ [0, Tγ ], satisfying (2.8)–(2.10), where the operator Uu is a piecewise constant function of τ .

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

415

2.5. Step 3: u(τ ˜ ) 0 ≡ u(τ ) 0 for τ ≤ Tγ . For j = 0, 1, . . . we will prove the following assertion: if u(τ ˜ j ) 0 = u(τ j ) 0 a.s., then

u(τ ˜ ) 0 = u(τ ) 0 for τ j ≤ τ ≤ τ j+1 , a.s.

(2.13)

Since u(τ ˜ 0 ) = u(τ0 ), then (2.12) and (2.13) would imply that

u(τ ˜ ) 0 = u(τ ) 0 ∀ 0 ≤ τ ≤ Tγ ,

(2.14)

for any γ > 0. To prove (2.13) we consider (following Lemma 7.1 in [KP08]) the quantities 2 ˜ ) = 1 u(τ E(τ ) = 21 u(τ ) 20 and E(τ 2 ˜ ) 0 . Due to Ito’s formula we have d E = (u, −Au) dτ +

1 B0 dτ + (u, bs es dβs (τ )) 2 s

and 1 2 ∗ 2 bs |U es | dτ + (u, ˜ bs (U ∗ es ) dβs (τ )) 2 s 1

u ˜ 0

u ˜ 0 (u, −Au) dτ + B0 dτ + (u, bs es dβs (τ )) . =

u 0 2

u 0 s

d E˜ = (u, ˜ −U ∗ Au) dτ +

Therefore, ˜ 0

u 0 − u (u, −Au) dτ

u 0

u 0 − u ˜ 0 2 2 bs (u, es )2 dτ + Mτ ,

u 0 s

˜ 2 = 2(E − E) ˜ d(E − E)

where Mτ stands for the corresponding stochastic integral. ˜ 2 ((τ ∨ τi ) ∧ τi+1 ). Then For 0 ≤ τ ≤ Tγ let us denote J (τ ) = (E − E)

u 0 − u ˜ 0 d ˜ EJ (τ ) = 2 E (E − E) (u − Au)Iτi ≤τ ≤τi+1 dτ

u 0

u 0 − u ˜ 0 2 2 2 bs (u, es ) Iτi ≤τ ≤τi+1 . +E

u 0 ˜

2(E− E) d −2 , u , u Since u 0 − u ˜ 0 = u ˜ 0 ≥ γ , then dτ EJ (τ )≤ 0 ˜ 0 and |(u, −Au)| ≤ γ 0 + u Cγ EJ (τ ). As J (0) = 0, then EJ (τ ) ≡ 0 and (2.13) is established. Accordingly (2.14) also is proved.

416

S. B. Kuksin

2.6. Step 4: Limit γ → 0. Since B2 < ∞, then u(τ ) satisfies the γ -independent estimate E sup u(τ ) 2 ≤ C(T0 , ν) 0≤τ ≤T0

(see [Kuk06a], Sect. 4.3). Accordingly P

sup u(τ ) 2 ≤ γ

0≤τ ≤T0

−1

→ 1 as γ → 0.

(2.15)

Denote by u(τ ˆ ) the two-vector (u 1 (τ ), u 2 (τ )), where u(τ ) = u j (τ )e j (we recall that e1 , e2 , . . . are the basis vectors es , re-parameterised by natural numbers). Then τ uˆ j (τ ) = u j (0) + F j ds + b j β j (s), j = 1, 2, 0

where F j is the j th component of the drift in (2.3). Since uˆ is a stationary process, then P{u(0) ˆ = 0} = 0 (this follows, say, from Krylov’s result, used in the next subsection). Setting F jR = F j ∧ R, we denote by uˆ R (τ ) ∈ R2 the process uˆ Rj (τ ) = u j (0) +

τ

F jR ds + b j β j (s),

0

j = 1, 2.

By the Girsanov theorem, distribution of the process uˆ R (τ ), 0 ≤ τ ≤ T0 , is absolutely continuous with respect to the process (b1 β1 , b2 β2 ) + u(0). ˆ Therefore P{ min |uˆ R (τ )| = 0} = 0 ,

(2.16)

0≤τ ≤T0

for any R. Since max0≤τ ≤T0 |uˆ R (τ ) − u(τ ˆ )| → 0 as R → ∞ in probability, then the process u(τ ˆ ) also satisfies (2.16). Jointly with (2.15) this implies that P{Tγ = T0 } → 1 as γ → 0 , and we derive from (2.14) the relation

u(τ ˜ ) 0 = u(τ ) 0 ∀ 0 ≤ τ ≤ T0 , a.s. 2.7. Step 5: End of proof. The advantage of the process u˜ compared to u is that it satisfies the ν-independent Ito Eq. (2.9). Let us consider the first two components of the process: ∞ ∗ ∗ d u˜ j = − Uu, Uu, (τ )A(u) dτ + bl dβl (τ ) , u˜ u˜ (τ ) j

l=1

jl

(2.17)

2 ∞ ∗ 2 ∞ where j = 1, 2. Denoting a j (τ ) = l=1 U jl bl = l=1 Ul j bl and using (2.5) we find that a.s. C ≥ a1 (τ ) + a2 (τ ) ≥ c > 0 ∀ τ ,

(2.18)

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

417

√ where C = 2 B0 and c √ depends only on |b1 | ∧ |b2 |. Due to (1.4) for each τ ≥ 0 we have E|U ∗ A(u(τ ))| j ≤ B1 /2. This bound and the first estimate in (2.18) imply that Lemma 5.1 from [Kry87] applies to the Ito equation (2.17) uniformly in ν if we choose the lemma’s parameters as follows: d = 1, γ = 1, As = s, rs = 1, cs = 1, yt = t, ϕt = t.

(2.19)

Taking in the lemma for f (t, x) the characteristic function of the segment [−δ, δ], we get γR √ E e−t a j (τ )1/2 I{|u˜ j (τ )|≤δ} dτ ≤ C δ , j = 1, 2, 0

where γ R ≤ 1 is the first exit time ≤ 1 of the process u˜ j from the segment [−R, R]. Sending R to ∞ we get that 1 √ a j (τ )1/2 I{|u˜ j (τ )|≤δ} dτ ≤ C1 δ , j = 1, 2 , (2.20) E 0

uniformly in ν. For c as in (2.18) and any fixed τ let us consider the event Q τ1 = {a1 (τ ) ≥ Denote by Q τ2 its complement. Then a1 (τ ) ≥

1 1 c on Q τ1 and a2 (τ ) ≥ c on Q τ2 . 2 2

1 2 c}.

(2.21)

Let us set Q τ = {|u˜ 1 (τ )| + |u˜ 2 (τ )| ≤ δ}. Then P(Q τ ) = E(I Q τ I Q τ1 + I Q τ I Q τ2 ) ≤ E(I{|u˜ 1 (τ )|≤δ} I Q τ1 + I{|u˜ 2 (τ )|≤δ} I Q τ2 ). By (2.21) the r.h.s. is bounded by √ √ 2 E I{|u˜ 1 (τ )|≤δ} a1 + I{|u˜ 2 (τ )|≤δ} a2 . c Jointly with (2.20) the obtained inequality shows that 1 √ P(Q τ ) dτ ≤ C2 δ. 0

Since δ δ } = P{ u(τ ˜ ) 0 ≤ } ≤ P(Q τ ) , 2 2 where the l.h.s. is independent from τ , then √ δ P{ u(τ ) 0 ≤ } ≤ C2 δ 2 for any δ > 0. This relation implies (2.2). The constant C in (2.2), as well as all other constants in this section, depend only on B1 and |b1 | ∧ |b2 |. Using the Remark in Section 2.2 we may replace |b1 | ∧ |b2 | by |b˜1 | ∧ |b˜2 |. This completes the theorem’s proof. P{ u(τ ) 0 ≤

418

S. B. Kuksin

3. Distribution of Energy Again, let u ν (τ ) be a stationary solution of (1.1), written in the form (2.3), let E ν (τ ) be its energy and E 0 (τ ) = 21 U (τ ) 20 be the energy of the Eulerian limit. Theorem 3.1. For any R > 0, let Q ⊂ [−R, R] be a Borel set. Then P{E ν (τ ) ∈ Q} ≤ p R (|Q|)

(3.1)

uniformly in ν ∈ (0, 1], where p R (t) → 0 as t → 0 In particular, the measures D(E ν (τ )) are absolutely continuous with respect to the Lebesgue measure. Since D(E ν j ) D(E 0 (τ )), then E 0 (τ ) satisfies (3.1) for any open set Q ⊂ [−R, R]. Accordingly, P{E 0 (τ ) ∈ Q} = 0 if |Q| = 0 since the Lebesgue measure is regular. We get Corollary 3.2. The measure D(E 0 (τ )) is absolutely continuous with respect to the Lebesgue measure. Proof of the theorem. For any δ > 0 let us consider the set 1

O = O(δ) = {u ∈ H2 | u 2 ≤ δ − 4 , u 0 ≥ δ}. Writing u = u ν as u = u s es , we set u I = |s|≤N u s es and u I I = u − u I . For 1

1

u I I 22 ≤ δ − 2 N −2 . So u I 20 ≥ δ 2 − δ − 2 N −4 . any u ∈ O we have u II 20 ≤ N −4 Choosing N = N (δ) = 21/4 δ −5/8 we achieve

u I 20 ≥

1 2 δ ∀ u ∈ O. 2

The stationary process E(u ν (τ )) satisfies the Ito equation

1 2 bs u s (τ ) dβs (τ ) d E = − u(τ ) 1 + B0 dτ + 2 (see Sect. (2.5)). The diffusion coefficient a(τ ) satisfies a(τ ) = bs2 |u s (τ )|2 ≥ b2N u I (τ ) 20 , where b N = min|s|≤N |bs | > 0. So, a(τ ) ≥ Besides,

1 2 2 b δ if u(τ ) ∈ O. 2 N

(3.2)

maxs bs2 1 2 B0 , E − u(τ ) 1 + B0 ≤ B0 , E|a(τ )| ≤ 2 2

(see (1.4)). Let Q ⊂ [−R, R] be a Borel set and f be its indicator function. Applying the Krylov lemma with the same choices of parameters as in (2.19), passing to the limit as R → ∞ as in Sect. 2.7 and taking into account that E(τ ) is a stationary process, we get that E a(τ )1/2 f (E(τ ) ≤ C|Q|1/2 , (3.3)

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

419

uniformly in ν > 0. Due to (1.4) and (2.2), P{u(τ ) ∈ O} ≤

√ 1 √ B1 δ + C δ. 2

Jointly with (3.2) and (3.3) this estimate implies that

√ −1 P(E ν (τ ) ∈ Q) = E f (E(τ )) ≤ C(|Q|1/2 b−1 N δ ) + C 1 δ ∀ 0 < δ ≤ 1,

where N = N (δ). Now (3.1) follows.

4. Distributions of Functionals of Vorticity In this section we assume that B6 < ∞. The vorticity ζ = rot u(t, x) of a solution u for (1.1), written in the fast time τ = νt, satisfies the equation

Here ξ =

d dt

ζτ − ζ + ν −1 (u · ∇)ζ = ξ(τ, x). s∈Z2 \{0} βs (τ )ϕs (x)

(4.1)

and

2π λs f s1 2π l 1 x 1 + s2 l 2 x 2 ϕs = rot es = π · 2 ll21 s12 + ll21 s22

for any s, where f = cos or f = − sin, depending on whether s1 + s2 δs1,0 > 0 or not. We will study Eq. (4.1) in Sobolev spaces, l l 2 H = {ζ ∈ H (T ) | ζ d x = 0}, l ≥ 0, given the norms · l , defined as in (1.3). Let us fix m ∈ N and choose any m analytic functions f 1 (ζ ), . . . , f m (ζ ), linear independent modulo constant functions.3 We assume that the functions f j (ζ ), . . . , f j (ζ ), 1 ≤ j ≤ m, have at most a polynomial growth as |ζ | → ∞ and that f j (ζ ) ≥ −C ∀ j, ∀ζ (for example, each f j (ζ ) is a trigonometric polynomial, or a polynomial of an even degree with a positive leading coefficient). Consider the map F : H l → Rm , ζ → (F1 (ζ ), . . . , Fm (ζ )), Fj = f j (ζ (x)) d x, T2

where 0 < l < 1. Since for any P < ∞ we have H l ⊂ L P (T2 ) if l is sufficiently close to 1, then choosing a suitable l = l(F) we achieve that the map F is C 2 -smooth. Let us fix this l. We have

d F(ζ )(ξ ) = f m (ζ (x))ξ(x) d x . f 1 (ζ (x))ξ(x) d x, . . . , 3 I.e., C f (ζ ) + · · · + C f (ζ ) = const, unless C = · · · = C = 0. m m m 1 1 1

420

S. B. Kuksin

Lemma 4.1. If ζ ≡ 0, then the rank of d F(ζ ) is m. Proof. Assume that the rank is < m. Then there exist the numbers C1 , . . . , Cm , not all equal to zero, such that (C1 f 1 (ζ ) + · · · + Cm f m (ζ ))ξ d x = 0 ∀ ξ ∈ H l . (4.2) Denote P(ζ ) = C1 f 1 (ζ ) + · · · + Cm f m (ζ ). This is a non-constant analytic function. Due 2 to (4.2), P(ζ (x)) = const. Denote this constant C∗ . Then the connected set ζ (T ) lies −1 2 in the discrete set P (C∗ ). So ζ (T ) is a point, i.e. ζ (x) ≡ const. Since ζ d x = 0, then ζ (x) ≡ 0. Now let ζ (t) = rot u ν (t), where u ν is a stationary solution of (1.1). Applying Ito’s formula to the process F(ζ (τ )) ∈ Rm and using that F j is an integral of motion for the Euler equation, we get that 1 d F j (τ ) = f j (ζ (τ, x))ζ (τ, x) d x + b2 f j (ζ (τ, x))ϕs2 (x) d x) dτ 2 s s bs + f j (ζ (τ, x))ϕs (x) d x dβs (τ ). s 2 ≡ |s|2 /2π 2 , then Since bs ≡ b−s and ϕs2 + ϕ−s

1 B1 ) d x dτ d F j (τ ) = f j (ζ )(−|∇x ζ |2 + 4π bs + f j (ζ (τ, x))ϕs (x) d x dβs (τ ) s

:= H j (ζ (τ )) dτ +

h js (ζ (τ )) dβs (τ ) .

s

Ito’s formula applies since under our assumptions, all moments of the random variables ζ (τ, x) and |∇x ζ (τ, x)|, are finite (see [Kuk06a], Sect. 4.3). Using that F j (τ ) is a stationary process, we get from the last relation that EH j = 0, i.e. B1 E f j (ζ (τ, x)) d x. (4.3) E f j (ζ (τ, x))|∇x ζ (τ, x)|2 d x = 4π Since B6 < ∞ then all moments of random variables |ζ (τ, x)| are bounded uniformly in ν ∈ (0, 1], see [Kuk06b] and (10.11) in [Kuk06a]. As the random field ζ is stationary in τ and in x, then the r.h.s. of (4.3) is bounded uniformly in ν. Using the assumption f j ≥ −C we find that f j |∇x ζ |2 ≤ f j |∇x ζ |2 + 2C|∇x ζ |2 . Since

E

|∇x ζ (τ, x)|2 d x = E u ν (τ ) 22 =

1 B1 , 2

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

421

then we get that E|H j (ζ (τ ))| ≤ C j < ∞,

(4.4)

uniformly in ν (and for all τ ). Consider the diffusion matrix a(ζ (τ )), a jl (ζ ) = s h js (ζ )h ls (ζ ). Clearly E tr (a jl )(ζ (τ )) ≤ C,

(4.5)

uniformly in ν. Let us denote D(ζ ) = det a jl (ζ ) ≥ 0. Noting that h js (ζ ) = bs (d F(ζ )) js , we obtain from Lemma 4.1: Lemma 4.2. The function D is continuous on H l and D > 0 outside the origin. Now we regard (4.1) as an equation in H 1 and set Oδ = {ζ ∈ H 1 | ζ 1 ≤ δ −1 , ζ l ≥ δ} . Since H 1 H l , then D ≥ c(δ) > 0 everywhere in Oδ . Estimates (4.4), (4.5) allow to apply Krylov’s lemma with p = d = m to the stationary process F(ζν (τ )) ∈ Rm , uniformly in ν. Choosing there for f the characteristic function of a Borel set Q ⊂ {|z| ≤ R}, we find that / Oδ } + c(δ)−1/(m+1) C R |Q|1/(m+1) P{F(ζν (τ )) ∈ Q} ≤ P{ζν (τ ) ∈

(4.6)

(cf. the arguments in Sect. 3). Since ζ 1 = u 2 and ζ l ≥ ζ 0 ≥ u 0 for ζ = rot u, then due to (1.4) and (2.2) the first term in the r.h.s. of (4.6) goes to zero with δ uniformly in ν, and we get that P{F(ζν (τ )) ∈ Q} ≤ p R (|Q|) ,

p R (t) → 0 as t → 0 ,

(4.7)

uniformly in ν. Evoking the Amplification to Theorem 1.1 we derive from (4.7) that the vorticity ζ0 of the Eulerian limit U satisfies (4.7), if Q is an open subset of B R . We have Theorem 4.3. If B6 < ∞, then the distribution of the stationary solution for the 2D NSE, written in terms of vorticity (4.1), satisfies (4.7) uniformly in ν. The vorticity ζ0 of the Eulerian limit U is distributed in such a way that the law of F(ζ0 (τ )) is absolutely continuous with respect to the Lebesgue measure in Rm . Corollary 4.4. Let X H∩C 1 (T2 ; R2 ) be a compact set of finite Hausdorff dimension. Then µ0 (X ) = 0. Proof. Denote the Hausdorff dimension of X by d and choose any m > d. Then (F ◦ rot)(X ) is a subset of Rm of positive codimension. So its measure with respect to D( f (ζ0 (t)) equals zero. Since D( f (ζ0 (t)) = (F ◦ rot) ◦ µ0 , then µ0 (X ) = 0.

422

S. B. Kuksin

5. Appendix: Rotation of Solid Body The Euler equation for a freely rotating solid body, written in terms of its momentum M ∈ R3 , is M˙ + [M, A−1 M] = 0,

(5.1)

where A is the operator of inertia and [·, ·] is the vector product. The corresponding damped/driven Eq. (0.5) is M˙ + [M, A−1 M] + ν M =

√

ν η(t) ,

(5.2)

d 3 where the random force is η(t) = dt j=1 b j β j (t)e j with non-zero b j ’s, and {e1 , e2 , e3 } is the eigenbasis of the operator A−1 . Let us denote by 0 < λ1 ≤ λ2 ≤ λ3 the eigenvalues, corresponding to the eigenvectors e j ’s. Equation (5.2) has a unique stationary measure. Let Mν (t) be a corresponding stationary solution. An inviscid limit, similar to that in Theorem 1.1, holds:

DMν j (·) DM0 (·) as ν j → 0 ,

(5.3)

where M0 (t) ∈ R3 is a stationary process, formed by solutions of (5.1). The Euler equation has two quadratic integrals of motion: H1 (M) = 21 |M|2 and H2 (M) = 1 −1 2 (A M, M). Distributions of the random variables H1 (Mν (t)) and H2 (Mν (t)), 0 ≤ ν ≤ 1, satisfy direct analogies of the assertions in Sects. 2, 3. To analyse further the processes Mν with ν 1 and the inviscid limit M0 , we note that a.e. non-empty level set of the vector-integral H = (H1 , H2 ) is formed by two peri± odic trajectories of (5.1) (see [Arn89]). Denote them S(H . It is easy to see that the 1 ,H2 ) − + are equal. Since conditional probabilities for Mν (t) to belong to S(H1 ,H2 ) or to S(H 1 ,H2 ) ± the dynamics, defined by (5.1) on each set S(H1 ,H2 ) obviously is ergodic with respect to ± ,4 then the methods of [FW98,FW03,KP08] a corresponding invariant measure ν(H 1 ,H2 ) apply to the process H ν j (τ ) = H (Mν j (τ )) ∈ K = {(h 1 , h 2 ) ∈ R2 | 0 ≤ λ1 h 1 ≤ h 2 ≤ λ3 h 1 }, where τ = ν j t. They allow to prove that a limiting (as ν j → 0) process H 0 (τ ) exists and is a stationary solution of an SDE, obtained from the Ito equation for H (M(τ )) ± by the usual stochastic averaging with respect to the ergodic measures ν(H on the 1 ,H2 ) ± curves S(H1 ,H2 ) . This is a stochastic equation in K . Assume that the matrix A−1 is non-degenerate: 0 < λ1 < λ2 < λ3 .

(5.4)

Then the diffusion matrix for the averaged equation is non-degenerate outside ∂ K and the ray {h 2 = λ2 h 1 , h 1 ≥ 0}. The process H 0 is a stationary solution of the averaged equation such that ± 4 The density of the measure ν ± (H1 ,H2 ) against the Lebesgue measure on the curve S(H1 ,H2 ) is inverse-proportional to velocity of the trajectory.

Energy and Vorticity for Solutions of 2D N-S Equation with Small Viscosity

423

• it has finite quadratic exponential moments cf. (2.1), • its marginal distribution D(H 0 (0)) is absolutely continuous with respect to the Lebesgue measure on K . We claim that under the non-degeneracy assumption (5.4) the averaged equation has a unique stationary measure θ , satisfying the two properties above. Accordingly D(H (M0 (0)) = θ and D(M0 (0)) =

2 α∈{+,−} R

α πα ν(H θ (dH1 dH2 ), 1 ,H2 )

(5.5)

where π+ = π− = 1/2, cf. Theorem 6.6 in [KP08]. In particular, the convergence (5.3) holds as ν → 0 (i.e., the limit does not depend on a sequence ν j → 0). The representation (5.5) for the measure D(M0 (0)) is called its disintegration with respect to the map H : R3 → R2 . It may be obtained independently from the arguments above (see references in [Kuk07]). The role of the arguments is to represent the measure θ in terms of the averaged equation. The measure µ0 = DU (0), corresponding to the Eulerian limit U (Theorem 1.1) also admits a disintegration, similar to (5.5), but with much more complicated ingredients, see [Kuk07]. The main difficulty to study this disintegration (and the measure µ0 itself) comes from the fact that, in difference from the sets {H = const}, the iso-integral sets for the Euler equation U | E(U ) = const, f (rot(U (x))) d x = const ∀ f , (5.6) and the Eulerian dynamics on them are understood very poorly. In particular, nothing is known about the measures on the sets (5.6) which enter the disintegration for the Eulerian limit. Still an analogy with Eq. (5.2) and with the damped/driven KdV equation allows us in [Kuk07] to conjecture an averaging procedure of the Whitham type to find the measures, involved in the disintegration of µ0 . References [Arn89] [FW98] [FW03] [HM06] [KP08] [Kry87] [KS04] [Kuk06a]

Arnold, V.: Mathematical Methods in Classical Mechanics. 2nd ed., Berlin: Springer-Verlag, 1989 Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems. 2nd ed., New York: Springer-Verlag, 1998 Freidlin, M.I., Wentzell, A.D.: Averaging principle for stochastic perturbations of multifrequency systems. Stochastics and Dynamics 3, 393–408 (2003) Hairer, M., Mattingly, J.: Ergodicity of the 2D Navier-Stokes equations with degenerate stochastic forcing. Ann. Math. 164(3), 993–1032 (2006) Kuksin, S.B., Piatnitski, A.L.: Khasminskii - Whitham averaging for randomly perturbed KdV equation. J. Math. Pur. Appl. 89, 400–428 (2008) Krylov, N.V.: Estimates of the maximum of the solution of a parabolic equation and estimates of the distribution of a semimartingale. Math. USSR Sbornik 58, 207–221 (1987) Kuksin, S.B., Shirikyan, A.: Randomly forced CGL equation: stationary measures and the inviscid limit. J. Phys. A: Math. Gen. 37, 1–18 (2004) Kuksin, S.B.: Randomly Forced Nonlinear PDEs and Statistical Hydrodynamics in 2 Space Dimensions. Zürich: Eur. Math. Soc. 2006, also available at http://www.ma.utexas.edu/ mp_arc 06-178

424

[Kuk06b] [Kuk07]

S. B. Kuksin

Kuksin, S.B.: Remarks on the balance relations for the two-dimensional Navier–Stokes equation with random forcing. J. Stat. Physics 122, 101–114 (2006) Kuksin, S.B.: Eulerian limit for 2D Navier-Stokes equation and damped/driven KdV equation as its model. Proc. Stelov Inst. Math. 259, 128–136 (2007)

Communicated by G. Gallavotti

Commun. Math. Phys. 284, 425–457 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0569-3

Communications in

Mathematical Physics

Nonlinear Dynamical Stability of Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars Tao Luo1 , Joel Smoller2 1 Department of Mathematics, Georgetown University, Washington, DC 20057-1233, USA.

E-mail: [email protected]

2 Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1109, USA.

E-mail: [email protected] Received: 11 October 2007 / Accepted: 28 March 2008 Published online: 15 July 2008 – © Springer-Verlag 2008

Abstract: We prove general nonlinear stability and existence theorems for rotating star solutions which are axi-symmetric steady- state solutions of the compressible isentropic Euler-Poisson equations in 3 spatial dimensions. We apply our results to rotating white dwarf and high density supermassive (extreme relativistic) stars, stars which are in convective equilibrium and have uniform chemical composition. Also, we prove nonlinear dynamical stability of non-rotating white dwarfs with general perturbation without any symmetry restrictions. This paper is a continuation of our earlier work ([26]). Contents 1. 2. 3. 4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rotating Star Solutions . . . . . . . . . . . . . . . . . . . . . . . General Existence and Stability Theorems . . . . . . . . . . . . . 3.1 Compactness of minimizing sequence . . . . . . . . . . . . . 3.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications to White Dwarf and Supermassive Stars . . . . . . . Nonlinear Dynamical Stability of Non-Rotating White Dwarf Stars With General Perturbations . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

425 428 430 430 439 448

. . . . . 453

1. Introduction The motion of a compressible isentropic perfect fluid with self-gravitation is modeled by the Euler-Poisson equations in three space dimensions (cf. [5]): ⎧ ⎪ ⎨ ρt + ∇ · (ρv) = 0, (1.1) (ρv)t + ∇ · (ρv ⊗ v) + ∇ p(ρ) = −ρ∇, ⎪ ⎩ = 4πρ.

426

T. Luo, J. Smoller

Here ρ, v = (v1 , v2 , v3 ), p(ρ) and denote the density, velocity, pressure and gravitational potential, respectively. The gravitational potential is given by ρ(y) 1 (x) = − dy = −ρ ∗ , (1.2) |x| R3 |x − y| where ∗ denotes convolution. System (1.1) is used to model the evolution of a Newtonian gaseous star ([5]). In the study of time-independent solutions of system (1.1), there are two cases, non-rotating stars and rotating stars. An important question concerns the stability of such solutions. Physicists call such star solutions stable provided that they are minima of an associated energy functional ([37], p.305 & [33]). Mathematicians, on the other hand, consider dynamical nonlinear stability via solutions of the Cauchy problem. The main purpose of this paper is to prove a general theorem which relates these two notions and shows that for a wide class of Newtonian rotating stars, minima of the energy functional are in fact, dynamically stable. This is done for various equations of state p = p(ρ) which includes polytropes, supermassive, and white dwarf stars. For non-rotating stars, Rein ([32]) has proved nonlinear stability under various hypotheses on the equation of state, including in particular, polytropes where p = kρ γ , γ > 4/3; his theory applies to neither white dwarf nor supermassive stars. In a recent paper, [26], we studied nonlinear stability of rotating polytropic stars, where p = kρ γ , γ > 4/3. In this paper, we generalize these results to rotating white dwarf and supermassive stars, thereby completing the nonlinear stability theory for rotating (and nonrotating) compressible Newtonian stars.1 Our main theorem applies to minimizers of an energy functional with a total mass constraint. The crucial hypotheses are that the infimum of the energy functional in the requisite class, be finite and negative. This is verified for both white dwarf and supermassive stars by combining a scaling technique used by Rein ([31]), together with our method in [26] where we use some particular solutions of the Euler-Poisson equations in order to simplify the energy functional. It should be noticed that neither the scaling technique in [31] nor the method in [26] using particular solutions of Euler-Poisson equations apply to white dwarf stars directly. As a by-product of our method, we prove the existence of a minimizer for the energy functional, which is a rotating white dwarf star solution, in a class of functions having less symmetry than those solutions obtained in [1] and [10]. The method in [1] and [10] is to construct a specific minimizing sequence of the energy functional, each element in the sequence being a local minimizer of the energy functional. In contrast, our method is to show that any minimizing sequence of the energy functional must be compact (cf. Theorem 3.1 below). This fact is crucial for both existence and stability results. For a white dwarf star (a star in which gravity is balanced by electron degeneracy pressure), the pressure function p(ρ) obeys the following asymptotics ([5], Chap. 10): p(ρ) = c1 ρ 4/3 − c2 ρ 2/3 + · · · , ρ → ∞, (1.3) p(ρ) = d1 ρ 5/3 − d2 ρ 7/3 + O(ρ 3 ), ρ → 0, 1 In all cases under consideration, stability is only “conditional” because no global in time solutions have been constructed so far for compressible Euler-type equations in three spatial dimensions; this is a major open problem. In the stability result in [32], it was assumed that the solutions of the Cauchy problem for the evolutionary Euler-Poisson equations exist and preserve the total mass and energy. In general, shock waves appear in compressible fluid flows. In the presence of shock waves, the total energy should be non-increasing in time due to the entropy condition. We prove the conservation of total mass for general weak solutions and the non-increase of the total energy for entropy weak solutions if the weak solutions are in certain L p spaces (see Theorem 3.2). Those two properties are important for our stability analysis.

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

427

where c1 , c2 , d1 and d2 are positive constants. The existence theory for non-rotating white dwarf stars is classical provided the mass M of the star is not greater than a critical mass Mc (M ≤ Mc ) ([5]). For rotating white dwarf stars with prescribed total mass and angular momentum distribution, Auchumuty and Beals ([1]) proved that if the angular momentum distribution is nonnegative, then existence holds if M ≤ Mc . Friedman and Turkington ([10]) proved existence for any mass provided that the angular momentum distribution is everywhere positive; see Li ([21]), Chanillo & Li ([6]) and Luo & Smoller ([25]) for related results for rotating star solutions with prescribed constant angular velocity. To the best of our knowledge, our stability theorem in this paper for rotating and non-rotating white dwarf stars with M ≤ Mc is the first nonlinear dynamical stability theorem for such stars. For a supermassive star (a star which is supported by the pressure of radiation rather than that of matter; sometimes called an extreme relativistic degenerate star [33]), the pressure p(ρ) is given by ([37]): p(ρ) = kρ γ , γ = 4/3,

(1.4)

where k > 0 is a constant. For non-rotating spherically symmetric solutions for supermassive stars, Weinberg ([37]) showed that the total energy vanishes; thus to quote Weinberg ([37], p. 327) “the polytrope with γ = 4/3 is trembling between stability and instability”, and he remarks that one needs to use general relativity to settle this stability problem. For rotating supermassive star solutions, we show here that the energy is negative E < 0 due to the rotational kinetic energy (see (4.26) below). Thus the stability problem falls within the framework of Newtonian mechanics and so our general stability theorem applies to show that rotating supermassive stars are nonlinearly stable, provided that M ≤ Mc . For the stability of both white dwarfs and supermassive stars, we require that the total mass of each one lies below a corresponding critical mass, a “Chandrasekhar” limit. We show that this holds because the pressure function for both is of the order ρ 4/3 as ρ → ∞. The above dynamical stability results for rotating stars apply for axi-symmetric perturbations with some restrictions on angular momentum. For non-rotating stars, G. Rein ([32]) proved nonlinear dynamical stability for general perturbations. However, his result does not apply to white dwarf stars. For non-rotating white dwarf stars, the problem was formulated by Chandrasekhar [4] in 1931 (and also in [8] and [16]) and leads to an equation for the density which was called the “ Chandrasekhar equation ” by Lieb and Yau in [22]. This equation predicts the gravitational collapse at some critical mass ([4] and [5]). This gravitational collapse was also verified by Lieb and Yau ([22]) as the limit of Quantum Mechanics. In Sect. 5, we prove the nonlinear dynamical stability for nonrotating white dwarf stars with general perturbations without any symmetry assumption provided that the total mass is below some critical mass. Other related results besides those mentioned above for compressible fluid rotating stars can be found in [2, 3, 9, and 25]. The linearized stability and instability for non-rotating and rotating stars were discussed by Lin ([23]), Lebovitz ([18]) and Lebovitz & Lifschitz ([19]). Related nonlinear stability and instability results for galaxies, globular and gaseous stellar objects can be found in Guo & Rein ([12,13]) and Jang ([11]). Related results for the Euler- Poisson equations of self-gravitating fluids can be found in [7, 15, 28 and 36].

428

T. Luo, J. Smoller

2. Rotating Star Solutions

We now introduce some notation which will be used throughout this paper. We use to denote R3 , and use || · ||q to denote || · || L q (R3 ) . For any point x = (x1 , x2 , x3 ) ∈ R3 , let r (x) = x12 + x22 , z(x) = x3 , B R (x) = {y ∈ R3 , |y − x| < R}. (2.1) For any function f ∈ L 1 (R3 ), we define the operator B by 1 f (y) dy = f ∗ . B f (x) = |x − y| |x|

(2.2)

Also, we use ∇ to denote the spatial gradient, i.e., ∇ = ∇x = (∂x1 , ∂x2 , ∂x3 ). C will denote a generic positive constant. ˜ A rotating star solution (ρ, ˜ v˜ , )(r, z), where r = x12 + x22 and z = x3 , x = (x1 , x2 , x3 ) ∈ R3 , is an axi-symmetric time-independent solution of system (1.1), which models a star rotating about the x3 - axis. Suppose the angular momentum (per unit mass), J (m ρ˜ (r )) is prescribed, where r +∞ m ρ˜ (r ) = ρ(x)d ˜ x= 2π s ρ(s, ˜ z)dsdz, (2.3) x12 +x22
0

−∞

is the mass in the cylinder {x = (x1 , x2 , x3 ) : x12 + x22 < r }, and J is a given function. In this case, the velocity field v˜ (x) = (v1 , v2 , v3 ) takes the form x2 J (m ρ˜ (r )) x1 J (m ρ˜ (r )) , , 0). r2 r2 Substituting this in (1.1), we find that ρ(r, ˜ z) satisfies the following two equations: ∂r p(ρ) ˜ = ρ∂ ˜ r (B ρ) ˜ + ρ˜ L(m ρ˜ (r ))r −3 , (2.4) ˜ = ρ∂ ˜ z (B ρ), ˜ ∂z p(ρ) v˜ (x) = (−

where the operator B is defined in (2.2), and L(m ρ˜ ) = J 2 (m ρ˜ ) is the square of the angular momentum. We define ρ p(s) A(ρ) = ρ ds. s2 0 It is easy to verify that (cf. [1]) (2.4) is equivalent to ∞ A (ρ(x)) ˜ + L(m ρ˜ (s)s −3 ds − B ρ(x) ˜ = λ, r (x)

(2.5)

where ρ(x) ˜ > 0,

(2.6)

for some constant λ. Here r (x) and z(x) are as in (2.1). Let M be a positive constant and let W M be the set of functions ρ defined by W M = {ρ : R3 → R, ρ is axisymmetric, ρ ≥ 0, a.e.,

ρ(x)L(m ρ (r (x))) A(ρ(x)) + ρ(x)d x = M, +ρ(x)Bρ(x) d x < +∞}. r (x)2

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

For ρ ∈ W M , we define the energy functional F by 1 ρ(x)L(m ρ (r (x))) 1 F(ρ) = [A(ρ(x)) + − ρ(x)Bρ(x)]d x. 2 r (x)2 2

429

(2.7)

In (2.7), the first term denotes the potential energy, the middle term denotes the rotational kinetic energy and the third term is the gravitational energy. For a white dwarf star, the pressure function p(ρ) satisfies the following conditions: lim

ρ→0+

p(ρ) p(ρ) = 0, lim 4/3 = K, p (ρ) > 0 as ρ > 0, ρ→∞ ρ ρ 4/3

(2.8)

where K is a finite positive constant. Assuming that the function L ∈ C 1 [0, M] and satisfies L(0) = 0, L(m) ≥ 0, f or 0 ≤ m ≤ M,

(2.9)

Auchmuty and Beals (cf. [1]) proved the existence of a minimizer of the functional F(ρ) in the class of functions W M,S = W M ∩ Wsym , where Wsym = {ρ : R3 → R, ρ(x1 , x2 , −x3 ) = ρ(x1 , x2 , x3 ), xi ∈ R, i = 1, 2, 3}. (2.10) Their result is given in the following theorem. Theorem 2.1 ([1]). If the pressure function p satisfies (2.8) (for either 0 < K < +∞ or K = +∞ ) and (2.9) holds, then there exists a constant Mc > 0 depending on the constant K in (2.8) (if K = +∞ then Mc = +∞, if 0 < K < +∞, then 0 < Mc < +∞) such that, if M < Mc ,

(2.11)

then there exists a function ρ(x) ˆ ∈ W M,S which minimizes F(ρ) in W M,S . Moreover, if ˆ > 0}, G = {x ∈ R3 : ρ(x)

(2.12)

then G¯ is a compact set in R3 , and ρˆ ∈ C 1 (G) ∩ C β (R3 ) for some 0 < β < 1. Furthermore, there exists a constant µ < 0 such that ∞ ˆ + r (x) L(m ρˆ (s)s −3 ds − B ρ(x) ˆ = µ, x ∈ G, A (ρ(x)) ∞ (2.13) −3 3 ˆ ≥ µ, x ∈ R − G. r (x) L(m ρˆ (s)s ds − B ρ(x) Remark 1. When 0 < K < ∞, the constant 0 < Mc < +∞ in (2.11) is called critical mass. The critical mass was first found by Chandrasekhar (cf. [5]) in the study of non- rotating white dwarf stars. When 0 < K < ∞, it was proved by Friedman and Turkington ([10]) that, if the angular momentum satisfies the following condition J ∈ C 1 ([0, M]), J (m) ≥ 0, for 0 ≤ m ≤ M, J (0) = 0, J (m) > 0 for 0 < m ≤ M, (2.14) where J is the angular momentum, then the condition (2.11) can be removed, i.e., the above theorem holds for any positive total mass M.

430

T. Luo, J. Smoller

In this paper, we are interested in minimizers of functional F in the larger class W M . By the same argument as in [1], it is easy to prove the following theorem on the regularity of a minimizer. Theorem 2.2. Suppose that the pressure function p satisfies: lim

ρ→0+

p(ρ) p(ρ) = 0, lim 6/5 = ∞, p (ρ) > 0 as ρ > 0, ρ→∞ ρ ρ 6/5

(2.15)

and the angular momentum satisfies (2.9). Let ρ˜ be a minimizer of the energy functional F in W M and let = {x ∈ R3 : ρ(x) ˜ > 0}, then ρ˜ ∈ C(R3 ) ∩ C 1 ( ). Moreover, there exists a constant λ such that ∞ ˜ + r (x) L(m ρ˜ (s)s −3 ds − B ρ(x) ˜ = λ, x ∈ , A (ρ(x)) ∞ −3 3 ˜ ≥ λ, x ∈ R − . r (x) L(m ρ˜ (s)s ds − B ρ(x)

(2.16)

(2.17)

We call such √ a minimizer ρ˜ a rotating star solution with total mass M and angular momentum L(m). 3. General Existence and Stability Theorems For the angular momentum, besides the condition (2.9), we also assume that it satisfies the following conditions: L(am) ≥ a 4/3 L(m), 0 < a ≤ 1, 0 ≤ m ≤ M, L (m) ≥ 0,

0 ≤ m ≤ M.

(3.1) (3.2)

Condition (3.2) is called the S¨olberg stability criterion ([35]). 3.1. Compactness of minimizing sequence. In this section, we first establish a compactness result for the minimizing sequences of the functional F. This compactness result is crucial for the existence and stability analyses. Theorem 3.1. Suppose that the square of the angular momentum L satisfies (2.9), (3.1) and (3.2), and the pressure function p satisfies the following conditions: 1 p(ρ) p(ρ) p ∈ C 1 [0, +∞), dρ < +∞, lim = K , p(ρ) ≥ 0, p (ρ) > 0 for ρ > 0, 2 ρ→∞ ρ γ ρ 0 (3.3) where 0 < K < +∞ and γ ≥ 4/3. If (1) inf F(ρ) < 0,

ρ∈W M

and

(3.4)

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

(2) for ρ ∈ W M , [A(ρ)(x) +

431

1 ρ(x)L(m ρ (r (x))) ]d x ≤ C1 F(ρ) + C2 , 2 r (x)2

(3.5)

for some positive constants C1 and C2 , then the following hold: (a) If {ρ i } ⊂ W M is a minimizing sequence for the functional F, then there exist a sequence of vertical shifts ai e3 (ai ∈ R, e3 = (0, 0, 1)), a subsequence of {ρ i }, (still labeled {ρ i }), and a function ρ˜ ∈ W M , such that for any > 0 there exists R > 0 with Tρ i (x)d x ≤ , i ∈ N, (3.6) |x|≥R

and Tρ i (x) ρ, ˜ weakly in L γ (R3 ), as i → ∞,

(3.7)

where Tρ i (x) := ρ i (x + ai e3 ). Moreover (b) ∇ B(Tρ i ) → ∇ B(ρ) ˜ str ongly in L 2 (R3 ), as i → ∞. (c) ρ˜ is a minimizer of F in W M . Thus ρ˜ is a rotating star solution with total mass M and angular momentum

(3.8) √

L.

Remark 2. i) The assumption (3.4) is crucial for our compactness and stability analysis. The physical meaning of this is that the gravitational energy, the negative part of the energy F, should be greater than the positive part, which means the gravitation should be strong enough to hold the star together. In Sect. 4, we will verify this assumption. Roughly speaking, in addition to (3.3), if we require lim

ρ→0+

p(ρ) = α, ρ γ1

(3.9)

for some constants γ1 > 4/3 and 0 < α < +∞, then (3.4) holds for the following cases: (a) When γ = 4/3 (where γ is the constant in (3.3)), if the total mass M is less than a ”critical mass” Mc , then (3.4) holds. This case includes white dwarf stars. For a white dwarf star, γ1 = 5/3. (b) When γ > 4/3, (3.4) holds for arbitrary positive total mass M. This generalizes our previous result in [26] for the polytropic stars with p(ρ) = ρ β , β > 4/3. It should be noted that (3.9) does not apply to supermassive star, i.e. p(ρ) = kρ 4/3 . For the supermassive star, in order that (3.4) holds, in addition to requiring that the total mass is less than a ”critical mass”, we also require that the angular momentum (per unit mass) J is not identically zero.

432

T. Luo, J. Smoller

ii) Assumption (2) in the above theorem implies that the functional F is bounded below, i.e., inf F(ρ) > −∞.

ρ∈W M

(3.10)

We will verify this assumption in Sect. 4 (see Theorem 4.1). iii) The inequality (3.6) is crucial for the compactness result (3.8). One of the difficulties in the analysis is the loss of compactness because we consider the problem in an unbounded space, R3 . The inequality (3.6) means the masses of the elements in the minimizing sequence Tρ i (x) ”almost” concentrate in a ball B R (0). iv) It is easy to verify that the functional F is invariant under any vertical shift, i.e., if ρ(·) ∈ W M , then ρ(x) ¯ =: ρ(x + ae3 ) ∈ W M and F(ρ) ¯ = F(ρ) for any a ∈ R. Therefore, if {ρ i } is a minimizing sequence of F in W M , then {Tρ i } =:= ρ i (x + ai e3 ) is also a minimizing sequence in W M . Theorem 3.1 is proved in a sequence of lemmas with some modifications of the arguments in [26]. We only sketch the proofs of those lemmas and Theorem 3.1. Complete details can be followed as in [26]. We first give some inequalities which will be used later. We begin with Young’s inequality (see [14], p. 146.) Lemma 3.1. If f ∈ L p ∩ L r , 1 ≤ p < q < r ≤ +∞, then || f ||q ≤ || f ||ap || f ||r1−a ,

a=

q −1 − r −1 . p −1 − r −1

(3.11)

The following two lemmas are proved in [1]. Lemma 3.2. Suppose the function f ∈ L 1 (R3 ) ∩ L q (R3 ). If 1 < q ≤ 3/2, then 1 B f =: f ∗ |x| is in L r (R3 ) for 3 < r < 3q/(3 − 2q), and (3.12) ||B f ||r ≤ C || f ||b1 || f ||q1−b + || f ||c1 || f ||q1−c , for some constants C > 0, 0 3/2, then B f (x) is a bounded continuous function, and satisfies (3.12) with r = ∞. Lemma 3.3. For any function f ∈ L 1 (R3 ) ∩ L 4/3 (R3 ), ∇ B f ∈ L 2 (R3 ). Moreover,

2/3 1 2 4/3 ||∇ B f ||2 ≤ C , (3.13) | f (x)B f (x)d x| = | f | (x)d x | f |(x)d x 4π for some constant C. We also need the following lemma. Lemma 3.4. Suppose that the pressure function p satisfies (3.3) and that (3.5) holds. Let {ρ i } ⊂ W M be a minimizing sequence for the functional F. Then there exists a constant C > 0 such that 1 ρ i (x)L(m ρ i (r (x))) ]d x ≤ C, for all i ≥ 1, (3.14) [(ρ i )γ (x) + 2 r (x)2 where γ ≥ 4/3 is the constant in (3.3). So, the sequence {ρ i } is bounded in L γ (R3 ).

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

433

Proof. By (3.5), we know that [A(ρ i )(x) +

1 ρ i (x)L(m ρ i (r (x))) ]d x ≤ C, for all i ≥ 1, 2 r (x)2

(3.15)

for any minimizing sequence {ρ i } ⊂ W M for the functional F, where we have used that {F(ρ i )} is bounded from above since it converges to inf W M F. It is easy to verify that, by virtue of (3.3) and (2.5), lim

ρ→∞

A(ρ) K , A(ρ) > 0 for ρ > 0. = γ ρ γ −1

(3.16)

Therefore, there exits a constant ρ ∗ > 0 such that α A(ρ) ≥ ρ γ , where α =

2(γ −1) K .

for ρ ≥ ρ ∗ ,

(3.17)

Hence, for ρ ∈ W M ,

γ

ρ dx ≤

∗ γ −1

ρd x + α A(ρ)d x ρ≥ρ ∗ ≤ (ρ ∗ )γ −1 M + α A(ρ)d x. ρ<ρ ∗

(ρ )

(3.18)

Applying this inequality to ρ i , we conclude that the sequence {ρ i } is bounded in L γ (R3 ) by using (3.15). For any M > 0, we let f M = inf F(ρ). ρ∈W M

(3.19)

5/3 f for every M > M ¯ ¯ > 0. Lemma 3.5. If (3.1) holds, then f M¯ ≥ ( M/M) M

Proof. The proof follows from a scaling argument as in [31] and [26]. Take ¯ 1/3 and let ρ(x) a = (M/ M) ¯ = ρ(ax) for any ρ ∈ W M . It is easy to verify that ρ¯ ∈ W M¯ . Moreover, for r ≥ 0, it is easy to verify (as in [26]) that m ρ¯ (r ) =

1 m ρ (ar ). a3

(3.20)

Since L satisfies (3.1) and a > 1, we have L(m ρ¯ (r )) ≥

1 L(m ρ (ar )). a4

Thus, as in [26], we can show that ρ(x)L(m ρ (r (x))) ρ(x)L(m ¯ 1 ρ¯ (r (x))) d x ≥ d x. r (x)2 r (x)2 a5

(3.21)

(3.22)

434

T. Luo, J. Smoller

Therefore, since a ≥ 1, it follows from (3.21) and (3.22) that ρ(x)L(m ρ (r (x))) a −5 a −5 F(ρ) ¯ ≥ a −3 A(ρ)d x − dx ρ Bρd x + 2 2 r (x)2

ρ(x)L(m ρ (r (x))) 1 1 ≥ a −5 dx A(ρ)d x − ρ Bρd x + 2 2 r (x)2 5/3 ¯ F(ρ). (3.23) = ( M/M) Since ρ → ρ¯ is one-to-one between W M and W M¯ , this proves the lemma. Lemma 3.6. Let {ρ i } ⊂ W M be a minimizing sequence for F. Then there exist constants r0 > 0, δ0 > 0, i 0 ∈ N and x i ∈ R3 with r (x i ) ≤ r0 , such that ρ i (x)d x ≥ δ0 , i ≥ i 0 . (3.24) B1 (x i )

Proof. First, since limi→∞ F(ρ i ) → f M and f M < 0 (see (3.4)), for large i, 1 fM ≤ −F(ρ i ) ≤ − ρ i Bρ i d x. 2 2

(3.25)

For any i, let δi = sup

x∈R3

Now

|y−x|<1

ρ i (y)dy.

(3.26)

ρ i Bρ i (x)d x

= ρ i (x) R3

(3.27) |y−x|<1

=: D1 + D2 + D3 ,

+

+ 1<|y−x|
|y−x|>r

ρ i (y) d yd x |y − x| (3.28)

and D3 ≤ M 2 r −1 . The shell 1 < |y − x| < r can be covered by at most Cr 3 balls of radius 1, so D2 ≤ C Mδi r 3 . By using Hölder’s inequality and applying (3.12) to the restriction of ρ i to {y : |y − x| < 1}, we get ρ i (y) D1 ≤ ρ i 4/3 dy4 |y−x|<1 |y − x| i c i 1−c ≤ Cρ i 4/3 χ B1 (x) ρ i b1 ρ i 1−b + χ ρ ρ B (x) 1 1 4/3 4/3 i b i 1−b c i 1−c ≤ Cρ 4/3 δi ρ 4/3 + δi ρ 4/3 , (3.29) where 0 < b < 1 and 0 < c < 1. Now since {ρ i γ } is bounded, it follows that {ρ i 4/3 } is bounded due to the fact γ ≥ 4/3 in view of (3.11) and ρ i 1 = M; this gives D1 ≤ C(δib +δic ). It follows that we could choose r so large that the above estimates give ρ i Bρ i (x)d x < − f M if δi were small enough. This would contradict (3.25). So

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

435

there exists δ0 > 0 such that δi ≥ δ0 for large i. Thus, as i is large, there exist x i ∈ R3 and i 0 ∈ N such that B1

(x i )

ρ i (x)d x ≥ δ0 , i ≥ i 0 .

(3.30)

We now prove that there exists r0 > 0 independent of i such that x i must satisfy r (x i ) ≤ r0 for i large. Namely, since ρ i has mass at least δ0 in the unit ball centered at x i , and is axially symmetric, it has mass ≥ Cr (x i )δ0 in the torus obtained by revolving this ball around the x3 -axis (or z- axis).Therefore r (x i ) ≤ (Cδ0 )−1 M. In order to prove Theorem 3.1, we will need the following lemma. Lemma 3.7. Let { f i } be a bounded sequence in L γ (R3 ) (γ ≥ 4/3) and suppose f i f 0 weakly in L γ (R3 ). Then (a) For any R > 0, ∇ B(χ B R (0) f i ) → ∇ B(χ B R (0) f 0 ) str ongly in L 2 (R3 ), where χ is the characteristic function. (b) If in addition { f i } is bounded in L 1 (R3 ), f 0 ∈ L 1 (R3 ), and for any > 0 there exist R > 0 and i 0 ∈ N such that |x|>R

| f i (x)|d x < ,

i ≥ i0 ,

(3.31)

then ∇ B f i → ∇ B f 0 str ongly in L 2 (R3 ). Proof. This lemma follows easily from the proof of Lemma 3.7 in [31], due to the following observation: The map: ρ ∈ L γ (R3 ) → χ B R (0) ∇ B(χ B R (0) ρ) is compact for any R > 0, if γ ≥ 4/3, where χ denotes the characteristic function. With the above lemmas, the proof of Theorem 3.1 is similar to that in [26]. So we only outline the main steps. Proof of Theorem 3.1. Step 1. Splitting. We begin with a splitting as in [31]. For ρ ∈ W M , for any 0 < R1 < R2 , we have ρ = ρχ|x|≤R1 + ρχ R1 <|x|≤R2 + ρχ|x|>R2 =: ρ1 + ρ2 + ρ3 ,

(3.32)

436

T. Luo, J. Smoller

where again χ is the characteristic function. It is easy to verify that

3 ρ j (x)L(m ρ j (r (x))) ρ(x)L(m ρ (r (x))) dx = dx 2 r (x) r 2 (x) j=1

+

3 ρ j (x)(L(m ρ (r (x))) − L(m ρ j (r (x)))

r 2 (x)

j=1

≥

3 ρ j (x)L(m ρ j (r (x)))

r 2 (x)

j=1

d x,

d x.

(3.33)

In the last inequality above, we have used (3.2). So, we have F(ρ) ≥

3 j=1

where

Ii j =

R3

R3

F(ρ j ) −

Ii j ,

(3.34)

1≤i< j≤3

|x − y|−1 ρi (x)ρ j (y)d xd y,

1 ≤ i < j ≤ 3.

If we choose R2 > 2R1 in the splitting (3.32), then I13 ≤

C . R2

(3.35)

By (3.12) and (3.13), we have I12 + I23 1 = ∇(Bρ1 + Bρ3 ) · ∇ Bρ2 d x ≤ C∇(Bρ1 + Bρ3 )2 ∇ Bρ2 2 4π 2/3

2/3

≤ C M 1/3 ρ1 + ρ3 4/3 ∇ Bρ2 2 ≤ C M 1/3 ρ4/3 ∇ Bρ2 2 .

(3.36)

Using Lemma 3.5, (3.4), (3.34), (3.35) and (3.36), and following an argument as in the proof of Theorem 3.1 in [31], we can show that f M − F(ρ) M2 5/3 M3 5/3 M1 5/3 2/3 ) −( ) −( ) ) f M + C(R2−1 + M 1/3 ρ4/3 ||∇ Bρ2 ||2 ) ≤ (1 − ( M M M 2/3 (3.37) ≤ C f M M1 M3 + C(R2−1 + M 1/3 ρ4/3 ||∇ Bρ2 ||2 ), by choosing R2 > 2R1 in the splitting (3.32), where Mi =

ρi (x)d x (i = 1, 2, 3.)

Step 2. Compactness. Let {ρ i } be a minimizing sequence of F in W M . By Lemma 3.6, we know that there exists i 0 ∈ N and δ0 > 0 independent of i such that ρ i (x)d x ≥ δ0 , i f i ≥ i0 , (3.38) ai e3 +B R0 (0)

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

437

where ai = z(x i ) and R0 = r0 + 1, x i and r0 are those quantities in Lemma 3.6, e3 = (0, 0, 1). Having proved (3.38), we can follow the argument in the proof of Theorem 3.1 in [31] to verify (3.31) for f i (x) = Tρ i (x) =: ρ i (· + ai e3 ) by using (3.34) and (3.38) and choosing suitable R1 and R2 in the splitting (3.32). We sketch this as follows. The sequence Tρ i =: ρ i (· + ai e3 ), i ≥ i 0 , is a minimizing sequence of F in W M (see Remark 2 after Theorem 3.1). We rewrite (3.38) as Tρ i (x)d x ≥ δ0 , i ≥ i 0 . (3.39) B R0 (0)

Applying (3.37) with Tρ i replacing ρ, and noticing that {Tρ i } is bounded in L γ (R3 ) (see Lemma 3.4) (so {Tρ i 4/3 } is bounded if γ ≥ 4/3 in view of (3.11) and the fact ρ i 1 = M), we obtain, if R2 > 2R1 , (3.40) − C f M M1i M3i ≤ C(R2−1 + ||∇ BTρ2i ||2 ) + F(Tρ i ) − f M , where M1i = Tρ1i (x)d x = |x|R2 Tρ i (x)d x and Tρ2i = χ R1 <|x|≤R2 Tρ i . Since {Tρ i } is bounded in L γ (R3 ), there exists a subsequence, still labeled by {Tρ i }, and a function ρ˜ ∈ W M such that Tρ i ρ˜ weakly in L γ (R3 ). This proves (3.7). By (3.39), we know that M1i in (3.40) satisfies M1i ≥ δ0 for i ≥ i 0 by choosing R1 ≥ R0 where R0 is the constant in (3.39). Therefore, by (3.40) and the fact that f M < 0 (cf. (3.4)) , we have − C f M δ0 M3i ≤ C R2−1 + C||∇ B ρ˜2 ||2 +C||∇ BTρ2i −∇ B ρ˜2 ||2 ) + F(Tρ i )− f M , (3.41) ˜ Given any > 0, by the same argument as [31], we can increase where ρ˜2 = χ|x|>R2 ρ. R1 > R0 such that the second term on the right hand side of (3.41) is small, say less than /4. Next choose R2 > 2R1 such that the first term is small. Now that R1 and R2 are fixed, the third term on the right hand side of (3.41) converges to zero by Lemma 3.7(a). Since {Tρ i } is a minimizing sequence of F in W M , we can make F(Tρ i ) − f M small by taking i large. Therefore, for i sufficiently large, we can make i M3 =: Tρ i (x)d x < . (3.42) |x|>R2

This verifies (3.31) in Lemma 3.7 for f i = Tρ i . By weak convergence we have that for any > 0 there exists R > 0 such that M − ≤ ρ(x)d ˜ x ≤ M, which implies ρ˜ ∈ L 1 (R3 ) with

B R (0)

ρd ˜ x = M. Therefore, by Lemma 3.7(b), we have

||∇ BTρ i − ∇ B ρ|| ˜ 2 → 0,

i → +∞.

(3.43)

This proves (3.8). Equation (3.6) in Theorem 3.1 follows from (3.42) by taking R = R2 .

438

T. Luo, J. Smoller

Step 3. Lower Semi-Continuity. Let {ρ i } be a minimizing sequence of the energy functional F, and let ρ˜ be a weak limit of {Tρ i } in L γ (R3 ). We will prove that ρ˜ is a minimizer of F in W M ; that is F(ρ) ˜ ≤ lim inf F(Tρ i ).

(3.44)

i→∞

By (3.3), there exist positive constants C and ρ ∗ such that A (ρ) ≤ Cρ γ −1 , f or ρ ≥ ρ ∗ ,

(3.45)

where γ ≥ 4/3 is the constant in (3.3). Since ρ˜ ∈ L γ and ρd ˜ x = M, we can conclude ˜ ∈ L γ , where L γ is the dual space of L γ , i.e., γ = γ γ−1 . In view of (2.5) and A (ρ) (3.3), we have A (ρ) = p (ρ)/ρ > 0,

for ρ > 0,

(3.46)

so that

A(Tρ )d x ≥ i

A(ρ)d ˜ x+

i A (ρ)(Tρ ˜ − ρ), ˜ for i ≥ 1.

(3.47)

˜ ∈ L γ and Tρ i weakly converges to ρ˜ in L γ , Since A (ρ)

i ˜ − ρ) ˜ → 0, as i → +∞. A (ρ)(Tρ

(3.48)

Therefore,

A(ρ)d ˜ x ≤ lim inf

i→∞

A(Tρ i )d x.

(3.49)

Next, following the proof in [26], we can show that lim inf

˜ Tρ i (x)L(m Tρ i (r (x)) − ρ(x)L(m ρ˜ (r (x)) r 2 (x)

i→∞

d x ≥ 0,

(3.50)

by showing that the mass function m ρ˜ (r ) =:

x12 +x22 ≤r

ρ(x)d ˜ x

is continuous for r ≥ 0, and using (3.6). Then (3.44) follows from (3.43), (3.49) and (3.50).

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

439

3.2. Stability. In this section, we assume that the pressure function p satisfies p ∈ C 1 [0, +∞), lim

ρ→0+

p(ρ) p(ρ) = 0, lim = K , p (ρ) > 0 for ρ > 0. (3.51) ρ→∞ ρ γ ρ 6/5

where 0 < K < +∞ and γ ≥ 4/3 are constants. It should be noticed that (3.51) implies both (2.15) and (3.3). We consider the Cauchy problem for (1.1) with the initial data ρ(x, 0) = ρ0 (x), v(x, 0) = v0 (x).

(3.52)

We begin by giving the definition of a weak solution. Definition 3.1. Let ρv = m. The triple (ρ, m, )(x, t) (x ∈ R3 , t ∈ [0, T ]) (T > 0) and given by (1.2), with ρ ≥ 0, p(ρ), m, m⊗m/ρ and ρ∇ being in L 1 (R3 ×[0, T ]), is called a weak solution of the Cauchy problem (1.1) and (3.52) on R3 × [0, T ] if for any Lipschitz continuous test function ψ with compact support in R3 × [0, T ],

T

(ρψt + m · ∇ψ + p(ρ)∇ψ) d xdt +

ρ0 (x)ψ(x, 0)d x = 0,

(3.53)

0

and

m⊗m mψt + ( p(ρ)I + )∇ψ d xdt + m0 (x)ψ(x, 0)d x ρ 0 T = ρ∇ψd xdt, T

(3.54)

0

where I is the 3 × 3 unit matrix. The total energy of system (1.1) at time t is

1 1 A(ρ) + ρ|v|2 (x, t)d x− E(t) = E(ρ(t), v(t)) = |∇|2 (x, t)d x, (3.55) 2 8π where as before,

ρ

A(ρ) = ρ 0

p(s) ds. s2

(3.56)

For a solution of (1.1) without shock waves, the total energy is conserved, i.e., E(t) = E(0) (t ≥ 0)(cf. [35]). For solutions with shock waves, the energy should be non-increasing in time, so that for all t ≥ 0, E(t) ≤ E(0),

(3.57)

due to the entropy conditions, which is described below. Definition 3.2. A weak solution (defined above) on R3 ×[0, T ] is called an entropy weak solution of (1.1) if it satisfies the following “entropy inequality”: ∂t η +

3 j=1

∂x j q j + ρ

3 j=1

ηm j x j ≤ 0,

(3.58)

440

T. Luo, J. Smoller

in the sense of distributions; i.e., ⎞ ⎛ T 3 ⎠ ⎝ηβt + q · ∇β − ρ ηm j x j β d xdt + R3

0

j=1

R3

β(x, 0)η(x, 0)d x ≥ 0,

(3.59)

for any nonnegative Lipschitz continuous test function β with compact support in [0, T )× R3 . Here the “entropy” function η and “entropy flux” functions q j and q, are defined by ⎧ ρ p(s) |m|2 ⎪ ⎪ ⎨ η = 2ρ + ρ 0 s 2 ds, ρ |m|2 m (3.60) q j = 2ρ 2 j + m j 0 p s(s) ds, ⎪ ⎪ ⎩ q = (q , q , q ). 1

2

3

Remark 3. The inequality (3.58) is motivated by the second law of thermodynamics ([17]), and plays an important role in shock wave theory ([34]). For smooth solutions, the inequality in (3.58) can be replaced by equality. Some properties of entropy weak solutions are given in the following theorem. Theorem 3.2. If (ρ, m) ∈ L ∞ ([0, T ]; L 1 (R3 )) satisfies the first equation in (1.1) in the sense of distributions, then ρ(x, t)d x = ρ(x, 0)d x =: M, 0 < t < T. (3.61) R3

R3

Let (ρ, m, ) be a weak solution defined in Definition 3.1. Suppose (ρ, m, ) satisfies the entropy condition (3.58), ρ ∈ L ∞ ([0, T ]; L 1 (R3 ))∩ L ∞ ([0, T ]; L r (R3 )) for some r satisfying r > 3/2 and r ≥ γ (γ ≥ 4/3 is the constant in 3.51), m ∈ L ∞ ([0, T ]; L s (R3 )) (s > 3), (η, q) ∈ L ∞ ([0, T ]; L 1 (R3 )), where η and q are given in (4.3). Moreover, we assume that (ρ, m) has the following additional regularity: t lim

h→0 0

R3

|ρ(x, τ + h) − ρ(x, τ )|d xdτ = 0, t ∈ (0, T ), a.e.

(3.62)

Then E(t) ≤ E(0),

0 < t < T,

(3.63)

where E(t) is defined in (3.55). The proof of this theorem is the same as that for Theorem 5.1 in [26], so we omit it. Remark 4. The local existence of smooth solutions of the Cauchy problem (1.1) and (3.52) can be found in [29]. The local existence of solutions with shock fronts for the equations of compressible fluids can be found in [27]. The global existence of solutions for compressible fluids in three dimensions has been a major open problem. It would be possible to prove the global existence of entropy weak solutions with symmetry, by using some ideas for compressible Euler equations as in [20]. In this paper, we consider the weak solutions of the Cauchy problem satisfying some physically reasonable properties.

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

441

We consider axi-symmetric initial data, which takes the form

Here r =

ρ0 (x) = ρ(r, z), v0 (x) = v0r (r, z)er + v0θ (r, z)eθ + v03 (ρ, z)e3 .

(3.64)

x12 + x22 , z = x3 , x = (x1 , x2 , x3 ) ∈ R3 (as before), and

er = (x1 /r, x2 /r, 0)T , eθ = (−x2 /r, x1 /r, 0)T , e3 = (0, 0, 1)T .

(3.65)

We seek axi-symmetric solutions of the form ρ(x, t) = ρ(r, z, t), v(x, t) = vr (r, z, t)er + v θ (r, z, t)eθ + v 3 (r, z, t)e3 , (x, t) = (r, z, t) = −Bρ(r, z, t).

(3.66) (3.67)

We call a vector field u(x, t) = (u 1 , u 2 , u 3 )(x) (x ∈ R3 ) axi-symmetric if it can be written in the form u(x) = u r (r, z)er + u θ (r, z)eθ + u 3 (ρ, z)e3 . For the velocity field v = (v1 , v2 , v3 )(x, t), we define the angular momentum (per unit mass) j (x, t) about the x3 -axis at (x, t) , t ≥ 0, by j (x, t) = x1 v2 − x2 v1 .

(3.68)

For an axi-symmetric velocity field v(x, t) = vr (r, z, t)er + v θ (r, z, t)eθ + v 3 (ρ, z, t)e3 , v1 =

x1 r x2 θ x2 x1 v − v , v2 = vr + v θ , v3 = v 3 , r r r r

(3.69) (3.70)

so that j (x, t) = r v θ (r, z, t).

(3.71)

In view of ( 3.69) and (3.71), we have |v|2 = |vr |2 +

j2 + |v 3 |2 . r2

(3.72)

Therefore, the total energy at time t can be written as 1 ρ j 2 (x, t) E(ρ(t), v(t)) = A(ρ)(x, t)d x + dx 2 r 2 (x) 1 1 − |∇ Bρ|2 (x, t)d x + ρ(|vr |2 + |v 3 |2 )(x, t)d x. (3.73) 8π 2 There is an important conserved quantity for the Euler-Poisson equations (1.1); namely the angular momentum. In order to describe these, we define Dt , the non-vacuum region at time t ≥ 0 of the solution by Dt = {x ∈ R3 : ρ(x, t) > 0}.

(3.74)

442

T. Luo, J. Smoller

We will make the following assumption of the conservation of angular momentum for the axi-symmetric solutions of the Cauchy problem (1.1), which is motivated by physical considerations, cf. [35]). A1) For any t ≥ 0, there exists a measurable subset G t ⊂ Dt with meas(Dt −G t ) = 0 (meas denotes Lebsegue measure) such that, for any x ∈ G t , the angular momentum j (x, t) defined in (3.68) only depends on the mass in the cylinder with radius r (x), i.e., j (x, t) = jt (m ρt (r (x)),

(3.75)

where m ρt (r (x) =

y12 +y22 ≤r (x)

ρ(y, t)dy,

y = (y1 , y2 , y3 ) ∈ R3 .

Moreover, for t ≥ 0 and x ∈ G t , there exists a point x0 (t) ∈ G 0 satisfying m ρt (r (x)) = m ρ0 (r (x0 (t))),

(3.76)

j (x, t) = jt (m ρt (r (x)) = j0 (m ρ0 (r (x0 (t))).

(3.77)

and

Remark 5. For axi-symmetric motion, we have formally Dj = 0, Dt

(3.78)

j Dj ∂j where D Dt is the material derivative, i. e., Dt := ∂t + v · ∇ j. This means that the angular momentum (per unit mass) is transported by the fluids. On the other hand, by the conservation of mass, the mass enclosed within any material volume cannot change as we follow the volume in its motion ( [35], p. 47)). Mathematically, this means that, for any point x0 ∈ G 0 , along the particle path x = ψ(t) satisfying dψ dt = v(ψ(t), t) and ψ(0) = x0 ,

m ρ(t) (r (ψ(t))) = m ρ0 (r (x0 )) and j (ψ(t), t) = j (x0 , 0). Also, we need a technical assumption; namely, A2) lim

r →0+

L(m ρ(t) (r ) + m ρ˜ (r ))m σ (t) (r ) = 0, r2

(3.79)

for t ≥ 0, where σ (t) = ρ(t) − ρ˜ and L is the distribution of the square of angular momentum for the rotating star solution.

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

443

Remark 6. Equation (3.79) can be understood as follows. For any ρ ∈ W M , we have limr →0+ m ρ (r ) = 0. Therefore limr →0+ L(m ρ(t) (r ) + m ρ˜ (r )) = L(0) = 0, so if we define +∞ ˆ˜ (ρ(s, z, t) − ρ(s, ˜ z))dz, ρ(s, ˆ t) − ρ(s) = −∞

then if m σ (t) (r ) = r2

r

ˆ

ˆ t) − ρ(s))ds ˜ 0 (2π s(ρ(s, 2 r

∈ L ∞ (0, δ) f or some δ > 0,

(3.80)

(3.79) will hold. If ρ(·, ˆ t) − ρ(·) ˜ˆ ∈ L ∞ (0, δ), then (3.80) holds. This can be assured by assuming that ρ(r, z, t) − ρ(r, ˜ z) ∈ L ∞ ((0, δ) × R × R+ ) and decays fast enough in the z direction. For example, when ρ(x, t) − ρ(x) ˜ has compact support in R3 and ∞ 3 ρ(·, t) − ρ(·) ˜ ∈ L (R ), then (3.79) holds. We next make some assumptions on the initial data; namely, we assume that the initial data is such that the initial total mass and angular momentum are the same as those of the rotating star solution (those two quantities are conserved quantities). Therefore, we require I1 ) ˜ x = M. (3.81) ρ0 (x)d x = ρ(x)d Moreover we assume I2 ) For the initial angular momentum j (x, 0) = r v0θ (r, z) =: j0 (r, z) (r = x12 + x22 , z = x3 for x = (x1 , x2 , x3 ), we assume j (x, 0) only depends on the total mass in the cylinder {y ∈ R3 , r (y) ≤ r (x)}, i.e. , (3.82) j (x, 0) = j0 m ρ0 (r (x)) . (This implies that we require that v0θ (r, z) only depends on r .) Finally, we assume that the initial profile of the angular momentum per unit mass is the same as that of the rotating star solution, i. e., I3 ) j02 (m) = L(m),

0 ≤ m ≤ M,

(3.83)

where L(m) is the profile of the square of the angular momentum of the rotating star defined in Sect. 2. In order to state our stability result, we need some notation. Let λ be the constant in Theorem 2.2, i.e., ∞ A (ρ(x)) ˜ + r (x) L(m ρ˜ (s))s −3 ds − B ρ(x) ˜ = λ, x ∈ , ∞ (3.84) −3 ˜ ≥ λ, x ∈ R3 − , r (x) L(m ρ˜ )(s))s ds − B ρ(x) with A defined in (3.56) and defined in (2.16).

444

T. Luo, J. Smoller

For ρ ∈ W M , we define, d(ρ, ρ) ˜ = [A(ρ) − A(ρ)] ˜ + (ρ − ρ){ ˜

∞

r (x)

L(m ρ˜ (s)) ds − λ − B ρ}d ˜ x. (3.85) s3

For x ∈ , in view of the convexity of the function A (cf. (3.46)) and (3.84), we have, ∞ L(m ρ˜ (s)) (A(ρ) − A(ρ))(x) ˜ +( ds − λ − B ρ(x))(ρ ˜ − ρ) ˜ s3 r (x) = (A(ρ) − A(ρ) ˜ − A (ρ)(ρ ˜ − ρ))(x) ˜ ≥ 0.

(3.86)

˜ = 0, so we have A(ρ)(x)) ˜ = 0. This is because since A(0) = 0 For x ∈ R3 − , ρ(x) due to p(0) = 0 (cf. (3.3)) and (2.5). Therefore, by (3.84), we have, for ρ ∈ W M and x ∈ R3 − , ∞ L(m ρ˜ (s)) ds − λ − B ρ(x))(ρ ˜ − ρ) ˜ (A(ρ) − A(ρ))(x) ˜ +( s3 r (x) = A(ρ) ≥ 0. (3.87) Thus, for ρ ∈ W M , d(ρ, ρ) ˜ ≥ 0. We also define ˜ = d1 (ρ, ρ)

˜ ρ(x)L(m ρ (r (x))) − ρ(x)L(m ρ˜ (r (x)) dx r 2 (x) ∞ − s −3 L(m ρ˜ (s))ds(ρ(x) − ρ(x))d ˜ x, 1 2

(3.88)

r (x)

(3.89)

for ρ ∈ W M . We shall show later that d1 ≥ 0. Our main stability result in this paper is the following global-in- time stability theorem. Theorem 3.3. Suppose that the pressure function satisfies (3.51), and both (3.4), (3.5) hold. Let ρ˜ be a minimizer of the functional F in W M , and assume that it is unique up to a vertical shift. Assume that I1 )- I3 ), [(3.81)–(3.83)] hold. Moreover, assume that the angular momentum of the rotating star solution ρ˜ satisfies (2.9), (3.1) and (3.2). Let (ρ, v, )(x, t) be an entropy weak solution of the Cauchy problem (1.1) and (3.52) satisfying (3.61) and (3.63) with axi-symmetry. If the angular momentum j satisfies Assumption A1) and Assumption A2) holds, then for every > 0, there exists a number δ > 0 such that if 1 ||∇ Bρ0 − ∇ B ρ|| d(ρ0 , ρ) ˜ + ˜ 22 + |d1 (ρ0 , ρ)| ˜ 8π 1 + ρ0 (x)(|v0r |2 + |v03 |2 )(x)d x < δ, 2

(3.90)

then for every t > 0, there is a vertical shift a(t)e3 (a(t) ∈ R, e3 = (0, 0, 1)) such that, 1 ˜ + ˜ 22 + |d1 (ρ(t), T a(t) ρ)| ˜ d(ρ(t), T a(t) ρ) ||∇ Bρ(t) − ∇ BT a(t) ρ|| 8π 1 + ρ(x, t)(|vr (x, t)|2 + |v 3 (x, t)|2 )d x < , 2 where T a(t) ρ(x) ˜ =: ρ(x ˜ + a(t)e3 ).

(3.91)

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

445

Remark 7. The above stability results of rotating star solutions apply for axi-symmetric perturbations. For the stability of non-rotating star solutions, we can consider general perturbations without axi- symmetry. Also, Assumptions A1)- A2) and I2)-I3) in the above theorem are used to control the angular momentum, for the stability of non-rotating stars, those assumptions are not needed. Moreover, the uniqueness assumption for minimizers of the energy functional is not needed for non-rotating star solutions since this uniqueness was proved in [22]. We give a general result of the stability for non-rotating white dwarf stars in Sect. 5, for which the stability results of non-rotating stars in [32] do not apply. Remark 8. The integral terms in (3.90) and (3.91) can be understood as follows; namely for rotating stars, the velocity has no r or z components, so it is natural that these terms be small. Remark 9. Without the uniqueness assumption for the minimizer of F in W M , we can have the following type of stability result, as observed in [32] for the non-rotating star solutions. Suppose the assumptions in Theorem 3.3 hold. Let S M be the set of all minimizers of F in W M and (ρ, v, )(x, t) be an axi-symmetric weak entropy solution of the Cauchy problem (1.1) and (3.52) satisfying (3.61) and (3.63). Then for every > 0, there exists a number δ > 0 such that if 1 ||∇ Bρ0 − ∇ B ρ|| inf d(ρ0 , ρ) ˜ + ˜ 22 + |d1 (ρ0 , ρ)| ˜ 8π ρ∈ ˜ SM 1 (3.92) + ρ0 (x)(|v0r |2 + |v03 |2 )(x)d x < δ, 2 then for every t > 0, there is a vertical shift a(t)e3 (a ∈ R, e3 = (0, 0, 1)) such that 1 ||∇ Bρ(t) − ∇ BT a(t) ρ|| inf d(ρ(t), T a(t) ρ) ˜ + ˜ 22 + |d1 (ρ(t), T a(t) ρ)| ˜ 8π ρ∈ ˜ SM 1 (3.93) + ρ(x, t)(|vr (x, t)|2 + |v 3 (x, t)|2 )(x)d x < , 2 where T a(t) ρ(x) ˜ =: ρ(x ˜ + a(t)e3 ). In the case of non-rotating stars, i.e. L = 0, the uniqueness of minimizers of the energy functional was proved by Lieb and Yau in [22]. There has been no uniqueness results for the case of rotating stars. It might be expected that this problem can be solved by using some ideas in [22]. The proof of Theorem 3.3 follows from several lemmas. The proofs of these lemmas are similar to those in [26], and therefore we only sketch them. First we have Lemma 3.8. Suppose the angular momentum of the rotating star solutions satisfies (2.9), (3.1) and (3.2). For any ρ(x) ∈ W M , if lim L(m ρ (r ) + m ρ˜ (r ))m σ (r )r −2 = 0,

(3.94)

d1 (ρ, ρ) ˜ ≥ 0,

(3.95)

r →0+

where σ = ρ − ρ, ˜ then

where d1 is defined by (3.89).

446

T. Luo, J. Smoller

Proof. For an axi-symmetric function f (x) = f (r, z) (r = x = (x1 , x2 , x3 )), we let +∞ ˆ f (r ) = 2πr f (r, z)dz,

x12 + x22 , z = x3 for

−∞

m f (r ) =

{x: x12 +x22 ≤r }

f (x)d x =

r

fˆ(s)ds,

(3.96)

(3.97)

0

so that m f (r ) = fˆ(r ).

(3.98)

σ (x) = (ρ − ρ)(x), ˜

(3.99)

In order to show (3.95), we let

and for 0 ≤ α ≤ 1, we define (r (x))) − ρ(x)L(m ˜ (ρ˜ + ασ )(x)L(m ρ+ασ 1 ˜ ρ˜ (r (x))) dx Q(α) = 2 2 r (x) ∞ −α s −3 L(m ρ˜ (s))dsσ (x)d x. r (x)

(3.100)

Then Q(0) = 0, Q(1) = d1 (ρ, ρ). ˜

(3.101)

By the assumption that L (m) ≥ 0 for 0 ≤ m ≤ M (cf. (3.2)) and (3.94), we can show that +∞ ∞ σˆ (r ) s −3 (L(m ρ+ασ (s)) − L(m ρ˜ (s)))dsdr, (3.102) Q (α) = ˜ 0

r

and therefore Q(0) = Q (0) = 0.

(3.103)

This is done by interchanging the order of integration and integrating by parts (details can be found in [26]). Differentiating (3.103) again and interchanging the order of integration, we get +∞ d 2 Q(α) = α s −3 L (m ρ+ασ (s))(m σ (s))2 ds. (3.104) ˜ dα 2 0 Therefore, if L (m) ≥ 0 for 0 ≤ m ≤ M, then d 2 Q(α) ≥ 0, f or 0 ≤ α ≤ 1. dα 2 This, together with (3.103) and (3.101), yields d1 (ρ, ρ) ˜ = Q(1) ≥ 0.

(3.105)

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

447

Lemma 3.9. Let (ρ, v) be a solution of the Cauchy problem (1.1) and (3.52) as stated in Theorem 3.3, then E(ρ, v)(t) − F(ρ) ˜ 1 ||∇ Bρ(·, t) − ∇ B ρ|| ˜ 22 ˜ − = d(ρ(t), ρ) ˜ + d1 (ρ(t), ρ) 8π 1 + ρ(|vr |2 + |v 3 |2 )(x, t)d x. 2

(3.106)

Proof. From (3.75) and (3.77) in A1), we have, for x ∈ G t = {x|ρ(x, t) > 0}, j 2 (x, t) = ( jt (m ρt (r (x)))2 = ( j0 (m ρ0 (r (x0 (t))))2 = L(m ρ0 (r (x0 (t)) = L(m ρt (r (x))). (3.107) Therefore, by (3.73), we have ρ(x, t)L(m ρ(t) (r (x)) 1 E(ρ(t), v(t)) = A(ρ)(x, t)d x + dx 2 r 2 (x) 1 1 |∇ Bρ|2 (x, t)d x + ρ(|vr |2 + |v 3 |2 )(x, t)d x. − 8π 2 (3.108) Equation (3.106) follows from (3.108) and the following identities: ˜ 22 ) (||∇ Bρ(·, t)||22 − ||∇ B ρ|| =

||∇(Bρ(·, t)) − ∇ B ρ)|| ˜ 22

∇ B ρ(x) ˜ · (∇ Bρ(x, t) − ∇ B ρ(x))d ˜ x = ||∇(Bρ(·, t)) − ∇ B ρ)|| ˜ 22 − 8π B ρ(x)(ρ(x, ˜ t) − ρ(x))d ˜ x, and

+2

ρ(x, t)d x =

ρ(x)d ˜ x = M.

Having established these lemmas, the proof of Theorem 3.3 is similar to the proof of Theorem 3.1 in [26]. We sketch it as follows. Proof of Theorem 3.3 . Assume the theorem is false. Then there exist 0 > 0, tn > 0 and initial data ρn (x, 0) ∈ W M and vn (x, 0) such that for all n ∈ N, 1 ||∇ Bρn (0) − ∇ B ρ|| d(ρn (0), ρ) ˜ + d1 (ρ0 , ρ) ˜ + ˜ 22 8π 1 1 + ρn (x, 0)(|vnr (x, 0)|2 + |vn3 (x, 0)|2 )(x)d x < , 2 n

(3.109)

but for any a(tn ) ∈ R, 1 d(ρn (tn ), T a(tn ) ρ) ˜ + d1 (ρn (tn ), T a(tn ) ρ) ˜ + ˜ 22 ||∇ Bρn (tn ) − ∇ BT a(tn ) ρ|| 8π 1 + (3.110) ρn (x, tn )(|vnr (x, tn )|2 + |vn3 (x, tn )|2 )(x)d x ≥ 0 . 2

448

T. Luo, J. Smoller

By (3.106) and (3.109), we have ˜ lim E(ρn (0), vn (0)) = F(ρ).

n→∞

(3.111)

Since E(ρn (t), vn (t)) is non-increasing in time, ˜ (3.112) lim sup F(ρn (tn )) ≤ lim E(ρn (tn ), vn (tn )) ≤ lim E(ρn (0), vn (0)) = F(ρ).

n→∞

n→∞

n→∞

Therefore {ρn (·, tn )} ⊂ W M is a minimizing sequence for the functional F. We then can apply Theorem 3.1 to conclude that there exists a sequence {an } ⊂ R such that up to a subsequence, ||∇(Bρn (tn ) − BT an ρ)|| ˜ 2 → 0,

(3.113)

as n → ∞; this is where we use the assumption that the minimizer is unique up to a vertical shift. By (3.106), the fact that the energy is non-increasing in time, and F(T a ρ) = F(ρ), we have for any ρ ∈ W M and a ∈ R, E(ρn (tn ), vn (tn )) − F(T an ρ) ˜ an = d(ρn (tn ), T ρ) ˜ + d1 (ρ(tn ), T an ρ) ˜ 1 − ||∇(Bρn (tn ) − BT an ρ)|| ˜ 22 8π 1 + ρn (|vnr |2 + |vn3 |2 )(x, tn )d x 2 ˜ ≤ E(ρn (0), vn (0)) − F(T an ρ) = E(ρn (0), vn (0)) − F(ρ) ˜ → 0,

(3.114)

as n → ∞. Since ˜ 2 → 0, ||∇ Bρn (tn ) − ∇ BT an ρ|| ˜ ≥ 0, as n → ∞, d(ρn (tn ), ρ) ˜ + d1 (ρ(tn ), T an ρ) ˜ d(ρn (tn ), T an ρ) 1 + ||∇(Bρn (tn ) − T an B ρ)|| ˜ 22 8π 1 + ρn (|vnr |2 + |vn3 |2 )(x, tn )d x → 0, 2

(3.115)

as n → ∞. This contradicts (3.110), and completes the proof. 4. Applications to White Dwarf and Supermassive Stars In this section, we want to verify the assumptions (3.4) and (3.5) in Theorem 3.2 for both white dwarfs and supermassive stars. Once we verify (3.4) and (3.5), we can apply Theorems 3.1 and 3.3. We begin with the following theorem which verifies (3.5) for white dwarfs, supermassive stars, and polytropes with γ > 4/3, in both the rotating and non-rotating cases.

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

449

Theorem 4.1. Assume that the pressure function p satisfies (3.3). Then there exists a constant Mc satisfying 0 < Mc < ∞ if γ = 4/3 and Mc = ∞ if γ > 4/3, such that if M < Mc , then (3.5) holds for ρ ∈ W M . Proof. Using (3.13), we have, for ρ ∈ W M , 1 ρ(x)L(m ρ (r (x))) 1 F(ρ) = [A(ρ) + − ρ Bρ]d x 2 r (x)2 2

2/3 1 ρ(x)L(m ρ (r (x))) 4/3 ≥ [A(ρ) + ]d x − C ρ d x ρ dx 2 r (x)2 1 ρ(x)L(m ρ (r (x))) = [A(ρ) + ]d x − C M 2/3 ρ 4/3 d x. (4.1) 2 r (x)2 Taking p = 1, q = 4/3, r = γ , and a = in Young’s inequality (3.11), we obtain,

3 4 γ −1

γ −1

(where γ ≥ 4/3 is the constant in (3.3))

= M a ||ρ||1−a ||ρ||4/3 ≤ ||ρ||a1 ||ρ||1−a γ γ .

(4.2)

This, together with (3.16)–(3.18) yields

b 4 4 ρ 4/3 d x ≤ M 3 a ( ρ γ d x)b ≤ M 3 a (ρ ∗ )γ −1 M + α A(ρ)d x

4 4 ≤ C M 3 a+b (ρ ∗ )1/3 + α M 3 a ( A(ρ)d x)b ,

(4.3)

where b = 3(γ1−1) , α and ρ ∗ are the constants in (3.17) and we have used the elementary inequality (x + y)b ≤ C(x b + y b ), for x, y > 0, 0 4/3, then 0 < b < 1, if γ = 4/3, then b = 1. Therefore (4.4) implies (3.5). The next result shows that (3.4) holds for a wide class of (rotating or non-rotating) stars, including White Dwarfs. Theorem 4.2. Suppose that the pressure function p satisfies (3.3) and lim

ρ→0+

p(ρ) = β, ρ γ1

(4.5)

for some constants γ1 > 4/3 and 0 < β < +∞, and assume that the angular momentum (per unit mass) satisfies (2.9). Then there exists Mc satisfying 0 < Mc < +∞ if γ = 4/3 and Mc = +∞ if γ > 4/3 such that if M < Mc , then (3.4) holds, where γ is the constant in (3.3). Remark 10. White dwarfs satisfy (3.3) and (4.5) with γ = 4/3 and γ1 = 5/3.

450

T. Luo, J. Smoller

Proof of Theorem 4.2. Due to (3.3) and (4.5), we can apply Theorem 2.1. Let ρ(x) ˆ ∈ W M,S be a minimizer of F(ρ) in W M,S as described in Theorem 2.1, and let G = {x ∈ R3 : ρ(x) ˆ > 0}. Then G¯ is a compact set in R3 , and ρˆ ∈ C 1 (G). Furthermore, there exists a constant µ < 0 such that ∞ ˆ + r (x) L(m ρˆ (s)s −3 ds − B ρ(x) ˆ = µ, x ∈ G, A (ρ(x)) ∞ (4.6) −3 3 ˆ ≥ µ, x ∈ R − G. r (x) L(m ρˆ (s)s ds − B ρ(x) ˆ = inf ρ∈W M,S F(ρ). It follows from [1] that there exists ρˆ ∈ W M,S ⊂ W M such that F(ρ) ˆ It is easy to verify that the triple (ρ, ˆ vˆ , ) is a time-independent solution of the Eulerˆ > 0}, where Poisson equations (1.1) in the region G = {x ∈ R3 : ρ(x) x J (m (r )) x J (m (r )) ˆ = −B ρ. ˆ Therefore vˆ = (− 2 r ρˆ , 1 r ρˆ , 0) and ˆ = ρ∇ ˆ x (B ρ) ˆ + ρˆ L(m ρˆ )r (x)−3 er , x ∈ G, ∇x p(ρ)

(4.7)

1 2 where er = ( r x(x) , r x(x) , 0). Moreover, it is proved in [3] that the boundary ∂G of G is smooth enough to apply the Gauss-Green formula on G. Applying the Gauss-Green formula on G and noting that ρ| ˆ ∂G = 0, we obtain, x · ∇x p(ρ)d ˆ x = −3 p(ρ)d ˆ x = −3 p(ρ)d ˆ x. (4.8)

G

G

As in [26], we have 1 1 x · ρ∇ ˆ x B ρd ˆ x =− ρˆ B ρd ˆ x =− ρˆ B ρd ˆ x. 2 G 2 G Next, since x · er = r (x), we have −3 x · ρ(x)L(m ˆ ρˆ (r (x))r (x)er d x G −2 ρ(x)L(m ˆ = ρˆ (r (x))r (x)d x G −2 = ρ(x)L(m ˆ ρˆ (r (x))r (x)d x. Therefore, from (4.8)–(4.10) we have 1 −2 − 3 p(ρ)d ˆ x =− ρˆ B ρd ˆ x + ρ(x)L(m ˆ ρˆ (r (x))r (x)d x. 2

(4.9)

(4.10)

(4.11)

ˆ for b > 0; then ρ¯ ∈ W M . Also, it is easy to verify that the following Let ρ(x) ¯ = b3 ρ(bx), identities hold, ρ(x) ¯ ρ(y) ¯ ρ¯ B ρd ¯ x= d xd y, 3 3 |x − y| R R ρ(x) ˆ ρ(y) ˆ =b d xd y = b ρˆ B ρd ˆ x (4.12) R3 R3 |x − y|

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

A(ρ)d ¯ x =b

451

−3

ˆ x. A(b3 ρ(x))d

(4.13)

Moreover, for r ≥ 0, m ρ¯ (r ) = 2π

r

s 0

= 2π

r

−∞ ∞

s 0

= 2π

br

∞

s

ρ(s, ¯ z)dsdz ρ(bs, ˆ bz)dsdz

−∞ ∞

−∞

0

= m ρ (br ).

ρ(s , z )ds dz (4.14)

Therefore,

ˆ b3 ρ(x)L(m ρˆ (br (x))) dx r (x)2 ρ(x)L(m ˆ ρˆ (r (x))) = b2 d x. r (x)2

ρ(x)L(m ¯ ρ¯ (r (x))) dx = r (x)2

(4.15)

It follows from (4.12)–(4.15) that

1 F(ρ) ¯ =b ˆ x − b ρˆ B ρd A(b ρ)d ˆ x 2 ρ(x)L(m ˆ b2 ρˆ (r (x))) d x. + 2 r (x)2 −3

3

Hence, (4.11) and (4.16) give ˆ − 3bp(ρ(x)) ˆ dx F(ρ) ¯ = b−3 A(b3 ρ)

2 ρ(x)L(m ˆ b ρˆ (r (x))) −b d x. + 2 r (x)2 In view of (2.9), we have 2

ρ(x)L(m ˆ b ρˆ (r (x))) −b d x ≤ 0, 2 r (x)2

(4.16)

(4.17)

(4.18)

if b > 0 is small. It follows from (3.9) that 1 γ1 βρ ≤ p(ρ) ≤ 2βρ γ1 , for small ρ. 2

(4.19)

Thus, when b is small, since ρˆ is bounded, we have β 2β 3γ1 γ1 b3γ1 (ρ) b (ρ) ˆ γ1 (x) ≤ A(b3 ρ(x)) ˆ ≤ ˆ (x), 2(γ1 − 1) γ1 − 1

(4.20)

452

T. Luo, J. Smoller

for x ∈ R3 . Hence, (4.18) and (4.19) imply b−3 A(b3 ρ) ˆ − 3bp(ρ(x)) ˆ dx

2 3 b3γ1 −3 − (ρ) ˆ γ1 d x. ≤β γ1 − 1 2

(4.21)

Since γ1 > 4/3, we have 3γ1 − 3 > 1. Therefore, we conclude that ˆ − 3bp(ρ(x)) ˆ d x < 0, b−3 A(b3 ρ)

(4.22)

for small b. Equation (3.4) follows from (4.17), (4.18) and (4.22). This completes the proof of Theorem 4.2. We show next that if the angular momentum distribution is everywhere positive, we may apply the existence theorem of Friedman and Tarkington, [10], to conclude that (3.4) holds with no total mass restriction. This result applies also to White Dwarfs. Theorem 4.3. Suppose that the pressure function p satisfies (3.3) with γ√ = 4/3 and (3.9) holds. Assume that the angular momentum (per unit mass) J (m) = L(m) satisfies (2.14), then (3.4) holds for any 0 < M < +∞. Proof. By the existence theorem in [10], if (2.14) is satisfied, then for any 0 < M < +∞, there exists ρ˜ ∈ W M,S such that F(ρ) ˜ = inf ρ∈W M,S F(ρ). Also, all the properties of ρ˜ in Theorem 2.1 are satisfied. Moreover, the regularity of the boundary ∂G is smooth enough to apply the Gauss-Green formula (cf. [3]). The proof now follows exactly as in Theorem 4.2. We finally turn to the case of rotating supermassive stars. Theorem 4.4. Consider a supermassive star; i.e., p(ρ) = kρ 4/3 ,

k > 0 is a constant.

(4.23)

3 If there exists ρˆ ∈ W M such that ρˆ ∈ C 1 (G) ∩ C(R ˆ vˆ√is a steady state solution √ ) and (ρ, x

L(m (r )) x

L(m (r ))

of the Euler-Poisson equation , where vˆ = (− 2 r ρˆ , 1 r ρˆ , 0), in an open bounded set G ⊂ R3 with the Lipschitz boundary ∂G, i.e., ∇x p(ρ) ˆ = ρ∇ ˆ x (B ρ) ˆ + ρˆ L(m ρˆ )r (x)−3 er , x ∈ G, (4.24) 3 ρˆ = 0, x ∈ R − G.

then (3.4) holds provided L satisfies (2.9) and L(m 0 ) > 0, for some m 0 ∈ (0, M).

(4.25)

Remark 11. The existence of ρˆ described above is unknown. The significance of this theorem is that if there exists such a ρˆ , which solves the √ Euler-Poisson equation, together √ with the induced velocity field vˆ = (− the stability theorem, Theorem 3.3.

x2 L(m ρˆ (r )) x1 L(m ρˆ (r )) , , 0), r r

then we can apply

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

453

Proof. Following along the same lines as (4.7)–(4.10), we obtain the same equality as (4.11). Therefore, 1 −2 F(ρ) ˆ =− (4.26) ρ(x)L(m ˆ ρˆ (r (x))r (x)d x, 2 in view of (4.23) and (4.11). Since ρˆ ∈ C 1 (G) ∩ C(R3 ) and ρˆ = 0 for x ∈ R3 − G, it is easy to show that m ρˆ (r ) is continuous in r . Moreover, m ρˆ (0) = 0 and m ρˆ (R) = M, where R = max x∈G¯ (r (x). Therefore, there exists r0 ∈ (0, M) such that m ρˆ (r0 ) = m 0 ,

(4.27)

where m 0 is the constant in (4.25). Thus, L(m ρˆ (r0 )) > 0,

(4.28)

in view of (4.25). Since m ρˆ (r ) is continuous in r and L(m) is continuous in m, we conclude that −2 ρ(x)L(m ˆ (4.29) ρˆ (r (x))r (x)d x > 0. The inequality (3.4) now follows from (4.26)). The preceding theorems, together with Theorem 3.3 show that polytropes ( p(ρ) = kρ γ ) with γ > 4/3 and White Dwarf stars, in both the rotating and non-rotating cases, as well as rotating supermassive stars are dynamically stable. Moreover, if the angular momentum distribution is not everywhere positive and the pressure p behaves asymptotically near infinity like ρ 4/3 , then dynamic stability holds only under a (Chandrasekhar) mass restriction, M ≤ Mc . 5. Nonlinear Dynamical Stability of Non-Rotating White Dwarf Stars With General Perturbations The dynamical stability results in Sect. 3 apply for axi-symmetric perturbations. Also, for the stability of rotating stars, Assumptions A1), A2) and I2), I3) are made in Theorem 3.3 to control the angular momentum. Moreover, the uniqueness of minimizers of the energy functional for rotating stars is not known. However, uniqueness for non-rotating stars was proved by Lieb and Yau in [22]. In this section, we prove a very general nonlinear dynamical stability for non-rotating white dwarf stars without Assumptions A1), A2) and I2), I3), and for general perturbations. For white dwarf stars, as mentioned before, the pressure function satisfies p ∈ C 1 [0, +∞), lim

ρ→0+

p(ρ) p(ρ) = β, lim = K , p (ρ) > 0 for ρ > 0, ρ→∞ ρ γ ρ γ1

(5.1)

where γ1 > 4/3, 0 < β < +∞ and 0 < K < +∞ are constants. In this section, we always assume that the pressure function satisfies (5.1). First, we define for 0 < M < +∞, 3 ρ(x)d x = M, X M = {ρ : R → R, ρ ≥ 0, a.e., 1 (5.2) [A(ρ(x)) + ρ(x)Bρ(x)]d x < +∞}, 2

454

T. Luo, J. Smoller

where A(ρ) is the function given in (2.5). For ρ ∈ X M , we define the energy functional G for non-rotating stars by 1 (5.3) G(ρ) = [A(ρ(x)) − ρ(x)Bρ(x)]d x. 2 We begin with the following theorem. Theorem 5.1. Suppose that the pressure function p satisfies (5.1). Let ρ˜ N be a minimizer of the energy functional G in X M and let N = {x ∈ R3 : ρ˜ N (x) > 0}, then there exists a constant λ N such that x ∈ N , A (ρ˜ N (x)) − B ρ˜ N (x) = λ N , −B ρ˜ N (x) ≥ λ N , x ∈ R3 − N .

(5.4)

(5.5)

The proof of this theorem is well-known, cf. [32] or [1]. Remark 12. 1) We call the minimizer ρ˜ N of the functional G in X M a non-rotating star solution. 2) It follows from [22] that the minimizer ρ˜ N of the functional G in X M is actually radial, and has a compact support. Similar to Theorem 3.1, we have the following compactness theorem. Theorem 5.2. Suppose that the pressure function p satisfies (5.1). There exists a constant M c (0 < M c < ∞) such that if M < M c , then the following hold: (1) inf G(ρ) < 0,

ρ∈X M

(2) for ρ ∈ X M ,

(5.6)

A(ρ)(x)d x ≤ C1 G(ρ) + C2 ,

(5.7)

for some positive constants C1 and C2 , (3) if {ρ i } ⊂ X M is a minimizing sequence for the functional G, then there exist a sequence of translations {x i } ⊂ R3 , a subsequence of {ρ i }, (still labeled {ρ i }), and a function ρ˜ N ∈ X M , such that for any > 0 there exists R > 0 with Tρ i (x)d x ≤ , i ∈ N, (5.8) |x|≥R

and Tρ i (x) ρ˜ N , weakly in L 4/3 (R3 ), as i → ∞, where Tρ i (x) := ρ i (x + x i ). Moreover

(5.9)

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

455

(4) ∇ B(Tρ i ) → ∇ B(ρ˜ N ) str ongly in L 2 (R3 ), as i → ∞,

(5.10)

and (5) ρ˜ N is a minimizer of G in X M . (6) The minimizers of G in X M are unique up to a translation ρ N (x) → ρ N (x + y). Proof. First, the proofs of (1) and (2) are the same as Theorems 4.1 and 4.2 by taking L = 0 (it is easy to check the axial symmetry is not used in the proof of Theorems 4.1 and 4.2 if L = 0). Lemmas 3.4, 3.5 and 3.7 still hold by taking γ = 4/3 and L = 0, and replacing W M by X M , F by G and f M by inf ρ∈X M G(ρ). Also, it is easy to check that (3.25)–(3.29) in the proof of Lemma 3.6 still hold by replacing f M by inf ρ∈X M G(ρ). Therefore, following the proof of Lemma 3.6, we conclude: If {ρ i } ⊂ X M is a minimizing sequence for G, then there exists constant δ0 > 0, i 0 ∈ N and x i ∈ R3 , such that ρ i (x)d x ≥ δ0 , i ≥ i 0 . B1 (x i )

Therefore, if we let Tρ i (x) := ρ i (x + x i ), then

(5.11)

B1 (0)

Tρ i (x)d x ≥ δ0 , i ≥ i 0 .

This is similar to (3.39). Having established this inequality and the other analogues of Lemmas 3.4, 3.5 and 3.7, we can prove this theorem in a similar manner as the proof of Theorem 3.1. The uniqueness of minimizers is proved in [22]. For the stability, we consider the Cauchy problem (1.1) with the initial data (3.53). We do not assume that the initial data have any symmetry. Let ρ˜ N be the minimizer of G on X M and λ N be the constant in (5.5). For ρ ∈ X M , we define d(ρ, ρ˜ N ) = {[A(ρ) − A(ρ˜ N )] − (ρ − ρ˜ N )(λ N + B ρ˜ N }d x, (5.12) = {[A(ρ) − A(ρ˜ N )] − B ρ˜ N (ρ − ρ˜ N )}d x, where we have used the identity

ρd x =

ρ˜ N d x = M,

for ρ ∈ X M . By a similar argument as (3.86)–(3.88), we have d(ρ, ρ˜ N ) ≥ 0,

(5.13)

for any ρ ∈ X M , in view of (4.6). Our nonlinear stability theorem of non-rotating white dwarf star solutions is the following theorem, which extends the results in [32].

456

T. Luo, J. Smoller

Theorem 5.3. Suppose that the pressure function satisfies (5.1). Let ρ˜ N be the minimizer of the functional G in X M . Let (ρ, v, )(x, t) be an entropy weak solution of the Cauchy problem (1.1) and (3.52) stated in Theorem 3.2 satisfying (3.61) and (3.63). If the initial data satisfies ρ0 (x) = ρ N (x)d x = M, then there exists a constant M c (0 < M c < ∞) such that if M < M c , then for every

> 0, there exists a number δ > 0 such that if 1 1 d(ρ0 , ρ˜ N ) + ||∇ Bρ0 − ∇ B ρ˜ N ||22 + (5.14) ρ0 (x)(|v0 |2 )(x)d x < δ, 8π 2 then for every t > 0, there is a translation y(t) ∈ R3 such that, 1 1 d(ρ(t), T y(t) ρ˜ N )+ ||∇ Bρ(t)−∇ BT y(t) ρ˜ N ||22 + ρ(x, t)|v(x, t)|2 )d x < , (5.15) 8π 2 where T y(t) ρ˜ N (x) =: ρ˜ N (x + y(t)). The proof of this theorem follows from the compactness result (Theorem 5.2), and the arguments as in the proof of Theorem 3.3 and in [32], and is thus omitted. Acknowledgements. Luo was supported in part by the National Science Foundation under Grants DMS0606853 and DMS-0742834. Smoller was supported in part by the National Science Foundation under Grant DMS-0603754. The authors are grateful to the referee, whose suggestions have helped to improve the presentation of the paper greatly. Part of this work was done during Luo’s stay at Worcester Polytechnic Institute (WPI). Support received from WPI is gratefully acknowledged.

References 1. Auchmuty, G., Beals, R.: Variational solutions of some nonlinear free boundary problems. Arch. Rat. Mech. Anal. 43, 255–271 (1971) 2. Auchmuty, G.: The global branching of rotating stars. Arch. Rat. Mech. Anal. 114, 179–194 (1991) 3. Caffarelli, L., Friedman, A.: The shape of axi-symmetric rotating fluid. J. Funct. Anal. 694, 109–142 (1980) 4. Chandrasekhar, S.: Phil. Mag. 11, 592 (1931); Astrophys. J. 74, 81 (1931); Monthly Notices Roy. Astron. Soc. 91, 456 (1931); Rev. Mod. Phys. 56, 137 (1984) 5. Chandrasekhar, S.: Introduction to the Stellar Structure. Chicago, IL: University of Chicago Press, 1939 6. Chanillo, S., Li, Y.Y.: On diameters of uniformly rotating stars. Commun. Math. Phys. 166(2), 417– 430 (1994) 7. Deng, Y., Liu, T.P., Yang, T., Yao, Z.: Solutions of Euler- Poisson equations for gaseous stars. Arch. Rat. Mech. Anal. 164(3), 261–285 (2002) 8. Fowler, R.H.: Monthly Notices Roy. Astron. Soc. 87, 114 (1926) 9. Friedman, A., Turkington, B.: Asymptotic estimates for an axi-symmetric rotating fluid. J. Func. Anal. 37, 136–163 (1980) 10. Friedman, A., Turkington, B.: Existence and dimensions of a rotating white dwarf. J. Diff. Eqns. 42, 414–437 (1981) 11. Jang, J.: Nonlinear instability in gravitational Euler- Poisson system for γ = 65 . Arch. Rat. Mech. Anal. 188(2), 265–307 (2008) 12. Guo, Y., Rein, G.: Stable steady states in stellar dynamics. Arch. Rat. Mech. Anal. 147, 225–243 (1999) 13. Guo, Y., Rein, G.: Stable models of elliptical galaxies. Mon. Not. R. Astron. Soc. 344, 1296–1306 (2002) 14. Gilbarg, D., Trudinger, N.: Elliptic Partial Differentail Equations of Second Order. 2nd ed., BerlinHeidelberg-New York: Springer, 1983

Newtonian Rotating and Non-rotating White Dwarfs and Rotating Supermassive Stars

457

15. Humi, M.: Steady states of self-gravitating incompressible fluid in two dimensions. J. Math. Phys. 47(9), 093101 (2006) 16. Landau, L.: Phys. Z. Sowjetunion 1, 285 (1932) 17. Lax, P.: Shock Waves and Entropy. In: Nonlinear Functional Analysis, E. A. Zarantonello, ed., New York: Academic Press, 1971, pp. 603–634 18. Lebovitz, N.R.: The virial tensor and its application to self-gravitating fluids. Astrophys. J. 134, 500–536 (1961) 19. Lebovitz, N.R., Lifschitz, A.: Short-wavelength instabilities of Riemann ellipsoids. Philos. Trans. Roy. Soc. London Ser. A 354(1709), 927–950 (1996) 20. LeFloch, P., Westdickenberg, M.: Finite energy solutions to the isentropic Euler equations with geometric effects. J. Math. Pures Appl. (9) 885,389–429 (2007) 21. Li, Y.Y.: On uniformly rotating stars. Arch. Rat. Mech. Anal. 115(4), 367–393 (1991) 22. Lieb, E.H., Yau, H.T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Commun. Math. Phys. 112(1), 147–174 (1987) 23. Lin, S.S.: Stability of gaseous stars in spherically symmetric motions. SIAM J. Math. Anal. 28(3), 539– 569 (1997) 24. Lions, P.L.: The concentration-compactness principle in the calculus of variations. The locally compact case. Part I, Ann. Inst. H. Poincaré Anal. Non Linéaire 1, 109–145 (1984) 25. Luo, T., Smoller, J.: Rotating fluids with self- gravitation in bounded domains. Arch. Rat. Mech. Anal. 173(3), 345–377 (2004) 26. Luo, T., Smoller, J.: Existence and Nonlinear Stability of Rotating Star Solutions of the Compressible Euler-Poisson Equations. (To appear in Arch. Rat. Mech. Anal) DOI: 10.1007/500205-007-0108-y, 2008 27. Majda, A.: The existence of multidimensional shock fronts. Mem. Amer. Math. Soc. 43(281), Providence, RI: Amer. Math. Soc., 1983 28. Makino, T.: Blowing up of the Euler-Poisson equation for the evolution of gaseous star. Trans. Th. Stat. Phys. 21, 615–624 (1992) 29. Makino, T.: On a local existence theorem for the evolution equation of gaseous stars. In: Patterns and waves, Stud. Math. Appl., 18, Amsterdam: North-Holland, 1986, pp. 459–479 30. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. NewYork: Academic Press, 1975 31. Rein, G.: Reduction and a concentration-compactness principle for energy-casimir functionals. SIAM J. Math. Anal. 33(4), 896–912 (2001) 32. Rein, G.: Non-linear stability of gaseous stars. Arch. Rat. Mech. Anal. 168(2),115–130 (2003) 33. Shapiro, S.H., Teukolsky, S.A.: Black Holes, White Dwarfs, and Neutron Stars. New York: WILEY-VCH, 2004 34. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. 2nd Ed., Berlin-New York: Springer, 1994 35. Tassoul, J.L.: Theory of Rotating Stars. Princeton, NJ: Princeton University Press, 1978 36. Wang, D.: Global Solutions and Stability for Self- Gravitating Isentropic Gases. J. of Math. Anal. Appl. 229, 530–542 (1999) 37. Weinberg, S.: Gravitation and Cosmology. NewYork: John Wiley and Sons, 1972 Communicated by H.-T. Yau

Commun. Math. Phys. 284, 459–479 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0521-6

Communications in

Mathematical Physics

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions Robert Seiringer, Jun Yin Department of Physics, Jadwin Hall, Princeton University, Princeton, NJ 08542-0708, USA. E-mail: [email protected]; [email protected] Received: 12 October 2007 / Accepted: 14 January 2008 Published online: 4 June 2008 – © The Authors 2008

Abstract: We show that the Lieb-Liniger model for one-dimensional bosons with repulsive δ-function interaction can be rigorously derived via a scaling limit from a dilute three-dimensional Bose gas with arbitrary repulsive interaction potential of finite scattering length. For this purpose, we prove bounds on both the eigenvalues and corresponding eigenfunctions of three-dimensional bosons in strongly elongated traps and relate them to the corresponding quantities in the Lieb-Liniger model. In particular, if both the scattering length a and the radius r of the cylindrical trap go to zero, the Lieb-Liniger model with coupling constant g ∼ a/r 2 is derived. Our bounds are uniform in g in the whole parameter range 0 ≤ g ≤ ∞, and apply to the Hamiltonian for three-dimensional bosons in a spectral window of size ∼ r −2 above the ground state energy.

1. Introduction Given the success of the Lieb-Liniger model [11,12], both as a toy model in statistical mechanics and as a concrete model of dilute atomic gases in strongly elongated traps, it is worth investigating rigorously its connection to three-dimensional models with genuine particle interactions. A first step in this direction was taken in [16,17], where it was shown that in an appropriate scaling limit the ground state energy of a dilute threedimensional Bose gas is given by the ground state energy of the Lieb-Liniger model. The purpose of this paper is to extend this result to excited energy eigenvalues and the corresponding eigenfunctions. The Lieb-Liniger model has recently received a lot of attention as a model for dilute Bose gases in strongly elongated traps [2,3,6,9,16,21–24]. Originally introduced as a toy model of a quantum many-body system, it has now become relevant for the description of actual quasi one-dimensional systems. The recent advances in experimental techniques © 2008 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes.

460

R. Seiringer, J. Yin

have made it possible to create such quasi one-dimensional systems in the laboratory [5,7,10,19,20,26,28]. These provide a unique setting for studying matter under extreme conditions where quantum effects dominate. The Lieb-Liniger model describes n non-relativistic bosons in one spatial dimension, interacting via a δ-function potential with strength g ≥ 0. In appropriate units, the Hamiltonian is given by n,,g

H1d

=

n −∂i2 + −2 V (z i /) + g i=1

δ(z i − z j ) .

(1.1)

1≤i< j≤n

Here, we use the notation ∂i = ∂/∂z i for brevity. The trap is represented by the potential V , which is assumed to be locally bounded and to tend to infinity as |z| → ∞. The scaling parameter is a measure of the size of the trap. Instead of the trap potential V , one can also confine the system to an interval of length with appropriate boundary conditions; in fact, periodic boundary conditions were considered in [11,12]. n,,g The Lieb-Liniger Hamiltonian H1d acts on totally symmetric wavefunctions 2 n φ ∈ L (R ), i.e., square-integrable functions satisfying φ(z 1 , . . . , z n ) = φ(z π(1) , . . . , z π(n) ) for any permutation π . In the following, all wavefunctions will be considered symmetric unless specified otherwise. In the case of periodic boundary conditions on the interval [0, ], Lieb and n,,g Liniger have shown that the spectrum and corresponding eigenfunctions of H1d can be obtained via the Bethe ansatz [12]. In [11] Lieb has specifically studied the excitation spectrum, which has an interesting two-branch structure. This structure has recently received a lot of attention [3,9] in the physics literature. Our results below show that the excitation spectrum of the Lieb-Liniger model is a genuine property of dilute threedimensional bosons in strongly elongated traps in an appropriate parameter regime. In the following, we shall consider dilute three-dimensional Bose gases in strongly elongated traps. Here, dilute means that a1/3 1, where a is the scattering length of the interaction potential, and is the average particle density. Strongly elongated means that r , where r is the length scale of confinement in the directions perpendicular to z. We shall show that, for fixed n and , the spectrum of a three dimensional Bose gas in an energy interval of size ∼ r −2 above the ground state energy is approximately equal to the spectrum of the Lieb-Liniger model (1.1) as long as a r and r . The effective coupling parameter g in (1.1) is of the order g ∼ a/r 2 and can take any value in [0, ∞]. The same result applies to the corresponding eigenfunctions. They are approximately given by the corresponding eigenfunctions of the Lieb-Liniger Hamiltonian, multiplied by a product function of the variables orthogonal to z. The precise statement of our results will be given in the next section. We remark that the problem considered here is somewhat analogous to the onedimensional behavior of atoms in extremely strong magnetic fields, where the Coulomb interaction behaves like an effective one-dimensional δ-potential when the magnetic field shrinks the cyclotron radius of the electrons to zero. For such systems, the asymptotics of the ground state energy was studied in [1], and later the excitation spectrum and corresponding eigenfunctions were investigated in [4]. In this case, the effective onedimensional potential can be obtained formally by integrating out the variables transverse to the magnetic field in a suitably scaled Coulomb potential. Our case considered here is much more complicated, however. The correct one-dimensional physics emerges only if the kinetic and potential parts of the Hamiltonian are considered together.

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

461

2. Main Results Consider the Hamiltonian for n spinless bosons in three space dimensions, interacting via 2 a pair-potential v. We shall write x = (x⊥ , z) ∈ R3 , with x⊥ ∈ R2 . Let V ⊥ ∈ L ∞ loc (R ) ⊥ and V ∈ L ∞ loc (R) denote the (real-valued) confining potentials in the x and the z-direction, respectively. Then n,r,,a H3d =

n −i + r −2 V ⊥ (xi⊥ /r ) + −2 V (z i /) i=1

+

(2.1)

a −2 v(|xi − x j |/a) .

1≤i< j≤n

The trap potentials V and V ⊥ confine the motion in the longitudinal (z) and the transversal (x⊥ ) directions, respectively, and are assumed to be locally bounded and tend to ∞ ⊥ 2 ∞ as |z| and |x⊥ | tend to ∞. More precisely, V ∈ L ∞ loc (R) and V ∈ L loc (R ), with ⊥ ⊥ lim R→∞ inf |z|≥R V (z) = lim R→∞ inf |x⊥ |≥R V (x ) = +∞. Without loss of generality, we can assume that V ≥ 0. The scaling parameters r and measure the size of the traps. The interaction potential v is assumed to be a measurable, nonnegative function with finite range R0 , i.e., v(r ) = 0 for r > R0 . We assume that its scattering length equals 1; the scaled potential a −2 v(| · |/a) then has scattering length a [14,15] and range a R0 . We do not assume any smoothness or even integrability of v. In particular, we allow v to take the value +∞ on an interval [0, R1 ], corresponding to hard-sphere particles. In this case, the Hamiltonian (2.1) has to be restricted to the subset of R3n , where |xi − x j | ≥ a R1 for any pair i = j, with Dirichlet boundary conditions on the boundary of this set. To be precise, let us assume that v(r ) = ∞ for 0 ≤ r ≤ R1 for some 0 ≤ R1 ≤ R0 , and that v(r ) is bounded on [R1 + ε, ∞) for any ε > 0. The Hamiltonian (2.1) is then defined as the Friedrichs extension [25, Thm. X.23] of the operator on C0∞ ( ), where denotes the open set R3n \ {(x1 , . . . , xn ) ∈ R3n : |xi − x j | ≤ a R1 for some i = j}. In general, all the operators considered here are defined via the Friedrichs extension of the associated quadratic forms, and all estimates are meant in the form sense. Let e⊥ and b(x⊥ ) denote the ground state energy and the normalized ground state wave function of −⊥ + V ⊥ (x⊥ ), respectively. Note that b is a bounded and strictly positive function and, in particular, b ∈ L p (R2 ) for any 2 ≤ p ≤ ∞. Let also e⊥ > 0 denote ⊥ ⊥ ⊥ the gap above the ground state energy of − + V (x ). The corresponding quantities for −⊥ + r −2 V ⊥ (x⊥ /r ) are then given by e⊥ /r 2 , e⊥ /r 2 and br (x⊥ ) = r −1 b(x⊥ /r ), respectively. n,r,,a k (n, r, , a), with k = 1, 2, 3, . . . . The eigenvalues of H3d will be denoted by E 3d n,,g k (n, , g). Theorem 1 Moreover, the eigenvalues of H1d in (1.1) will be denoted by E 1d k (n, r, , a) is approximately equal to E k (n, , g) + ne⊥ /r 2 for small a/r shows that E 3d 1d and r/, for an appropriate value of the parameter g. In fact, g turns out to be given by 8πa g= 2 |b(x⊥ )|4 d 2 x⊥ . (2.2) 2 r R Theorem 1. Let g be given in (2.2). There exist constants C > 0 and D > 0, independent of a, r , , n and k, such that the following bounds hold:

462

R. Seiringer, J. Yin

(a) k (n, r, , a) ≥ E 3d

r2 k ne⊥ k 1 − + E (n, , g) − η E (n, , g) (2.3) (1 ) L 1d r2 e⊥ 1d

k (n, , g) ≤ as long as E 1d e⊥ /r 2 , with

ηL = D

3/8 na 1/8 2 na . +n r r

(2.4)

ne⊥ k + E 1d (n, , g) (1 − ηU )−1 r2

(2.5)

(b) k E 3d (n, r, , a) ≤

whenever ηU < 1, where ηU = C

na 2/3 r

.

(2.6)

k (n, , g) is monotone increasing in g, and uniformly bounded in g for We note that E 1d k fixed k. In fact, E 1d (n, , g) ≤ E fk (n, ) for all g ≥ 0 and for all k, where E fk (n, ) are the eigenenergies of n non-interacting fermions in one dimension [8]. Using this property of uniform boundedness, we can obtain the following corollary from Theorem 1.

Corollary 1. Fix k, n and . If r → 0, a → 0 in such a way that a/r → 0, then lim

k (n, r, , a) − ne⊥ /r 2 E 3d k (n, , g) E 1d

= 1.

(2.7)

k (n, r, , a) = −2 E k (n, r/, 1, a/), and likewise Note that by simple scaling E 3d 3d k (n, , g) = −2 E k (n, 1, g). Hence it is no restriction to fix in the limit considered E 1d 1d in Corollary 1. In fact, could be set equal to 1 without loss of generality. Note also that the convergence stated in Corollary 1 is uniform in g, in the sense that g ∼ ar −2 is allowed to go to +∞ as r → 0, a → 0, as long as gr ∼ a/r → 0. n,r,,a In order to state our results on the corresponding eigenfunctions of H3d and n,,g H1d , we first have to introduce some additional notation to take into account the possible degeneracies of the eigenvalues. Let k1 = 1, and let ki be recursively defined by

k k (n, , g) > E 1di−1 (n, , g) . ki = min k : E 1d k +j

ki ki+1 ki Then E 1d (n, , g) < E 1d (n, , g), while E 1di (n, , g) = E 1d (n, , g) for 0 ≤ j < ki+1 − ki . That is, k counts the energy levels including multiplicities, while i counts the k (n, , g) levels without multiplicities. Hence, if ki ≤ k < ki+1 , the energy eigenvalue E 1d is ki+1 −ki fold degenerate. Note that ki depends on n, and g, of course, but we suppress this dependence in the notation for simplicity. n,r,,a is as follows. Our main result concerning the eigenfunctions of H3d

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

463

n,r,,a Theorem 2. Let g be given in (2.2). Let k be an eigenfunction of H3d with k eigenvalue E 3d (n, r, , a), and let ψl (ki ≤ l < ki+1 ) be orthonormal eigenfunctions of n,,g k (n, , g). Then H1d corresponding to eigenvalue E 1d

2 ki+1 n k (n,,g) −1

−1 1−(r 2 / e⊥ )E 1d ⊥) ≥ 1 −

k ψl b (x − 1 r k

(1−ηU )(1−ηL ) l=ki

i=1 ki+1 −1

×

j=1 k

ki −1

j

E 1d (n,,g) k

E 1di+1 (n,,g)−E 1di (n,,g)

+

j=1

j

E 1d (n,,g)

k

k −1

E 1di (n,,g)−E 1di

(n,,g)

(2.8)

k (n, , g) < as long as ηU < 1, ηL < 1 and E 1d e⊥ /r 2 .

In particular, this shows that k (x1 , . . . , xn ) is approximately of the product form n,,g ψk (z 1 , . . . , z n ) i br (xi⊥ ) for small r/ and a/r , where ψk is an eigenfunction of H1d k with eigenvalue E 1d (n, , g). Note that although k is close to such a product in the L 2 (R3n ) sense, it is certainly not close to a product in a stronger norm involving the energy (more precisely, the kinetic or interaction energy). In fact, a product wavefunction will have infinite energy if the interaction potential v contains a hard core; in any case, even if v is smooth, the energy of a product wave function will be too big and not related to the scattering length of v at all. Corollary 2. Fix k, n, and g0 ≥ 0. Let Pgk0 ,1d denote the projection onto the

k (n, , g ), and let P ⊥ denote the projeceigenspace of H1d 0 with eigenvalue E 1d 0 r n ⊥ 2 tion onto the function i=1 br (xi ) ∈ L (R2n ). If r → 0, a → 0 in such a way that g = 8π b44 a/r 2 → g0 , then n,,g

lim k |Pr⊥ ⊗ Pgk0 ,1d |k = 1 .

(2.9)

Here, the tensor product refers to the decomposition L 2 (R3n ) = L 2 (R2n ) ⊗ L 2 (Rn ) into the transversal (x⊥ ) and longitudinal (z) variables. We note that Corollary 2 holds also in case g0 = ∞. In this case, Pgk0 ,1d has to be defined as the spectral projection with respect to the limiting energies lim g→∞ k (n, , g). Using compactness, it is in fact easy to see that the limit of P k E 1d g0 ,1d as g0 → ∞ exists in the operator norm topology. Alternatively, one may directly define n,,∞ with Dirichlet boundary conditions replacing the δ-function interaction, and H1d k P∞,1d as its spectral projections. We omit the details. n,,g

n,r,,a Our results imply, in particular, that H3d converges to H1d in a certain normresolvent sense. In fact, if a → 0 and r → 0 in such a way that g = 8π b44 a/r 2 → g0 , 1 (n, , g ), ∞) is fixed, then it follows easily from Corollaries 1 and 2 and λ ∈ C \ [E 1d 0 that 1 1 ⊥ (2.10) lim − Pr ⊗ = 0. n,r,,a n,,g 0 λ + ne⊥ /r 2 − H λ − H1d 3d

n,r,,a In this sense, the Hamiltonian H3d is close to Pr⊥ ⊗ H1d . We emphasize, hown,,g n,r,,a ever, that H1d can not be obtained by simply projecting H3d onto the function n ⊥ b (x ). For instance, of v is not integrable, any eigenfunction of Pr⊥ is necessarily i=1 r i n,,g

464

R. Seiringer, J. Yin

n,r,,a outside of the form domain of H3d , and hence has infinite energy! In this respect, the problem studied in this paper is different and more difficult than seemingly similar problems where, e.g., the Feshbach projection method could be employed. Our main results, Theorems 1 and 2, can be extended in several ways, as we will explain now. For simplicity and transparency, we shall not formulate the proofs in the most general setting.

• Instead of allowing the whole of R2 as the configuration space for the x⊥ variables, one could restrict it to a subset, with appropriate boundary conditions for the Laplacian ⊥ . For instance, if V ⊥ is zero and the motion is restricted to a disk with Dirichlet boundary conditions, the corresponding ground state function b in the transversal directions is given by a Bessel function. • Similarly, instead of taking R as the configuration space for the z variables, one can work on an interval with appropriate boundary conditions. In particular, the case V = 0 with periodic boundary conditions on [0, ] can be considered, which is the special case studied by Lieb and Liniger in [11,12]. • As noted in previous works on dilute Bose gases [14,15], the restriction of v having a finite range can be dropped. Corollaries 1 and 2 remain true for all repulsive interaction potentials v with finite scattering length. Also Theorems 1 and 2 remain valid, with possibly modified error terms, however, depending on the rate of decay of v at infinity. We refer to [14,15] for details. • Our results can be extended to any symmetry type of the 1D wavefunctions, not just symmetric ones. In particular, one can allow the particles to have internal degrees of freedom, like spin. • As mentioned in the Introduction, in the special case k = 1, i.e., for the ground state energy, similar bounds as in Theorem 1 have been obtained in [17]. In spite of the fact that the error terms are not uniform in the particle number n, these bounds have then been used to estimate the ground state energy in the thermodynamic limit, using the technique of Dirichlet–Neumann bracketing. Combining this technique with the results of Theorem 1, one can obtain appropriate bounds on the free energy at positive temperature and other thermodynamic potentials in the thermodynamic limit as well. In fact, since our energy bounds apply to all (low-lying) energy eigenvalues, bounds on the free energy in a finite volume are readily obtained. The technique employed in [17] then allows for an extension of these bounds to infinite volume (at fixed particle density). In the following, we shall give the proof of Theorems 1 and 2. The next Sect. 3 gives k (n, r, , a), as stated in Theorem 1(b). the proof of the upper bound to the energies E 3d The corresponding lower bounds in Theorem 1(a) are proved in Sect. 4. Finally, the proof of Theorem 2 will be given in Sect. 5. 3. Upper Bounds k (n, r, , a), This section contains the proof of the upper bounds to the 3D energies E 3d stated in Theorem 1(b). Our strategy is similar to the one in [17, Sect. 3.1] and we will use some of the estimates derived there. The main improvements presented here concern the extension to excited energy eigenvalues, and the derivation of a bound that is uniform in the effective coupling constant g in (2.2). In contrast, the upper bound in [17, Thm. 3.1] 1 (n, r, , a), and is not uniform in g for large applies only to the ground state energy E 3d g.

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

465

k (n, r, , a), we will prove a Before proving the upper bounds to the 3D energies E 3d simple lemma that will turn out to be useful in the following.

Lemma 1. Let H be a non-negative Hamiltonian on a Hilbertspace H, with eigenvalues 0 ≤ E 1 ≤ E 2 ≤ E 3 ≤ . . . . For k ≥ 1, let f 1 , . . . , f k ∈ H. If, for any {ai } with k 2 i=1 |ai | = 1 we have 2 k i=1 ai f i ≥ 1 − ε for ε < 1 and

k i=1 ai f i

k H i=1 ai f i ≤ E ,

then E k ≤ E(1 − ε)−1 . Proof. For any f in the k-dimensional subspace spanned by the f i , we have f |H | f ≤ E(1 − ε)−1 f | f by assumption. Hence E k ≤ E/(1 − ε) by the variational principle.

k k Pick {ai } with i=1 |ai |2 = 1, and let φ = i=1 ai ψi , where the ψi are orthonorn,,g i mal eigenfunctions of H1d with eigenvalues E 1d (n, , g). Consider a 3D trial wave function of the form (x1 , . . . , xn ) = φ(z 1 , . . . , z n )F(x1 , . . . , xn )

n

br (xk⊥ ) ,

k=1

where F is defined by F(x1 , . . . , xn ) =

f (|xi − x j |) .

i< j

Here, f is a function with 0 ≤ f ≤ 1, monotone increasing, such that f (t) = 1 for t ≥ R for some R ≥ a R0 . For t ≤ R we shall choose f (t) = f 0 (t)/ f 0 (R), where f 0 is the solution to the zero-energy scattering equation for a −2 v(| · |/a) [14,15]. That is,

d2 1 2 d − 2− + 2 v(t/a) dt t dt 2a

f 0 (t) = 0 ,

normalized such that limt→∞ f 0 (t) = 1. This f 0 has the properties that f 0 (t) = 1 − a/t for t ≥ a R0 , and f 0 (t) ≤ t −1 min{1, a/t}. To be able to apply Lemma 1, we need a lower bound on the norm of . This can be (2) obtained in the same way as in [17, Eq. (3.9)]. Let φ denote the two-particle density of φ, normalized as φ(2) (z, z )dzdz = 1 . R2

466

R. Seiringer, J. Yin

Since F is 1 if no pair of particles is closer together than a distance R, we can estimate the norm of by n(n − 1) (2) | ≥ 1 − φ (z, z )br (x⊥ )2 br (y⊥ )2 θ (R − |x − y|)dzdz d 2 x⊥ d 2 y⊥ 2 R6 n(n − 1) ≥ 1− br (x⊥ )2 br (y⊥ )2 θ (R − |x⊥ − y⊥ |)d 2 x⊥ d 2 y⊥ 2 R4 n(n − 1) π R 2 ≥ 1− b44 , (3.1) 2 r2 where Young’s inequality [13, Thm. 4.2] has been used in the last step. Since F ≤ 1, the norm of is less than 1, and hence 1 ≥ | ≥ 1 −

n(n − 1) π R 2 b44 . 2 r2

(3.2)

n,r,,a Next, we will derive an upper bound on |H3d | . Define G by = G F. Using partial integration and the fact that F is real-valued, we have F 2 G j G + |G|2 |∇ j F|2 . | − j | = −

R3n

R3n

Using −⊥ br = (e⊥ /r 2 )br − r −2 V ⊥ ( · /r )br , we therefore get n ne⊥ n,,g n,r,,a |H3d | = 2 | + F2 br (x⊥j )2 φ H1d φ r R3n j=1

−g i< j δ(z i − z j ) ⎛ ⎞ n + |G|2 ⎝ |∇ j F|2 + a −2 v(|xi − x j |/a)|F|2 ⎠. R3n

j=1

(3.3)

i< j

With the aid of Schwarz’s inequality for the integration over the z variables, as well as F ≤ 1, n n,,g F2 br (x⊥j )2 φ H1d φ R3n

⎛ ⎝ ≤ ≤

j=1

R3n

F2

n

⎞1/2 ⎛ br (x⊥j )2 |φ|2 ⎠

⎝

j=1

n,,g φ2 H1d φ2

≤

k E 1d (n, , g) .

R3n

F2

n

⎞1/2

2

n,,g br (x⊥ )2 H1d φ ⎠

j=1

(3.4) n,,g

Here, we have used that φ is a linear combination of the first k eigenfunctions of H1d to obtain the final inequality. The term in the second line of (3.3) is bounded by

n(n − 1) n(n − 1) π R 2

(2) 4 i< j δ(z i − z j ) ≥ φ (z, z)dz 1 − b 4 , 2 2 r2 R (3.5)

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

467

as an argument similar to (3.1) shows. The remaining last term in (3.3) can be bounded in the similar way as in [17, Eqs. (3.12)–(3.19)]. For completeness, we repeat the arguments here. Since 0 ≤ f ≤ 1 and f is monotone increasing by assumption, F 2 ≤ f (|xi − x j |)2 , and n

|∇ j F|2 ≤ 2

j=1

f (|xi − x j |)2 + 4

i< j

f (|xk − xi |) f (|xk − x j |) .

(3.6)

k
Consider the first term on the right side of (3.6), together with the last term in (3.3) containing the interaction potential v. These terms are bounded above by 2 |G|2 f (|xi − x j |)2 + 21 a −2 v(|xi − x j |/a) f (|xi − x j |)2 i< j

R3n

= n(n − 1)

(2)

R6

br (x⊥ )2 br (y⊥ )2 φ (z, z ) × f (|x − y|)2 + 21 a −2 v(|x − y|/a) f (|x − y|)2 d 3 xd 3 y.

(3.7)

Let h(z) =

R2

f (|x|)2 + 21 a −2 v(|x|/a) f (|x|)2 d 2 x⊥ .

Note that h is supported in [−R, R], and R h(z)dz = 4πa(1 − a/R)−1 . Using Young’s inequality for the integration over the ⊥-variables, we get n(n − 1) 4 (3.7) ≤ b φ(2) (z, z )h(z − z )dzdz . (3.8) 4 2 r2 R Consider now the contribution from the last term in (3.6). We can write it as 2 (3.9) 4 |G|2 f (|xk − xi |) f (|xk − x j |) = n(n − 1)(n − 2) 3n 3 k
× d 3 x1 d 3 x2 d 3 x3 , where φ(3) denotes the three-particle density of φ, normalized to have integral 1. Let m(z) =

R2

f (|x|)d 2 x⊥ .

(The function m was called k in [17].) Like h, also m is supported in [−R, R]. For the integration over x1⊥ we use R2

f (|x1 − x2 |)br (x1⊥ )2 d 2 x1⊥ ≤

b2∞ b2∞ m(z 1 − z 2 ) ≤ m∞ . 2 r r2

468

R. Seiringer, J. Yin

For the remaining integrations, we proceed as in (3.8) to obtain b2 b4 2 (2) (3.9) ≤ n(n − 1)(n − 2) 2∞ 2 4 m∞ φ (z, z )m(z − z )dzdz . (3.10) 3 r r R2 Altogether, we have thus shown that ⎛ ⎞ n |G|2 ⎝ |∇ j F|2 + a −2 v(|xi − x j |/a)|F|2 ⎠ R3n

≤

j=1

i< j

n(n − 1) (2) b44 φ (z, z )h(z − z )dzdz 2 r2 R b2 b4 2 + n(n − 1)(n − 2) 2∞ 2 4 m∞ φ(2) (z, z )m(z − z )dzdz . (3.11) 2 3 r r R

We now proceed differently than in [17]. Our analysis here has the advantage of yielding an upper bound that is uniform in g. Let ϕ ∈ H 1 (R). Then

z d|ϕ(t)|2

dt

|ϕ(z)|2 − |ϕ(z )|2 =

dt z

2 z z

dϕ(t)

dϕ(t)

≤ 2|ϕ(z )| (3.12)

dt dt + 2

dt dt z z 1/2

dϕ 2

dϕ 2 1/2

. ≤ 2|ϕ(z )||z − z | + 2|z − z |

R dz R dz

z

Here, we used |ϕ(t)| ≤ |ϕ(z )| + z dϕ(t) dt dt for z ≤ t ≤ z for the first inequality and applied Schwarz’s inequality for the second. (2) We apply the bound (3.12) to φ (z, z ) for fixed z . Using the fact that the support of h is contained in [−R, R], we get φ(2) (z, z )h(z − z )dzdz − h(z)dz φ(2) (z, z)dz R2

R

≤ 2R 1/2

R

R

2 1/2

(2) (2)

∂z (z, z ) dz h(z)dz φ (z , z ) dz φ

R

R

2

(2)

+ 2R h(z)dz

∂z φ (z, z ) dzdz 2 R R 1/2 2R 2 k ≤ E 1d (n, , g) h(z)dz +R n (n − 1)g R

.

(2) Here, we used Schwarz’s inequality and the fact that g 21 n(n − 1) R φ (z, z)dz ≤ k (n, , g) and E 1d

" !

2

d2

(2) k

∂z (z, z ) dzdz ≤ φ − φ ≤ E 1d (n, , g)/n . φ

2

2 dz R 1

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

469

The same argument is used with h replaced by m. Now R h(z)dz = 4πa(1−a/R)−1 , and [17, Eq. (3.22)] 2πa (1 + ln(R/a)) , 1 − a/R a 2πa R 1− . m(z)dz ≤ 1 − a/R 2R R m∞ ≤

Therefore (3.11) ≤

n(n − 1) 1 + K g (3.13) 2 1 − a/R 1/2 2R 2 k × φ(2) (z, z)dz + + R E 1d (n, , g) , n (n − 1)g R

where we denoted a 1 + ln(R/a) 2π (n − 2) K = 3 R 1 − a/R

2 R b2∞ . r

Putting together the bounds (3.4), (3.5) and (3.13), and using again the fact that k (n, , g), we obtain the upper bound − 1) R φ(2) (z, z)dz ≤ E 1d

g 21 n(n

n,r,,a ne⊥ k (n, , g) 1 + n(n−1) π R 2 b4 + a/R+K

H3d − 2

≤ E 1d 4 2 (1−a/R) r2 r # $ 1+K (2Rg(n − 1))1/2 + (n − 1)Rg . + 1−a/R It remains to choose R. If we choose R3 =

ar 2 , n2

then

na 2/3

n,r,,a ne⊥ k

H3d − 2

≤ E 1d (n, , g) 1 + C r r for some constant C > 0. Moreover, from (3.2) we see that | ≥ 1 − C

na 2/3 r

.

Hence the upper bound (2.5) of Theorem 1 follows with the aid of Lemma 1.

470

R. Seiringer, J. Yin

4. Lower Bounds n,r,,a In this section, we will derive a lower bound on the operator H3d in terms of the n,,g Lieb-Liniger Hamiltonian H1d . In particular, this will prove the desired lower bounds k (n, r, , a). Our method is based on [17, Thm. 3.1], but extends it in on the energies E 3d several important ways which we shall explain. Let be a normalized wavefunction in L 2 (R3n ). We define f ∈ L 2 (Rn ) by

f (z 1 , . . . , z n ) =

R2n

n

(x1 , . . . , xn )

br (xk⊥ )d 2 xk⊥ .

(4.1)

k=1

Moreover, we define F by (x1 , . . . , xn ) = F(x1 , . . . , xn )

n

br (xk⊥ ) .

k=1

Note that F is well-defined, since br is a strictly positive function. Finally, let G be given by G(x1 , . . . , xn ) = (x1 , . . . , xn ) − f (z 1 , . . . , z n )

n

br (xk⊥ ) .

(4.2)

k=1

Using partial integration and the eigenvalue equation for br , we obtain

n

n,r,,a ne⊥

H3d − 2 = r

3n i=1 R

+

1 2

⎡ ⎣|∇i F|2 + −2 V (z i /)|F|2

⎤ a −2 v(|xi − x j |/a)|F|2 ⎦

j, j =i

n

(4.3)

br (xk⊥ )2 d 3 xk .

k=1

Now choose some R > a R0 , and let ) U (r ) =

3(R 3 − a R03 )−1 for a R0 ≤ r ≤ R 0 otherwise .

(4.4)

For δ > 0 define Bδ ⊂ R2 by

Bδ = x⊥ ∈ R2 : b(x⊥ )2 ≥ δ . In the following, we proceed along the same lines as in [17, Eqs. (3.31)–(3.36)]. Consider, for fixed i and x j , j = i, the Voronoi cell j around particle j, i.e., j = {x : |x − x j | ≤ |x − xk | for all k = j} .

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

471

Denote by B j the ball of radius R around x j . We can then bound br (xi⊥ )2 |∇i F|2 + 21 a −2 v(|xi − x j |/a)|F|2 d 3 xi j ∩B j

≥ min br (x⊥ )2 a

x∈B j

≥

j ∩B j

minx∈B j br (x⊥ )2 maxx∈B j br (x⊥ )

a 2

U (|xi − x j |)|F|2 d 3 xi

j ∩B j

br (xi⊥ )2 U (|xi − x j |)|F|2 d 3 xi .

Here, we used Lemma 1 of [18], which states that 2 2 3 1 −2 |∇ F(x)| + 2 a v(|x|/a)|F(x)| d x ≥ a U (|x|)|F(x)|2 d 3 x D

D

for any convex domain D containing the origin. maxx∈B j br (x⊥ )2 ≤ minx∈B j br (x⊥ )2 + 2(R/r 3 )∇b2 ∞ , we obtain minx∈B j br (x⊥ )2 maxx∈B j br (x⊥ )2

≥

χBδ (x⊥j /r )

Estimating

R ∇b2 ∞ . 1−2 r δ

(It is easy to see that ∇b2 is a bounded function; see, e.g., the proof of Lemma 1 in the Appendix in [17].) Here χBδ denotes the characteristic function of Bδ . If xk(i) denotes ⊥ is its ⊥-component, we the nearest neighbor of xi among the x j with j = i, and xk(i) conclude that, for 0 ≤ ε ≤ 1, ⎡ ⎤ n n 2 −2 2 ⎣|∇i F| + 1 ⎦ a v(|xi − x j |/a)|F| br (xk⊥ )2 d 3 xk 2 i=1

≥

R3n

j, j =i

n

*

3n i=1 R

+

R3n

k=1

⊥ ε|∇i⊥ F|2 + a U (|xi − xk(i) |)χBδ (xk(i) /r )|F|2

n +

br (xk⊥ )2 d 3 xk

k=1

n * + ε|∂i |2 + (1 − ε)|∂i |2 χmink |xi −xk |≥R (xi ) d 3 xk ,

(4.5)

k=1

where a = a(1 − ε)(1 − 2R∇b2 ∞ /r δ). Here, the factor χmink |xi −xk |≥R restricts the xi integration to the complement of the balls of radius R centered at the xk for k = i. That is, for the lower bound in (4.5) only the kinetic energy inside these balls gets used. Part of the kinetic energy in the x⊥ direction has been dropped, which is legitimate for a lower bound. For a lower bound, the characteristic function χmink |xi −xk |≥R could be replaced by the smaller quantity χmink |zi −z k |≥R , as was done in [17]. We do not do this here, however, and this point will be important in the following. In particular, it allows us to have the full kinetic energy (in the z direction) at our disposal in the effective one-dimensional problem that is obtained after integrating out the x⊥ variables. In contrast, only the kinetic energy in the regions |z i − z k | ≥ R was used in [17] to derive a lower bound on the ground state energy. The improved method presented in the following leads to an operator lower bound, however.

472

R. Seiringer, J. Yin

We now give a lower bound on the two terms on the right side of (4.5). We start with the second term, which is bounded from below by R3n

|∂i |2

n

d 3 xk

k=1

−(1 − ε)

R3n

n

|∂i |2 χmink |zi −z k |≤R (z i )χmink |x⊥ −x⊥ |≤R (xi⊥ ) i

k

d 3 xk .

k=1

To estimate the last term from below, consider first the integral over the x⊥ variables for fixed values of z 1 , . . . , z n . Using (4.2), this integral equals

2

n n

⊥ br (xk ) + ∂i G χmink |x⊥ −x⊥ |≤R (xi⊥ ) d 2 xk⊥

∂i f i k

2n R k=1 k=1 n ≤ 1 + η−1 |∂i f |2 χmink |x⊥ −x⊥ |≤R (xi⊥ ) br (xk⊥ )2 d 2 xk⊥ R2n

+ (1 + η)

|∂i G|2

R2n

i

k

k=1

n

d 2 xk⊥

k=1

for any η > 0, by Schwarz’s inequality. It is easy to see that R2n

χmink |x⊥ −x⊥ |≤R (xi⊥ ) i

n

k

br (xk⊥ )2 d 2 xk⊥ ≤

k=1

n(n − 1) π R 2 b44 2 r2

using Young’s inequality, as in (3.1). Now R3n

|∂i |

2

n k=1

d xk = 3

R3n

|∂i f |

2

n

br (xk⊥ )2 d 3 xk

k=1

+

R3n

|∂i G|2

n

d 3 xk ,

k=1

since, by the definition of G in (4.2), ∂i G is orthogonal to any function of the form ξ(z 1 , . . . , z n ) k br (xk⊥ ). We have thus shown that the second term on the right side of (4.5) satisfies the lower bound * n + ε|∂i |2 + (1 − ε)|∂i |2 χmink |xi −xk |≥R (xi ) d 3 xk R3n

k=1

n 2 −1 n(n − 1) π R 4 2 ≥ 1 − (1 − ε) 1 + η b |∂ f | dz k i 4 2 r2 Rn k=1 n + (1 − (1 − ε)(1 + η)) |∂i G|2 d 3 xk R3n

(4.6)

k=1

for any η > 0. We are going to choose η small enough such that (1 − ε)(1 + η) ≤ 1, in which case the last term is non-negative and can be dropped for a lower bound. Note that the right side of (4.6) contains the full kinetic energy of the function f . Had we replaced χmink |xi −xk |≥R by the smaller quantity χmink |zi −z k |≥R on the left side of (4.6),

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

473

as in [17], only the kinetic energy for |z i − z k | ≥ R would be at our disposal. This is sufficient for a lower bound on the ground state energy, as in [17], but would not lead to the desired operator lower bound which is derived in this section. We now investigate the first term on the right side of (4.5). The bound we shall derive on this term is obtained in the same way as in [17, Eqs. (3.39)–(3.46)]. Consider, for fixed z 1 , . . . , z n , the expression n i=1

n * + ⊥ ε|∇i⊥ F|2 + a U (|xi − xk(i) |)χBδ (xk(i) /r )|F|2 br (xk⊥ )2 d 2 xk⊥ .

R2n

(4.7)

k=1

To estimate this term from below, we use Temple’s inequality [27], as in [18], which says that for any Hamiltonian H with lowest eigenvalues E 0 < E 1 and expectation value H < E 1 in some state, E 0 ≥ H −

(H − H )2 . E 1 − H

(4.8)

Recall that e⊥ denotes the gap above zero in the spectrum of −⊥ + V ⊥ − e⊥ , i.e., the lowest non-zero eigenvalue. By scaling, e⊥ /r 2 is the gap in the spectrum of ⊥ ⊥ ⊥ 2 − + Vr − e /r . Note that under the transformation φ → br−1 φ this latter operator is unitarily equivalent to ∇ ⊥∗ · ∇ ⊥ as an operator on L 2 (R2 , br (x⊥ )2 d 2 x⊥ ), which appears in (4.7). Hence also this operator has e⊥ /r 2 as its energy gap. For l ∈ N, denote l n n ⊥ U l = U (|xi − xk(i) |)χBδ (xk(i) /r ) br (xk⊥ )2 d 2 xk⊥ . R2n

i=1

k=1

Temple’s inequality (4.8) implies (under the assumption that the denominator in the last term is positive) n 1 U 2 2 (4.7) ≥ a U 1 − a |(x , . . . , x )| d 2 xk⊥ . 1 n U ε e⊥ /r 2 − a U R2n

(4.9)

k=1

Now, using (4.4) and Schwarz’s inequality, U 2 ≤ 3n(R 3 − (a R0 )3 )−1 U , and U (|x − y|)br (x⊥ )2 br (y⊥ )2 d 2 x⊥ d 2 y⊥ U ≤ n(n − 1) R4

b44 b4 3π R 2 ≤ n(n − 1) 2 U (|x|)d 2 x⊥ ≤ n(n − 1) 2 4 3 . r r R − R03 R2

(4.10)

Using this bound, as well as a ≤ a in the error term, we obtain (4.9) ≥ a U where

a =a

R2n

|(x1 , . . . , xn )|2

n

d 2 xk⊥ ,

k=1

−1 n2 a 1 1 3n ar 2 4 1 − ⊥ 3π b4 1− ⊥ 3 , ε e R 1 − (a R0 /R)3 ε e R 1 − (a R0 /R)3

474

R. Seiringer, J. Yin

with the understanding that the term in square brackets is positive. Now let br (x⊥ )2 br (y⊥ )2 U (|x − y|)χBδ (y⊥ /r )d 2 x⊥ d 2 y⊥ . d(z − z ) = R4

Note that d(z) = 0 if |z| ≥ R. We bound U from below by U ≥

R2n

i = j

≥

R2n

i = j

≥

U (|xi − x j |)χBδ (x⊥j /r )

θ (|xk − xi | − R)

k, k =i, j

⎛

U (|xi − x j |)χBδ (x⊥j /r )⎝1 −

n

br (xl⊥ )2 d 2 xl⊥

l=1

⎞

θ (R −|xk − xi |)⎠

k, k =i, j

d(z i − z j ) 1 − (n − 2)

b2∞ .

r2

i = j

br (xl⊥ )2 d 2 xl⊥

l=1

π R2

n

(4.11)

If we denote

a =a

π R2 1 − (n − 2) 2 b2∞ r

,

we thus conclude that (4.7) ≥ a

d(z i − z j )

i< j

R2n

|(x1 , . . . , xn )|

2

n

d 2 xk⊥ .

k=1

By using (4.1)–(4.2), R2n

|(xi , . . . , xn )|2

n

d 2 xk⊥ ≥ | f (z 1 , . . . , z n )|2 .

k=1

Let g = a R d(z)dz. A simple bound as in [17, Eqs. (3.49)–(3.53)] shows that, for any > 0 and ϕ ∈ H 1 (R), d(z − z )|ϕ(z)|2 dz a R , 2g R 2 ≥g max |ϕ(z)| |∂ϕ|2 dz 1 − − z, |z−z |≤R R for fixed z . In particular, applying this estimate to f for fixed z j , j = i, a

i< j

≥g

Rn

d(z i − z j )| f |2

i< j

Rn

n

dz k

k=1

δ(z i − z j )| f |

2

n k=1

dz k − (n − 1)

n i=1

Rn

|∂i f |2

n k=1

dz k ,

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

475

with g = g 1 −

,

2g R

.

(4.12)

Hence we have shown that n * n + ⊥ /r )|F|2 br (xk⊥ )2 d 3 xk ε|∇i⊥ F|2 + a U (|xi − xk(i) |)χBδ (xk(i) i=1

≥g

k=1

i< j

Rn

δ(z i − z j )| f |

2

n

dz k − (n − 1)

k=1

n

Rn

i=1

|∂i f |

2

n

dz k .

(4.13)

k=1

This concludes our bound on the first term on the right side of (4.5). For the term involving V in (4.3), we can again use the orthogonality properties of G in (4.2), as well as the assumed positivity of V , to conclude that n

−2 V (z i /)|(x1 , . . . , xn )|2

3n i=1 R n

≥

i=1

Rn

n

d 3 xk

k=1 −2

V (z i /)| f (z 1 , . . . , z n )|

2

n

dz k .

(4.14)

k=1

By combining (4.3) and (4.5) with the estimates (4.6), (4.13) and (4.14), we thus obtain

n,r,,a ne⊥

H3d − 2

r n 2 −1 n(n − 1) π R 4 2 ≥ 1 − (1 − ε) 1 + η b − (n − 1) |∂ f | dz k i 4 2 r2 Rn k=1 ⎤ ⎡ n n ⎣ + −2 V (z i /) + g δ(z i − z j )⎦ | f |2 dz k . (4.15) Rn

i=1

i< j

k=1

Recall that g is given in (4.12). This estimate is valid for all 0 ≤ ε ≤ 1, > 0 and η > 0 such that (1 − ε)(1 + η) ≤ 1 in order to be able to drop thelast term in (4.6). To complete the estimate, it remains to give a lower bound to R d(z)dz. As in [17, Eq. (3.48)], we can use |b(x⊥ )2 − b(y⊥ )2 | ≤ R∇b2 ∞ for |x⊥ − y⊥ | ≤ R to estimate

4π ⊥ 4 2 ⊥ 2 d(z)dz ≥ 2 b(x ) d x − R∇b ∞ /r r R Bδ 4π ≥ 2 b44 − δ − R∇b2 ∞ /r . r

476

R. Seiringer, J. Yin

This leads to the bound , 2g R R∇b2 ∞ /r g δ ≥ 1− − 1− g b44 b44 π R2 × (1 − ε) 1 − 2R∇b2 ∞ /r δ 1 − (n − 2) 2 b2∞ r −1 n2 a 3n ar 2 1 1 4 1 − ⊥ 3π b4 × 1− ⊥ 3 . ε e R 1 − (a R0 /R)3 ε e R 1 − (a R0 /R)3 It remains to choose the free parameters R, δ, , ε and η. If we choose R=n

na 1/4 r

, δ=ε=η=

na 1/8 r

, =

1 na 5/12 , n r

Eq. (4.15) implies

na 1/8 3/8

n,r,,a ne⊥

n,,g 2 na

H3d f H1d f − 2 ≥ 1 − D − Dn r r r for some constant D > 0. In particular, if Pr⊥ denotes the projection onto the function n ⊥ 2 2n k=1 br (xk ) ∈ L (R ), we have the operator inequality n,r,,a − H3d

ne⊥ n,,g ≥ (1 − ηL ) Pr⊥ ⊗ H1d r2

in the sense of quadratic forms. (Here, ηL is defined in (2.4).) e⊥ /r 2 . Recall that the gap above the ground state energy of −⊥ + r −2 V ⊥ (x⊥ /r ) is Hence, using the positivity of both V and v, n,r,,a H3d −

ne⊥ e⊥ ⊥ 1 − P ⊗ 1. ≥ r r2 r2

In particular, n,r,,a H3d −

ne⊥ e⊥ n,,g ⊥ ⊥ 1 − P ⊗ 1 (4.16) ≥ (1 − γ ) − η ⊗ H + γ P (1 ) L r r 1d r2 r2

k (n, , g), the lowest k eigenvalues for any 0 ≤ γ ≤ 1. If we choose γ = (r 2 / e⊥ )E 1d i (n, , g), of the operator on the right side of (4.16) are given by (1 − γ )(1 − ηL )E 1d 1 ≤ i ≤ k. This implies the lower bound (2.3) of Theorem 1.

5. Estimates on Eigenfunctions In this final section, we shall show how the estimates in the previous two sections can be n,r,,a and the eigencombined to yield a relation between the eigenfunctions k of H3d n,,g functions ψk of H1d . We start with the following simple lemma.

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

477

Lemma 2. Let H be a non-negative Hamiltonian on a Hilbertspace H, with eigenvalues 0 ≤ E 1 ≤ E 2 ≤ . . . and corresponding (orthonormal) eigenstates ψi . For k ≥ 1, let f i , 1 ≤ i ≤ k, be orthonormal states in H, with the property that f i |H | f i ≤ ηE i for some η > 1. If E k+1 > E k , then k

k (η − 1) i=1 Ei | f i |ψ j | ≥ k − . E k+1 − E k 2

i, j=1

Proof. Let Pk be the projection onto the first k eigenfunctions of H . Then H ≥ H Pk + E k+1 (1 − Pk ). By the variational principle, k k f i |H Pk + E k (1 − Pk )| f i ≥ Ei , i=1

i=1

and hence k

f i |H | f i ≥

i=1

k

E i + (E k+1 − E k )

i=1

k

f i |1 − Pk | f i .

i=1

By assumption, f i |H | f i ≤ ηE i , implying the desired bound k k (η − 1) i=1 Ei f i |Pk | f i ≥ k − . E k+1 − E k i=1

Corollary 3. Assume in addition that El+1 > El for some 1 ≤ l < k. Then k

| f i |ψ j | ≥ k − l 2

i, j=l+1

k (η − 1) i=1 (η − 1) li=1 E i Ei − − . E k+1 − E k El+1 − El

Proof. Applying Lemma 2 and using the orthonormality of the ψi and the f i , we have k

| f i |ψ j |2 =

i, j=l+1

k

| f i |ψ j |2 +

i, j=1

−

| f i |ψ j |2

i, j=1

l k i=1 j=1

≥k−

l

| f i |ψ j | − 2

k

k l i=1 j=1

(η − 1) i=1 E i (η − 1) li=1 E i +l − − 2l . E k+1 − E k El+1 − El

Let again Pr⊥ denote the projection onto the function recall inequality (4.16), which states that n,r,,a H3d −

| f i |ψ j |2

n

⊥ k=1 br (xk )

∈ L 2 (R2n ), and

ne⊥ e⊥ n,,g ⊥ ⊥ 1 − P ⊗1 ≥ (1 − γ ) − η ⊗ H + γ P (1 ) L r r 1d r2 r2

(5.1)

478

R. Seiringer, J. Yin

for any 0 ≤ γ ≤ 1. As already noted, the choice k e⊥ )E 1d (n, , g) γ = (r 2 /

implies that the lowest k eigenvalues of the operator on the right side of (5.1) are given i (n, , g), 1 ≤ i ≤ k. Here, we have to assume that by (1 − γ )(1 − ηL )E 1d e⊥ /r 2 ≥ k (n, , g) in order for γ not to be greater than one. The corresponding eigenfunctions E 1d are ψi (z 1 , . . . , z n )

n

br (xk⊥ ) .

k=1

We can now apply Corollary 3, with H equal to the operator on the right side of (5.1), k (n, r, , a) ≤ E k (n, , g)/(1 − η ) f i = i , l = ki − 1 and k = ki+1 − 1. Since E 3d U 1d by Theorem 1(b), we conclude that

! " 2 ki+1 −1 ki+1 −1 n

⊥ br (xk )

k ψl

k=ki l=ki i=1 ki −1 i ki+1 −1 i E 1d E 1 i=1 −1 ≥ ki+1 − ki − + k i=1 k 1d ki+1 ki i (1 − γ )(1 − ηU )(1 − ηL ) E 1d − E 1d E 1d − E 1di −1 i = E i (n, , g) as long as γ < 1, ηU < 1 and ηL < 1. Here, we used the notation E 1d 1d for short. Since

! " 2 ki+1 −1 n

br (xk⊥ ) ≤ 1

k ψl

l=ki

i=1

for every k, this implies Theorem 2. Acknowledgements. We are grateful to Zhenqiu Xie for helpful discussions. R.S. acknowledges partial support by U.S. NSF grant PHY-0652356 and by an A.P. Sloan Fellowship.

References 1. Baumgartner, B., Solovej, J.P., Yngvason, J.: Atoms in Strong Magnetic Fields: The High Field Limit at Fixed Nuclear Charge. Commun. Math. Phys. 212, 703–724 (2000) 2. Bergeman, T., Moore, M.G., Olshanii, M.: Atom-Atom Scattering under Cylindrical Harmonic Confinement: Numerical and Analytic Studies of the Confinement Induced Resonance. Phys. Rev. Lett. 91, 163201 (2003) 3. Bloch, I., Dalibard, J., Zwerger, W.: Many-Body Physics with Ultracold Gases. http://arxiv.org/list/0704. 3011, 2007 4. Brummelhuis, R., Duclos, P.: Effective Hamiltonians for atoms in very strong magnetic fields. J. Math. Phys. 47, 033501 (2006); On the One-Dimensional Behaviour of Atoms in Intense Homogeneous Magnetic Fields. In: Partial Differential Equations and Spectral Theory, PDE2000 Conference in Clausthal, Germany, Demuth, M., Schulze, B.-W. (eds.), Basel: Birkhäuser 2001, pp. 25–35 5. Dettmer, S. et al.: Observation of Phase Fluctuations in Elongated Bose-Einstein Condensates. Phys. Rev. Lett. 87, 160406 (2001) 6. Dunjko, V., Lorent, V., Olshanii, M.: Bosons in Cigar-Shaped Traps: Thomas-Fermi Regime, TonksGirardeau Regime, and In Between. Phys. Rev. Lett. 86, 5413–5416 (2001)

The Lieb-Liniger Model as a Limit of Dilute Bosons in Three Dimensions

479

7. Esteve, J. et al.: Observations of Density Fluctuations in an Elongated Bose Gas: Ideal Gas and Quasicondensate Regimes. Phys. Rev. Lett. 96, 130403 (2006) 8. Girardeau, M.: Relationship between Systems of Impenetrable Bosons and Fermions in One Dimension. J. Math. Phys. 1, 516–523 (1960) 9. Jackson, A.D., Kavoulakis, G.M.: Lieb Mode in a Quasi-One-Dimensional Bose-Einstein Condensate of Atoms. Phys. Rev. Lett. 89, 070403 (2002) 10. Kinoshita, T., Wenger, T., Weiss, D.S.: Observation of a One-Dimensional Tonks-Girardeau Gas. Science 305, 1125–1128 (2004) 11. Lieb, E.H.: Exact Analysis of an Interacting Bose Gas. II. The Excitation Spectrum. Phys. Rev. 130, 1616–1624 (1963) 12. Lieb, E.H., Liniger, W.: Exact Analysis of an Interacting Bose Gas. I. The General Solution and the Ground State. Phys. Rev. 130, 1605–1616 (1963) 13. Lieb, E.H., Loss, M.: Analysis. Second edition, Providene, RI: American Math Soc. 2001 14. Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: The Mathematics of the Bose Gas and its Condensation. Oberwolfach Seminars, Vol. 34, Basel-Boston: Birkhäuser 2005 15. Lieb, E.H., Seiringer, R., Yngvason, J.: Bosons in a trap: A rigorous derivation of the Gross-Pitaevskii energy functional. Phys. Rev. A. 61, 043602 (2000) 16. Lieb, E.H., Seiringer, R., Yngvason, J.: One-dimensional Bosons in Three-dimensional Traps. Phys. Rev. Lett. 91, 150401 (2003) 17. Lieb, E.H., Seiringer, R., Yngvason, J.: One-Dimensional Behavior of Dilute, Trapped Bose Gases. Commun. Math. Phys. 244, 347–393 (2004) 18. Lieb, E.H., Yngvason, J.: Ground State Energy of the Low Density Bose Gas. Phys. Rev. Lett. 80, 2504–2507 (1998) 19. Moritz, H., Stöferle, T., Köhl, M., Esslinger, T.: Exciting Collective Oscillations in a Trapped 1D Gas. Phys. Rev. Lett. 91, 250402 (2003) 20. Moritz, H., Stöferle, T., Günter, K., Köhl, M., Esslinger, T.: Confinement Induced Molecules in a 1D Fermi Gas. Phys. Rev. Lett. 94, 210401 (2005) 21. Olshanii, M.: Atomic Scattering in the Presence of an External Confinement and a Gas of Impenetrable Bosons. Phys. Rev. Lett. 81, 938–941 (1998) 22. Olsahnii, M., Dunjko, V.: Short-Distance Correlation Properties of the Lieb-Liniger System and Momentum Distributions of Trapped One-Dimensional Atomic Gases. Phys. Rev. Lett. 91, 090401 (2003) 23. Petrov, D.S., Shlyapnikov, G.V., Walraven, J.T.M.: Regimes of Quantum Degeneracy in Trapped 1D Gases. Phys. Rev. Lett. 85, 3745–3749 (2000) 24. Petrov, D.S., Gangardt, D.M., Shlyapnikov, G.V.: Low-dimensional trapped gases. J. Phys. IV. 116, 5–46 (2004) 25. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II. Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 26. Richard, S. et al.: Momentum Spectroscopy of 1D Phase Fluctuations in Bose-Einstein Condensates. Phys. Rev. Lett. 91, 010405 (2003) 27. Temple, G.: The theory of Rayleigh’s Principle as Applied to Continuous Systems. Proc. Roy. Soc. London A. 119, 276–293 (1928) 28. Tolra, B.L. et al.: Observation of Reduced Three-Body Recombination in a Correlated 1D Degenerate Bose Gas. Phys. Rev. Lett. 92, 190401 (2004) Communicated by B. Simon

Commun. Math. Phys. 284, 481–507 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0574-6

Communications in

Mathematical Physics

Polynomial-Time Algorithm for Simulation of Weakly Interacting Quantum Spin Systems Sergey Bravyi1 , David DiVincenzo1 , Daniel Loss2 1 IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA. E-mail: [email protected] 2 Department of Physics, University of Basel, Klingelbergstrasse 82, 4056 Basel, Switzerland

Received: 13 October 2007 / Accepted: 11 April 2008 Published online: 15 July 2008 – © Springer-Verlag 2008

Abstract: We describe an algorithm that computes the ground state energy and correlation functions for 2-local Hamiltonians in which interactions between qubits are weak compared to single-qubit terms. The running time of the algorithm is polynomial in n and δ −1 , where n is the number of qubits, and δ is the required precision. Specifically, we consider Hamiltonians of the form H = H0 + V , where H0 describes non-interacting qubits, V is a perturbation that involves arbitrary two-qubit interactions on a graph of bounded degree, and is a small parameter. The algorithm works if || is below a certain threshold value 0 that depends only upon the spectral gap of H0 , the maximal degree of the graph, and the maximal norm of the two-qubit interactions. The main technical ingredient of the algorithm is a generalized Kirkwood-Thomas ansatz for the ground state. The parameters of the ansatz are computed using perturbative expansions in powers of . Our algorithm is closely related to the coupled cluster method used in quantum chemistry. Contents 1. 2.

3.

4.

Introduction and Summary of Results . . . . . . . Kirkwood-Thomas Ansatz for the Ground State . 2.1 Creation operators . . . . . . . . . . . . . . . 2.2 Ansatz for the ground state . . . . . . . . . . 2.3 Kirkwood-Thomas equations . . . . . . . . . 2.4 A lower bound on the spectral gap . . . . . . Solution of the Kirkwood-Thomas Equations . . . 3.1 Solution by formal power series . . . . . . . 3.2 Convergence of C-series . . . . . . . . . . . 3.3 Convergence of E-series . . . . . . . . . . . Linked Cluster Theorems . . . . . . . . . . . . . 4.1 Linked cluster expansion for the ground state

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

482 486 486 487 488 489 491 491 492 494 495 496

482

S. Bravyi, D. DiVincenzo, D. Loss

4.2 Upper bound on the number of linked clusters . . . . 4.3 Linked cluster expansion for the ground state energy 5. Computational Algorithms . . . . . . . . . . . . . . . . . 5.1 Computing the coefficients C p (M) . . . . . . . . . . 5.2 Computing the ground state energy . . . . . . . . . . 5.3 Computing spin-spin correlation functions . . . . . . 6. Discussion and Open Problems . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

497 498 500 500 501 502 503 504 504

1. Introduction and Summary of Results Perturbation theory provides a systematic way of getting approximations to eigenvalues and eigenvectors for a variety of quantum spin models. Arguably, a significant part of analytical and numerical results of condensed matter physics has been obtained using perturbative expansions in some small parameter. Quite recently the methods of the perturbation theory have been successfully employed in quantum complexity theory. In Ref. [1] Kempe, Kitaev, and Regev used perturbative reductions to show that the problem of computing the ground state energy of a Hamiltonian with two-qubit interactions is QMA-complete. After that Terhal and Oliveira [2] generalized this result to local Hamiltonians on a 2D lattice. Our main goal is to examine whether the methods of the perturbation theory provide an efficient computational algorithm for the simulation of quantum spin systems. In this paper we focus on the simulation of low-temperature properties, namely computing the ground state energy and spin-spin correlation functions for the ground state. An efficient algorithm must have a running time T = O(n α δ −β ), where n is the number of spins, δ is a precision up to which we need to compute the ground state energy or a correlation function, and α, β > 0 are some constants. Before stating the results, let us describe the spin models that we shall consider. Let G = (L, E) be a graph with a set of vertices L, |L| = n, and set of edges E. Suppose n spins-1/2 (qubits) are located at vertices u ∈ L and spin-spin interactions are located on edges (u, v) ∈ E. The Hamiltonian is H () = H0 + V, H0 = u |11|u , V = Vu,v . (1) u∈L

(u,v)∈E

Here Vu,v is an arbitrary operator acting on a pair of qubits u, v, and is a real number. The operators H0 and V are called the unperturbed Hamiltonian and the perturbation. We shall always assume that u > 0 for all u ∈ L. Accordingly, the unperturbed Hamiltonian H0 has a non-degenerate ground state | = |0, 0, . . . , 0,

H0 | = 0.

Most of the time, all we will need to know about H0 and V are the following parameters: = min u , u∈L

J = max Vu,v . (u,v)∈E

(2)

The parameter is the gap between the smallest and the second smallest eigenvalue of H0 , while the parameter J characterizes a strength of the perturbation V . Let d be the maximum vertex degree of the graph G, d = max |{v : (u, v) ∈ E}| . u∈L

(3)

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

483

The quantity we are interested in is the smallest eigenvalue of H (), which we shall denote by E(). Clearly, E() is a continuous concave function of and E(0) = 0. Besides, since we assume that > 0, the standard perturbation theory arguments [3] show that E() is analytic at = 0 and the Taylor series E() =

∞

Ep p

(4)

p=1

converges absolutely for V < /2. The following theorem proved by Yarotsky [4] asserts that E() is non-degenerate for sufficiently small and sets a lower bound on the spectral gap. Theorem 1. Suppose || ≤ 20 , where 0 =

2−18 . dJ

(5)

Then the smallest eigenvalue E() has multiplicity 1 and the gap between E() and the second smallest eigenvalue of H () is at least /2. (The explicit value of 0 has not been stated in Ref. [4].) We shall provide an alternative proof of Theorem 1 in Sects. 2,3. As was shown by Osborne in Ref. [5], Theorem 1 implies that expectation values of local observables on the ground state of H () can be efficiently computed within any constant precision δ by simulating quantum adiabatic evolution along the path connecting H (0) and H (). However, the running time of such simulation scales exponentially as a function of δ −1 . As was noted in Ref. [5], it means that simulation of the adiabatic evolution does not yield a polynomial-time algorithm for computing the ground state energy. The perturbation theory provides an approximation to the ground state energy by truncating the series Eq. (4) at sufficiently high order p. In order to understand whether this approach can be used to construct an efficient computational algorithm, two separate issues have to be addressed: Q1: What is the convergence radius of the perturbative series? Q2: What is the computational cost of finding the coefficients in the perturbative series? Note that the radius of convergence of the series Eq. (4) is a property of the Hamiltonian H () only. It does not depend upon what particular perturbative expansion has been used to find the coefficients E p . The following theorem allows one to answer the first question. p Theorem 2. The Taylor series E() = ∞ p=1 E p converges absolutely for || ≤ 20 . Furthermore, p q E() − E q ≤ n2−16− p if || ≤ 0 . (6) q=1 Thus if one needs to compute E() with a specified precision δ, it suffices to compute the coefficients E 1 , . . . , E p , where p = log2 (nδ −1 ) + O(1) (assuming that is a constant that does not depend on n).

484

S. Bravyi, D. DiVincenzo, D. Loss

Answering the second question has nothing to do with the convergence radius of the series Eq. (4) (as long as it is non-zero). One can compute the coefficients E p by choosing so small that V . In this regime the standard perturbation theory is applicable, for example, the self-energy operator formalism, see Refs. [1,6], or the Rayleigh-Schrödinger expansion, see Ref. [7]. Clearly, the computational cost of finding the coefficients E p varies for different methods. In the present paper we compute the coefficients E p using the Kirkwood-Thomas ansatz for the ground state. It was originally proposed in Ref. [8] for translation-invariant Ising-like Hamiltonians with a transverse magnetic field. The translation-invariance constraint has been removed in the later work by Datta and Kennedy [9]. We use the generalized Kirkwood-Thomas ansatz proposed by Yarotsky [4] which is applicable to any spin Hamiltonian with sufficiently weak interactions. It allows us to prove the following. Theorem 3. Suppose d is a fixed constant independent of n. Then there exists an algorithm with a running time n exp (O( p)) that takes as input a triple (H0 , V, p) and outputs E 1 , . . . , E p . An immediate consequence of Theorems 2, 3 is Corollary 1. Suppose , J , d are fixed constants independent of n and || ≤ 0 . Then there exists an algorithm with a running time poly(n, δ) that computes E() with an absolute error at most δ. Besides, it follows from Theorems 2,3 that the energy density E()/n can be computed with a precision δ in a time n · poly(δ −1 ). Note that while computing the coefficients E 1 , . . . , E p we cannot afford the running time to grow faster than exp (O( p)) (for fixed n) since we need p ∼ log (nδ −1 ) to achieve the desired accuracy. The perturbative expansion based on the Kirkwood-Thomas ansatz has two special features that make the scaling exp (O( p)) possible: (i) The parameters of the ansatz are complex amplitudes C(M) assigned to subsets of vertices M ⊆ L. The recursive equations specifying the amplitudes C(M) are described by apolynomial of a p constant degree, see Sect. 3.1; (ii) The perturbative expansion C(M) = ∞ p=1 C p (M) has a property known as the linked cluster theorem, namely, C p (M) = 0 unless M can be covered by a connected subgraph of size O( p), see Sect. 4.1. The number of such subgraphs grows only exponentially with p, see Sect. 4.2. It implies that the number of non-zero coefficients C p (M) grows as n exp (O( p)), see Sect. 5.1. Naturally, one could run the algorithm from Theorem 3 to compute the truncated series for E() even if || > 0 . The running time will be polynomial in n and δ −1 as long as || is smaller than the convergence radius R of the series Eq. (4). Although we believe that R must be close to /(d J ) (see the discussion in at Sect. 6), its exact value cannot be easily found. In practical simulations, one could evaluate R by computing sufficiently many coefficients E p and using the fact that R −1 is the largest accumulation point of a sequence |E p |1/ p , p = 1, . . . , ∞, see Ref. [10]. Note that in general the singular point (points) of E() with || = R does not lie on the real axis and thus cannot be identified with a quantum phase transition point of H () (since we consider finite systems, the latter is not even well defined). Obviously, efficient computation of E() is possible due to the presence of a small parameter in the problem. However it should be emphasized that the condition || ≤ 0 does not imply that the ground state |ψ of H () is close to the ground state | of the

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

485

unperturbed Hamiltonian H0 . In fact, one should expect that |ψ and | are almost orthogonal for largen.1 To illustrate this statement, consider as an example the perx turbation V = −J u∈ L X u , where X is the Pauli σ operator, and the unperturbed Hamiltonian H0 = u∈L |11|u . Clearly, the ground state of H () is a product of one-qubit states, |ψ = u∈L |ψu . A simple calculation shows that 0|ψu = cos (θ/2), where cos (θ ) = (1 + 4 2 J 2 /2 )−1/2 . Thus for any fixed the overlap |ψ = (cos (θ/2))n gets exponentially small as n increases. However the reduced density matrices of the ground states |ψ and | for any subset of qubits of constant size are indeed close to each other for small . In other words, for small the state |ψ describes small density quantum fluctuations of the background state |. One could speculate that this statement remains true for arbitrary weak perturbations as well. The Kirkwood-Thomas ansatz for the ground state of H () used in the present paper provides a convenient way to quantify the “density of quantum fluctuations” and prove that it is indeed small for || ≤ 0 . Unfortunately, our approach does not allow us to make any statements about the validity of the area law or to decide whether the ground state can be well approximated using the PEPS ansatz, see Ref. [12]. The simulation algorithm based on the Kirkwood-Thomas ansatz is closely related to the coupled cluster method originally introduced by Coester [13]. The coupled cluster method is extensively used for numerical simulations in quantum chemistry, see a review [14], as well as in condensed matter physics, see a review [15] and the references therein. Accordingly, from the perspective of practical simulations, the algorithm described in the present paper is certainly not a new one. However, we believe that our results may be regarded as the first rigorous proof that the coupled cluster method yields a polynomial-time simulation algorithm for spin Hamiltonians with weak interactions. The Kirkwood-Thomas ansatz is not well suited for computing spin-spin correlation functions because it provides an unnormalized ground state. We avoid this problem using the standard relation between the correlation functions and the linear response of the ground state energy to a small perturbation. It allows us to prove Theorem 4. Let Ou,v be a Hermitian operator acting non-trivially only on qubits u, v ∈ L. Suppose Ou,v ≤ 1. The expectation value of Ou,v on the ground state of H () can be computed with a precision δ in a time T = poly(δ −1 ) as long as || ≤ 0 /2(d + 1). It should be emphasized that all results discussed in the paper concern only the ground state properties. We don’t know whether efficient computation of thermodynamic quantities such as the free energy is possible for non-zero temperature. Perturbative expansions for partition functions of quantum spin systems at non-zero temperature have been studied by Datta et al in [16] and by Borgs et al in [17] using generalized Pirogov-Sinai theory. Whether or not these methods provide an efficient algorithm for computing the free energy is an open question. Our results demonstrate that perturbation theory can be useful for quantum complexity theory not only as a tool of proving hardness of simulation, see [1,2], but also as a systematic method of constructing efficient simulation algorithms. An interesting open problem is whether the Kirkwood-Thomas ansatz can be adapted to the case when the unperturbed Hamiltonian H0 has a degenerate ground state (i.e. u = 0 for some qubits u). It could yield more powerful techniques for analyzing perturbative series for the low-energy effective Hamiltonian, see [21] for the recent progress. 1 This effect is analogous to the well-known “orthogonality catastrophe” observed by Anderson in Ref. [11] for non-interacting fermions in the presence of a scattering potential.

486

S. Bravyi, D. DiVincenzo, D. Loss

The rest of the paper is organized as follows. Section 2 provides the necessary background on the generalized Kirkwood-Thomas ansatz. It mostly follows Ref. [4], although some of our proofs are technically different (in particular, Lemma 4). Section 3 shows how to solve the Kirkwood-Thomas equations using a power series and proves Theorem 2. In Sect. 4 we prove that our perturbative expansion obeys the well-known linked cluster theorem and establish an upper bound on the number of linked clusters on a graph. The algorithms for computing the ground state energy and spin-spin correlation functions are explicitly described in Sect. 5 which provides a proof of Theorems 3, 4. Some open problems are discussed in Sect. 6. Appendix A contains a technical lemma proving submultiplicativity of the norm of creation operators. 2. Kirkwood-Thomas Ansatz for the Ground State 2.1. Creation operators. Define one-qubit operator a † = |10|. Let au† be the operator a † on qubit u tensored with the identity on all other qubits. For any non-empty subset of vertices M ⊆ L denote a †M = u∈M au† . Note that the operators a †M are nilpotent, (a †M )2 = 0, and that they pairwise commute: a †M a †K = a †K a †M . Also, one can easily check that the operators {a †M }, ∅ = M ⊆ L are linearly independent. (All these definitions and properties apply to a and a M operators as well.) Definition 1. A creation operator is an operator that can be written as C(M) a †M C= ∅= M⊆L

for some complex numbers C(M). For any given creation operator C the coefficients C(M) are uniquely defined by C(M) = |a M C|. Claim 1. Any state |ψ satisfying |ψ = 1 can be uniquely written as |ψ = ex p(−C) | for some creation operator C. Remark. The exponent above is defined by its Taylor series. The nilpotence of operators a †M implies that C k = 0 for any k greater than the number of qubits n = |L|, so the Taylor series can be truncated at k = n. Proof. Clearly, the states {a †M |}, M ⊆ L, constitute the orthonormal basis of the n-qubit Hilbert space. Let |ψ = M⊆L ψ(M)a †M |. Equation |ψ = exp (−C) | is equivalent to a system of equations ψ(∅) = 1, C(M) = −ψ(M) +

|M| (−1)k k=2

M ⊆ L,

k!

M = ∅.

C(M1 ) · · · C(Mk ),

M=M1 ∪...∪Mk

(7)

Here the second summation is over all partitions of M into k disjoint non-empty sets M1 , . . . , Mk . Suppose we have already found all coefficients C(M) with |M| ≤ p. Then Eq. (7) assigns a unique value to all coefficients C(M) with |M| = p + 1. Thus the system Eq. (7) has a unique solution.

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

487

2.2. Ansatz for the ground state. Our goal is to find an eigenvector |ψ satisfying H () |ψ = E() |ψ, where E() is the smallest eigenvalue of H (). We shall use the following ansatz for |ψ (we don’t care about the normalization): |ψ = exp (−C) |, C =

C(M) a †M .

(8)

∅= M⊆L

Claim 1 asserts that the ground state can be represented in this form unless it is orthogonal to |. Since we don’t require |ψ to be a normalized state, the ansatz Eq. (8) is meaningful only if C is a bounded operator. We shall define a norm of a creation operator as C1 = max |C(M)|. (9) u∈L

Mu

Thus the ansatz Eq. (8) must be supplemented by a requirement that C is a creation operator with a finite norm C1 . Of course, it may happen that H has several eigenvectors of the form Eq. (8). One has to invoke some extra arguments to select an eigenvector corresponding to the smallest eigenvalue, see Subsect. 2.4. The “physical meaning” of the norm C1 can be illustrated by considering a product state: |ψ = u∈L |ψu , where |ψu = |0 + αu |1. Obviously, |ψ = exp (−C) | with C = − u∈L αu au† . Accordingly, C1 = maxu∈L |αu |. Thus one can think about C1 as a density of quantum fluctuations. Using the identity exp (C) exp (−C) = I valid for an arbitrary operator C, see [22], one can rewrite the Schrödinger equation H () |ψ = E() |ψ as ˆ ˆ )| = E() |. exp (C)(H 0 )| + exp (C)(V

(10)

Here we introduced a linear map Cˆ on the space of n-qubit operators whose action is ˆ ) = C X − XC. C(X ˆ is defined by the Taylor series. The advantage of the The exponential function exp (C) ˆ after a few lowest ansatz Eq. (8) is that we can truncate expansion of the function exp (C) orders since all higher order terms turn out to be identically zero. It follows from the two lemmas stated below. Lemma 1. Let C1 , C2 be creation operators. Then Cˆ 1 Cˆ 2 (H0 ) = 0.

(11)

Proof. To simplify notations we shall consider operators a instead of a † . Let u, M1 , M2 ⊆ L and X = aˆ M1 aˆ M2 (|11|u ). By linearity, it is enough to prove that X = 0. Since the operators a M1 and a M2 commute, X = 0 unless u ∈ M1 ∩M2 . Then [a M2 , |11|u ] = a M2 and X = [a M1 , a M2 ] = 0. Lemma 2. Let C1 , C2 , . . . , C5 be creation operators. Then Cˆ 1 Cˆ 2 · · · Cˆ 5 (V ) = 0.

(12)

488

S. Bravyi, D. DiVincenzo, D. Loss

Proof. To simplify notations we shall consider operators a instead of a † . Let M1 , M2 , . . . , M5 ⊆ L, (u, v) ∈ E and X = aˆ M1 aˆ M2 · · · aˆ M5 (Vu,v ). By linearity it is enough to prove that X = 0. Since the operators a M1 , . . . , a M5 commute with each other, X = 0 unless each of the subsets M j contains at least one of the vertices u, v. Therefore, expanding the commutators one can represent X as a linear combination of 25 terms, where each term contains at least five operators au , av on the pair of qubits u, v. Some of these operators a are on the right of Vu,v and some of them are on the left. Thus at least three operators a are on the same side of Vu,v . Then at least two operators a act on the same side of Vu,v and on the same qubit. Thus each of the 25 terms in X contains either au2 or av2 . Thus X = 0. Combining Lemmas 1,2 we get the following truncations: ˆ 0 ), ˆ exp (C)(H 0 ) = H0 + C(H

(13)

4 1 ˆk C (V ). k!

(14)

ˆ exp (C)(V )=

k=0

Here a convention Cˆ 0 (V ) = V is adopted. Let us point out an analogy between the truncation effect observed above and the Lieb-Robinson bound [18,19]. The latter asserts that for any local observable Ou acting only on a qubit u and for any Hamiltonian H with short-range interactions of bounded norm the time evolved observable Ou (t) = exp (i Hˆ t)(Ou ) can be approximated very well by an operator acting only on spins within distance v|t| from u, where v is a group velocity. If one takes a creation operator C for which the coefficients C(M) are nonzero only for subsets M of size O(1) (an analogue of short-range interactions), then ˆ the "time-evolved" observable exp (C)(O u ) acts only on the spins within distance O(1) from u (apply the same arguments as in the proof of Lemma 2). As opposed to the Lieb-Robinson bound scenario, the size of a region acted on by the evolved operator does not depend on the norm of C (which is analogous to the evolution time) and no approximations are involved. 2.3. Kirkwood-Thomas equations. Substituting Eqs. (13) into the Schrödinger equation Eq. (10) and taking into account that H0 | = 0 one gets ˆ C(M)H0 a †M | + exp (C)(V ) | = E() |. (15) − ∅= M⊆L

Let us introduce eigenvalues of the unpertubed Hamiltonian E 0 (M) such that H0 a †M | = E 0 (M) a †M |, E 0 (M) = u .

(16)

u∈M

Multiplying Eq. (15) on the left by |a M , M = ∅, and employing Eq. (14) one arrives at 1 |a M Cˆ k (V )|, ∅ = M ⊆ L. E 0 (M) k! 4

C(M) =

k=0

(17)

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

489

Following [4], we shall refer to Eq. (17) as Kirkwood-Thomas equations. Similarly, multiplying Eq. (15) by | on the left one gets E() =

4 1 |Cˆ k (V )|. k!

(18)

k=0

It is clear that the Kirkwood-Thomas equations Eq. (17) may have several solutions C since the equations do not explicitly include the eigenvalue E(). In the worst case when neither eigenvector of H () is orthogonal to | the Kirkwood-Thomas equations would have 2n solutions since any eigenvector could be represented in the form Eq. (8). We shall explain how to select the solution corresponding to the smallest eigenvalue in the next subsection. The following lemma asserts that the norm · 1 has a property analogous to submultiplicativity. It is the main technical tool that allows one to manipulate easily with equations like Eq. (17). Lemma 3. Let k be any integer and C1 , . . . , Ck be creation operators. Define a creation operator C such that 1 C= |a M Cˆ 1 · · · Cˆ k (V )|. C(M)a †M where C(M) = E 0 (M) ∅= M⊆L

Then C1 ≤

k 213 d J C j 1 .

(19)

j=1

Besides, ||Cˆ 1 · · · Cˆ k (Vu,v )|| ≤ 24 J

k

C j 1 for any (u, v) ∈ E.

(20)

j=1

The proof of the lemma is presented in Appendix A. 2.4. A lower bound on the spectral gap. Suppose we can find some eigenvalue E () of the Hamiltonian H () such that E (0) = 0, E () is a continuous function of , and E () has multiplicity 1 for || ≤ c . Then it follows immediately that E () is the smallest eigenvalue of H () for all || ≤ c . Of course, the main difficulty in using this argument is proving non-degeneracy of an eigenvalue. The following lemma asserts that a solution of the Kirkwood-Thomas equations Eq. (17) with a sufficiently small norm C1 corresponds to a non-degenerate eigenvalue separated from the rest of the spectrum by a constant gap. Lemma 4. Suppose H () |ψ = E() |ψ, where |ψ = exp (−C) | and C is a creation operator with a finite norm C1 satisfying the inequality 214 d J || (C1 )k 1> . k! 3

(21)

k=0

Then E() has multiplicity 1 and any other eigenvalue of H () is separated from E() by a gap at least /2.

490

S. Bravyi, D. DiVincenzo, D. Loss

Proof. Let us abbreviate H ≡ H (). Assume that H |φ = (E() + δ) |φ, where |δ| < /2 and the states |ψ, |φ are linearly independent (the latter condition is fulfilled automatically if δ = 0). We can always write |φ as B(M) a †M | (22) exp (C) |φ = M⊆L

for some complex numbers B(M). Note that B(M) = 0 for some non-empty set M since otherwise |φ is proportional to |ψ. Thus we can define a creation operator B = † ∅= M⊆L B(M) a M with a non-zero norm B1 > 0. Using commutativity [C, B] = 0 we can represent |φ as |φ = B |ψ + B(∅) |ψ. Then the eigenvalue equations H |φ = (E() + δ) |φ and H |ψ = E() |ψ imply [B, H ] |ψ = [B, H − E()I ] |ψ = B(H − E()I ) |ψ − (H − E()I ) |φ (23) = −δ|φ = −δ B |ψ − δ B(∅) |ψ. ˆ Commutativity [C, B] = 0 yields exp (C)(B) = B. Hence, multiplying Eq. (23) by exp (C) on the left one arrives at ˆ [B, exp (C)(H )] | + δ B | + δ B(∅) | = 0.

(24)

ˆ From Lemma 1 we know that [B, exp (C)(H 0 )] = [B, H0 ]. Choosing any M = ∅ and multiplying Eq. (24) by |a M on the left one gets ˆ |a M Bˆ exp (C)(V )| E 0 (M) − δ 3 1 E 0 (M) 1 |a M Bˆ Cˆ k (V )|. = E 0 (M) − δ k! E 0 (M)

B(M) =

k=0

Here we have taken into account that Bˆ Cˆ k (V ) = 0 for k ≥ 4, see Lemma 2. Note that condition |δ| < /2 implies a bound |E 0 (M)/(E 0 (M) − δ)| ≤ 2. Applying Lemma 3 to the operator B and using the triangle inequality for the norm one gets 1 214 d J || B1 (C1 )k . k! 3

B1 ≤

k=0

Since B1 > 0 we can divide both sides by B1 getting an inequality opposite to the one stated in the lemma. Thus the assumption from which we started the proof leads to a contradiction. Remark. Note that at = 0 the Hamiltonian H () = H0 has many degenerate eigenvalues, so one can certainly find two eigenvalues with separation |δ| < /2. It might seem to be in contradiction with the lemma above. However at = 0 the condition that C1 is finite can not be fulfilled for degenerate eigenvalues, since the corresponding eigenvectors are orthogonal to |.

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

491

Corollary 2. Suppose H () has an eigenvector |ψ = exp (−C) | with an eigenvalue E () such that E () is a continuous function of , E (0) = 0, and C1 ≤ cmax for all || ≤ c . Define c such that 1=

3 214 d J c 1 (cmax )k . k! k=0

Let c = (1) (2) (3)

min (c , c ).

E () E () E ()

Then for all || ≤ c ,

is the smallest eigenvalue of H (), has multiplicity 1, is separated from the rest of the spectrum by a gap at least /2.

Proof. (1) Indeed, Lemma 4 implies that no level crossings involving the eigenvector |ψ can occur for || ≤ c . Since |ψ is the ground state for = 0, it is the ground state for all || ≤ c . (2) and (3) follow immediately from Lemma 4. 3. Solution of the Kirkwood-Thomas Equations In their original paper [8] Kirkwood and Thomas employed the expansion in powers of in order to find the ground state. Alternative approach proposed by Datta and Kennedy in Ref. [9] and generalized by Yarotsky in Ref. [4] is to regard Eq. (17) as a fixed point equation for a non-linear map on the space of creation operators. One can prove that this map is a contraction in the unit ball (for a properly defined metric) if is below a certain threshold value. Then one can invoke the Brouwer fixed point theorem to argue that the unit ball contains a unique fixed point. Although the latter method is more elegant, we adopt the original Kirkwood-Thomas approach based on power series, because it naturally lends itself for getting approximation to the ground state energy with a controllable error. 3.1. Solution by formal power series. Let us first solve the Kirkwood-Thomas equation (17) in terms of formal power series ignoring the convergence issue. Recall that C = M⊆L C(M) a †M , where the sum is over all non-empty sets. Define a series C(M) =

∞

C p (M) p , ∅ = M ⊆ L.

(25)

p=1

Let us agree that C0 (M) = 0 for any M. Define also Cp = C p (M) a †M , Cˆ p = C p (M) aˆ †M , ∅= M⊆L

(26)

∅= M⊆L

∞ p ˆ ˆ p so that C = ∞ p=1 C p and C = p=1 C p . Substituting the series Eqs. (25,26) into the Kirkwood-Thomas equation (17) and equating the coefficients for each power of one gets C1 (M) = E 0 (M)−1 |a M V |, ˆ ˆ C p (M) = E 0 (M)−1 4k=1 k!1 p1 +...+ pk = p−1 |a M C p1 · · · C pk (V )|,

(27) p ≥ 2. (28)

492

S. Bravyi, D. DiVincenzo, D. Loss

Clearly, the equations above have a unique solution. Substituting Eqs. (25,26) into the formula for the ground state energy Eq. (18) one gets E() =

∞

E p p,

E 1 = |V |,

p=1

Ep =

4 1 k! k=1

|Cˆ p1 · · · Cˆ pk (V )|,

p ≥ 2.

(29)

p1 +...+ pk = p−1

Of course, formal power series do not represent an actual solution of the KirkwoodThomas equations unless we prove their convergence. p 3.2. Convergence of C-series. We would like to prove that the series C = ∞ p=1 C p are convergent with respect to the norm Eq. (9) with a non-zero convergence radius. We shall need to get a lower bound on the convergence radius in terms of , d, and J . Clearly, it is enough to analyze convergence of the series χ () =

∞

χ p p , χ p = C p 1 .

(30)

p=1

Note that C1 ≤ χ (||). Lemma 5. The series χ () =

∞

p=1 χ p

p converges absolutely for

|| ≤ 20 =

2−17 . dJ

(31)

Besides, for any as above one has the following bounds: |χ ()| ≤ 2−15 , χ p ≤

2−15 for p ≥ 1. (20 ) p

(32)

Proof. Let us first get an upper bound on C1 1 . From Eq. (27) it clear that C1 (M) = 0 unless M ⊆ {u, v} for some edge (u, v) ∈ E. Let u ∈ L be the vertex that achieves the maximum in C1 1 = maxu∈L Mu |C1 (M)|. Then the sum contains at most d +1 sets M, namely, M = {u} and M = {u, v} for (u, v) ∈ E. Therefore C1 1 ≤ (d + 1)J/ ≤ 2d J/, that is χ1 ≤

2d J .

(33)

Define a polynomial function F p of real variables x1 , . . . , x p−1 according to F p (x1 , . . . , x p−1 ) = x p−1 + +

1 6

1 2

x p1 x p2

p1 + p2 = p−1

p1 + p2 + p3 = p−1

x p1 x p2 x p3 +

1 24

x p1 x p2 x p3 x p4 .

p1 +...+ p4 = p−1

(34)

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

493

Applying Lemma 3 and triangle inequality to Eq. (28) one gets χp ≤

213 d J F p (χ1 , . . . , χ p−1 ),

p ≥ 2.

(35)

To simplify notations, define constants a=

213 d J 2d J , b= ,

(36)

so that χ1 ≤ a and χ p ≤ bF p (χ1 , . . . , χ p−1 ) for p ≥ 2. Consider the formal power series µ() =

∞

µ p p , µ1 = a, µ p = bF p (µ1 , . . . , µ p−1 ),

p ≥ 2.

(37)

p=1

Since the polynomial F p has non-negative coefficients one can prove inductively that χ p ≤ µ p for all p ≥ 1. Hence it suffices to prove that the series Eq. (37) converges absolutely for || ≤ 20 . Our strategy will be to guess a function µ() analytic for || ≤ 20 whose Taylor series at = 0 coincides with the series Eq. (37). By inspecting the recursive relation Eq. (37) one can easily convince oneself that µ() has to obey the following equation: 1 1 1 µ() = a + b µ() + µ2 () + µ3 () + µ4 () . (38) 2 6 24 We can use it to write down the inverse function 1 1 1 µ , Q(µ) = a + b µ + µ2 + µ3 + µ4 . (µ) = Q(µ) 2 6 24

(39)

Simple algebra shows that |Q(µ)| ≥

a a if |µ| ≤ = 2−14 . 2 4b

(40)

Thus (µ) is analytic for |µ| ≤ 2−14 . Define a set M = {µ : |µ| ≤ 2−15 }. Claim 2. Let be a complex number such that || ≤ 20 . Then equation (µ) = has a unique solution µ ∈ M. Proof. One can easily show that for any µ1 , µ2 ∈ M, |Q(µ1 ) − Q(µ2 )| ≤ 2b|µ1 − µ2 |.

(41)

Assume (µ1 ) = (µ2 ) = for some µ1 , µ2 ∈ M. If = 0 then µ1 = µ2 = 0. Assume = 0. Then µ1 , µ2 = 0 and µ1 − µ2 =

µ2 (Q(µ1 ) − Q(µ2 )) . Q(µ2 )

Applying the lower bound Eq. (40) and the upper bound Eq. (41) we get |µ1 − µ2 | ≤

2−15 |Q(µ1 ) − Q(µ2 )| 2−13 b|µ1 − µ2 | 1 ≤ ≤ |µ1 − µ2 |. a/2 a 2

494

S. Bravyi, D. DiVincenzo, D. Loss

Thus µ1 = µ2 and equation (µ) = has at most one solution µ ∈ M. Therefore : M → (M) is an injection. Let us prove that (M) contains a ball of radius 20 . Indeed, (M) is an open set and 0 ∈ (M). Let γ be the boundary of M, i.e., a circle of radius 2−15 centered at 0. Then (γ ) is the boundary of (M). For any µ ∈ γ one has |Q(µ)| ≤ a + 2b|µ| = a + 2−14 b ≤ 2a. Thus (M) contains a ball of radius R = min |(µ)| ≥ µ∈γ

2−17 2−15 = = 20 . 2a dJ

It completes the proof of the claim. Let K = { : || ≤ 20 }. From Claim 2 we infer that (µ) is an analytic bijection from the set −1 (K ) ⊆ M to the set K . It follows from the inverse function theorem for analytic functions, see [10], that the inverse function µ() is analytic for ∈ K . Therefore the series Eq. (37) converges absolutely for || ≤ 20 . The upper bound on µ p can be obtained by standard methods using Cauchy’s formula: µ()d 1 . µp = 2πi ||=20 p+1 Thus |µ p | ≤

1 (20 ) p

max |µ()| ≤

: ||=20

Recall that χ p ≤ µ p , so the lemma is proved.

2−15 . (20 ) p

One can summarize the results of this subsection as follows. Corollary 3. Suppose || ≤ 20 . Then the Kirkwood-Thomas equations (17) have a unique solution C defined by the power series Eq. (25) with C1 ≤ 2−15 . 3.3. Convergence of E-series. In this subsection we analyze convergence of the series p E() = ∞ p=1 E p for the eigenvalue obtained from the Kirkwood-Thomas equation, see Eq. (4). p Lemma 6. The series E() = ∞ p=1 E p converges absolutely for || ≤ 20 =

2−17 . dJ

(42)

Besides, |E p | ≤

2−16 n for p ≥ 1. (20 ) p

(43)

Proof. Applying Lemma 3 and triangle inequality to Eq. (29) one gets |E 1 | ≤ nd J, |E p | ≤ 24 nd J F p (χ1 , . . . , χ p−1 ),

p ≥ 2,

(44)

where χ p = C p 1 and the polynomial F p is defined in Eq. (34). Define a formal series e() =

∞ p=1

e p p , e1 = nd J, e p = 24 nd J F p (χ1 , . . . , χ p−1 ),

p ≥ 2.

(45)

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

495

By definition, |E p | ≤ e p for all p. Besides, e() can be expressed in terms of χ () = ∞ p p=1 χ p as 1 2 1 3 1 4 4 e() = 2 nd J χ () + χ () + χ () + χ () + nd J . 2 6 24 This equality can be verified by equating coefficients for each power of . Lemma 5 implies that χ () is analytic for || ≤ 20 . Therefore e() and E() are analytic for || ≤ 20 and the first statement of the lemma is proved. In order to get an upper bound on e p (and thus on E p ), use Cauchy’s formula: e()d 1 . ep = 2πi ||=20 p+1 It follows from Lemma 5 that |χ ()| ≤ 2−15 for || ≤ 20 . Therefore |e p | ≤

1 (20 ) p

max |e()| ≤

: ||=20

The lemma is proved.

2−16 n 1 4 −14 2 nd J (2 )2 + nd J (2 ) ≤ . 0 0 (20 ) p (20 ) p

Corollary 4. Suppose || ≤ 20 . Then the series Eq. (29) converges absolutely to the smallest eigenvalue of H (). The smallest eigenvalue is non-degenerate and is separated from the rest of the spectrum by a gap at least /2. Proof. It follows from Corollary 2. Indeed, we have already shown that the conditions of Corollary 2 are satisfied with c = 20 and cmax = 2−15 , see Corollary 3. It yields c ≤ 2−15 /(d J ). Thus c = min (c , c ) = 20 . Lemma 6 allows one to estimate an error resulting from truncation of the series for the ground state energy at a finite order p. Corollary 5. Suppose || ≤ 0 . Then p q −16− p E() − n E . q ≤ n2 q=1 Proof. Use Eq. (43).

(46)

Summarizing, we have proved Theorems 1,2. 4. Linked Cluster Theorems Throughout this section we shall use a term linked cluster which refers to a subset of vertices inducing a connected subgraph of G. More formally, Definition 2. A subset M ⊆ L is called a linked cluster iff for any u, v ∈ M there exists a sequence of vertices u 0 , u 1 , . . . , u t ∈ M such that u 0 = u, u t = v and (u j , u j+1 ) ∈ E for all j = 0, . . . , t − 1. Definition 3. A connected size of a subset M ⊆ L is the minimal size of a linked cluster that contains all vertices of M. We shall denote a connected size of M as |M|c .

496

S. Bravyi, D. DiVincenzo, D. Loss

4.1. Linked cluster expansion for the ground state. ∞ p Lemma 7. Let C(M) = p=1 C p (M) be the solution of the Kirkwood-Thomas equations obtained in Sect. 3. Then C p (M) = 0 unless |M|c ≤ p + 1.

(47)

Proof. We shall prove the lemma by induction in p. From Eq. (27) one infers that C1 (M) = 0 unless M ⊆ {u, v} for some edge (u, v) ∈ E. In particular, C1 (M) = 0 unless |M|c ≤ 2. It proves the statement of the lemma for p = 1. Suppose the statement is proved for p = 1, . . . , q − 1. From Eq. (28) one infers that Cq (M) is a linear combination of terms like x = C p1 (M1 ) · · · C pk (Mk )|a M aˆ †M1 · · · aˆ †Mk (Vu,v )|,

(48)

where p1 + . . . + pk = q − 1. Let us figure out under what circumstances the matrix element in Eq. (48) can be non-zero. Claim 3. Let M, M1 , . . . , Mk ⊆ L be non-empty sets, N = M1 ∪ . . . ∪ Mk , (u, v) ∈ E. Denote y = |a M aˆ †M1 · · · aˆ †Mk (Vu,v )|. Then y = 0 unless the following conditions are met: (i) Each set M1 , . . . , Mk contains at least one of the vertices u, v. (ii) N \{u, v} ⊆ M ⊆ N ∪ {u, v}. Remark. This claim is true even for M = ∅ if one adopts a convention a∅ = I . Proof. Suppose some set M j contains neither u nor v. Then a †M j commutes with Vu,v as well as with all operators a †Mi for i = j. Thus y = 0. Suppose condition N \{u, v} ⊆ M is violated. Then there exists a set M j and a vertex w ∈ M j such that w = u, v and † which commutes with all other operators involved w∈ / M. Thus a †M j contains a factor aw † leftwards one can show that each of 2k terms in y starts from |a † , in y. By moving aw w that is, y = 0. Suppose condition M ⊆ N ∪ {u, v} is violated, that is, there exists a vertex w ∈ M such that w ∈ / N and w = u, v. Then the operator a M contains a factor aw which commutes with all other operators involved in y. By moving aw rightwards one can show that each of 2k terms in y tails with aw |, that is, y = 0.

Returning to Eq. (48) we conclude that x = 0 unless each set M j contains either u or/and v, and M ⊆ N ∪ {u, v}. Let M˜ j be a linked cluster of minimal size containing M j , that is, |M j |c = | M˜ j |. Let N˜ = M˜ 1 ∪ . . . ∪ M˜ k and C = N˜ ∪ {u} ∪ {v}. Then C is a linked cluster and M ⊆ C. Note that |C| ≤

k j=1

| M˜ j | + 2 − k =

k j=1

|M j |c + 2 − k,

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

497

where we have taken into account that each M˜ j contains either u or/and v. By induction hypothesis we have C p j (M j ) = 0 unless |M j |c ≤ p j + 1. Thus for any non-zero term x one has |C| ≤

k ( p j + 1) + 2 − k = q − 1 + k + 2 − k = q + 1. j=1

Thus Cq (M) = 0 unless |M|c ≤ q + 1.

4.2. Upper bound on the number of linked clusters. The following lemma asserts that the number of linked clusters of size p containing a given vertex grows at most exponentially with p (if the maximal degree of the graph d is a constant). To the best of our knowledge, this lemma has been originally proved in Ref. [23] by Aliferis, Gottesman, and Preskill in the context of fault-tolerant quantum computation.2 Lemma 8. Let N p (u) be the number of linked clusters with p vertices that contain a vertex u and N p = maxu∈L N p (u). Then N p ≤ (4d) p−1 .

(49)

Proof. Let T p (u) be a set of trees with p vertices that contain a vertex u (naturally, we consider only those trees that are subgraphs of G). Let T p (u) = |T p (u)| be the number of such trees. For any tree T ∈ T p (u), a set of vertices of T is a linked cluster that contains u. Conversely, if M u is a linked cluster, |M| = p, consider a subgraph G M induced by M. Then any spanning tree of G M belongs to T p (u). Thus N p (u) ≤ T p (u). Denote T p = maxu∈L T p (u). Obviously, T1 = 1 and T2 ≤ d. Let us prove that T p1 T p2 , (50) Tp ≤ d p1 + p2 = p

where the convention T0 = 0 is adopted. Indeed, for any edge e incident to a vertex u define a set T p (u, e) that includes all trees T ∈ T p (u) that contain an edge e. Let T p (u, e) = |T p (u, e)|. Clearly, T p (u, e) ≤ d max T p (u, e). (51) T p (u) = ∪e T p (u, e), T p (u) ≤ e

e

Let e = (u, v) be the edge that achieves the maximum. Note that any tree T ∈ T p (u, e) consists of the edge (u, v) and two disjoint trees T1 ∈ T p1 (u) and T2 ∈ T p2 (v), where p1 + p2 = p. Thus we have an upper bound T p1 (u)T p2 (v) ≤ T p1 T p2 . T p (u, e) ≤ p1 + p2 = p

p1 + p2 = p

Substituting it into Eq. (51) and taking the maximum over u ∈ L we obtain Eq. (50). Define a sequence S1 , S2 , . . . such that S p1 S p2 for p ≥ 2. (52) S1 = 1, S p = d p1 + p2 = p 2 The authors became aware of it after completion of the present work.

498

S. Bravyi, D. DiVincenzo, D. Loss

Clearly, T1 = S1 = 1 and T2 ≤ d = S2 . It follows that T p ≤ S p for allp. In order to p derive an explicit formula for S p define a generating function S(x) = ∞ p=1 S p x . It obeys an equation S(x) = d S(x)2 + x, which implies √ 1 − 1 − 4d x . S(x) = 2d Taking the derivatives one gets Sp =

p−1 1 1 d p S (4d) p a − . = − p! d x p x=0 2d( p!) 2 a=0

It follows that Sp ≤

(4d) p−1 (4d) p ( p − 1)! ≤ ≤ (4d) p−1 . 4d( p!) p

Summarizing, N p ≤ T p ≤ S p ≤ (4d) p−1 .

4.3. Linked cluster expansion for the ground state energy. This subsection provides the necessary tools for computing spin-spin correlators. A reader interested only in computing the ground state energy can safely skip it. Let us consider a more general family of Hamiltonians for which the parameter may be different on different edges. Let variable u,v be assigned to an edge (u, v) ∈ E. For any subset of edges A ⊆ E denote [A] a collection of variables assigned to edges of A. The Hamiltonian is u,v Vu,v . (53) H ([E]) = H0 + (u,v)∈E

Let E([E]) be the ground state energy of H ([E]). We shall consider multivariate Taylor series for the function E([E]). Lemma 9. The multivariate Taylor series for E([E]) at the point [E] = 0 converges absolutely if |u,v | ≤ 0 for all (u, v) ∈ E. Proof. Let = {[E] : |u,v | ≤ 20 for all (u, v) ∈ E}. Let us firstly show E([E]) is an analytic function of each individual variable u,v for [E] ∈ . Indeed, let E˜ be a set of all edges except (u, v). Define an unperturbed Hamiltonian H˜ 0 = H0 + (u,v)∈E˜ u,v Vu,v and a perturbation u,v Vu,v . It follows from Corollary 4 that H˜0 has non-degenerate ground state and the spectral gap at least /2. Applying the standard perturbation theory to a perturbed Hamiltonian H˜ 0 + u,v Vu,v we conclude that E([E]) is an analytic function of u,v as long as the Weyl condition u,v Vu,v < /4 is satisfied. Since we assumed that |u,v | ≤ 20 , one has u,v Vu,v < 20 J = 2−17 /d < /4. Thus E([E]) is analytic in with respect to each individual variable u,v . Repeatedly using Cauchy’s formula one gets ⎛ ⎞ 1 1 ⎠ E(z[E]). E([E]) = ⎝ (54) 2πi |z u,v |=20 (z u,v − u,v ) (u,v)∈E

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

499

Since H0 and V are bounded operators, the absolute value |E(z[E])| can be bounded by a constant (maybe depending on n). The Taylor series in u,v at the point u,v = 0 for any factor 1/(z u,v − u,v ) in Eq. (54) converges absolutely as long as |u,v | < 20 . Thus the Taylor series for E([E]) converges absolutely if |u,v | ≤ 0 for all (u, v) ∈ E. The Taylor series for E([E]) can be uniquely written in the form ⎛ ⎞ ⎝ E([E]) = u,v ⎠ p A ([A]), A⊆E

(55)

(u,v)∈A

where the sum is over all subsets of edges A and p A ([A]) is the series that involves only variables u,v pertaining to A. Clearly, the coefficients of p A ([A]) are functionals of interactions Vu,v with (u, v) ∈ A only. The main goal of this section is to show that the expansion Eq. (55) involves only linked clusters of edges. Let us firstly define this notion. Definition 4. A subset of edges A ⊆ E is called a linked cluster iff the subset of vertices induced by A is a linked cluster. Lemma 10. The series Eq. (55) involves only linked clusters of edges A. Proof. Suppose A ⊆ E is not a linked cluster of edges. Let M ⊆ L be a set of vertices induced by A. Since M is not a linked cluster, it can be represented as a disjoint union M = M1 ∪ M2 , where M1 , M2 ⊆ L, M1 ∩ M2 = ∅, and no edge connects M1 and M2 . Accordingly, A can be represented as a union A = A1 ∪ A2 , where A1 and A2 are the set of edges inducing M1 and M2 respectively. Let us choose variables [E] such that u,v = 0 unless (u, v) ∈ A. Then it is clear that the Hamiltonian H ([E]) splits into a sum of three terms acting on non-overlapping sets of qubits: H ([E]) = H1 + H2 + Helse , H j = u |11|u + u,v Vu,v , Helse =

u∈M j

(u,v)∈A j

u |11|u .

u∈L\(M1 ∪M2 )

The ground state energy of H ([E]) is equal to the sum of ground state energies of H1 , H2 , and Helse . It implies that E([E]) = E([A1 ]) + E([A2 ]).

(56)

If we assume that p A ([A]) = 0, when E([E]) would include at least one monomial including variables from both sets A1 , A2 which contradicts Eq. (56). The following implication of Lemma 10 will simplify computation of spin-spin correlators. Corollary 6. Consider a Hamiltonian H = H0 + V , where V = (u,v)∈E Vu,v . Let ∞ E() = p=1 E p p be the series for the ground state energy of H . Suppose the interaction Vs,t depends on a parameter η for some edge (s, t) ∈ E. Then a derivative ∂ E p Kp = ∂η η=0 can be computed by setting Vu,v = 0 for all edges (u, v) having distance p + 1 or greater from the edge (s, t).

500

S. Bravyi, D. DiVincenzo, D. Loss

Proof. Indeed, E p can be obtained from Eq. (55) by setting u,v = on all edges, restricting the sum to linked clusters A of size at most p and collecting all monomials of total degree p. Clusters A that do not contain the edge (s, t) will not contribute to K p . Clusters A that contain the edge (s, t) cannot contain any edge (u, v) having distance p + 1 or greater from the edge (s, t). 5. Computational Algorithms In this section we describe an algorithm that takes as input a description of the Hamiltonians H0 , V and an integer p. The algorithm returns coefficients E 1 , . . . , E p a list of p . The running time of the in the series for the ground state energy E() = ∞ E p p=1 algorithm is n exp (O( p)). In Sect. 5.3 we describe a generalization of the algorithm that allows one to compute spin-spin correlation functions. 5.1. Computing the coefficients C p (M). The first part of the algorithm is to compute the coefficients Cq (M), ∅ = M ⊆ L sequentially for q = 1, . . . , p using the solution of the Kirkwood-Thomas equations, see Eqs. (27,28). This gives an approximate description of the ground state. We shall store triples (M, q, Cq (M)) in n bins (memory registers) Bu labeled by vertices of the graph u ∈ L. Once a coefficient Cq (M) is computed, the triple (M, q, Cq (M)) is placed into every bin Bu for which u ∈ M. From Lemma 7 we learn that Cq (M) = 0 unless M is a subset of some linked cluster M˜ of size at most q + 1. According to Lemma 8, the number of linked clusters M˜ of size q +1 containing vertex u is bounded by exp (O(q)), where the coefficient in the exponent depends only on d. Each linked cluster of size q+1 containing vertex u has 2q subsets containing vertex u. Thus we can bound the number of entries in the bin Bu at the moment when all coefficients C1 (M), . . . , C p (M) p have been computed as |Bu | ≤ q=1 2q exp (O(q)) = exp (O( p)). Suppose we have already computed all non-zero coefficients C1 (M), . . . , Cq−1 (M), M ⊆ L. The next step is to compute coefficients Cq (M) for all sets M ⊆ L satisfying the condition of Lemma 7, that is |M|c ≤ q + 1. Expanding Eq. (28) one gets Cq (M) = E 0 (M)−1

4 1 k!

(u,v)∈E k=1

C p1 (M1 ) · · · C pk (Mk )

p1 +...+ pk =q−1 M1 ,...,Mk ⊆L

×|a M aˆ M1 · · · aˆ Mk (Vu,v )|.

(57)

Note that the right hand side of this equation involves only coefficients C p j (M j ) that have been already computed. For simplicity let us assume that computation of any term in Eq. (57) requires one unit of time.3 Denote x = |a M aˆ †M1 · · · aˆ †Mk (Vu,v )|.

(58)

Recall, see Claim 3, that x = 0 unless the following conditions are met: (i) Each set M1 , . . . , Mk contains at least one of the vertices u, v. (ii) N \{u, v} ⊆ M ⊆ N ∪ {u, v}, where N = M1 ∪ . . . ∪ Mk . 3 This assumption might seem unjustified, because the precision up to which the coefficient C (M) must q be computed depends upon δ. However, taking into account these subtleties will lead to an additional overhead poly(log n, log δ −1 ) which can be neglected since the algorithm has running time poly(n, δ −1 ).

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

501

The property (i) implies that for a fixed edge (u, v) we can restrict the three rightmost sums in Eq. (57) by taking triples (M j , p j , C p j (M j )) either from the bin Bu or from the bin Bv . Therefore for a fixed (u, v), the overall number of non-zero terms in the three rightmost sums in Eq. (57) can be bounded by (|Bu | + |Bv |)k ≤ (|Bu | + |Bv |)4 = exp (O(q)). We shall now prove that only a small number of edges (u, v) can give a non-zero contribution to Cq (M). Indeed, there are two cases: (1) M ⊆ {u, v}; (2) there exists w ∈ M such that w ∈ / {u, v}. Clearly only O(1) edges (u, v) can lead to case (1), so let us focus on case (2). Consider any term x as in Eq. (58). Properties (i),(ii) above imply that x = 0 unless there exists a set M j such that w ∈ M j and one of the vertices u, v belongs to M j . Without loss of generality, w, u ∈ M j . Lemma 7 implies that C p j (M j ) = 0 unless |M j |c ≤ p j + 1 ≤ q. Therefore the distance between u and w is at most q. Taking into account that |M| ≤ |M|c ≤ q + 1, we can bound the number of edges (u, v) that can give a non-zero contribution to Cq (M) by |M|d q+1 ≤ (q + 1)d q+1 = exp (O(q)). Summarizing, the overall number of non-zero terms in Eq. (57) is exp (O(q)). In order to compute all non-zero coefficients Cq (M) we will have to repeat the procedure above for each subset M satisfying the condition of Lemma 7, that is |M|c ≤ q + 1. By Lemma 8, the number of such sets is n exp (O(q)). Summarizing, the overall time one needs to compute all the coefficients C1 (M), . . . , C p (M) is n exp (O( p)). 5.2. Computing the ground state energy. The final step of the algorithm is to compute the coefficients E 1 , . . . , E p using Eq. (29). This equation can be expanded as Ep =

4 1 k!

(u,v)∈E k=1

C p1 (M1 ) · · · C pk (Mk )

p1 +...+ pk = p−1 M1 ,...,Mk ⊆L

× |aˆ M1 · · · aˆ Mk (Vu,v )|.

(59)

Expanding the commutators in the matrix element one gets 2k terms. However, only the term in which Vu,v is the leftmost operator gives a non-zero contribution. Thus |aˆ M1 · · · aˆ Mk (Vu,v )| = (−1)k |Vu,v a †Mk · · · a †M1 |. In particular, we can restrict the summation over M1 , . . . , Mk by subsets of {u, v} only. There are only 3 such subsets: {u}, {v}, and {u, v}. This observation implies that the number of terms in the rightmost sum in Eq. (59) is O(1). Since there are O( p 3 ) partitions p1 + . . . + pk = p − 1, k ≤ 4, the overall number of terms in Eq. (59) is O(np 3 ). Combining the results of Subsects. 5.1,5.2 we conclude that the overall time needed to compute the coefficients E 1 , . . . , E p scales as n exp (O( p)). In the above analysis we did not keep track of the coefficients in the exponents O( p). If one computes the exact coefficient, it yields the overall running time n215+6 log (d) , where log stands for the base two logarithm. Accordingly, the running time as a function of n and δ scales as T (n, δ) ∼ n(nδ −1 )15+6 log (d) . For example, implementing the algorithm on a 2D square lattice (d = 4) would require a running time T (n, δ) ∼ n(nδ −1 )27 , which is certainly not practical. Note however, that the power of nδ −1 depends upon the ratio ||/R, where R is the convergence radius of the series Eq. (4). The power 15 + 6 log (d) corresponds to the most pessimistic scenario R = 20 (the best lower bound on the convergence radius that we can prove) and || = 0 .

502

S. Bravyi, D. DiVincenzo, D. Loss

5.3. Computing spin-spin correlation functions. Let s, t ∈ L be any pair of vertices. It may or may not be an edge of the graph G. Let us add (s, t) to the set of edges E (by creating a double edge between s and t if necessary). The modified graph has maximal degree d ∗ = d + 1. Let Os,t be a Hermitian operator acting non-trivially only on qubits s, t. We shall assume that Os,t ≤ J . The quantity we are interested in is the expectation value K =

ψ|Os,t |ψ , ψ|ψ

where |ψ is the ground state of H () = H0 + V . Our goal is to compute K with a specified precision δ. To this end let us define a Hamiltonian H (, η) = H0 + V + η Os,t .

(60)

Let E(, η) be the smallest eigenvalue of H (, η). As we know from Lemma 9, the Taylor series E(, η) =

∞

E p,q p ηq

(61)

p,q=0

converges absolutely for ||, |η| ≤ 0∗ , where 0∗ =

2−18 2−18 = . d∗ J (d + 1)J

The Hellman-Feynman theorem asserts that ∞ ∂ E(, η) K = = E p,1 p . ∂η η=0

(62)

p=0

Our algorithm will get an approximation to K by computing a truncation of series in Eq. (62). The following lemma provides a bound on the error resulting from the truncation. Lemma 11. Suppose || ≤ 0∗ /(2d). Then p q K − E q,1 ≤ 2−16− p J d(d + 1). q=0

(63)

Proof. Let us firstly prove that |E p,1 | ≤

2−16 d p+1 . (0∗ ) p+1

(64)

Indeed, use Cauchy’s formula: E p,1

1 = (2πi)2

||=0∗

|η|=0∗

E(, η) d dη . p+1 η2

(65)

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

503

From Lemma 6 we infer that |E(, η)| ≤ 2−16 n. However, we would like to have an upper bound independent of n. To this end we employ Corollary 6 according to which E p,1 can be computed by restricting the Hamiltonian on the (d + 1)-neighborhood of the edge (s, t). The number of spins in this neighborhood is at most n ∗ = d p+1 . Therefore |E(, η)| ≤ 2−16 d p+1 . Substituting this bound into Eq. (65) one gets Eq. (64). ∞ q Finally, using the condition || ≤ 0∗ /(2d) we bound the sum q= p+1 |E q,1 | as in Eq. (63). Lemma 11 shows that in order to compute K with an absolute error δ it is enough to compute the coefficients E 0,1 , E 1,1 , . . . , E p,1 in the series Eq. (61) with p = log (δ −1 )+ O(1). Computation of the coefficients E p,1 requires only minor modifications of the algo˜ η) = E(, η). Using rithm described in Sects. 5.1,5.2. Indeed, consider a function E(, the series Eq. (61) one gets ˜ η) = E(,

∞

E˜ r (η) r ,

E˜ r (η) =

r =1

E p,q ηq .

p+q=r

In particular, E p,1

∂ E˜ p+1 (η) = ∂η

.

(66)

η=0

˜ η) is the ground state energy of a Hamiltonian H0 + (V +η Os,t ). On the other hand, E(, Thus we can compute the coefficients E˜ 1 (η), . . . , E˜ p+1 (η) using the already available algorithm for the ground state energy. Moreover, from Corollary 6 we know that the coefficients E 0,1 , E 1,1 , . . . , E p,1 can be computed by restricting the Hamiltonian to the ( p + 1)-neighborhood of the edge (s, t). Thus we can apply Theorem 3 with n replaced by n ∗ = d p+1 , obtaining an algorithm with a running time exp (O( p)) for computing E˜ 1 (η), . . . , E˜ p+1 (η). In fact, at every step of this algorithm we have to retain only the terms independent of η and the terms linear in η, see Eq. (66). Since we have chosen p = log (δ −1 ) + O(1), the running time of the algorithm is poly(δ −1 ). 6. Discussion and Open Problems We have proved that the ground state properties of a spin Hamiltonian with sufficiently weak interactions between qubits can be computed efficiently. We hope that this result could be generalized in several different directions. Firstly, one could try to consider more general class of unperturbed Hamiltonians H0 , for example, classical Ising-like Hamiltonians. In addition, one could consider systems of fermionic modes rather than spins. Secondly, one could investigate possible generalizations of the Kirkwood-Thomas ansatz to the case of degenerate ground state. In this case the ansatz should be constructed for an effective Hamiltonian acting on a low-energy subspace rather than for the ground state. Results of this kind could provide a rigorous basis for perturbative derivations of lowenergy effective Hamiltonians, for example the mapping from the half-filled Hubbard model to the Heisenberg model. Thirdly, one could try to get pa stronger lower bound on the convergence radius R of the series E() = ∞ p=1 E p . We note that a stronger lower bound R ≥ /d J can be easily obtained for classical Hamiltonians, when all

504

S. Bravyi, D. DiVincenzo, D. Loss

interactions Vu,v are diagonal in the |0, |1 basis. Therefore, one could speculate that in the quantum case R should be close to /d J . Acknowledgements. The authors gratefully acknowledge useful discussions with Panos Aliferis, Barbara Terhal, and Frank Verstraete. S.B. and D.D. acknowledge support by the DTO through ARO contract number W911NF-04-C-0098, and D.L. by the Swiss NF and the NCCR Nanoscience.

Appendix A In this section we prove Lemma 3. By definition of the norm, C1 = maxu∈L Yu , where Yu = E 0 (M)−1 |a M Cˆ 1 · · · Cˆ k (V )| . Mu

Applying the triangle inequality one can bound Yu as † † E 0 (M)−1 Yu ≤ X u := |a M aˆ M1 · · · aˆ Mk (Vv,w )| Mu

(v,w)∈E M1 ,...,Mk

×|C1 (M1 )| · · · |Ck (Mk )|.

(67)

Here the last sum is over all non-empty subsets M1 , . . . , Mk ⊆ L. Claim 3 allows one to restrict the summation in Eq. (67) only by tuples (M, M1 , . . . , Mk , v, w) satisfying conditions (i),(ii). We shall partition X u into k + 1 (possibly overlapping) sums that will ( j) be dealt with separately. We define X u , j = 1, . . . , k as a sum of all terms in Eq. (67) (0) for which u ∈ M j . We define X u as a sum of all terms in Eq. (67) for which u ∈ {v, w}. In other words, ( j) E 0 (M)−1 χ M j (u) |a M aˆ †M1 · · · aˆ †Mk (Vv,w )| Xu = Mu

(v,w)∈E M1 ,...,Mk

× |C1 (M1 )| · · · |Ck (Mk )|, where χ M j is the characteristic function4 of M j and † † X u(0) = E 0 (M)−1 |a M aˆ M1 · · · aˆ Mk (Vu,v )| Mu

v : (u,v)∈E M1 ,...,Mk

× |C1 (M1 )| · · · |Ck (Mk )|. Condition (ii) in Claim 3 implies that u ∈ M ⊆ N ∪ {v, w}, so that each term in Eq. (67) (0) (k) appears at least one time in the sums X u , . . . , X u , hence Xu ≤

k

( j)

Xu .

(68)

j=0

Upper bound on X ( j) , 1 ≤ j ≤ k. The property (i) in Claim 3 implies that at least one end-point of the edge (v, w) belongs to M j . W.l.o.g. v ∈ M j . Then property (ii) 4 That is χ M j (u) = 1 if u ∈ M j and χ M j (u) = 0 otherwise.

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

505

implies M j ⊆ M ∪ {w}, so that |M j | ≤ 2|M| (recall that M is a non-empty set because u ∈ M). It gives us a bound E 0 (M) ≥ |M| ≥ (/2)|M j |. Note also that for any fixed M1 , . . . , Mk and v, w there exist at most four sets M satisfying condition (ii) of Claim 3 (take N and add/subtract vertices v and w). Therefore 8 1 ( j) † † Xu ≤ χ M j (u)χ M j (v) max |a M aˆ M1 · · · aˆ Mk (Vv,w )| M |M j | (v,w)∈E M1 ,...,Mk

× |C1 (M1 )| · · · |Ck (Mk )|. Now we can bound the matrix element by 2k J and add a restriction Mi ∩ {v, w} = ∅ to the summations over sets Mi , i = j, see Claim 3, property (i). Taking into account that |Ci (Mi )| ≤ |Ci (Mi )| + |Ci (Mi )| ≤ 2Ci 1 , (69) Mi v

Mi : Mi ∩{v,w}=∅

Mi w

we arrive at ( j)

Xu ≤

22k+2 J 1 |C j (M j )|. Ci 1 χ M j (u)χ M j (v) |M j | (v,w)∈E M j

i= j

Changing the order of summations and bounding the sum over (v, w) by d|M j | one gets ( j)

Xu ≤

k 22k+2 d J 22k+2 d J Ci 1 χ M j (u)|C j (M j )| ≤ Ci 1 . i= j

Mj

i=1

Finally, Lemma 2 implies that it is enough to consider k ≤ 4, so that k

( j)

Xu ≤

j=1

k 212 d J C j 1 .

(70)

j=1

Upper bound on X (0) . Claim 3 implies that for any fixed (M1 , . . . , Mk , v) there exist at most four sets M satisfying (ii). Using a bound E 0 (M) ≥ we arrive at 4 † † X u(0) ≤ max |a M aˆ M1 · · · aˆ Mk (Vu,v )| |C1 (M1 )| · · · |Ck (Mk )|. M v : (u,v)∈E M1 ,...,Mk

Claim 3, property (i) allows us to bound the matrix element by 2k J and add a restriction Mi ∩ {u, v} = ∅ to the summations over sets Mi . Using Eq. (69) we arrive at X u(0) ≤

k 22k+2 J C j 1 j=1

v : (u,v)∈E

1≤

k 210 d J C j 1 .

(71)

j=1

Combining Eqs. (68,70,71) we prove the upper bound Eq. (19). The second bound Eq. (20) of Lemma 3 is much easier to prove. Applying the triangle inequality one gets ||aˆ †M1 · · · aˆ †Mk (Vu,v )|||C1 (M1 )| · · · |Ck (Mk )|. ||Cˆ 1 · · · Cˆ k (Vu,v )|| ≤ M1 ,...,Mk

506

S. Bravyi, D. DiVincenzo, D. Loss

Clearly the matrix element is zero unless M j ⊆ {u, v} for all j. Expanding the commutators one gets 2k terms, but only the term in which all creation operators stand on the right of Vu,v gives a non-zero contribution. Taking into account that

|C j (M j )| ≤ 2C j 1 ,

M j : M j ⊆{u,v}

one arrives at ||Cˆ 1 · · · Cˆ k (Vu,v )|| ≤ 2k J

k

C j 1 ≤ 24 J

j=1

k

C j 1 ,

j=1

where we have applied Lemma 2 to argue that k ≤ 4. References 1. Kempe, J., Kitaev, A., Regev, O.: “The Complexity of the Local Hamiltonian Problem”. SIAM J. Computing 35(5), 1070–1097 (2006) 2. Oliveira, R., Terhal, B.M.: “The complexity of quantum spin systems on a two-dimensional square lattice”. http://arXiv.org/list/0504050, 2005, to appear in Quant. Inf. Comput. 3. Kato, T.: “Perturbation Theory for Linear Operators”. New York: Springer-Verlag, 1966 4. Yarotsky, D.: “Perturbations of ground states in weakly interacting quantum spin systems”. J. Math. Phys. 45(6), 2134 (2004) 5. Osborne, T.: “Simulating adiabatic evolution of gapped spin systems”. Phys. Rev. A 75, 032321 (2007) 6. Abrikosov, A., Gorkov, L., Dzyaloshinski, I.: “Methods of Quantum Field Theory in Statistical Physics”. New York: Dover Publications Inc., 1975 7. Lindgren, I.: “The Rayleigh-Schrödinger perturbation and the linked-diagram theorem for a multi-configurational model space”. J. Phys. B 7(18), 2441 (1974) 8. Kirkwood, J., Thomas, L.: “Expansions and Phase Transitions for the Ground State of Quantum Ising Lattice Systems”. Commun. Math. Phys. 88, 569–580 (1983) 9. Datta, N., Kennedy, T.: “Expansions for one quasiparticle states in spin 1/2 systems”. J. Stat. Phys. 108, 373 (2002) 10. Lang, S.: “Complex Analysis”. Graduate Texts in Mathematics 103, New York: Springer-Verlag, 1985 11. Anderson, P.W.: “Infrared Catastrophe in Fermi Gases with Local Scattering Potentials”. Phys. Rev. Lett. 18, 1049 (1967) 12. Verstraete, F., Wolf, M.M., Perez-Garcia, D., Cirac, J.I.: “Criticality, the area law, and the computational power of PEPS”. Phys. Rev. Lett. 96, 220601 (2006) 13. Coester, F.: “Bound states of a many-particle system”. Nucl. Phys. 7, 421 (1958) 14. Crawford, T., Schaefer, H.: “An Introduction to Coupled Cluster Theory for Computational Chemists”. Rev. Comput. Chem. 14, 33–36 (1999) 15. Farnell, D.J.J., Bishop, R.F.: “The Coupled Cluster Method Applied to the XXZ Model on the Square Lattice”. http://arXiv.org/list/cond-mat/0606060, 2006 16. Datta, N., Fernández, R., Fröhlich, J.: “Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states”. J. Stat. Phys. 84, 455–534 (1996) 17. Borgs, C., Kotecký, R., Ueltschi, D.: “Low temperature phase diagrams for quantum perturbations of classical spin systems”. Commun. Math. Phys. 181, 409–446 (1996) 18. Hastings, M., Koma, T.: “Spectral Gap and Exponential Decay of Correlations”. Commun. Math. Phys. 265, 781 (2006) 19. Bravyi, S., Hastings, M., Verstraete, F.: “Lieb-Robinson bounds and the generation of correlations and topological quantum order”. Phys. Rev. Lett. 97, 050401 (2006) 20. Latorre, J.I., Rico, E., Vidal, G.: “Ground state entanglement in quantum spin chains”. Quant. Inf. Comput. 4, 48 (2004) 21. Bravyi, S., DiVincenzo, D.P., Loss, D., Terhal, B.M.: “Simulation of Many-Body Hamiltonians using Perturbation Theory with Bounded-Strength Interactions”. http://arXiv.org/abs/0803.2686, 2008

Polynomial-Time Algorithm for Weakly Interacting Quantum Spin Systems

507

22. Bhatia, R.: “Matrix Analysis”. Graduate Texts in Mathematics 169, New York: Springer-Verlag, 1997 23. Aliferis, P., Gottesman, D., Preskill, J.: “Accuracy threshold for postselected quantum computation”. Quant. Inf. Comput. 8, 181 (2008) Communicated by M.B. Ruskai

Commun. Math. Phys. 284, 509–535 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0602-6

Communications in

Mathematical Physics

L1 Stability of Spatially Periodic Solutions in Relativistic Gas Dynamics Daniela Calvo1 , Rinaldo M. Colombo2 , Hermano Frid3 1 Dipartimento di Matematica, Università degli studi di Torino, Via Carlo Alberto 10,

10123 Torino, Italy. E-mail: [email protected]

2 Dipartimento di Matematica, Università degli studi di Brescia, Via Branze 38,

25123 Brescia, Italy. E-mail: [email protected]

3 Instituto de Matemática Pura e Aplicada-IMPA, Estrada Dona Castorina 110,

22460-320 Rio de Janeiro, Brazil. E-mail: [email protected] Received: 16 October 2007 / Accepted: 14 May 2008 Published online: 7 August 2008 – © Springer-Verlag 2008

Abstract: This paper proves the well posedness of spatially periodic solutions of the relativistic isentropic gas dynamics equations. The pressure is given by a γ -law with initial data of large amplitude, provided γ − 1 is sufficiently small. As a byproduct of our techniques, we obtain the same results for the classical case. At the limit c → +∞, the solutions of the relativistic system converge to the solutions of the classical one, the convergence rate being 1/c2 . We also construct the semigroup of solutions of the Cauchy problem for initial data with bounded total variation, which can be large, as long as γ − 1 is small. 1. Introduction We consider the 2 × 2 hyperbolic system of conservation laws describing the onedimensional motion of an isentropic relativistic gas in Euler coordinates, which reads ⎧ ⎛ ⎛ ⎞ v 2 p(ρ) ⎞ p(ρ) ⎪ 1 + 1 + ⎪ 2 2 c ⎪ c ρ c ρ ⎪ ∂t ⎝ρ ⎪ v 2 ⎠ + ∂x ⎝ρ v v 2 ⎠ = 0, ⎪ ⎪ ⎪ 1− c 1− c ⎨ (1.1) ⎛ ⎞ ⎪ p(ρ) ⎪ ⎪ 2 1 + c2 ρ ⎪ ρ v + p(ρ) ⎪ ⎪ ∂t ⎝ρ v = 0. ⎪ v 2 ⎠ + ∂ x 2 ⎪ ⎩ 1− c 1 − vc Here, ρ is the gas density, v its velocity and p the pressure. We consider the case of a polytropic gas, in which the pressure is given by the so-called γ -law, p(ρ) = ζ 2 ρ γ , with 1 ≤ γ < 2. The main result of this paper states the existence of a Standard Riemann Semigroup (SRS, cf. [4]) of periodic solutions to (1.1), which may have large amplitude, provided γ − 1 is sufficiently small. In particular, this means that the initial value problem with

510

D. Calvo, R. M. Colombo, H. Frid

periodic initial data is well posed in L1 globally in time, as long as γ − 1 is sufficiently small. In this case, the total variation per period of the initial data may be taken arbitrarily large, according to the smallness of γ − 1. While proving the L1 -stability of periodic solutions, we also construct the SRS for the Cauchy problem. So, our results contain, in particular, the existence of a SRS for the Cauchy problem, with initial data with arbitrarily large amplitude and total variation, as long as γ − 1 is sufficiently small. The above system has been considered in the literature by many authors, such as [9, 10,15] and, in the case γ = 1, [3,11,18]. It is immediate to see that in the classical limit c → +∞, system (1.1) formally converges to the classical Euler equations of isentropic gas dynamics

∂t ρ + ∂x (ρ v) = 0, (1.2) p(ρ) = ζ 2 ρ γ . ∂t (ρ v) + ∂x ρ v 2 + p(ρ) = 0, The present analytical techniques apply (more easily) also to the non-relativistic case (1.2) yielding, in particular, the well posedness of the solutions constructed in [16,18] and, more importantly, the well posedness of the periodic solutions constructed in [15]. Furthermore, we prove that in the classical limit c → +∞, the SRS generated by (1.1) converges to that of (1.2), the rate of convergence being 1/c2 , recovering, in particular, the results in [9]. In the next section we state the main results of the paper, and at the end we describe the sections along which the main results, as well as the additional results concerning the non-relativistic case and the limit as c → +∞, are presented. 2. Statements of the Main Results Bakhvalov introduced in [2] a class of 2 × 2 strictly hyperbolic and genuinely non-linear systems, characterized by the particular geometry of the shock curves in the plane of Riemann invariants, for which a global existence result can be proved for initial data with large oscillation and only locally bounded total variation. Namely, consider a strictly hyperbolic, genuinely nonlinear 2 × 2 system ∂t u + ∂x f (u) = 0,

(2.1)

where u = (u 1 , u 2 ) and f (u) = ( f 1 (u), f 2 (u)). Let z, w be a pair of Riemann invariants for (2.1) such that the map (u 1 , u 2 ) → (z, w) is one-to-one in its domain. Parametrize the shock curves of the first and second family by z = R1 (w; z 0 , w0 ), w ≤ w0 ; z = R2 (w; z 0 , w0 ), w ≤ w0 ;

z = L 1 (w; z 0 , w0 ), w ≥ w0 ; z = L 2 (w; z 0 , w0 ), w ≥ w0 .

(2.2)

In (2.2), the state (z, w) can be connected on the left by L i and on the right by Ri to (z 0 , w0 ) by a shock of the i th family. For fixed W, Z ∈ R, let = {(z, w) : z ≥ Z and w ≤ W } .

(2.3)

The next hypotheses impose conditions on the shock curves under which the solvability of the Cauchy problem with locally bounded variation is obtained.

Stability of Periodic Solutions in Relativistic Gas Dynamics

A1 : A2 : A3 : A4 :

511

maxi=1,2 sup(z,w)∈ |λi (z, w)| < ∞. ∂ R2 ∂ L 2 ∂ R1 ∂ L 1 , < +∞, 0 < , < 1. ∀(z, w) ∈ with w = w0 , 1 < ∂w ∂w ∂w ∂w For i = 1, 2, let zr = Ri (wr ; zl , wl ). Then the shock curves z = Ri (w; zl , wl ), for w ≤ wl , and z = L i (w; zr , wr ), for w ≥ wr , intersect only in the points (zl , wl ), (zr , wr ). If four points (zl , wl ), (zr , wr ), (z m , wm ) and (ˆz m , wˆ m ) satisfy z m = R2 (wm ; zl , wl ), zr = R1 (wr ; z m , wm ), zˆ m = R1 (wˆ m ; zl , wl ) and zr = R2 (wr ; zˆ m , wˆ m ), then (zl − zˆ m ) + (wˆ m − wr ) ≤ (wl − wm ) + (z m − zr ).

System (2.1) belongs to Bakhvalov’s class over if it satisfies A1 – A4 . Theorem 2.1 ([2, Theorem 1]). Fix as in (2.3) with Z , W ∈ R2 . Let system (2.1) be strictly hyperbolic and genuinely nonlinear in . If (2.1) belongs to Bakhvalov’s class over , then for all uo ∈ BVloc (R; ), the Cauchy problem for (2.1) with datum uo admits a global weak entropy solution. Remark 2.1. The region considered by Bakhvalov is a subset of , so his theorem is a little stronger than the above statement. The proof of Theorem 2.1 involves the construction of a functional F, non-increasing in time, defined on approximate solutions obtained by the Glimm scheme, see [2]. This functional is constant along solutions to Riemann problems and, hence, may be seen as a function of the two initial states. Let ul and ur be the left and right constant states of a given Riemann problem whose solution consists of a shock (or rarefaction) wave of first family, say σ1 , followed by a shock (or rarefaction) wave of second family, say σ2 . Definition 2.1. Define F(ul , ur ) = [[ z(σ1 ) ]]

−

+ [[ w(σ2 ) ]] , with [[ s ]] −

−

=

max{−s, 0} denoting the negative part of s. As usual, the Riemann coordinates are assumed to have a positive increment across rarefactions and a negative one across a shock. Lemma 2.1 ([2, Lemma 1]). Under the same assumptions of Theorem 2.1, for any three states ul , um and ur in , F(ul , ur ) ≤ F(ul , um ) + F(um , ur ).

(2.4)

The equality holds in (2.4) if and only if um is a value attained by the solution corresponding to the Riemann data ul , for x < 0, and ur , for x > 0. As in [15], it is sufficient to use a local version of the above lemma. For any set B in the (z, w) plane, define R[B] to be the set of all values attained by the solution to any Riemann problem with initial data in B. The following is [15, Lemma 3.2]. Lemma 2.2. Let B0 , B1 be rectangles in the (z, w) plane with the property that R[R[B0 ]] ⊂ B1 and system (2.1) verifies Bakhvalov conditions Ai , i = 1,.., 4, when restricted to B1 . Then, for any states ul , um and ur in B0 , F(ul , ur ) ≤ F(ul , um ) + F(um , ur ),

(2.5)

and equality holds in (2.4) if um is a value assumed by the Riemann solution corresponding to the Riemann data ul , for x < 0, and ur , for x > 0.

512

D. Calvo, R. M. Colombo, H. Frid

Fig. 1. Hypothesis B4

It is convenient to substitute condition A4 by the following stronger condition introduced by DiPerna in [12]. Define R1 (z 0 , w0 ) = {(z, w) : z = R1 (w; z 0 , w0 ), w ≤ w0 } , L 2 (z 0 , w0 ) = {(z, w) : z = L 2 (w; z 0 , w0 ), w ≥ w0 } , and w = w − w0 , wˆ = wˆ − wˆ 0 , z = z − z 0 , ˆz = zˆ − zˆ 0 . Condition B4 consists of the following: B4 .1: B4 .2:

Let (ˆz 0 , wˆ 0 ) ∈ R1 (z 0 , w0 ). If z = L 2 (w; z 0 , w0 ), zˆ = L 2 (w; ˆ zˆ 0 , wˆ 0 ) and wˆ = w, then ˆz ≥ z. Let (ˆz 0 , wˆ 0 ) ∈ L 2 (z 0 , w0 ). If z = R1 (w; z 0 , w0 ), zˆ = R1 (w; ˆ zˆ 0 , wˆ 0 ) and ˆz = z, then wˆ ≥ w.

The above conditions depend on the choice of the pair of Riemann invariants. System (1.1) falls within (2.1) by setting u1 = ρ

u2 = ρ v

v 2

p(ρ) c2 ρ v 2 1− c 1 + p(ρ) c2 ρ v 2 , 1− c

1+

c

⎡ , f (u 1 , u 2 ) =

p(ρ) c2 ρ ⎢ρ v v 2 ⎢ 1− c ⎢ ⎢ ⎢ ρ v 2 + p(ρ)

1+

⎣

1−

v 2

⎤ ⎥ ⎥ ⎥ ⎥. ⎥ ⎦

(2.6)

c

The characteristic speeds of (1.1) are: λ1 =

v− 1−

v

p (ρ) √

p (ρ) c2

and

λ2 =

v+ 1+

v

p (ρ) √ . p (ρ) c2

For later use, introduce the physically relevant set ˚ + × R : ρ > 0, p < c2 , v 2 < c2 . U = u∈R

(2.7)

Stability of Periodic Solutions in Relativistic Gas Dynamics

513

It is immediate to verify that if u ∈ U, then −c < λ1 (u) < λ2 (u) < c. Throughout the paper, u ∈ U denotes the conserved variables (2.6), while v ∈ V, with V = v(U), denotes the Riemann coordinates v = (v1 , v2 ), see [10, formulæ (2.24)–(2.25)]: ρ

ρ

p (r ) p (r ) c c c+v c+v v1 = log dr, v2 = log dr. (2.8) − + p(r ) p(r ) 2 c−v 2 c − v 0 r + 2 0 r + 2 c

c

We remark that v is defined only for |v| < c and that the vacuum state corresponds to the line v1 = v2 , while v2 > v1 corresponds to ρ > 0. In the case of the γ -law p = ζ 2 ρ γ , the above integrals can be explicitly computed: √ ρ

2 γ p (r ) ζ (γ −1)/2 if γ ∈ ]1, 2] , arctan ρ dr = c p(r ) γ −1 c 0 r + 2 c ρ

p (r ) ζ dr = if γ = 1. 2 ln(ρ/ρ∗ ) p(r ) ρ∗ r + 2 1 + ζc c In [15, Theorem 4.1], the existence of domains Uγ , 1 < γ < 2 this established, satisfying: (i) K ⊂ Uγ , for any compact K ⊂ U and for γ − 1 sufficiently small; (ii) In Vγ := v(Uγ ) it is possible to define new Riemann invariants z = z(v1 , γ ) and w = w(v2 , γ ) with respect to which the corresponding shock curves satisfy Bakhvalov’s conditions, recalled below. The referred to result in [15] extends to the relativistic system (1.1) a previous result of DiPerna [12] for the corresponding non-relativistic system (1.2). With the classical Riemann coordinates (v1 , v2 ), (see (2.8)), system (1.1) satisfies conditions A1 – A3 . This follows from the lemmas in [15, Sect. 2]. However, neither B4 nor A4 hold for the classical Riemann invariants v1 and v2 of system (1.1). The situation is parallel to that of the system of non-relativistic isentropic gas dynamics, in which DiPerna showed in [12] that it is still possible to find a pair of Riemann invariants z = z(v1 , γ ), w = w(v2 , γ ) for which the system satisfies A1 – A3 and B4 , at least locally. Using these new Riemann invariants z = z(v1 , γ ) and w = w(v2 , γ ) we next define a functional on periodic piecewise constant functions ¯ u(x) =

N

uα χ]xα−1 ,xα ] (x),

α=0

x ∈ , u0 = u N , uα ∈ U, α = 0, . . . , N ,

(2.9)

where denotes the interval of periodicity. We set ¯ := L[u]

N

F(uα−1 , uα ),

α=1

where F is as in Def. 2.1. The construction of wave front tracking approximate solutions does not use the exact Riemann solution, but an approximate solution, which depends on the approximation parameter ε > 0 (see [5]). Accordingly, we use a function F ε (ul , ur ) whose definition

514

D. Calvo, R. M. Colombo, H. Frid

is similar to that of F(ul , ur ) with the only difference that instead of the exact Riemann solution it uses the approximate one. Coherently, we define ¯ := Lε [u]

N

F ε (uα−1 , uα ).

α=1

For the construction of periodic wave front-tracking approximate solutions of (1.1), uε (t), through an interval [0, T ], for any T > 0, if ε > 0 is sufficiently small, a key point is the fact that Lε [uε (t)] is non-increasing for t ∈ [0, T ]. Let BV (R, U) be the space of periodic functions on R, with periodic interval , assuming values in U, of bounded total variation per period. Given u ∈ U, our approximate SRS, Stε , “almost” preserves the domains Dγ ⊂ BV (R, U), consisting of -periodic piecewise constant functions u(x) satisfying L[u] < Mγ , for some Mγ > 0, with Mγ → ∞, as γ → 1, and 1 u(x) d x = u , (2.10) in the sense that Lε [Stε u] < Mγ , for t ∈ [0, T ], and 1 ε St u(x) d x − u < δ,

t ∈ [0, T ],

for any δ > 0, if ε > 0 is sufficiently small. We measure the total variation per period of a periodic function u : R → U, denoted TV (u|), by means of the sum of the total variations in one period of each of the Riemann coordinates: ⎧ ⎫ 2 N N ∈N ⎨ ⎬ |vi (u(xα )) − vi (u(xα−1 ))| : xα−1 < xα TV (u|) = sup . ⎩ x0 , . . . , x N ∈ ⎭ i=1 α=1 We can now state our main theorem establishing the existence of a Standard Riemann Semigroup of periodic solutions with large oscillation and total variation per period. The definition of SRS in the periodic case is the obvious adaptation from [4, Def. 9.1]. Theorem 2.2. Let u ∈ U and 1 < γ < 2. Then, there exist domains Uγ ⊂ U, and constants Mγ > 0 satisfying Mγ → ∞ as γ → 1, p (ρ) < c2 for u ∈ Uγ and, for any given compact K ⊂ U, K ⊂ Uγ , provided γ − 1 is sufficiently small. Moreover, there exists a Standard Riemann Semigroup S : [0, +∞[ × Dγ → Dγ , in the sense that (a) For any u ∈ Dγ , ! !

St u − St u L1 () ≤ C !t − t !, for t, t ∈ [0, ∞[ , (2.11) for a constant C > 0 depending only on bounds on the total variation per period for u ∈ Dγ ; (b) For u, u ∈ Dγ , St u − St u 1 ≤ eCt u − u L1 () , (2.12) L () also for some constant C > 0 depending only on bounds on the total variation per period for u, u ∈ Dγ ;

Stability of Periodic Solutions in Relativistic Gas Dynamics

515

(c) If u ∈ Dγ is piecewise constant, then for T > 0 sufficiently small, St u (t ∈ [0, T ]) coincides with the function obtained by piecing together the Riemann solutions corresponding to each of the jump discontinuities in u. Further, we"have also the following# properties: $ 1 (d) Dγ ⊇ u ∈ BV (R, Uγ ) : u(x) d x = u , TV (u|) ≤ Mγ . (e) For all u ∈ Dγ , the trajectory t → St u coincides with the Glimm solution constructed in [15]. Theorem 2.2 follows immediately from two major results which are stated subsequently. The first, Theorem 2.3, establishes the existence of periodic wave front tracking approxi¯ defined for t ∈ [0, T ], for any T > 0, and any periodic mate solutions, u¯ ε (t, ·) = Stε u, piecewise constant function u¯ assuming values in U, provided that ε > 0 and γ − 1 > 0 are sufficiently small. The second, Theorem 2.4, establishes the stability in L1 () of the periodic wave front tracking approximate solutions with respect to their initial data. Theorem 2.3. Given any periodic piecewise constant function u : R → U of the form (2.9) and any T > 0, it is possible to construct periodic wave front tracking approximate solutions, uε (t, ·) = Stε u, defined for t ∈ [0, T ], for any T > 0, provided that ε > 0 and γ − 1 > 0 are sufficiently small. The approximate solutions satisfy

Lε [Stε u] ≤ Lε [Stε u], for 0 ≤ t < t ≤ T, ! ! ! ε ! ! S u − S ε u! d x ≤ C !t − t !, for t, t ∈ [0, T ], t t

(2.13) (2.14)

where C > 0 is a constant depending only on TV (u|), and u is given by (2.10). Moreover, there exists a subsequence εi → 0 for which Stεi u → u(t, ·) =: St u as εi → 0, where u(t, ·) is an entropy solution of (1.1) with initial data u. Theorem 2.4. For u ∈ U and γ ∈ ]1, 2[, there exist constants Mγ > 0, with Mγ → ∞, as γ → 1+, and domains Uγ ⊂ U, with Uγ ⊃ K , for any compact K ⊂ U, for γ sufficiently close to 1, with the following property. If u , u

are two periodic piecewise constant functions, of the form (2.9), with mean values 1 1

u := u (x) d x, u := u

(x) d x, u , u ∈ Uγ , || || respectively, assuming values in Uγ , with ! ! ! !

!u − u !, !u

− u !, TV (u|), TV (u |) < Mγ , then Stε u and Stε u are defined for t ∈ [0, T ], for any T > 0, provided that ε > 0 is sufficiently small. Moreover, we have ! ! ! ε ! ! S u(x) − S ε u (x)! d x ≤ eCt !u(x) − u (x)! d x, (2.15) t t

for some constant C > 0 depending only on Mγ . The fact that St u coincides with the Glimm solution constructed in [15] follows from well known uniqueness theorems, cf. [4,6–8,14]. The rest of this paper is organized as follows. In Sect. 3, we construct the wave front tracking approximate solutions and prove Theorem 2.3.

516

D. Calvo, R. M. Colombo, H. Frid

Section 4 is devoted to the proof of Theorem 2.4. The latter involves the construction of wave front tracking approximate solutions for the usual Cauchy problem with initial data of bounded total variation, and the proof of the L1 -stability of such approximate solutions. In Sect. 5, we briefly show how our results immediately apply also to the nonrelativistic system (1.2). Finally, in Sect. 6, we show the convergence of the semigroup solutions of the relativistic system (1.1) to the semigroup solution of (1.2) when c → ∞. 3. Periodic Wave Front Tracking Approximate Solutions This section is devoted to the proof of Theorem 2.3. The latter is based on the fact that, with respect to the new Riemann invariants z(v1 , γ ) and w(v2 , γ ), system (1.1) satisfies Bakhvalov’s conditions A1 – A4 . We recall that conditions A1 – A3 are satisfied also by v1 , v2 (see [10,15]). The precise definition of z(v1 , γ ) and w(v2 , γ ) is given in [15, formula (4.2)] and is immaterial for our purposes here. Only those relevant properties of the functions z(v1 , γ ) and w(v2 , γ ) stated in the following lemma are sufficient, see [15, Sect. 6] for more details. Remember the definition (2.7) of U. Lemma 3.1. The new Riemann invariants z(v1 , γ ), w(v2 , γ ), with respect to which system (1.1) satisfies Bakhvalov’s conditions A1 – A4 , may be defined for v belonging to a domain Vγ and u ∈ Uγ := v−1 (Vγ ), Uγ ⊂ U and Uγ ⊃ K , for any compact K ⊂ U, provided that γ − 1 > 0 is sufficiently small. Moreover, after a suitable normalization, z(v1 , γ ) and w(v2 , γ ) satisfy lim z(v1 , γ ) = v1 ,

γ →1

lim w(v2 , γ ) = v2 ,

γ →1

∂z ∂w (v1 , γ ) = 1, lim (v2 , γ ) = 1, γ →1 ∂v2 ∂v1 k k ∂ z ∂ w lim (v1 , γ ) = 0 for k ≥ 2, lim (v2 , γ ) = 0 for k ≥ 2, k γ →1 ∂v γ →1 ∂v k 1 2 lim

γ →1

(3.1)

locally uniformly in v1 and v2 . Although the conserved variable u depends on γ through the pressure p = pγ (ρ) = ζ 2 ρ γ , we may consider u as independent of γ , because of the nice behavior of pγ as γ → 1+, on compact subsets of ]0, +∞[, as stated in the following lemma, whose elementary proof is left to the reader. Lemma 3.2. As γ → 1+, the pressure law pγ converges to p1 uniformly in Ck , for any k ∈ N, on any compact subset of ]0, +∞[. In the following, we will always use the Riemann coordinates z(v1 , γ ), w(v2 , γ ), whose relevant properties are described in Lemma 3.1, but henceforth we denote them simply by v1 and v2 , respectively. Except for the fact that now we assume that the pair (v1 , v2 ) also satisfies A4 , for any other property that will be needed in the following, we can mix up these two pairs of Riemann coordinates without any problem, due to (3.1). Lemma 3.3. In the Riemann coordinates, the Lax curves of (1.1) departing from v can be parametrized as L1 (v, σ ) = (v1 + σ + ψ(v, γ , σ ), v2 + ψ(v, γ , σ )) , L2 (v, σ ) = (v1 + ψ(v, γ , σ ), v2 + σ + ψ(v, γ , σ )) ,

(3.2)

Stability of Periodic Solutions in Relativistic Gas Dynamics

517

with a suitable function ψ of class C2,1 such that ψ(v, γ , σ ) = 0 for all σ ≥ 0 and ψ(v, γ , σ ) → ϕ(σ ) in C2 uniformly on compact sets as γ → 1, where

0 ζ σ ≥ 0, ζc = . (3.3) ϕ(σ ) = 2ζc σ σ − 2 + c arcsinh c sinh 4ζc σ ≤ 0, 1 + (ζ /c)2 Moreover, ψ is locally Lipschitz, for all σ ≤ 0, ψ ≤ 0, ∂σ ψ ≥ 0 and setting & % v ∈ V, γ ∈ [1, γ¯ ] − + , H (γ , M, u , u ) = sup ∂σ ψ(v, γ , σ ) : and ψ(v, γ , σ ) ∈ K

(3.4)

where K is any compact subset of V, we have H (γ , M, u− , u+ ) < +∞ and lim H (γ , M, u− , u+ ) = H (1, M, u− , u+ ).

γ →1

Proof. 1. We observe that when γ > 1, the shock curves are not translation invariant as in the case γ = 1 (see [11,18]) and therefore the function ψ depends also on v. 2. The parametrization (3.3) is in [11, Sect. 4]. The regularity of follows from that of pγ and, hence, of the flux function defining (1.1), see also [18, Prop. 1]. 3. The inequalities ψ ≤ 0 and ∂σ ψ ≥ 0 are consequences of the following two facts. First, a tedious but straightforward computation shows that ϕ(0) = ϕ (0) = ϕ

(0) = 0,

lim ϕ

(σ ) > 0.

σ →0−

It is also easy to see that, for each fixed v ∈ V, limγ →1+ ∂σk ψ(v, γ , σ ) = ϕ k (σ ), uniformly in [−M, 0[, for any M > 0, k ∈ N. Hence, given δ > 0 such that ϕ

(σ ) > 0 for σ ∈ [−δ, 0[, we conclude that ∂σ3 ψ(v, γ , σ ) > 0,

for σ ∈ [−δ, 0[ ,

for all γ > 1 sufficiently close to 1. We thus obtain, in particular, ∂σ2 ψ(v, γ , σ ) < 0, ∂σ ψ(v, γ , σ ) > 0, 4. Second, note that for σ ≤ −δ we have σ 2ζc σ 2ζc where · sinh · < sinh c 4ζc c 4ζc

for σ ∈ [−δ, 0[ .

2ζc σ ∈ ]0, 1[ and < 0. c 2c

The latter inequality follows from the strict concavity of s → sinh s for s ≤ −δ. Moreover, we have ∂σ ϕ < 0 since ' σ σ 2ζc 2 cosh > 1+ sinh2 , 4ζc c 4ζc which holds thanks to 2ζc /c ∈ ]0, 1[. Hence, we obtain the corresponding inequalities for ψ(v, γ , σ ) and ∂σ ψ(v, γ , σ ), for each fixed v ∈ V, for σ ∈ [−δ, −M], for any M > 0, if γ > 1 is sufficiently close to 1. 5. The boundedness of H (γ , M, u− , u+ ) follows from the regularity of ψ and the compactness of K. The final limit is a consequence of the locally uniform convergence γ → 1, see Lemma 3.2.

518

D. Calvo, R. M. Colombo, H. Frid

Fig. 2. The parametrization of the approximate Lax curves (3.5)

Following [5], for a fixed ε > 0, we consider the interpolation between the i-shock and the i-rarefaction wave (i = 1, 2) (approximate Lax curves): Lε1 (v, σ ) = (v1 + σ + ψε (v, γ , σ ), v2 + ψε (v, γ , σ )) , Lε2 (v, σ ) = (v1 + √ ψε (v, γ , σ ), v2 + σ + ψε (v, γ , σ )) , ψε (v, γ , σ ) = (σ/ ε) ψ(v, γ , σ ),

(3.5)

and is any C∞ function satisfying (s) = 1 for s ≤ −2, (s) = 0 for s ≥ −1, (s) ∈ [0, 1] and (s) ∈ [−2, 0] for s ∈ [−2, −1]. We note that the interpolated Lax curves admit the parametrization (3.5) for ε sufficiently small. Indeed, ∂σ Lε1 (v, 0) = [1 0]T , hence the half-line exiting v¯ and parallel to v1 = v2 does not intersect the approximate Lax curve between v¯ and vo , see Fig. 2, provided ε is sufficiently small. We thus have the following analog of Lemma 3.3. ( ( ( ) Lemma 3.4. There exist γo ∈ 1, γ¯ , εo > 0 such that for ε ∈ ]0, εo ] and γ ∈ 1, γo , the function ψε in (3.5) satisfies ψε ≤ 0, ∂σ ψ ε ≥ 0 and, for any compact K ⊂ V, & % v ∈ K, ψε (v, γ , σ ) ∈ K ε − + < +∞. H (γ , M, u , u ) = sup ∂σ ψε (v, γ , σ ) : and γ ∈ [1, γo ] Moreover, ψε (·, γ , ·) → ψ and H ε (γ , M, ul , ur ) → H (γ , M, ul , ur ) as ε → 0, uniformly on compact sets. Given ε > 0, a left and right state ul , ur , with Riemann coordinates vl = (v1l , v2l ) and vr = (v1r , v2r ), as in [11, Sect. 2] or [10, Theorem 4.1], we construct an approximate solution of the Riemann problem associated to (1.1). First, determine the unique values σ1 and σ2 and a middle state vm such that vr = Lε2 (vm , σ2 ) and vm = Lε1 (vl , σ1 ). If σ1 ≥ 0, then the states vl and vm are connected by a 1-rarefaction wave. We approximate this rarefaction wave using a fixed ε grid in the (v1 , v2 ) plane. Let the integers h, k be such j that hε ≤ v1l < (h + 1)ε and kε ≤ v1m < (k + 1)ε. Introducing the states ω1 = ( jε, v2l ) j and * ω1 = ( j + 21 )ε, v2l for j = h, . . . , k, we construct the ε-approximate solution through the following rarefaction fan: ⎧ l ( ( v if x/t ∈ −∞, λ1 (* ω1h ) ⎪ ⎪ ⎨ + , j−1 j vε (t, x) = ω1j if x/t ∈ λ1 (* ω1 ), λ1 (* ω1 ) , j = h + 1, . . . , k. ⎪ ⎪ ) ) ⎩ m v if x/t ∈ λ(* ω1k ), +∞ . On the other hand, if σ1 < 0, the states vl and vm are connected by a shock: % l l v if x < λ 1 (v , σ1 )t vε (t, x) = m v if x > λ1 (vl , σ1 )t.

Stability of Periodic Solutions in Relativistic Gas Dynamics

519

Let λ1 (v, σ ) denote the Rankine-Hugoniot speed of the (exact) shock between the states v and L1 (v, σ ). Then, the shock speed λ 1 is defined as σ1 σ1 l s l λ1 (v , σ1 ) = √ λ1 (v , σ1 ) + 1 − √ λr1 (vl , σ1 ), ε ε λs1 (vl , σ1 ) = λ1 vl , L1 (vl , σ1 ) , meas [ jε, ( j + 1)ε] ∩ [v m , vl ] j 1 1 r l λ1 (v , σ1 ) = λ1 (ωˆ 1 ). |σ1 | j

The construction of the ε-approximate solution for waves of the second family is analogous to the previous case, we refer to [5,11] for details. Let now u(x) be a periodic piecewise constant initial condition as in (2.9). A piecewise constant ε-approximate solution to the Cauchy problem for (1.1) is constructed as follows. At time τ0 = 0 solve the Riemann problems defined by the jumps in u(x) using the above algorithm. This yields a piecewise constant approximate solution (t, x) → uε (t, x) defined up to the time τ1 > τ0 , where the first set of interactions takes place. The Riemann problems arising at time τ1 are again approximately solved using the algorithm above. Then, uε can be defined up to the time τ2 when the next interaction takes place, etc. As usual, we denote Stε u := uε (t, ·). Of great importance in the wave front tracking technique is the control of the interactions. Aiming at continuous dependence, the usual slight modifications of the wave speeds to avoid multiple interactions cannot be adopted. On the other hand, we treat below in details the case of simple interactions, leaving to the inductive procedures developed in [5,11] the case of many waves interacting simultaneously. Let D¯ be the set of periodic piecewise constant functions with values in U as in (2.9). We denote by σi,α for (i = 1, 2) the total size of the waves of the i th family in the ε-approximate solution of the Riemann problem for (1.1) at xα with states uα and uα+1 . In the Riemann coordinates, this means for α = 0, . . . , N , (3.6) vα = Lε2 Lε1 (vα−1 , σ1,α ), σ2,α where u0 = u N . Given u− , u+ ∈ U, let v− , v+ be their respective images in the plane of Riemann coordinates. We solve the corresponding ε-approximate Riemann problem obtaining v+ = Lε2 Lε1 (v− , σ1 ), σ2 . Let F ε (u− , u+ ) := [[ v1 (σ1 ) ]] + [[ v2 (σ2 ) ]] , −

−

¯ we set similarly to Definition 2.1. Now if u(·) ∈ D, Lε [u] :=

N

F ε (uα−1 , uα ).

α=1

Proof of Theorem 2.3. 1. Given any T > 0, to prove that we can construct the approximate solution Stε u throughout the whole interval [0, T ] we need to show that after all possible interactions between wave fronts in Stε u, as t increases, Stε u keeps assuming values in Uγ , where the special Riemann coordinates v1 , v2 are defined. We achieve this

520

D. Calvo, R. M. Colombo, H. Frid

Fig. 3. Notation for the interaction estimates

by showing that TV (Stε u|) keeps being always uniformly bounded and that the mean value 1 (Stε u) := S ε u(x) d x t can be made arbitrarily close to u := (S0ε u) = (u(·)) if ε > 0 is sufficiently small. 2. As in [13], we construct the approximate solutions in a number of time-steps of fixed length T0 , independent of ε, using (2.13)–(2.14) and the convergence of the approximate solutions at each time-step as ε → 0 to an entropy solution St u of (1.1) with initial data u(·), in order to pass from one time-step to the next one. 3. The control of TV (Stε u|) is achieved once we show (2.13). Observe that, by the geometry of the approximate wave curves, both v1 and v2 decrease across approximate shocks of both families, and increase across approximate rarefactions of both families. Observe also that, because of property A2 , the absolute value of the change in v1 across approximate shocks of the first family dominates that of v2 across the same waves, while the absolute value of the change in v2 across approximate shocks of the second family dominates that of v1 across the same waves. Clearly, then, Lε [Stε u] is equivalent to the negative variation per period of Stε u. By periodicity, the total variation per period equals twice the negative variation per period, so Lε [Stε u] is equivalent to TV (Stε u|), that is, 1 ε ε L [St u] ≤ TV (Stε u|) ≤ CLε [Stε u] , C

(3.7)

for some constant C > 0 depending only on (1.1) and Uγ . 4. As usual, we say that a wave on the left approaches a wave to its right, if the former belongs to a family of order greater than that of the latter, or if they both belong to the same family and at least one of them is a shock. Now, suppose that a wave connecting a state ul to a state um interacts with a wave connecting um to ur (see Fig. 3). Assume also that the interaction produces two wave fronts of total size σ1+ and σ2+ , connecting

and u to u , respectively. We are going to show that the states ul to um r m F ε (ul , ur ) ≤ F ε (ul , um ) + F ε (um , ur ).

(3.8)

5. We must analyze all possibilities according to whether σ is a shock or rarefaction of the first or second family and σ

is a shock or rarefaction of the first or second family. There are in total 10 cases of approaching waves. Of all these cases, the most delicate

Stability of Periodic Solutions in Relativistic Gas Dynamics

521

Fig. 4. Bakhvalov’s condition A4 through DiPerna’s conditions B4 .1, B4 .2

is the one in which σ2− is a shock of the second family, and σ1− is a shock of the first family, see Fig. 3, left. Bakhvalov’s condition A4 refers exactly to this interaction (see Fig. 4). 6. In the case of the γ -law systems of gas dynamics, as pointed out by DiPerna [12], Bakhvalov’s condition A4 is satisfied in the plane of the special Riemann coordinates introduced in [12], as a consequence of the validity of DiPerna’s conditions B4 .1, B4 .2. The latter can be viewed also as follows. Let v0 be a given reference state, let R1 := {(v1 , v2 ) : v2 = R1 (v1 ; v0 ), v1 ≤ v01 } be the right shock curve of the first family departing from v0 (i.e., curve whose points can be connected on the right of v0 by a 1-shock). Let L 2 := {(v1 , v2 ) : v1 = L 2 (v2 ; v0 ), v2 ≥ v02 } be the left shock curve of the second family departing from v0 (i.e., curve whose points can be connected on the left of v0 by a 2-shock). If v∗ is any point in R1 , the left shock curve of the second family departing from v∗ is the graph of a function v1 = g(v2 ) := L 2 (v2 ; v∗ ). If Tv∗ : R2 → R2 is the translation in the Riemann coordinates plane taking v0 to v∗ , the Tv∗ -translate of L 2 is the graph of the function v1 = g(v ¯ 2 ) := v∗1 + L 2 (v2 − v∗2 ; v0 ). DiPerna’s condition B4 .1 is equivalent to the fact that the inequality g(v ¯ 2 ) ≤ g(v2 ) holds. Similarly, from any point v∗∗ ∈ L 2 , the right shock curve of the first family is the graph of the function v2 = h(v1 ) := R1 (v1 ; v∗∗ ) and the Tv∗∗ -translate of R1 is the graph ¯ 1 ) = v∗∗2 + R1 (v1 − v∗∗1 ; v0 ). Again, DiPerna’s condition B4 .2 of the function v2 = h(v means exactly that we must have h¯ 1 (v1 ) ≥ h(v1 ). See Fig. 4, where v0 = um , v∗ = ur , + + v∗∗ = ul , R1 = R1− , L 2 = L − 2 , L 2 ( · ; v∗ ) = L 2 , R1 ( · ; v∗∗ ) = R1 , and the translates of − − R1 and L 2 are R1 and L 2 , respectively. As we see in Fig. 4, the polygon formed by − − − + + the curves L − 2 , R1 , L 2 , R1 is contained in the polygon formed by the curves L 2 , R1 , − −

L 2 and R1 and this implies A4 due to the validity of A2 , which impose constraints on the inclinations of the curves Ri and L i . 7. The analysis in [12], for the classical case, and [15], for the relativistic case, shows that DiPerna’s transformation is C 2 -stable in the sense that if R1ε := {(v1 , v2 ) : v2 = R1ε (v1 ; v0 ), v1 ≤ v01 } and L ε2 := {(v1 , v2 ) : v1 = L ε2 (v1 ; v0 ), v2 ≥ v02 } are curves sufficiently close (in a compact interval) to the curves R1 and L 2 , defined in the last step,

522

D. Calvo, R. M. Colombo, H. Frid

in the C 2 -norm, we still have the inequalities between the corresponding functions g, g, ¯ ¯ where now v∗ runs along R ε and v∗∗ runs along L ε . h and h, 1 2 8. Now observe that we may define a parametrization similar to (3.2) for the left wave curves departing from a given point v, that is, ˜ ˜ L˜ 1 (v, σ ) = v1 + σ + ψ(v, γ , σ ), v2 + ψ(v, γ, σ) , (3.9) ˜ ˜ ˜ L2 (v, σ ) = v1 + ψ(v, γ , σ ), v2 + σ + ψ(v, γ , σ ) ˜ γ , σ ) = 0 for all σ ≤ 0 which with a suitable function ψ˜ of class C2,1 such that ψ(v, converges in C 2 as γ → 1 to the function corresponding to γ = 1. We can also define the approximate left wave curves similarly to (3.5), that is, L˜ ε1 (v, σ ) = v1 + σ + ψ˜ ε (v, γ , σ ), v2 + ψ˜ ε (v, γ , σ ) , L˜ ε2 (v, σ ) = v1 + ψ˜ ε (v, γ , σ ), v2 + σ + ψ˜ ε (v, γ , σ ) , √ ˜ γ , σ ). ψ˜ ε (v, γ , σ ) = (−σ/ ε) ψ(v,

(3.10)

9. To simplify the reasoning we assume here that our wave front tracking approximate solution is constructed exactly as described before with the only difference that the approximate Riemann problems are solved using the approximate right 1-wave curve Lε1 and the approximate left 2-wave curve L˜ ε2 . This choice would not cause any change in the development of the theory of [5]. So given vl , vr we find vm as the intersection of ε Lε1 (vl , σ ) with L˜2 (vr , σ ), and construct the approximate Riemann solution as before. Hence, from the considerations in Steps 6 and 7 and the form in which the approximate wave curves are defined (as convex combinations of the coordinate lines and the exact wave curves, see (3.5) and (3.10)), we conclude the following. If g ε , g¯ ε , h ε and h¯ ε are the functions defined as above replacing the exact right 1-shock and left 2-shock curves by the approximate ones, with v∗ running along R1ε and v∗∗ running along L ε2 , we still have g¯ ε (v2 ) ≤ g ε (v2 ) and h¯ ε (v1 ) ≥ h ε (v1 ). This implies that Fig. 4 also describes the interaction between two approximate shock waves. Therefore, we conclude that (3.8) holds for this type of interaction. 10. For the other possible types of interactions, the fact that (3.8) holds is immediate and we only need to draw pictures to see that clearly. For instance, Fig. 5 describes three examples of possible interactions:(i) a 2-rarefaction R R2− with a 1-shock S1− giving a 1-shock S1+ and a 2-rarefaction R R2+ ; (ii) a 1-shock

S1− with a 1-shock S1− giving a 1-shock S1+ and a 2-rarefaction R R2+ ; (iii) a 2-shock S2− with a 2-rarefaction R R2− giving a 1-shock S1+ and a 2-shock S2+ . 11. From the validity of (3.8) at each possible interaction, we conclude that Lε [Stε u] decreases at each interaction time, being constant in time intervals that do not contain any interaction. Therefore, (2.13) holds. 12. Since Lε [Stε u] is non-increasing with time and, by construction, Stε u is spatially -periodic, it follows from (3.7) that the total variation per period of Stε u¯ is uniformly bounded. 13. Now, the proof of (2.14) follows similarly to the one of the corresponding property of the wave front tracking approximate solution for the usual Cauchy problem (see [5, 11]), by using the periodicity of Stε u and the uniform boundedness of TV (Stε u|).

Stability of Periodic Solutions in Relativistic Gas Dynamics

523

Fig. 5. Three examples of possible interactions

14. The properties (2.13) and (2.14) satisfied by the -periodic wave front tracking approximate solution Stε u allow us to repeat the reasoning in [13,15] and construct the approximate solutions in an arbitrary time-interval [0, T ], as long as ε > 0 is sufficiently small. 15. Indeed, from (2.13) and (2.14) we first obtain a T0 > 0 such that the approximate solutions may be constructed in the time-interval [0, T0 ]. This T0 > 0 is such that the ¯ do not leave a square image in the Riemann coordinates plane of the mean-values (Stε u) box Q(v0 , R) whose side length equals a certain R > 0 and is centered at a certain point v0 , during the time-interval [0, T0 ], as long as u lies in the concentric box Q(v0 , R/2) of side length R/2. Such T0 > 0 always exists due to (2.13) and (2.14) (cf. [13,15]). 16. Now since the approximate solutions converge to an entropy solution of (1.1), which follows in a standard way (see [5]), we have that, for ε > 0 sufficiently small, the ¯ , for t ∈ [0, T0 ], belong to the box Q(v0 , R/2) since they converge mean-values (Stε u) to u , uniformly in t ∈ [0, T0 ]. Therefore, we may construct the approximate solutions also in the time-interval [T0 , 2T0 ] if ε > 0 is sufficiently small, and so on. In this way, we can cover the given time-interval [0, T ] with a finite number of intervals of the form [(k − 1)T0 , kT0 ], k ∈ N, and obtain that the approximate solutions can be constructed in any time-interval [0, T ], as long as ε > 0 is sufficiently small. 17. As already mentioned, the convergence of the approximate solutions to an entropy solution of (1.1) is standard. This concludes the proof.

4. The L1 -Stability of the Periodic Wave Front Tracking Approximate Solutions This section is devoted to the proof of Theorem 2.4. Since the L1 -stability is a local property, in the periodic case we can reduce its proof to the usual case of the Cauchy problem as follows. We define approximate wave-front tracking solutions, StC,ε u∗ , StC,ε u∗ as in [5,11], for initial data u∗ , u∗ which coincides with the -periodic piecewise constant initial

outside these interdata u, u on three period intervals, and is constant, equal to u , u vals. So, if = [−L , L], these initial data coincide on [−3L , 3L]. The corresponding approximate wave front tracking solutions, StC,ε u∗ and StP,ε u, coincide over , on a time-interval [0, T∗ ], where T∗ depends only on an upper bound of the characteristic speeds, because of the finite speed of propagation property. The same is true for the approximate solutions StC,ε u∗ and StP,ε u .

524

D. Calvo, R. M. Colombo, H. Frid

If we prove that any two approximate solutions StC,ε u∗ and StC,ε u∗ satisfy C,ε C,ε ≤ C0 u∗ − u∗ L1 (R) , St u∗ − St u∗ 1

(4.1)

L (R )

for some constant C0 not depending on ε, u∗ , u∗ , but only on the bounds for the data of the problem, we then obtain P,ε P,ε ≤ 3C0 u − u L1 () , t ∈ [0, T∗ ]. St u − St u 1 L ()

This reasoning can be repeated for the intervals [(k − 1)T∗ , kT∗ ], k ∈ N, as long as they are contained in the [0, Tε ], [0, Tε ], where StP,ε u and StP,ε u are defined, for which we have shown that Tε , Tε → +∞ as ε → 0. Indeed, the possibility of repeating the procedure follows from the fact that Lε [StP,ε u] and Lε [StP,ε u ] do not increase with time, which guarantees the uniform boundedness of TV (StP,ε u|) for t ∈ [0, Tε ] and TV (StP,ε u |) for t ∈ [0, Tε ]. We thus get, for t ∈ [(k − 1)T∗ , kT∗ ], and k ∈ N, P,ε P,ε ≤ (3C0 )k u − u L1 () , St u − St u 1 L ()

which then gives the desired stability property (2.15). By the above arguments, in this section we consider only the stability property for the Cauchy problem as in [5,11]. From now on, we follow closely the notation in [11]. Before we begin, we state the following simple proposition which shows that we may prevent vacuum by means of a suitable bound on the total variation of the function which we measure in the Riemann coordinates by

2 |vi (u(xα )) − vi (u(xα−1 ))| : xα−1 < xα . TV (u) = sup (4.2) i=1 α

Proposition 4.1. Fix a positive M and two states u− , u+ ∈ ]0, +∞[ × R such that ρ+

ρ−

p (r ) p (r ) M M and . (4.3) dr > dr > p(r ) p(r ) 4 4 0 0 r + c2 r + c2 Then, any function u : R → ]0, +∞[ × R satisfying lim u(x) = u− ,

x→−∞

lim u(x) = u+

TV (u) < M

and

x→+∞

does not attain as value the vacuum state. Proof. Assume that u∗ : R → ]0, +∞[ × R satisfies (4.3), lim x→−∞ u∗ (x) = u− , lim x→+∞ u∗ (x) = u+ and ρ ∗ (x∗ ) = 0 for some x∗ ∈ R. Then, by (4.2) TV (u∗ ) ≥

2 ! ! ! ! ! ! ! ! − !v − v ∗ (x∗ )! + !v ∗ (x∗ ) − v + ! > !v − − v − ! + !v + − v + ! i

i

i=1

=2

ρ−

0

completing the proof.

p (r )

r+

p(r ) c2

i

i

ρ+

dr + 0

p (r )

r+

p(r ) c2

1

dr

> M,

2

1

2

Stability of Periodic Solutions in Relativistic Gas Dynamics

525

We are going to prove the following theorem. − + (4.3) for some Theorem 4.1. Choose a positive M( and ( states u , u satisfying ) ) γ¯ ∈ ]1, 2]. Then, there exists γo ∈ 1, γ¯ such that for all γ ∈ 1, γo , system (1.1) generates a Standard Riemann Semigroup S : [0, +∞[ × D → D. Moreover, for a suitable κγ ∈ ]0, 1[, ⎧ ⎫ ⎨ lim x→−∞ u(x) = u− ⎬ (1) D ⊇ clL1 u ∈ BV(R; U) : lim x→+∞ u(x) = u+ ; ⎩ ⎭ TV (u) ≤ κγ M (2) if TV (uo ) ≤ κγ M, then TV (St uo ) ≤ M for all t ≥ 0; (3) limγ →1 κγ = 1/ 1 + H (1, M, u− , u+ ) .

¯ Let D¯ denote the set of piecewise constant functions with values in U. For any u ∈ D, u = u− χ]−∞,x ] + 0

α

uα χ]x

α−1 ,x α

+ ] + u χ]x N ,+∞[ ,

(4.4)

we denote by σi,α (i = 1, 2) the total size of the waves of the i th family in the ε-approximate solution of the Riemann Problem for (1.1) at xα with states uα and uα+1 , see (3.6). Let A denote the set of all couples of approaching waves. We say that a pair of waves (σi,α , σ j,β ) is approaching if either, α < β and j < i, or if j = i, min{σi,α , σi,β } < 0, see [4, Chap. 7] or [17]. Following [11, (2.11)], we introduce the functionals ! ! 1 − η sgn σi,α !σi,α !, α i ! ! !σi,α σ j,β !, Q ε (u) = V ε (u) =

σi,α ,σ j,β ∈A

ϒ ε (u) = V ε (u) +

(4.5)

1 · Q ε (u), K

where η ∈ ]0, 1[ and K > 0 are constants depending only on u− , u+ , M and their values will be defined below. The dependence of ϒ ε on ε is due to the dependence on ε of the wave sizes in the ε–solution to Riemann problems. Clearly, the functional ϒ ε (u) is equivalent to the total variation of u. Our first goal will be to show that ϒ ε (Stε u) decreases with time. The decrease of ε ϒ (Stε u) with time will then be used to prove (4.1). To simplify the notation, in the remainder of this paper, C denotes a generic “large” constant dependent only on the domain U where the conserved quantities may vary. The actual value of C is unimportant for the results obtained here. Throughout this section, we refer to Fig. 3 for the interaction estimates. Lemma 4.1. Let n ∈ N with n ≥ 1 and A ⊂ R3 be a compact set with the origin in its interior. If g ∈ C2,1 (; Rn ) satisfies g(0, y, z) = g(x, 0, z) = g(x, y, 1) = 0, then

g(x, y, z) ≤ Lip(D 2 g) · |x| · |y| · |z − 1|. The proof is a straightforward extension of [4, Lemma 2.5].

526

D. Calvo, R. M. Colombo, H. Frid

Lemma 4.2. Let ul , um , ur belong to U and the waves σ1− of the first family and σ2− of the second family interact. Call σ1+ , σ2+ the total sizes of the waves exiting the interaction, see Fig. 3, left. Then, there exists a constant C dependent only on U such that ! ! + ! ! ! ! !σ − σ − ! + !σ + − σ − ! ≤ C · (γ − 1) · !σ − σ − !, (4.6) 1

2

1

2

1

2

provided γ − 1 is sufficiently small. Moreover sgn σ1+ = sgn σ1−

and

sgn σ2+ = sgn σ2− .

(4.7)

vl

Proof. Consider first (4.6) and introduce for a fixed the functions / . + σ1 − σ1− g(σ1− , σ2− , γ ) = . σ2+ − σ2− − Clearly, g(0, σ2− , γ ) = g(σ if γ = 1, the equality g(σ1− , σ2− , 1) 1 , 0, γ ) = 0. Moreover, = 0 is equivalent to Lε1 Lε2 (vl , σ2+ ), σ1+ = Lε2 Lε1 (vl , σ1− ), σ2− . A solution to the latter equality is σ1− = σ1+ and σ2− = σ2+ . It is the unique solution, since the map (σ1− , σ2− ) → σ1+ (σ1− , σ2− ), σ2+ (σ1− , σ2− ) is globally invertible by the Hadamard global inverse function theorem, see [1, Theorem 1.8] and [11, Lemma 3.2]. The estimate (4.6) now follows from Lemma 4.1. To prove (4.7), choose γ so that C(γ − 1)M ≤ 1/2 and apply (4.6).

Lemma 4.3. Fix ul , um , ur ∈ U and let the waves σ , σ

, both of the first family interact, and call σ1+ , σ2+ the total sizes of the waves exiting the interaction, see Fig. 3, right. Then, if γ − 1 is sufficiently small, (1) if σ

< 0 and σ < 0, then σ1+ − σ2+ = σ + σ

; (2) if σ

> 0, σ < 0 and σ1+ < 0, then σ1+ − σ2+ = σ + σ

. Moreover ! +! ! ! !σ ! − !σ ! < 0, 1 ! ! +! ! ! ! ! ! !σ ! ≤ C(γ − 1)!σ σ

! + C !σ ! − !σ + ! ; σ

2 + σ1 >

σ

σ1+

σ

1

+σ .

(3) if < 0, < 0 and 0, then = 0 and =

Moreover, if σ σ < 0, there is a C > 0, depending only on U, such that ! ! ! ! ! ! ! ! ! ! + !σ − (σ + σ

)! + !σ + ! ≤ C · !σ σ

! · !σ ! + !σ

! . 1

σ2+

2

(4.8)

Proof. The proof of (1) follows directly from the parametrization (3.5), see Fig. 6, left. Similarly, to prove the first equality in (2) see Fig. 6, right. Consider the function F(vl , γ , σ , σ

) = σ2+ (vl , σ , σ

) − σ

. It is known that F(vl , 1, σ , σ

) < 0 for all vl , σ , σ

. Therefore, on ! the ! U, ! compact set ! if! γ !is sufficiently small, also F(vl , γ , σ , σ

) < 0. Hence, !σ2+ ! < σ

and !σ1+ ! − !σ ! < 0. Moreover, ψε (vm , γ , σ ) = ψε (vl , γ , σ1+ ) + σ2+ + ψε (v∗ , γ , σ2+ ), ψε (vm , γ , σ ) ≤ ψε (vl , γ , σ1+ ) + σ2+ , σ + ≥ ψε (vm , γ , σ ) − ψε (vl , γ , σ1+ ), ! ! ! ! ! +2! !σ ! ≤ C(γ − 1)!σ σ

! + C !σ − σ + ! 2 1 ! ! ! ! ! ! ≤ C(γ − 1)!σ σ

! + C !σ ! − !σ1+ ! , where we applied Lemma 4.1 to the map (σ , σ

, γ ) →ψε vm (σ

), γ , σ −ψε (vl, γ , σ ). The bound (4.8) is as [11, Lemma 3.1], while case (3) is immediate.

Stability of Periodic Solutions in Relativistic Gas Dynamics

527

Fig. 6. Left, proof of (1) and, right, proof of (2) in Lemma 4.3

Entirely analogous estimates hold for waves of the second family. Finally, introduce the set " $ DεM = u ∈ D¯ : ϒ ε (u) ≤ 2M . We observe that DεM depends only on ε, u− , u+ and M. Lemma 4.4. If u ∈ DεM , Hγ is as in (3.4), ϒ ε is as in (4.5) with K ≥ M, and the total variation is measured as in (4.2), then 1−η · TV (u) ≤ ϒ ε (u) ≤ 2(1 + η) TV (u). 1 + Hγ Proof. Write u as in (4.4). By (4.5), thanks to K ≥ M and u ∈ DεM , ⎞2 ⎛ ! ! ! ! 1 !σi,α ! + ⎝ !σi,α !⎠ ϒ ε (u) ≤ (1 + η) K i,α

i,α

! ! ! ! !σi,α ! + TV (u) !σi,α ! ≤ (1 + η) K i,α i,α ! ! ! ! σi,α ≤ 2(1 + η) TV (u). ≤ 2(1 + η) i,α

Using the definition (4.2) of the total variation, the form (3.5) of the interpolated Lax curves and the bound on ∂σ ψ ε provided by Lemma 3.4, TV (u) =

! ! ! ! ! ! !σi,α ! + !ψ ε (vα , γ , σi,α )! ≤ (1 + Hγ ) !σi,α ! ≤ 1 + Hγ ϒ ε (u), 1−η i,α

i,α

completing the proof. This allows to choose M so that all functions u of the form (4.4) satisfying (3.6) with K TV (u) + (TV (u))2 ≤ M are also in DεM for all ε. As usual, below we use V (t) to denote the variation of the functional V at the interaction time t, similarly for Q and ϒ.

528

D. Calvo, R. M. Colombo, H. Frid

Lemma 4.5. There ) exist ) constants η ∈ ]0, 1[, K ∈ [M, +∞[, γo > 1 and εo > 0 such that for all γ ∈ 1, γo , for all ε ∈ ]0, εo ] and at any time t¯ > 0 at which two waves σ1− and σ2− of different families interact (see Fig. 3, left), the following estimates hold: ϒ ε (t¯) ≤ −

1 !! − − !! · σ1 σ2 . 2K

Moreover, at any time t¯ > 0 at which two waves σ and σ

of the same family interact (see Fig. 3, right), ϒ ε (t¯) ≤ −

1 !!

!! · σσ . 2K

Proof. Consider the different possible interactions separately. 1. First, the interaction between waves of different families. Recall (4.6) in Lemma 4.2. Therefore, ! ! ! ! V ε (t¯) = (1 − η sgn σ1+ )!σ1+ ! + (1 − η sgn σ2+ )!σ2+ ! ! ! ! ! −(1 − η sgn σ1− )!σ1− ! − (1 − η sgn σ2− )!σ2− ! ! ! ! ! ! ! ≤ !σ1+ − σ1− ! + !σ2+ − σ2− ! ≤ C · (γ − 1) · !σ1− σ2− !, ! ! ! ! !σ + σ j,β ! + !σ + σ j,β ! Q ε (t¯) = 1 2 σ j,β ∈A(σ1+ )

−

σ j,β ∈A(σ2+ )

! − ! !σ σ j,β ! − 1

σ j,β ∈A(σ1− )\{σ2− }

! ! ≤ !σ1+ − σ1− !

σ j,β ∈A(σ2− )\{σ1− }

! ! ! ! !σ j,β ! + !σ + − σ − ! 2

2

σ j,β ∈A(σ1+ )

! ! ≤ (C M (γ − 1) − 1) !σ1− σ2− !, ϒ ε (t¯) ≤

! − ! ! ! !σ σ j,β ! − !σ − σ − ! 2 1 2

! ! ! ! !σ j,β ! − !σ − σ − ! 1

2

σ j,β ∈A(σ2+ )

! ! 1 (C (K + M)(γ − 1) − 1) !σ1− σ2− !, K

so that if γ < 1 + (2(K + M)C)−1 , then the desired estimate holds. Consider an interaction between waves of the same family. Following the same lines of [11, Lemma 3.1], we consider the different cases. + 2. σ < 0, σ

< 0 and σ1+ < 0, so that ! σ2! ≥ 0. Computations similar to [11, 1 of 1 !

! ε Lemma 3.1] yield that ϒ (u) ≤ − K σ σ as soon as K ≥ M/η. 3. σ < 0, σ

> 0 and σ1+ < 0. Moreover, following (2) in Lemma 4.3, ! ! ! ! ! ! V ε (u) = 2 !σ1+ ! − !σ ! + η!σ2+ ! ! ! ! ! ! ! ≤ 2(Cη − 1) !σ ! − !σ1+ ! + Cη(γ − 1)!σ σ

!, ! ! ! ! ! ! ! ! |σα | + !σ2+ ! M − !σ σ

! Q ε (u) ≤ !σ1+ ! − !σ ! σα ∈A(σ )

! ! ! ! ! ! ! ! ≤ M !σ ! − !σ1+ ! + !σ2+ ! − !σ σ

! ! ! ! ! ! ! ≤ M(C + 1) !σ ! − !σ1+ ! + (C Mη(γ − 1) − 1) !σ σ

!,

Stability of Periodic Solutions in Relativistic Gas Dynamics

529

! ! ! ! M(1 + C) − 1 !σ ! − !σ1+ ! ϒ ε (u) ≤ 2Cη + K ! ! 1 + ((K + M)Cη(γ − 1) − 1) !σ σ

! K 1 !!

!! ≤− σσ , 2K as soon as η < 1/(4C), K > 2M(1 + C) and γ < 1 + 1/ (2Cη(K + M)). 4. σ < 0, σ

> 0 and σ1+ > 0. By (3) in Lemma 4.3, ! ! V ε (u) = −2!σ ! ≤ 0, ! ! ! ! ! ! ! ! |σ | − !σ σ

! ≤ −!σ σ

!, Q ε (u) ≤ !σ1+ ! − !σ

! σ ∈A(σ1+ )

1 !!

!! σσ . K To complete the proof, we only choose the parameters K , η and γ as follows: ϒ ε (u) ≤ −

η < 1/(4C) Case 1, K > max {M/η, 2M(1 + C)} Cases 2 and 3, γ < 1 + min {1/ (2(K + M)C) , 1/ (2Cη(K + M))} Cases 1 and 3, to satisfy all the above requirements. We thus proved the following proposition. Proposition 4.2. Fix M > 0. Then, there exists a constant η > 0, independent of ε, such that, for any u¯ ∈ DεM , the wave-front tracking algorithm constructs an approximate solution uε : [0, +∞[ × R → of (1.1), with: (i) uε (t, ·) ∈ DεM for all t ≥ 0; (ii) the function t → ϒ ε (uε (t, ·)) is non-increasing; (iii) any strip [0, T ] × R contains finitely many interaction points of uε ; (iv) TV (uε (t, ·)) is uniformly bounded. To denote the globally defined, ε-approximate solution, we use the notation ¯ uε (t, ·) = Stε u.

(4.9)

The proof then works towards an estimate independent of ε of the Lipschitz constant for the semigroup S ε in the L1 norm. The basic technique is to shift the locations xα of the jumps in the initial condition u¯ at constant rates ξ α , and estimate the rates at which the jumps in the corresponding solution uε (t, ·) are shifted, for any fixed t > 0. To this aim, we use the technique based on pseudopolygonals, see [5,11]. Recall that a pseudopolygonal is an L1 -continuous countable concatenation of elementary paths and an elementary path in DεM is a map : ]a, b[ → DεM of the form (θ ) = u− χ(−∞,x θ ( + 0

θ xα−1

n−1 α=1

u α χ( x θ

θ α−1 ,x α

(

+ u+ χ]xnθ ,+∞[ ,

with xαθ = x¯α + ξα θ and < xαθ for all θ ∈ ]a, b[ and α = 0, . . . , n. As in [5,11], the following key result holds.

530

D. Calvo, R. M. Colombo, H. Frid

Proposition 4.3 ([5, Prop. 5] or [11, Prop. 2.5]). Let 0 be a pseudopolygonal in DεM . Then, for all τ > 0, the path τ = Sτε 0 is also a pseudopolygonal. For ε > 0, the weighted length of the elementary path : ]a, b[→ DεM is

= (b − a) · ϒξε (u), where the functional ϒξε will be defined below. In the above definition, does not depend on the particular choice of θ such that (θ ) = u, since the map θ → ϒξε ((θ )) is constant along elementary paths. The weighted length of a pseudopolygonal is then defined as the sum of the weighted lengths of its elementary paths. For any two piecewise constant functions u, w ∈ DεM , their weighted distance is & %

such that : [0, 1] → DεM . dε (u, w) = inf is a pseudopolygonal joining u to w The functional ϒξε in the definition of the length of pseudopolygonals is ⎛ ⎞ 2 ε = 2⎝ Si,α [[ σ j,β ]] ⎠ − [[ σi,α ]] , −

!β j=1 ! ! ! !σ2,β ! + !σ1,β !, Rαε = β<α

ϒξε

−

(4.10)

β>α

2 ! ! !σi,α ξα ! exp K 1 S ε + K 2 R ε + K 3 ϒ ε , = i,α i,α α i=1

where V ε is as in (4.5). The constants K 1 , K 2 and K 3 are determined below. The basic interaction estimates on shifting interactions are the following. Lemma 4.6 ([11, Lemma 3.4]). In an interaction as in Fig. 3, left: 2 ! ! + +! ! − −! ! ! ! ! ! !σ ξ ! − !σ ξ ! < C · !σ − σ − ! · !ξ − ! + !ξ − ! , i i 1 1 1 2 1 2 i=1

while in case of Fig. 3, right ! + + ! ! ! !

! ! + + ! ! ! ! ! ! ! !σ ξ ! − !σ ξ ! − !σ ξ ! + !σ ξ ! < C · !σ σ

! !ξ ! + !ξ

! . 1 1 2,α 2,α α

An estimate often used in the proofs below is ea − eb ≤ (a − b) ea , for a, b ∈ R. Lemma 4.7. There exist constants K 1 , K 2 and K 3 such that at any interaction ϒξε does not increase. Proof. Following the lines of the proof of [11, Lemma 3.4], we consider several different cases. For the sake of notational simplicity, we omit the dependence on ε in the functionals below and we keep it fixed throughout this proof. 1. Interaction between two waves of different families. Using the notation in Fig. 3, left, the estimate (4.6) and Lemma 4.5, we have: ! − −! ! −! ! ! ! ! Si ≤ C(γ ! −−! 1) σ1 σ2 , 1R ! 1−= −−! σ2 , ! ! ! ! R2 = − σ1 , ϒ ≤ − 2K σ1 σ2 .

Stability of Periodic Solutions in Relativistic Gas Dynamics

531

Therefore ϒξ ≤

2 ! + + ! ! − − ! !σ ξ ! − !σ ξ ! exp K 1 S + + K 2 R + + K 3 ϒ + i

i=1 2

+

i

i

1

i

1

! − −! !σ ξ ! (K 1 Si + K 2 Ri + K 3 ϒ) i i

i=1

× exp K 1 Si+ + K 2 Ri+ + K 3 ϒ + ! ! ! ! ! ! ≤ C !σ1− σ2− ! !ξ1− ! + !ξ2− ! exp K 1 S1+ + K 2 R1+ + K 3 ϒ + ! ! +!σ1− ξ1− ! (K 1 S1 + K 2 R1 + K 3 ϒ) × exp K 1 S1+ + K 2 R1+ + K 3 ϒ + ! ! ! ! ! ! +C !σ1− σ2− ! !ξ1− ! + !ξ2− ! exp K 1 S2+ + K 2 R2+ + K 3 ϒ + ! ! +!σ2− ξ2− ! (K 1 S2 + K 2 R2 + K 3 ϒ) × exp K 1 S2+ + K 2 R2+ + K 3 ϒ + × exp K 1 S1+ + K 2 R1+ + K 3 ϒ + ! ! ! ! ! ! ! ! + !ξ1− ! + !ξ2− ! !σ2− ! C !σ1− ! + K 1 S2 + K 2 R2 + K 3 ϒ × exp K 1 S2+ + K 2 R2+ + K 3 ϒ + ! ! ! ! ! ! ≤ !ξ1− ! + !ξ2− ! !σ1− ! ! ! K 3 !! − − !! σ1 σ2 × (C − K 2 ) !σ2− ! + C K 1 (γ − 1) − 2K + + + × exp K 1 S1 + K 2 R1 + K 3 ϒ ! ! ! ! ! ! + !ξ1− ! + !ξ2− ! !σ2− ! ! ! K 3 !! − − !! σ1 σ2 × (C − K 2 ) !σ1− ! + C K 1 (γ − 1) − 2K + + + × exp K 1 S2 + K 2 R2 + K 3 ϒ ≤ 0, provided K 2 ≥ C and γ ≤ 1 + K 3 /(2C K K 1 ). In the next cases, it is useful to separate the waves taking part in the interaction from those on the left or on the right of the interaction point: right

ϒξ = ϒξleft + ϒξ

+ ϒξint .

2. Interaction between shocks of the same family. Concerning the waves on the left of the interaction point, by (4.5), (4.10) and Lemma 4.5, we have ! ! ! ! 1 !!

!! σσ , Si,α = −2!σ2+ !, Rα = −!σ2+ !, ϒ ε ≤ − 2K therefore ϒξleft ≤ 0. Concerning the waves on the right of the interaction point, we get ! ! ! ! 1 !!

!! σσ , Si,α = −2!σ2+ !, Rα = !σ2+ !, ϒ ε ≤ − 2K right

therefore, if K 1 ≥ K 2 /2, then ϒξ

≤ 0.

532

D. Calvo, R. M. Colombo, H. Frid

Now let us consider the waves entering and exiting the interaction point. By (4.5), (4.10), (1) in Lemma 4.3, (4.8) and Lemma 4.5, we get ! ! ! ! ! ! ! ! S1+ − S = −!σ

! − !σ2+ !, S1+ − S

= −!σ ! − !σ2+ !, ! ! ! ! + S2,α − S1+ ≤ !σ2+ !, R1+ − R

= −!σ !, ! ! R + = R , R + − R , ≤ !σ + !. 1

2,α

2

Then, thanks to Lemma 4.6, (1) in Lemma 4.3 and (4.8), following computations very similar to those in [11, 2. in the proof of Proposition 2.7], we get ϒξint ≤ 0 if K 1 ≥ C and K 3 ≥ 2C K M(K 1 + K 2 ). 3. Interaction between a 1-shock σ and a 1-rarefaction σ

resulting in a 1-shock. In this case, σ2+ is a 2-shock. Let us consider the waves on the left of the interaction point. Thanks to Lemma 4.5, (2) in Lemma 4.3 and (4.8), ! ! ! ! 1 !!

!! left σσ . Si,α ≤ 2!σ2+ !, Rαleft ≤ !σ2+ !, ϒ left ≤ − 2K And therefore ! ! K 3 !

! left !σ σ ! K 1 Si,α + K 2 Rαleft + K 3 ϒ left ≤ (2K 1 + K 2 )!σ2+ ! − 2K K 3 !!

!! σσ , ≤ (2K 1 + K 2 )C M − 2K so that ϒξleft ≤ 0, provided K 3 > 2K (2K 1 + K 2 )C M. Considering the waves on the right of the interaction point, we have ! ! ! ! 1 !!

!! right right σσ , Si,α ≤ 2!σ2+ !, Rα = !σ2+ !, ϒ right ≤ − 2K so that ϒξ ≤ 0 under the same conditions as above. Concerning the interacting waves, using (3) in Lemma 4.3, we get ! ! ! ! ! ! ! ! ! ! S1+ − S = !σ1+ ! − !σ ! + 2!σ2+ ! = 3!σ2+ ! − !σ

!, ! ! ! ! ! ! ! ! ! ! S1+ − S

= !σ1+ ! − 2!σ ! + 2!σ2+ ! ≤ !σ2+ ! − !σ !, ! ! ! ! ! ! ! ! ! ! + S2, − S1+ = !σ1+ ! − !σ2+ ! = !σ ! − !σ

! ≤ !σ !, ! ! ! ! ! ! R + − R

= −!σ !, R + − R

= −!σ !, R − R

= −!σ ! ≤ 0. 2

1

Therefore, by Lemma 4.5 and (2) in Lemma 4.3, following computations very similar to those at [11, 3, proof of Proposition 2.7], we get ϒξint ≤ 0 as soon as K 1 ≥ C, K 2 ≥ K 1 and K 3 ≥ 6C K K 1 . 4. Interaction between a 1-shock σ and a 1-rarefaction σ

resulting in a 1-rarefaction. As observed!before, σ2+ = 0. For the waves not taking part in the interaction ! we ! ! have Si,α = −2!σ !. For the waves on the left of the interaction point: Rα ≤ −!σ !, while for waves on the right, Rα = 0. Therefore ϒ left ≤ 0 and ϒ right ≤ 0. Regarding the waves entering or exiting the interaction point, we compute ! ! ! ! S1+ − S ≤ −!σ !, S1+ − S

≤ −2!σ !, ! ! ! ! R1+ − R ≤ −!σ !, R1+ − R

≤ −!σ1+ !. Hence, following computations similar to those at [11, 4, proof of Proposition 2.7], we get ϒξint ≤ 0 as soon as K 2 ≥ C and 2(K 1 + K 3 ) ≥ C.

Stability of Periodic Solutions in Relativistic Gas Dynamics

533

The following results are proved as in [5,11]. Proposition 4.4 ([5, Prop. 7] or [11, Prop. 2.7]). Let us consider the system (1.1) and let us take M as in Theorem 4.1. Then there exist positive constants K 1 , K 2 and K 3 , independent of ε in (4.10), such that the following holds: if 0 is a pseudopolygonal, then the weighted length τ of the pseudopolygonal τ = Sτε 0 is a non-increasing function of time, i.e. the map t → ϒξε (Stε ) is non-increasing. Proposition 4.5 ([5, Prop. 8] or [11, Prop. 2.8]). Any two functions u, u in DεM can be joined by a pseudopolygonal entirely contained in DεM . Moreover, the weighted length of this pseudopolygonal is uniformly equivalent to the usual L1 -distance, i.e., 1 · L1 ≤ (b − a) · ϒξε () ≤ C · L1 . C Proposition 4.6 ([11, Prop. 2.9]). Let M be as in Theorem 4.1. Then, the semigroup S ε : [0, +∞[ × DεM → DεM defined by (4.9) is uniformly Lipschitz continuous with respect to the L1 distance, with a constant independent of ε. As in [5], to complete the proof of Theorem 4.1, we consider a sequence of semigroups S εn with limn→+∞ εn = 0, and construct the limit semigroup as S = limn→+∞ S εn . More precisely, for u¯ ∈ D and t ≥ 0, we define St u¯ = lim S εn , n→+∞

DεMn

is any sequence converging to u¯ in L1 . The limit is unique and depends where u¯ ∈ continuously on the initial data. With easy computations, we can verify that if TV (uo ) ≤ κγ M, then TV (St uo ) ≤ M for all t ≥ 0, and limγ →1 κγ = 1/ 1 + H (1, M, u− , u+ ) ; therefore the conclusion of the proof of Theorem 4.1 follows as in [5]. 5. The Classical Case This section describes the modifications necessary to show that also the classical case (1.2) fits in Theorem 4.1. Again, u denotes the conserved quantities and f the flow: / . ρv u 1 = ρ, . f (u 1 , u 2 ) = u 2 = ρv, ρv 2 + p(ρ) Conditions (4.3) reduce to the known inequalities [17, (XX)] ρ+

ρ−

p (r ) p (r ) M M dr > and dr > . r 4 r 4 0 0

(5.1)

All the results obtained for the relativistic system (1.1) have immediate analogue for the classical case (1.2). For instance, Theorem 4.1 can be restated as follows. (Recall the definition (2.7) of U). Theorem 5.1. Choose a positive( M (and states u− , u+ satisfying ) (5.1) for some ) γ¯ ∈ ]1, 2]. Then, there exists γo ∈ 1, γ¯ such that for all γ ∈ 1, γo , system (1.1) generates a Standard Riemann Semigroup S : [0, +∞[ × D → D. Moreover, for a suitable κγ ∈ ]0, 1[,

534

D. Calvo, R. M. Colombo, H. Frid

⎧ ⎨

⎫ lim x→−∞ u(x) = u− ⎬ (1) D ⊇ clL1 u ∈ BV(R; U) : lim x→+∞ u(x) = u+ ; ⎩ ⎭ TV (u) ≤ κγ M (2) if TV (uo ) ≤ κγ M, then TV (St uo ) ≤ M for all t ≥ 0; (3) lim κγ = 1/ 1 + H (1, M, u− , u+ ) with H (1, M, u− , u+ ) as in (3.4). γ →1

The proof is entirely similar to that of Theorem 4.1, so we only sketch it. Coherently with the limit c → +∞ in (2.8), the Riemann coordinates are ρ

ρ

p (r ) p (r ) v1 = v − dr and v2 = v + dr, r r 0 0 and, using the γ -law, ρ− 0

ρ

ρ∗

p (r ) 3−γ √ γ ζ ρ (γ −1)/2 dr = r γ − 1 p (r ) dr = ζ ln(ρ/ρ∗ ) r

The function ϕ in (3.3) becomes (see [11, § 4]), % 0 ϕ∞ (σ ) = − 21 σ + 2ζ sinh

σ 4ζ

(5.2)

if γ > 1, if γ = 1.

σ ≥0 σ < 0.

Note that, in both cases, the relations above coincide with the formal limits for c → +∞ of the analogous relativistic conditions. In the classical case, Theorem 4.1 can thus be entirely rephrased, providing Lipschitz continuous dependence to the solutions constructed in [16]. 6. The Limit c → +∞ Now, we can extend to the case “γ near to 1” the rigorous classical limit c → +∞ obtained in [3, Theorem 4.1] for γ = 1, see also [9]. We prove that as c → +∞ any solution of (1.1) converges to the corresponding solution of the classical p-system (1.2) with 1/c2 as rate of convergence. Below, we denote by S c : [0, +∞[ × Dco → Dc the semigroup constructed in Theorem 4.1 and by ( S : ([0, +∞[ × D → D the one defined in Theorem 5.1. Throughout this section, γ ∈ 1, γo . Proposition 6.1. Fix γ ∈ [1, 3]. Let co ∈ ]0, +∞[ be fixed. Choose a positive M and states u− , u+ satisfying (4.3) for c = co . Then, M, u− and u+ satisfy (4.3) for all c ∈ [co , +∞[ and also (5.1). Moreover, for all u such that lim x→−∞ u(x) = u− , lim x→+∞ u(x) = u+ and TV (u) ≤ κγ M, c S u − St u t

L1

≤C·

1 · t, c2

where the constant C depends only on M, u± and TV (u).

Stability of Periodic Solutions in Relativistic Gas Dynamics

535

Proof. To prove the first statement, simply observe that for all c > co , ρ

ρ

ρ

ρ

p (r ) p (r ) p (r ) p (r ) dr > dr and dr. dr > p(r ) p(r ) p(r ) r 0 r + 2 0 r + 0 0 r + c c 2 c 2 o

o

Thanks to the constructions of the SRS provided by Theorem 4.1, the latter statement follows from [3, Cor. 2.5]. Acknowledgement. D. Calvo would like to acknowledge the support from CNPq, Brazil, while she was visiting IMPA, during 2006, through the post-doctoral fellowship 15.1452/2005-9. H. Frid would like to acknowledge the support from CNPq, through the grant 306137/2006-2, and FAPERJ, grant E-26/152.192-2002.

References 1. Ambrosetti, A., Prodi, G.: A primer of nonlinear analysis. Volume 34 of Cambridge Studies in Advanced Mathematics, Cambridge: Cambridge University Press, 1995, Corrected reprint of the 1993 original 2. Bakhvalov, N.: The existence in the large of a regular solution of a quasilinear hyperbolic system. USSR Comp. Math. and Math. Phys. 10, 205–219 (1970) 3. Bianchini, S., Colombo, R.M.: On the stability of the standard Riemann semigroup, Proc. Amer. Math. Soc. 130(7), 1961–1973 (2002) (electronic) 4. Bressan, A.: Hyperbolic systems of conservation laws. The one-dimensional Cauchy problem. Volume 20 of Oxford Lecture Series in Mathematics and its Applications, Oxford: Oxford University Press, 2000 5. Bressan, A., Colombo, R.M.: The semigroup generated by 2 × 2 conservation laws. Arch. Rat. Mech. Anal. 133(1), 1–75 (1995) 6. Bressan, A., Colombo, R.M.: Unique solutions of 2 × 2 conservation laws with large data. Indiana Univ. Math. J. 44(3), 677–725 (1995) 7. Bressan, A., LeFloch, P.: Uniqueness of weak solutions to systems of conservation laws. Arch. Rat. Mech. Anal. 140(4), 301–317 (1997) 8. Bressan, A., Lewicka, M.: A uniqueness condition for hyperbolic systems of conservation laws. Discrete Contin. Dyn. Syst. 6(3), 673–682 (2000) 9. Chen, G.-Q., Christoforou, C., Zhang, Y.: Dependence of entropy solutions in the large for the Euler equations on nonlinear flux functions. Preprints on Conservation Laws, 2006-027, http://www.math. ntnu.no/conservation/2006/, 2006 10. Chen, J.: Conservation laws for the relativistic p-system. Comm. Part. Differ. Eqs. 20(9–10), 1605–1646 (1995) 11. Colombo, R.M., Risebro, N.H.: Continuous dependence in the large for some equations of gas dynamics. Comm. Part. Differ. Eqs. 23(9-10), 1693–1718 (1998) 12. DiPerna, R.J.: Global solutions to a class of nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math. 26, 1–28 (1973) 13. Frid, H.: Periodic solutions of conservation laws constructed through Glimm scheme. Trans. Amer. Math. Soc. 353(11), 4529–4544 (2001) 14. Frid, H., LeFloch, P.G.: Uniqueness for multidimensional hyperbolic systems with commuting Jacobians. Arch. Rat. Mech. Anal. 182(1), 25–47 (2006) 15. Frid, H., Perepelitsa, M.: Spatially periodic solutions in relativistic isentropic gas dynamics. Commun. Math. Phys. 250(2), 335–370 (2004) 16. Nishida, T., Smoller, J.A.: Solutions in the large for some nonlinear hyperbolic conservation laws. Comm. Pure Appl. Math. 26, 183–200 (1973) 17. Smoller, J.: Shock waves and reaction-diffusion equations. Second ed., New York: Springer-Verlag, 1994 18. Smoller, J., Temple, B.: Global solutions of the relativistic Euler equations. Commun. Math. Phys. 156(1), 67–99 (1993) Communicated by P. Constantin

Commun. Math. Phys. 284, 537–552 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0539-9

Communications in

Mathematical Physics

Poisson Groups and Differential Galois Theory of Schroedinger Equation on the Circle Ian Marshall1 , Michael Semenov-Tian-Shansky2,3 1 Mathematics Department, University of Loughborough, Loughborough, UK.

E-mail: [email protected]

2 Institute Mathématique de Bourgogne, Dijon, France. E-mail: [email protected] 3 Steklov Mathematical Institute, St. Petersburg, Russia

Received: 17 October 2007 / Accepted: 10 January 2008 Published online: 24 June 2008 – © Springer-Verlag 2008

Abstract: We combine the projective geometry approach to Schroedinger equations on the circle and differential Galois theory with the theory of Poisson Lie groups to construct a natural Poisson structure on the space of wave functions (at the zero energy level). Applications to KdV-like nonlinear equations are discussed. The same approach is applied to 2nd order difference operators on a one-dimensional lattice, yielding an extension of the lattice Poisson Virasoro algebra. 0. Introduction It is well known that the space H of Schroedinger operators on the circle Hu = −∂x2 − u,

u ∈ C ∞ (S 1 ), S 1 R/2π Z,

may be regarded as the phase space for the KdV hierarchy (with periodic boundary conditions). It carries a family of natural Poisson structures which play an important rôle in the Hamiltonian description of the KdV flows. In this letter we shall be concerned with the so-called second Poisson structure for the KdV equation associated with the third order differential operator l = 21 ∂x3 + u∂x + ∂x u.

(1)

This Poisson structure may be regarded as the Lie–Poisson bracket associated with the Virasoro algebra and arises as a result of the identification of H with (a hyperplane in) the dual space of the Virasoro algebra. Our aim is to describe its extension to the space of wave functions, i.e., of solutions of the Schroedinger equations (at zero energy level). Despite its apparent simplicity, this question involves several nontrivial points and has not been fully explored in the existing literature.1 1 We do not discuss the generalization to the case of higher order differential operators, as well as the relation to the Drinfeld–Sokolov theory [DS1]. These questions will be addressed in a separate publication.

538

I. Marshall, M. Semenov-Tian-Shansky

According to elementary theory, for a given u the space V = Vu of solutions of the Schroedinger equation − ψ − uψ = 0

(2)

is 2-dimensional and for any two solutions φ, ψ their wronskian W = φψ − φ ψ is constant. An element w ∈ V may be regarded as a non-degenerate quasi-periodic plane curve (the non-degeneracy condition means that w ∧ w is nowhere zero). There exists a matrix M ∈ S L(2) (the monodromy matrix) such that, writing elements of V as row vectors w = (φ, ψ), w(x + 2π n) = w(x)M n , n ∈ Z. The group G = S L(2) acts naturally on V (preserving the wronskian) by right multiplication. G plays a key rôle in the geometry of H in its double guise of the differential Galois group of Eq. (2) and of the group of projective transformations. Both aspects are completely classical; the novel element introduced in the present paper consists in their interaction with the Poisson geometry. Let us recall how the Schroedinger equation is seen from the viewpoint of projective geometry. The following assertion is well known (see [OT]). Theorem. (i) Any pair of linearly independent solutions of the Schroedinger equation defines a non-degenerate quasi-periodic projective curve γ : R → CP1 such that γ (x + 2π ) = γ (x)M. Any two projective curves associated with a given Schroedinger equation are related by a global projective transformation. (ii) Conversely, any nondegenerate quasi-periodic projective curve may be lifted to a non-degenerate curve in C2 such that its wronskian is equal to 1. In more abstract language, H is the space of projective connections on the circle. For a given Hu = −∂ 2 − u ∈ H there is a natural projective line bundle Pu → S 1 ; the quasi-periodic projective curve referred to above is its covariantly constant section, and the group G = S L(2), or, more precisely, the associated projective group P S L(2) = S L(2)/ {±1}), its structure group. Without restricting the generality we may fix an affine coordinate on CP1 in such a way that ∞ corresponds to the zeros of the second coordinate ψ of the point on the plane curve; with this choice γ is replaced with the affine curve x → η(x) = φ(x)/ψ(x). The potential u may be restored from η by the formula u = 21 S(η), where S is the Schwarzian derivative η S(η) = − η

3 2

η η

2 ,

which has the crucial property of being invariant under projective transformations η →

aη + c bη + d

induced by the right action of G. The space V of all quasi-periodic plane curves with wronskian 1, or the equivalent space of projective curves (together with the associated monodromy matrices)

Poisson Groups and Schroedinger Equation on the Circle

539

encodes all information about Schroedinger operators. In [W1] G. Wilson considered the extension of the KdV hierarchy to this space. To put it in a more formal way let us note that the natural “algebra of observables” associated with the KdV equation consists of local functionals of the form 2π F[u] = F(u, ∂x u, ∂x2 u, . . . ) d x, 0

where F is a polynomial (or, more generally, a rational) function of u and of its derivatives. We can identify the observable F[u] and the corresponding density; in other words, our basic algebra of observables is identified with the differential field Cu . In the same way, we can associate with the space of solutions of the Schroedinger equation a bigger differential field Cφ, ψ . Clearly, Cφ, ψ ⊃ Cu ; as a matter of fact, Cu is isomorphic to the differential subfield of G-invariants and hence Cφ, ψ ⊃ Cu is a differential Galois extension with differential Galois group G = S L(2) (we shall speak below simply of Galois groups and Galois extensions, for short). Various subgroups of G give rise to intermediate differential fields. In particular, for Z = {±I } the associated subfield of invariants is naturally isomorphic to Cη ; since Z is the center of G, the extension Cη ⊃ Cu is again a Galois extension with the Galois group P S L(2) = S L(2)/Z .2 Let B be the subgroup of lower triangular matrices; its field of invariants Cφ, ψ B may be identified with Cv , where v = 21 ηη . One has u = v −v 2 , which is the classical Miura transform. Note that since B is not normal in G, Cv ⊃ Cu is not a Galois extension, and hence, as noted by Wilson [W1], the treatment of the Miura transform requires the introduction of the ‘universal covering’ algebra Cφ, ψ . The natural idea explored in [W1] is the possibility to lift the KdV flows originally defined on H to the bigger space V. An important ingredient of such an extension is to equip V with a Poisson structure or its substitute. Wilson’s point of view is to look at the symplectic form, because it may be naturally pulled back (at the expense of becoming degenerate, see [W1]). A closer look at the situation reveals yet another difficulty: the relevant ‘variational’ 2-form is an integral of a density whose differential is not identically zero; rather it is a closed form on the circle and hence its contribution disappears only if we may discard ‘total derivatives’. This convention, adopted in formal variational calculus, greatly simplifies many formulae, but sometimes hides important “obstruction terms”. In Wilson’s paper this difficulty is avoided by the tacit assumption that the monodromy matrix is equal to 1. Without this assumption the degenerate 2-forms discussed in his paper are not closed; hence finally his approach is intrinsically close to the quasi-Hamiltonian formalism of Alekseev, Malkin and Meinrenken [AMM]. An alternative approach, followed in the present paper, is to look at the Poisson structure. Of course, Poisson brackets cannot be pulled back, and hence we have to guess a Poisson structure on the extended algebra and then check its consistency with the original bracket. Our strategy is based on the projective point of view outlined above. Although the space of projective curves is our main object, it is natural to start with the much bigger space W of all quasi-periodic plane curves, W = {(w = (φ, ψ), M) | w(x + 2π ) = w(x)M} . 2 The Galois theory point of view was implicit in the old paper of Drinfeld and Sokolov [DS2], where wave functions for different values of energy are considered, leading to an extended class of “equations of KdV type”. Generically, the associated Galois group becomes in this case the product of several copies of S L(2).

540

I. Marshall, M. Semenov-Tian-Shansky

The space W contains the set W of all non-degenerate plane curves with non-zero wronskian as an open subset. Let C := C ∞ (R/2π Z, C× ) be the scaling group which acts on W via f · (w , M) = ( f w , M).

(3)

Clearly, C acts freely on W and the quotient may be identified with V. The action of the linear group G = S L(2) on W is via g : w → w · g, M → g −1 Mg. The key condition which we use to restrict the choice of the Poisson structure on W is its covariance with respect to the group action. This condition puts us in the framework of Poisson group theory, as it allows both C and G to carry nontrivial Poisson structures, although it does not presume any a priori choice of these structures. As it happens, the covariance condition together with the natural constraint on the wronskian make their choice almost completely canonical. (In particular, the Poisson bracket on G is fixed up to scaling and conjugation; it is of the standard “quastriangular” type and the case of zero bracket is excluded.) The Poisson structure on W constructed in this way is closely related to the so-called exchange algebras discovered in the end of 1980s [B1]. We believe that the point of view adopted in the present paper provides a useful and nontrivial complement to these old results in making explicit the hidden Poisson group aspects of differential Galois theory. Our main result is summarized in Theorem 2.13 describing the essentially unique covariant Poisson structure on the space of projective curves and on all levels of the associated tower of differential Galois extensions. At the bottom of the tower we recover the Poisson Virasoro algebra. It is remarkable that the Poisson Virasoro algebra, as well as the entire tower of its Galois extensions, arise from simple covariance requirements for the Poisson structure on the top space. As a corollary we get a natural tower of compatible Hamiltonian flows associated with equations of the KdV type. The whole construction has a natural finite difference version discussed in Sect. 3. The discrete counterpart of the space of projective curves is the space of projective configurations carrying a covariant Poisson structure which again is unique and gives rise to Poisson structures on all levels of the associated tower of difference Galois extensions. This time the Poisson algebra located at the bottom of the tower is the discrete analogue of the Virasoro algebra, already discussed in [FT,V,B2] and [FRS]. 1. A Review of Poisson Lie Groups Let G be a Lie group with Lie algebra g. A Poisson structure on G is called multiplicative if the multiplication m: G × G → G is a Poisson mapping. A Lie group equipped with a multiplicative Poisson bracket is called a Poisson Lie group. Any multiplicative Poisson bracket on G identically vanishes at its unit element e ∈ G; its linearization at e gives rise to the structure of a Lie algebra on the dual space g∗ ; multiplicativity then implies that the dual of the commutator map [ , ] : g∗ ×g∗ → g∗ is a 1-cocycle on g. A pair (g, g∗ ) with these properties is called a Lie bialgebra. A fundamental theorem, due to Drinfeld, asserts that a multiplicative Poisson bracket on G is completely determined by its linearization and hence there is an equivalence between

Poisson Groups and Schroedinger Equation on the Circle

541

the category of Poisson Lie groups (whose morphisms are Lie group homomorphisms which are also Poisson mappings) and the category of Lie bialgebras (whose morphisms are homomorphisms of Lie algebras such that their duals are homomorphisms of the dual algebras). An action G × M → M of a Poisson group on a Poisson manifold M is called a Poisson action if this mapping is Poisson; in other words, for F, H ∈ Fun(M), their Poisson bracket at the transformed point g · m ∈ M may be computed as follows: ˆ ˆ , g), Hˆ (· , g) {F, H }M (g · m) = F(m, · ), Hˆ (m, · ) (g) + F(· (m), (4) G

M

ˆ where in the r.h.s. we set F(m, g) = F(g · m), Hˆ (m, g) = H (g · m) and treat them as functions of two variables g ∈ G, m ∈ M. In that case we shall also say that the Poisson bracket on M is G-covariant. The choice of the basic ring of functions on M depends on the context; we may work, for instance, in the C ∞ -setting or, alternatively, consider the rings of polynomial or rational functions on the appropriate manifolds. It is sometimes useful to restrict the action G × M → M to a subgroup of G. A natural class of subgroups of G are those Lie subgroups which are also Poisson submanifolds for which the inherited Poisson structure is of course multiplicative. This class, however, is too restricted, since a Poisson Lie group may have very few Poisson subgroups and a wider class consists of the so called admissible subgroups. A subgroup H ⊂ G of a Poisson Lie group G is called admissible if the subalgebra of H -invariants Fun(M) H ⊂ Fun(M) is closed with respect to the Poisson bracket. A simple admissibility criterion is stated as follows. Let h ⊂ g be the Lie algebra of H and h⊥ ⊂ g∗ its annihilator in g∗ . Then H ⊂ G is admissible if and only if h⊥ ⊂ g∗ is a Lie subalgebra; H ⊂ G is a Poisson subgroup if and only if h⊥ is an ideal in g∗ . Let us assume that H is admissible and that the quotient space M/H is smooth, so that case we may identify Fun(M/H ) with Fun(M) H and hence the quotient space inherits the Poisson structure. This is the basis of Poisson reduction, originally introduced by Lie. The only nontrivial example which we need in the present paper is the projective group G = S L(2) (or P S L(2)). The group G = S L(2, C) carries a family of natural Poisson structures called the Sklyanin brackets which make it a Poisson Lie group. These Poisson structures are parameterized by the choice of a classical r-matrix r ∈ g ∧ g; for g = sl(2) the classical Yang–Baxter equation does not impose any restrictions on the choice of r , so any element of g ∧ g gives rise to a Poisson bracket on G. It is specified by the set of Poisson bracket relations for the matrix coefficients of G (regarded as generators of its affine ring). In usual tensor notation we have {g1 , g2 } = [r, g1 g2 ],

(5)

where in the r.h.s. we regard r ∈ g ∧ g and g1 g2 = g ⊗ g as elements of Mat(2) ⊗ Mat(2) Mat(4) and compute the commutator in Mat(4). Let h, e, f be the standard generators of sl(2). Up to the natural equivalence there exist three types of classical r-matrices: (a) r = 0, (b) r = h ∧ f , (c) r = e ∧ f , where is a scaling parameter. They correspond to three types of G-orbits in g. Case (a) gives trivial bracket; case (c) is generic; case (b) (the so-called triangular r-matrix) is degenerate. The standard Poisson

542

I. Marshall, M. Semenov-Tian-Shansky

bracket on G which corresponds (c) is given by the following set of relations for to case αβ the matrix coefficients of g = γ δ , we have {α, β} = αβ, {α, γ } = αγ , {β, δ} = βδ, {γ , δ} = γ δ, {β, γ } = 0, {α, δ} = 2βγ .

(6)

Notice that det g = αδ − βγ is a Casimir function and hence the Poisson bracket is well defined on the coordinate ring of S L(2) and even of P S L(2).) In the sequel we shall be mainly concerned with the standard bracket (6). We shall see that the covariance condition together with the wronskian constraint fix the Poisson structure on G uniquely up to scaling and conjugation; in particular, r-matrices of types (a) and (b) are excluded. It will be important for us to have an explicit description of the dual Poisson group associated with the standard r-matrix (of type (c) ) on g. Let b± ⊂ g be the opposite Borel subalgebras of g = sl(2) which consist of upper (respectively, lower) triangular matrices. The dual Lie algebra g∗ associated with the standard r-matrix may be identified with the subalgebra of b+ ⊕ b− , g∗ = {(X + , X − ) ∈ b+ ⊕ b− | diag X + + diag X − = 0} .

(7)

We conclude, in particular, that the standard Cartan subgroup A, Borel subgroups B± and unipotent subgroups N ± ⊂ B± are admissible subgroups of G. (Of course, this is not true for conjugate subgroups!) The Lie group G ∗ associated with g∗ may be identified with the subgroup in B+ × B− , G ∗ = {(b+ , b− ) ∈ B+ × B− |diag b+ · diag b− = I } . It carries a natural Poisson bracket which makes it a Poisson Lie group; this is the dual Poisson Lie group of G. The mapping −1 G ∗ → G : (b+ , b− ) → M = b+ b−

maps G ∗ onto an open dense subset in G; the Poisson structure induced on this subset extends smoothly to the entire manifold G. Explicitly it is described by the following formula: {M1 , M2 } = M1 M2 r + r M1 M2 − M2 r+ M1 − M1r− M2 ,

(8)

where r± = r ± t and t ∈ g ⊗ g stands for the tensor Casimir element. This Poisson structure on G has a number of remarkable properties; in particular, its symplectic leaves are conjugacy classes in G; moreover, this bracket is covariant with respect to the action of G (equipped with the bracket (6)) by conjugation. Conversely, the only Poisson structure on G (now regarded as a G-space, not as a group) which is Poisson covariant with respect to the action of G by conjugation is that given by (8).

Poisson Groups and Schroedinger Equation on the Circle

543

2. The Space of Wave Functions as a Poisson Space We shall assume in the sequel that all functions take values in C. For M ∈ S L(2, C) let W M be the space of smooth quasi-periodic plane curves, W M = {w : R → C2 | w(x + 2π ) = w(x)M for all x},

(9)

where w is denoted by a row vector. Let W be the set of pairs, W = {(w, M)|M ∈ S L(2, C), w ∈ W M } . The wronskian W : W → C is defined by the standard formula W (φ, ψ) = φψ − φ ψ,

(10)

and we define W ⊂ W to be the open subset consisting of non-degenerate curves, i.e. having non-zero wronskian. We want to find the most general Poisson structure on W which is covariant with respect to the right action of G = S L(2, C) and to the action of the scaling group C. This structure appears to be partially rigid. It is convenient to describe this Poisson structure by giving the Poisson brackets of the ‘evaluation functionals’ which assign to wave functions φ, ψ their values at the running point x ∈ R. The covariance with respect to the local scaling group implies that these brackets are quadratic and local, i.e., depend only on the values of φ, ψ at the given points. Lemma 2.1. Assume that the Poisson bracket on W is covariant with respect to the action of C. Then the Poisson structure on C is trivial and, writing w = (φ, ψ), the bracket of evaluation functionals has the form {φ(x), φ(y)} = A(x, y)φ(x)φ(y), {ψ(x), ψ(y)} = D(x, y)ψ(x)ψ(y), {φ(x), ψ(y)} = B(x, y)φ(x)ψ(y) + C(x, y)φ(y)ψ(x).

(11)

It is natural to assume that the bracket (11) is translation invariant, i.e., the structure functions depend only on the difference x − y. Using tensor notation, we can write these Poisson brackets in the following condensed form: {w1 (x), w2 (y)} = w1 (x)w2 (y)R(x, y),

(12)

where w(x) = (φ(x), ψ(x)) and we write the tensor product w1 (x)w2 (y) as a row vector of length 4, w1 (x)w2 (y) = (φ(x)φ(y), φ(x)ψ(y), ψ(x)φ(y), ψ(x)ψ(y)); the matrix R(x, y) ∈ Mat(4) is given by ⎛

A(x − y) 0 ⎜ R(x, y) = ⎝ 0 0

0 B(x − y) C(x − y) 0

0 −C(y − x) −B(y − x) 0

⎞ 0 0 ⎟ ⎠. 0 D(x − y)

Poisson brackets of this type were first studied in [B1] (for a special choice of R). It is convenient to drop temporarily the Jacobi identity condition and to consider all (generalized) Poisson brackets which are covariant with respect to the Galois group action.

544

I. Marshall, M. Semenov-Tian-Shansky

Lemma 2.2. Let us assume that the Poisson bracket (12) is right-G-invariant; then the exchange matrix has the structure ⎛ ⎞ 0 0 0 0 ⎜0 c(x − y) −c(x − y) 0⎟ R0 (x, y) = a(x − y)I + ⎝ , (13) 0 c(x − y) −c(x − y) 0⎠ 0 0 0 0 where a and c are arbitrary odd functions. Lemma 2.3. Fix an arbitrary r-matrix r ∈ g ∧ g and equip G with the corresponding Sklyanin bracket (5). Let us assume that the Poisson bracket (12) is right-G-covariant; then the exchange matrix has the structure Rr (x, y) = R0 (x, y) + r,

(14)

where we write r ∈ g ∧ g ⊂ Mat(2) ⊗ Mat(2) as a 4 × 4-matrix in the standard way. For g = sl(2) the classical Yang–Baxter equation does not impose any restrictions on the choice of r ; indeed, it amounts to the requirement that the Schouten bracket [r, r ] ∈ g ∧ g ∧ g should be ad g-invariant, but for g = sl(2) we have ∧3 g C. Still, we must distinguish two cases: – [r, r ] = 0, which happens when r = 0 or r is triangular (cases (a) and (b) of the classification in Sect.2 above). – [r, r ] = − 2 = 0, which happens when r is quasitriangular (case (c)). Since Rr in (14) is the sum of 2 terms, the Schouten bracket [r, r ] gives an extra term to the Jacobi identity for the corresponding exchange bracket. Lemma 2.4. The exchange bracket (12) with exchange matrix (14) satisfies the Jacobi identity if and only if c(x − y)c(y − z) + c(y − z)c(z − x) + c(z − x)c(x − y) = 0

(15)

in cases (a) and (b) and c(x − y)c(y − z) + c(y − z)c(z − x) + c(z − x)c(x − y) = − 2

(16)

in case (c). Functional equation (16) is a version of the so called Rota–Baxter equation. To solve it, one can put c(x) = C(x) and express C as a Cayley transform, C(x) =

f (x) + 1 ; f (x) − 1

then (16) immediately yields for f the standard 2-cocycle relation f (x − y) f (y − z) f (z − x) = 1. The obvious solution is thus Cλ (x − y) = coth λ(x − y), where λ is a parameter. Setting λ → ∞, we obtain a particular solution C(x − y) = sign(x − y). We shall see that this special solution is the only one which is compatible with the constraint W = 1. The solution of the degenerate equation (15) is c(x) = 1/x. So far, the most general Poisson structure on W still contains functional moduli and a free parameter. As is easy to check, the Poisson brackets for the ratio η = φ/ψ do not depend on a:

Poisson Groups and Schroedinger Equation on the Circle

545

Proposition 2.5. We have

{η(x), η(y)} = η(x)2 − η(y)2 − c(x − y) (η(x) − η(y))2 .

(17)

Remark 2.6. Formula (17) defines a family of G-covariant Poisson brackets on the space of projective curves. As usual, we must understand it in the distribution sense. The smooth terms in (17) represent a finite-dimensional perturbation of the Poisson operator associated with the kernel c(x − y); these ‘perturbation terms’ are imposed by the covariance condition. In order to establish a connection between these brackets and Schroedinger operators we must take into account the wronskian constraint which restricts the choice of c. The second structure function a drops out after projectivization and is not restricted by the Jacobi identity. We shall see, however, that the wronskian constraint suggests a natural way to choose a as well. An interpretation of the general family (17) of Poisson brackets remains an open question. Our next proposition describes the basic Poisson bracket relations for the wronskian: Proposition 2.7. We have {W (x), φ(y)} = (c(x − y) − 2a(x, y))W (x)φ(y) −c (x − y)φ(x)[φ(x)ψ(y) − ψ(x)φ(y)].

(18)

By symmetry, a similar formula holds for {W (x), ψ(y)}. Formula (18) immediately leads to the following crucial observation: Proposition 2.8. The constraint W = 1 is compatible with the Poisson brackets for scaling invariant η if and only if the last term in (18) is identically zero; this is possible if and only if C (x − y) is a multiple of δ(x − y), i.e., if C(x − y) is a multiple of sign(x − y). It is important that the wronskian constraint excludes the possibility that = 0 and hence the corresponding Poisson structure on G is conjugate to the standard one (case (c)). From now on, without restricting the generality, we fix = 1. Proposition 2.9. Let us assume that c(x − y) = sign(x − y); then the Poisson bracket relations for the wronskian are given by: {W (x), W (y)} = (sign(x − y) − 2a(x, y))W (x)W (y),

(19)

or, equivalently {log W (x), log W (y)} = (sign(x − y) − 2a(x, y)).

(20)

Formulae (19) and (20) suggest the following distinguished choice of a: Proposition 2.10. Assume that a is so chosen that sign(x − y) − 2a(x, y) = δ (x − y).

(In other words, a(x, y) is the distribution kernel of the operator 21 ∂ −1 − ∂ .) Then: (i) The logarithms of wronskians form a Heisenberg Lie algebra, the central extension of the abelian Lie algebra of C. (ii) Let C = C/C∗ be the quotient of the scaling group over the subgroup of constants; log W is the moment map for the action of C on W.

546

I. Marshall, M. Semenov-Tian-Shansky

Recall that according to the general theory the Poisson bracket relations for the moment map may reproduce the commutation relations for a central extension of the original Lie algebra. This is precisely what happens in the present case. With this choice of a and C the Poisson geometry of the space V of wave functions becomes finally quite transparent: V arises as a result of Hamiltonian reduction with respect to C over the zero level of the associated moment map. The constraint set log W = 0 is (almost) non-degenerate (i.e., this is a 2nd class constraint, according to Dirac). The projective invariants commute with the wronskian and hence their Poisson brackets are not affected by the constraint.3 The description of the Poisson structure on V is completed by the Poisson brackets for the monodromy. Proposition 2.11. The Poisson covariant brackets for the monodromy have the form {w(x)1 , M2 } = w(x)1 M2 r+ − r− M2 , {M1 , M2 } = M1 M2 r + r M1 M2 − M2 r+ M1 − M1r− M2 .

(21)

The Poisson bracket for the monodromy is precisely the Poisson bracket of the dual group G ∗ described in (8). In other words, the ‘forgetting map’ µ : (w, M) → M is a Poisson morphism from W into the dual group G ∗ .4 This mapping is of special importance. Proposition 2.12. The mapping µ is the non-abelian moment map5 associated with the right action of G on W. Putting together the previous lemmas, we can now state our main assertion which describes the Poisson bracket relations in the differential algebra Cη and in its various subalgebras which correspond to different admissible subgroups of G. Theorem 2.13. (i) The covariance condition with respect to the action of the Galois group G and the compatibility with the Wronskian constraint fix in an essentially unique way (i.e. up to scaling and conjugation) both the Poisson structure on the space of projective curves and the Poisson structure on G. The Poisson structure on G is necessarily of the quasi-triangular type. (ii) The basic Poisson bracket relations in Cη are given by {η(x), η(y)} = η(x)2 − η(y)2 − sign(x − y) (η(x) − η(y))2 .

(22)

3 With this choice of a the Poisson structure on W becomes non-degenerate; for other possible choices this may be not true. For example, the opposite possibility is to set sign(x − y) − 2a(x, y) = 0. This makes the bracket on W highly degenerate; its kernel is eliminated by the wronskian constraint, and the reduced structure remains the same. While logically possible, the resulting picture is much less attractive. 4 The Poisson bracket (8) is ubiquitous in various problems related to monodromy; another striking example, which is very close to our present context, is its rôle in the theory of isomonodromic deformations described in the very interesting paper of P.Boalch [Bo]. 5 We refer the reader for instance to [BB] for the general definition of non-abelian moment maps associated with Poisson group actions.

Poisson Groups and Schroedinger Equation on the Circle

547

(iii) Consider the tower of differential extensions : Cη O dII II vv v II v v II , vvvv I 2R N Cη AcH Cη : HH v HH vv HH vv v HH v v 2 R ? B, v Cη O ? Cη G All arrows in this commutative diagram are Poisson morphisms. (iv) We have Cη N Cθ , where θ := η ; moreover, {θ (x), θ (y)} = 2 sign(x − y)θ (x)θ (y).

(23)

(v) The subalgebra of B-invariants is generated by v := 21 η /η = 21 θ /θ ; we have: {v(x), v(y)} = 21 δ (x − y).

(24)

(vi) The subalgebra of G-invariants is generated by u = 21 S(η) = v − v 2 ; we have: {u(x), u(y)} = 21 δ (x − y) + δ (x − y) [u(x) + u(y)] .

(25)

Formula (25) reproduces the standard Virasoro algebra; in other words, the Poisson algebra (22) constructed from general covariance principles is indeed an extension of the Poisson–Virasoro algebra. Remark 2.14. The Poisson bracket relations (23) – (25) listed above are particularly simple, since their r.h.s. is algebraic. Because the basic Poisson bracket relations (22) are nonlocal, this need not always be the case. This is what happens in the case of A-invariants: Proposition 2.15. 1. The differential subalgebra of A-invariants in Cη is generated by ρ = η /η. 2. The Poisson brackets for ρ have the form y y {ρ(x), ρ(y)} = 2ρ(x)ρ(y) sinh ρ(s) ds + sign(x − y) cosh ρ(s) ds . x

x

It is well known that the standard KdV equation is generated with respect to the Virasoro bracket by the Hamiltonian h[u] = u 2 d x. (26) The Hamiltonians of all higher KdV equations are associated with trace identities for Hu and hence are G-invariant. Thus we may lift them to all levels of the extension tower, and on each level they serve as commuting Hamiltonians generating a system of compatible commuting flows.

548

I. Marshall, M. Semenov-Tian-Shansky

Proposition 2.16. The following commutative diagram which is formed by Poisson maps summarizes all information on the evolution equations generated by the standard Hamiltonian (26) and on the differential substitutions which relate these equations. ηt = S(η)ηx

jjj ρ=η /ηjjjj j j j j j u jjj u=S(η) ρt = ρx x x − 23 (ρx2 /ρ)x − 21 (ρ 3 )x

TTTT T

RRR RRR θ=η RRR RRR R(

v=η /η

θt = θx x x − 23 (θx2 /θ)x

l llll v=θ /θ l l vlll

v=ρ+ρ ρ −1 TT

TTTT )

vt = vx x x − 6v 2 vx

u=v −v 2

u t = u x x x + 6uu x

The associated Hamiltonian flows on each level of this diagram factorize over those which lie beneath. All equations in the diagram belong to the well known class of “equations of the KdV type”. Their mutual relations were discussed by Wilson [W1], although the Hamiltonian description which we propose is totally different. Equation ηt = S(η)ηx = ηx x x − 23 η2x x /ηx ,

(27)

is sometimes called the Schwarz–KdV equation; in [W1] George Wilson suggested for it the name “ur-KdV equation”, due to its position atop the extension tower. Remark 2.17. Equations which appear in the diagram form a rather small part in the general class of “equations of the KdV type” discussed in [SS], where a classification theorem is given for evolution equations of the form u t = u x x x + F(u, u x , u x x ) which admit nontrivial conservation laws. General equations of this type depend on several parameters and may include elliptic functions, as was first noticed by Calogero and Degasperis [CD]. We expect that rational equations from this list will also fit into the Poisson group setting by bringing into play the wave functions for different values of energy, as suggested in [DS2]. 3. Discrete Case The theory of the Schroedinger equation has a simple and natural lattice counterpart. Consider the 2nd order difference equation on the one-dimensional lattice with periodic potential φn+2 + u n φn+1 + φn = 0, u n+N = u n .

(28)

Let τ be the shift operator, (τ φ)n = φn+1 . Eq. (28) may be written in operator form as τ 2 + u τ + 1 φ = 0. (29)

Poisson Groups and Schroedinger Equation on the Circle

549

For a given u, the space of its solutions is two-dimensional; any two solutions φ, ψ have constant wronskian W = φn ψn−1 − φn−1 ψn . The monodromy matrix M is defined in the standard way. The projective description of discrete Schroedinger equations is given by the following theorem. To state it we need a few elementary notions. An ordered projective configuration is a map γ : Z → CP1 ; we shall simply speak of projective configurations, for short. A configuration is called non-degenerate if γn = γn+1 . for all n. A plane configuration is a map w : Z → C2 ; it is called non-degenerate if wn ∧ wn+1 = 0. We denote by wn the row vector (φn , ψn ). Theorem. (i) Any pair of linearly independent solutions of the discrete Schroedinger equation defines a non-degenerate quasi-periodic projective configuration γ : Z → CP1 such that γn+N = γn · M. Any two projective configurations associated with a given discrete Schroedinger equation are related by a global projective transformation. (ii) Conversely, any non-degenerate quasi-periodic projective configuration may be lifted to a non-degenerate plane configuration such that its wronskian is equal to 1. As before, we replace the projective line with its affine model putting ηn = φn /ψn . The group G = S L(2) is the (difference) Galois group of Eq. (28). Curiously, the potential u itself is not a rational Galois invariant. A natural finite difference analog of the Schwarzian derivative is the cross-ratio, sn [η] := [ηn , ηn+1 , ηn+2 , ηn+3 ] =

ηn − ηn+2 ηn+1 − ηn+3 · ; ηn − ηn+1 ηn+2 − ηn+3

an elementary calculation yields sn = u n u n+1 .

(30)

From now on we shall assume that the period N of the lattice is odd. In this case the potential may be restored as the periodic solution of (30) (regarded as an equation for u N for given sm [η] m=1 ); it belongs to a quadratic extension of C(η)G = C(u u τ ) ⊂ C(u). Note that the resulting formula is non-local, that is, it depends on the values of ηm for all m. The Poisson structure on the space of discrete Schroedinger operators is much less obvious than in the continuous case; it may be regarded as a lattice analog of the Virasoro algebra. One version of its definition was proposed in [FRS] as a part of a more general theory, the q-difference version of the Drinfeld–Sokolov theory [DS1] which applies to q-difference equations of arbitrary order (see also [STSS]). Another definition of the lattice Virasoro algebra had been proposed earlier by Faddeev and Takhtajan [FT]. The projective point of view outlined in the present paper also yields a natural Poisson structure on the space of discrete Schroedinger operators; we shall see that it is identical to that introduced in [FRS] and is simply related to the Faddeev–Takhtajan bracket. In this section we shall denote by W the space of all plane quasi-periodic configurations and by C the discrete scaling group. Proposition 3.1. (i) Let us assume that the Poisson structure on W is covariant with respect to the right action of G and to the natural action of the scaling group. Then the bracket between the evaluation functionals is given by 1 1 2 wm , wn2 = wm wn R(m − n), (31)

550

where

I. Marshall, M. Semenov-Tian-Shansky

⎛ 0 ⎜0 R(k) = R0 (k) + r, R0 (k) = ak I + ⎝ 0 0

0 ck ck 0

0 −ck −ck 0

⎞ 0 0⎟ , 0⎠ 0

(32)

(we omitted Poisson bracket relations for the monodromy which remain the same as before). Here ak is an arbitrary odd function and ck is an odd function which satisfies cn−m cm−k + cm−k ck−n + ck−n cn−m = α,

(33)

where α = 0 when r is a trivial or triangular r-matrix and α = − 2 for r quasitriangular (case (c)). The wronskian W of a plane configuration w = (φ, ψ) is defined by the obvious formula W [w]n = φn ψn−1 − ψn φn−1 . The space V ⊂ W of wave functions of discrete Schroedinger operators is defined by the constraint W [w] = 1. Proposition 3.2. We have {Wn , φm } = (an−m + an−1−m − cn−m )Wn φm +(cn−m − cn−m−1 )(φn φn−1 ψm − φn ψn−1 φm ).

(34)

A similar formula holds for {Wn , ψm }. Scaling invariants ηn commute with the wronskian if and only if the second term in (34) is also proportional to Wn φm ; this condition implies that cn−m − cn−m−1 = (δnm + δn,m+1 ).

(35)

Without restricting the generality we may assume that = 1 and in that case we get {Wn , φm } = (an−m + an−1−m − cn−m + δnm + δn,m+1 )Wn φm .

(36)

Fortunately, condition (35) is again satisfied by the sign function and hence the Poisson structure on the space of projective configurations remains basically the same as in the continuous case. Moreover, if η is a projective curve, which defines a Schroedinger equation, we may fix a generic set of values {x1 , . . . , x N } of the coordinate x on the circle such that η(xn ) = η(xn+1 ); then {η(xn )} is a non-degenerate projective configuration which gives rise to a difference Schroedinger equation and the evaluation functionals η → η(xn ) form a Poisson subalgebra in the big Poisson algebra (22). Explicitly we have 2 − sign(n − m) (ηn − ηm )2 . {ηn , ηm } = ηn2 − ηm

(37)

Note that it’s of course not true that the solutions of this difference equation are the values of the wave functions for the continuous equation: indeed, the wronskian constraints are different in the two cases. It is noteworthy that nevertheless the conditions imposed by these constraints on the structure function c are satisfied by the same standard function.

Poisson Groups and Schroedinger Equation on the Circle

551

In order to compute the Poisson structure induced by (37) on the set of potentials let us start with the subfields of rational N - and B-invariants in C(η); in complete analogy with the continuous case we have C(η) N = C(θ ), where θm := ηm+1 − ηm , and C(η) B = C(λ), where λm :=

ηm+2 − ηm+1 · ηm+1 − ηm

An easy computation yields {θm , θn } = −2 sign(m − n)θm θn , {λm , λn } = 2 δm+1,n − δm,n+1 λm λn .

(38)

A natural interpretation of the variables λn is connected with the Miura transform for the discrete Schroedinger equation. Let us assume that the difference operator (29) is factorized, τ 2 + u τ + 1 = (τ + v)(τ + v −1 ).

(39)

The potentials u, v are related by the difference Miura map, −1 u n = vn + vn+1 .

(40)

We may assume without restricting the generality that ψ is the solution of (28) which satisfies the first order equation (τ + v −1 )ψ = 0. Let φ be the second solution of this equation such that W (φ, ψ) = 1 and η = φ/ψ; then ηn+1 − ηn =

1 . ψn ψn+1

Clearly, vn = −ψn /ψn+1 , and hence vn vn+1 =

ψn ψn+1 ψn ηn+2 − ηn+1 = = = λn ψn+2 ψn+2 ψn+1 ηn+1 − ηn

(41)

Thus λn is the product of two neighbouring potentials in the factorized Schroedinger operator (39). The potentials themselves again are not rational Galois invariants of B and belong to a quadratic extension of C(λ). From (40), (41) we easily derive that sn = u n u n+1 =

(1 + λn )(1 + λn+1 ) . λn+1

(42)

Proposition 3.3. We have {λm , λn } = (δm+1,n − δm,n+1 )λm λn , {sm , sn } = δm+1,n − δm,n+1 (sm + sn − sm sn ) −1 −1 δm+2,n − sn+1 δm,n+2 . + sm sn sm+1

(43)

Formula (43) implies the following Poisson bracket relations for the potentials: Proposition 3.4. Let n = (−1)n sign n, n = 0, 0 = 0. Then {vn , vm } = 2n−m vn vm and {u n , u m } = 2n−m u n u m + 2(δm+1,n − δm,n+1 ).

(44)

552

I. Marshall, M. Semenov-Tian-Shansky

Formula (44) coincides with the lattice Virasoro algebra introduced in [FRS], while (43) coincides with the Faddeev–Takhtajan version of the lattice Virasoro algebra. The non-locality of the Poisson bracket relations in (44) is due to the non-locality of the formula for potentials v and u in terms of η. The same structure constants n−m arise in [FRS] in the framework of the discrete Drinfeld–Sokolov theory, which provides for this formula a totally different (and more direct) explanation. Acknowledgement. The authors would like to thank L. D. Faddeev, V. Fock and V. Sokolov for useful discussions. The work of the second author was partially supported by the INTAS-OPEN grant 03-51-3350, the RFFI grant 05-01-00922 and the ANR program “GIMP” ANR-05-BLAN-0029-01. The first author is grateful to the Association Suisse-Russe for financing his visit to the Steklov Institute, with special thanks to J.-P. Periat and S. Yu. Sergueeva.

References [AMM] [B1] [B2] [BB] [Bo] [CD] [DS1] [DS2] [FT] [FRS] [KN] [OT] [S1] [S2] [STSS] [SS] [V] [W1] [W2] [W3]

Alekseev, A., Malkin, A., Meinrenken, E.: Lie group valued moment maps. J. Diff. Geom. 48(3), 445–495 (1998) Babelon, O.: Extended conformal algebra and the Yang-Baxter equation. Phys. Lett. B 215(3), 523–529 (1988) Babelon, O.: Exchange formula and lattice deformation of the Virasoro algebra. Phys. Lett. B 238 (2-4), 234–238 (1990) Babelon, O., Bernard, D.: Dressing symmetries. Commun. Math. Phys. 149(2), 279–306 (1992) Boalch, P.: Stokes matrices, Poisson Lie groups and Frobenius manifolds. Invent. Math. 146, 479–506 (2001) Calogero, F., Degasperis, A.: Reduction technique for matrix nonlinear evolution equations solvable by the spectral transform. J. Math. Phys. 22(1), 23–31 (1981) Drinfeld, V.G., Sokolov, V.V.: Lie algebras and equations of Korteweg-de Vries type. Sov. Math. Dokl. 23, 457–462 (1981) Drinfeld, V.G., Sokolov, V.V.: On equations related to the Korteweg-de Vries type. Sov. Math. Dokl. 32, 361–365 (1985) Faddeev, L.D., Takhtajan, L.A.: Liouville model on the lattice. Lect. Notes in Phys. 246, 166–179 (1986) Frenkel, E., Reshetikhin, N., Semenov-Tian-Shansky, M.A.: Drinfeld-Sokolov reduction for difference operators and deformations of W-algebras. I. The case of Virasoro algebra. Commun. Math. Phys. 192, 605–629 (1998) Krichever, I.M., Novikov, S.P.: Holomorphic bundles over algebraic curves, and nonlinear equations. Russ. Math. Survy. 35(6), 53–80 (1980) Ovsienko, V., Tabachnikov, S.: Projective differential geometry old and new: from Schwarzian derivative to cohomology of diffeomorphism group. Cambridge Tracts in Mathematics, 165. Cambridge: Cambridge University Press, 2005 Semenov-Tian-Shansky, M.A.: Dressing action transformations and Poisson-Lie group actions. Publ. RIMS. 21, 1237–1260 (1985) Semenov-Tian-Shansky, M.A.: Monodromy map and classical r-matrices. Zapiski nauchn. semin. POMI, 200, 156–166, (1992) (Russian); http://arxiv.org/list/ hep-th 9402054, 1994 Semenov-Tian-Shansky, M.A., Sevostyanov, A.V.: Drinfeld-Sokolov reduction for difference operators and deformations of W-algebras. II. The general semisimple case. Commun. Math. Phys. 192, 631–647 (1998) Svinolupov, S.I., Sokolov, V.V.: Evolution equations with nontrivial consevation laws. Funct. Anal. Appl. 16, 317–319 (1983) Volkov, A.Yu.: Miura transformation on a lattice. Theor. Math. Phys. 74, 96–99 (1988) Wilson, G.: On the quasi-hamiltonian formalism of the KdV equation. Phys. Lettr. A 132, 445–450 (1988) Wilson, G.: On antiplectic pairs in the hamiltonian formalism of evolution equations. Quart. J. Math. Oxford Ser.(2) 42, 227–256 (1991) Wilson, G.: On the antiplectic pair connected with the Adler-Gel’fand-Dikii bracket. Nonlinearity 5, 109–131 (1992)

Communicated by L. Takhtajan

Commun. Math. Phys. 284, 553–581 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0580-8

Communications in

Mathematical Physics

Random Repeated Interaction Quantum Systems Laurent Bruneau1 , Alain Joye2, , Marco Merkli3, 1 CNRS-UMR 8088 and Département de Mathématiques, Université de Cergy-Pontoise, Site Saint-Martin,

BP 222, 95302 Cergy-Pontoise, France. E-mail: [email protected]; http://www.u-cergy.fr/bruneau

2 Institut Fourier, UMR 5582, CNRS-Université de Grenoble I, BP 74, 38402 Saint-Martin d’Hères, France.

E-mail: [email protected]

3 Department of Mathematics and Statistics, Memorial University, St. John’s, NL, A1C 5S7, Canada.

E-mail: [email protected]; http://www.math.mun.ca/~merkli Received: 31 October 2007 / Accepted: 1 April 2008 Published online: 29 July 2008 – © Springer-Verlag 2008

Abstract: We consider a quantum system S interacting sequentially with independent systems Em , m = 1, 2, . . . Before interacting, each Em is in a possibly random state, and each interaction is characterized by an interaction time and an interaction operator, both possibly random. We prove that any initial state converges to an asymptotic state almost surely in the ergodic mean, provided the couplings satisfy a mild effectiveness condition. We analyze the macroscopic properties of the asymptotic state and show that it satisfies a second law of thermodynamics. We solve exactly a model in which S and all the Em are spins: we find the exact asymptotic state, in case the interaction time, the temperature, and the excitation energies of the Em vary randomly. We analyze a model in which S is a spin and the Em are thermal fermion baths and obtain the asymptotic state by rigorous perturbation theory, for random interaction times varying slightly around a fixed mean, and for small values of a coupling constant. 1. Introduction This paper is a contribution to rigorous non-equilibrium quantum statistical mechanics, examining the asymptotic properties of random repeated interaction systems. The paradigm of a repeated interaction system is a cavity containing the quantized electromagnetic field, through which an atom beam is shot in such a way that only a single atom is present in the cavity at all times. Such systems are fundamental in the experimental and theoretical investigation of basic processes of interaction between matter and radiation, and they are of practical importance in quantum optics and quantum state engineering [15–17].

Partly supported by the Ministère français des affaires étrangères through a séjour scientifique haut niveau.

554

L. Bruneau, A. Joye, M. Merkli

A repeated interaction system is described by a “small” quantum system S (cavity) interacting successively with independent quantum systems E1 , E2 , . . . (atoms). At each moment in time, S interacts precisely with one Em (with increasing index as time increases), while the other elements in the chain C = E1 + E2 + · · · evolve freely according to their intrinsic (uncoupled) dynamics. The complete evolution is described by the intrinsic dynamics of S and of Em , plus an interaction between S and Em , for each m. The latter consists of an interaction time τm > 0, and an interaction operator Vm (acting on S and Em ); during the time interval [τ1 + · · · + τm−1 , τ1 + · · · + τm ), S is coupled to Em via a coupling operator Vm . One may view C as a “large system”, and hence S as an open quantum system. From this perspective, the main interest is the effect of the coupling on the system S. Does the system approach a time-asymptotic state? If so, at what rate, and what are the macroscopic (thermodynamic) properties of the asymptotic state? Idealized models with constant repeated interaction, where Em = E, τm = τ , Vm = V , have been analyzed in [7,17]. It is shown in [7] that the coupling drives the system to a τ -periodic asymptotic state, at an exponential rate. The asymptotic state satisfies the second law of thermodynamics: energy changes are proportional to entropy changes, with ratio equal to the temperature of the chain C. In experiments, where repeated interaction systems can be realized as “One-Atom Masers” [15–17], S represents one or several modes of the quantized electromagnetic field in a cavity, and the E describe atoms injected into the cavity, one by one, interacting with the radiation while passing through the cavity, and then exiting. It is clear that neither the interaction (τm , Vm ), nor the state of the incoming elements Em can be considered exactly the same in each interaction step m. Indeed, in experiments, the atoms are ejected from an atom oven, then cooled down before entering the cavity – a process that cannot be controlled entirely. It is therefore natural to build a certain randomness into the description. For instance, we may consider the temperature of the incoming E or the interaction time τ to be random. (Other parameters may vary randomly as well.) We develop in this work a theory that allows us to treat repeated interaction processes with time-dependent (piecewise constant) interactions, and in particular, with random interactions. We are not aware of any theoretical work dealing with variable or random interactions, other than [8]. Moreover, to our knowledge, this is the only work, next to [8], where random positive temperature Hamiltonians (random Liouville operators) are examined. The purpose of the present paper is twofold: – Firstly, we establish a general framework for random repeated interaction systems and we prove convergence results for the dynamics. The dynamical process splits into a decaying and a fluctuating part, the latter converging to an explicitly identified limit in the ergodic mean. To prove the main convergence result, Theorem 1.3 (see also Theorems 3.2 and 3.3), we combine techniques of non-equilibrium quantum statistical mechanics developed in [7] with techniques of [8], developed to analyze infinite products of random operators. We generalize results of [8] to time-dependent, “instantaneous” observables. This is necessary in order to be able to extract physically relevant information about the final state, such as energy- and entropy variations. We examine the macroscopic properties of the asymptotic state and show in Theorem 1.4 that it satisfies a second law of thermodynamics. This law is universal in the sense that it does not depend on the particular features of the repeated interaction system, and it holds regardless of the initial state of the system. – Secondly, we apply the general results to concrete models where S is a spin and the E are either spins as well, or they are thermal fermion fields. We solve the spin-spin system exactly: Theorem 1.5 gives the explicit form of the final state in case the

Random Repeated Interaction Quantum Systems

555

interaction time, the excitation level of spins E or the temperatures of the E are random. The spin-fermion system is not exactly solvable. We show in Theorem 7.1 that, for small coupling, and for random interaction times τ and random temperatures β of the thermal fermi fields E, the system approaches a deterministic limit state. We give in Theorem 1.6 the explicit, rigorous expansion of the limit state for small fluctuations of τ around a given value τ0 . This part of our work is based on a careful execution of rigorous perturbation theory of certain non-normal “reduced dynamics operators”, in which random parameters as well as other, deterministic interaction parameters must be controlled simultaneously. 1.1. Setup. The purpose of this section is to explain parts of the formalism, with the aim to make our main results, presented in the next section, easily understandable. We first present the deterministic description. According to the fundamental principles of quantum mechanics, states of the systems S and Em are given by normalized vectors (or density matrices) on Hilbert spaces HS and HEm , respectively. We assume that dim HS < ∞, while the HEm may be infinite dimensional. Observables of S and Em are bounded operators forming von Neumann algebras MS ⊂ B(HS ) and MEm ⊂ B(HEm ). Observables AS ∈ MS and AEm ∈ MEm evolve according to the Heisenberg dynamics R t → αSt (AS ) and R t → αEt m (AEm ) respectively, where αSt and αEt m are ∗-automorphism groups of MS and MEm , respectively, see e.g. [5]. The Hilbert space of the total system is the tensor product H = HS ⊗ HC , where HC = m≥1 HEm is the Hilbert space of the chain, and the non-interacting dynamics is t defined on the algebra MS m≥1 MEm by αSt m≥1 αEm . The infinite tensor product H is taken with respect to distinguished “reference states” of the systems S and Em , represented by vectors ψS ∈ HS and ψEm ∈ HEm .1 Typically, one takes the reference states to be equilibrium (KMS) states for the dynamics αSt , αEt m , at inverse temperatures βS , βEm . It is useful to consider the dynamics in the Schrödinger picture. For this, we implement the dynamics via unitaries, generated by self-adjoint operators L S and L Em , acting on B(HS ) and B(HEm ), respectively. The generators, called Liouville operators, are uniquely determined by α#t (A) = eit L # A# e−it L # , t ∈ R, and L # ψ# = 0,

(1.1)

where # stands for either S or Em . 2 In particular, (1.1) holds if the reference states are equilibrium states. Let τm > 0 and Vm ∈ MS ⊗ MEm be the interaction time and interaction operator associated to S and Em . We define the (discrete) repeated interaction Schrödinger dynamics of a state vector ψ ∈ H, for m ≥ 0, by

U (m)ψ = e−iτm L m · · · e−iτ2 L 2 e−iτ1 L 1 ψ, where Lk = Lk +

L En

(1.2)

(1.3)

n =k 1 Those vectors are to be taken cyclic and separating for the algebras M and M , respectively [5]. S Em Their purpose is to fix macroscopic properties of the system. However, since dimHS < ∞, the vector ψS does not play any significant role. In practice, it is chosen so that it makes computations as simple as possible. 2 The existence and uniqueness of L satisfying (1.1) is well known under general assumptions on the # reference states ψ# [5].

556

L. Bruneau, A. Joye, M. Merkli

describes the dynamics of the system during the time interval [τ1 +· · ·+τk−1 , τ1 +· · ·+τk ), which corresponds to the time step k of the discrete process, with L k = L S + L Ek + Vk ,

(1.4)

acting on HS ⊗ HEk . (We understand that the operator L En in (1.3) acts nontrivially only on the n th factor of the chain Hilbert space HC .) An operator ρ on H which is self-adjoint, non-negative, and has unit trace is called a density matrix. A state (·) = Tr(ρ · ), where Tr is the trace over H, is called a normal state. Our goal is to understand the large-time asymptotics (m → ∞) of expectations U (m)∗ OU (m) ≡ (α m (O)), (1.5) for normal states and certain observables O. Important physical observables are represented by operators that act either just on S or ones that describe exchange processes between S and the chain C. The latter are represented by time-dependent operators because they act on S and, at step m, on the element Em which is in contact with S. We define instantaneous observables to be those of the form ( j)

O = AS ⊗rj=−l Bm ,

(1.6)

( j)

where AS ∈ MS and Bm ∈ MEm+ j (we do not write identity operators in the tensor product). The class of instantaneous observables allows us to study all properties of S alone, as well as exchange properties between S and C. Let us illustrate our strategy to analyze (1.5) for the initial state determined by the vector ψ0 = ψS ⊗ ψC , where ψC = ⊗m≥1 ψEm . We use ideas stemming from the algebraic approach to quantum dynamical systems far from equilibrium to obtain the following representation for large m (Proposition 2.5): ψ0 , α m (O)ψ0 = ψ0 , P M1 · · · Mm−l−1 Nm (O)Pψ0 . (1.7) Here, P is the orthogonal projection onto HS , along ψC , projecting out the degrees of freedom of C. The Mk are effective operators which act on HS only, encoding the effects of the interactions on the system S. They are called reduced dynamics operators (RDO), and have the form Mk = Peiτk K k P, where K k is an (unbounded, non-normal) operator acting on HS ⊗ HEk , satisfying eit K k Ae−it K k = eit L k Ae−it L k for all A ∈ MS ⊗ MEk , and K k ψS ⊗ ψEk = 0.3 The operator Nm (O) acts on HS and has the expression (Proposition 2.4)

( j)

Nm (O)ψ0 = Peiτm−l L m−l · · · eiτm L m (AS ⊗rj=−l Bm )e−iτm L m · · · e−iτm−l L m−l ψ0 .

(1.8)

The asymptotics m → ∞ of (1.7) for identical matrices Mk ≡ M has been studied in [7]. In the present work we consider the Mk to be random operators. We allow for randomness through random interactions (interaction times, interaction operators) as well as random initial states of the Em (random temperatures, energy spectra, etc). 3 These are the defining properties of K ; K has an explicit form expressible in terms of the modular data k k of (MS ⊗ MEk , ψS ⊗ ψEk ), see Sect. 2.2.

Random Repeated Interaction Quantum Systems

557

Let (, F, p) be a probability space. To describe the stochastic dynamic process at ∗ hand, we introduce the standard probability measure dP on ext := N , dP = j≥1 dp j , where

dp j ≡ dp, ∀ j ∈ N∗ .

(1.9)

We make the following randomness assumptions: (R1) The reduced dynamics operators Mk are independent, identically distributed (iid) random operators. We write Mk = M(ωk ), where M : → B(Cd ) is an operator valued random variable. (R2) The operator Nm (O) is independent of the Mk with 1 ≤ k ≤ m − l − 1, and it has the form N (ωm−l , . . . , ωm+r ), where N : r +l+1 → B(Cd ) is an operator valued random variable. Since the operator Mk describes the effect of the k th interaction on S, assumption (R1) means that we consider iid random repeated interactions. The random variable N in (R2) does not depend on the time step m. This is a condition on the observables, it means that the nature of the quantities measured at various times m are the same. For ( j) instance, the Bm in (1.6) can represent the energy of Em+ j , or the part of the interaction energy Vm+ j belonging to Em+ j , etc. Both assumptions are verified in a wide variety of physical systems: we may take random interaction times τk = τ (ωk ), random coupling operators Vk = V (ωk ), random energy levels of the Ek encoded in L Ek = L E (ωk ), random temperatures βEk = βE (ωk ) of the initial states of Ek , and so on; see Sects. 6 and 7 for concrete models. 1.2. Main results. Our main results are: the existence and identification of the limit of infinite products of random reduced dynamics operators; the proof of the approach of a random repeated interaction system to an asymptotic state, together with its identification; the analysis of the macroscopic properties of the asymptotic state; explicit expressions of that state for spin-spin and spin-fermion systems. We present here some main results and refer to subsequent sections for more information and for proofs. – Ergodic limit of infinite products of random operators. The asymptotics of the dynamics (1.7), in the random case, is encoded in the product M(ω1 ) · · · M(ωm−l−1 )N (ωm−l , . . . , ωm+r ). It is not hard to see that the spectrum of the operators M(ω) is contained inside the closed complex unit disk, and that M(ω)ψS = ψS (see Lemma 2.3). Definition 1.1. Let M(E) denote the set of reduced dynamics operators whose spectrum on the complex unit circle consists only of a simple eigenvalue 1. The following is our main result on convergence of products of random reduced dynamics operators (see also Theorem 3.3). We denote by E[M] the expectation of M(ω). Theorem 1.2 (Ergodic limit of infinite operator product). Suppose that p(M(ω) ∈ ⊂ N∗ of probability M(E) ) = 0. Then E[M] ∈ M(E) . Moreover, there exists a set , one s.t. for any ω = (ωn )n∈N ∈ ν 1 M(ω1 ) · · · M(ωn )N (ωn+1 , . . . , ωn+l+r +1 ) = |ψS θ | E[N ], ν→∞ ν

lim

n=1

558

L. Bruneau, A. Joye, M. Merkli

where θ = P1,∗ E[M] ψS , P1,X is the (Riesz) spectral projection of X associated to the eigenvalue 1, and ∗ denotes the adjoint. Asymptotic state of random repeated interaction systems. We use the result of Theorem 1.2 in (1.7), where we replace α m by the random dynamics, denoted αωm . It follows that the ergodic limit of (1.7) is + (E[N ]), where + (A) := θ, AψS , A ∈ MS .

(1.10)

A density argument using the cyclicity of the reference state ψ0 extends the argument leading to (1.7) to all normal initial states on M. Theorem 1.3 (Asymptotic State). Suppose that p(M(ω) ∈ M(E) ) = 0. There exists a , for any instantaneous observable ⊂ N∗ of probability one s.t. for any ω ∈ set O, (1.6), and for any normal initial state , we have µ 1 m lim αω (O) = + (E[N ]) . µ→∞ µ

(1.11)

m=1

Macroscopic properties of the asymptotic state. Since we deal with open systems, it is generally not meaningful to speak about the total energy (which is typically infinite). However, variations (fluxes) in total energy are often well defined. Using an argument of [7] (see also [6] for a heuristic argument based on the hamiltonian approach) one shows that the formal expression for the total energy is constant during all time-intervals [τm−1 , τm ), and that it undergoes a jump j (m, ω) := αωm (V (ωm+1 ) − V (ωm ))

(1.12)

at time step m.The variation of the total energy between the instants 0 and m is then

E(m, ω) = m k=1 j (k, ω). The relative entropy of with respect to 0 , two normal states on M, is denoted by Ent(|0 ). Our definition of relative entropy differs from that given in [5] by a sign, so that in our case, Ent(|0 ) ≥ 0. For a thermodynamic interpretation of entropy and its relation to energy, we assume for the next result that ψS is a (βS , αSt )–KMS state on MS , and that the ψEm are (βEm , αEt m )–KMS state on MEm , where βS is the inverse temperature of S, and βEm are random inverse temperatures of the Em . Let 0 be the state on M determined by the vector ψS ⊗ ψC = ψS m ψEm . The change of relative entropy is denoted S(m, ω) := Ent( ◦ α m |0 ) − Ent(|0 ). Theorem 1.4 (Energy and entropy productions, 2nd law of thermodynamics). Let be a normal state on M. Then

E(m, ω) =: dE + = + E P(L S + V − eiτ L (L S + V )e−iτ L )P a.s., lim m→∞ m

S(m, ω) lim a.s. =: dS+ = + E βE P(L S +V − eiτ L (L S +V )e−iτ L )P m→∞ m We call dE + and dS+ the asymptotic energy- and entropy productions; they are independent of the initial state . If βE is deterministic, i.e., ω-independent, then the system satisfies the second law of thermodynamics: dS+ = βE dE + .

Random Repeated Interaction Quantum Systems

559

Explicit expressions for asymptotic states. We apply our general results to spin-spin and spin-fermion systems, presenting here a selection of results, and referring the reader to Sects. 6 and 7 for additional results and more detail. Spin-spin systems. Both S and E are two-level atoms with hamiltonians h S , h E having ground state energy zero, and excited energies E S and E E , repectively. The hamiltonian describing the interaction of S with one E is given by h = h S + h E + λv, where λ is a coupling parameter, and v induces energy exchange processes, v := aS ⊗ aE∗ + aS∗ ⊗ aE .

(1.13)

Here, a# denotes the annihilation operators and a#∗ the creation operators of # = S, E. The Gibbs state at inverse temperature β is given by β,# (A) =

Tr(e−βh # A) , where Z β,# = Tr(e−βh # ). Z β,#

(1.14)

We take the reference state to be ψ0 = ψS ⊗m≥1 ψEm ,βm , where ψS is the tracial state on S, and ψEm ,βEm is the Gibbs state of Em (represented by a single vector in an appropriate “GNS” Hilbert space, see Sect. 6). The following result deals with three situations: 1. The interaction time τ is random. It is physically reasonable to assume that τ (ω) varies within an interval of uncertainty, since it cannot be controlled exactly in experiments. 2. The excitation energy of E is random. This situation occurs if various kinds of atoms are injected into the cavity, or if some impurity atoms enter it. 3. The temperature of the incoming atoms is random. This is physically reasonable since the incoming atom beam’s temperature cannot be controlled exactly in experiments. Theorem 1.5 (Random spin-spin system). Set T := √

2π . (E S −E E )2 +4λ2

1. Random interaction time. Suppose that βEm = β is constant, and that τ (ω) > 0 ⊂ N∗ is a random variable satisfying p (τ ∈ / T N) = 0. Then there exists a set , for all normal states on M and for all of probability one, such that for all ω ∈ observables A of S, µ 1 lim (αωm (A)) = β ,S (A), µ→∞ µ

(1.15)

m=1

with β = β1 := β E E /E S . 2. Random excitation energy of E. Suppose that τ and βEm = β are constant, and that E E (ω) > 0 is a random variable satisfying p (τ ∈ / T N) = 0. (Here, T = T (ω) is ⊂ N∗ of probability one s.t. for all ω ∈ random via E E (ω).) Then there exists a set , for all normal initial states on M and for all observables A of S, (1.15) holds with

−1

−1 , β = β2 := −E S−1 log 2 1 −(1 −E[e0 ])−1 E (1−e0 )(1−2Z β−1E E /E S ,S ) and where 2 2 √ E S − E E − (E S − E E )2 + 4λ2 + 4λ2 eiτ (E S −E E )2 +4λ2 e0 = . (1.16) 2 2 2 2 E S − E E − (E S − E E ) + 4λ + 4λ

560

L. Bruneau, A. Joye, M. Merkli

3. Random temperature of E. Suppose that β(ω) is a random variable, and that ⊂ N∗ of probability one s.t. for τ > 0 satisfies τ ∈ / T N. Then there exists a set , for all normal initial states on M and for all observables A of S, (1.15) all ω ∈ −1 −1 − 1 . ] holds with β = β3 := −E S−1 log E[Z β(ω)E /E , S E S Remarks. 1. In the situation of Point 1 of Theorem 1.5, we obtain the following sharper result than (1.15). There is a constant C, α > 0, and there is a random variable n 0 (ω) : (α n (A))−β ,S (A) ≤ Ce−αn , satisfying E[eαn 0 ] < ∞ such that, for each ω ∈ ω for all n ≥ n 0 (ω), all observables A and all normal initial states . 2. If E S = E E then β1 = β. In the case of identical interactions (no randomness), the system S is therefore “thermalized” by the elements of the chain, a fact which was already noticed in [2]. One might expect that for a randomly fluctuating temperature of the E, the system S would be thermalized at asymptotic temperature equalling the average of the chain temperature. However, Point 3. of the above theorem shows that this is not the case: the asymptotic temperature is in general not the average temperature. The random repeated interaction process induces a more complicated thermalization effect on S than simple temperature averaging. Spin-fermion systems. Let S be a spin-1/2 system with Hilbert space of pure states C2 , and Hamiltonian given by the Pauli matrix σz . We take the systems E to be infinitely extended thermal fermi fields. They model dispersive environments. Let a(k) and a ∗ (k) denote the usual fermionic creation and annihilation operators, and let a( f ) = 3 k, a ∗ ( f ) = f (k)a(k)d f (k)a ∗ (k)d3 k, for square-integrable f . We take the 3 3 R R state β of E to be the equilibrium state at inverse temperature β. It is characterized by β (a ∗ ( f )a( f )) = f, (1 + eβh )−1 f , where the h appearing in the scalar product is the Hamiltonian of a single fermion. We represent the one-body fermion space as h = L 2 (R+ , dµ(r ); g), where g is an auxiliary Hilbert space, and we take h to be the operator of multiplication by r ∈ R+ .4 At each interaction step, S interacts with a fresh system E for a duration τ . The interaction induces energy exchanges between the two interacting subsystems, it is represented by the operator λV , where λ is a small coupling constant, and V = σx ⊗ [a ∗ (g) + a(g)]. Here, σx is the Pauli matrix and g = g(k) ∈ L 2 (R3 , d3 k) is a form factor determining the relative strength of interaction between S and modes of the thermal field. We consider random interaction times of the form τ (ω) = τ0 + σ (ω), where τ0 is a fixed value, and σ (ω) ∈ [−, ] is a random variable with small amplitude . Theorem 1.6 (Random spin-fermion system). Assume that the form factor satisfies (1 + eβh/2 )g L 2 (R3 , d3 k) < ∞, and that p(σ (ω) ∈ π2 N − τ0 ) = 1. There is a constant λ0 > 0 s.t. if 0 < |λ| < λ0 , then Theorem 1.3 applies, and the asymptotic state + , (1.10), has the following expansion: for any A ∈ MS , + (A) = q(σ )A00 + (1 − q(σ ))A11 + Rσ,λ (A),

(1.17)

where Ai j = i, A j , i, j = 0, 1 and |0 , |1 are the eigenvectors of σz with eigenvalues ±1. The remainder term satisfies |Rσ,λ (A)| ≤ CA( 3 + λ2 ), where C is independent of , σ, λ, A. 4 For instance, for usual non-relativistic, massive fermions, the single-particle Hilbert space is L 2 (R3 , d3 k) (Fourier space), and the Hamiltonian is the multiplication by |k|2 . This corresponds to g = L 2 (S 2 , d) (uniform measure on S 2 ), and dµ(r ) = 21 r 1/2 dr .

Random Repeated Interaction Quantum Systems

561

The probability q(σ ) is given by q(σ ) =

α+ α− ξ+ − α+ ξ− α− ξ+ − α+ ξ− + 2E[σ ] 2 + 4(E[σ ])2 (ξ+ + ξ− ) 4 2 α+ + α− τ0 (α+ + α− ) τ0 (α+ + α− )3 α− η+ − α+ η− +E[σ 2 ] 2 , τ0 (α+ + α− )2

where, with sinc(x) = sin(x)/x, g(r )2g −βr 2 (r ∓ 2)τ0 2 (r ± 2)τ0 e sinc + sinc , α± = dµ(r ) 1 + e−βr 2 2 g(r )2g −βr ξ± = τ0 dµ(r ) e sinc ∓ 2)τ + sinc ± 2)τ , (1.18) [(r ] [(r ] 0 0 1 + e−βr g(r )2g −βr e cos [(r ∓ 2)τ0 ] + cos [(r ± 2)τ0 ] . η± = dµ(r ) −βr 1+e Expansion (1.17) shows in particular that to lowest order in λ, the final state is diagonal in the energy basis. This is a sign of decoherence of S due to contact with the environment C. Organization of the paper. In Sect. 2 we cast the dynamical problem into a shape suitable for further analysis. Our main result there is Proposition 2.5. Section 3 contains the proof of Theorem 1.2, and in Sects. 4 and 5 we present the proof of Theorems 1.3 and 1.4, respectively. In Sects. 6 and 7 we present the setup and main results for spin-spin and spin-fermion systems. In particular, we give the proofs of Theorems 1.5 and 1.6. 2. Repeated Interactions and Matrix Products In this section, we link the repeated interaction dynamics to products of matrices. This reduction is a purely “algebraic” procedure and randomness plays no role here. Throughout the paper, we assume without further mentioning it, that (A1) dim HS = d < ∞, and the reference vectors ψ# are cyclic and separating for M# (# = S or Em ). Recall that cyclicity means that M# ψ# is dense in H# , and separability means that A# ψ# = 0 ⇒ A# = 0, ∀A# ∈ M# , and is equivalent to M# ψ# being dense in H# , where M# is the commutant von Neumann algebra of M# . 2.1. Splitting off the trivial dynamics. We isolate the “free part” of the dynamics given in (1.2)–(1.4), i.e. that of the elements Ek which do not interact with S at a given time step m. Proposition 2.1. For any m, we have U (m) = Um− e−iτm L m · · · e−iτ1 L 1 Um+ , where

⎡ Um− = exp ⎣−i

j−1 m j=1 k=1

⎤

⎡

τ j L Ek ⎦ and Um+ = exp ⎣−i

(2.1) m

⎤ τ j L Ek ⎦

(2.2)

j=1 k> j

are unitary operators which act trivially on HS and satisfy Um± ψC = ψC , ∀m ∈ N∗ .

562

L. Bruneau, A. Joye, M. Merkli

Proof. As the interaction Liouvillean at time m, L m , and the free Liouvillean L Ek commute provided k = m, we can write successively

, e−iτ1 L 1 = e−iτ1 L 1 e−iτ1 k>1 L Ek e−iτ2 L 2 = e−iτ2 L E1 e−iτ2 L 2 e−iτ2 k>2 L Ek .. .

e−iτm L m = e−iτm

k<m

L Ek

and then use this decomposition in (1.2).

e−iτm L m e−iτm

(2.3) k>m

L Ek

,

2.2. Choosing a suitable generator of dynamics. We follow an idea developed recently in the study of open quantum systems far from equilibrium which allows to represent the dynamics in a suitable way [9,7,8,12–14]. Let Jm and m denote the modular conjugation and the modular operator of the pair (MS ⊗ MEm , ψS ⊗ ψEm ), respectively. For more detail see the above references as well as [5] for a textbook exposition. Throughout this paper, we assume the following condition on the interaction, without further mentioning it: 1/2

−1/2

(A2) m Vm m

∈ MS ⊗ MEm , ∀m ≥ 1.

We present explicit formulae for the modular conjugation and the modular operator for the spin-fermion system in Sect. 7. The Liouville operator K m at time m associated to the 1/2 −1/2 reference state ψS ⊗ψEm is defined as K m = L S +L Em +Vm − Jm m Vm m Jm . It sat1/2 −1/2 isfies e±iK m ≤ exp{ m Vm m }. (In [9], such operators are called C-Liouville operators.) The main dynamical features of K m are the relations eit L m A e−it L m = eit K m A e−it K m , ∀A ∈ MS ⊗ MC , m ≥ 1, t ∈ R, K m ψS ⊗ ψEm = 0.

(2.4) (2.5)

Relation (2.4) means that K m implements the same dynamics as L m . This is seen to 1/2 −1/2 hold by noting that the difference K m − L m = Jm m Vm m Jm commutes with all A ∈ MS ⊗ MC (since J MJ = M , as is known from the Tomita-Takesaki theory of von Neumann algebras, see e.g. [5]). The advantage of using K m instead of L m is that eit K m leaves ψS ⊗ ψEm invariant. However, while L m is self-adjoint, K m is not even normal and unbounded. We want to examine the large time behaviour of the evolution of a normal state on M, defined by ◦ α m (see (1.5)). Since a normal state is a convex combination of vector states, it is not hard to see that one has to examine the large time evolution of vector the density matrix, we can write states only. More precisely, by diagonalizing ρ = j≥1 p j |φ j φ j |, where p j ≥ 0 and j≥1 p j = 1, and where the φ j are normalized vectors in H. If we can show that limm→∞ φ, α m (A)φ = φ (A) exists for any normalized vector φ ∈ H, then any normal state satisfies p j φ j , α m (A)φ j = p j φ j (A). (2.6) lim (α m (A)) = lim m→∞

m→∞

j≥1

j≥1

In other words, we only have to analyze vector states (·) = φ, · φ . If the asymptotic states φ do not depend on the vector φ, i.e. φ ≡ + , then any normal initial state has

Random Repeated Interaction Quantum Systems

563

asymptotic state + , by (2.6). The above argument works equally well if the pointwise limit m → ∞ is replaced by the ergodic limit. Next, since, by Assumption (A1), ψ0 = ψS ⊗ψC , where ψC = ⊗m≥1 ψEm , is cyclic for the commutant M (which is equivalent to being separating for M), we can approximate any vector in H arbitrarily well by vectors φ = B ψ0 ,

(2.7)

N B = BS ⊗n=1 Bn ⊗n>N 1lEn ∈ M ,

(2.8)

for some

with BS ∈ MS , Bn ∈ MEn (with vanishing error as N → ∞; see also [7]). Hence, we may restrict our attention to taking the limit m → ∞ of expressions (2.9) ψ0 , (B )∗ α m (A)B ψ0 = ψ0 , (B )∗ B α m (A)ψ0 . 2.3. Observables of the small system. To present the essence of our arguments in an unencumbered way, we first consider the Heisenberg evolution of observables AS ∈ MS , and we treat more general observables in the next section. Consider expression (2.9). Using Proposition 2.1, we obtain α m (AS ⊗ 1lC ) = U (m)∗ (AS ⊗ 1lC ) U (m) = (Um+ )∗ eiτ1 L 1 · · · eiτm L m (AS ⊗ 1lC ) e−iτm L m · · · e−iτ1 L 1 Um+ , (2.10) where we made use of the fact that Um− acts trivially on HS . Due to the properties of the unitary Um+ specified in Proposition 2.1, and due to (2.4), (2.5), we have α m (AS ⊗ 1lC ) ψ0 = (Um+ )∗ eiτ1 L 1 · · · eiτm L m (AS ⊗ 1lC ) e−iτm L m · · · e−iτ1 L 1 ψ0 = (Um+ )∗ eiτ1 K 1 · · · eiτm K m (AS ⊗ 1lC )ψ0 .

(2.11)

Let us introduce PN = 1lS ⊗ 1lE1 ⊗ · · · 1lE N ⊗ PψE N +1 ⊗ PψE N +2 ⊗ · · · , where PψEk = |ψEk ψEk |. From the definition of B , (2.8), we see that ψ0 |(B )∗ B = ψ0 |(B )∗ B PN . Moreover, introducing the m-independent unitary operator ⎡ ⎤ N −1 N τ j L Ek ⎦ = PN Um+ , U˜ N+ := exp ⎣−i j=1 k= j+1

we can write, for m > N , ψ0 , (B )∗ B α m (AS ⊗ 1lC )ψ0 = ψ0 , (B )∗ B (U˜ N+ )∗ PN eiτ1 K 1 · · · eiτm K m (AS ⊗1lC )ψ0 = ψ0 , (B )∗ B (U˜ N+ )∗ eiτ1 K 1 · · · eiτ N K N PN eiτ N +1 K N +1 · · · eiτm K m (AS ⊗ 1lC )ψ0 . We define the projection P = 1lS ⊗ |ψC ψC |,

(2.12)

564

L. Bruneau, A. Joye, M. Merkli

and observe that PN eiτ N +1 K N +1 · · · eiτm K m (AS ⊗ 1lC )ψ0 = PN eiτ N +1 K N +1 · · · eiτm K m P(AS ⊗ 1lC )ψ0 = Peiτ N +1 K N +1 · · · eiτm K m P(AS ⊗ 1lC )ψ0 . By a simple argument using the independence of the elements Ek of C, we show exactly as in Proposition 4.1 of [7], that for any q ≥ 1 and any distinct integers n 1 , · · · , n q , Peiτn1 K n1 eiτn2 K n2 · · · eiτnq K nq P = Peiτn1 K n1 Peiτn2 K n2 P · · · Peiτnq K nq P. (2.13) Therefore, introducing operators M j acting on HS by Peiτ j K j P = M j ⊗ |ψC ψC |, or M j Peiτ j K j P,

(2.14)

we have proven the following result. Proposition 2.2. Let AS ∈ MS and φ = B ψ0 with B as in (2.8). Then for any m > N we have φ, α m (AS ⊗ 1lC )φ = ψ0 , (B )∗ B (U˜ N+ )∗ eiτ1 K 1 · · · eiτ N K N P M N+1 M N+2 · · · Mm (AS ⊗ 1lC )ψ0 . (2.15) Proposition 2.2 shows how the large time dynamics of a repeated interaction system is described by products m = M1 M2 · · · Mm on HS .

(2.16)

The main features of the matrices M j , inherited from those of eiτ j K j , are given in the following lemma. Lemma 2.3 ([7], Prop. 2.1). Assuming (A1), we have M j ψS = ψS , for all j ∈ N∗ . Moreover, to any φ ∈ HS there corresponds a unique A ∈ MS such that φ = AψS . |||φ||| := AB(HS ) defines a norm on HS , and as operators on HS endowed with this norm, the M j are contractions for any j ∈ N∗ . Remark. It follows from Lemma 2.3 that the spectrum of M j lies in the closed complex unit disk, and that 1 is an eigenvalue of each M j (with common eigenvector ψS ). 2.4. Instantaneous observables. So far, we have only considered observables of the system S. In this section, we extend the analysis to the more general class of instantaneous observables, defined in (1.6). Those are time-dependent observables, which, at time m, measure quantities of the system S and of a finite number of elements Ek of the chain, namely the element interacting at the given time-step, plus the l preceding elements and the r following elements in the chain. Physically important instantaneous observables are those with indices j = −1, 0: they appear naturally in the study of the energy exchange process between the system S and the chain (see Sect. 5); they also appear in experiments where one makes a measurement on the element right after it has interacted with S (the atom which exits the cavity) in order to get indirect information on the state of the latter.

Random Repeated Interaction Quantum Systems

565

The Heisenberg evolution of instantaneous observables is computed in a straightforward way, as for observables of the form AS ⊗ 1lC . We refrain from presenting all details of the derivation and present the main steps only. Let αkm,n (B) := ei(

m j=n

τ j )L Ek

Be−i(

m j=n

τ j )L Ek

, n ≤ m,

(2.17)

denote the free evolution from time n − 1 to m of an observable B acting non-trivially on HEk only, with the understanding that αkm,n equals the identity for n > m. With this definition and (2.2), we get ( j)

(Um− )∗ (AS ⊗rj=−l Bm )Um− m,m−l+1 m,m = AS ⊗ αm−l (Bm(−l) ) ⊗ · · · αm−l (Bm(−1) ) ⊗ Bm(0) ⊗ · · · ⊗ Bm(r ) m,m+ j+1

= AS ⊗rj=−l αm+ j

( j)

(Bm ).

(2.18)

Hence, ( j)

α m (AS ⊗rj=−l Bm ) = (Um+ )∗ eiτ1 L 1 · · · eiτm L m (AS ⊗rj=−l αm+j

m,m+j+1

( j)

(Bm ))e−iτm L m · · · e−iτ1 L 1 Um+ . (2.19)

Consider a vector state φ, ·φ , where φ is given by (2.7). We proceed as in the previous section to obtain ( j) φ, α m (AS ⊗rj=−l Bm ) φ m,m+ j+1 ( j) (Bm ) e−iτm L m · · · e−iτ1 L 1 ψ0 = B ψ0 , B (U˜ N+ )∗ eiτ1 L 1 · · · eiτm L m AS ⊗rj=−l αm+ j m,m+ j+1 ( j) = B ψ0 , B (U˜ N+ )∗ PN eiτ1 K 1 · · · eiτm K m AS ⊗rj=−l αm+ j (2.20) (Bm ) ψ0 . The vector to the right of (U˜ N+ )∗ can be further expanded as m,m+ j+1

eiτ1 K 1 · · · eiτ N K N PN eiτ N +1 K N +1 · · · eiτm K m (AS ⊗rj=−l αm+ j

( j)

(Bm ))ψ0

m,m+ j+1

= eiτ1 K 1 · · · eiτ N K N Peiτ N +1 K N +1 · · · eiτm K m (AS ⊗rj=−l αm+ j

( j)

(Bm ))ψ0

= eiτ1 K 1 · · · eiτ N K N P M N +1 · · · Mm−l−1 m,m+ j+1

×Peiτm−l K m−l · · · eiτm K m (AS ⊗rj=−l αm+ j

( j)

(Bm ))ψ0 ,

(2.21)

where P has been defined in (2.12), and where we have proceeded as in the derivation of (2.15) to arrive at the product of the matrices M N +1 · · · Mm−l−1 . We now define the operator Nm = Nm (O), see (1.6), acting on HS by m,m+ j+1

(Nm ψS ) ⊗ ψC := Peiτm−l K m−l · · · eiτm K m (AS ⊗rj=−l αm+ j

( j)

(Bm ))ψ0 . (2.22)

We will also denote the l.h.s. simply by Nm ψ0 . The operator Nm depends on the instan(−l) (r ) taneous observable, Nm = Nm (AS , Bm , . . . , Bm ). It can be expressed as follows.

566

L. Bruneau, A. Joye, M. Merkli

Proposition 2.4. Let α m,n denote the dynamics from time n to time m, i.e., α m,n (·) = U (m, n)∗ · U (m, n), where U (m, n) = U (m)U (n)∗ , and U (m) is given in (2.1). Then we have ( j) Nm ψ0 = Pα m,m−l−1 AS ⊗rj=−l Bm ψ0 ( j)

= Pα m,m−l−1 (AS ⊗0j=−l Bm )ψ0

r !

ψEm+k , Bm(k) ψEm+k .

(2.23)

k=1

Proof. The second equality is clear, since the dynamics involves only the Ek with indices k ≤ m. To prove the first equality, we use the properties of the operators K j and the definition (2.22) to see that m,m+ j+1

Nm ψ0 = Peiτm−l L m−l · · · eiτm L m (AS ⊗rj=−l αm+ j

( j)

(Bm ))

×e−iτm L m · · · e−iτm−l L m−l ψ0 . m,m+ j+1

Next, we write the αm+ j m,m+ j+1

αm+ j

(·) = e

(2.24)

in terms of the generators L Em+ j , see (2.17), i(τm+ j+1 +···+τm )L Em+ j

· e

−i(τm+ j+1 +···+τm )L Em+ j

.

Inserting this expression into (2.24) we can distribute the generators L Em+ j among the propagators in (2.24), and we see that

( j)

Nm ψ0 = Peiτm−l L m−l · · · eiτm L m (AS ⊗rj=−l Bm )e−iτm L m · · · e−iτm−l L m−l ψ0 , where the L k , (1.2), give the full dynamics.

Finally, Nm can be defined on all of HS in the following way. From Proposition 2.4, it is immediate that for all observables AS in the commutant MS , we can set Nm AS ψ0 := AS Nm ψ0 . Since MS ψS = HS (separability of ψS ), Nm is defined on all of HS . We have proven the following result. Proposition 2.5. Let O be an instantaneous observable, (1.6), and let φ = B ψ0 with B as in (2.8). Then we have for any m > N + l + 1, φ, α m (O)φ = ψ0 , (B )∗ B (U˜ N+ )∗ eiτ1 K 1 · · · eiτ N K N P M N +1 · · · Mm−l−1 Nm (O)ψ0 , where the M j are defined in (2.14), and Nm (O) is given in (2.22). To understand the large time behaviour of instantaneous observables, we study the n → ∞ asymptotics of products n Nn+l+1 = M1 M2 · · · Mn Nn+l+1 on HS ,

(2.25)

where Nn+l+1 involves only quantities of the systems S and Ek , with k = n + 1, . . . , n + l + r + 1. The numbers l, r are determined by the instantaneous observable O (1.6).

Random Repeated Interaction Quantum Systems

567

3. Proof of Theorem 1.2 According to Proposition 2.5, the large time dynamics is described by products of operators of the form (2.25), in the limit n → ∞. We will use in this section our basic assumptions (R1) and (R2), saying that the M j form a set of iid random matrices, and that Nn+l+1 is a random matrix independent of the M j , j = 1, . . . , n. In this section, we review results of [8] on products of the form M1 · · · Mn , and we extend them to products of random matrices of the form (2.25). Our main result here is Theorem 3.3. 3.1. Decomposition of random reduced dynamics operators. Let P1, j denote the spectral projection of M j for the eigenvalue 1 (cf. Lemma 2.3) and define ψ j := P1,∗ j ψS ,

P j := |ψS ψ j |,

(3.1)

where P1,∗ j is the adjoint operator of P1, j . Note that ψ j |ψS = 1 so that P j is a projection and, moreover, M ∗j ψ j = ψ j . We introduce the following decomposition: M j := P j + Q j M j Q j , with Q j = 1l − P j .

(3.2)

The following are basic properties of products of operators Mk . Proposition 3.1 ([8]). We define M Q j := Q j M j Q j . For any n, we have M1 · · · Mn = |ψS θn | + M Q 1 · · · M Q n ,

(3.3)

where ∗ ∗ ∗ ψ + · · · + MQ · · · MQ ψ θn = ψn + M Q n n−1 n 2 1

=

Mn∗ · · ·

M2∗ ψ1 ,

(3.4) (3.5)

and where ψS , θn = 1. Moreover, there exists C0 such that 1. For any j ∈ N∗ , P j = ψ j ≤ C0 and Q j ≤ 1 + C0 . 2. sup {M Q jn M Q jn−1 · · · M Q j1 , n ∈ N∗ , jk ∈ N∗ } ≤ C0 (1 + C0 ). 3. For any n ∈ N∗ , θn ≤ C02 . Typically, for matrices Mk ∈ M(E) (recall Definition 1.1), we expect the first part in the decomposition (3.3) to be oscillatory and the second one to be decaying. 3.2. The probabilistic setting. We use the notation introduced at the end of Sect. 1.1. Let us define the shift T : ext → ext by (T ω) j = ω j+1 , ∀ ω = (ω j ) j∈N ∈ ext .

(3.6)

T is an ergodic transformation of ext . The random reduced dynamics operators are characterized by a measurable map ω1 → M(ω1 ) ∈ Md (C),

(3.7)

where the target space is that of all d × d matrices with complex entries, d being the dimension of HS . With a slight abuse of notation, we write sometimes M(ω) instead

568

L. Bruneau, A. Joye, M. Merkli

−1 of M(ω1 ). Hence, for any subset B ⊂ Md (C), p(M(ω) ∈ B) = p(M (B)) = M −1 (B) dp(ω), and similarly for other random variables. According to (R1) the product (2.16) is n (ω) := M(ω1 )M(ω2 ) · · · M(ωn ) = M(T 0 ω)M(T 1 ω) · · · M(T n−1 ω). In the same way as in (3.1), we introduce the random variable ψ(ω) ∈ Cd defined as

ψ(ω) := P1 (ω)∗ ψS ,

(3.8)

where P1 (ω) denotes the spectral projection of M(ω) for the eigenvalue 1, and where ∗ stands for the adjoint. We decompose M(ω) := |ψS ψ(ω)| + M Q (ω) = P(ω) + M Q (ω)

(3.9)

as in (3.2). Note that ψ(ω) and M Q (ω) define bonafide random variables: ω → P1 (ω) is measurable since ω → M(ω) is [4]. In the next section, we will consider the process (see (3.4), (3.5)) θn (ω) = M ∗ (T n−1 ω)M ∗ (T n−2 ω) · · · M ∗ (T ω)ψ(ω) n ∗ ∗ ∗ MQ (ωn )M Q (ωn−1 ) · · · M Q (ω j+1 )ψ(ω j ). =

(3.10)

j=1

Note that θn is a Markov process, since θn+1 (ω) = M ∗ (ωn+1 )θn (ω). Finally, the operators Nm = Nm (O), given in (1.8) and Proposition 2.4, have the form Nm (O) = N (ωm−l , . . . , ωm+r ) = N (T m−l−1 ω),

(3.11)

see also condition (R2). 3.3. Convergence results for random matrix products. We have pointed out after Lemma 2.3 that the spectrum of any RDO lies inside the complex unit disk, and 1 is an eigenvalue (with the deterministic, i.e., ω-independent, eigenvector ψS ). The following result on the product of an iid sequence of RDO’s is the main result of [8]. Theorem 3.2 ([8]). Let M(ω) be a random reduced dynamics operator. Suppose that p(M(ω) ∈ M(E) ) = 0. Then we have E[M] ∈ M(E) . Moreover, there exist a set 1 ⊂ ∗ N with P(1 ) = 1, and constants C, α > 0, s.t. for any ω ∈ 1 there is an n 0 (ω) so that M Q (ω1 ) · · · M Q (ωn ) ≤ Ce−αn , for all n ≥ n 0 (ω), and ν 1 lim θn (ω) = θ. ν→∞ ν

(3.12) (3.13)

n=1

Also, n 0 (ω) is a random variable satisfying E[eαn 0 ] < ∞, and −1 E[ψ] = P1,∗ E[M] E[ψ] = P1,∗ E[M] ψS . θ = 1l − E[M Q ]∗

(3.14)

As a consequence, ν 1 M(ω1 ) · · · M(ωn ) = |ψS θ | = P1,E[M] . ν→∞ ν

lim

n=1

(3.15)

Random Repeated Interaction Quantum Systems

569

Remark. In the setting of Theorem 3.2, if not only M(ω), but also M ∗ (ω) has a deterministic eigenvector with eigenvalue 1 (denoted ψS∗ and normalized as ψS∗ , ψS = 1), then θ = ψS∗ and one can sharpen (3.15) as follows (see Proposition 3.1 and Eq. (3.12)): There are constants C, α > 0, and there is a random variable n 0 (ω) with E[eαn 0 ] < ∞, s.t. for all ω ∈ 1 and all n ≥ n 0 (ω), we have M(ω1 ) · · · M(ωn ) − |ψS ψS∗ | ≤ Ce−αn . While this result allows us to study the large time behaviour of observables of the small system S (see Sect. 2.3), in order to study the physically relevant instantaneous observables, we need to understand products of the form (2.25). In our probabilistic setting, they read M(ω1 ) · · · M(ωn )N (T n ω). Theorem 3.3. Let M(ω) be a random reduced dynamics operator and let N (ω) be a random matrix, uniformly bounded in ω. Suppose that p(M(ω) ∈ M(E) ) = 0. Then ∗ there exists a set 2 ⊂ N s.t. P(2 ) = 1 and s.t. for any ω ∈ 2 , ν 1 lim M(ω1 ) · · · M(ωn )N (T n ω) = |ψS θ | E[N ]. n→∞ ν

(3.16)

n=1

Remark. In our dynamical process, N (ω) depends only on finitely many variables ωm−l , . . . , ωm+r , see (3.11), so measurability and boundedness of the random matrix N are easily established in concrete applications. Proof of Theorem 3.3. Using the decomposition (3.3) together with (3.12), it suffices to show that ν 1 ∗ n N (T ω)θn (ω) = E[N ]∗ θ. ν→∞ ν

lim

(3.17)

n=1

We follow the strategy of [8] used to prove (3.13) of the present paper. From (3.4) we get ν

N ∗ (T n ω)θn (ω) =

n=1

=

n−1 ν

∗ ∗ N ∗ (T n ω)M Q (T n−1 ω) · · · M Q (T j+1 ω)ψ(T j ω)

n=1 j=0 ν ν−k

∗ ∗ N ∗ (T k+ j ω)M Q (T k+ j−1 ω) · · · M Q (T j+1 ω)ψ(T j ω).

(3.18)

k=1 j=0

Let us introduce the random vectors ∗ ∗ ∗ θ (k) (ω) = N ∗ (T k ω)M Q (T k−1 ω)M Q (T k−2 ω) · · · M Q (T ω)ψ(T 0 ω),

(3.19)

so that, by (3.18), ν ν ν−k 1 ∗ n 1 (k) j N (T ω)θn (ω) = θ (T ω) ν ν n=1

k=1

=

∞ k=1

j=0

χ{k≤ν}

ν−k j=0

θ (k) (T j ω)

∞

1 =: g(k, ν, ω). (3.20) ν k=1

570

L. Bruneau, A. Joye, M. Merkli ∗

For each fixed k, by ergodicity, there exists a set (k) ⊂ N of probability one, such that, for all ω ∈ (k) , the following limit exists: ν−k

1 ν−k+1 θ (k) (T j ω) ν→∞ ν − k + 1 ν

lim g(k, ν, ω) = lim

ν→∞

j=0

= lim

J →∞

1 J +1

J

θ (k) (T j ω) = E[θ (k) ].

j=0

Therefore, on the set ∞ := ∩k∈N (k) of probability one, for any k ∈ N, we have by independence of the M(ω j ), 1 ≤ j ≤ k, and of N ∗ (T k ω), ∗ k−1 ] E[ψ]. lim g(k, ν, ω) = E[θ (k) ] = E[N ∗ ] E[M Q

ν→∞

(3.21)

It follows from Proposition 3.1, Theorem 3.2 and the boundedness of N (ω) that for j ω ∈ 2 = 1 ∩ ∞ , we have θ (k) (T j ω) ≤ Ceαn 0 (T ω) e−α(k−1) . Therefore, for all ν large enough, and for all 1 ≤ k ≤ ν, ν−k 1 αn 0 (T j ω) −α(k−1) g(k, ν, ω) ≤ C e e ≤ 2CE[eαn 0 ]e−α(k−1) , ν

(3.22)

j=0

where we have used ergodicity in the last estimate. Of course, the same upper bound (3.22) holds for k > ν, since then g(k, ν, ω) = 0. The r.h.s. of (3.22) is summable w.r.t. k ∈ N, so we can use the Lebesgue Dominated Convergence Theorem in (3.20) to conclude that, almost surely on 2 , limν→∞ ν1 νn=1 N ∗ (T n ω)θn (ω) = E[N ∗ ] ∞ ∗ k k=0 E[M Q ] E[ψ]. Relation (3.17), and thus the proof of the theorem, now follow from (3.14). 4. Proof of Theorem 1.3 Let φ be a normalized vector in H. Fix > 0 and ω ∈ ext . There exists a B = B (, ω) ∈ M of the form (2.8) (with N depending on , ω), s.t. φ − B ψ0 < .

(4.1)

Here, both φ and ψ0 may depend on ω. It follows that φ, α m (O)φ − B ψ0 , α m (O)B ψ0 < 2 O. ω ω

(4.2)

Using Proposition 2.2 and Theorem 3.3, and that B commutes with αωm (O), we arrive at the relations µ µ 1 1 ψ0 , (B )∗ B αωm (O)ψ0 = lim ψ0 , (B )∗ B αωm (O)ψ0 µ→∞ µ µ→∞ µ

lim

m=1

m=N +1

µ 1 = lim ψ0 , (B )∗ B (U˜ N+ )∗ eiτ (ω1 )K (ω1 ) · · · eiτ (ω N )K (ω N ) P µ→∞ µ m=N +1

×M(ω N +1 ) · · · M(ωm−l−1 )N (ωm−l , ωm−l+1 , . . . , ωm+r )ψ0 = ψ0 , (B )∗ B ψ0 θ, E[N (O)]ψS ,

(4.3)

Random Repeated Interaction Quantum Systems

571

for all ω in a set 2 of measure one. It follows from (4.1) that (1−)2 < ψ0 , (B )∗ B ψ0 = B ψ0 2 < (1+)2 . Since is arbitrary, using the latter bound in (4.3) and taking into account (4.2), we conclude that (1.11) holds for any vector initial state (·) = φ, · φ . Finally, the argument leading to (2.6) shows that (1.11) holds for all normal initial states. The proof of Theorem 1.3 is complete. 5. Proof of Theorem 1.4 An easy application of Theorem 1.3 shows that for any normal initial state , ( E(m, ω)) = + ( j+ ), a.s., m→∞ m lim

where

" # j+ = E P V P − Peiτ L V e−iτ L P = E P(V − α τ (V ))P .

(5.1)

(5.2)

The energy grows linearly in time almost surely, at the rate dE + .5 In order to show the expression for dE + given in Theorem 1.4, it suffices to prove that + [E(P(L S − α τ (L S ))P)] = 0. Let be a normal state. Although L S ∈ / M, still α k (L S ) − α k−1 (L S ) is an instantaneous observable belonging to M. This follows from eiτk L k L S e−iτk L k − L S ∈ MS ⊗ MEk , which in turn is proven by noting that τk τk eiτk L k L S e−iτk L k − L S = eit L k [iL k , L S ]e−it L k dt = eit L k [iVk , L S ]e−it L k dt, 0

0

where [iVk , L S ] = − dtd eit L S Vk e−it L S |t=0 ∈ MS ⊗ MEk . As a consequence, α k (L S ) − α k (L S ) ∈ M, and we can apply Theorem 1.3 to obtain k limm→∞ m1 m ) − α k+1 (L S ) = + (E[P(L S − α τ (L S ))P]) a.s. On the k=1 α (L Sm other hand, we have that k=1 α k (L S ) − α k+1 (L S ) = m1 α 1 (L S ) − α m+1 (L S ) , which tends to zero as m → ∞. This proves the formula for dE + given in Theorem 1.3. Next we show the expression for dS+ in Theorem 1.3. The following result is deterministic, we consider ω fixed and do not display it.

Proposition 5.1. Let be a normal state on M. Then for any m ≥ 1, we have

S(m) := Ent( ◦ α m |0 ) − Ent(|0 ) $ m % k−1 k m = βEk j (k) + α (L S + Vk ) − α (L S + Vk+1 ) + βS (α (L S ) − L S ) , k=1

where the energy jump j (k) has been defined in (1.12). 5 The definition of dE differs from the one of [7] by a factor 1 : here dE represent the asymptotic average + + τ energy production per interaction and not per unit of time. One could also study the average energy production per unit of time. It is easy to see that

lim

( E(m, ω))

m→∞ τ (ω1 ) + · · · + τ (ωm )

=

dE + , a.s. E[τ ]

572

L. Bruneau, A. Joye, M. Merkli

Proof. The proof is similar to that of Proposition 2.6. in [7]. Using the entropy production formula [10], we have $ $ % % m

S(m) = α βEk L Ek − βEk L Ek − βS L S . βS L S + (5.3) k

k

Clearly, the sums in the argument of in the right-hand side only extend from k = 1 to k = m. We examine the difference of the two terms with index k, α m (βEk L Ek ) − βEk L Ek = α k (βEk L Ek ) − βEk L Ek = βEk α k (L k ) − βEk L Ek − βEk α k (L S + Vk ) = βEk α k−1 (L k ) − βEk L Ek + βEk j (k) − βEk α k (L S + Vk+1 ) = βEk α k−1 (L S + Vk ) + βEk j (k) − βEk α k (L S + Vk+1 ), where we use α m (L Ek ) = α k (L Ek ) in the first step, α k (L k ) = α k−1 (L k ) and (1.12) in the third step, and in the last one α k−1 (L Ek ) = L Ek . By Proposition 5.1, we have for all m ≥ 1, ω ∈ ext , m m 1 1

S(m, ω) = βEk j (k) + βEk α k−1 (Vk ) − α k (Vk+1 ) m m m

+

k=1 m

1 m

k=1

k=1

1 βEk α k−1 (L S ) − α k (L S ) + (βS (α m (L S ) − L S )). m

Using Theorem 1.3 we see that with probability one (and where M denotes the reduced dynamics operator)

S(m, ω) = + E[βE M] E[P V P] − E[βE Pα τ (V )P] + + (E[βE P V P] m→∞ m −E[βE M] E[P V P]) + + E[βE P(L S − α τ (L S ))P] " # = + E βE P((L S + V ) − α τ (L S + V ))P . lim

This completes the proof of Theorem 1.4. 6. Spin-spin Models and Proof of Theorem 1.5 In this section, we consider both S and E to be two-level systems, with interaction given by (1.13). This is a particular case of the third example in [7]. The main results of this section have been anounced in [8]. The observable algebra for S and for E is AS = AE = M2 (C). Let E S , E E > 0 be the “excited” energy level of S and of E, respectively. Accordingly, the Hamiltonians are given by 0 0 0 0 hS = and h E = . 0 ES 0 EE The dynamics are given by αSt (A) = eith S Ae−ith S and αEt (A) = eith E Ae−ith E . We choose (for computational convenience) the reference state of E to be the Gibbs state

Random Repeated Interaction Quantum Systems

573

at inverse temperature β, see (1.14), and we choose the reference state for S to be the tracial state, 0,S (A) = 21 Tr(A). The interaction operator is defined by λv, where λ is a coupling constant, and v is given in (1.13). The creation and annihilation operators are represented by the matrices 01 00 and a#∗ = . a# = 00 10 The Heisenberg dynamics of S coupled to one element E is given by the ∗-automorphism group t → eith λ Ae−ith λ , A ∈ AS ⊗ AE , h λ = h S + h E + λv. To find a Hilbert space description of the system, one performs the GNS construction of (AS , 0,S ) and (AE , β,E ), see e.g. [5,7]. In this representation, the Hilbert spaces are given by HS = HE = C2 ⊗ C2 , the Von Neumann algebra by MS = ME = M2 (C) ⊗ 1lC2 ⊂ B(C2 ⊗ C2 ), and the vectors representing 0,S and β,E are ψS = √1 (|0 ⊗ |0 + |1 ⊗ |1 ) and ψE = √ 1−βh |0 ⊗ |0 + e−β E E /2 |1 ⊗ |1 , 2 E Tre respectively, i.e., we have β# ,# (A) = ψ# , (A ⊗ 1l)ψ# , # = S, E, βE = β, βS = 0, and where |0 (resp. |1 ) denote the ground (resp. excited) state of h S and h E . Finally, the Liouvillean L is given by L = (h S ⊗ 1lC2 − 1lC2 ⊗ h S ) ⊗ (1lC2 ⊗ 1lC2 ) + (1lC2 ⊗1lC2 ) ⊗ (h E ⊗ 1lC2 − 1lC2 ⊗h E ) +λ(aS ⊗ 1lC2 ) ⊗ (aE∗ ⊗ 1lC2 ) + λ(aS∗ ⊗ 1lC2 ) ⊗ (aE ⊗ 1lC2 ). 6.1. Spectral analysis of the reduced dynamics operator M. The RDO M is defined by (2.14). However, in this example, where the hamiltonian h λ is explicitly diagonalizable, we shall use another expression for it, which may look less simple but has the advantage that it only makes use of the self-adjoint hamiltonian. Since ψS is cyclic for MS and HS has finite dimension, ∀φ ∈ HS , ∃!AS = A ⊗ 1lC2 ∈ MS such that φ = AS ψS . It is then easy to see that M(A ⊗ 1lC2 )ψS = (M(A) ⊗ 1lC2 )ψS , and where the map M acts on AS and is defined as M(A) := Tr E eiτ h λ A ⊗ 1l e−iτ h λ ,

(6.1)

(6.2)

where Tr E (AS ⊗ AE ) := β,E (AE )AS denotes the partial trace over E. Similarly, if M∗ denotes the map dual to M, i.e. ∀ρ, A ∈ M2 (C), Tr(ρM(A)) = Tr(M∗ (ρ)A), then we have, for any density matrix ρ, ((M∗ (ρ))∗ ⊗ 1l)ψS = M ∗ (ρ ∗ ⊗ 1l)ψS .

(6.3)

In particular, the spectrum of the map M∗ is in one-to-one correspondence with the spectrum of the operator M ∗ (via complex conjugation), and if ρ is an eigenvector of M∗ for the eigenvalue 1 (which we know to exist), then the “corresponding eigenvector” of M ∗ is ψS∗ = (ρ ∗ ⊗ 1l)ψS . A simple computation shows that the four eigenvalues of h λ are E 0+ = 0, E 0− = E S + E E and E 1± =

1 1 (E S + E E ) ± (E S − E E )2 + 4λ2 . 2 2

(6.4)

574

L. Bruneau, A. Joye, M. Merkli

The corresponding normalized eigenvectors are given by ψ0+ = |0 ⊗ |0 , ψ0− = |1 ⊗ |1 , and ψ1± = a1± |1 ⊗ |0 + b1± |0 ⊗ |1 , respectively, where a1± = −

λ λ2

+ (E S − E 1±

)2

, b1± =

E S − E 1± λ2

+ (E S − E 1± )2

.

(6.5)

We finally denote a0+ = b0− = 1 and a0− = b0+ = 0. Inserting the spectral decomposition of h λ into (6.2) gives the following result. Lemma 6.1. For any A ∈ A, −1 M(A) = Z β, E

eiτ (E nσ −E n σ ) a¯ nσ an σ n|An + b¯nσ bn σ 1 − n|A(1 − n )

n,σ,n ,σ

× anσ a¯ n σ |n n | + e−β E E bnσ b¯n σ |1 − n 1 − n | ,

(6.6)

where n, n ∈ {0, 1} and σ, σ ∈ {−, +} and Z β,E = Tr(e−βh E ). Similarly, for any density matrix ρ, −1 iτ (E nσ −E n σ ) −β E E ¯n σ 1−n |ρ(1 − n) σ n |ρn + e a M∗ (ρ)=Z β, e a ¯ b b nσ nσ n E n,σ,n ,σ

× a¯ nσ an σ |n n| + b¯nσ bn σ |1 − n 1 − n| .

(6.7)

The above lemma allows us to make a complete spectral analysis of M. Proposition 6.2. 1. The eigenvalues of M are 1, e0 , e− , e+ where e0 is given in (1.16), 2 √ 2 2 E S − E E − (E S − E E )2 + 4λ2 + 4λ2 eiτ (E S −E E ) +4λ e− = 2 E S − E E − (E S − E E )2 + 4λ2 + 4λ2 √ 2 2 × eiτ (E S +E E − (E S −E E ) +4λ ) , e+ = e− .

Moreover, the eigenstates of M ∗ for the eigenvalues 1, e0 , e− , e+ are respectively ψS∗ = (e−β h S ⊗ 1lC2 )ψS , where β := β E E /E S , φ0 = |0 ⊗ |0 − |1 ⊗ |1 , φ− = |0 ⊗ |1 and φ+ = |1 ⊗ |0 . 2. The functions |e0 (τ )|, |e+ (τ )| and |e− (τ )| are continuous and periodic of period 2π T := √ . Moreover, they have modulus strictly less than 1 if and only 2 2 (E S −E E ) +4λ

if τ ∈ / T N.

Remark 6.3. Since e0 is positive, Point 2 proves that 1 is a non degenerate eigenvalue for M∗ if and only if τ ∈ / T N, i.e. for all but a discrete set of interaction times. This condition agrees with the corresponding assumption of [7] in the perturbative regime. Proof. Point 2 follows from Point 1. Point 1 is proven by direct computation using (6.4)-(6.5)-(6.7).

Random Repeated Interaction Quantum Systems

575

6.2. Proof of Theorem 1.5. Point 2 of Proposition 6.2 shows that M ∈ M(E) if and only if τ ∈ / T N. Hence, for this spin-spin model, Theorem 1.3 applies if and only if p(τ ∈ / T N) = 0, which is precisely the assumption we have in each of the three situations of Theorem 1.5. It remains to compute the asymptotic state + in each of these three situations. Using the complete spectral decomposition of M ∗ (ω) (see Proposition 6.2), we compute explicitly its expectation E[M ∗ ] and then the spectral projection P1,E[M ∗ ] . After computation, we get: 2 |ψS e−β h S Tr(e−β h S ) P1,E[M ∗ ] = |ψS ρ E ⊗ 1lC2

1. Random interaction time: P1,E[M ∗ ] =

⊗ 1lC2 ψS |,

2. Random excitation energy of E: ψS |, where

ρ E = 1 − (1 − E(e0 ))−1 E((1 − e0 )(1 − 2Z β−1 ,S )) |0 0|

+ 1 + (1 − E(e0 ))−1 E((1 − e0 )(1 − 2Z β−1 ,S )) |1 1|, 3. Random temperature of E: P1,E[M ∗ ] = −1 − 1). β = −E S−1 log(E[Z β−1 (ω),S ]

˜ 2 |ψS e−βh S ˜ Tr(e−βh S )

⊗ 1lC2 ψS |, where

Combining these formulas with (1.10) give the various expressions for the asymptotic state + . Finally, when the interaction time τ is random, the map M ∗ (ω) has a deterministic eigenvector for the eigenvalue 1. This allows for the stronger convergence result mentioned in Remark 1 after Theorem 1.5 (see also the remark after Theorem 3.2). 7. Spin-Fermion Models and Proof of Theorem 1.6 We combine our convergence results with a rigorous perturbation theory in the coupling strength between S and C. We take S to be a 2-level atom and the E are large quantum systems, each one modeled by an infinitely extended gas of free thermal fermions. The random parameters are the temperature of the system Ek , Tk = βk−1 , as well as the interaction time τk . The state space and the reference vector of S are HS = C2 ⊗ C2 ,

1 ψS = √ (|0 ⊗ |0 + |1 ⊗ |1 ) , 2

(7.1)

where {|0 = [1, 0]T , |1 = [0, 1]T } is the canonical basis of C2 . Equation (7.1) gives the GNS representation of the trace state on the algebra of complex matrices M2 (C), 1 2 Tr(AS ) = ψS , (AS ⊗ 1lS )ψS , for all AS ∈ M2 (C). The von Neumann algebra of observables represented on HS is thus MS = M2 (C) ⊗ 1l ⊂ B(C2 ⊗ C2 ). The Heisenberg dynamics of S is given by eitσz AS e−itσz . The Pauli matrices σz and σx (the latter plays a role in the interaction) are 1 0 01 , σx = . (7.2) σz = 0 −1 10 On the algebra MS , the dynamics is implemented as τSt (AS ⊗1l) = eit L S (AS ⊗1l)e−it L S , with standard Liouville operator L S = σz ⊗ 1l − 1l ⊗ σz .

(7.3)

576

L. Bruneau, A. Joye, M. Merkli

Note that L S ψS = 0, as required in (1.1). It is easily verified that the modular operator

S and the modular conjugation JS are given by

S = 1l ⊗ 1l,

JS (ψ ⊗ χ ) = χ ⊗ ψ,

(7.4)

C2 ,

and where the bar means taking complex conjugation of coordifor vectors ψ, χ ∈ nates in the canonical basis. We now describe a single element E of the chain, a free Fermi gas at inverse temperature β in the thermodynamic limit. We refer the reader to [5] for a detailed presentation. Let h and h be the Hilbert space and the Hamiltonian for a single fermion, respectively. We represent h as h = L 2 (R+ , dµ(r ); g), where g is an auxiliary Hilbert space, and we take h to be the operator of multiplication by r ∈ R+ . (See also footnote 4 at the end of Sect. 1). The fermionic annihilation and creation operators a( f ) and a ∗ ( f ) act on the fermionic Fock space − (h). They satisfy the canonical anticommutation relations (CAR). As a consequence of the CAR, the operators a( f ) and a ∗ ( f ) are bounded and satisfy a # ( f ) = f , where a # stands for either a or a ∗ . The algebra of observables of a free Fermi gas is the C ∗ -algebra of operators A generated by {a # ( f )| f ∈ h}. The dynamics is given by τft (a # ( f )) = a # (eith f ), where h is the Hamiltonian of a single particle, acting on h. It is well known (see e.g. [5]) that for any β > 0, there is a unique (τf , β)−KMS state β on A, determined by the two point function β (a ∗ ( f )a( f )) = f, (1 + eβh )−1 f . Let us denote by f the Fock vacuum vector, and by N the number operator of − (h). We fix a complex conjugation (anti-unitary involution) f → f¯ on h which commutes with the energy operator h. It naturally extends to a complex conjugation on the Fock space − (h) and we denote it by the same ¯ symbol, i.e. → . The GNS representation of the algebra A associated to the KMS-state β is the triple (HE , πβ , ψE ) [1] where HE = − (h) ⊗ − (h), ψE = f ⊗ f , and

πβ (a ∗ ( f )) =

βh/2 √e f βh 1+eβh/2 a ∗ √e βh 1+e

πβ (a( f )) = a

⊗ 1l + (−1) N ⊗ a ∗ √ 1 βh f¯ =: aβ ( f ), 1+e f ⊗ 1l + (−1) N ⊗ a √ 1 βh f¯ =: aβ∗ ( f ).

(7.5)

(7.6)

1+e

The von Neumann algebra of observables for an element E of the chain is ME = πβ (A) , acting on the Hilbert space HE . The dynamics on πβ (A) is given by τEt (πβ (A)) = πβ (τft (A)), it extends to ME in a unique way. The standard Liouville operator is given by L E = d(h) ⊗ 1l − 1l ⊗ d(h).

(7.7)

Note that L E ψE = 0. Finally, the modular conjugation and the modular operator associated to (ME , ψE ) are ¯ ⊗ (−1) N (N −1)/2 , ¯ JE ( ⊗ ) = (−1) N (N −1)/2

E = e−β L E .

(7.8)

The combined, uncoupled system has product structure, with Hilbert space HS ⊗HE , algebra MS ⊗ ME , reference state ψS ⊗ ψE . The uncoupled dynamics is generated by the Liouville operator L0 = LS + LE .

(7.9)

Random Repeated Interaction Quantum Systems

577

We now specify the interaction between the small system and the elements of the chain. Let g ∈ h be a form factor. The interaction operator is given by V := σx ⊗ 1lC2 ⊗ (aβ (g) + aβ∗ (g))

(7.10)

(where σx is defined in (7.2)). It produces energy exchange processes between S and E. Using (7.4), (7.8), one readily calculates ( S ⊗ E )1/2 V ( S ⊗ E )−1/2

1 eβh ∗ = σx ⊗ 1lC2 ⊗ a √ g ⊗ 1l + a √ g ⊗ 1l 1 + eβh 1 + eβh βh/2

−βh/2 e e g¯ + (−1) N ⊗ a √ g¯ . +(−1) N ⊗ a ∗ √ 1 + eβh 1 + eβh

(7.11)

We assume that eβh/2 g ∈ h. Then (7.11) shows that ( S ⊗ E )1/2 V ( S ⊗ E )−1/2 ∈ MS ⊗ ME , i.e., Condition (A2) of Sect. 2 is satisfied. Theorem 7.1 (Convergence to asymptotic state). Let 0 < τmin < τmax < ∞ and 0 < βmax < ∞ be given. Let τ : → [τmin , τmax ] and β : → (0, βmax ] be random variables. Suppose that (1 + eβmax h/2 )g < ∞, and that there is a δ > 0 such that (7.12) p dist(τ, π2 N) > δ = 0. Then there is a constant λ0 > 0, depending on τmin , τmax , βmax , δ, and on the form factor g, s.t. if 0 < |λ| < λ0 , then p(M(ω) ∈ M(E) ) > 0. In particular, the results of Theorem 1.3, applied to the spin-fermion system, hold: the system approaches the repeated interaction asymptotic state + , defined in (1.10). Proof. We expand the operator M in a power (Dyson) series in λ: M = eiτ L S P τ 2n dt1 · · · + λ n≥1

0

t2n−1

dt2n eiτ L S Peit2n K 0 W e−it2n K 0 · · · eit1 K 0 W e−it1 K 0 P, (7.13)

0

where only the even powers appear since the interaction is linear in creation and annihilation operators, and P projects onto the vacuum. W is the operator W = V − J 1/2 V −1/2 J,

(7.14)

where V is given in (7.10) (see also (7.11)), and J = JS ⊗ JE , = S ⊗ E are the modular conjugation and the modular operator associated to (MS ⊗ ME , ψS ⊗ ψE ), see also (7.4), (7.8). Using the Canonical Anticommutation Relations, one easily sees that aβ# (g) = g (independent of β; see (7.6) for the definition of the thermal creation and annihilation operators). Using (7.14) and (7.11), it is easy to find the upper bound W ≤ 3(1 + eβh/2 )g.

(7.15)

We apply standard analytic perturbation theory to the operator (7.13). For λ = 0, the eigenvalues of M, {1, e±2iτ }, lie apart by the distance r0 (τ ) := min {2| sin(2τ )|, 2| sin(τ )|} .

(7.16)

578

L. Bruneau, A. Joye, M. Merkli

(Note the spectrum of L S is {−2, 0, 0, 2}, cf (7.3).) We assume that the interaction time is such that r0 (τ ) is strictly positive. Below, this condition appears as dist(τ, π2 N) > δ > 0. The following result gives an estimate of the eigenvalues of M, which will be needed in verifying that M is in the family M(E) . Proposition 7.2. Suppose that |λ| < 41 r0 (τ ). Denote by 1, e0 , e± the four eigenvalues of M. We have e0 = 1 − λ2 τ 2 α + ε0 , λ2 τ 2 ±2iτ 2 2 1− α ± iλ τ e± = e dµ(r )g(r )2g 2

1 − sinc(τ (2 − r )) 1 − sinc(τ (2 + r )) + + ε± , × 2−r 2+r where sinc(x) = sin(x)/x, and where

τ (r − 2) τ (r + 2) α = dµ(r )g(r )2g sinc2 + sinc2 . 2 2

(7.17)

(7.18)

(7.19)

The error terms ε# , # = 0, ±, satisfy the bound 1 + λ2 τ 2 W 2 cosh(|λ|τ W ) |ε# | ≤ 12λ4 τ 4 W 4 cosh2 (|λ|τ W ) 1 + . (7.20) r0 (τ ) Proof. Expansions (7.17), (7.18) of the eigenvalues have already been calculated in [7], Sect. 4.8, but the error estimate (7.20), allowing the control of τ, β, has not been given there. This error estimate is obtained by performing perturbation theory in a straightforward, but careful fashion. One proceeds as in [11], Chap. II.2. By knowing this expansion of the eigenvalues of M, we can impose a smallness condition on λ which guarantees that the eigenvalues e# have modulus strictly less than one, which is equivalent to saying that M ∈ M(E) . Proposition 7.3. Suppose that τmin < τ < τmax , β < βmax , and that dist(τ, π2 N) > δ, for some constants 0 < τmin < τmax and βmax , δ > 0. Then there is a constant λ0 > 0, depending on τmin , τmax , βmax , δ, as well as on the form factor g, s.t. if 0 < |λ| < λ0 , then |e# | < 1 −

λ2 τ 2 α < 1, 8

(7.21)

# = 0, ±. In particular, M ∈ M(E) . End of proof of Theorem 7.1, given Proposition 7.3. Fix τmin , τmax , βmax and δ, and suppose that (7.12) holds. Denote by the set of ω for which dist(τ (ω), π2 N) > δ. Then p( ) = 0, and for each ω ∈ , we have M(ω) ∈ M(E) , by Proposition 7.3. Consequently, p(M(ω) ∈ M(E) ) ≥ p( ) > 0. Proof of Proposition 7.3. We impose conditions s.t. the three eigenvalues given in (7.17), (7.18) have modulus strictly less than one. We have |e0 | < 1 −

λ2 τ 2 α, 2

(7.22)

Random Repeated Interaction Quantum Systems

579

provided |ε0 | <

λ2 τ 2 α. 2

(7.23)

Next, since e− is the complex conjugate of e+ , it suffices to consider the latter. We write, with obvious identifications in (7.18), e+ = e2iτ [1 − x + iy + ε+ ]. We have |e+ | ≤ |1 − x + iy| + |ε+ | =

& 1 − 2x + x 2 + y 2 + |ε+ |.

(7.24)

Since x 2 + y 2 is just the square of the modulus of the second order (λ2 ) contribution to the eigenvalue e+ , it is easy to see that x 2 + y 2 ≤ λ4 τ 4 W 4 (1 + r0 1(τ ) )2 . We now impose the condition 2λ τ W 2 2

4

1 1+ r0 (τ )

2 < α,

(7.25)

which implies that x 2 + y 2 < x. Combining this latter inequality with (7.24) gives |e+ | <

√ 1 − x + |ε+ | ≤ 1 − x/2 + |ε+ |.

(7.26)

Finally we impose the condition |ε+ | < x/4 =

λ2 τ 2 α, 8

(7.27)

so that we get from (7.26), |e+ | < 1 − x/4 = 1 −

λ2 τ 2 α. 8

(7.28)

This last bound, combined with (7.22), proves that (7.21) holds, provided the conditions (7.27), (7.25) and (7.23) are imposed. Taking into account the bound (7.20), we see that a sufficient condition for (7.27), (7.25) and (7.23) to hold is that 1 + λ2 τ 2 W 2 cosh(|λ|τ W ) < α. (7.29) 96λ τ W cosh (|λ|τ W ) 1 + r0 (τ ) 2 2

4

2

One may now use (7.15), (7.16), to find a constant λ0 , depending only on the parameters as stated in the proposition, s.t. if |λ| < λ0 , then (7.29) holds. (Note that α, (7.19), does not depend on β, and the minimum of α, taken over τ > 0 varying in any compact set, must be strictly positive.) This completes the proof of Proposition 7.3, and hence that of Theorem 7.1.

580

L. Bruneau, A. Joye, M. Merkli

7.1. Proof of Theorem 1.6. Since E[M] ∈ M(E) (by Theorems 7.1 and 1.2), 1 is a simple eigenvalue of E[M]. Let ψS∗ denote the unique vector invariant under E[M ∗ ], normalized as ψS∗ , ψS = 1, where ψS is given in (7.1). We have P1,E[M] = |ψS ψS∗ |, and thus, by (3.14), θ = ψS∗ . To calculate ψS∗ , we note first that for any ω, M(ω) is block-diagonal: Lemma 7.4. Let P0 = |0 0| + |1 1| be the spectral projection of L S associated to {1}. The operator M(ω) leaves the subspace Ran P0 invariant. In the ordered orthonormal basis {|0 , |1 } of Ran P0 , we have the representation P0 M(ω)P0 = 1l − λ2 τ 2 (ω)

α− (ω) −α− (ω) + O(λ4 ), −α+ (ω) α+ (ω)

(7.30)

where the α± (ω) are given by (1.18) with τ (ω) replaced by τ0 . The remainder term is uniform in τ varying in compact sets. Proof of Lemma 7.4. As explained at the beginning of the proof of Theorem 7.1, only even powers of the interaction are present in the Dyson series expansion for M, (7.13). It follows from (7.10) and (7.14) that each term in the Dyson series (7.13) leaves Ran P0 invariant; this is so because the operator σx shows up an even number of times, and σx |0 = |1 and σx |1 = |0 . The calculation of the explicit form (7.30) is not hard. This concludes the proof of Lemma 7.4 The expansion for M ∗ and hence of E[M ∗ ] in powers of λ follow directly from (7.30). One then performs an expansion in powers of σ and finds for the O(λ2 )-term:

α− −α+ ξ− −ξ+ − 2λ2 E[σ ] −α− α+ −ξ− ξ+ η− −η+ + λ2 O(σ 3 ). −λ2 (E[σ ])2 −η− η+

−λ2 τ02

The following expansion of the invariant vector ψS∗ follows: 1 α+ α− |0 ⊗ |0 + |1 ⊗ |1 √ ψS∗ = α+ + α− α+ + α− 2 α− ξ+ − α+ ξ− (|0 ⊗ |0 − |1 ⊗ |1 ) +2E[σ ] 2 τ0 (α+ + α− )2 α− ξ+ − α+ ξ− (|0 ⊗ |0 − |1 ⊗ |1 ) +4(E[σ ])2 (ξ− + ξ+ ) 4 τ0 (α+ + α− )3 α− η+ − α+ η− +E[σ 2 ] 2 (|0 ⊗ |0 − |1 ⊗ |1 ) + O(σ 3 ) + O(λ2 ). (7.31) τ0 (α+ + α− )2 Formula (1.17) now follows directly from (7.31) and (1.10). This concludes the proof of Theorem 1.6.

Random Repeated Interaction Quantum Systems

581

References 1. Araki, H., Wyss, W.: Representations of canonical anticommutation relations. Helv. Phys. Acta 37, 136–159 (1964) 2. Attal, S., Joye, A.: The Langevin Equation for a Quantum Heat Bath. J. Funct. Anal. 247, 253–288 (2007) 3. Attal, S., Joye, A., Pillet, C.-A. (eds.): Open Quantum Systems I-III, Lecture Notes in Mathematics, Volumes 1880–1882, Berlin-Heidelberg-New York: Springer Verlag, 2006 4. Azoff, E.A.: Borel Measurability in Linear Algebra. Proc. Am. Math. Soc. 42(2), 346–350 (1974) 5. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics, Volumes 1 and 2, Texts and Monographs in Physics, Berlin-Heidelberg-New York: Springer Verlag, 1996 6. Bruneau, L.: Repeated interaction quantum systems. In: Proceedings of the IRS Conference, to appear in Markov Process. Related Fields (Paris, January 2007) 7. Bruneau, L., Joye, A., Merkli, M.: Asymptotics of repeated interaction quantum systems. J. Func. Anal. 239, 310–344 (2006) 8. Bruneau, L., Joye, A., Merkli, M.: Infinite products of random matrices and repeated interactions dynamics. http://arXiv.org/list/math.PR/0703675, 2007 9. Jak˘sic, V., Pillet, C.-A.: Non-equilibrium steady states of finite quantum systems coupled to thermal reservoirs. Commun. Math. Phys. 226, 131–162 (2002) 10. Jak˘si´c, V., Pillet, C.-A.: A note on the entropy production formula. In: Advances in differential equations and mathematical physics (Birmingham, AL, 2002), Contemp. Math. 327, Providence, RI: Amer. Math. Soc. 2003, pp. 175–180 11. Kato, K.: Perturbation Theory for Linear Operators. 2nd edition. Berlin: Springer, 1976 12. Merkli, M., Mück, M., Sigal, I.M.: Instability of Equilibrium States for Coupled Heat Reservoirs at Different Temperatures. J. Funct. Anal. 243, 87–120 (2007) 13. Merkli, M., Mück, M., Sigal, I.M.: Theory of Non-Equilibrium Stationary Sates as a Theory of Resonances. Ann. H. Poincaré 8(8), 1539–1593 (2007) 14. Merkli, M., Sigal, I.M., Berman, G.P.: Resonance Theory of Decoherence and Thermalization. Ann. Phys., 323(2), 373–412 (2008) Decoherence and thermalization, Phys. Rev. Lett. 98(13), 130401 (2007) 15. Meschede, D., Walther, H., Müller, G.: One-atom maser. Phys. Rev. Lett. 54, 551–554 (1993) 16. Weidinger, M., Varcoe, B.T.H., Heerlein, R., Walther, H.: Trapping states in micromaser. Phys. Rev. Lett. 82, 3795–3798 (1999) 17. Wellens, T., Buchleitner, A., Kümmerer, B., Maassen, H.: Quantum state preparation via asymptotic completeness. Phys. Rev. Lett. 85, 3391–3364 (2000) Communicated by I.M. Sigal

Commun. Math. Phys. 284, 583–647 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0641-z

Communications in

Mathematical Physics

Classification of Superpotentials A. Dancer1 , M. Wang2, 1 Jesus College, Oxford University, Oxford OX1 3DW, United Kingdom.

E-mail: [email protected]

2 Department of Mathematics and Statistics, McMaster University, Hamilton,

ON L8S 4K1, Canada. E-mail: [email protected] Received: 5 February 2007 / Accepted: 26 June 2008 Published online: 1 October 2008 – © Springer-Verlag 2008

Abstract: We extend our previous classification [DW4] of superpotentials of “scalar curvature type” for the cohomogeneity one Ricci-flat equations. We now consider the case not covered in [DW4], i.e., when some weight vector of the superpotential lies outside (a scaled translate of) the convex hull of the weight vectors associated with the scalar curvature function of the principal orbit. In this situation we show that either the isotropy representation has at most 3 irreducible summands or the first order subsystem associated to the superpotential is of the same form as the Calabi-Yau condition for submersion type metrics on complex line bundles over a Fano Kähler-Einstein product. 0. Introduction In this paper we continue the study we began in [DW4] of superpotentials for the cohomogeneity one Einstein equations. These equations are the ODE system obtained as a reduction of the Einstein equations by requiring that the Einstein manifold admits an isometric Lie group action whose principal orbits G/K have codimension one [BB,EW]. As discussed in [DW3], these equations can be viewed as a Hamiltonian system with constraint for a suitable Hamiltonian H, in which the potential term depends on the Einstein constant and the scalar curvature of the principal orbit, and the kinetic term is essentially the Wheeler-deWitt metric, which is of Lorentz signature. For any Hamiltonian system with Hamiltonian H and position variable q, a superpotential is a globally defined function u on configuration space that satisfies the equation H(q, du q ) = 0.

(0.1)

From the classical physics viewpoint, u is a C 2 (rather than a viscosity) solution of a time-independent Hamilton-Jacobi equation. The literature for implicitly defined first order partial differential equations then suggests that such solutions are fairly rare. It Partly supported by NSERC grant No. OPG0009421.

584

A. Dancer, M. Wang

is therefore not unreasonable to expect in our case that one can classify (at least under appropriate conditions) those principal orbits where the associated cohomogeneity one Einstein equations admit a superpotential. The existence of such a superpotential u in our case leads naturally to a subsystem of equations of half the dimension of the full Einstein system. One way to see this is via generalised first integrals which are linear in momenta, described in [DW4]. Schematically, the subsystem may be written as q˙ = J ∇u, where J is an endomorphism related to the kinetic term of the Einstein Hamiltonian. String theorists have exploited the superpotential idea in their search for explicit metrics of special holonomy (see for example [CGLP1,CGLP2,CGLP3,BGGG] and references in [DW4]). The point here is that the subsystem defined by the superpotential often (though not always) represents the condition that the metric has special holonomy. Also, the subsystem can often be integrated explicitly. In [DW4], Sect. 6, we obtained classification results for superpotentials of the cohomogeneity one Ricci-flat equations. Besides assuming that G and K are both compact, connected Lie groups such that the isotropy representation of G/K is multiplicity-free, we also mainly restricted our attention to superpotentials which are of the same form as the scalar curvature function of G/K , i.e., a finite sum with constant coefficients of exponential terms. Almost all the known superpotentials are of this kind. However, the above classification results were further subject to the technical assumption that the extremal weights for the superpotential did not lie in the null cone of the Wheeler-de Witt metric. In [DW4] we gave some examples of superpotentials which do not satisfy this hypothesis. These included several new examples which do not seem to be associated to special holonomy. In this paper, therefore, we attempt to solve the classification problem without the non-null assumption on the extremal weights. As in [DW4], we use techniques of convex geometry to analyse the two polytopes naturally associated to the classification problem. The first is (a rescaled translate of) the convex hull conv(W) of the weight vectors appearing in the scalar curvature function of the principal orbit. The second is the convex hull conv(C) of the weight vectors in the superpotential. In [DW4] we showed that the non-null assumption forces these polytopes to be equal, so we could analyse the existence of superpotentials by looking at the geometry of conv(W). In the current paper, conv(C) may be strictly bigger than conv(W) because of the existence of vertices outside conv(W) but lying on the null cone of the Wheeler-de Witt metric. Our strategy is to consider such a vertex c and project conv(W) onto an affine hyperplane separating c from conv(W). We can now analyse the existence of superpotentials in terms of the projected polytope. The analysis becomes considerably more complicated because, whereas in [DW4] we could analyse the situation by looking at the vertices and edges of conv(W), now, because we have projected onto a subspace of one lower dimension, we have to consider the 2-dimensional faces of conv(W) also. We find that in this situation the only polytopes conv(W) arising from principal orbits with more than three irreducible summands in their isotropy representations are precisely those coming from principal orbits which are circle bundles over a (homogeneous) Fano product. In the latter case, the solutions of the subsystem defined by the superpotential correspond to Calabi-Yau metrics, as discussed in [DW4]. After a review of basic material in Sect. 1, we state the main classification theorem of the paper in Sect. 2 and give an outline of the strategy of the proof there.

Classification of Superpotentials

585

1. Review and Notation In this section we fix notation for the problem and review the set-up of [DW4]. Let G be a compact Lie group, K ⊂ G be a closed subgroup, and M be a cohomogeneity one G-manifold of dimension n + 1 with principal orbit type G/K , which is assumed to be connected and almost effective. A G-invariant metric g on M can be written in the form g = εdt 2 + gt , where t is a coordinate transverse to the principal orbits, ε = ±1, and gt is a 1-parameter family of G-homogeneous Riemannian metrics on G/K . When ε = 1, the metric g is Riemannian, and when ε = −1, the metric g is spatially homogeneous Lorentzian, i.e., the principal orbits are space-like hypersurfaces. We choose an Ad(K )-invariant decomposition g = k ⊕ p, where g and k are respectively the Lie algebras of G and K , and p is identified with the isotropy representation of G/K . Let p = p1 ⊕ · · · ⊕ pr

(1.1)

be a decomposition of p ≈ T(K ) (G/K ) into irreducible real K -representations. We r let di be the real dimension of pi , and n = i=1 di be the dimension of G/K (so dim M = n + 1). We use d for the vector of dimensions (d1 , . . . , dr ). We shall assume that the isotropy representation of G/K is multiplicity free, i.e., all the summands pi in (1.1) are distinct as K -representations. In particular, if there is a trivial summand it must be 1-dimensional. We use q = (q1 , . . . , qr ) to denote exponential coordinates on the space of Ginvariant metrics on G/K . The Hamiltonian H for the cohomogeneity one Einstein equations with principal orbit G/K is now given by: H = v−1 J + εv ((n − 1) − S) , where is the Einstein constant, v = 21 ed·q is the relative volume and 2 r r pi2 1 J ( p, p) = pi − , n−1 di i=1

(1.2)

i=1

which has signature (1, r − 1). The scalar curvature S of G/K above can be written as S= Aw ew·q , w∈W

where Aw are nonzero constants and W is a finite collection of vectors w ∈ Zr ⊂ Rr . The set W depends only on G/K and its elements will be referred to as weight vectors. These are of three types: (i) type I: one entry of w is −1, the others are zero, (ii) type II: one entry is 1, two are -1, the rest are zero, (iii) type III: one entry is 1, one is -2, the rest are zero. Notation 1.1. As in [DW4] we use (−1i , −1 j , 1k ) to denote the type II vector w ∈ W ⊂ Rr with −1 in places i and j, and 1 in place k. Similarly, (−2i , 1 j ) will denote the type III vector with −2 in place i and 1 in place j, and (−1i ) the type I vector with −1 in place i.

586

A. Dancer, M. Wang

Remark 1.2. We collect below various useful facts from [DW4] and [WZ1]. Also, we shall use standard terminology from convex geometry, as given, e.g., in [Zi]. In particular, a “face” is not necessarily 2-dimensional. However, a vertex and an edge are respectively zero and one-dimensional. The convex hull of a set X in Rr will be denoted by conv(X ). (a) For a type I vector w, the coefficient Aw > 0, while for type II and type III vectors, Aw < 0. (b) The type I vector with −1 in the i th position is absent from W iff the corresponding summand pi is an abelian subalgebra which satisfies [k, pi ] = 0 and [pi , p j ] ⊂ p j for all j = i. If the isotropy group K is connected, these last conditions imply that pi is 1-dimensional, and the p j , j = i, are irreducible representations of the (compact) analytic group whose Lie algebra is k ⊕ pi . (c) If (1i , −1 j , −1k ) occurs in W then its permutations (−1i , 1 j , −1k ) and (−1i , −1 j , 1k ) do also. (d) If dim pi = 1 then no type III vector with −2 in place i is present in W. If in addition K is connected, then no type II vector with nonzero entry in place i ispresent. (e) If I is a subset of {1, . . . , r }, then each of the equations i∈I xi =1 and i∈I xi =−2 defines a face (possibly empty) of conv(W). In particular, all type III vectors in W are vertices and (−1i , −1k , 1 j ) ∈ W is a vertex unless both (−2i , 1 j ) and (−2k , 1 j ) lie in W. (f) For v, w ∈ W (or indeed for any v, w such that vi or wi = −1), we have J (v + d, w + d) = 1 −

r vi wi i=1

di

.

(1.3)

For the remainder of the paper, we shall work in the Ricci-flat Riemannian case, that is, we take ε = 1 and = 0. As in [DW4], any argument that does not use the sign of Aw would be valid in the Lorentzian case. We shall also assume that conv(W) is r − 1 dimensional. This is certainly the case if G is semisimple, as W spans Rr (see the proof of Theorem 3.11 in [DW3]). The superpotential equation (0.1) now becomes J (∇u, ∇u) = ed·q S,

(1.4)

where ∇ denotes the Euclidean gradient in Rr . As in [DW4] we shall look for solutions to Eq.(1.4) of the form ¯ Fc¯ ec·q , (1.5) u= c∈ ¯ C

where C is a finite set in R , and the Fc¯ are nonzero constants. Now Eq.(1.4) reduces to, for each ξ ∈ Rr , Aw if ξ = d + w for some w ∈ W J (a, ¯ c) ¯ Fa¯ Fc¯ = (1.6) 0 if ξ ∈ / d + W. r

a+ ¯ c=ξ ¯

We shall assume henceforth that r ≥ 2 since the superpotential equation always has a solution in the r = 1 case, as was noted in [DW4], and J is of Lorentz signature only when r ≥ 2. The following facts were deduced in [DW4] from Eq.(1.6).

Classification of Superpotentials

587

Proposition 1.3. conv( 21 (d + W)) ⊂ conv(C). Proof. If w ∈ W, then Eq.(1.6) implies that d + w = a¯ + c¯ for some a, ¯ c¯ ∈ C, and hence that 21 (d + w) = 21 (a¯ + c) ¯ ∈ conv(C). Proposition 1.4. If a, ¯ c¯ ∈ C and a+ ¯ c¯ cannot be written as the sum of two non-orthogonal elements of C distinct from a, ¯ c¯ then either J (a, ¯ c) ¯ = 0 or a¯ + c¯ ∈ d + W. In particular, if c¯ is a vertex of C, then either J (c, ¯ c) ¯ = 0, or 2c¯ = d + w for some w ∈ W and J (c, ¯ c) ¯ Fc¯2 = Aw . In the latter case, J (d + w, d + w) has the same sign as Aw so is > 0 if w is type I and < 0 if w is type II or III. As mentioned in the Introduction, for the classification in [DW4] we made the assumption that all vertices c¯ of C are non-null. Under this assumption, the second assertion of Prop. 1.4 implies that all vertices of C lie in 21 (d + W). Hence conv(C) is contained in conv( 21 (d +W)), and by Prop. 1.3 they are equal. This meant that in [DW4], subject to the non-null assumption, we could study the existence of a superpotential in terms of the convex geometry of W. The aim of the current paper is to drop this assumption. We still have 1 conv( (d + W)) ⊂ conv(C), 2 but can no longer deduce that these sets are equal. The problem is that a vertex c¯ of conv(C) may lie outside conv( 21 (d + W)) if it is null. In fact, it is clear from the above discussion that conv( 21 (d + W)) is strictly contained in conv(C) if and only if C has a null vertex. For if c¯ is a null vertex of C and 2c¯ = d + w for some w ∈ W, then Eq.(1.6) fails for ξ = d + w. We conclude this section by proving an analogue of Proposition 2.5 in [DW4]. The arguments below using Prop. 1.4 are ones which will recur throughout this paper. Henceforth when we use the term “orthogonal” we mean orthogonal with respect to J unless otherwise stated. Theorem 1.5. C lies in the hyperplane {x¯ : x¯i = 21 (n −1)} (possibly after subtracting a constant from the superpotential). Proof. We can assume 0 ∈ / C by subtracting a constant from the superpotential. We shall also use repeatedly below the fact that as J has signature (1, r − 1) there are no null planes, only null lines. Denote by Hλ the hyperplane x¯i = λ, so 21 (d + W) lies in H 1 (n−1) . Suppose there 2 exist elements of C with x¯i > 21 (n − 1). Let λmax denote the greatest value of x¯i over C. If a˜ c˜ is an edge of conv(C) ∩ Hλmax , then Prop. 1.4 shows that a, ˜ c˜ are null, and that c˜ is orthogonal to the element of C closest to it on the edge. Hence c˜ is orthogonal to the whole edge. Now J is totally null on Span{a, ˜ c}, ˜ so since there are no null planes, a, ˜ c˜ are proportional, which is impossible as they are both in Hλmax . So C ∩ Hλmax is a single point c˜max , which is null. Next we claim that all elements of C lying in the half-space x¯i > 21 (n − 1) must be multiples of c˜max . If not, let λ∗ be the greatest value such that there is an element of C, not proportional to c˜max , in Hλ∗ . Let a˜ be a vertex of conv(C) ∩ Hλ∗ , not proportional to c˜max . Now, by Prop. 1.4, J (a, ˜ c˜max ) = 0, and so a˜ is not null. Since λ∗ > 21 (n − 1), we see a˜ + a˜ must be written in another way as a sum of two non-orthogonal elements

588

A. Dancer, M. Wang

of C. This sum must be of the form µc˜max + f˜. But c˜max is orthogonal to a˜ and to itself, hence to f˜, a contradiction establishing our claim.1 Similarly, all elements of C lying in x¯i < 2 (n − 1) are multiples of an element c˜min , should they occur. (Note that J is negative definite on H0 and we have assumed 0∈ / C so λmin = 0.) We denote the sets of elements lying in these open half-spaces by C+ and C− respectively. Note that, when non-empty, C+ and C− are orthogonal to all elements of C ∩ H 1 (n−1) . (For if a˜ ∈ C ∩ H 1 (n−1) then a˜ + c˜max cannot be written in another way as a 2 2 sum of two non-orthogonal elements of C.) In particular, if c˜max and c˜min are orthogonal, then c˜max is orthogonal to all of conv(C), which is r -dimensional by assumption. So c˜max is zero, a contradiction. The same argument implies that C+ and C− are both non-empty. Let ν c˜min and µc˜max be respectively the elements of C− and C+ closest to H 1 (n−1) . 2

Suppose that c˜max + ν c˜min = c˜(1) + c˜(2) with c˜(i) ∈ C and J (c˜(1) , c˜(2) ) = 0. Nonorthogonality means the c˜(i) cannot belong to the same side of H 1 (n−1) and by the 2 choice of ν, they cannot belong to opposite sides of H 1 (n−1) . Both therefore lie in 2

H 1 (n−1) . But by the previous paragraph, J (c˜max + ν c˜min , c˜(1) + c˜(2) ) = 0. This means 2 that c˜max + ν c˜min is null, which contradicts J (c˜max , c˜min ) = 0. Hence c˜max + ν c˜min lies in d + W ⊂ Hn−1 . Applying the same argument to c˜min + µc˜max , we find that in fact µ = ν = 1, i.e., C+ = {c˜max } and C− = {c˜min }. ⊥ , c˜⊥ Now C∩H 1 (n−1) (and hence its convex hull) is contained in the hyperplanes c˜max min 2 in H 1 (n−1) . These hyperplanes are distinct as c˜max is orthogonal to itself but not to c˜min . 2 Hence d + W ⊂ (C + C) ∩ Hn−1 is contained in the union of the point c˜max + c˜min and the ⊥ ∩ c˜⊥ of H codimension 2 subspace c˜max n−1 . So conv(d +W) is contained in a codimenmin sion 1 subspace of Hn−1 , contradicting our assumption that dim conv(d + W) = r − 1. Remark 1.6. A notational difficulty arises from the fact that, as seen above, points of C are on the same footing as points in 21 (d +W) rather than pointsof W. Accordingly, we shall use letters c, u, v, . . . to denote elements of the hyperplane u i = −1 (such as elements of W), and c, ¯ u, ¯ v, ¯ . . . to denote the associated elements 21 (d + c), 21 (d + u), 21 (d + v), . . . of the hyperplane u¯ i = 21 (n − 1) (such as elements of C or of 21 (d + W)). Note that for any convex or indeed affine sum λ j ξ ( j) of vectors ξ ( j) in Rr , we have j

λ j ξ ( j) =

λ j ξ ( j) .

j

Since we now know that the set C, like 21 (d + W), lies in H 1 (n−1) := {x¯ : 1 2 (n

2

x¯i =

− 1)}, we will adopt the convention, as in the last paragraph, that when we refer to hyperplanes such as c¯⊥ in the rest of the paper, we mean “affine hyperplanes in H 1 (n−1) ”. 2

2. The Classification Theorem and the Strategy of its Proof We can now state the main theorem of the paper. Theorem 2.1. Let G be a compact connected Lie group and K a closed connected subgroup such that the isotropy representation of G/K is the direct sum of r pairwise

Classification of Superpotentials

589

inequivalent R-irreducible summands. Assume that dim conv(W) = r − 1, where W is the set of weights of the scalar curvature function of G/K (cf Sect. 1). (This holds, for example, if G is semisimple.) If the cohomogeneity one Ricci-flat equations with G/K as principal orbit admit a superpotential of form (1.5) where C contains a J -null vertex, then we are in one of the following situations (up to permutations of the irreducible summands): (i) W = {(−1)i , (11 , −2i ) : 2 ≤ i ≤ r }, d1 = 1, C = 2 ≤ i ≤ r }) and r ≥ 2; (ii) r ≤ 3.

1 2 (d

+ {(−11 ), (11 , −2i ) :

Remark 2.2. As mentioned before, the situation where C has no null vertex was analysed in [DW4]. Hence, except for the r ≤ 3 case, Theorem 2.1 completes the classification of superpotentials of scalar curvature type subject to the above assumptions on G and K . Remark 2.3. The first case of the above theorem is realized by certain circle bundles over a product of r − 1 Fano (homogeneous) Kähler-Einstein manifolds (cf. Example 8.1 in [DW4], and [BB,WW,CGLP3]), and the subsystem of the Ricci-flat equations singled out by the superpotential in these examples corresponds to the Calabi-Yau condition. For more on the r = 2 case, see the concluding remarks in Sect. 10. Remark 2.4. Theorem 2.1 remains true if we replace the connectedness of G and K by the connectedness of G/K and the extra condition on the isotropy representation given by the second statement in Remark 1.2(d), i.e., if pi is an irreducible summand of dimension 1 in the isotropy representation of G/K , then [pi , p j ] ⊂ k ⊕ p j for all j = i. This weaker property does hold in practice. For example, the exceptional AloffWallach space N1,1 can be written as (SU (3) × )/(U1,1 · ), where U1,1 is the set of diagonal matrices of the form diag(exp(iθ ), exp(iθ ), exp(−2iθ )) and is the dihedral group with generators ⎛ ⎞ ⎛ 2πi/3 ⎞ 0 0 010 e ⎝ −1 0 0 ⎠ ⎝ 0 e−2πi/3 0 ⎠. 001 0 0 1 In order to prove Theorem 2.1 we have to analyse the situation when there is a null vertex c¯ ∈ C. As discussed in Sect. 1, conv(C) now strictly includes conv( 21 (d + W)) as c¯ is not in conv( 21 (d + W)). Our strategy is to take an affine hyperplane H separating c¯ from conv( 21 (d + W)), and consider the projection c¯ of conv( 21 (d + W)) onto H from c. ¯ Roughly speaking, whereas in [DW4] we could analyse the situation by looking at the vertices and edges of conv( 21 (d + W)), now, because we have projected onto a subspace of one lower dimension, we have to consider the 2-dimensional faces of conv( 21 (d + W)) also. This is a natural method of dealing with the situation of a point outside a convex polytope. It has some relation to the notion of “lit set” introduced in a quite different context by Ginzburg-Guillemin-Karshon [GGK]. The analysis in the next section will show that the vertices of the projected polytope can be divided into three types (Theorem 3.8). We label these types (1A), (1B) and (2). Roughly, these correspond to vertices orthogonal to c, ¯ vertices ξ¯ such that the line through c¯ and ξ¯ meets conv( 21 (d + W)) at a vertex, and vertices ξ¯ such that this line meets conv( 21 (d + W)) in an edge.

590

A. Dancer, M. Wang

In the remainder of the paper we shall gradually narrow down the possibilities for each type. In Sect. 3 we begin a classification of type (2) vertices. In Sect. 4 we are able to deduce that conv( 21 (d + W)) lies in the half space J (c, ¯ ·) ≥ 0. We are able to deduce an orthogonality result for vectors on edges in conv( 21 (d + W)) ∩ c¯⊥ . This is analogous to the key result Theorem 3.5 of [DW4] that held (in the more restrictive situation of that paper) for general edges in conv( 21 (d + W)). In Sect. 5 we exploit this result and some estimates to classify the possible configurations of (1A) vertices (i.e. vertices in c¯⊥ ), see Theorem 5.18. In Sect. 6 we attack the (1B) vertices, exploiting the fact that adjacent (1B) vertices give rise to a 2-dimensional face of conv( 21 (d + W)). This is the most laborious part of the paper, as it involves a case-by-case analysis of such faces. We show that adjacent (1B) vertices can arise only in a very small number of situations (Theorem 6.18). In Sect. 7 we exploit the listing of 2-dim faces to show that there is at most one type (2) vertex, except in two special situations (Theorem 7.1). In Sect. 8 and 9, we eliminate more possibilities for adjacent (1B) and type (2) vertices. We find that if r ≥ 4 then we are either in case (i) of the theorem or there are no type (2) vertices and no adjacent type (1B) vertices. Using the results of Sect. 4, in the latter case we find that all vertices are (1A) except for a single (1B). Building on the results of Sect. 5 for (1A) vertices, we are able to rule out this situation in Sect. 10, see Theorem 10.15 and Corollary 10.16. 3. Projection onto a Hyperplane We first present some results about null vectors in H 1 (n−1) . 2

Remark 3.1. From Eq.(1.3), the set of null vectors in the hyperplane H 1 (n−1) form an 2 ellipsoid { xi2 /di = 1}. If c¯ is null, then the hyperplane c¯⊥ in H 1 (n−1) is the tangent 2

space to this ellipsoid. So any element x¯ = c¯ of c¯⊥ satisfies J (x, ¯ x) ¯ < 0. Lemma 3.2. Let x, y satisfy xi = yi = −1. Suppose that J (x, ¯ x) ¯ and J ( y¯ , y¯ ) ≥ 0. Then J (x, ¯ y¯ ) ≥ 0, with equality iff x¯ is null and x¯ = y¯ . In particular, if x, ¯ y¯ are distinct null vectors then J (x, ¯ y¯ ) > 0. Proof. This follows from Eq.(1.3) and Cauchy-Schwartz.

Proposition 3.3. Let H = {x¯ : h(x) ¯ = λ} be an affine hyperplane, where h is a linear functional such that conv( 21 (d + W)) lies in the open half-space {x¯ : h(x) ¯ < λ}. Then there is at most one element of C in the complementary open half-space {x¯ : h(x) ¯ > λ}. Such an element is a null vertex of conv(C). Hence any element of C outside conv( 21 (d + W)) is a null vertex of C. Proof. Suppose the points of C with h(x) ¯ > λ are c¯(1) , . . . , c¯(m) with m > 1. Our result is stable with respect to sufficiently small perturbations of H , so we can assume that h(c¯(1) ) > h(c¯(2) ) ≥ h(c¯(3) ), . . . , h(c¯(m) ). Now c¯(1) + c¯(1) and c¯(1) + c¯(2) cannot be written in any other way as the sum of two elements of C. Hence, by Prop. 1.4, c¯(1) is null and J (c¯(1) , c¯(2) ) = 0. The only other way c¯(2) + c¯(2) can be written is as c¯(1) + c¯ for some c¯ ∈ C. But then c¯ = 2c¯(2) − c¯(1) , so J (c¯(1) , c) ¯ = 0, and such sums will not contribute. Hence J (c¯(2) , c¯(2) ) = 0, contradicting Lemma 3.2.

Classification of Superpotentials

591

Corollary 3.4. For distinct elements c, ¯ a¯ of C, the line segment c¯a¯ meets conv( 21 (d +W)). This gives us some control over the extent to which conv(C) can be bigger than the set conv( 21 (d + W)). Lemma 3.5. Let A ⊂ H 1 (n−1) be an affine subspace such that A ∩ conv( 21 (d + W)) is 2

a face of conv( 21 (d + W)). Suppose there exists c¯ ∈ C ∩ A with c¯ ∈ / conv( 21 (d + W)). 1 ¯ a¯ ∈ C, then in fact a, ¯ a¯ ∈ A. Let x¯ ∈ A. If x¯ = 2 (a¯ + a¯ ) with a, Proof. If a¯ or a¯ equals c¯ this is clear. We know by Cor. 3.4 that if a, ¯ a¯ = c¯ then the segments a¯ c, ¯ a¯ c¯ meet conv( 21 (d +W)). So there exist 0 < s, t ≤ 1 with t a¯ + (1 − t)c¯ and s a¯ + (1 − s)c¯ in conv( 21 (d + W)). Hence

2st s+t

2st x¯ + 1 − s+t

c¯ =

s t s a¯ + (1 − s)c¯ ¯ + (t a¯ + (1 − t)c) s+t s+t 1 ∈ conv( (d + W)). 2

As it is an affine combination of x, ¯ c¯ this point also lies in A, so it lies in A ∩ conv( 21 (d + W)). Also, it is a convex linear combination of the points t a¯ + (1 − t)c¯ and s a¯ + (1 − s)c¯ of conv( 21 (d + W)). Hence, by our face assumption, both these points lie in A, so a, ¯ a¯ lie in A. Remark 3.6. The above lemma will be very useful because it means that in all our later calculations using Prop. 1.4 for a face defined by an affine subspace A, we need only consider elements of C lying in A. Proposition 3.7. Let vw be an edge of conv(W) and suppose v, ¯ w¯ ∈ C. (i) If there are no points of W in the interior of vw, then J (v, ¯ w) ¯ = 0. (ii) If u = 21 (v + w) is the unique point of W in the interior of vw, J (v, ¯ w) ¯ > 0, and u is type II or III, then Fv¯ , Fw¯ are of opposite signs. Proof. Part(i) is a generalization of Theorem 3.5 in [DW4] and we will be able to apply the proof of that result after the following argument. Let the edge v¯ w¯ of conv( 21 (d + W)) be defined by equations x, ¯ u (i) = λi : i ∈ I, where x, ¯ u (i) ≤ λi for i ∈ I and 1 x¯ ∈ conv( 2 (d + W)). (In the above, , is the Euclidean inner product in Rr .) Note that Span {u (i) : i ∈ I} is the , -orthogonal complement of the direction of the edge. Let H be a hyperplane whose intersection with conv( 21 (d + W)) is the edge v¯ w. ¯ We (i) = can take H to be defined by the equation x, ¯ b u b λ , where b i are i∈I i i∈I i i arbitrary positive numbers summing to 1. If a, ¯ a¯ are elements of C whose midpoint lies in v¯ w, ¯ then either they are both in H or one of them, a¯ say, is on the opposite side of H from conv( 21 (d + W)). In the latter case a¯ is null and the only element of C on this side of H so, by Prop. 1.4, is J -orthogonal to v, ¯ w. ¯ Hence, as 21 (a¯ + a¯ ) is an affine combination of v, ¯ w, ¯ we see that J (a, ¯ a¯ ) = 0, and so such sums do not contribute in Eq.(1.6). We may therefore assume that a, ¯ a¯ are

592

A. Dancer, M. Wang

in H . But as this is true for all H of the above form, the only sums that will contribute are those where a, ¯ a¯ are collinear with v¯ w. ¯ Now if a, ¯ say, lies outside the line segment v¯ w, ¯ then it is null and J -orthogonal to v¯ or w, ¯ and hence to the whole line. So the only sums which contribute are those where a, ¯ a¯ lie on the line segment v¯ w. ¯ Now the proof of Theorem 3.5 in [DW4] gives (i). Turning to (ii), note first that the above arguments and Prop. 1.4 give (ii) immediately if no interior points of the edge v¯ w¯ lie in C. If there are m interior points in C, we again proceed as in the proof of Theorem 3.5 in [DW4] and use the notation there. We may assume that Lemma 3.2 (and hence Cor. 3.3 and Lemma 3.4) of [DW4] still holds; for the only issue is the statement for λm+1 , but if c(0) + c(λm+1 ) cannot be written as c(λ j ) + c(λk ) (0 < λ j , λk < m + 1) then what we want to prove is already true. Now Lemma 3.4 in [DW4] and our hypothesis J (v, ¯ w) ¯ > 0 imply that J00 < 0 and Jλi ,λ j > 0 except in the three cases listed there. The proof that the elements of C are equi-distributed in v¯ w¯ carries over from [DW4] since the midpoint u¯ is not involved in the arguments. Suppose next that the points in C ∩ v¯ w¯ are equi-distributed. In the special case where m = 1, we have J (v, ¯ u) ¯ = 0 = J (u, ¯ w), ¯ which imply J (u, ¯ u) ¯ = 0. So the midpoint does not contribute to the equation from c(0) +c(λm+1 ) . If m > 1, we write down the equations arising from c(0) + c(λm+1 ) and c(λm−1 ) + c(λm+1 ) . The formula for Fλ j in [DW4] still holds for 1 ≤ j ≤ m, and using this and the second equation we obtain the analogous formula for Fλm+1 . Putting all the above information together in the first equation and using Au < 0, we /F0m−1 is positive if m is even and negative if m is odd. In either case it see that Fλm+1 1 follows immediately that F0 Fλm+1 < 0, as required. We shall now set up the basic machinery of the projection of our convex hull onto an affine hyperplane. Let c¯ be a null vector in C and let H be an affine hyperplane separating c¯ from conv( 21 (d + W)). Define a map P : conv( 21 (d + W)) −→ H by letting P(¯z ) be the intersection point of the ray c¯ ¯ z with H . We denote by the image of P in H . (P and of course depend on c¯ and the choice of H . When considering projections from several null vertices, we will use the vertices as superscripts to distinguish the cases, e.g., ¯ c¯ , b .) Let us now consider a vertex ξ¯ of . We know that c¯ and ξ¯ are collinear with a subset P −1 (ξ¯ ) of conv( 21 (d + W)). As ξ¯ does not lie in the interior of a positive-dimensional subset of , we see that no point of P −1 (ξ¯ ) lies in the interior of a subset of conv( 21 (d + W)) of dimension > 1. So P −1 (ξ¯ ) is a vertex or an edge of conv( 21 (d + W)). If P −1 (ξ¯ ) is a vertex x, ¯ then 2 x¯ ∈ d +W and in Lemma 3.5 we can take the affine subspace A to be the line through c, ¯ ξ¯ , x. ¯ Using this lemma and also Prop. 1.4 and Cor. 3.4 we see that either x¯ ∈ C (in which case J (x, ¯ c) ¯ = 0), or x¯ ∈ / C and x¯ = (a¯ + c)/2 ¯ for some null element a¯ ∈ C ∩ A. We have therefore deduced Theorem 3.8. Let ξ¯ be a vertex of . Then exactly one of the following must hold: (1A) ξ¯ (and hence P −1 (ξ¯ ) ) is orthogonal to c; ¯ (1B) The line through c, ¯ ξ¯ meets conv( 21 (d + W)) in a unique point x, ¯ and there exists a null a¯ ∈ C such that (a¯ + c)/2 ¯ = x; ¯ (2) ξ¯ is not orthogonal to c, ¯ and c¯ and ξ¯ are collinear with an edge v¯ w¯ of conv( 21 (d + W)), (and hence c and ξ are collinear with the corresponding edge vw of conv(W) ).

Classification of Superpotentials

593

Remark 3.9. If (1B) occurs, then a¯ = 2 x¯ − c¯ being null is equivalent to J (x, ¯ x) ¯ = J (x, ¯ c), ¯ that is, r r xi2 xi ci = . di di i=1

(3.1)

i=1

In particular xi and ci are nonzero for some common index i. We will from now on refer to this situation by saying that the vectors x and c overlap. We make a preliminary remark about (1A) vertices. Lemma 3.10. Suppose that u ∈ W and u¯ ∈ c¯⊥ . (a) If u = (−2i , 1 j ) then ci = 0. (b) Suppose that K is connected. If u = (−1i , −1 j , 1k ), then ci , c j , ck are all nonzero. Proof. After a suitable permutation, we may let 1, . . . , s be the indices a with ca = 0. We need s u a ca

da

a=1

=

s ca2 = 1. da a=1

/ {1, . . . , s}) as then we need d j = 1 = In case (a) this is impossible if ci = 0 (that is, i ∈ c j and ca = 0 for a = j, contradicting rk=1 ck = −1. s u a2 s s Next, Cauchy-Schwartz on ( √uda )a=1 , ( √cda )a=1 shows a=1 da ≥ 1. In case (b), if, a

a

say, ck = 0, then since d1i + d1j ≥ 1 and di , d j ≥ 2 (see Remark 1.2(d)) we must have di = d j = 2. The equations then imply ci = c j = −1 and ca = 0 for a = i, j, also giving a contradiction. Similar arguments rule out ci = 0 or c j = 0. In the next two sections we shall get stronger results on (1A) vertices. Let us now consider type (2) vertices. Theorem 3.11. Consider a type (2) vertex ξ¯ of . So c and ξ are collinear with an edge vw of conv(W). Suppose there are no points of W in the interior of vw. Then we have (i) c = 2v − w or (ii) c = (4v − w)/3. In (i) the points of C on the line through c, ¯ ξ¯ are c¯ and w. ¯ In (ii) they are c, ¯ w¯ and c¯(1) = (2v¯ + w)/3 ¯ = (c¯ + w)/2. ¯ We need J (c¯(1) , w) ¯ = 0. Proof. This is very similar to the arguments of Sect. 3 in [DW4]. We apply Lemma 3.5 to the line through v, ¯ w. ¯ (A) We write the elements of C on the line as c¯ = c¯(0) , c¯(1) , . . . , c¯(m+1) with m ≥ 0. So c¯(m+1) is either null or is w. ¯ No other c¯( j) can lie beyond w, ¯ by Cor. 3.4. (0) By assumption c¯ = c¯ is not orthogonal to the whole line. As c¯ is null, this means c¯ is not orthogonal to any other point on the line. So c¯(0) + c¯( j) is either 2v, ¯ 2w¯ or else is a sum of two other c¯(i) . In particular, c¯(0) + c¯(1) = 2v. ¯ In fact c¯(0) + c¯( j) is never 2w; ¯ for the only possibility is for c¯(0) + c¯(m+1) = 2w, ¯ in which case c¯(m+1) is null, and so c¯(m) + c¯(m+1) = 2w, ¯ contradicting v¯ = w. ¯ We deduce that for j > 1, we have c¯(0) + c¯( j) = c¯(k) + c¯( p) for some 1 ≤ k, p ≤ j −1.

594

A. Dancer, M. Wang

(B) Let c¯(m+1) be null. Since the segment c¯(0) c¯(m+1) lies in the interior of the null ellipsoid, Lemma 3.2 implies that J (c¯(i) , c¯( j) ) > 0 unless i = j = 0 or m +1. Arguments very similar to those in Sect. 3 of [DW4] enable us to determine the signs of the Fc¯( j) in (1.5) and show that the contributions from the pairs summing to c¯(1) + c¯(m+1) cannot cancel. So we have a contradiction unless c¯(1) + c¯(m+1) = w, ¯ which can only happen if m = 1, i.e., c¯(0) + c¯(1) = 2v, ¯ c¯(1) + c¯(2) = 2w¯ and c¯(0) + c¯(2) = 2c¯(1) (otherwise c¯(0) + c¯(2) cannot cancel). Hence we have c = (3v − w)/2 ; c(1) = (v + w)/2 ; c(2) = (3w − v)/2. ¯ c¯(2) ) + F12 J (c¯(1) , c¯(1) ) = 0 so that the contriWriting F j for Fc¯( j) , we need 2F0 F2 J (c, (0) (2) (1) (1) ¯ c¯(2) ) and J (c¯(1) , c¯(1) ) > 0, we butions from c¯ + c¯ and c¯ + c¯ cancel. As J (c, (1) need F0 and F2 to have opposite signs. Now, as J (c, ¯ c¯ ), J (c¯(1) , c¯(2) ) > 0, we see that Av and Aw have opposite signs. So we may let w be type I and v be type II or III, as long as the asymmetry between c¯(0) and c¯(2) is removed. Note that v, w cannot overlap if v is type II, as then Remark 1.2(c) means w is not a vertex. The possibilities are (up to permutation) v

c(0) = 21 (3v − w)

w

c(2) = 21 (3w − v)

(1) (−2, 1, 0, . . .) (−1, 0, . . .) (− 25 , 23 , 0, . . .) (− 21 , − 21 , 0, . . .) (2) (−2, 1, 0, . . .) (0, −1, 0, . . .) (−3, 2, 0, . . .) (1, −2, 0, . . .) (3) (−2, 1, 0, . . .) (0, 0, −1, 0, . . .) (−3, 23 , 21 , . . .) (1, − 21 , − 23 , 0, . . .) (4) (1, −1, −1, 0, . . .) (0, 0, 0, −1, . . .) ( 23 , − 23 , − 23 , 21 , . . .) (− 21 , 21 , 21 , − 23 , . . .) Now, it is clear in (1) and (2) that c¯(0) and c¯(2) can’t both give null vectors. For (3) and (4), we find that the nullity equations for c¯(0) and c¯(2) have no integral solutions in di (in fact d3 (resp. d4 ) must be 5/2). Therefore in fact c¯(m+1) cannot be null. (C) Now suppose that c¯(m+1) = w and m > 0. Since J (c, ¯ v) ¯ = 0, v¯ must lie between c¯(0) and c¯(1) . So J (c¯(0) , ·) and J ( · , c¯(m+1) ) are affine functions on the line, vanishing at c¯(0) and c¯(m) respectively. Hence J (c¯(0) , c¯(i) ) (i ≥ 1) and J (c¯(i) , c¯(m+1) ) (0 ≤ i ≤ m − 1) are the same sign as J (c¯(0) , c¯(1) ). It follows that J (c¯(i) , ·) is an affine function on the line, taking the same sign as J (c¯(0) , c¯(1) ) at c¯(0) , c¯(m+1) (for 1 ≤ i ≤ m − 1). Thus J (c¯(i) , c¯( j) ) is the same sign as J (c¯(0) , c¯(1) ) except for the cases J (c¯(0) , c¯(0) ) = 0 = J (c¯(m) , c¯(m+1) )

:

sign J (c¯(m+1) , c¯(m+1) ) = −sign J (c¯(0) , c¯(1) ).

It then follows that the sign and non-cancellation arguments of (B) (taken from Sect. 3 of [DW4]) still hold, except in the case m = 1. These give the two cases of the theorem. If m = 0, we have c(1) = w and c(0) = 2v−w as c(0) + c(1) = 2v. If m = 1, then c(2) = w, c(0) + c(1) = 2v and c(0) + c(2) = 2c1 (for cancellation). Hence c(0) = (4v − w)/3, c(1) = (2v + w)/3, as well as J (c¯(1) , c¯(2) ) = 0. Remark 3.12. If there are points of W in the interior of vw, we can still conclude that c(0) + c(1) = 2v. Hence c = λv + (1 − λ)w for 1 < λ ≤ 2, since if λ > 2 then c¯(1) is beyond w. ¯ It must then be null, and m = 0, so there is no way of getting 2w¯ as a sum of two elements in C.

Classification of Superpotentials

595

Lemma 3.13. For case (i) in Theorem 3.11 (i.e., c = 2v − w ), either w is type I, or w is type III and vi = −1, wi = −2 for some index i. Proof. It follows from above that J (w, ¯ w)F ¯ w2¯ = Aw so J (w, ¯ w) ¯ is positive if w is wi2 type I and negative if w is type II or III. In the latter case, i di > 1, but by nullity, ci2 c = 2v − w satisfies di = 1. Hence for some i we have |wi | > |ci | = |2vi − wi |. As vi , wi ∈ {−2, −1, 0, 1}, it follows that vi = −1, wi = −2. We are now able to characterise the case where c is a type I vector. Theorem 3.14. If c is a type I vector, say (−1, 0, . . .) for definiteness, then W is given by {(−1)i , (11 , −2i ) : i = 2, . . . , r }. Remark 3.15. Equivalently, W is as in Ex 8.1 of [DW4], where the hypersurface in the Ricci-flat manifold is a circle bundle over a product of Kähler-Einstein Fano manifolds. A superpotential was found for this example in [CGLP3]. Proof. Nullity of c¯ implies d1 = 1, so (−21 , 1i ) ∈ / W. Also (−11 , −1 j , 1k ) ∈ / W, as then c would be in conv(W). Let us consider the vertices ξ¯ in . ξ¯ cannot be of type (1A); otherwise ξ1 = −1, which implies the existence of a type II vector in W with a nonzero first component, contradicting the above. There can also be no ξ¯ of type (1B) since by Remark 3.9 the vector x¯ satisfies 0 < −x1 , which we ruled out above. Hence all vertices of are of type (2), i.e., correspond to edges vw of conv(W) such that c = λv + (1 − λ)w and λ > 1. From this equation it follows that v, w are of the form v = (−1i ), w = (11 , −2i ) for some i > 1. As (being a (r − 2)-dimensional polytope in an (r − 2)-dimensional affine space) has at least r − 1 vertices, such vectors occur for all i = 1. Now no type II vector can be in W, otherwise v would not be a vertex. Also (1i , −2 j ) with i, j = 1 cannot be in W, as then (−1 j ) would not be a vertex. We have already seen (−21 , 1i ) is not in W. So W is as claimed. We shall henceforth exclude this case, i.e. case (i) of Theorem 2.1, from our discussion. We conclude this section by giving a preliminary listing of the possibilities for c when we have a type (2) vertex. These are given by cases (i) and (ii) of Theorem 3.11, as well as the possible cases when there is a point of W in the interior of vw. For Theorem 3.11(i) the possible v, w, c are: Table 1. c = 2v − w cases (1) (2) (3) (4) (5) (6) (7)

v

w

c = 2v − w

(−1, 1, −1, . . .) (−1, −1, 1, . . .) (−1, 0, −1, 1, . . .) (−2, 1, . . .) (−2, 1, . . .) (−1, 0, . . .) (1, −1, −1, . . .)

(−2, 1, . . .) (−2, 1, . . .) (−2, 1, . . .) (−1, 0, . . .) (0, 0, −1, . . .) (0, −1, . . .) (0, 0, 0, −1, . . .)

(0, 1, −2, . . .) (0, −3, 2, . . .) (0, −1, −2, 2, . . .) (−3, 2, . . .) (−4, 2, 1, . . .) (−2, 1, . . .) (2, −2, −2, 1, . . .)

596

A. Dancer, M. Wang

where . . . denotes zeros as usual. To arrive at this list, recall from Lemma 3.13 that w is either type I or type III with vi = −1, wi = −2 for some i. Note also that if w is type I and v is type II then v cannot overlap with w as w cannot then be a vertex. Furthermore, the other possibility with w type I and v type III is excluded as we are assuming in Theorem 3.11 that there are no points of W in the interior of vw. Finally, the case w = (−2, 1, . . .), v = (−1, 0, . . .) can be excluded as this just gives the example in Theorem 3.14. In order to list the possibilities under Theorem 3.11(ii), recall that we need J (c¯(1) , w) ¯ = 0, where c(1) = (2v + w)/3. Equivalently, we need 2J (v, ¯ w) ¯ + J (w, ¯ w) ¯ = 0.

(3.2)

This puts constraints on the possibilities for v, w. For instance, w cannot be type I, as for such vectors J (v, ¯ w) ¯ ≥ 0 and J (w, ¯ w) ¯ > 0. Also, if w is type II or III, then from the superpotential equation we need J (w, ¯ w) ¯ < 0, so J (v, ¯ w) ¯ > 0. If w is type III, say (−2, 1, 0, . . .), then since d1 ≥ 2, we have d41 + d12 ≤ 3, and the above equation gives J (v, ¯ w) ¯ ≤ 41 with equality iff d1 = 2, d2 = 1. By the above remarks and the nullity of c, ¯ after a moderate amount of routine computations, we arrive at the following possibilities, up to permutation of entries. In the table we have listed only the minimum number of components for each vector and all unlisted components are zero. Note that the entries (12)–(16) can occur only if K is not connected (cf. Remark 5.9). Table 2. c = 13 (4v − w) cases v

w

c(1) = (2v + w)/3

c = (4v − w)/3

(− 23 , 13 , − 43 , 23 ) (−2, 13 , 23 ) (− 43 , 13 , 0) (− 23 , 13 , − 23 ) (− 43 , 13 , 23 , − 23 ) (− 43 , 1, − 23 ) (− 23 , 13 , 23 − 23 , − 23 ) (− 23 , 1, − 23 , − 23 ) ( 13 , −1, −1, 23 ) (1, −1, − 13 , − 23 ) (1, − 53 , − 31 ) (1, − 13 , − 31 , − 43 ) ( 13 , − 13 , − 13 , − 43 , 23 ) ( 13 , −1, − 31 , 23 , − 23 ) (1, − 13 , − 31 , − 23 , − 23 ) ( 13 , − 13 , − 13 , 23 , − 23 , − 23 )

( 23 , − 31 , − 83 , 43 )

(1)

(0, 0, −2, 1)

(−2, 1, 0, 0)

(2)

(−2, 0, 1)

(−2, 1, 0)

(3)

(−1, 0, 0, )

(−2, 1, 0)

(4)

(0, 0, −1)

(−2, 1, 0)

(5)

(−1, 0, 1, −1)

(−2, 1, 0, 0)

(6)

(−1, 1, −1)

(−2, 1, 0)

(7)

(0, 0, 1, −1, −1)

(−2, 1, 0, 0, 0)

(8)

(0, 1, −1, −1)

(−2, 1, 0, 0)

(9)

(0, −1, −1, 1)

(1, −1, −1, 0)

(10)

(1, −1, 0, −1)

(1, −1, −1, 0)

(11)

(1, −2, 0)

(1, −1, −1)

(12)

(1, 0, 0, −2)

(1, −1, −1, 0)

(13)

(0, 0, 0, −2, 1)

(1, −1, −1, 0, 0)

(14)

(0, −1, 0, 1, −1)

(1, −1, −1, 0, 0)

(15)

(1, 0, 0, −1, −1)

(1, −1, −1, 0, 0)

(16)

(0, 0, 0, 1, −1, −1)

(1, −1, −1, 0, 0, 0)

(−2, − 13 , 43 ) (− 23 , − 31 , 0) ( 23 , − 31 , − 43 ) (− 23 , − 31 , 43 , − 43 ) (− 23 , 1, − 43 ) ( 23 , − 31 , 43 , − 43 , − 43 ) ( 23 , 1, − 43 , − 43 ) (− 13 , −1, −1, 43 ) (1, −1, 13 , − 43 ) (1, − 37 , 13 ) (1, 13 , 13 , − 83 ) (− 13 , 13 , 13 , − 83 , 43 ) (− 13 , −1, 13 , 43 , − 43 ) (1, 13 , 13 , − 43 , − 43 ) (− 13 , 13 , 13 , 43 , − 43 , − 43 )

We will also need a listing of those cases for which vw has interior points lying in conv(W).

Classification of Superpotentials

597

Table 3. Cases with interior points

(1) (2) (3) (4) (5)

v

w

c

(1, −2, . . .) (1, −2, . . .) (−1, 0, . . .) (−2, 1, 0, . . .) (1, −1, −1, . . .)

(−2, 1, . . .) (−1, 0, . . .) (1, −2, . . .) (0, 1, −2, . . .) (−1, 1, −1, . . .)

(3λ − 2, 1 − 3λ, . . .) (2λ − 1, −2λ, . . .) (1 − 2λ, 2λ − 2, . . .) (−2λ, 1, 2λ − 2, . . .) (2λ − 1, 1 − 2λ, −1, . . .)

Recall from Remark 3.12 that 1 < λ ≤ 2 and · · · denote zeros. Note that except in (4) all interior points which may lie in W actually do. 4. The Sign of J(c, w) ¯ ·) ≥ 0, i.e., the same Theorem 4.1. conv( 21 (d + W)) lies in the closed half-space J (c, closed half-space in which the null ellipsoid lies. Proof. We know that if ξ¯ is a vertex of c¯ then there are three possibilities, given by (1A), (1B) and (2) of Theorem 3.8. If (1A) occurs, then by definition J (c, ¯ ξ¯ ) = 0. If (1B) occurs, let a¯ be the null vector in Theorem 3.8. Then by Lemma 3.2, J (c, ¯ a) ¯ > 0, which in turn implies that J (c, ¯ ξ¯ ) > 0. It is now enough to show that J (c, ¯ ξ¯ ) ≥ 0 if ξ¯ is a type (2) vertex of c¯ , since it then follows that c¯ , and hence conv( 21 (d + W)), lies in the half-space J (c, ¯ ·) ≥ 0. Suppose then that ξ¯ is a type (2) vertex with J (c, ¯ ξ¯ ) < 0. By Remark 3.12, c = λv + (1 − λ)w for some v, w ∈ W with 1 < λ ≤ 2, and both J (c, ¯ v), ¯ J (c, ¯ w) ¯ < 0. In particular, from Remark 3.1 and Lemma 3.2, J (v, ¯ v), ¯ J (w, ¯ w) ¯ < 0 since v, ¯ w¯ lie on the side of c¯⊥ opposite to the null ellipsoid. But 0 = 4J (c, ¯ c) ¯ = J (d + λv + (1 − λ)w, d + λv + (1 − λ)w) = J (λ(d + v) + (1 − λ)(d + w), λ(d + v) + (1 − λ)(d + w)) = λ2 J (d +v, d +v)+2λ(1−λ)J (d + v, d + w) + (1−λ)2 J (d + w, d + w). It follows from the above remarks that J (d + v, d + w) < 0, that is vi wi > 1. di i

One then checks that this condition is only satisfied in the following cases (up to permutation of indices and interchange of v and w): (a) v = (−2, 1, 0, . . .), w = (−2, 0, 1, 0, . . .) with 1 < d1 < 4; (b) v = (−2, 1, 0, . . .), w = (−1, 1, −1, 0, . . .) with d1 = 2, or (d1 , d2 ) = (3, 2), or d2 = 1; (c) v = (1, −1, −1, 0, . . .), w = (1, −1, 0, −1, 0, . . .) with d1 = 1 or d2 = 1; (d) v = (1, −1, −1, 0, . . .), w = (0, −1, −1, 1, 0, . . .) with d2 = 1 or d3 = 1. In case (a), c = (−2, λ, 1 − λ, 0, . . .). The condition d1 < 4 is incompatible with the nullity of c. ¯ Interchanging v and w reverses only the role of λ and 1 − λ. A similar argument rules out case (b) with v, w as shown, as here c = (−λ − 1, 1, λ − 1). If we interchange v and w, then c = (λ − 2, 1, −λ, . . .). Theorem 3.11 tells us λ = 4/3 or 2, so c = (−2/3, 1, −4/3, . . .) or (0, 1, −2, . . .). In the former case c(1) := (2v + w)/3 = (−4/3, 1, −2/3, . . .), so the condition J (w, ¯ c¯(1) ) = 0 gives 8/3d1 + 1/d2 = 1. Thus (d1 , d2 ) = (3, 9) or (4, 3) but in

598

A. Dancer, M. Wang

neither case is c¯ null. In the latter case nullity means 1/d2 + 4/d3 = 1, so J (c, ¯ v) ¯ = 1 (1 − 1/d − 2/d ) > 0, a contradiction. 2 3 4 In case (c), c = (1, −1, −λ, λ − 1) and if v and w are interchanged, the last two components of c are interchanged. But c¯ cannot be null if d1 = 1 or d2 = 1. A similar argument works for case (d). Corollary 4.2. conv( 21 (d + W)) ∩ c¯⊥ is a (possibly empty) face of conv( 21 ((d + W)). This enables us to adapt Theorem 3.5 of [DW4] to the elements of c¯⊥ . Corollary 4.3. Let vw be an edge of conv(W) and suppose v¯ and w¯ are in c¯⊥ . Suppose further that there are no elements of W in the interior of vw. Then J (v, ¯ w) ¯ = 0. Proof. This is essentially the same as the proof of Theorem 3.5 of [DW1]. As conv( 21 (d + W)) ∩ c¯⊥ is a face of conv( 21 (d + W)), Lemma 3.5 shows that for calculations in c¯⊥ we need only consider elements of C in this hyperplane. Note that by Cor. 3.4, no elements of C lie on the opposite side of c¯⊥ to conv( 21 (d + W). Any vertex of conv(C) ∩ c¯⊥ outside conv( 21 (d + W)) ∩ c¯⊥ is, by Prop. 1.4, null, so must be c¯ by Lemma 3.2. Now Cor. 3.4 shows that c¯ is the only element of conv(C) ∩ c¯⊥ outside conv( 21 (d + W)) ∩ c¯⊥ . But any sum c¯ + a¯ with a¯ ∈ c¯⊥ does not contribute, so in fact we are in the situation of Theorem 3.5 of [DW4]. We introduce the following sets: Sˆ1 = {i ∈ {1, . . . , r } : ∃ unique w ∈ W with w¯ ∈ c¯⊥ and wi = −2}, Sˆ≥2 = {i ∈ {1, . . . , r } : ∃ more than one w ∈ W with w¯ ∈ c¯⊥ and wi = −2}. These are similar to the sets S1 , S≥2 of [DW4], but now we require that the vectors w lie in c¯⊥ . It is immediate from Cor. 4.3 that di = 4 if i ∈ Sˆ≥2 , (cf. Prop. 4.2 in [DW4]). We next prove a useful result about which elements of 21 (d + W) can be orthogonal to c. ¯ This will give us information about when (1A) vertices can occur. Lemma 4.4. Assume that we are not in the situation of Theorem 3.14 (i.e., c is not of type I ). Let u ∈ W be such that u¯ ∈ c¯⊥ . Then: 1. there exists i with ci = 0 and −2 < ci < 1; 2. if c ∈ Zr then there is at most one such u, and hence at most one (1A) vertex (wrt c). Proof. (a) The condition J (u, ¯ c) ¯ = 0 means i udi ci i = 1, and nullity of c¯ means ci2 i di = 1. As u i ∈ {−2, −1, 0, 1}, if the condition in (a) does not hold, then u i ci ≤ 2 ci for all i so we must have equality for all i. Now ci = u i for all i with ci nonzero. As ci = −1 and c = u (since c ∈ / W by definition), this means c is a type I vector and we are in the situation of Theorem 3.14. (b) We see from the previous paragraph that we need u i ci > ci2 for some i. If c ∈ Zr c this means ci = −1 and u i = −2. The orthogonality condition is now d2i + d jj = 1, where u j = 1. As di = 1 we see c j ≥ 0. c If c j = 0 then di = 2. If c j > 0 then di ≥ 3 so 13 ≤ d jj < c1j , where the second cj2 1 di + d j ≤ 1. So c j = 1 or 2. Moreover, (−1i , 2 j ), which contradicts ci = −1.

inequality is due to the nullity requirement latter implies (di , d j ) = (3, 6) and c =

the

Classification of Superpotentials

599

We see that either c j = 1 and (di , d j ) = (4, 2) or (3, 3), or c j = 0 and di = 2. Corollary 4.3 implies that if there is more than one such u (say (−2i , 1 j ) and (−2i , 1k )) for a given i, then di = 4, so (di , d j , dk ) = (4, 2, 2), and (ci , c j , ck ) = (−1, 1, 1), contradicting the nullity of c. It now readily follows that the nullity condition prevents there being more than one u ∈ W with u¯ ∈ c¯⊥ except when c = (−1, −1, 1, 0, . . .) with d = (4, 4, 2, . . .) or (3, 3, 3, . . .) and u = (−2, 0, 1, 0, . . .), (0, −2, 1, 0, . . .). But in this case if both u occur then c ∈ conv(W), a contradiction. We shall study (1A) vertices for non-integral c in the next section. The following results will be useful. Proposition 4.5. Let v = (−2i , 1 j ) and w = (−2k , 1l ) be elements of W such that v, ¯ w¯ ∈ c¯⊥ . Suppose that i ∈ Sˆ1 and {i, j} ∩ {k, l} = ∅. Then k ∈ Sˆ≥2 and (di , dk , dl ) = (2, 4, 2). Proof. By Remark 1.2(e) the affine subspace {x¯ : xi + xk = −2, x j + xl = 1} ∩ c¯⊥ meets conv( 21 (d + W)) in a face, whose possible elements are v, w, u = (−2k , 1 j ), y = (−1i , 1 j , −1k ) and z = (−1i , −1k , 1l ) (since i ∈ Sˆ1 ). As J (v, ¯ w) ¯ = 41 , we see from Thm 4.3 that vw is not an edge so z is present in the face. Now Cor. 4.3 on vz implies di = 2. Also, u must be present, otherwise y is present and Cor. 4.3 on zw and yw gives a contradiction. So k ∈ Sˆ≥2 , and Cor. 4.3 on uw implies dk = 4. Now considering zw implies dl = 2. Remark 4.6. This is similar to the proof of Prop. 4.6 in [DW4]. But we cannot now deduce that d j = 1 as the proof of this in [DW4] relied on the existence of t = (−1i , −1 j , 1k ), and although we know this is in W we do not know if t¯ lies in c¯⊥ . Proposition 4.7. If i ∈ Sˆ1 and v = (−2i , 1 j ) gives an element of c¯⊥ then w = (−1i , −1 j , 1k ) cannot give an element of c¯⊥ . Proof. This is similar to Prop. 4.3 in [DW4]. Since i ∈ Sˆ1 , the vectors v, ¯ w¯ lie on an edge in the face {x¯ : 2xi +x j = −3}∩ c¯⊥ of conv( 21 (d +W)), and J (v, ¯ w) ¯ = 41 (1− d2i + d1j ) = 0 since di = 1. Corollary 4.8. With v as in Prop. 4.7, there are no elements w = (−2 j , 1k ) with w¯ in c¯⊥ . Proof. This is similar to Prop. 4.4 in [DW4]. If k = i, then the type I vector u := (−1i ) = 13 (2v + w) lies in W and u¯ ∈ c¯⊥ . By Lemma 5.1 below, u = c, contradicting c ∈ / W. We can therefore take k = i. Now v, ¯ w¯ lie on an edge in the face {x¯ : 3xi +2x j = −4}∩ c¯⊥ (this is a face by Prop. 4.7 and the assumption i ∈ Sˆ1 ). But J (v, ¯ w) ¯ = 41 (1 + d2j ) = 0. 5. Vectors Orthogonal to a Null Vertex In this section we analyse the possibilities for 21 (d + W) ∩ c¯⊥ . This will give us an understanding of the vertices of type (1A). We first dispose of the case of type I vectors.

600

A. Dancer, M. Wang

Lemma 5.1. If u is a type I vector and u¯ ∈ c¯⊥ then c = u, so we are in the situation of Theorem 3.14. Proof. Up to a permutation we may let u = (−1, 0, . . .). The orthogonality condition implies c1 = −d1 . But nullity implies ci2 /di = 1, so d1 = 1 and ci = 0 for i > 1. (Note that in particular u ∈ / W.) We shall therefore assume from now on there are no type I vectors giving points of c¯⊥ . Lemma 5.2. (i) Two type II vectors whose nonzero entries lie in the same set of three indices cannot both give elements of c¯⊥ . (ii) Two type III vectors (−2i , 1 j ) and (1i , −2 j ) cannot both give elements in c¯⊥ . (iii) Three type III vectors whose nonzero entries all lie in the same set of three indices cannot all give rise to elements in c¯⊥ . Proof. These all follow from Lemma 5.1 by exhibiting an affine combination of the given vectors which is of type I. Let u, v ∈ W be such that u¯ and v¯ ∈ c¯⊥ . It follows that λu¯ + (1 − λ)v¯ ∈ c¯⊥ for all λ. Hence Remark 3.1 shows that for all λ, 0 ≥ J (d + λu + (1 − λ)v, d + λu + (1 − λ)v) = J (λ(d + u) + (1 − λ)(d + v), λ(d + u) + (1 − λ)(d + v)) = λ2 (J (d + u, d + u) + J (d + v, d + v) − 2J (d + u, d + v)) +2λ(J (d + u, d + v) − J (d + v, d + v)) + J (d + v, d + v). Equality occurs if and only if λu + (1 − λ)v = c, as c¯ is the only null vector in c¯⊥ . Multiplying by −1, using Eq.(1.3), and recalling that the minimum value of a quadratic αλ2 + βλ + γ with α > 0 is γ − (β 2 /4α), we deduce the following result. Lemma 5.3. If u, v ∈ W and u, ¯ v¯ ∈ c⊥ , then r r r r r r v2 u i2 vi2 u i2 u i vi u i vi i + − ≤ 2− . di di di di di di i=1

i=1

i=1

i=1

i=1

(5.1)

i=1

Moreover, equality occurs if and only if c = λu + (1 − λ)v for some λ. Remark 5.4. By definition, c does not lie in conv(W). So in the case of equality in Eq.(5.1) we cannot have 0 ≤ λ ≤ 1. This observation will in many cases show that equality cannot occur. u i vi = 1 (i.e., Remark 5.5. The right-hand side of Eq.(5.1) is maximised when di u i2 vi2 J (u, ¯ v) ¯ = 0). In this case Eq.(5.1) just follows from di , di ≥ 1, which is true for any two vectors in c¯⊥ . If J (u, ¯ v) ¯ = 0, we get sharper information. Corollary 5.6. Suppose that K is connected. If u, v are type II vectors in W with u, ¯ v¯ ∈ c¯⊥ then 1 u i vi 3 ≤ ≤ , 2 di 2 r

i=1

with equality if and only if c = λu + (1 − λ)v for some λ, in which case all the di = 2 whenever i is an index such that u i or vi is nonzero.

Classification of Superpotentials

601

u i2 vi2 3 Proof. Writing X = di and Y = di we see that 1 ≤ X, Y ≤ 2 . The lower bound ⊥ arises from u, ¯ v¯ being in c¯ , while the upper bound follows from Remark 1.2(d) and the assumption that u, v are type II vectors. Now X + Y − X Y = 1 − (1 − X )(1 − Y ) is minimised for X, Y in this range if X = Y = 23 , when it takes the value 43 . The inequality Eq.(5.1) now gives the result. When K is connected, it follows that any two such type II vectors must overlap. Moreover, if they have only one common index then we are in the case of equality in Cor. 5.6. The nullity of c¯ implies that λ = 21 in this case, contradicting Remark 5.4. Combining this remark with Cor. 5.2 and Lemma 5.2 (i) , we deduce the following result. Corollary 5.7. Assume K is connected. If u, v are type II vectors in W with u, ¯ v¯ ∈ c¯⊥ , then either u = (−1a , −1b , 1i ), v = (−1a , −1b , 1 j ) or u = (1a , −1b , −1i ), v = (1a , −1b , −1 j ). Hence the collection of all such type II vectors is of the form, for some fixed a, b: (i) (−1a , −1b , 1i ) : i ∈ I for some set I ; or (ii) (1a , −1b , −1i ) : i ∈ I for some set I ; or (iii) (1, −1, −1, 0, . . .), (1, −1, 0, −1, . . .), (1, 0, −1, −1, . . .).

We now investigate type III vectors. Lemma 5.8. Suppose K is connected. If u is a type II vector and v a type III vector in W with u, ¯ v¯ ∈ c¯⊥ , then ri=1 udi vi i > 0. Proof. With the notation of Cor. 5.6 we have 1 ≤ X ≤ 23 and 1 ≤ Y ≤ 3. So X + Y − X Y = 1 − (1 − X )(1 − Y ) ≥ 0, and Eq.(5.1) gives the desired inequality. Also, the case of equality (i.e., X = 23 , Y = 3) leads to λ = 23 , again contradicting Remark 5.4. Remark 5.9. While Cor. 5.6 — Lemma 5.8 are stated under the assumption that K is connected, the actual property we used is that in Remark 2.4. By contrast, the next two results do not require this property. Lemma 5.10. Any two type III vectors u, v giving elements of c¯⊥ must overlap. Proof. Write u = (−2i , 1 j ) and v = (−2k , 1l ). By Cor. 4.3, if i, k ∈ Sˆ≥2 then di = dk = 4. Since J (c, ¯ u) ¯ = 0 we have (by Cauchy-Schwartz) 2

c2j ci 1 4 2ci c j 2 + ≤ + + 1= − . di dj di d j di d j Hence 2 ci2 c j dj 1 + ≥ ≥ . di d j 1 + dj 2

If u and v do not overlap, then the above and the analogous result from considering J (c, ¯ v) ¯ = 0, together with the nullity of c, ¯ imply that d j = 1 = dl and the only nonzero components of c are ci = ck = −1, c j = cl = 21 . But then c is the midpoint of uv, contradicting c ∈ / conv(W).

602

A. Dancer, M. Wang

So if u and v do not overlap, we can take i ∈ Sˆ1 . Proposition 4.5 shows that k ∈ Sˆ≥2 and (di , dk , dl ) = (2, 4, 2). Hence 2 < X ≤ 3 and Y = 23 , so X + Y − X Y ≥ 0 and u i vi di ≥ 0. Non-overlap means that equality holds. But then λ = 1/3, contradicting Remark 5.4. Lemma 5.10, together with Lemmas 5.2 and 4.8, implies the following corollary. Corollary 5.11. The type III vectors associated to elements of 21 (d + W) ∩ c¯⊥ are, up to permutation of indices, either of the form (a) (−21 , 1i ), i ∈ I , (with d1 = 4 if |I | ≥ 2 ), or (b) (11 , −2i ), i ∈ I , for some subset I ⊂ {2, . . . , r }. Having found the possible configurations for type III vectors in c¯⊥ , we start to analyse the type II vectors for each such configuration. For the rest of this section we will assume that K is connected (cf. Remark 5.9). Remark 5.12. Lemma 5.8 now shows that in case (a) of Cor. 5.11, if |I | ≥ 2, then every type II vector associated to an element of c¯⊥ must have “−1” in place 1. Similarly, in case (b), if |I | ≥ 3, then every such type II vector has “1” in place 1. (So if a type II is present then d1 = 1). If |I | = 2, the only possible type II vectors with “0” in place 1 are (01 , −12 , −13 , 1i ) where i ≥ 4, and all type II vectors whose first entry is nonzero actually must have first entry equal to 1. Lemma 5.13. In case (a) of Cor. 5.11 with |I | ≥ 2 there are no type II vectors associated to elements of c¯⊥ . Proof. Let v = (−21 , 1k ) and w = (−11 , 1i , −1 j ) give elements of c¯⊥ with k = i, j. Consider the face {x¯ : xi + xk = 1, x1 + x j = −2} ∩ c¯⊥ . Other than v, w the possible elements in this face come from u = (−11 , 1k , −1 j ) and s = (−21 , 1i ). As d1 = 4, J (v, ¯ w) ¯ = 0, so vw is not an edge and u must be present. But J (u, ¯ w) ¯ = 41 (1− d11 − d1j ) = 0 since d1 = 4, giving a contradiction. So k = i or j for every such v, w. Hence if such a w exists there are at most two type III vectors. Now if |I | = 2 and the type IIIs are (−2, 1, 0, . . .), (−2, 0, 1, . . .), we cannot have w = (−1, 1, −1, . . .) or (−1, −1, 1, . . .) as then a suitable affine combination of the above vectors give a type I vector (cf. Lemmas 5.1, 5.2). So in fact no type II vectors give rise to elements of c¯⊥ . Lemma 5.14. The vectors v = (−2, 1, 0, . . .) and w = (0, 1, −1, −1, 0, . . .) are not both associated to elements of c¯⊥ , unless (0, 1, −2, 0, . . .) or (0, 1, 0, −2, 0, . . .) is also. Proof. Suppose (0, 1, −2, 0, . . .), (0, 1, 0, −2, 0, . . .) are absent. Consider the face {x¯ : x2 = 1, x1 + x3 + x4 = −2} ∩ c¯⊥ . The other possible elements of this face come from t = (−1, 1, −1, 0, . . .) and y = (−1, 1, 0, −1, 0, . . .). Both these must be present, as J (v, ¯ w) ¯ = 0. Applying Cor. 4.3 to wt, vt and wy we obtain (d1 , d2 , d3 , d4 ) = (4, 2, 2, 2). Now we have equality (for y, t) in Eq.(5.1), as both sides equal 15/16. We find that λ = 1/2, giving a contradiction again to Remark 5.4. Combining this with Lemma 5.8 (and using Lemma 4.7) yields:

Classification of Superpotentials

603

Corollary 5.15. If there is a unique type III vector u = (−21 , 12 ) with u¯ in c¯⊥ , then the type II vectors associated to elements of c¯⊥ all have “-1” in place 1. Moreover (−11 , −12 , 1i ) cannot be present. Also, if (−11 , 12 , −1i ) is present for some i ≥ 3 then (d1 , d2 ) = (4, 2) or (3, 3) and the index i is unique. For the last assertion, observe that (−11 , 12 , −1i ) and the type III vector are joined by an edge, so Cor. 4.3 shows the dimensions are as stated. If we have two such type II for i 0 and i 1 then Eq.(5.1) implies di0 + di1 ≤ 4. Hence since K is connected, di0 = di1 = 2 and we have equality in Eq.(5.1) with λ = 21 , giving a contradiction. Lemma 5.16. Let the type III vectors be as in Cor. 5.11(b), i.e., they are (11 , −2a ), a ∈ I . Assume that |I | ≥ 2. If we have a type II vector w = (11 , −1i , −1 j ) with w¯ in c¯⊥ then i, j ∈ I . Proof. Suppose for a contradiction that w = (11 , −1i , −1 j ) is present (so d1 = 1) and (11 , −2 j ) absent (i.e. j ∈ / I ). Since |I | ≥ 2, we can consider v = (11 , −2k ) where k ∈ I (so k = j) and k = i. Consider the face {x¯ : x1 = 1, xi + x j + xk = −2} ∩ c¯⊥ . As well as v, w the possible elements of W in the face giving elements of c¯⊥ are y = (11 , −1i , −1k ), t = (11 , −1 j , −1k ) and u = (11 , −2i ). As d1 = 1, vw is not an edge so t is present. Now Cor. 4.3 applied to vt and tw gives d1 = d j = 2 and dk = 4. Moreover, if i ∈ I then u is present, so the edge wu gives di = 4. Thus we have shown that da = 4 for all a ∈ I . Now considering (11 , −2a ) and (11 , −2b ) with a, b ∈ I , we see that we have equality in Eq.(5.1) (both sides equal 34 ). In fact c is the average of these two vectors (i.e., λ = 21 ), so as in Remark 5.4 we have a contradiction. Lemma 5.17. Let the type III vectors be as in Cor. 5.11(b), i.e., they are (11 , −2a ), a ∈ I . Assume that |I | ≥ 3. Then d1 = 1. Proof. Each pair v, w of type III vectors gives an edge, and if d1 = 1, then we have J (v, ¯ w) ¯ > 0. By Theorem 4.3 all the midpoint vectors (11 , −1a , −1b ) are present for a, b ∈ I . Now Prop. 3.7 shows that Fv¯ and Fw¯ have opposite signs, so we have a contradiction if |I | ≥ 3. Putting together our results so far, we obtain a description of the possibilities for c¯⊥ ∩ 21 (d + W). Theorem 5.18. Assume that r ≥ 3 and K is connected, and that we are not in the situation of Thm. 3.14. Up to permutation of the irreducible summands, the following are the possible configurations of vectors in W associated to elements of 21 (d + W) ∩ c¯⊥ . (1) {(−21 , 1i ), 2 ≤ i ≤ m} for fixed m ≥ 2. There are no type II vectors, and d1 = 4 if m ≥ 3. (2) {(11 , −2i ), 2 ≤ i ≤ m} for fixed m ≥ 3 and d1 = 1. There are no type II vectors. (3) (i) {(11 , −22 ), (11 , −23 ), (−12 , −13 , 1i ), 4 ≤ i ≤ m} with d1 = 1, d2 = d3 = 2. (ii) {(11 , −12 , −13 ), (11 , −22 ), (11 , −23 ), (−12 , −13 , 1i ), 4 ≤ i ≤ m}, d1 = 1, d2 = d3 = 2. (4) {(1, −2, 0, 0, . . .), (1, 0, −2, 0, . . .), (1, −1, −1, 0, . . .)} with d1 = 1. (5) A unique type III (−2, 1, 0, . . .). Possible type II vectors are (i) (−1, 1, −1, 0, . . .) with either (d1 , d2 ) = (4, 2) or (3, 3); or (ii) {(−11 , 13 , −1i ), 4 ≤ i ≤ m} for fixed m ≤ r and with d1 = 2; or (iii) {(−11 , −13 , 1i ), 4 ≤ i ≤ m} for fixed m ≤ r and with d1 = 2.

604

A. Dancer, M. Wang

(6) No type III vectors. Possible type II vectors are (i) {(−11 , −12 , 1i ), 3 ≤ i ≤ m} for fixed m ≤ r , with d1 = d2 = 2 if m ≥ 4; or (ii) {(11 , −12 , −1i ), 3 ≤ i ≤ m} for fixed m ≤ r , with d1 = d2 = 2 if m ≥ 4; or (iii) {(11 , −12 , −13 ), (11 , −12 , −14 ), (11 , −13 , −14 )} with d1 = d2 = d3 =d4 =2. Proof. Corollary 5.11 gives the possibilities for the type III vectors in c¯⊥ . If there are none then Cor. 5.7 gives the possibilities in (6). If there is a unique type III vector, then Cor. 5.15 and Cor. 5.7 give us the cases listed in (5) (or (1) with m = 2 if there are no type II). If we have two or more type III vectors with −2 in the same place then Lemma 5.13 shows we are in case (1). If we have more than two type III vectors with 1 in the same place a, then da = 1 by Lemma 5.17. Remark 5.12 then implies there are no type II vectors and we are in case (2). If we have exactly two type III vectors with 1 in the same place, e.g., (1, −2, 0, . . .) and (1, 0, −2, 0, . . .), then the proof of Lemma 5.17 shows that if the type II vector (1, −1, −1, 0, . . .) is absent we must have d1 = 1. On the other hand, if d1 = 1 we are, by Remark 5.12 and the connectedness of K , in case (2) or (3)(i). If d1 = 1, then by the above, Remark 5.12, and Cor. 5.7, we are in case (3)(ii) or (4). The statements about values of the di follow from straightforward applications of Cor. 4.3 to the obvious edges of conv( 21 (d + W)) ∩ c¯⊥ . Remark 5.19. The possibilities in Theorem 5.18 can be somewhat sharpened. In cases (1), (2), and (3), m cannot be r ; in other words the maximum number of vectors is not allowed. This follows easily from looking at the system of equations expressing the nullity of c, ¯ the orthogonality of the vectors to c¯ and the fact that the entries of c sum up to −1. Similarly, r = 3 in (5)(i) and r = 4 in (6)(iii). When m ≥ 5 in (5)(ii) or (5)(iii), the segment joining two type II vectors is an edge, so Cor. 4.3 gives d3 = 2. 6. Adjacent (1B) Vertices We now turn to (1B) vertices. Let ξ¯ , ξ¯ be adjacent (1B) vertices of . Then there exist vertices x, ¯ x¯ of conv( 21 (d + W)) such that c, ¯ ξ¯ , x¯ are collinear and c, ¯ ξ¯ , x¯ are collinear. Moreover, there exist null vectors a, ¯ a¯ such that x¯ = (a¯ + c)/2 ¯ and x¯ = (a¯ + c)/2. ¯ By Cor. 3.4, there must be an element y¯ of conv( 21 (d + W)) on a¯ a¯ , so P −1 (ξ¯ ξ¯ ) contains the convex hull of x, ¯ x¯ , y¯ and hence is 2-dimensional. As ξ¯ ξ¯ is by assumption an edge −1 of , P (ξ¯ ξ¯ ) is a 2-dimensional face of conv( 21 (d + W)). So we need to analyse the 2-dimensional faces of conv(W) containing vertices x, x such that x = (a + c)/2,

x = (a + c)/2,

a, ¯ a¯ null,

(6.1)

and such that c lies in the 2-dimensional plane defining this face. The lines through x, c (resp. x , c) only meet conv(W) at x (resp. x ). Most 2-faces of conv(W) are triangular. We list below (up to permutation of components) all the possible non-triangular faces. For further details regarding how this listing is arrived at, see [DW5]. We emphasize that only the full faces are being listed, i.e., configurations formed by all the possible elements of W in a given 2-dimensional plane. As the set of weight vectors for a given principal orbit may be a subset of the full set of possible weight vectors, these full faces may degenerate to subfaces or even lower-dimensional faces (see Remark 6.2).

Classification of Superpotentials

605

Listing convention. In the interest of economy and clarity, we make the convention that when we list vectors in W belonging to a 2-face we will use the freedom of permuting the summands to place nonzero components of the vectors first and we will only put down the minimum number of components necessary to specify the vectors. Hexagons. There are 3 possibilities: (H1) This is the face in the plane {x1 + x2 + x3 = −1; xa = 0, for a > 3}. Points of W are (−2i , 1 j ), (−1i , 1 j , −1k ), (−1i ), where i, j, k ∈ {1, 2, 3}. The type III vectors form the vertices of the hexagon. (H2) The plane here is {x1 + x2 = −1, x3 + x4 = 0, xi = 0 (i > 4)}. Points of W are vertices u = (−2, 1, 0, 0), v = (1, −2, 0, 0), y = (−1, 0, 1, −1), z = (−1, 0, −1, 1), z = (0, −1, −1, 1),

y = (0, −1, 1, −1),

and the interior points α = (−1, 0, 0, 0), β = (0, −1, 0, 0). (H3) The plane is {x2 = −1, x1 + x3 + x4 = 0, xi = 0 (i > 4)}. Points of W are the vertices u = (−1, −1, 1, 0), v = (0, −1, 1, −1), w = (1, −1, 0, −1), x = (1, −1, −1, 0), y = (0, −1, −1, 1), z = (−1, −1, 0, 1) and the centre t = (0, −1, 0, 0). Square. (S) with midpoint t = (0, −1, 0, 0, 0) and vertices v = (−1, −1, 1, 0, 0), u = (0, −1, 0, 1, −1), s = (0, −1, 0, −1, 1), w = (1, −1, −1, 0, 0). Trapezia. We have vertices v, u, s, w, t with 2v − s = 2u − w and t = 21 (s + w), i.e., these are symmetric trapezia. Below we list the possible v, u, s, w. Table 4. Possible trapezoidal faces

(T1) (T2) (T3) (T4) (T5) (T6)

v

u

s

w

(−2, 1, 0, 0) (−2, 0, 1, 0) (−1, −1, 0, 1) (0, 0, 1, −1, −1) (−1, 0, 0, 1, −1) (1, −1, −1, 0, 0)

(−2, 0, 1, 0) (−2, 1, 0, 0) (0, −1, −1, 1) (1, 0, 0, −1, −1) (0, 0, −1, 1, −1) (1, −1, 0, −1, 0)

(0, 0, −2, 1) (0, −1, 1, −1) (−2, 1, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0, 0) (0, 0, −1, 1, −1)

(0, −2, 0, 1) (0, 1, −1, −1) (0, 1, −2, 0) (0, 1, −2, 0, 0) (0, 1, −2, 0, 0) (0, 0, 1, −1, −1)

Note that the configuration with vertices (−1, −1, 1, 0, 0), (−1, −1, 0, 1, 0), (0, 0, 1, −1, −1), and (0, 0, −1, 1, −1) is equivalent to (T6) under the composition of a permutation and a J -isometric involution.

606

A. Dancer, M. Wang

Parallelograms. We have vertices v, u, s, w with v − u = s − w. Table 5. Possible parallelogram faces

(P1) (P2) (P3) (P4) (P5) (P6) (P7) (P8) (P9) (P10) (P11) (P12) (P13) (P14) (P15) (P16) (P17)

v

u

s

w

(−2, 1, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0, 0) (1, −1, −1, 0, 0, 0) (1, −1, −1, 0, 0, 0) (1, −1, −1, 0, 0, 0) (0, 0, 1, −1, −1, 0) (1, 0, −1, 0, −1) (−1, 0, −1, 0, 1) (−1, 0, 1, 0, −1) (−1, 0, 1, 0, −1) (−2, 1, 0, 0) (−2, 1, 0, 0)

(−1, 0, −1, 1) (−2, 0, 1, 0, 0) (−2, 0, 1, 0, 0) (−1, 0, 1, −1) (−1, 0, 0, 1, −1) (−1, 0, 0, −1, 1) (0, 1, −1, −1, 0) (0, 0, 0, 1, −1, −1) (1, −1, 0, 0, −1, 0) (1, 0, 0, 0, −1, −1) (0, −1, 0, −1, 0, 1) (1, −1, −1, 0, 0) (0, 0, −1, −1, 1) (−1, −1, 1, 0, 0) (−1, −1, 1, 0, 0) (0, 1, −2, 0) (0, 1, 0, −2)

(−2, 0, 1, 0) (0, 1, 0, −1, −1) (0, 0, −1, −1, 1) (−1, −1, 0, 1) (−1, 0, 1, −1, 0) (−1, 0, −1, 1, 0) (−1, 0, 0, 1, −1) (1, −1, 0, −1, 0, 0) (0, 0, −1, 1, 0, −1) (0, −1, −1, 1, 0, 0) (1, 0, 0, 0, −1, −1) (0, 0, −1, 1, −1) (0, −1, −1, 1, 0) (0, 0, −1, 1, −1) (0, 0, 1, −1, −1) (−1, 0, 1, −1) (−2, 0, 1, 0)

(−1, −1, 0, 1) (0, 0, 1, −1, −1) (0, −1, 0, −1, 1) (0, −2, 1, 0) (0, −1, 1, 0, −1) (0, −1, −1, 0, 1) (1, 0, −1, 0, −1) (0, 0, 1, 0, −1, −1) (0, 0, 0, 1, −1, −1) (0, 0, 0, 1, −1, −1) (1, −1, −1, 0, 0, 0) (0, −1, −1, 1, 0) (1, −1, −1, 0, 0) (0, −1, −1, 1, 0) (0, −1, 1, −1, 0) (1, 0, −1, −1) (0, 0, 1, −2)

Remark 6.1. (P1), (P2), (P3), and (P17) are actually rectangles. (P16) also includes the midpoints y = (u + v)/2 = (−1, 1, −1, 0) and z = (s + w)/2 = (0, 0, 0, −1). The rectangle (P17) also includes the midpoints y = (u + v)/2 = (−1, 1, 0, −1) and z = (s + w)/2 = (−1, 0, 1, −1). Remark 6.2. We must also consider subshapes of the above. Each symmetric trapezium contains two parallelograms. The two rectangles with midpoints (P17), (P16) will contain asymmetric trapezia. (P17) also contains parallelograms and squares. (For (P16), note that s is present iff w is.) Furthermore, there are numerous subshapes of the hexagons. The regular hexagon (H3) contains rectangles with midpoint (by omitting opposite pairs of vertices). Besides triangles, the hexagon (H2) contains pentagons, rectangles and squares (with midpoints), and kite-shaped quadrilaterals (e.g. y uz v). For (H1) see the discussion before Theorem 6.12. Finally, the triangle with midpoints of all sides (where the vertices are the three type III vectors with 1 in the same place) contains a trapezium (by omitting one vertex) and hence parallelograms. Remark 6.3. We also note for future reference that there are examples where we can have four or more coplanar elements of W but the plane cannot be a face. These examples are not of course relevant to the case of adjacent (1B) vertices, but some will be relevant when we consider multiple vertices of type (2). The examples which we will need in that context are the following three trapezia: Table 6. Further trapezia (T ∗ 1) (T ∗ 2) (T ∗ 3)

v

u

s

w

(0, 1, −1, −1) (0, −1, 1, −1) (1, −1, −1, 0)

(1, 0, −1, −1) (1, −1, 0, −1) (1, −1, 0, −1)

(−2, 1, 0, 0) (−2, 1, 0, 0) (−1, 0, −1, 1)

(1, −2, 0, 0) (0, 1, −2, 0) (−1, 0, 1, −1)

In (T*2),(T*3), as in (T1)-(T7), we have 2v − s = 2u − w. In these examples t = 21 (s + w) may also be present. In (T*1) we have s − w = 3(v − u), and the vectors t = (2s + w)/3 = (−1, 0, 0, 0) and r = (s + 2w)/3 = (0, −1, 0, 0) will also be present.

Classification of Superpotentials

607

As an example, we explain why the trapezium (T*2) can never be a face. As u is present in W, so are u = (−1, 1, 0, −1) and u = (−1, −1, 0, 1). Now (2u + u )/3 = (2s + u)/3 = (−1, 13 , 0, − 13 ) is in the plane, but u is not, so this plane cannot give a face. Similar arguments involving (1, 0, −1, −1), (−1, 0, 0, 0) (resp. (−1, 0, −1, 1), (−1, 0, 1, −1)) show (T*1) (resp. (T*3)) cannot be faces. These arguments also show several parallelograms cannot be faces, but these will not be relevant for our purposes. We now begin to classify the possible 2-faces which arise from adjacent (1B) vertices. We shall repeatedly use Prop. 1.4, Cor. 3.4, and Lemma 3.5. Let E denote the affine 2-plane determined by the 2-face being studied. Theorem 6.4. Suppose we have adjacent (1B) vertices corresponding to a parallelogram face vusw of conv(W). So we have u¯ = (a¯ + c)/2 ¯ and w¯ = (a¯ + c)/2 ¯ for null a, ¯ a¯ . Suppose the vertices v, u, s, w are the only elements of W in the face. Then u, w are adjacent vertices of the parallelogram, and either (i) C ∩ E = {c, ¯ a, ¯ a¯ , e}, ¯ where e¯ is null with v = (a + e)/2 and s = (a + e)/2; or (ii) v, ¯ s¯ ∈ C and J (a, ¯ v) ¯ = J (a¯ , s¯ ) = J (¯s , v) ¯ = 0. Moreover, if none of v, u, s, w is type I, then (i) cannot occur. Proof. We may introduce coordinates in the 2-plane E using the sides sv and sw to define the coordinate axes. In this way we can speak of “left” or “right”, “up” or “down”. If we extend the sides of the parallelogram to infinite lines, these lines divide the part of the plane outside the parallelogram into 8 regions, and c¯ must be in the interior of one such region. We first observe that if c¯ is in one of the four regions which only meet the parallelogram at a vertex, then a¯ a¯ does not meet the parallelogram, contradicting Lemma 3.4. (A) Let c¯ then lie in a region which meets the parallelogram in an edge. Without loss of generality we may assume the edge is uw. By Cor. 3.4, all elements of C ∩ E lie on ¯ c) or between the rays from c¯ through a, ¯ a¯ . Hence, by Lemma 3.2, J (b, ¯ > 0 for all b¯ ∈ C\{c}. ¯ If b¯ is a rightmost element of (C ∩ E)\{c}, ¯ then as b¯ + c¯ cannot be written in another way as a sum of two elements of C, we deduce from Proposition 1.4 that b¯ + c¯ ∈ d + W. So b¯ is either a¯ or a¯ . All other elements of C ∩ E lie to the left of a¯ a¯ . Note also that a rightmost element of (C ∩ E)\{c, ¯ a, ¯ a¯ } satisfies b + c = a + a , 2v or 2s. (B) Next let e¯ = 2v¯ − a. ¯ Observe that as well as v¯ = (a¯ + e)/2, ¯ we have s¯ = (a¯ + e)/2, ¯ since 2v¯ − a¯ = 2(v¯ − u) ¯ + c¯ = 2(¯s − w) ¯ + c¯ = 2¯s − a¯ . If e¯ ∈ C, then it must be null, and the same argument as above shows that no elements of (C ∩ E)\{e} ¯ lie to the left of a, ¯ a¯ , so we are in case (i). Now, Lemma 3.2 shows ¯ k) ¯ > 0 for all h¯ = k¯ ∈ C ∩ E. If v, u, s, w are all type II/III, we see that Fc¯ , Fe¯ are J (h, of one sign and Fa¯ , Fa¯ the other sign. But now the contributions from a¯ + a¯ and c¯ + e¯ in the superpotential equation cannot cancel. If e¯ ∈ / C then, as in the argument before Theorem 3.8, s¯ , v¯ ∈ C and we are in case (ii). Proposition 3.7 shows v, ¯ s¯ are orthogonal. Moreover, note that the remark at the end of (A) shows that v + c or s + c is left of a + a . Lemma 6.5. In case (ii) of Theorem 6.4, we have J (v, ¯ v) ¯ = J (¯s , s¯ ). Proof. As c¯ and a¯ = 2u¯ − c¯ are both null, and similarly c¯ and a¯ = 2w¯ − c¯ are both null, we deduce (cf. Remark 3.9) J (u, ¯ u) ¯ = J (u, ¯ c) ¯

:

J (w, ¯ w) ¯ = J (w, ¯ c). ¯

(6.2)

608

A. Dancer, M. Wang

We also have 2J (u, ¯ v) ¯ = J (c, ¯ v) ¯

:

2J (w, ¯ s¯ ) = J (c, ¯ s¯ )

(6.3)

from the orthogonality conditions on a, ¯ v¯ and a¯ , s¯ . Now J (¯s , s¯ )− J (v, ¯ v) ¯ = J (¯s , s¯ )− J (w¯ − u¯ − s¯ , w¯ − u¯ − s¯ ), which, on expanding out and using the second relations of Eqs.(6.2),(6.3), becomes J (2u¯ − c, ¯ w¯ − s¯ ) − J (u, ¯ u). ¯ Now J (2u¯ − c, ¯ w¯ − s¯ ) − J (u, ¯ u) ¯ = J (2u¯ − c, ¯ u¯ − v) ¯ − J (u, ¯ u) ¯ = J (2u¯ − c, ¯ u) ¯ − J (u, ¯ u) ¯ = J (u¯ − c, ¯ u) ¯ = 0. We have used the first relations of Eqs.(6.3), (6.2) in the second and fourth equalities. Remark 6.6. We must also consider the case when the midpoint of one side or a pair of opposite sides of the parallelogram face is in W. This can happen for (P16) and (P17). Note that v, u, s, w are type II/III in these cases. In fact, the argument of Theorem 6.4 is still valid if one or both of the midpoints of vu, sw is in W and c lies in the region to the right of uw (or the left of vs). Keeping c in the region to the right of uw, we now need to consider the case where one or both of the midpoints of vs, uw is in W. The conclusions (in 6.4(ii)) still hold except that we no longer have J (v, ¯ s¯ ) = 0. However, we have to make slight modifications to the arguments as 21 (a¯ + a¯ ) may be in C ∩ E. If e¯ ∈ C, then, as a¯ + a¯ is not in d + W, the usual sign argument shows that the terms in the superpotential equation summing up to a¯ + a¯ do not cancel, which is a contradiction. So e¯ ∈ / C and our previous arguments hold except for the use of Prop. 3.7. Note that we also have to consider the possibility that a, a , and e lie on the line through vs. But now the midpoint of uw must be present and C∩ E = {c, ¯ a, ¯ a¯ , 21 (a+ ¯ a¯ )}, with v + s = a + a . The usual sign argument then forces the midpoints of uw and vs to be present and of type I. Hence this special configuration cannot occur in (P16) or (P17). Lastly, since the proof of Lemma 6.5 makes no mention of midpoints, it remains valid if midpoints are present. The conditions of Theorem 6.4 and Lemma 6.5, together with the nullity of a, ¯ a¯ , c, ¯ put very strong constraints on vusw and the dimensions. In fact, one can check that these constraints cannot be satisfied for any of our parallelograms (including those of Remark 6.2) with one exception. This is the rectangle yy z z in (H2) with c = (−2, 1, 0, . . .) and d13 + d14 = d11 , which will be dealt with in Lemma 8.5. We now give an example of how to apply the above conditions in a specific case. Example 6.7. Consider parallelogram (P8). The equation of the 2-plane E containing the parallelogram is x2 = −x1 , x5 = x6 , x2 + x5 = −1, x1 + · · · + x6 = −1

(6.4)

and xi = 0 for i > 6. As all vertices are type II/III, we must be in case (ii) of Theorem 6.4. (A) Take c to face the side uw. Note that vs and uw have equation x1 = 1, x1 = 0 respectively, so c1 < 0. Also, the remarks at the end of parts (A) and (B) in the proof of Theorem 6.4 shows that c1 > − 13 , as v + c or s + c is left of a + a so 1 + c1 > −2c1 .

Classification of Superpotentials

609

The condition J (v, ¯ s¯ ) = 0 implies d1 = d2 = 2 and Lemma 6.5 implies d3 = d4 . Equations (6.2) and (6.3) give four linear equations in ci . Now d3 = d4 and Eq.(6.3) show c3 = c4 , so the equations for the plane give c = ( 21 −c4 , − 21 +c4 , c4 , c4 , − 21 −c4 , − 21 − c4 ). Next d1 = d2 = 2 and Eq.(6.3) show c4 = 3d4 /(2d4 + 2) and c1 = (1 − 2d4 )/(2d4 + 2). But the condition − 13 < c1 < 0 now implies d3 = d4 = 1, and it follows that c cannot be null. (B) The argument if c faces vs is very similar. We have d3 = d4 and d5 = d6 = 2, and the orthogonality equations imply c3 = c4 . So c has the same form as in the second paragraph of (A) above. We find c4 = −3d4 /(2d4 + 2)) and c1 = (1 + 4d4 )/(2d4 + 2). But we now have the inequality 1 < c1 < 43 , so again d3 = d4 = 1, violating nullity. (C) If c faces vu or sw then we need J (¯s , w) ¯ = 0 (resp. J (v, ¯ u) ¯ = 0), which is impossible. Example 6.8. The example of the square (S) with midpoint can be treated in essentially the same way as the parallelograms. By symmetry, we may assume that c lies in the region that intersects uw. However, because 21 (a + a ) may now be the midpoint and hence in W, the configuration of Theorem 6.4(i) can occur, even though all vertices are type II. We have C ∩ E = {c, ¯ a, ¯ a¯ , e} ¯ with a = (−1, −1, 1, 1, −1, . . .), a = (1, −1, −1, −1, 1, . . .), 5c = 1(1, −1, −1, 1, −1, . . .), and e = (−1, −1, 1, −1, 1, . . .) with nullity condition i=1 di = 1. We will be able to rule this case out in Sect. 7. On the other hand, the configuration of Theorem 6.4(ii) cannot occur, as one easily checks. Next assume that adjacent (1B) vertices in c¯ determine a trapezium vusw as shown in the diagram below: LL I

L

L IX

L

II L III

L v r

Lru L

L

L

L VIII IV

L

L

L s

Lrw r q

L t

L

L VII VI

L

L

V

where t is the midpoint of sw and vu is parallel to sw. We assume that v, u, s, w ∈ W but our conclusions hold whether or not t lies in W. We will now derive constraints on the 2-face and E ∩ C resulting from having c lie in one of the regions shown above. For theoretical considerations, we need only treat the cases where c lies in Regions I to

610

A. Dancer, M. Wang

VI. In practice, for an asymmetric trapezium, we must consider c lying in the remaining regions as well. In the following we will adopt the convention that a, ¯ a¯ always denote null vectors in C. (I) c in Region I. This is impossible because then s¯ = 21 (c¯ + a) ¯ and w¯ = 21 (c¯ + a¯ ) 1 for some a, ¯ a¯ , and so a¯ a¯ would not intersect conv( 2 (d + W), a contradiction to Cor. 3.4. (II) c in Region II. Then v¯ = 21 (c¯ + a), ¯ u¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ . We get a contradiction to Cor. 3.4 if a, ¯ a¯ lie below the line sw. They also cannot lie on the line sw since the argument in (A) in the proof of Theorem 6.4 and Cor. 3.4 imply that C ∩ E = {c, ¯ a, ¯ a¯ }, and the terms corresponding to s¯ , w¯ in the superpotential equation would be unaccounted for. Let e = 2s − a, e = 2w − a . These points lie in Region VI, and since we have a trapezium, e = e . We may now apply Theorem 3.8 to a¯ and a¯ to obtain the possibilities: (i) s¯ , w¯ ∈ C; J (a, ¯ s¯ ) = 0 = J (a¯ , w), ¯ (ii) s¯ ∈ C, J (a, ¯ s¯ ) = 0; w¯ ∈ / C, e¯ ∈ C is null, J (e¯ , s¯ ) = 0, (iii) w¯ ∈ C, J (w, ¯ a¯ ) = 0; s¯ ∈ / C, e¯ ∈ C is null, J (e, ¯ w) ¯ = 0. Note that the last condition in (ii) (resp. (iii)) results from applying Theorem 3.8 to e¯ (resp. e). ¯ (III) c in Region III. We have v¯ = 21 (c¯ + a), ¯ w¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ lying respectively in Regions VIII and VI (in view of Cor. 3.4). Applying Theorem 3.8 we obtain the possibilities: (i) s¯ ∈ C, J (a, ¯ s¯ ) = 0 = J (a¯ , s¯ ), (ii) s¯ ∈ / C, 2¯s = a¯ + a¯ (which implies c + s = v + w). ¯ w¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ ∈ C ∩ E. (IV) c in Region IV. We have u¯ = 21 (c¯ + a), If a lies in region IX, then Cor. 3.4 implies that a¯ lies in Region VI. Applying Theorem 3.8 to a¯ and a¯ we obtain the possibilities: (i) s¯ ∈ C, J (a, ¯ s¯ ) = 0 = J (a¯ , s¯ ), (ii) 2s = a + a , i.e., c + s = u + w. If a lies on the line sv, then we may apply Theorem 3.8 to a¯ . We cannot have 2¯s = a¯ + e¯ with e¯ ∈ C and null, otherwise a¯ e¯ would not intersect conv( 21 (d + W)). So we have (iii) s¯ ∈ C and J (¯s , a¯ ) = 0. If a lies in Region II, then a¯ lies in Region VI. Let e¯ = 2v¯ − a¯ and e¯ = 2¯s − a¯ . As we have a trapezium, e¯ = e¯ . Now e lies in region VII or VIII while e lies in region VIII or IX, so by Cor. 3.4 e¯ and e¯ cannot both lie in C and hence be null. Theorem 3.8 now gives the possibilities: (iv) v, ¯ s¯ ∈ C, J (a, ¯ v) ¯ = 0 = J (a¯ , s¯ ), (and by Prop. 3.7 J (v, ¯ s¯ ) = 0), (v) v¯ ∈ C, J (a, ¯ v) ¯ = 0, e¯ ∈ C is null, and J (e¯ , v) ¯ = 0, (vi) s¯ ∈ C, J (a¯ , s¯ ) = 0, e¯ ∈ C is null, and J (e, ¯ s¯ ) = 0. (V) c in Region V. We have u¯ = 21 (c¯ + a¯ ), s¯ = 21 (c¯ + a) ¯ for some a, ¯ a¯ lying respectively in Regions VIII and II (by Cor. 3.4). Theorem 3.8 now gives the possibilities: (i) v¯ ∈ C, J (a, ¯ v) ¯ = 0 = J (a¯ , v), ¯ (ii) v¯ ∈ / C, 2v¯ = a¯ + a¯ (which implies c + v = u + s). (VI) c in Region VI. We have s¯ = 21 (c¯ + a), ¯ w¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ lying respectively in regions VIII and IV (by Cor. 3.4). (To rule out a, ¯ a¯ lying in the line vu, we proceed as in case (II), except that when t ∈ W, we conclude instead

Classification of Superpotentials

611

that C ∩ E = {c, ¯ a, ¯ a¯ , 21 (a¯ + a¯ )}. One can still check that v, ¯ u¯ cannot be both accounted for.) Now let e¯ = 2v¯ − a¯ and e¯ = 2u¯ − a¯ . Again, having a trapezium means e¯ = e¯ and Theorem 3.8 now gives the possibilities: (i) u, ¯ v¯ ∈ C, J (a, ¯ v) ¯ = 0 = J (a¯ , u), ¯ (and J (u, ¯ v) ¯ = 0 by Prop. 3.7), (ii) v¯ ∈ C, J (a, ¯ v) ¯ = 0, u¯ ∈ / C, e¯ ∈ C is null, (iii) u¯ ∈ C, J (a¯ , u) ¯ = 0, v¯ ∈ / C, e¯ ∈ C is null. Remark 6.9. We mention a useful inequality which holds in (II) and (VI) above, as well as in parallelogram faces with the same configuration (cf. Example 6.7(A)). Let us consider (II), where we choose in E coordinates such that the first coordinate axis is parallel to s¯ w¯ (assumed to be horizontal) and the second coordinate axis is arbitrary, with the second coordinate increasing as we go up. As in (A) in the proof of Theorem 6.4, all points in (C ∩ E)\{c, ¯ a, ¯ a¯ } must lie below the line a¯ a¯ . Let b¯ be a point among these with largest second coordinate. Since we have seen above that either s¯ or w¯ lies in C ∩ E, we have s2 ≤ b2 . Furthermore, as b¯ + c¯ cannot lie in d + W it must be balanced by sums of elements in C ∩ E, with the limiting configuration given by a¯ + a¯ . So we have 21 (b2 + c2 ) ≤ a2 = a2 = 2v2 − c2 . Combining the two inequalities we get 3c2 ≤ 4v2 − s2 . Equality in the above holds iff b¯ lies in s¯ w¯ and b¯ + c¯ = a¯ + a¯ . In particular, b¯ is unique, so in II(i), the inequality above is strict. Note that we only need v¯ u¯ and s¯ w¯ to be parallel and the presence or absence of t in W is immaterial. Hence in Theorem 6.4(ii) we also have an analogous strict inequality, which we have already used, e.g., in (B) of Example 6.7. (For a parallelogram, there may be midpoints on the pair of non-horizontal sides lying in 21 (d + W), but 21 (b¯ + c) ¯ can never equal these midpoints, so we still get the inequality we want.) For the configuration in (VI), we still have an analogous inequality, but since 21 (a¯ + a¯ ) ∈ C, we lose uniqueness of b¯ and hence the strict inequality. We will also have occasion to apply the above analysis to appropriate trapezoidal regions in hexagon (H3). The method described above together with Remark 6.9 can now be used to rule out the trapezia (T1)-(T6) as well as those mentioned in Remark 6.2. Example 6.10. For the trapezium (T3), the vectors v, u, s, w are given in Table 4, and lie in the 2-plane {x1 + x2 + x3 + x4 = −1, x2 + 2x4 = 1}. vu is given by x4 = 1 while sw is given by x4 = 0. sv is given by x3 = 0 and wu is given by x1 = 0. The vector c that we are looking for has the form (−c3 + c4 − 2, 1 − 2c4 , c3 , c4 ). Since the trapezium is symmetric, an explicit symmetry being induced by interchanging x1 and x3 , we need only consider c lying in Regions II-VI. (A) If c lies in Region III, then c1 > 0, c4 > 1. Since a = 2v − c, we obtain a = (c3 − c4 , −3 + 2c4 , −c3 , 2 − c4 ). Similarly, a = (c3 − c4 + 2, 1 + 2c4 , −4 − c3 , −c4 ). If we are in case (ii), then c = v + w − s = (1, −1, −2, 1), which violates c4 > 1. So we must be in case (i). It follows from J (a, ¯ s¯ ) = 0 = J (a¯ , s¯ ) that d1 + 3 = −2c3 + 4c4 and J (w, ¯ s¯ ) = J (v, ¯ s¯ ). The second equality implies that d1 = d2 . Using this together with the first equality and the null condition for a¯ (in the form J (w, ¯ w) ¯ = J (w, ¯ c), ¯ see Remark 3.9) we get c4 = d1 (d1 −1)/(4d1 +2d3 ). Since c4 > 1, we have d1 (d1 −5) > 2d3 , so d1 > 5. But by Remark 3.1, J (¯s , s¯ ) < 0, which gives d1 < 5 (since d1 = d2 ), a contradiction. (B) Let c lie in Region IV, so that c1 > 0, 0 < c4 < 1. We obtain a = (2 + c3 − c4 , 2c4 − 3, −2 − c3 , 2 − c4 ) and a = (2 + c3 − c4 , 1 + 2c4 , −4 − c3 , −c4 ). We claim

612

A. Dancer, M. Wang

that a3 > 0, so that a lies in Region IX. To see this, we solve for c3 , c4 using the null conditions J (u, ¯ u) ¯ = J (c, ¯ u) ¯ and J (w, ¯ w) ¯ = J (c, ¯ w) ¯ for a, ¯ a¯ respectively. We obtain c4 = (d2 d3 + 2d3 d4 − d2 d4 )/(d3 (d2 + 3d4 )) and a3 = −2 − c3 = (d2 d3 + 2d3 d4 − d2 d4 )/(d2 (d2 + 3d4 )). Since c4 > 0 we obtain our claim. Since a lies in Region IX, we first check if (ii) holds. In this case, c = (2, −1, −3, 1) which contradicts c4 < 1. The equations in (i) together imply the contradiction 0 = −4/d2 . (C) Suppose c lies in Region V, so that c1 > 0, c4 < 0. We obtain a = (c3 − c4 − 2, 1 + 2c4 , −c3 , −c4 ) and a = (c3 − c4 + 2, 2c4 − 3, −2 − c3 , 2 − c4 ). If (ii) holds then c = (−1, 1, −1, 0) and this contradicts c1 > 0. Hence (i) must hold. By Remark 3.9, the null condition for a¯ is J (¯s , s¯ ) = J (¯s , c), ¯ which is dc31 − ( d11 + 1 ¯ v) ¯ = J (¯s , v) ¯ and J (v, ¯ v) ¯ < 0, which d2 )c4 = 0. The two equations in (i) imply J (u, in turn give d1 = 2. Using this, the null condition for a, ¯ and J (a¯ , v) ¯ = 0, we obtain −1 c4 = 1+d and c3 = − d2d(d2 +2 . But c1 = c4 − c3 − 2 > 0, which simplifies to 2 2 +1) 1 > d2 (d2 + 1), a contradiction. (D) Let c lie now in Region II. Then c1 < 0, c3 < 0, 1 < c4 ≤ 43 , where the last upper bound comes from the inequality in Remark 6.9. We obtain a = (c3 − c4 , 2c4 − 3, −c3 , 2 − c4 ) and a = (2 + c3 − c4 , 2c4 − 3, −2 − c3 , 2 − c4 ). The null conditions for a, ¯ a¯ then give c3 = −

2d1 d4 + d1 d2 − d2 d4 , (d1 + d3 )(d2 + 2d4 ) − d2 d4

c4 =

(d1 + d3 )(d2 + 2d4 ) . (d1 + d3 )(d2 + 2d4 ) − d2 d4

Suppose we are in case (i). The two equations and the above values of c3 , c4 combine to give (d1 − d3 )((d1 + d3 )(d2 + 2d4 ) − d2 d4 ) = 0. However, the upper bound c4 ≤ 43 translates into (d1 + d3 )(d2 + 2d4 ) ≥ 4d2 d4 . So the second factor is positive and we have d1 = d3 . Putting this information into the equation J (a, ¯ s¯ ) = 0, we get d1 d2 d4 (d2 + 15) = 2d12 d2 (d2 + 1) − 6d1 d22 + 2d4 (2d12 d2 + 2d12 + d22 ). By Remark 3.1 we also have J (¯s , s¯ ) < 0, i.e., 1 < d41 + d12 , so either d2 = 1 or d1 < 8. Substituting these values into the equation above and using c4 ≤ 43 we obtain in each instance a contradiction. If we are in case (ii), then by adding the equations J (a, ¯ s¯ ) = 0 and 2J (w, ¯ s¯ ) − J (a¯ , s¯ ) = 0 (equivalent to J (e¯ , s¯ ) = 0), we obtain 1 = d21 + d12 . Hence (d1 , d2 ) = (4, 2) or (3, 3). One then checks that these values are incompatible with the null condition for e¯ , J (a, ¯ s¯ ) = 0, and the bound c3 < 0. An analogous argument works to eliminate case (iii), where we now need the bound c4 ≤ 43 instead. (E) Lastly suppose c lies in Region VI, so c1 , c3 < 0 and − 13 ≤ c4 < 0, where the lower bound for c4 results from Remark 6.9. We have a = (c3 − c4 − 2, 1 + 2c4 , −c3 , −c4 ) and a = (2 + c3 − c4 , 1 + 2c4 , −4 − c3 , −c4 ). Using the null conditions for a, ¯ a¯ , we obtain c1 = −

2(d2 + d3 ) d1 + 5d2 + d3 −2(d1 + d2 ) −2d2 , c2 = , c3 = , c4 = . d1 + d2 + d3 d1 + d2 + d3 d1 + d2 + d3 d1 + d2 + d3

If we are in case (i), J (u, ¯ v) ¯ = 0 gives d2 = d4 = 2. The other two equations and the above values of c3 , c4 then give 3(d1 + d3 + 2)(d1 + d3 − 4) = 4(3d1 + 3d3 − 2).

Classification of Superpotentials

613

The lower bound − 13 ≤ c4 becomes d1 + d2 + d3 ≥ 6d2 . Using this inequality in the above Diophantine relation leads to a contradiction. (Alternatively, observe the relation is a quadratic in d1 + d3 with no rational roots). For case (ii), using the two equations and the above values for c3 , c4 , we arrive at the relation (d1 + d2 + d3 )((d1 − 5)d2 d4 + d1 d4 + d2 d3 + 2d3 d4 ) = 2d2 (d1 d2 + 2d1 d4 − d2 d4 + d2 d3 + 2d3 d4 ). Using the lower bound − 13 ≤ c4 in the above relation we see that d1 ≤ 3. By direct substitution, we further obtain d1 = 3. Finally, if d1 = 2, the null condition for c¯ gives 1 > 21 c12 and so d2 + d3 ≤ 4. The lower bound on c4 now implies d2 = 1. Since c2 > 1, the null condition for c¯ is violated. Case (iii) reduces to case (ii) upon interchanging the first and third summands. Therefore, the trapezium (T3) has been eliminated. We discuss next the hexagons (H1)-(H3). As the three cases are similar, we will focus on (H3) and refer to the following (schematic) diagram: @

I

@

@

vr

@

@ @

IV

@

@

@ u r @ @ VII @ @ @r z@

III @ II @ @rw @ @ IV @ @ tr @r x @ @ VII @ r y

@

VI

V

@

@

@

@

Example 6.11. The hexagon (H3) lies in the 2-plane given by {x2 = −1, x1 +x3 +x4 = 0}. So c has the form (−c3 − c4 , −1, c3 , c4 ). The lines vw and zy are given respectively by x1 + x3 = 1 and x1 + x3 = −1. Similarly, the lines uv and yx are given by x3 = 1 and x3 = −1 respectively. The lines uz and wx are given by x1 = −1 and x1 = 1 respectively. Interchanging x1 and x3 induces the reflection about the perpendicular bisector of vw, while (x1 , x2 , x3 , x4 ) → (−x3 , x2 , −x1 , −x4 ) induces the reflection about ux. These symmetries reduce our consideration to those c lying in Regions I-VI. Moreover, (H3) is actually a regular hexagon. The symmetry (x1 , x2 , x3 , x4 ) → (−x4 , x2 , −x3 , −x1 ) induces the reflection about zw, which swaps Region II with Region IV and Region I with Region VI. Finally, the symmetry (x1 , x2 , x3 , x4 ) → (−x3 , x2 , −x4 , −x1 ) induces

614

A. Dancer, M. Wang

the rotation in E about t taking x to w, and maps Region V to Region III. Therefore, we need only consider c lying in Regions I, II, and V. In the discussion below we again adopt the convention that a, ¯ a¯ always denote null vectors in C. If c lies in Region I, then u¯ = 21 (c¯ + a), ¯ x¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ , and we immediately see that a¯ a¯ cannot meet conv( 21 (d + W)), a contradiction to Cor. 3.4. c lying in Region II. We have c1 , c3 , < 1 and c1 + c3 > 1. The assumption of adjacent (1B) vertices means that v¯ = 21 (c¯ + a) ¯ and w¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ ∈ E ∩ C. Hence a = (c3 + c4 , −1, 2 − c3 , −2 − c4 ) and a = (2 + c3 + c4 , −1, −c3 , −2 − c4 ). One checks easily that a¯ lies in Region IV and a¯ lies in Region IV . Moreover, the null conditions for these vectors yield c1 =

d3 + d4 d1 + d4 d1 + d3 + 2d4 , c3 = , c4 = − . d1 + d3 + d4 d1 + d3 + d4 d1 + d3 + d4

Let e := 2u − a and e := 2x − a . These lie respectively in Regions VII and VII. We can now apply Theorem 3.8 to a¯ and a¯ to obtain the following possibilities: (i) (ii) (iii) (iv)

u, ¯ x¯ ∈ C and J (a, ¯ u) ¯ = 0 = J (a¯ , x); ¯ u¯ ∈ C, J (a, ¯ u) ¯ = 0, x¯ ∈ / C, e¯ ∈ C is null; x¯ ∈ C, J (a¯ , x) ¯ = 0, u¯ ∈ / C, e¯ ∈ C is null; u, ¯ x¯ ∈ / C, e, ¯ e¯ are both null.

We can eliminate (i)-(iii) by noting that the two equations in each case together with the values of c3 , c4 above imply that 1 = d11 + d12 + d13 . Using this relation (and the values of c3 , c4 ) in the null condition for c¯ then leads to a contradiction. For case (iv) we can again apply Theorem 3.8 to the null vertices e¯ and e¯ . The conditions J (e, ¯ z¯ ) = 0 and J (e¯ , y¯ ) = 0 lead, as above, to 1 = d11 + d12 + d14 and 1 = d12 + d13 + d14 respectively. Using this in the null condition for c¯ again leads to a contradiction. Hence z¯ , y¯ ∈ / C and q¯ := 2¯z − e¯ and q¯ := 2 y¯ − e¯ are null vectors in E ∩ C. In fact we now find that q = q , so caeqe a is a hexagon circumscribing (H3). Let us consider the pair of null vertices c, ¯ q. ¯ We apply the argument in (A) of the proof of Theorem 6.4 to the wedge with vertex c¯ bounded by the rays c¯a¯ and c¯a¯ . All elements of (C ∩ E)\{a, ¯ a¯ , c} ¯ lie below the line a¯ a¯ . Let b¯ be a highest (with respect to x1 + x3 ) element among these. Since e¯ ∈ C, b1 + b3 > −1 and so c¯ + b¯ cannot 3 −2d4 equal 2u, ¯ 2t¯, 2 x. ¯ Hence c¯ + b¯ = a¯ + a¯ , and we compute that b1 + b3 = dd11+d +d3 +d4 . The analogous argument applied to the wedge bounded by the rays q¯ e¯ and q¯ e¯ gives a lowest −d1 −d3 element b¯ of (C ∩ E)\{q, ¯ e, ¯ e¯ } satisfying b¯ + q¯ = e¯ + e¯ and b1 + b3 = 2dd14+d . To 3 +d4 avoid a contradiction, we must have d1 + d3 ≥ 2d4 . We can repeat the above argument with the null vertex pairs {e, ¯ a¯ } and {e¯ , a}, ¯ obtaining the inequalities d3 + d4 ≥ 2d1 and d1 + d4 ≥ 2d3 respectively. The three inequalities then imply that in fact d1 = d3 = d4 and c = ( 23 , −1, 23 , − 43 ). Furthermore, C ∩ E = {a, ¯ a¯ , c, ¯ e, ¯ e¯ , t¯, q} ¯ and the null condition for c¯ gives (d1 , d2 ) = (3, 9) or (4, 3). By looking at the terms in the superpotential equation corresponding to the vertices (all of type II), we find that the coefficients Fc¯ , Fe¯ , Fe¯ have the same sign, which is opposite to that of Fa¯ , Fa¯ , Fq¯ . Next we note that the only ways to write d + ( 13 , −1, 13 , − 23 ) (resp. d + (− 13 , −1, 23 , − 13 )) as a sum of element of C are t¯ + c¯ = a¯ + a¯ (resp. t¯ +

Classification of Superpotentials

615

a¯ = c¯ + e). ¯ The superpotential equation then gives Fa¯ Fa¯ J (a, ¯ a¯ ) + Ft¯ Fc¯ J (t¯, c) ¯ = 0 and Fc¯ Fe¯ J (c, ¯ e) ¯ + Ft¯ Fa¯ J (t¯, a) ¯ = 0. Since J (a, ¯ a¯ ), J (c, ¯ e), ¯ J (t¯, c) ¯ and J (t¯, a) ¯ are all positive, the above equations and facts imply that Fc¯ and Fa¯ have the same sign, a contradiction. So c cannot lie in Region II. c lying in Region V. We have c3 < −1 < −c4 < 1 < c1 . The adjacent (1B) vertices assumption implies that w¯ = 21 (c¯ + a) ¯ and y¯ = 21 (c¯ + a¯ ) for some a, ¯ a¯ ∈ C ∩ E. It follows that a = (2 + c3 + c4 , −1, −c3 , −2 − c4 ) and a = (c3 + c4 , −1, −2 − c3 , 2 − c4 ). The null conditions on these vectors give c1 =

2d3 + d4 (2d1 + d4 )(d3 + d4 ) d1 d3 − d1 . , c4 = , c3 = − 1 + d4 (d1 + d3 + d4 ) d4 d1 + d3 + d4 d1 + d3 + d4

Since a3 = −c3 > 1, a lies above the line uv. Also, a1 = c3 + c4 = −c1 < −1, so a lies below the line uz. We can therefore apply Theorem 3.8 to a¯ and a¯ to get the following possibilities: ¯ (i) u¯ ∈ C, J (a, ¯ u) ¯ = 0 = J (a¯ , u), (ii) u¯ ∈ / C, e¯ := 2u¯ − a, ¯ e¯ := 2u¯ − a¯ lie in C ∩ E and are null. If (i) occurs, then the two orthogonality conditions imply that d1 = d3 , so c4 = 0, c1 = −c3 = 1+ dd14 . Substituting these values of ci into J (a¯ , u) ¯ = 0 gives 1 = d24 + d12 .

But the null condition for c¯ is 1 = d12 + d21 (1+ dd14 )2 > d12 + d24 = 1, which is a contradiction. Hence (ii) must occur. Note that if the above diagram is rotated so that the lines x1 + x3 = κ (for arbitrary constants κ) are horizontal, then the lines x1 − x3 = κ would be vertical. u is the only point in the hexagon lying on x1 − x3 = −2. Observe that a1 − a3 = a1 − a3 ≥ −2, otherwise a¯ a¯ would not intersect conv( 21 (d + W)), which contradicts Cor. 3.4. If, however, a1 − a3 > −2, then e¯e¯ would not intersect conv( 21 (d + W)). So in fact a = e , e = a and u all lie on x1 − x3 = −2. In other words, the hexagon is circumscribed by the triangle caa with intersections at w, u and y. It follows easily from the above that c = (2, −1, −2, 0), d1 = d3 = d4 , and the null condition for c¯ is 1 = d81 + d12 . Also, we have C ∩ E = {c, ¯ a, ¯ a¯ , t¯}. Since w, u, y are type II, by Lemma 3.2, we see that the signs of Fa¯ , Fc¯ , and Fa¯ in the superpotential equation cannot be chosen compatibly. We have thus shown that the hexagon (H3) cannot occur. The hexagon (H2) is not regular, but has reflection symmetry about uv and the perpendicular bisector of yy . It can be eliminated by similar arguments, but we now have to consider c lying in Regions III and IV as well. The hexagon (H1) can also be eliminated by the above methods. Here the hexagon is invariant under the symmetric group permuting the coordinates x1 , x2 , x3 . Together with Cor. 3.4, this fact reduces our consideration to those c lying in three of the regions formed by extending the sides of the hexagon. As mentioned in Remark 6.2, we also need to rule out subshapes of the hexagons. For (H2) and (H3) the methods used above can also be applied to rule out all the subparallelograms and trapezia except the rectangle yy z z of (H2) (see Lemma 8.5 and the discussion immediately before Ex 6.7). All sub-triangles will be dealt with at the end of this section. (There is a triangle with midpoint in (H2) but that can be dealt with by similar methods.) For (H2) this leaves the pentagon yy vz z and the kite y uz v, both of which can still be eliminated using the above methods.

616

A. Dancer, M. Wang

The possible subshapes of (H1) are rather numerous. However, if r ≥ 4 we will be able to eliminate all of them in Lemma 8.6. Without this assumption, the above methods can be used to eliminate those subshapes which do not contain all three type I vectors. Of course the following discussion will handle the sub-triangles. Lastly, we consider triangular faces. Theorem 6.12. Suppose we have adjacent (1B) vertices in c¯ corresponding to a triangular face x¯ x¯ x¯ of conv( 21 (d +W)). Let E be the affine 2-plane determined by the triangular face. So there are null vectors a, ¯ a¯ in C ∩ E such that x = 21 (a +c), x = 21 (a +c). Suppose the vertices of the triangle are the only elements of W in the face. Then we are in one of the following two situations: (i) C ∩ E = {c, ¯ a, ¯ a¯ , x¯ }, with c + x = a + a and J (x¯ , a) ¯ = J (x¯ , a¯ ) = 0; 1 (ii) C ∩ E = {c, ¯ a, ¯ a¯ }, where 2 (a + a ) = x , one of x, x , x is type I, and the others are either both type I or both type II/III.

Proof. (A) We may introduce coordinates in E so that x¯ x¯ is vertical and to the right of c. ¯ As a¯ a¯ must meet conv( 21 (d + W)), we see x¯ is on or to the right of a¯ a¯ . Let b¯ be any leftmost point of (C ∩ E)\{c}. ¯ As in Theorem 6.4, we see that b¯ + c¯ ∈ d + W, so all elements of C ∩ E except c, ¯ a, ¯ a¯ are to the right of a¯ a¯ . (B) Considering a¯ x¯ and a¯ x¯ we see (using Theorem 3.8 and Cor. 3.4) that either ¯ = 0 = J (x¯ , a¯ ), or (1) x¯ ∈ C and J (x¯ , a) 1 (2) x¯ ∈ / C and x = 2 (a + a ). In case (1), (x¯ )⊥ ∩ E is the line through a¯ a¯ . By Prop. 3.3 and Cor. 3.4, observe that all elements of (C ∩ E)\{x¯ } are left of x¯ . Let b¯ be a rightmost element of (C ∩ E)\{x¯ }. ¯ x¯ ) = 0 or b¯ + x¯ ∈ d + W. Since b¯ is not to the left of a¯ a¯ , the second So either J (b, alternative cannot hold and so b¯ must lie on a¯ a¯ . Combining this with our results in (A), we see C ∩ E is as in (i). Also, as J (a, ¯ a¯ ) > 0 and a¯ + a¯ ∈ / d + W, we see a + a must equal c + x . In case (2), by Cor. 3.4 there are no elements of C ∩ E right of a¯ a¯ . Hence C ∩ E is as ¯ e) in (ii). Now J (b, ¯ > 0 for all b¯ = e¯ in C ∩ E, so the last statement of (ii) follows. Remark 6.13. We must also consider the case when some midpoints of the sides of our triangular face lie in W. (This could happen if two vertices were (1, −1, −1, . . .),

Classification of Superpotentials

617

(−1, 1, −1, . . .) or (1, −2, . . .), (1, 0, −2, . . .) or (1, −2, . . .), (−1, 0, . . .).) Let us denote the midpoints of x x , x x and x x respectively by z, y, t. If z is absent, the arguments of (A) in the proof of Theorem 6.12 still hold, so we have the alternatives (1),(2) in (B). If (1) holds then, choosing b¯ as above, if b is right of aa , we have 21 (b + x ) ∈ W. This gives a contradiction since 21 (b + x ) cannot be y or t as b = x, x . Now C ∩ E = {c, ¯ a, ¯ a¯ , x¯ }, and as c + x ∈ / 2W it must equal a + a . It follows that the midpoints y, t cannot arise. If instead (2) holds, then C ∩ E = {c, ¯ a, ¯ a¯ } and again no midpoints can be present. Suppose now the midpoint z of x x is present. The argument of (A) shows that to account for z, 21 (a¯ + a¯ ) ∈ C, and all elements of (C ∩ E)\{c, ¯ a, ¯ a¯ , (a¯ + a¯ )/2} are right of a¯ a¯ . We still have the alternatives (1) and (2), but (2) immediately gives a contradiction. In (1) we see as before there are no elements of C ∩ E lying to the right of a¯ a¯ , so C ∩ E = {c, ¯ x¯ , a, ¯ a¯ , (a¯ + a¯ )/2}. Note that J (a, ¯ (a¯ + a¯ )/2) and J (a¯ , (a¯ + a¯ )/2) > 0. If c + x = a + a , we find after some algebra that a¯ + (a¯ + a¯ )/2 = 2 y¯ and also cannot be written as a different sum of elements of C, giving a contradiction. If c¯ + x¯ = a¯ + a¯ , then one sees that c¯ + x¯ ∈ / d + W, and by relabelling x and x , a and a we may assume that c¯ + x¯ = a¯ + 21 (a¯ + a¯ ) and also a¯ + a¯ = 2 y¯ and a¯ + 21 (a¯ + a¯ ) = 2t¯ = x¯ + x¯ . These relations imply a = x , a contradiction. So no triangle with any midpoints present can arise. Remark 6.14. There are also triangular faces with two points of W in the interior of an edge. This can only happen if two vertices are (−2, 1, 0, . . .) and (1, −2, 0, . . .) (up to permutation). The other sides of the triangle now have no interior points in W unless the triangle is contained in the hexagon (H1). We can again modify the proof of Theorem 6.12 to treat this situation. If the interior points z, w lie on x x , then (2a¯ + a¯ )/3, (a¯ + 2a¯ )/3 must be in C, and all points of C ∩ E except for these two and c, ¯ a, ¯ a¯ lie to the right of a¯ a¯ . By Prop. 3.3, alternative (1) must now hold. The usual argument shows x¯ is the only element of C ∩ E on the right of a¯ a¯ . Now again J (a, ¯ 13 (2a¯ + a)) ¯ > 0, J (a¯ , 13 (a¯ + 2a¯ )) > 0, and the sums a + (2a + a)/3 and a + (a + 2a )/3 cannot give points in 2W. Since they also cannot both be cancelled by c + x in the superpotential equation, we have a contradiction. The other possibility for two interior points is, after relabelling the vertices if necessary, when z = (2x + x )/3 and w = (2x + x)/3. As usual all elements of C ∩ E except for c, ¯ a, ¯ a¯ are on the right of a¯ a¯ . Alternative (1) must hold, or else we cannot account for z, w. The usual argument shows either x¯ is the only element of C ∩ E right of a¯ a¯ , or z ∈ C is the rightmost element of (C ∩ E)\{x¯ } (so (z + x )/2 = w). In the former case we cannot get both z and w, as (c + x )/2 can’t equal both z and w. In the latter, considering a¯ z¯ shows J (a, ¯ z¯ ) = 0. But as J (a, ¯ x¯ ) = 0, this means a¯ is orthogonal to x¯ and hence to c, ¯ a contradiction. So no triangle with points of W in the interior of an edge can arise (except possibly for a subtriangle of (H1)). Nullity of c, ¯ a, ¯ a¯ and the conditions in Theorem 6.12(i),(ii) again put severe con straints on x, x , x and the dimensions. The possible triangles for case (i) are as follows, where (Tr11)-(Tr22) occur only if K is not connected, and we have also listed the vectors c, a, a for future reference. Further details of how the following listing is arrived at can be found in [DW5].

618

x (T r 1) (−2, 1, 0, 0, 0) (T r 2) (−2, 1, 0, 0) (T r 3) (0, 0, 0, −2, 1) (T r 4) (−2, 1, 0, 0, 0, 0) (T r 5) (−2, 1, 0, 0, 0) (T r 6) (−2, 1, 0, 0, 0, 0) (T r 7) (−2, 1, 0, 0, 0, 0) (T r 8) (−2, 1, 0, 0, 0, 0) (T r 9) (−2, 1, 0, 0, 0, 0, 0) (T r 10) (−2, 1, 0, 0, 0, 0, 0) (T r 11) (0, 0, 0, 1, −1, −1) (T r 12) (0, 1, 0, −1, −1) (T r 13) (0, 0, 0, −1, −1, 1) (T r 14) (0, 0, 0, −1, −1, 1) (T r 15) (0, 0, 0, 1, −1, −1, 0) (T r 16) (0, 0, 0, 1, −1, −1, 0) (T r 17) (0, 0, 0, 1, −1, −1, 0) (T r 18) (0, 0, 0, 1, −1, −1, 0) (T r 19) (0, 0, 0, 1, −1, −1, 0) (T r 20) (−1, −1, 1, 0, 0, 0) (T r 21) (−1, 1, −1, 0, 0, 0) (T r 22) (−1, 1, −1, 0, 0, 0)

A. Dancer, M. Wang

x (0, 0, −2, 1, 0) (0, 1, −2, 0) (−2, 1, 0, 0, 0) (0, 0, −2, 1, 0, 0) (0, 1, −1, 0, −1) (0, 0, 1, −1, −1, 0) (0, 0, 1, −1, −1, 0) (0, 0, −1, −1, 1, 0) (0, 0, 1, −1, −1, 0, 0) (0, 0, −1, 1, −1, 0, 0) (−2, 1, 0, 0, 0, 0) (−2, 1, 0, 0, 0) (−2, 1, 0, 0, 0, 0) (0, 1, −1, −1, 0, 0, 0) (1, −1, −1, 0, 0, 0, 0) (1, −1, −1, 0, 0, 0, 0) (1, −1, −1, 0, 0, 0, 0) (1, −1, −1, 0, 0, 0, 0, 0) (1, −1, −1, 0, 0, 0, 0, 0) (0, 0, 1, −1, −1, 0) (0, 0, −1, 1, −1, 0) (0, 0, −1, −1, 1, 0)

x (0, 0, −2, 0, 1) (0, 1, −1, −1) (0, 1, −2, 0, 0) (0, 0, 0, 1, −1, −1) (0, 1, −1, −1, 0) (0, 0, 1, −1, 0, −1) (0, 0, −1, −1, 0, 1) (0, 0, −1, −1, 0, 1) (0, 0, 1, 0, 0, −1, −1) (0, 0, −1, 0, 0, 1, −1) (−2, 0, 1, 0, 0, 0) (−1, 1, −1, 0, 0) (0, 1, −2, 0, 0, 0) (−2, 1, 0, 0, 0, 0, 0) (1, −1, 0, 0, 0, 0, −1) (0, −1, 1, 0, 0, 0, 1) (0, −1, −1, 0, 0, 0, 1) (1, 0, 0, 0, 0, 0, −1, −1) (0, −1, 0, 0, 0, 0, −1, 1) (0, 0, 1, −1, 0, −1) (0, 0, −1, 1, 0, −1) (0, 0, −1, −1, 0, 1)

3c 3a (T r 1) (2, −1, −8, 2, 2) (−2, 1, −4, 4, −2) (T r 2) (2, 3, −6, −2) (−2, 3, −6, 2) (T r 3) (−4, 4, −4, 2, −1) (−8, 2, 4, −2, 1) (T r 4) (2, −1, −4, 4, −2, −2) (−2, 1, −8, 2, 2, 2) (T r 5) (2, 3, −4, −2, −2) (−2, 3, −2, 2, −4) (T r 6) (2, −1, 4, −4, −2, −2) (−2, 1, 2, −2, −4, 2) (T r 7) (2, −1, 0, −4, −2, 2) (−2, 1, 6, −2, −4, −2) (T r 8) (2, −1, −4, −4, 2, 2) (−2, 1, −2, −2, 4, −2) (T r 9) (2, −1, 4, −2, −2, −2, −2) (−2, 1, 2, −4, −4, 2, 2) (T r 10) (2, −1, −4, 2, −2, 2, −2) (−2, 1, −2, 4, −4, −2, 2) (T r 11) (−8, 2, 2, −1, 1, 1) (−4, 4, −2, 1, −1, −1) (T r 12) (−6, 3, −2, 1, 1) (−6, 3, 2, −1, −1) (T r 13) (−4, 4, −4, 1, 1, −1) (−8, 2, 4, −1, −1, 1) (T r 14) (−4, 4, −2, −2, 1, 1, −1) (4, 2, −4, −4, −1, −1, 1) (T r 15) (4, −4, −2, −1, 1, 1, −2) (2, −2, −4, 1, −1, −1, 2) (T r 16) (2, −4, 0, −1, 1, 1, −2) (4, −2, −6, 1, −1, −1, 2) (T r 17) (2, −4, −4, −1, 1, 1, 2) (4, −2, −2, 1, −1, −1, −2) (T r 18) (4, −2, −2, −1, 1, 1, −2, −2) (2, −4, −4, 1, −1, −1, 2, 2) (T r 19) (2, −4, −2, −1, 1, 1, −2, 2) (4, −2, −4, 1, −1, −1, 2, −2) (T r 20) (1, 1, 3, −4, −2, −2) (−1, −1, 3, −2, −4, 2) (T r 21) (1, −1, −3, 4, −2, −2) (−1, 1, −3, 2, −4, 2) (T r 22) (1, −1, −3, −4, 2, 2) (−1, 1, −3, −2, 4, −2)

3a (−2, 1, −4, −2, 4) (−2, 3, 0, −4) (4, 2, −8, −2, 1) (−2, 1, 4, 2, −4, −4) (−2, 3, −2, −4, 2) (−2, 1, 2, −2, 2, −4) (−2, 1, −6, −2, 2, 4) (−2, 1, −2, −2, −2, 4) (−2, 1, 2, 2, 2, −4, −4) (−2, 1, −2, −2, 2, 4, −4) (−4, −2, 4, 1, −1, −1) (0, 3, −4, −1, −1) (4, 2, −8, −1, −1, 1) (−8, 2, 2, 2, −1, −1, 1) (2, −2, 2, 1, −1, −1, −4) (−2, −2, 6, 1, −1, −1, −4) (−2, −2, −2, 1, −1, −1, 4) (2, 2, 2, 1, −1, −1, −4, −4) (−2, −2, 2, 1, −1, −1, −4, 4) (−1, −1, 3, −2, 2, −4) (−1, 1, −3, 2, 2, −4) (−1, 1, −3, −2, −2, 4)

Remark 6.15. In making the above table, it is useful to observe from the nullity and orthogonality conditions that x cannot be type I, and that if x is type III, say, (−2i , 1 j ), then xi = xi iff x j = x j . The possibilities for Theorem 6.12(ii) are as follows (up to permutation of x, x , x and the corresponding permutation of c, a, a ):

Classification of Superpotentials

(T r 23) (T r 24) (T r 25) (T r 26) (T r 27) (T r 28)

619

x x x (−1, 0, 0, 0, 0) (0, −2, 1, 0, 0) (0, 0, 0, −2, 1) (−1, 0, 0, 0) (0, 1, −2, 0) (0, −1, −1, 1) (−1, 0, 0, 0, 0, 0) (0, 1, −2, 0, 0, 0) (0, 0, 0, 1, −1, −1) (−1, 0, 0, 0, 0) (0, 1, −1, −1, 0) (0, −1, −1, 0, 1) (−1, 0, 0, 0, 0, 0, 0) (0, 1, −1, −1, 0, 0, 0) (0, 0, 0, 0, 1, −1, −1) (−1, 0, 0) (0, −1, 0) (0, 0, −1)

c a a (T r 23) (1, −2, 1, −2, 1) (−1, −2, 1, 2, −1) (−1, 2, −1, −2, 1) (T r 24) (1, 0, −3, 1) (−1, 2, −1, −1) (−1, −2, 1, 1) (T r 25) (1, 1, −2, 1, −1, −1) (−1, 1, −2, −1, 1, 1) (−1, −1, 2, 1, −1, −1) (T r 26) (1, 0, −2, −1, 1) (−1, 2, 0, −1, −1) (−1, −2, 0, 1, 1) (T r 27) (1, 1, −1, −1, 1, −1, −1) (−1, 1, −1, −1, −1, 1, 1) (−1, −1, 1, 1, 1, −1, −1) (T r 28) (1, −1, −1) (−1, −1, 1) (−1, 1, −1)

Remark 6.16. In drawing up the above listing, recall from Theorem 6.12 that one of the vectors, without loss of generality x , is of type I. We write x = (−1, 0, 0, . . .). It now easily follows from nullity and the relations between x, x , x and c, a, a that x1 = x1 . Also, observe that as x is a vertex of W, no type II vector may have a nonzero entry in the first position. In contrast to the earlier listing of non-triangular faces, the above lists result from examining all triangular faces, including ones which arise from other faces because certain vertices are absent from W. The restrictions on the dimensions of the corresponding summands are as follows: (Tr1) (Tr2) (Tr3) (Tr4) (Tr5) (Tr6) (Tr7) (Tr8) (Tr9) (Tr10) (Tr11) (Tr12) (Tr13) (Tr14) (Tr15) (Tr16) (Tr17) (Tr18) (Tr19) (Tr20-22) (Tr23) (Tr24)

(2, 1, 16, 4, 4, . . .) (2, 3, 12, 4, . . .) (16, 4, 16, 2, 1, . . .) (2, 1, 16, 4, d5 , d6 , . . .), d14 + d15 = 41 (2, 3, 6, 6, 6, . . .) (2, 1, d3 , d4 , 4, 4, . . .), d13 + d14 = 41 (2, 1, 12, 3, 12, 12, . . .) (2, 1, d3 , d4 , 4, 4, . . .), d13 + d14 = 41 (2, 1, 4, d4 , d5 , d6 , d7 , . . .), d14 + d15 = d16 + d17 = 41 (2, 1, 4, d4 , d5 , d6 , d7 , . . .), d14 + d15 = d16 + d17 = 41 (16, 4, 4, 1, 1, 1, . . .) (12, 3, 4, 1, 1, . . .) (16, 4, 16, 1, 1, 1, . . .) (16, 4, d3 , d4 , 1, 1, 1, . . .) d13 + d14 = 41 (d1 , d2 , 4, 1, 1, 1, 4, . . .), d11 + d12 = 41 (12, 3, 12, 1, 1, 1, 12, . . .) (4, d2 , d3 , 1, 1, 1, 4, . . .), d12 + d13 = 41 (4, d2 , d3 , 1, 1, 1, d7 , d8 , . . .), d12 + d13 = 41 = d17 + d18 (d1 , 4, d3 , 1, 1, 1, d7 , d8 , . . .), d11 + d13 = 14 = d17 + d18 (1, 1, 3, 6, 6, 6, . . .), (1, 2, 2, 8, 8, 8, . . .), or (2, 1, 2, 8, 8, 8, . . .) 1 4 1 4 1 d1 + d2 + d3 + d4 + d5 = 1 1 9 d3 = 2d2 : d1 + d3 + d14 = 1

620

A. Dancer, M. Wang

(Tr25) (Tr26) (Tr27) (Tr28)

+ d12 + d43 + d14 + d15 + d16 = 1 d2 = d3 : d11 + d43 + d14 + d15 = 1. 7 1 i=1 di = 1 1 1 1 d1 + d2 + d3 = 1. 1 d1

Note that (Tr28) is a subtriangle of (H1), (Tr2) is a subtriangle of a triangle with midpoints of all sides in W, and (Tr12) is a subtriangle of a triangle with the midpoint of one side. Let us now illustrate by an example how one arrives at the above tables. Example 6.17. One possible triangle has vertices V1 = (0, 0, 0, −2, 1), V2 = (−2, 1, 0, 0, 0), V3 = (0, 1, −2, 0, 0) with the midpoint V4 = (−1, 1, −1, 0, 0) of V2 V3 in W. The triangle has a symmetry given by interchanging the first and third entries. It therefore suffices to consider V1 , V2 , V4 as possibilities for x . Of course, by Remark 6.13 the full triangle cannot occur. The possible subtriangles x x x are V2 V3 V1 , V2 V4 V1 , V4 V1 V2 , V3 V1 V2 , and V3 V1 V4 . Now 3c = 2x + 2x − x , 3a = 4x − 2x + x , and 3a = −2x + 4x + x can be used to compute these vectors in each case. For V2 V4 V1 one gets 3c = (−6, 4, −2, 2, −1), 3a = (−6, 2, 2, −2, 1) and so c¯ and a¯ cannot be both null. Similarly, for the last three possibilities, a¯ and c¯ cannot be both null. That leaves the first case, which gives (Tr3). The condition J (a, ¯ x¯ ) = 0 is 3 = d44 + d15 , which implies ¯ a, ¯ a¯ gives the equations (d4 , d5 ) = (2, 1). Putting this into the null conditions for c, 6=

64 4 16 16 4 64 16 16 16 + + = + + = + + . d1 d2 d3 d1 d2 d3 d1 d2 d3

The last two equations imply that d1 = d3 and the first two equations give d1 = 4d2 . These in turn give (d1 , d2 , d3 ) = (16, 4, 16), as in the tables above. Putting all the results in this section together we obtain Theorem 6.18. If we have two adjacent (1B) vertices, then the associated 2-face of W is given by a triangle in the list (T r 1) − (T r 27), the square with midpoint (S), a proper subshape of the hexagonal face (H1) containing all three type I vectors, or the sub-rectangle yy zz of (H2). We note for future reference the following properties of the c vector of the nontriangular faces appearing in the above theorem: for (S), all nonzero entries have the same absolute value, and there are only 3 (resp. 2) nonzero entries for the subfaces of (H1) (resp. (H2)). 7. More than One Type (2) Vertex In this section we shall now show there is at most one type (2) vertex in c¯ , except in the situation of Theorem 3.14 and one other possible case. Suppose we have two type (2) vertices of V . Then we have elements v, w, v , w of W with c, v, w collinear and c, v , w collinear. So we have four coplanar elements v, w, v , w of W where vw and v w are edges. Moreover, the edges vw and v w meet at c outside conv(W). Hence vwv w do not form a parallelogram or a triangle. From our listing of polygons in Sect. 6 and considering their sub-polygons we see that the possibilities for further analysis are the following:

Classification of Superpotentials

621

• Trapezia (T1)-(T6): We must have c = 2v − s = 2u − w. Also, we note for future reference that sw is always an edge of conv(W) in (T3) and (T5), regardless of whether or not the whole trapezium is a face, since sw can be cut out by {x2 = 1, x1 + x3 = −2} (cf 1.2(e)). • Hexagons (H1)-(H3). • Rectangle with midpoints (P17): While the rectangle itself cannot occur, we need to consider the trapezia obtained by omitting one vertex, so that the edges are a side of the rectangle and the segment joining the remaining vertex to the opposite midpoint. As above, note that the longer of the two parallel sides of the trapezium is always an edge of conv(W). • Parallelogram with midpoints (P16): This case is similar to (P17). The sub-polygons to consider are the trapezia obtained by omitting one vertex of the parallelogram. By symmetry, we are reduced to omitting either u or w. But since s occurs, w cannot be omitted. • Triangle with midpoints of all sides: We need to consider the trapezia obtained by omitting a vertex. By symmetry all three trapezia are equivalent. This triangle is always a face of conv(W) as it is cut out by {x2 = 1, x1 + x3 + x4 = −2} (cf. 1.2(e)). • Trapezia (T*1),(T*2), (T*3): By Rmk. 6.3 these cannot be faces of conv(W), so cannot come from adjacent type (2) vertices of V . For (T*1), besides the full trapezium, we need to consider the two trapezia obtained by omitting either s or w. By symmetry these are equivalent. For (T1),(T2),(T4),(T6),(T*2),(T*3) we must have c = 2v − s = 2u − w, and so Lemma 3.13 applied to vs gives a contradiction. The same argument works for (P16), as up to permutations, c = 2v − s = 2y − w. For (T1*), since Theorem 3.11 rules out c = (3v − s)/2 = (3u − w)/2 (corresponding to the full trapezium), the only other possible c is 2v − s = 2u − r , and again Lemma 3.13 rules this out. For (T3),(T5) and the trapezium coming from the triangle with midpoints, we need more information from the superpotential equation. Since J (c, ¯ s¯ ), J (c, ¯ w) ¯ > 0, while Av , Au < 0, Fs¯ , Fw¯ must have the same sign, which must be opposite to that of Fc¯ . Since stw is always an edge by earlier remarks, the nullity of c¯ implies J (¯s , w) ¯ > 0, contradicting Prop. 3.7(ii). Essentially the same argument works for (P17), as up to permutations c = 2s − v = 2z − u = (−2, −1, 2, 0). For (H3) most quadruples cannot give pairs of edges. For we observe that u (resp. v, w) is present iff x (resp. y, z) is. Thus, if u is missing, so is x, and v, w, y, z must all be present (otherwise we do not have a 2-dimensional polygon). But we now get a rectangle, which is not admissible. Hence all vertices are present and by symmetry we may assume that one of our edges is uz or uv. From this we quickly find that the two possible c (up to permutations) are (−1, −1, −1, 2) = 2y − x = 2z − u and (1, −1, 1, −2) = 2v − u = 2w − x. Both cases are ruled out by Lemma 3.13. For (H2), observe that y (resp. y ) is present iff z (resp. z ) is. As these four vectors cannot all be absent (otherwise we do not have a 2-dim polygon), by the symmetries of (H2), we can assume y is present. If v is present, then all possibilities are eliminated by Theorem 3.11. (Note that although 2α − y = 2z − z it is impossible for αy and zz to both be edges.) On the other hand, if v is absent, then y z is an edge. Since the polygon cannot be a parallelogram or a triangle, it follows that u is present and the polygon is a pentagon. In this case, the only possibility compatible with Theorem 3.11 is c = (0, −1, 2, −2) = 2y − u = 23 y − 21 z . (This is not a priori ruled out by Theorem 3.11 as y z has an interior point β).

622

A. Dancer, M. Wang

To discuss (H1), we write u = (−2, 1, 0), p = (0, 1, −2), v = (1, 0, −2), w = (0, −2, 1), s = (1, −2, 0), q = (−2, 0, 1) for the vertices, x = (−1, 1, −1), y = (−1, −1, 1), z = (1, −1, −1), for midpoints of the longer sides, and α = (−1, 0, 0), β = (0, 0, −1), γ = (0, −1, 0) for the interior points, with the understanding that the rest of the components of the above vectors are zero. As before we consider pairs of vectors which can form edges of an admissible polygon. We then compute the possibilities for c and apply Theorem 3.11. This will eliminate most possibilities. (For many quadruples of points we can see, as in (H3), that they cannot all be vertices.) So up to permutations, the remaining possibilities are as follows: If no type II is present: (1) (2) (3) (4) (5) (6) (7) (8)

c = 2α − u = 2β − p = (0, −1, 0, . . .), c = 2u − v = 2q − s = (−5, 2, 2, . . .), c = 2u − p = 2q − γ = (−4, 1, 2, . . .), c = 2q − u = 2α − p = (−2, −1, 2, . . .), If all type II are present: c = 2u − x = 2q − y = (−3, 1, 1, . . .), c = 2u − y = 2x − z = (−3, 3, −1, . . .), c = (3y − z)/2 = 2q − u = (−2, −1, 2, . . .), c = (3 p − u)/2 = (3v − s)/2 = (1, 1, −3, . . .).

Again, we cannot immediately rule out (7) and (8) using 3.11 because of the presence of interior points. However for (8) we easily see using the arguments of 3.11 that the elements of C on the line through v, s are c, ¯ c¯1 = (v¯ + s¯ )/2 = z¯ and c¯2 = (3¯s − v)/2. ¯ Now as s, v are type III we need Fc¯ and Fc¯2 to have the same sign, which is the opposite sign to Fc¯1 . But the superpotential equation now gives a contradiction to the fact that A z < 0. In (2)-(7) Lemma 3.13 applied respectively to uv, up, qu, ux, uy, qu gives a contradiction. Note that case (4) only occurs when K is not connected, as the vectors β, γ are absent (cf. 1.2(b)). We are left with (1), which is precisely the situation of Theorem 3.14. We have therefore proved Theorem 7.1. Apart from the situation of Theorem 3.14, the only other possible case where we can have more than one type (2) vertex is, up to permutation of summands, when two type (2) vertices are adjacent and the 2-plane determined by them and c¯ intersects conv( 21 (d + W)) in the pentagon with vertices uyy zz contained in the hexagon (H2). We will be able to rule this case out in Sect. 8.

Classification of Superpotentials

623

8. Adjacent (1B) Vertices Revisited We now return to our classification of when adjacent (1B) vertices can occur. The idea is as follows: each of the configurations of Sect. 6 involves, as well as the null vector c, ¯ two new null vectors a, ¯ a¯ . Hence the arguments of earlier sections also apply to a, ¯ a¯ . That is, we may consider the associated polytopes a¯ and a¯ . The following lemma is useful when applied to c¯ , a¯ and a¯ . Lemma 8.1. Suppose we have a (1B) vertex with exactly k adjacent (1B) vertices. Then r ≤ #((1A) vertices) + #((2) vertices) + k + 2. Suppose we have a (1B) vertex with no adjacent (1B) vertices. Then r ≤ #((1A) vertices) + #((2) vertices) + 2. If there are no (1B) vertices then r ≤ #((1A) vertices) + #((2) vertices) + 1. Proof. By our assumption that dim conv(W) = r − 1, it follows that c¯ is a polytope of dimension r − 2. Any vertex in it has at least r − 2 adjacent vertices. So for a (1B) vertex, the first two statements follow immediately. If there are no (1B) vertices the third inequality follows because c¯ has at least r − 1 vertices. Lemma 8.2. Configurations (Tr1) – (Tr22) cannot arise from adjacent (1B) vertices. Proof. The strategy is to count the number of type (1A), (2) and adjacent (1B) vertices in a¯ or c¯ and apply Lemma 8.1 to get a contradiction. (i) We first observe that for these configurations c and a have at least four nonzero entries (at least five except for (Tr2)), so they cannot be collinear with an edge vw with points of W in the interior of vw (see Table 3 in Sect. 3). So if c¯ or a¯ has a type (2) vertex, by Theorem 3.11, c or a must equal 2v − w or (4v − w)/3. It is easy to check that this is impossible except for c in (Tr3), using the forms of c in Tables 1, 2 in Sect. 3. (ii) Next we consider (1A) vertices. For (Tr1)-(Tr10) we have |ai /di | ≤ 13 for all i. For Tr(1), Tr(6),Tr(8) there are three i where equality holds. In these cases one of the associated di equals 1. Moreover for (Tr1) and (Tr8) two of these ai /di equal 1/3 and the third is −1/3, whereas for (Tr6) it is the other way round. For (Tr2)-(Tr5), (Tr7) and (Tr9)-(Tr10) there are only two i where equality holds. Further, for (Tr2) and (Tr5) |ci /di | ≤ 13 for all i, with equality for just two i, and here ci /di = 13 . It follows that for (Tr1), (Tr8) there are at most two (1A) vertices in a¯ , while for (Tr2)-(Tr5), (Tr7) and (Tr9)-(Tr10) there is at most one. In the case of (Tr6) there are at most three (1A) vertices in general but at most two if K is connected. For (Tr2) and (Tr5) there are no (1A) vertices in c¯ . By means of similar considerations, we find that there is one (1A) vertex (corresponding to x¯ ) in a¯ for (Tr12), (Tr13),(Tr14) (Tr16), (Tr18), (Tr19) and at most two (1A) vertices for (Tr11), (Tr17), (Tr20) and the d = (1, 1, 3, 6, 6, 6, . . .) case of (Tr21), (Tr22). For (Tr15) there are at most four (1A) vertices in a¯ , and for the remaining cases of (Tr21) and (Tr22) there are at most r − 4 (1A) vertices (r − 6 of those correspond to (−23 , 1 j ), where j > 6).

624

A. Dancer, M. Wang

(iii) Finally, consider the (1B) vertex ξ¯ in a¯ corresponding to x¯ in each of the triangles. In order for there to be an adjacent (1B) vertex, a¯ must be (up to permutation) of the form of the null vector c¯ in the 2-faces in Theorem 6.18. Now observe that for examples (Tr1), (Tr3), (Tr4), (Tr6)- (Tr20) the null vector a does not appear in the list of possible c. Hence ξ¯ has no adjacent (1B) vertices. From above, type (2) vertices cannot occur, so combining the bounds for (1A) vertices in a¯ with Lemma 8.1 gives an upper bound for r less than the minimum required by each configuration, a contradiction. (iv) Let us now consider (Tr21) and (Tr22). The vector a of (Tr21) has the same form as c in (Tr22) and vice versa. An adjacent 2-face containing aξ c of (Tr21) can only be a triangle of type (Tr22) containing 13 (1, −1, −3, −2, −2, 4, 0, . . .). Thus ξ has at most one adjacent (1B) vertex, and we get the bound r − 2 ≤ 2 + 1 + 0 in the d = (1, 1, 3, 6, 6, 6, . . .) subcase and r − 2 ≤ (r − 4) + 1 + 0 in the other two subcases, a contradiction. (v) For the remaining two triangles (Tr5) and (Tr2) we consider c¯ instead. For (Tr5), observe that c determines the plane x x x and does not occur as a possible null vector for any other configurations. So we have at most one adjacent pair of (1B) vertices in c¯ . From above, there are no type (1A) or (2) vertices. But r ≥ 5, giving a contradiction. For (Tr2), consider the vertex ξ¯ of c¯ corresponding to x . If there is a (1B) vertex adjacent to it, we have a 2-dim face including c, x . By Theorem 6.18 the only one is the face including x, so there is at most one (1B) vertex of c¯ adjacent to ξ¯ . Also, type (1A) and (2) vertices cannot occur, so r ≤ 3, a contradiction. Lemma 8.3. Configurations (Tr23)-(Tr27) cannot arise from adjacent (1B) vertices. Proof. Note first that all entries of c, a, a are integers, so Lemma 4.4 shows in each case there is at most one (1A) vertex, and for (Tr23),(Tr24) one checks that there are no (1A) vertices in c¯ . Note also that for all these configurations, as x is a vertex, there are no type II vectors with nonzero entry in place 1. Observe as in Lemma 8.2 that there are no type (2) vertices in c¯ , a¯ or a¯ . (For (Tr24), we need to rule out the possibility that c has the form (4) in Table 3 in Sect. 3 with λ = 3/2. This follows since the interior point in that case would be a type II vector with nonzero entry in place 1.) So in all cases if we have a (1B) vertex with exactly k (1B) vertices adjacent to it, then by Lemma 8.1 we have r ≤ k + 3. For (Tr23), (Tr24) (using c¯ ), we have r ≤ k + 2. We will work with c¯ below. First consider (Tr23) and look for (1B) vertices adjacent to ξ¯ where ξ¯ corresponds to the vertex x. We need a 2-face including c, x. By Theorem 6.18 such a face must be of type (Tr23), and having fixed c and a, the only freedom lies in assigning 1 in the third null vector to the first or fifth place. So k ≤ 2, which gives r ≤ 4, a contradiction. Similarly, for (Tr24), since a type (H1) face cannot contain c and x, we need only consider faces of type (Tr24), for which there are again two possibilities. However, as mentioned above, in one of these possibilities the vector “x ” has a 1 in place 1 and hence cannot occur. So there is at most one (1B) vertex adjacent to ξ , and we deduce r ≤ 3, a contradiction. For (Tr25),(Tr27) we similarly deduce that the only 2-face containing x, c is itself because as above we cannot have any type II vectors with nonzero entry in place 1. So r ≤ 4, a contradiction.

Classification of Superpotentials

625

Finally, for (Tr26) the above argument still works since (Tr24) has been ruled out (the vectors a, a of (Tr24) are of the same form as c, a of (Tr26), so a priori (Tr24) could be an adjacent 2-face). Lemma 8.4. Configuration (S) (square with midpoint) cannot arise from adjacent (1B) vertices. Proof. We refer to Sect. 6 for the expressions for the vertices vusw of the square. The null vertex c¯ corresponds to (1, −1, −1, 1, −1, 0, . . .) and the 2-dimensional face is cut out by x2 = −1, x1 + x3 = 0 = x4 + x5 , and xk = 0, for k > 5. Lemma 4.4(b) shows that there is at most one (1A) vertex in c¯ . As r ≥ 5 and all the nonzero entries of c have the same absolute value, it follows that there are no type (2) vertices. Let ξ¯ denote the vertex of c¯ so that ξ is collinear with u and a = 2u − c = (−1, −1, 1, 1, −1, 0, . . .). A (1B) vertex adjacent to ξ¯ gives a 2-dimensional face including c, ¯ u. ¯ By what we have analysed so far about 2-faces given by adjacent (1B) vertices, this face must again be a face of type (S), and the only possibilities are itself or the face obtained from this by swapping indices 2 and 5. Hence there are at most two (1B) vertices adjacent to ξ¯ , and at vertex ξ¯ , we have 3 ≤ r − 2 ≤ 1 + 2. Thus r = 5 is the remaining possibility, in which case ξ¯ has exactly two adjacent (1B) vertices and one adjacent (1A) vertex. Let us denote by ξ¯ the (1B) vertex such that ξ is collinear with w and a := 2w −c = (1, −1, −1, −1, 1). Let η¯ denote the other (1B) vertex adjacent to ξ¯ . Then the 2-face determined by c, ξ, η is cut out by x5 = −1, x1 + x3 = 0 = x2 + x4 . The ray cη intersects conv(W) at z = (1, 0, −1, 0, −1) and b := 2z − c = (1, 1, −1, −1, −1) corresponds to a null vertex. Similarly, there is a (1B) vertex η¯ (besides ξ¯ ) adjacent to ξ¯ , and the corresponding 2-face (also of type (S)) is cut out by x3 = −1, x1 + x2 = 0 = x4 + x5 . The ray cη intersects conv(W) at z = (0, 0, −1, 1, −1). The vector b := 2z − c = (−1, 1, −1, 1, −1) corresponds to a null vertex. Let us examine the (1A) vertex in c¯ more closely. Let y ∈ W such that J ( y¯ , c) ¯ = 0. As r = 5, the null condition for c¯ implies that di ≥ 2 with at most one equal to 2. Also, for some j ∈ {2, 3, 5} (i.e., j is an index for which the corresponding entry of c is −1) we must have y j = −2, so y is type III. Let i be the index such that yi = 1. Then i ∈ {1, 4} (i.e., i is an index for which the corresponding entry of c is 1), and the orthogonality condition implies (di , d j ) = (2, 4) or (3, 3). There are thus six possibilities for y, but only one can actually occur. With the possible exception of the existence of the (1A) vertex, the above argu ¯ ¯ ments apply equally to the projected polytopes a¯ , b , a¯ , and b as the entries of a, b, a and b are just permutations of those of c. We claim that whichever possibility for y occurs in c¯ , there is another projected polytope with no (1A) vertex. Applying the above arguments to this polytope would result in the contradiction r − 2 ≤ 2 and complete our proof. We can use a¯ for the contradiction if d1 = 2 or if any of (d1 , d2 ), (d3 , d4 ), (d1 , d5 ) = (3, 3). If d4 = 2 or if (d2 , d4 ) = (3, 3) we can use a¯ instead. Finally, if (d1 , d3 ) = (3, 3) ¯b ¯ we can use and if (d4 , d5 ) = (3, 3) we can use b . For example, when (d1 , d3 ) = (3, 3) (so y = (11 , −23 )), the null condition for c¯ implies that d2 , d4 in particular cannot equal 2 or 3. In order to have a (1A) vertex in ¯ b , we must have a type III vector (1i , −2 j ) with i ∈ {2, 4}. But this requires one of d2 , d4 to be 2 or 3. ¯ When (d4 , d5 ) = (3, 3), then d1 , d2 cannot be 2 or 3. But in b a (1A) vertex corresponds to (1i , −2 j ) with i ∈ {1, 2}, which implies that one of d1 , d2 is 2 or 3. The remaining cases are handled similarly.

626

A. Dancer, M. Wang

Lemma 8.5. The subrectangle yy zz of (H2) cannot arise from adjacent (1B) vertices. Proof. Recall c = (−2, 1, 0, . . .), so by Lemma 4.4 there are no (1A) vertices of c¯ . Moreover, using Tables 1-3 in Sect. 3, one may check that there are no type (2) vertices either. (Note that type II vectors other than y, z with a nonzero entry in place 1 cannot occur as then the subrectangle cannot be a face. Similarly the line through c, α, β and (1, −2, 0, . . .) will not give a type (2) vertex as this line cannot be an edge.) Let η¯ denote the vertex of c¯ collinear with c and y. Any (1B) vertex adjacent to η¯ will give rise to a face containing c and y, which cannot be of type (H1), and must therefore be of type (H2), since we have eliminated all other possibilities. In fact, it must be the face we started with. So there is just one (1B) vertex adjacent to η, ¯ and from above there are no (1A) or (2) vertices. As r ≥ 4 for (H2), this contradicts Lemma 8.1. Lemma 8.6. If r ≥ 4, configuration (H1) or subshapes cannot arise from adjacent (1B) vertices. Proof. We first note some special properties of W. Since (H1) is a face, there can be no type II vectors in W with nonzero entry in a place ∈ {1, 2, 3} and in a place ∈ / {1, 2, 3}. Also, if (−2i , 1k ) with i ∈ {1, 2, 3}, k ∈ / {1, 2, 3}, then (−1k ) must be absent, which has strong implications, as noted in Remark 1.2(b). Let ξ¯ be a (1B) vertex in the plane. We have at most one (1B) vertex adjacent to ξ¯ , as the associated face must again be of type (H1) and is now determined by c, ¯ ξ¯ . It also readily follows that c cannot be collinear with an edge of W not in the face (assuming as usual we are not in the situation of Theorem 3.14). Now the special properties of W in the first paragraph imply that (1A) vertices in c¯ can correspond only to type III vectors in W which overlap with c. A straight-forward check using the null condition for c¯ shows that the possible type IIIs have form (−2i , 1k ) with i ∈ {1, 2, 3}, k ∈ / {1, 2, 3} and ci /di = −1/2. It follows that di = 2 or 3 and hence, by nullity, the index i is unique. So there are at most r − 3 (1A) vertices. By Lemma 8.1, all r − 3 (1A) vertices must occur. Applying Cor. 4.3 we conclude that r ≤ 4 (as di = 4 and r > 4 forces i ∈ Sˆ≥2 ). We will now improve this estimate to r ≤ 3. Let the vertices of (H1) be as in Sect. 7. If r = 4 then a (1A) vertex does exist and we can take it to come from t = (−2, 0, 0, 1) with (−14 ) absent. It follows that besides t the only other possible members of W lying outside the 2-plane containing (H1) are (0, −2, 0, 1) and (0, 0, −2, 1). As noted just before Theorem 6.12 we may assume the type I vectors α, β, γ are all present. (If K is connected, d4 = 1 and so this last fact follows without having first to eliminate those subshapes not containing one of the type I vectors.) As noted above, d1 = 2 or 3, and c1 = −1 or − 23 respectively. First consider c1 = −1, so d1 = 2. Now c = (−1, c2 , −c2 , 0), and by swapping the 2, 3 coordinates if necessary, we may take c2 > 0. Observe u, q are absent, as if u is present or if u is absent but q is present, then u (resp. q) gives a (1B) vertex, which contradicts nullity as the associated a would have a1 = −3. Now the type II vectors x, y, z are absent, as if one is present they all are, and we have a type (2) vertex. We deduce α gives a (1B) vertex so a = (−1, −c2 , c2 , 0). The other (1B) vertex cannot correspond to w since β, γ are present. It also cannot be given by v, s as this violates nullity, so must correspond to p or β. If it is p, we have a = (1, 2 − c2 , c2 − 4, 0). Now Remark 3.9 implies c2 = (d3 + 4d2 )/(d3 + 2d2 ) so 1 < c2 < 2. But now no entry of a equals −1 or − 23 . We can

Classification of Superpotentials

627

now check that there are no (1A) or (2) vertices with respect to a , so there is at most one vertex of a¯ adjacent to p, a contradiction. If it is β then p, v must be absent. Now a = (1, −c2 , c2 − 2) and Remark 3.9 implies c2 = 1. Hence c = (−1, 1, −1), a = (−1, −1, 1), a = (1, −1, −1), and nullity implies d12 + d13 = 21 . It is easy to check by considering the vertices of a¯ that w, s must also be absent, so W just contains the three type I vectors, t and possibly one or both of (0, −2, 0, 1), (0, 0, −2, 1). But we can check that, if present, these three latter vectors give respectively vertices with respect to a, a which cannot satisfy any of the conditions (1A), (1B) or (2). So in fact we have r = 3. Similar arguments rule out the case c1 = − 23 . Lemmas 8.2-8.6 give the following improvement of Theorem 6.18. Theorem 8.7. It is impossible to have adjacent (1B) vertices except possibly when r = 3, in which case conv( 21 (d + W)) is a proper subface of (H1) containing all three type I vectors (e.g., the tri-warped example (Tr28) ). We are now in a position to strengthen Theorem 7.1 by eliminating the remaining case of the pentagon. Theorem 8.8. Let c¯ be a null vertex of conv(C) such that c¯ contains more than one type (2) vertex. Then we are in the situation of Theorem 3.14. Proof. We just have to eliminate the case of the pentagon uyy zz in Theorem 7.1. Recall r ≥ 4 for this configuration, and c is (0, −1, 2, −2, . . .). Using the nullity of c¯ we check that the only elements of W which can give an element of c¯⊥ are (−22 , 1i ) where i > 4 and we have d2 = 2. Note that (11 , −22 ) cannot be present as then y z is not an edge. By Cor. 4.3, at most one such vector can arise. So there is at most one (1A) vertex, which occurs only if r ≥ 5. If we can show there are no (1B) vertices, then we are done because if we look at the adjacent vertices of the type (2) vertex associated to y (in the pentagon), besides one (1A) possibility, the other possibility is the type (2) vertex associated to y (by Theorem 7.1). As there must be at least r − 2 adjacent vertices, we deduce r − 2 ≤ 2, so r = 4. But now, from above there is no (1A) vertex so in fact we get r − 2 ≤ 1, a contradiction. We now use Remark 3.9 to make a list of the possible x ∈ W associated to (1B) vertices of c¯ . These are (0, 1, −1, −1, 0, . . .), (0, −2, 1, 0, . . .), (13 , −1i , −1 j ), (−14 , 1i , −1 j ), (13 , −14 , −1i ), (−13 , −14 , 1i ) and (13 , −2i ), where i, j = 2, 3, 4. Note that type II vectors with nonzero entries in places 2, k, m cannot occur except for y , z as then y z is not an edge. For each x in this list, we consider the projected polytope a¯ , where a = 2x − c. By looking at the form of a, we see from Theorem 7.1 that there is at most one type (2) vertex in a¯ . Also, the nonzero components of a are either ≥ 1 or ≤ −2. By Lemma 4.4(a), there are no vertices of type (1A) in a¯ . Since r ≥ 4, by Theorem 8.7, the type (1B) vertex in a¯ corresponding to x¯ has no adjacent (1B) vertices. So we have a contradiction to Theorem 8.1. The above result together with Theorem 8.7 and Lemma 8.1 gives us lower bounds on the number of (1A) vertices. Theorem 8.9. Let c¯ be a null vertex of conv(C) and c¯ be the corresponding projected polytope. Suppose further that c is not type I, i.e., we are not in the case of Theorem 3.14.

628

A. Dancer, M. Wang

(i) If there are no (1B) vertices in c¯ , then there are at least r − 2 type (1A) vertices. (ii) If either there is a type (2) vertex or r ≥ 4, then there are at least r − 3 type (1A) vertices in c¯ . Hence there are at least r − 3 elements of 21 (d + W) orthogonal to c. ¯ 9. Type (2) Vertices In this section we consider again type (2) vertices of c¯ . In view of Theorem 8.8, it remains to deal with the case of a unique type (2) vertex in c¯ . By Theorem 8.7 there are no adjacent (1B) vertices in this situation. Let c be collinear with an edge vw of conv(W). We first consider the situation where there are no interior points of vw lying in W. By Theorem 3.11, we have the two possibilities c = 2v − w and c = (4v − w)/3. Moreover, a preliminary listing of the cases appears in Tables 1 and 2 of Sect. 3. Case (i). c = 2v − w. We have to analyse cases (1)-(7) in Table 1 of Sect. 3. The idea is to determine the number of (1A) and (1B) vertices using respectively Lemma 4.4 and Remark 3.9, and then get a contradiction (sometimes using Theorem 8.9). Note that J (w, ¯ w) ¯ < 0 for (1)-(3). In (1), (2) and (4)-(7), Lemma 4.4 shows that there are no elements of 21 (d + W) orthogonal to c¯ (recall c ∈ / W), so Theorem 8.9 shows that r ≤ 3. This already gives a contradiction in case (7). (Note that when r = 3 and w is type I, since w is a vertex there are no type II vectors in W.) In (1) the only x ∈ W that could satisfy Eq.(3.1) and give a (1B) vertex with respect to c¯ is (1, −1, −1). But the associated a = 2x − c is (2, −3, 0) and it easily follows that a, ¯ c¯ cannot both be null. For (2), the possible x ∈ W which correspond to (1B) vertices are (1, −2, 0) and (1, −1, −1) respectively. In each case we find the nullity of c¯ and Remark 3.9 imply J (w, ¯ w) ¯ > 0, a contradiction to Fw2¯ J (w, ¯ w) ¯ = Aw < 0 (as w is type III). In (5) with r = 3, one checks that the only possible x ∈ W corresponding to a (1B) vertex is (0, 1, −2). Let us consider the distribution of points of W in the plane x1 + x2 + x3 = −1. The point (0, 1, −2), if present, would lie on one side of the line vw while the point (−1, 0, 0) lies on the other side. Now (−1, 0, 0) must lie in W as otherwise v cannot be present by Remark 1.2(b). So since vw is an edge by assumption, (0, 1, −2) cannot lie in W, which gives a contradiction to Theorem 8.9(i). Hence in (1),(2),(5) Theorem 8.9 shows r ≤ 2, which is a contradiction. In case (4) the nullity of c¯ translates into 1 = 9/d1 + 4/d2 . Hence d2 = 1, so if K is connected (0, −1, 0, . . .) is present and w is not a vertex, which is a contradiction. If r = 3 and K is not connected, by Remark 1.2(b), (1, −2, 0) and (0, −2, 1) must be absent, and, from Remark 3.9, the possibilities for x ∈ W associated to the (1B) vertex are x = (−2, 0, 1) and y = (0, 1, −2). In the first case, conv(W) is the triangle with vertices v, w, x and a = 2x − c = (−1, −2, 2). Now J (a, ¯ w) ¯ > 0, contradicting the superpotential equation. In the second case, a = (3, 0, −4) with J (a, ¯ y¯ ) > 0, and aw intersects conv(W) in an edge. By Theorem 3.11, t = (1, 0, −2) ∈ W and conv(W) is a parallelogram with vertices v, y, w, t. Moreover, Remark 3.12 implies that a and w are the only elements of C in aw. But then the midpoint (0, 0, −1) of wt is unaccounted for in the superpotential equation. For (6) with r = 3, there should be at least two vertices in c¯ . But we find there are no (1B) vertices, a contradiction. So r = 2, and we are in the situation of the double warped product Example 8.2 of [DW4].

Classification of Superpotentials

629

In case (3), Lemma 4.4 shows 21 (d+W)∩c¯⊥ has at most one element. Hence c¯ , which has dimension ≥ 2 since r ≥ 4, must contain at least one (1B) vertex. By Theorem 8.7, such a (1B) vertex has at most 2 adjacent vertices. It follows that r = 4 and (1, −2, 0, 0) corresponds to the (1A) vertex; also d2 = 2. Also, since (−1, 0, 0, 0) ∈ W, (0, −1, 0, 0) cannot be a vertex of conv(W). But now routine computations using Eq.(3.1) show there are no (1B) vertices, a contradiction. So the only possible case if K is connected is that giving Example 8.2 of [DW4]. If K is disconnected there is the further possibility of (4) with r = 2, i.e., W = {(−2, 1), (−1, 0)}. This is discussed in the third paragraph of Example 8.3 of [DW4]. An example in the inhomogeneous setting is treated there and in [DW2]. An example where the hypersurface is a homogeneous space G/K is discussed in the concluding remarks at the end of Sect. 10. Case (ii). c = (4v − w)/3. For clarity of exposition let us assume K is connected, using the assumption as indicated in Remark 5.9. We examine the cases (1)-(11) in Table 2 of Sect. 3. Some of these cases can be immediately eliminated. In (3), Eq.(3.2) implies (d1 , d2 ) = (3, 3) or (4, 1). In neither case is c¯ null. In (11) Eq.(3.2) and J (v, ¯ w) ¯ > 0 imply (d1 , d2 , d3 ) = (2, 4, 4), (2, 5, 2) or (3, 3, 3), and again c¯ is not null. In (4) and (6) Eq.(3.2) implies (d1 , d2 ) = (2, 1) and (d1 , d2 ) = (3, 9) or (4, 3) respectively. In neither case does the nullity condition have an integral solution in d3 . Further cases can be eliminated by finding the possible (1A) vertices (using Lemmas 3.10 and 4.4) for the given value of c and using Theorem 8.9. In particular, we get a contradiction whenever r ≥ 4 and there are no (1A) vertices. 8 In (1), Eq.(3.2) implies (d1 , d2 ) = (2, 1), and nullity of c¯ implies 32 d3 + d4 = 3. But 1 ⊥ we now find that 2 (d + W) ∩ c¯ is empty, giving a contradiction as r ≥ 4. In (5) Eq.(3.2) and nullity imply (d1 , d2 ) = (3, 3) and {d3 , d4 } = {3, 8}. One can now check that 21 (d + W) ∩ c¯⊥ is empty, which is a contradiction as r ≥ 4. In (7), Eq.(3.2) implies (d1 , d2 ) = (2, 1) and nullity implies d13 + d14 + d15 = 38 . One can now check that the only possible elements u¯ orthogonal to c¯ correspond to u = (1, 0, 0, −2, . . .) if d4 = 4 and (1, 0, 0, 0, −2, . . .) if d5 = 4. The nullity condition means that at most one of these can occur. This is a contradiction as r ≥ 5. In (8) Eq.(3.2) gives (d1 , d2 ) = (2, 3) and nullity of c¯ gives d13 + d14 = 14 . Again one can check that 21 (d + W) ∩ c¯⊥ is empty, a contradiction as r ≥ 4. In (9), Eq.(3.2) and the nullity of c¯ give (d1 , d4 ) = (2, 16) and {d2 , d3 } = {2, 3}. The only u which can give u¯ ∈ c¯⊥ are (0, −2, 0, 0, 1i , . . .) if d2 = 2 or (0, 0, −2, 0, 1i , . . .) if d3 = 2, where i ≥ 5. In each case, i is unique since d2 (resp. d3 ) = 4. Since r ≥ 4, Theorem 8.9 now implies r = 4. But now these u are not present (as i ≥ 5). Therefore there are actually no (1A) vertices, a contradiction to r = 4. In (10) we have (d3 , d4 ) = (2, 16) and {d1 , d2 } = {2, 3}. The only u which can give an element of c¯⊥ is (0, −2, 0, 0, 1i , . . .) (for i unique and ≥ 5) if d2 = 2. The final argument in (9) now applies equally here. Finally, we can eliminate (2) by an analysis of both the (1A) and (1B) vertices. First, Eq.(3.2) and nullity of c¯ force (d1 , d2 , d3 ) = (6, 1, 8). Next we check that 21 (d +W)∩ c¯⊥ is empty, so r = 3. Using Remark 3.9 we then find there can be no (1B) vertices, giving a contradiction. So case (ii) cannot occur if K is connected. Remark 9.1. Case (ii) is the only part of this section that relies on the connectedness of K . In fact, the analysis of the cases where w is type III does not use this assumption. If

630

A. Dancer, M. Wang

K is not connected, using the same methods and with more computation we obtain the following additional possibilities (all of which are associated to a w of type II): v (0, −1, −1, 1)

w (1, −1, −1, 0)

c = (4v − w)/3 (− 13 , −1, −1, 43 )

d (1, 2, 6, 8) (1, 6, 2, 8) (10∗) (1, −1, 0, −1) (1, −1, −1, 0) (1, −1, 13 , − 43 ) (3, 3, 1, 8) (6, 2, 1, 8) (14) (0, −1, 0, 1, −1) (1, −1, −1, 0, 0) (− 13 , −1, 13 , 43 , − 43 ) (1, 3, 1, d4 , d5 ) (9∗)

r 4, 5 4, 5 4 4, 5 5

In (9*) and (10*) there is always a (1B) vertex in c¯ , and r = 4 or 5 according to whether the cardinality of c¯⊥ ∩ 21 (d + W) is 1 or 2. The dimensions d4 , d5 in (14) must satisfy 41 = d14 + d15 (i.e., {d4 , d5 } = {5, 20}, {6, 12}, {8, 8}) and again there is always a (1B) vertex in c¯ . Interior points. Finally, we must consider the cases, listed in Table 3, Sect. 3, when there may be points of W in the interior of vw. As in the earlier cases, we analyse the possible (1A) and (1B) vertices for these c. For (1) and (2), as 1 < λ ≤ 2, the nonzero entries of c are either < −2 or > 1. Hence by Lemma 4.4 there are no (1A) vertices. By Theorem 8.9 we have r = 2 or 3. In case (3) the nullity of c¯ implies that a vector u ∈ W not collinear with vw and with u¯ ∈ c¯⊥ must be of the form (−2, 0, 1 j ). So 2λ − 1 = d21 and c = (− d21 , −1 + d21 , 0, . . .). From the range for λ and the nullity of c, ¯ we have d1 = 3, d2 = 1. But d2 = 1 since w ∈ W. So again there are no (1A) vertices and by Theorem 8.9 we have r ≤ 3. In case (4), a straight-forward preliminary analysis reduces the possibilities of u ∈ W such that u¯ ∈ c¯⊥ to the choices u = (−2, 0, 1, . . .), (0, −2, 1, . . .) or (−1, −1, 1, . . .). Note that c1 < −2 and c2 = 1, so by Lemma 4.4(a) we see c3 < 1, i.e. λ < 23 . Now the second vector cannot occur because the orthogonality equation and λ ≤ 2 imply that d3 = 1, contradicting the presence of w. Since the three vectors are collinear, if two satisfy the orthogonality equation then all do. So there is at most one (1A) vertex and so r ≤ 4 by Theorem 8.9. This can be improved to r ≤ 3 as follows. If the third vector (−1, −1, 1, . . .) occurs then the orthogonality relation, the bound on λ, and nullity 5 1 imply that d1 = 5, d3 = 2 and λ = 7 2 + d2 . Now the nullity equation may be written as a quadratic in 2 +

1 d2

with no rational root. If the first vector (−2, 0, 1, . . .) occurs

then orthogonality implies λ =

d1 (d3 +2) 4d3 +2d1 ,

and the bound on λ gives

6 1 d1 + d3 > 1. Nullity 2d1 d1 +4 , and one can check

implies d1 ≥ 5 and d1 > d32 . We can deduce d3 = 2 and λ = that nullity fails. For case (5), again a straight-forward analysis of the orthogonality condition with the help of the nullity of c¯ gives the following u ∈ W as possibilities such that J (c, ¯ u) ¯ = 0: (a) (b) (c) (d)

(−23 , 1i ), i ≥ 4 and d3 = 2, (1, −2, 0, . . .), (−22 , 1i ), i ≥ 4 and d2 = 3, (0, −2, 1, 0, . . .).

Note (1, 0, −2, . . .) cannot be in W as then vw is not an edge. It follows from Cor. 4.3 that among (a) only one vector can occur and among (b), (c), (d) also only one vector can occur. (The orthogonality conditions of (b) and (d) are incompatible with 1 < λ ≤ 2.) So c¯⊥ ∩ 21 (d + W) contains at most two elements. If

Classification of Superpotentials

631

it has two elements, one must then come from (a) and the other from (b)-(d). Together they give an edge of c¯⊥ ∩ conv( 21 (d + W)) with no interior points in 21 (d + W). Using Cor. 4.3 and the null condition, we find that all these two-element cases cannot occur. Hence r ≤ 4. If r = 4 then the possible u with J (c, ¯ u) ¯ = 0 are given by (a)-(d) with i = 4. Now, we can show using techniques similar to those of Theorem 3.11 that c∗ = (1 − 2λ, 2λ − 1, −1, 0) also gives a null element of C. The possible vectors orthogonal to this element come from (a) and the vectors (b∗ ), (c∗ ), (d∗ ) obtained from (b), (c), (d) by swapping places 1 and 2. If (a) does not give an element in c¯⊥ ∩ 21 (d + W), it is straightforward to show, using the orthogonality and nullity conditions for c¯ and c∗ together, that the (1A) vertices for c¯ and c∗ are given by (b) and (b∗ ) respectively. Also we must have c = ( 43 , − 43 , −1, 0) and d = (4, 4, 9, d4 ). We need a (1B) vertex outside x4 = 0. From Remark 3.9, the only possible (1B) vertices for c¯ correspond to (1, 0, 0, −2) and (1, −1, 0, −1). In particular there can be no vertices, and hence no elements of W, with x4 > 0. Hence (−1, −1, 0, 1) and therefore (1, −1, 0, −1) are not in W. So the (1B) vertices for c¯ and c∗ are given by (1, 0, 0, −2) and (0, 1, 0, −2) respectively. Now the line joining the corresponding null vectors a, a ∗ misses conv(W), a contradiction. ⊥ The remaining case is when (a) gives the element in c¯⊥ ∩ 21 (d + W) and in c∗ ∩ 1 4 2 (d + W). Now for vw to be an edge we need (−1 ) absent, so by Remark 1.2(b) the only possible members of W lying outside {x4 = 0} are the three type IIIs with x4 = 1. In particular, all vectors in conv(W) have x4 ≥ 0. As (−1, −1, 1, 0) ∈ W, there must be (1B) vertices lying in {x4 = 0} for both c¯ and c∗ . We then find that the only possibilities for such a (1B) vertex are given by (b), (b∗ ) respectively. It follows that d1 = d2 , but now nullity is violated. We conclude that there are no (1A) vertices, so r ≤ 3. Theorem 9.2. Let c¯ be a null vertex in C such that c¯ contains a type (2) vertex corresponding to an edge vw of conv(W). Suppose we are not in the situation of Theorem 3.14. (i) If there are no points of W in the interior of vw, then either we are in the situation of Example 8.2 of [DW4] or K is not connected and we are in one of cases in the table of Remark 9.1 or in the situation of the third paragraph of Example 8.3 of [DW4]. (ii) If there are interior points of vw in W then r ≤ 3. For further remarks about the r = 2 case see the concluding remarks at the end of Sect. 10. 10. Completing the Classification Throughout this section we will assume that K is connected (and we are not in the situation of Theorem 3.14). Theorems 8.7 and 9.2 then tell us that if r ≥ 4 there are no type (2) vertices and no adjacent (1B) vertices in c¯ , for any null vector c¯ ∈ C. Since all (1B) vertices lie in the half-space {J (c, ¯ ·) > 0} bounded by the hyperplane c¯⊥ containing the (1A) vertices, we must therefore have in each c¯ exactly one (1B) vertex, with the remaining vertices all of type (1A). So if r ≥ 4 the only remaining task is to analyse such a situation.

632

A. Dancer, M. Wang

As dim c¯ = r − 2, we see dim(c¯ ∩ c¯⊥ ) = r − 3. In particular, there must be at least r − 2 elements of W giving elements of c¯⊥ ∩ 21 (d + W). Theorem 5.18 lists the possible configurations of such elements when r ≥ 3. The above discussion, together with Remark 5.19, shows that in cases (1), (2), (3) we can take m = r − 1, in cases (5)(ii), (5)(iii), (6)(i), (6)(ii) we can take m = r , and in (6)(iii) we have r = 5. Finally, since the vectors in (4) are collinear, so that dim( ∩ c¯⊥ ) = 1, we have r = 3 or 4. If r = 3, it follows that c¯ and the edge in conv( 21 (d + W)) determined by the vectors in (4) are collinear. This contradicts the orthogonality condition for the configuration of vectors in (4) and we conclude that r = 4. For each configuration we can consider the possible vectors u ∈ W giving the (1B) vertex. Besides the nullity condition on c¯ and the condition that c¯ should be orthogonal to the (1A) vectors, we have a further relation coming from the null condition in Remark 3.9. In most cases, routine (but occasionally tedious) computations show that these relations have no solution. As a result, we obtain the following possibilities for u (up to obvious permutations): Table 7. Unique 1B cases case

(1A) vectors

possible (1B) vector

(1)

(−21 , 1i ), 2 ≤ i ≤ r − 1

(−1r ), (12 , −2r ), (−11 , −12 , 1r )

(2)

(11 , −2i ), 2 ≤ i ≤ r − 1 (11 , −22 ); (11 , −23 ); possibly(11 , −12 , −13 );

(−12 ), (−12 , −13 , 1r )

(−12 , −13 , 1i ), 4 ≤ i ≤ r − 1

(14 , −2r ), (11 , −2r ), (−1r )

(4)

(11 , −22 ), (11 , −23 ), (11 , −12 , −13 )

(−14 ), (11 , −24 ), (11 , −12 , −14 ),

(5)(i)

(−21 , 12 ), (−11 , 12 , −13 )

(−11 ), (−14 )

(5)(ii)

(−21 , 12 ); (−11 , 13 , −1i ), 4 ≤ i ≤ r (−21 , 12 ); (−11 , −13 , 1i ), 4 ≤ i ≤ r (−11 , −12 , 1i ), 3 ≤ i ≤ r

(13 , −24 ), (−11 , 12 , −14 ), (13 , −14 , −15 )

(−11 , 12 , −13 ), (−11 , 12 , −1r ) (3)

(5)(iii) (6)(i) (6)(ii) (6)(iii)

(11 , −12 , −1i ), 3 ≤ i ≤ r (11 , −12 , −1i ), i = 3, 4; (11 , −13 , −14 )

(−22 , 14 ), (−11 , −12 , 14 ), (−13 , 14 , −15 ) (−11 , −13 , 14 ) (11 , −13 , −14 ), (−12 , 13 , −14 ), (11 , −23 ) (−15 )

Remark 10.1. The possibilities for the (1B) vertex in cases (5)(ii) and (5)(iii) only apply to the r ≥ 5 situation. When r = 4, the two cases become the same if we switch the third and fourth summands and the possibilities are discussed in Lemma 10.14 below. Remark 10.2. Note that u such as (−12 ), (−13 ) in (4), (−14 ) in (5)(ii) or (−1i ) with i > 2 in (6)(ii) cannot arise because they will not be vertices, due to the presence of the type II vectors in W. Remark 10.3. The dimensions must satisfy certain constraints in each case. Some such constraints were stated in Theorem 5.18 and Remark 5.19. We also have constraints coming from the nullity conditions for c¯ and a. ¯ These typically involve the requirement that some expression in the di is a perfect square. The following is a summary of general constraints in each case: case (1): d1 = 4; case (2): d1 = 1; case (3): d2 = d3 = 2; case (4): d2 + d3 ≤ 4d1 /(d1 − 1) and d2 , d3 , ≥ 2; case (5)(i): (d1 , d2 ) = (4, 2), (3, 3); case (5)(ii,iii): d1 = 2 and if r ≥ 5 also d3 = 2; case (6)(i, ii): d1 = d2 = 2; case (6)(iii): d1 = d2 = d3 = d4 = 2, d5 = 25.

Classification of Superpotentials

633

Our strategy now is reminiscent of that in Sect. 8. We have a (1B) vertex corresponding to u¯ and a second null vector a¯ satisfying a = 2u − c. Now we may apply our arguments to a, ¯ and conclude that the vectors in a¯ ⊥ ∩ 21 (d + W) are also of the form given in the above table, up to permutation. The resulting constraints will allow us to finish our classification. In some cases we can actually show that a¯ ⊥ ∩ 21 (d + W) is empty and we have a contradiction. A simple example when this happens is case 6(iii), where we now have 1 ). Other cases are treated c = ( 65 , − 25 , − 25 , − 25 , −1) and (ai /di ) = (− 35 , 15 , 15 , 15 , − 25 in Lemma 10.4 below. Next we shall show that cases (6)(i)(ii) cannot arise (cf. Lemma 10.6), so in all remaining cases there must be at least one type III vector w with w¯ in a¯ ⊥ . We now use our explicit formulae for c and a to derive inequalities on the entries of a and find when there can be such a type III vector orthogonal to a. ¯ For each such instance we then check whether a¯ ⊥ ∩ 21 (d + W) forms a configuration equivalent to one of those in Table 4. This turns out to be possible in only two situations (cf. Lemmas 10.7, 10.11 and Lemma 10.9). These have such distinctive features that W can be completely determined and judicious applications of Prop. 3.7 lead to contradictions. This yields our main classification theorem. Lemma 10.4. The following cases cannot arise: case (4) with u = (1, −1, 0, −1), case (6) (ii) with u = (1, 0, −2, . . .), case (4) with u = (0, 0, 0, −1) except for the case c = ( 43 , − 23 , − 23 , −1) with d = (4, 2, 2, 9). Proof. (α) For case (4) with u = (1, −1, 0, −1) we find that the nullity and orthogonality conditions and Remark 3.9 leave us with the following possibilities: d

c

(2, 5, 3, 20)

(1, − 45 , − 43 , 0) (1, − 23 , − 21 , 0) (1, − 43 , − 23 , 0) (1, − 65 , − 45 , 0)

(2, 6, 2, 12) (3, 4, 2, 12) (5, 3, 2, 15)

(ai /di ) 1 3 1 1 ( 2 , − 20 , 4 , − 10 ) 1 1 1 1 ( 2 , − 12 , 4 , − 6 ) ( 13 , − 16 , 13 , − 16 ) 4 2 2 ( 15 , − 15 , 5 , − 15 )

It is easy to see that we can never have i wdi iai = 1 for w ∈ W, so a¯ ⊥ ∩ 21 (d + W) is empty and we have a contradiction. (β) Similarly, nullity, orthogonality and Remark 3.9 give: d

c

(2, 2, 225, d4 , . . . , dr )

d4 dr 29 1 ( 232 225 , − 30 , − 4 , − 900 , . . . , − 900 ) d4 dr 13 1 ( 52 49 , − 14 , − 2 , − 196 , . . . − 196 ) d d 5 r 4 ( 10 9 , − 6 , −1, − 36 , . . . , − 36 )

(2, 2, 98, d4 , . . . , dr ) (2, 2, 36, d4 , . . . dr ))

(ai /di ) 109 29 1 1 1 ( 225 , 60 , − 60 , 900 , . . . , 900 ) 23 13 1 1 1 ( 49 , 28 , − 28 , 196 , . . . , 196 ) 5 1 1 1 ( 49 , 12 , − 12 , 36 , . . . 36 )

Moreover n equals 962, 226, 50 respectively. It is easy to see that we can never have wi ai i di = 1 for w ∈ W, so we have a contradiction.

634

A. Dancer, M. Wang

(γ ) For case (4) with u = (0, 0, 0, −1) we find that the nullity and orthogonality conditions and Remark 3.9 leave us with the following possibilities, up to swapping places 2 and 3: d (2, 2, 4, 25)

c

( 65 , − 25 , − 45 , −1) (2, 3, 3, 25) ( 65 , − 35 , − 35 , −1) 6 9 (3, 2, 3, 121) ( 15 11 , − 11 , − 11 , −1) 1−m 1−m (2m − 2, 2, 2, m 2 ) ( 2(m−1) m , m , m , −1)

(ai /di ) 3 1 1 1 (− 5 , 5 , 5 , − 25 ) 3 1 1 1 (− 5 , 5 , 5 , − 25 ) 5 3 3 1 (− 11 , 11 , 11 , − 121 ) 1 m−1 m−1 1 (− m , 2m , 2m , − m 2 )

It is now straightforward to see that we cannot have w ∈ W with i wdi iai = 1, except in two cases (both associated to the last entry of the table). One is the case stated in the lemma. The other occurs if m = 2, so a = (−1, 21 , 21 , −1) which is orthogonal to (−1, 0, 1, −1), (−1, 1, 0, −1). But as (0, 0, 0, −1) is a vertex, neither of these vectors can be in W. So a¯ ⊥ ∩ 21 (d + W) is still empty, giving the desired contradiction. As discussed above, we now turn to showing that case (6) cannot occur. The following remark is useful in finding when type III vectors can give elements of a¯ ⊥ ∩ 21 (d + W). Lemma 10.5. If w = (−2i , 1 j ) and w¯ ∈ a¯ ⊥ , then situation of Theorem 3.14).

ai di

< 0 (assuming we are not in the

Proof. We need

so if

ai di

≥ 0 then

aj dj

aj 2ai =1+ , (10.1) dj di 2 a2 a ≥ 1. Hence d jj = d j d jj ≥ d j ≥ 1. As a¯ is null this means

a = (−1 j ) and we are in the situation of Theorem 3.14.

Lemma 10.6. Configurations of type (6) cannot arise. Proof. Recall that we have dealt with (6) (iii) and we have d1 = d2 = 2. Case (6)(i): We have u = (−1, 0, −1, 1, . . .), and from the nullity and orthogonality conditions we deduce that

m−2 1 1 m+2 , − , ,..., 2 , (ci /di ) = − 2(m + 1) 2(m − 1) m 2 − 1 m −1 where

1 d3

+ d14 =

1 2(m+1)

(ai /di ) =

and n − 1 = m 2 for some positive integer m. We have, therefore,

m−2 2 2 −m 1 1 , , − − 2 , , − 2 2(m + 1) 2(m − 1) d3 m − 1 d4 m −1 −1 −1 , . . . , . m2 − 1 m2 − 1

Let us estimate the size of the entries in (ai /di ). First observe that, as d3 , d4 > 2(m + 1) from above, we have 4(m + 1) < d3 + d4 ≤ n − d1 − d2 = n − 4 = m 2 − 3

Classification of Superpotentials

635

6 1 so we deduce m ≥ 6. Hence we have 37 ≤ | ad11 | < 21 , 25 ≤ | ad22 | < 21 , | ad33 | ≤ 35 , | adii | ≤ 35 1 for i ≥ 5. Also note that ad44 < d24 < m+1 ≤ 17 . Finally, ad44 > 0, else we would have 2n − 4 = 2m 2 − 2 ≤ d4 , which is impossible. Consider now a type III vector w = (−2i , 1 j ) with w¯ ∈ a¯ ⊥ . By Remark 10.5, we a 1 ∈ (0, 17 ]. So we need i = 1, 3 or ≥ 5. If i = 1 then, by Eq.(10.1),we have d jj = m+1

must have j = 4 and

aj 2 d4 , contradicting our above remarks. If i = 3 then d j a Similarly, if i ≥ 5 then d jj ≥ 33 35 , which is impossible. 1 m+1

<

≥

23 35 ,

which is impossible. Hence there are no such type III vectors, so we are in case (6) with respect to a. ¯ We cannot be in 6(ii) as then the null vector has exactly one positive entry (see below), but a has two positive entries. For 6(i), the null vector has exactly two negative entries. Now a has r − 2 negative entries, so r = 4. But for 6(i) the negative entries have modulus 3 < 2, while a3 = −2 − m 2d−1 , a contradiction. Case 6(ii). Here there are two possibilities: Subcase (α). u = (0, −1, 1, −1, . . .). Then, as above, the null condition for c¯ gives

−1 (m − 1)(m + 2) m+2 −1 (ci /di ) = , , , − , . . . , 2(m + 1)2 2(m + 1) (m + 1)2 (m + 1)2 where

1 d3

1 + d14 = 2(m+1) and n − 1 = m 2 . So the vector (ai /di ) is given by

(m − 1)(m + 2) 2 −m 1 2 1 − , , + , − + , 2 2 2(m + 1) 2(m + 1) d3 (m + 1) d4 (m + 1)2 1 1 . , . . . , (m + 1)2 (m + 1)2

8 As before, we have m ≥ 6. We deduce | ad11 | ≤ 21 , 37 ≤ | ad22 | < 21 , | ad33 | ≤ 49 , | ad44 | ≤ 17 , 1 | adii | ≤ 49 for i ≥ 5. Also ad44 < 0, else d4 ≥ 2(m + 1)2 > 2n, which is impossible. We look for vectors w = (−2i , 1 j ) with w¯ ∈ a¯ ⊥ . By Lemma 10.5 we have i = 1, 2 a m+3 2 1 or 4. If i = 1 then d jj = (m+1) 2 , so j = 3, but now Eq.(10.1) contradicts d3 < m+1 . a

A similar argument works if i = 2, while if i = 4, Eq.(10.1) implies d jj > 57 , a contradiction. So 6(ii) must hold for a, ¯ as we have already ruled out 6(i). But now we need a to have exactly one positive entry, which has modulus < 2. So r = 4 and this positive entry is a3 , but we have a3 > 2, a contradiction. Subcase (β). u = (1, 0, −1, −1, . . .). We similarly have

−1 (m − 2)(m + 1) m−2 −1 (ci /di ) = , , , − , . . . , 2(m − 1)2 2(m − 1) (m − 1)2 (m − 1)2

2 1 m − 3m + 4 m−2 2 1 (ai /di ) = , , − , 2(m − 1)2 2(m − 1) (m − 1)2 d3 (m − 1)2 2 1 1 , − , , . . . , d4 (m − 1)2 (m − 1)2 where n − 1 = m 2 and m ≥ 6, and so

| adii |

<

1 2

1 d3

+

1 d4

for all i.

=

m+1 . 2(m−1)2

The last two equations easily imply that

636

A. Dancer, M. Wang

A type III (−2i , 1 j ) giving an element of a¯ ⊥ must have i = 3 or 4, by Lemma 10.5. a In both cases we find from Eq.(10.1) that d jj > 21 , which is impossible. So 6(ii) holds for a, which is impossible as a has at least two positive entries.

Lemma 10.7. The only possible example in case (4) is when c = 43 , − 23 , − 23 , −1 , u = (0, 0, 0, −1), and d = (4, 2, 2, 9). a¯ is then in case (1) with a = (− 43 , 23 , 23 , −1) and a¯ ⊥ ∩ 21 (d + W) consists of (−2, 1, 0, 0), (−2, 0, 1, 0). Proof. By Lemma 10.4 we just have to eliminate the possibility u = (1, 0, 0, −2). Now

(d1 − 1)(2d1 + d4 ) (d1 − 1)(2d1 + d4 ) 8 − 2d1 + d4 , , , (ai /di ) = d1 (4 + 2d1 + d4 ) 2d1 (4 + 2d1 + d4 ) 2d1 (4 + 2d1 + d4 ) 2 2(d1 − 1) . − − d4 d1 (2d1 + d4 + 4) The null condition is −(d1 − 1) d43 − 4(d12 − d1 − 1) d42 + 4d1 (3d12 + d1 + 8)d4 + 16d12 (d1 + 2)2 = 0. For w = (−2i , 1 j ) with w¯ ∈ a¯ ⊥ we need, by Lemma 10.5, i = 1 or 4. If i = 1, then for j = 2 or 3, Eq.(10.1) can be rewritten as 2d12 + d1 d4 + 2d1 = −5d4 − 32, which is 4 −2d1 absurd. For j = 4 it can be rewritten as −1 − d24 = d14+2d . So the right hand side 1 (2d1 +d4 +4) is < −1, which on clearing denominators is easily seen to be false. If i = 4, then for j = 1 Eq.(10.1) becomes d44 + d11 = 1. So (d1 , d4 ) = (2, 8), (3, 6) or (5, 5), all of which violate the null condition. For j = 2, 3 we obtain from Eq.(10.1) the equation 4d4 (d1 −1) = (2d1 +d4 +4)(d1 d4 +d4 −8d1 ), which can only have solutions if d4 ≤ 9. On the other hand, the null condition has no integer solutions if d4 ≤ 9. So no such type III exists, contradicting Lemma 10.6. Lemma 10.8. Configurations of type (5) (i) cannot occur. Proof. It is useful to note that the null condition for c¯ implies that d3 ≤ 4 when (d1 , d2 ) = (4, 2) and d3 ≤ 3 when (d1 , d2 ) = (3, 3). One further finds the following possibilities: c

(ai /di )

(−1, 0, 0, 0) (3, 3, 1, 2)

(−1, 1, − 13 , − 23 )

(− 13 , − 13 , 13 , 13 )

(3, 3, 2, 1)

(−1, 1, − 23 , − 13 ) (−1, 1, − 41 , − 43 ) (−1, 1, − 21 , − 21 ) (−1, 1, − 43 , − 41 ),

(− 13 , − 13 , 13 , 13 )

u

d

(4, 2, 1, 3) (4, 2, 2, 2) (4, 2, 3, 1)

(− 41 , − 21 , 41 , 41 ) (− 41 , − 21 , 41 , 41 ) (− 41 , − 21 , 41 , 41 )

9 15 6 3 5 3 1 (0, 0, 0, −1) (3, 3, 2, 121) (− 11 , 11 , − 11 , −1) ( 11 , − 11 , 11 , − 121 )

(4, 2, 2, 25)

(− 45 , 65 , − 25 , −1)

1 ( 15 , − 35 , 15 , − 25 )

One easily checks that a¯ ⊥ ∩ 21 (d + W) is empty in the last two cases, and consists only of type II vectors in the third to fifth cases, giving a contradiction to Lemma 10.6. For the first two cases, note that a¯ ⊥ ∩ 21 (d + W) contains 21 (d + (−1, −1, 1, 0)) since by hypothesis for 5(i) (−1, 1, −1, 0) is in W. Hence (1),(2) cannot hold with respect to a. ¯ Also the vector d of dimensions rules out (3),(4) and (5), so we have a contradiction.

Classification of Superpotentials

637

Lemma 10.9. The only possible example for case (3) is when c = 23 , − 23 , − 23 , 23 , −1 , u = (0, 0, 0, 0, −1), d = (2, 2, 2, 2, 9). Then a = (− 23 , 23 , 23 , − 23 , −1) and a¯ is again in case (3). Proof. (A) Let u = (−1r ). The null condition for c¯ gives 4dr = (δ + d1 + 2)2 , ,1 where δ = d4 + · · · + dr −1 . In particular, dr is a square. Also, (ai /di ) = √−1 dr 2 1 − √1d , 21 1 − √1d , √−1 , . . . , √−1 , −1 dr . d d r

r

r

r

If dr = 4, then we find there are no type III vectors in a¯ ⊥ , a contradiction. So dr ≥ 9 and we have | adii | ≤ 13 for i = 1, 4, . . . , r − 1, ≤ 19 for i = r and < 21 for i = 2, 3. Lemma 10.5 shows i = 2, 3. From Eq.(10.1) and the above estimates, we first get i = r , a and for the remaining values of i, we have d jj > 0, so that j = 2 or 3. Also, dr = 9 (and

hence d1 + d4 + · · · + dr −1 = 4), and (ai /di ) = (− 13 , 13 , 13 , − 13 , . . . , − 13 , − 19 ). Upon applying Theorem 5.18 to a¯ together with Theorem 8.9 and the above lemmas, we deduce that we are in case (3)(ii) with r = 5 and d1 = d4 = 2, giving the example in the statement of the lemma. 1−α (B) Next let u = (1, 0, . . . , −2). Now (ai /di ) = ( d21 − α, 1−α 2 , 2 , −α, . . . , −α, (n−2−dr )α−5 ) dr

where, as a consequence of the null condition for c, ¯ we have α=

c1 1 = d1 n−2−m

:

m 2 = dr (n − 1).

(10.2)

Next we get the identity (n − 2)2 − m 2 = (n − 1)(d1 + 1 + δ) + 1 = m (dd1r+1+δ) + 1, dr 2 (n −2), and hence α < d1 +1+δ = where δ is as given in (A) above. We deduce m < n−3 2

≤ min( 21 , d21 ). So adii is positive for i ≤ 3 and negative for 3 < i ≤ r −1. Note also ar 1 5 5 r) that (n − 2 − dr )α < 2(n−2−d n−3−dr = 2 1 + n−3−dr ≤ 2 . In particular, dr < − 2dr < 0. 2 n−dr −3

As usual, we look for (−2i , 1 j ) giving an element of a¯ ⊥ . By Lemma 10.5, i = 1, 2, 3. a If 4 ≤ i ≤ r − 1 then Eq.(10.1) says d jj = 1 − 2α > 0, so j = 1, 2 or 3. If j = 1 we

obtain α = 1 − d21 . Comparing this with Eq.(10.2) shows d1 = 3 and α = 13 , but now a¯ is not null. If j = 2 or 3, we obtain α = 13 ; we deduce from Eq.(10.2) that d1 ≤ 5, and again one can check that all possibilities violate nullity. So all type III (−2i , 1 j ) have i = r . Therefore we must be in case (1) or (5) with respect to a, ¯ and dr = 4 or 2 respectively. a If j = 1, 2, 3 then d jj > 0. Now Eq.(10.1) combined with the estimate above for adrr show that − 25 > ar > − d2r , so dr > 5, a contradiction. 3 . ComIf 4 ≤ j ≤ r − 1, then in the case dr = 4, we find Eq.(10.1) gives α = n−4 bining with Eq.(10.2) we get n = 10, which is incompatible with dr = 4 and r ≥ 5. If 4 dr = 2 we find similarly that α = n−3 and m satisfies 3m 2 − 8m − 4 = 0; but this has no integral roots. So u = (1, 0, · · · − 2) cannot occur. 1−α 2 (C) For u = (14 , −2r ), we have (ai /di ) = (−α, 1−α 2 , 2 , d4 − α, −α, . . . , −α, (n−2−dr )α−5) ) dr

and Eq.(10.2) still holds. The arguments of case (B) carry over to this case, on swapping indices 1, 4. Lemma 10.10. Case (2) cannot occur.

638

A. Dancer, M. Wang

1 1 1 Proof. (A) Consider u = (−12 ). Now (ai /di ) = ( d22 − 1, −1 d2 , d2 , . . . , d2 , dr (2 − n+1−dr d2

)). Nullity implies d2 ≥ 3, so −1 <

a1 d1

≤ − 13 and | adii | ≤

1 3

for 2 ≤ i ≤ r − 1. Also we

have dr (n − 1) = and for this choice of u we have m = n + 1 − 2d2 , so adrr = ddr2−m dr which is positive if m < 0 and negative if m > 0. By Lemma 10.5 and the fact that d1 = 1, we only have to consider (−2i , 1 j ) with i = 2 or r . a If i = 2, then Eq.(10.1) says d jj = 1 − d22 > 0 so j ≥ 3. If 3 ≤ j ≤ r − 1 then m2,

16 Eq. (10.1) shows d2 = 3, so m = n −5 and dr = (n−5) n−1 = n −9+ n−1 . As n ≥ 7 we must then have (dr , n) = (9, 17) or (2, 9). Imposing the nullity condition on a¯ shows there are only three possibilities, corresponding to d = (1, 3, 2, 2, 9), (1, 3, 4, 9), (1, 3, 3, 2). In the first two there is only one type III in a¯ ⊥ , as d1 = 1 and d2 = 4, so we are in case (5) with respect to a, ¯ contradicting the fact that d2 = 2. In the last case we must be in case (4) with respect to a, ¯ but now (0, −1, 1, −1) is present, so u is not a vertex. If j = r then Eq.(10.1) becomes d2 = 3ddr r−n−1 −2 which is less than 3, a contradiction. (We cannot have dr = 2 and 3dr = n + 1 as n ≥ 7.) Hence all such (−2i , 1 j ) have i = r and we are in case (1) or (5) with respect to a. ¯ a For case (1) we need r − 2 of the d jj ( j < r ) equal. This can only happen for our a if d2 = 3, which is ruled out as in the previous paragraph. For case (5) we have dr = 2 and aj n−1 2 1 1 d j = 3 − d2 . The possibilities on the left-hand side are d2 − 1, − d2 , d2 respectively. On using our relations for m, n, dr we find that only the third possibility can occur, and d2 = 3. The argument in the previous paragraph again eliminates this case. (B) Consider u = (0, −1, −1, . . . , 1). Now (ai /di ) = (2β − 1, β − d22 , 2

β−

4−(n+1−dr )β 2 ) d3 , β, . . . , β, dr

β :=

where again from the null condition of c¯ we have

3 + dr ( d12 + d13 ) 2 1 (1 − c1 ) = = : dr (n − 1) = m 2 (m > 0). 2 n+1−m n + dr + 1 (10.3)

Now 0 < β < 21 (the case β = 21 leads to r = 4, d2 = d3 = 2 and a = (0, −1, −1, 1) which violates nullity). So for w¯ ∈ a¯ ⊥ we just have to consider w = (−2i , 1 j ) with i = 2, r (as d1 = 1 we can’t have i = 1; also by symmetry the case i = 3 is treated the same way as i = 2). a If i = 2 then Eq.(10.1) says d jj = 2β + 1 − d42 . If 4 ≤ j ≤ r − 1 we get β = d42 − 1; the only possibility consistent with our bounds on β is d2 = 3, β = 13 and it is straightforward to check this is incompatible with the null condition for a. ¯ If i = 2 and j = 1 then Eq.(10.1) implies d2 = 2 and again one checks that nullity for c¯ fails. If j = 3 Eq.(10.1) says β = d42 − d23 − 1, so as β > 0 either d2 = 2 and β = 1 − d23 or d2 = 3 and β = 13 − d23 . In the former case the bound β < 21 shows d3 = 3 and β = 13 , and now nullity for a¯ fails. In the latter case the bound β > 0 shows d3 > 6. Substituting this into the quadratic which must vanish for nullity of c, ¯ we see δ = d4 + · · · + dr −1 is < 4. Checking the resulting short list of cases yields no examples where nullity holds. If j = r then Eqs.(10.1) and Eq.(10.3) imply d1r + d32 = 1 + d13 and one checks that the possible (d2 , dr ) yield no examples where nullity of c¯ holds. So all such type III have i = r , and case (1) or (5) holds for a. ¯ For case (1), then, as a in (A), r − 2 of the d jj ( j ≤ r − 1) must be equal. So either r = 5 and d2 = d3 with

Classification of Superpotentials

639

β = 1 − d22 or r = 4 and one of the preceding equalities holds. If β = 1 − d22 holds, then the bounds on β show d2 = 3, β = 13 and as usual nullity for a¯ fails. If d2 = d3 holds, then using our formulae for β and substituting into the null condition for a¯ gives a quadratic with no integer roots. a If case (5) holds, then dr = 2. Now Eq. (10.1) gives d jj = 5 − (n − 1)β. If j = 1,

6 2) 5 j = 2, or 4 ≤ j ≤ r − 1 we get β = n+1 , 5+(2/d , n respectively. (As usual, the case n j = 3 is treated in just the same way as j = 2.) Now using the equations in Eq. (10.3) relating n, m in each case gives a quadratic with no real roots.

Lemma 10.11. For case (1) the only possibility is when c = (− 43 , 23 , 23 , −1) with u = (0, 0, 0, −1) and d = (4, 2, 2, 9). a¯ is then in case (4). Proof. (A) Consider u = (0, . . . , 0, −1). From the null condition for c¯ we see that dr = k 2 , n−1 = (k+1)2 for some positive integer k and (ai /di ) = ( 21 (1− k1 ), − k1 , . . . , − k1 , −1 ). k2 Note that since d1 = 4, n > 5 and so k = 1. We must consider solutions of Eq.(10.1). By Lemma 10.5, i = 1. If i = r we have aj 2 d j = 1 − k 2 . The resulting equation has no solution in integer k > 1 for any choice of a

j. If 2 ≤ i ≤ r − 1, we need d jj = 1 − 2k . We only obtain a solution k > 1 if j = 1; in this case k = 3, so n = 17, dr = 9 and we see r = 4 with {d2 , d3 } = {2, 2} or {3, 1}. The former case is that in the statement of the lemma. In the latter case we can have just one type III and one type II in a¯ ⊥ (since d2 or d3 is 1, one potential type III is missing), so we must be in case (5) with respect to a; ¯ but no di is 2, a contradiction. (B) Consider u = (0, 1, 0, . . . , 0, −2). Now (ai /di ) = ( 21 (1 − β), d22 − β, −β, . . . ,

dr +6d2 −β, (n−dr −2)β−5 ), where β = d2 (2n−d . The nullity condition for c¯ implies d2 ≥ 3, dr r −4) d2 > δ and dr > 2d2 + 4, where δ now denotes d3 + · · · + dr −1 . We can then deduce that β < 1, 0 < ad11 < 21 , 0 < ad22 < 13 . In particular β < d22 . By Lemma 10.5, we must consider elements of a¯ ⊥ coming from vectors (−2i , 1 j ) with i ≥ 3. a If 3 ≤ i ≤ r − 1, Eq.(10.1) says d jj = 1 − 2β. As β < 1, this immediately rules out

3 ≤ j ≤ r − 1. If j = 1 we get β = 13 . Combining this with our formula above for β we get 2d2 (d2 + δ − 7) + (d2 − 3)dr = 0. The only possibilities are d2 = 3, δ = 4 which violates the null condition, or d = (4, 4, 2, 8) which violates the condition that dr (n −1) should be a square. If j = 2, we get β = 1 − d22 . Since we saw above that β < d22 we get d2 = 3 and β = 13 , which is ruled out as above. If j = r , Eq.(10.1) implies +5 . Comparing this with the formula for β above leads to a contradiction. β = 2+dd2 r+δ+2d r The remaining possibility is for i = r . So we are in case (1) or (5) with respect to a, ¯ and dr = 4 or 2 respectively. But dr > 2d2 + 4, so this is impossible. (C) Let u = (−1, −1, 0, . . . , 0, 1). Now (ai /di ) = (− 21 β, − d22 − β, −β, . . . ,

r −2)β r (d2 −4) −β, 1+(n−d ), where β = 2d2d(2n+d , so 0 < β < 16 (noting that the nullity dr r −4) condition for c¯ implies d2 ≥ 5). We look for vectors (−2i , 1 j ) giving elements of a¯ ⊥ . Now Lemma 10.5 rules out a i = r , while if 3 ≤ i ≤ r − 1 we need d jj = 1 − 2β > 23 . So j = r , and Eq.(10.1) yields

dr −1 β = n+d . Equating this to the expression above for β gives an equation which may r −2 be rearranged so that it says a sum of positive terms is zero, which is absurd. a If i = 1 then Eq.(10.1) says d jj = 1 − β. Clearly this can only possibly hold if j = r . r −1 The equation then gives β = dn−2 , and equating this with the earlier expression for β leads, as in the previous paragraph, to a contradiction.

640

A. Dancer, M. Wang

So the only possibility is i = 2, and we are therefore in case (1) or (5) with respect to a. ¯ But d2 ≥ 5 so this is impossible. (D) Let u = (−1, 1, −1, 0, . . .). Now (ai /di ) = (− 21 β, d22 − β, − d23 − β, −β, . . . ,

), and β = 21 − 2( d12 + d13 ). An analysis of the nullity condition for c¯ −β, (n−dr −2)β−1 dr 1 shows that it can only be satisfied if 15 < d12 + d13 < 41 , so d2 , d3 ≥ 5 and 0 < β < 10 . aj 2 Let us now consider solutions to Eq.(10.1). If i = r , we have d j = 1 − dr +

2(n−dr −2)β r −2) . If j = 2, this equation implies that the positive quantity (1 + 2(n−d )β dr dr 2(n−dr −2) 1 )β if j = 1) equals a nonpositive quantity (recall dr > 1 as i = r ). If (or ( 2 + dr j = 2, we get that it equals d22 + d2r − 1. But d2 ≥ 5 so dr = 2 or 3, and in each case we

find the nullity condition for c¯ is violated. a 9 If i = 1, Eq.(10.1) says d jj = 1 − β > 10 , so j = 2 or r . But for j = 2 we get d2 = 2, which is impossible as we know d2 ≥ 5, so in fact j = r . a If i = 2, Eq.(10.1) is d jj = 1+ d42 −2β. We cannot then have j = 1, 3 or 4 ≤ j ≤ r −1 as they lead to β > 23 , > 1, > 1 respectively. So we must have j = r . a If i = 3, we see d jj = 1 − d43 − 2β. If j = 1, 2 or 4 ≤ j ≤ r − 1, we see in all cases (using our bounds on d2 , d3 ) that β > 15 , a contradiction. Hence again j = r . a If 4 ≤ i ≤ r − 1, then d jj = 1 − 2β > 45 so j = 2 or r . If j = 2 we obtain

β = 1 − d22 ≥ 35 , contradicting our earlier inequality for β; so again we have j = r . We have shown that any (−2i , 1 j ) giving an element of a¯ ⊥ has j = r , so we are in case (3), (4) or (5) with respect to a. ¯ It cannot be case (3) as we know from Lemma 10.9 that then each di is 2 or 9, and we have d1 = 4. If we are in case (4), then Lemma 10.7 tells us that d = (4, 2, 2, 9). Moreover, as (−2, 0, 0, 1), (0, −2, 0, 1) are the elements 1 of a¯ ⊥ ∩ 21 (d + W), we must have β = d42 ; but now β > 10 , a contradiction. If it is case (5), then we have di = 2 for some i, which we can take to be 4. Now a¯ must be orthogonal to vectors associated to (−14 , 15 , −1k ) or (−14 , −15 , 1k ), and either case is incompatible with our expressions for ai /di . (E) Consider u = (−1, 1, 0, . . . , 0, −1). Now (ai /di ) = (− 21 β, d22 − β, −β, . . . ,

8d2 −dr (d2 −4) −β, ((n−dr −2)β−3) ) and β = 2d . It is easy to check that β < d22 . Also, the dr 2 (2n−dr −4) nullity condition for c¯ implies d2 ≥ 3 and ( d42 − 1)dr + 8 > 0; hence β > 0. The analysis is similar to that in (D). If i = r then Eq.(10.1) implies that a positive quantity times β equals a positive linear combination of reciprocals of di , minus 1. This sum of reciprocals is therefore > 1, which gives us upper bounds on dr . The only case where Eq.(10.1) and the null condition can hold is if j = 2 and d2 = 7, dr = 4, d3 + · · · + dr −1 = 11. a If i = 1 then Eq.(10.1) says d jj = 1 − β > 0, so j = 2 or r . But j = 2 implies d2 = 2, which from above cannot hold, so j = r . a Now Lemma 10.5 rules out i = 2. If 3 ≤ i ≤ r − 1, we have d jj = 1 − 2β. If j = 1

then we get β = 23 , which cannot hold. If j = 2 then β = 1 − d22 , and as β < d22 we deduce d2 = 3 and β = 13 , which violates the null condition for c. ¯ If 3 ≤ j ≤ r − 1, then β = 1, which is impossible. So we have j = r . So in all cases we have j = r , except in the exceptional case discussed above where we can have i = r and j = 2. But our list (1)-(6) of possible configurations in a¯ ⊥ shows that if the (i, j) = (r, 2) case occurs then no other type III can be in a¯ ⊥ . So we are in

Classification of Superpotentials

641

case (5), which is impossible as dr = 4 = 2 for this example. Hence the exceptional case cannot arise. We see therefore that j = r in all cases. So, as in (D), we must be in case (3), (4) or (5) with respect to a. ¯ As before, the fact that d1 = 4 rules out case (3). For case (4) we need the dk to be 4, 2, 2, 9 and dr to be 4 (as the (−2i , 1 j ) in a¯ ⊥ have j = r ), but this contradicts d1 = 4. So we are in case (5). Now the orthogonality condition for the family of type II vectors leads to β > 23 , which is impossible. Lemma 10.12. Case (5) (ii) cannot occur if r ≥ 5. Proof. (A) Consider u = (0, 0, 1, −2, 0, . . .). We have

n − 6 − 3d2 n − 2d2 − 3 (ai /di ) = 1 − β, 1 − 2β, − β, n − d2 − 2 n − d2 − 2 2(d2 + 2)β d2 + 1 4 2(d2 + 2)β d2 + 1 − − , − ,... n − d2 − 2 n − d2 − 2 d4 n − d2 − 2 n − d2 − 2 where all terms from the fifth onwards are equal and where β := 1 +

8(n − d2 − 2) + d4 (n + d2 ) c1 = . 2 2d4 (n + d2 + 2)

The nullity condition for c¯ implies d4 > 52, so n > 56 and we deduce 0 < β < < 35 . Hence ad11 > 0. It is also easy to show that adii > 0 for i ≥ 5 and ad33 > 0. So if (−2i , 1 j ) gives an element of a¯ ⊥ we need i = 2 or 4. As d4 > 52, Lemmas 10.6 - 10.11 show that case (5) must hold with respect to a. ¯ In particular di = 2, so we cannot have i = 4. Hence i = 2 and d2 = 2. Now Eq.(10.1) implies β = 2 2n−5 3n−9 n−4 3n−9 3 , 3n−4 , 4(n−2) + d4 (n−2) or 4(n−2) , depending on whether j = 1, 3, 4 or ≥ 5. In all cases this contradicts the bound β < 35 and n > 56. (B) Consider u = (−1, 1, 0, −1, 0, . . .). We have

n − 6 − 3d2 2 d2 + 1 −( )β, (ai /di ) = −β, + 1 − 2β, − d2 n − d2 − 2 n − d2 − 2 2(d2 + 2)β d2 + 1 2 2(d2 + 2)β d2 + 1 − − , − ,... , n − d2 − 2 n − d2 − 2 d4 n − d2 − 2 n − d2 − 2 15 26

where all terms from the fifth onwards are equal and β := 1 +

c1 n + d2 − 2 n−2 n − d2 − 2 = + + . 2 2(n + d2 + 2) d2 (n + d2 + 2) d4 (n + d2 + 2)

The nullity condition for c¯ implies d2 ≥ 9 and d4 ≥ 4. It is now easy to check that ai < β < 31 36 , and that di > 0 for i ≥ 5. As in (A), case (5) must hold with respect to a. ¯ Now if (−2i , 1 j ) gives an element ⊥ of a¯ we need di = 2. This, combined with Lemma 10.5, means i = 1 or 3. If i = 1, Eq.(10.1) immediately shows j cannot be 2. Moreover, if j = 3 or ≥ 5, Eq.(10.1) yields a value for β that violates the nullity condition for c. ¯ If j = 4 we obtain 3 7

642

A. Dancer, M. Wang

n−d2 −2 β = n−1 ¯ we need to consider the 2n + d4 n . As we are in case (5) with respect to a, 1 ⊥ elements in a¯ ∩ 2 (d + W) corresponding to type II vectors. Their number and pattern, as stipulated by Theorem 5.18, together with orthogonality to a, ¯ imply further linear relations among the components of (ai /di ) and small upper bounds for r (usually of the form r = 5, 6). In all cases these additional constraints can be shown to be incompatible with the above values of β. As an illustration of the above method, note that our type III vector is (−2, 0, 0, 1, 0, . . .). If a¯ is in case (5)(iii), Theorem 5.18 says that the possible type IIs must have a −1 in place 1 and a 0 in place 4. Since r ≥ 5, the remaining −1 must be in a place whose corresponding dimension is 2. As d3 = 2 we can have (−1, ∗, −1, 0, ∗, . . .) where ∗ indicates a possible location of the 1 in the type II. The other possibility is for −1 to be in place k for some k ≥ 5. After a permutation we can assume k = 5, and d5 = 2 must hold. The type II is then of the form (−1, ∗, ∗, 0, −1, ∗, . . .) where ∗ again indicates possible positions for the 1. In the first case, the orthogonality conditions n−d2 −2 imply ad22 = ad55 , which gives β = n−1 2n + d2 n . Comparing with the value of β from Eq.(10.1), we get d2 = d4 . Using this in the first value of β in (B) gives a contradiction after some manipulation. In the second case, the argument we just gave implies that we can only have r = 5 and the orthogonality condition implies ad22 = ad33 , which gives

n−1 2 −2) + d2(n−d . After a short computation, one sees that the two values of β β = n+d 2 +2 2 (n+d2 +2) are again incompatible. If a¯ is in case (5)(ii), the argument is essentially the same, as we only have to switch the places of the second −1 and the 1 in the type IIs. 3d2 +4−n Let us now take i = 3. If j = 1, Eq.(10.1) implies β = 5d . If the denom2 +10−n inator is negative, then β > 1, which is a contradiction. If it is positive we find that n+d2 −2 this is incompatible with the inequality β > 2(n+d which comes from the displayed 2 +2) expression for β above. d2 +1 n−d2 −2 + 2d . As above, we can rule this out by considIf j = 2, we get β = 2(d 2 +2) 2 (d2 +2) 1 ⊥ ering the vectors in a¯ ∩ 2 (d + W) associated to type II vectors. A similar argument n−2d2 −3 works for j ≥ 5, where we find β = 2(n−2d , and for j = 4, where we have 2 −4) n−d2 −2 n−2d2 −3 β = d4 (n−2d + 2(n−2d . 2 −4) 2 −4) (C) Next let u = (0, 0, 1, −1, −1, 0, . . .). We have (ai /di ) = (1 − β, 1 − 2β, 2(d2 +2) 2(d2 +2) 2(d2 +2) n−2d2 −3 n−6−3d2 d2 +1 d2 +1 2 2 n−d2 −2 −( n−d2 −2 )β, n−d2 −2 β − n−d2 −2 − d4 , n−d2 −2 β − n−d2 −2 − d5 , n−d2 −2 β − d2 +1 n−d2 −2 , . . .),

where all terms from the sixth on are equal and n + d2 c1 = + β := 1 + 2 2(n + d2 + 2)

1 1 + d4 d5

n − d2 − 2 . (n + d2 + 2)

The nullity condition for c¯ implies d4 and d5 ≥ 13. It is now easy to check that 0 < β < 23 , so ad11 > 0. As in (A), we find also that adii > 0 for i ≥ 6, and that ad33 > 0. As in (A) again, case (5) must hold with respect to a, ¯ so if (−2i , 1 j ) gives an element ⊥ of a¯ we need di = 2. This, combined with lemma 10.5, means i = 2. In this situation in all cases Eq.(10.1) gives a value of β incompatible with our bounds on n and β. Lemma 10.13. Case (5) (iii) cannot arise if r ≥ 5. Proof. This is similar to the proof of the previous Lemma so we will be brief.

Classification of Superpotentials

643

(A) Consider u = (0, −2, 0, 1, 0, . . .). Now (ai /di ) is given by

4 (n − 3) + (n + d2 − 2)(β − 1) 2 2d2 β − (d2 + 1) , , 1 − β, − + 1 − 2β, + d2 n − d2 − 2 d4 n − d2 − 2 2d2 β − (d2 + 1) 2d2 β − (d2 + 1) ,..., , n − d2 − 2 n − d2 − 2 where β := 1 +

1 (d2 + 4d4 )(n − 2 − d2 − d4 ) + 4d42 c1 = − . 2 2 2d2 d4 (2n − d2 − 4)

The nullity condition for c¯ implies d4 ≥ 8, d2 ≥ 27 and d2 > 2d4 , and it readily follows that 41 < β < 21 . In particular, ad11 > 0. As before, we see that case (5) holds with respect to a, ¯ so for a¯ ⊥ we have to consider type III vectors (−2i , 1 j ), where di = 2. So we need only consider i = 3 or i ≥ 5. In either situation, we proceed as in part (B) of the proof of Lemma 10.12, and obtain inconsistencies in the equations involving β or contradiction to the bounds on β or the dimensions. (B) Let u = (0, 0, −1, 1, −1, 0, . . .). The nullity condition on c, ¯ which has a symmetry in d4 and d5 , now implies d4 , d5 ≥ 46 and > 28d2 . Now (ai /di ) is given by

n−1 2 2d2 β − (d2 + 1) −2 n + d2 − 2 )β − , , 1 − β, 1 − 2β, ( + n − d2 − 2 n − d2 − 2 d4 n − d2 − 2 d5 2d2 β − (d2 + 1) 2d2 β − (d2 + 1) , ,... , + n − d2 − 2 n − d2 − 2 where all terms from the sixth on are equal and β := 1 +

c1 1 1 1 1 n − d2 − 2 = + + ( + )( ). 2 2 n + d2 − 2 d4 d5 n + d2 − 2

We deduce that 21 < β < 35 and so 21 > ad11 > 0. a¯ must be in case (5), and for (−2i , 1 j ) associated to elements of a¯ ⊥ ∩ 21 (d + W), we have di = 2 and we need only consider i = 2, 3 or ≥ 6. a If i = 2, Eq.(10.1) and the upper bound on β imply d jj > 35 . As d2 = 2, one checks that this never holds. a If i ≥ 6, Eq.(10.1) implies d jj > 47 48 . The bound d4 , d5 > 28d2 can be used to show that this never happens. If i = 3, Eq.(10.1) and the expression for β above are seen to be incompatible if we use the bounds on d4 , d5 and β. (C) Consider u = (−1, −1, 0, 1, 0, . . .). The nullity condition on c¯ gives d4 ≥ 4, d2 ≥ 5. The vector (ai /di ) is

2 (n + d2 − 2)β − (d2 + 1) 2 2d2 β − (d2 + 1) −β, 1 − 2β − , , , + d2 n − d2 − 2 d4 n − d2 − 2 2d2 β − (d2 + 1) ,... , n − d2 − 2

644

A. Dancer, M. Wang

where all terms from the fifth on are equal and β := 1 +

c1 1 d 2 + (d2 + d4 )(n − d2 − d4 − 2) = − 4 . 2 2 d2 d4 (3n − d2 − 6)

Now as d12 + d14 < 21 , we see that 13 < β < 21 . Again a¯ is in case (5) and we consider vectors (−2i , 1 j ) associated to elements of a¯ ⊥ ∩ 21 (d + W), where we must have di = 2, so i = 2, 4. a If i = 1, Eq.(10.1) becomes d jj = 1 − 2β > 0. This immediately means j = 2, 5, . . . , r . If j = 3, the value of β from Eq.(10.1) and the above expression for β lead to d4 ≤ 10/3. For j = 4, we obtain a contradiction by the method of part (B) in the proof of Lemma 10.12. a a 2 −4) If i ≥ 5, Eq.(10.1) says d jj = 4d2 β+(n−3d . If i = 3, Eq.(10.1) says d jj = 1 − n−d2 −2 2 −2)β + 2(n+d n−d2 −2 . In both situations, we can again apply the method of part (B) in the proof of Lemma 10.12 to obtain contradictions.

2(d2 +1) n−d2 −2

The last case to consider is case (5)(ii) with r = 4, which is the same as case (5)(iii) with r = 4 if we interchange the third and fourth summands. Lemma 10.14. No configurations for case (5) (ii) with r = 4 can occur. Proof. When r = 4 we no longer have d3 = 2, but the nullity condition for c¯ implies that d13 + d14 ≥ 21 . Hence either {d3 , d4 } is one of {3, 3}, {3, 4}, {3, 5}, {3, 6}, {4, 4} or one of d3 or d4 is 2. Using this together with the nullity conditions for a, ¯ c¯ and the orthogonality conditions, we see that we only need to consider u = (0, −2, 1, 0), (0, 0, 1, −2), (−1, 1, 0, −1) and (−1, −1, 1, 0). (A) Let u = (0, −2, 1, 0). From the nullity condition for c¯ we deduce that d4 = 2, d2 ≥ 13, and d3 ≥ 3. Now (ai /di ) = (1 − β, 2β − 1 −

4 2 2d2 β − (d2 + 1) (2d2 + d3 + 2)β − (d2 + 1) , ), , + d2 d3 d3 + 2 n − d2 − 2 d +4d +2d 2

3 a1 1 3 where β := 1 + c21 = 21 − d2 d23 (d2 +2d . We find that 11 26 < β < 2 and so d1 > 0. 3 +4) The above facts imply that a¯ is again in case (5)(ii), and for (−2i , 1 j ) associated to an element of a¯ ⊥ we must have i = 4. If j = 1, 2, Eq.(10.1) leads to a contradiction to the above dimension restrictions. The case j = 3 can be eliminated using the method of part (B) in the proof of Lemma 10.12. (B) Next let u = (0, 0, 1, −2). The nullity condition for c¯ implies that d3 = 2 and d4 > 32d2 + 14 ≥ 46. Now (ai /di ) is given by

(1 − β, 1 − 2β,

d4 − d2 + 1 2d2 + 2 − d4 4 2(d2 + 2)β − (d2 + 1) +( )β, − + ), d4 + 2 d4 + 2 d4 d4 + 2 d 2 +2d d −16

where β := 1 + c21 = 1 − 2d44 (2d22 +d4 4 +6) . One easily sees that 21 < β < 1, so that 0 < ad11 < 21 . Since d4 > 2d2 + 2 we obtain 0 < ad33 < 35 . Therefore, a¯ is in case (5)(ii) and for (−2i , 1 j ) associated to an element of a¯ ⊥ we must have (by Lemma 10.5) i = 2 and so d2 = 2. Putting this value of d2 into the nullity condition for c¯ gives a cubic equation in d4 with no integral roots, a contradiction.

Classification of Superpotentials

645

(C) Consider now u = (−1, 1, 0, −1). From the nullity condition for c¯ we deduce that d2 ≥ 4, d3 = 2, d4 ≥ 3 and 4d2 > 3d4 . Also, (ai /di ) = (−β,

2 (2d2 + 2−d4 )β −(d2 + 1) 2 2(d2 + 2)β −(d2 + 1) +1−2β, , − + ), d2 d4 + 2 d4 d4 + 2 2d +2d +d 2

4 4 . It follows that 21 < β < 34 and ad22 > 0. where β := 1 + c21 = 21 + d2 d42(2d2 +d 4 +6) Now we see that a¯ is either in case (1) or (4) or (5)(ii). In the first two instances, by Lemmas 10.11, 10.7 d is a permutation of (4, 2, 2, 9). Since 3d4 < 4d2 we have d2 = 9, d4 = 4. But then the null condition for c¯ is violated. So we are in case (5)(ii). For (−2i , 1 j ) associated to an element of a¯ ⊥ , as di = 2, we have i = 1 or 3. a If i = 1, then Eq.(10.1) becomes d jj = 1 − 2β < 0, so j = 3, 4. When j = 3 the value of β given above together with Eq.(10.1) imply that d4 = 3 or 4. But then the null condition for c¯ is violated. For j = 4 we may use the argument of part (B) of the proof of Lemma 10.12. If i = 3, using β > 21 in Eq.(10.1), we see that j = 2, 4. In either case, applying our bounds for the dimensions in Eq.(10.1) lead to contradictions. (D) Let u = (−1, −1, 1, 0). The null condition for c¯ implies that d4 = 2, d3 ≥ 3, d2 ≥ 5. With β := 1 + c21 , we have

(ai /di ) = (−β, 1 − 2β −

2 2 2d2 β − (d2 + 1) (2d2 + d3 + 2)β − (d2 + 1) , ). , + d2 d3 d3 + 2 d3 + 2

One computes that β = a¯

1 2

−

2d2 +d32 +2d3 , d2 (3d32 +2d2 d3 +6d3 )

and from the dimension bounds one

gets ≤β< cannot be in case (1) or (4), otherwise as d2 ≥ 5, we must have d2 = 9, d3 = 4, and the null condition for c¯ is violated. So a¯ is in case (5)(ii). For (−2i , 1 j ) associated to an element of a¯ ⊥ , we must then have i = 1, 4. a If i = 1, then Eq.(10.1) is d jj = 1 − 2β > 0, so j = 3 or 4. In either situation, we may apply the argument of part (B) of the proof of Lemma 10.12 to get a contradiction. If i = 4, Eq.(10.1) together with the dimension bounds above show first that we can only have j = 3. In that case, a more detailed look at Eq.(10.1) leads to a contradiction. 5 12

1 2.

We can summarise our discussions thus far by Theorem 10.15. Let r ≥ 4 and K be connected. Suppose that we are not in the situation of Theorem 3.14. Assume that c¯ ∈ C is a null vector such that c¯ has the property that there is a unique vertex of type (1B) and all other vertices are of type (1A). Then the only possibilities are given by Lemmas 10.7/10.11 and 10.9, up to interchanging a¯ and c¯ and a permutation of the irreducible summands. We will now sharpen the above theorem using Proposition 3.7. Corollary 10.16. Let r ≥ 4. Assume that K is connected and we are not in the situation of Theorem 3.14. Then the possibilities given by Lemmas 10.7 and 10.9 (and hence Lemma 10.11) cannot occur. Proof. We will discuss the r = 4 case (i.e. that in Lemmas 10.7, 10.11) in detail and leave the details of the r = 5 case (from Lemma 10.9) to the reader, as the arguments are very similar.

646

A. Dancer, M. Wang

In the r = 4 case, first observe that C has exactly two null vectors, c¯ and a¯ in the notation of Lemma 10.7, as the entries of a, c are determined by the vector d of dimensions. Hence, by Prop. 3.3, these are the only elements of C outside conv( 21 (d + W)). Since (−14 ) is a vertex of W, all type II vectors in W must be zero in place 4. As (1, −1, −1, 0) is associated to an element of c¯⊥ , it, together with (−1, 1, −1, 0), (−1, −1, 1, 0) are the only type II vectors in W. Next we analyse vectors in W and see if they are associated to elements of C, this last property being important for applying Prop. 3.7. Recall that (1, −2, 0, 0), (1, 0, −2, 0), (−2, 1, 0, 0), (−2, 0, 1, 0) must be in W. The first two give elements of c¯⊥ , the last two give elements of a¯ ⊥ . First consider v = (1, −2, 0, 0). Now v¯ is a vertex of conv( 21 (d + W)). By the superpotential equation, d + v = 2v¯ can be written as c¯(α) + c¯(β) , with c¯(α) , c¯(β) ∈ C. Since v¯ is a vertex, every such expression must involve a¯ or c, ¯ unless it is the trivial expression v¯ + v¯ and v¯ ∈ C. By computing 2v − a, 2v − c we find that these cannot lie in conv(W) (it is enough to exhibit one component < −2 or > 1). Thus v¯ ∈ C. Now an analogous argument shows that if w = (0, −2, 1, 0) is in W then w¯ also lies in C. But vw is an edge of conv(W) with no interior points in W. So Prop. 3.7 gives 1 − d42 = 4J (v, ¯ w) ¯ = 0, a contradiction to d2 = 2. Hence w ∈ / W. Similarly we see (0, −2, 0, 1) ∈ / W. Next consider z = (−1, −1, 1, 0). By Remark 1.2(e) and the above, z is a vertex of conv(W). As above we can show that z¯ ∈ C. Now v, z are the only elements of the face {x1 + 2x2 = −3} ∩ conv(W) (cf. proof of Prop. 4.3 in [DW4]). So applying Prop. 3.7 to vz we obtain 0 = 4J (v, ¯ z¯ ) = 14 , a contradiction. To handle the r = 5 case (from Lemma 10.9), first observe that null elements of C must have entries 23 in two of the places 1, . . . , 4 and − 23 in the other two places. For a, c as in Lemma 10.9, we can take (1, −2, 0, 0, 0), (1, 0, −2, 0, 0), (0, 1, 0, −2, 0), (−2, 1, 0, 0, 0) to lie in W. The argument above to show that v¯ is in C still works for such type III vectors v. As above, we can use Prop. 3.7 to show the other type III vectors (−2i , 1 j ) (i ≤ 4) do not lie in W; hence the a, ¯ c¯ of Lemma 10.9 are the only null elements of C. Let z = (−1, 1, −1, 0, 0) (it lies in W since (1, −1, −1, 0, 0) is associated to an element of c¯⊥ ) and v = (1, 0, −2, 0, 0). As above we find z¯ is in C, and the arguments of Prop. 4.3 in [DW4] show vz is an edge of conv(W). A contradiction results as above from applying Prop. 3.7 to vz. The discussion at the beginning of this section now tells us that if K is connected and r ≥ 4 the only case when we have a superpotential of the kind under discussion is that of Theorem 3.14. The proof of Theorem 2.1 is now complete. Concluding remarks 1. When r = 2, then c is collinear with the elements of W. In other words, the projected polytope c¯ reduces to a single vertex, which must be of type (2). The possible elements of W are (−2, 1), (−1, 0), (0, −1), (1, −2). If W has just two elements then Theorem 9.2 tells us we are either in the situation of Theorem 3.14 (the Bérard Bergery examples), or in Example 8.2 or the third case of Example 8.3 in [DW4]. In fact one can show that this last possibility can be realised in the class of homogeneous hypersurfaces exactly when (d1 , d2 ) = (8, 18). An example for these dimensions is provided by G = SU (2)9 Sym(9) (where Sym(9) acts on SU (2)9 by permutation) and K is the product of the diagonal U (1) in SU (2)9 with Sym(9). The arguments of [DW2] show that this in fact gives an example where the cohomogeneity one Ricci-flat equations are fully integrable.

Classification of Superpotentials

647

If W has three elements, we may adapt the proof of Theorem 3.11 to derive a contradiction. Here the essential point is that whenever we had to check that a sum of two elements of C does not lie in d + W, such a fact remains true because the interior point of vw is the midpoint. If W contains all four possible elements, then k ⊂ g is a maximal subalgebra (with respect to inclusion). We suspect that this case also does not occur. In any event, it is of less interest because the only way to obtain a complete cohomogeneity one example is by adding a Z/2-quotient of the principal orbit as a special orbit. 2. The only parts of this paper which depend on K being connected (or slightly more generally, on the condition in Remark 2.4) are parts of Sect. 5, Case (ii) of Sect. 9, and all of Sect. 10. To remove this condition, the main task would be generalizing Theorem 5.18 by getting a better handle on the type II vectors associated to (1A) vertices (cf Lemmas (5.6)-(5.8)). References [BB] [BGGG] [CGLP1] [CGLP2] [CGLP3] [DW1] [DW2] [DW3] [DW4] [DW5] [GGK] [EW] [WW] [WZ1] [Zi]

Bérard Bergery, L.: Sur des nouvelles variétés riemanniennes d’Einstein. Publications de l’Institut Elie Cartan, Nancy, 1982 Brandhuber, A., Gomis, J., Gubser, S., Gukov, S.: Gauge theory at large N and new G 2 holonomy metrics. Nucl. Phys. B 611, 179–204 (2001) Cveti˘c, M., Gibbons, G.W., Lü, H., Pope, C.N.: Hyperkähler Calabi metrics, L2 harmonic forms, resolved M2-branes, and AdS4 /CFT3 correspondence. Nucl. Phys. B 617, 151–197 (2001) Cveti˘c, M., Gibbons, G.W., Lü, H., Pope, C.N.: Cohomogeneity one manifolds of Spin(7) and G 2 holonomy. Ann. Phys. 300, 139–184 (2002) Cveti˘c, M., Gibbons, G.W., Lü, H., Pope, C.N.: Ricci-flat metrics, harmonic forms and brane resolutions. Commun. Math. Phys. 232, 457–500 (2003) Dancer, A., Wang, M.: Kähler-Einstein metrics of cohomogeneity one. Math. Ann. 312, 503–526 (1998) Dancer, A., Wang, M.: Integrable cases of the Einstein equations. Commun. Math. Phys. 208, 225–244 (1999) Dancer, A., Wang, M.: The cohomogeneity one Einstein equations from the Hamiltonian viewpoint. J. Reine Angew. Math. 524, 97–128 (2000) Dancer, A., Wang, M.: Superpotentials and the cohomogeneity one Einstein equations. Commun. Math. Phys. 260, 75–115 (2005) Dancer, A., Wang, M.: Notes on Face-listings for “Classification of superpotentials”. Posted at http://www.math.mcmaster.ca/mckenzie/newfaces.html, 2007 Ginzburg, V., Guillemin, V., Karshon, Y.: Moment maps, cobordisms, and Hamiltonian group actions, AMS Mathematical Surveys and Monographs, Vol. 98, Providence, RI: Amer, Math, Soc. 2002 Eschenburg, J., Wang, M.: The initial value problem for cohomogeneity one Einstein metrics. J. Geom. Anal. 10, 109–137 (2000) Wang, J., Wang, M.: Einstein metrics on S 2 -bundles. Math. Ann. 310, 497–526 (1998) Wang, M., Ziller, W.: Existence and non-existence of homogeneous Einstein metrics. Invent. Math. 84, 177–194 (1986) Ziegler, G.M.: Lectures on Polytopes, Graduate Texts in Mathematics, Vol. 152, BerlinHeidelberg-New York: Springer-Verlag, 1995

Communicated by G. W. Gibbons

Commun. Math. Phys. 284, 649–673 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0642-y

Communications in

Mathematical Physics

Hydrodynamic Turbulence and Intermittent Random Fields Raoul Robert1 , Vincent Vargas2 1 Institut Fourier, Université Grenoble 1, UMR CNRS 5582, 100, rue des Mathématiques,

BP 74, 38402 Saint-Martin d’Hères cedex, France. E-mail: [email protected]

2 CNRS, UMR 7534, Université Paris-Dauphine, Ceremade, F-75016 Paris, France.

E-mail: [email protected] Received: 30 March 2007 / Accepted: 29 June 2008 Published online: 11 October 2008 – © Springer-Verlag 2008

Abstract: In this article, we construct two families of multifractal random vector fields with non-symmetrical increments. We discuss the use of such families to model the velocity field of turbulent flows. 1. Introduction Roughly observed, some random phenomena seem scale invariant. This is the case for the velocity field of turbulent flows or the (logarithm of) evolution in time of the price of a financial asset. However, a more precise empirical study of these phenomena displays in fact a weakened form of scale invariance commonly called multifractal scale invariance or intermittency (the exponent which governs the power law scaling of the process or field is no longer linear). An important question is therefore to construct intermittent random fields which exhibit the observed characteristics. Following the work of Kolmogorov and Obukhov ([9,12]) on the energy dissipation in turbulent flows, Mandelbrot introduced in [10] a “limit-lognormal” model to describe turbulent dissipation or the volatility of a financial asset. This model was rigorously defined and studied in a mathematical framework by Kahane in [8]; more precisely, Kahane constructed a random measure called Gaussian multiplicative chaos. A natural extension of this work is to use Gaussian multiplicative chaos to construct a field (or a process in the financial case) which describes the whole phenomenon: the velocity field in turbulent flows (the price of an asset on a financial market). This extension was first performed by Mandelbrot himself who proposed to model the price of a financial asset with a time changed Brownian motion, the time change being random and independent of the Brownian motion. In [2], the authors proposed for the time change to take the primitive of multiplicative chaos: this gives the so-called multifractal random walk model (MRW) (Bacry and Muzy later generalized the construction of the MRW model in [3]). The obtained process accounts for many observed properties of financial assets. Partially supported by CNRS (UMR 7599 “Probabilités et Modèles Aléatoires”).

650

R. Robert, V. Vargas

The drawback of the above construction and of the MRW model is that the laws of the increments are symmetrical. In the case of finance, this is in contradiction with the skewness property observed for certain asset prices. In the case of turbulence, the laws of the increments must be nonsymmetrical: it is a theoretical necessity and stems from the dissipation of the kinetic energy ([7]). In light of these observations, we are naturally led to construct random fields which generalize to any dimension such a process and which present multifractal scale invariance as well as nonsymmetrical increments. We will answer a very natural question: how can one obtain a family of multifractal fields with nonsymmetrical increments by perturbing a given scale invariant Gaussian random field on Rd ? Finally, in the last part we will mention the difficulties which arise in trying to construct an incompressible multifractal velocity field that verifies the 4/5-law of Kolmogorov with positive dissipation.

2. Notations and Preliminary Results 2.1. The underlying Gaussian field. Let dW0 (x) denote the Gaussian white noise on Rd and ϕ : Rd → [0, 1] denote a C ∞ , radially symmetric function worth 1 for |x| 1 and 0 for |x| > 2. We also introduce a fixed correlation scale R > 0 and α a number which satisfies d/2 < α < d/2 + 1.

(2.1)

We define the Gaussian field Xg by the following formula: Xg (x) =

Rd

ϕ R (x − y)

x−y dW0 (y), |x − y|d−α+1

(2.2)

where we set the following notation: x ϕ R (x) = R d/2−α ϕ( ). R Using Kolmogorov’s continuity criterion (see the standard book [13]), it is easy to show that (2.2) defines a homogeneous, isotropic gaussian field which is almost surely Hölderian of order < α − d/2. Note that condition (2.1) implies that the integrand in (2.2) is square integrable and the R d/2−α factor ensures that the field is dimensionless. 2.1.1. Scaling property Let e be a unitary vector and λ > 0. We have the following identity in law: Xg (x + λe) − Xg (x) =

(law)

Rd

(

ϕ R (y)y ϕ R (y − λe)(y − λe) − )dW0 (y). d−α+1 |y| |y − λe|d−α+1

From the Gaussianity of the above law, we deduce that for all q > 0, there exists cq > 0 such that: q

E(|Xg (x + λe) − Xg (x)|q ) = σλe cq ,

Hydrodynamic Turbulence and Intermittent Random Fields

with

651

2 σλe

ϕ R (y)y ϕ R (y − λe)(y − λe) 2 − ) dy d−α+1 |y| |y − λe|d−α+1 ϕ R (λz)z ϕ R (λz − λe)(z − e) 2 ( d−α+1 − ) dz = λ2α−d d |z| |z − e|d−α+1 R λ z z−e ∼ ( )2α−d | d−α+1 − |2 dz. λ→0 R |z − e|d−α+1 Rd |z|

=

Rd

(

We thus derive the following scaling: λ E(|Xg (x + λe) − Xg (x)|q ) ∼ ( )q(α−d/2) Cq , λ→0 R where the constant Cq is independent of e. One says that (Xg (x))x∈Rd is at small scales monofractal with scaling exponent α − d/2. A homogeneous and isotropic field (X (x))x∈Rd is multifractal if there exists a nonlinear function ζq such that: λ E(|X (x + λe) − X (x)|q ) ∼ ( )ζq Cq . λ→0 R We call ζq the structural function of the field (X (x))x∈Rd . 2.2. Outline of the construction of multifractal vector fields from the field Xg . Our construction is inspired by the work of Kahane in [8]. Let > 0 and X (y) be a regular family of scalar Gaussian fields (not necessarily independent of dW0 ). We consider a j family of fields X (with scalar components X in the canonical basis) defined by: x−y X (x) = ϕ R (x − y) e X (y)−C dW0 (y) (2.3) d−α+1 d |x − y| R (|x − y| is defined in the next subsection and is given by a standard convolution). For an appropriate family X , we show that it is possible to find constants C such that X tends to a non-trivial field X (with scalar components X j in the canonical basis) as tends to 0. If one chooses X independent of dW0 , we will see that this leads to a field X that extends the model introduced by Bacry in [2] and that has symmetrical increments. Thus, to obtain nonsymmetrical increments, we must introduce the correlation between X and dW0 . 2.3. Notations and construction of the family X . Let k R be the function 1 for |x| R, d/2 R k (x) = |x| 0 otherwise. Let θ (x) be a C ∞ , non negative and radially symmetrical function with compact support in |x| 1 such that θ (x)d x = 1. Rd

652

R. Robert, V. Vargas

We define θ =

1 θ ( . ) d

and the corresponding convolutions: kR = θ ∗ k R ,

|.| = θ ∗ |.|.

Let γ be a strictly positive parameter and dW be a gaussian white noise on Rd . We consider the following gaussian field: X (y) = γ

Rd

kR (y − σ )dW (σ ).

Its correlation kernel is given by: E(X (x)X (y)) = γ 2 ρ/R (

x−y ), R

where ρ = k 1 ∗ k 1 and ρ = θ ∗ θ ∗ ρ. One can prove the following expansion: 1 + φ(x), |x|

ρ(x) = ωd ln+

where ωd denotes the surface of the unit sphere in Rd and φ is a continuous function that vanishes for |x| 2. We will note |.|∗ = inf(1, |.|) and, with this definition, the previous expansion is equivalent to: eρ(x) =

eφ(x) . |x|ω∗ d

One can also prove the following expansions with respect to for < R: kR (0) = with C0 =

θ(u) |u| 1 |u|d/2 du

C0 d/2

(2.4)

and there exists a constant C1 such that

ρ/R (0) = ωd ln

R + C1 + o().

(2.5)

In the sequel, we will consider the case γ dW = γ0 ()dW0 + γ1 dW1 , where dW1 is a white noise independent of dW0 and γ0 () is a function of that will be defined later. Note that the integral in formula (2.3) has a meaning since dW0 can be viewed as a random distribution.

Hydrodynamic Turbulence and Intermittent Random Fields

653

2.4. Preliminary technical results. We recall the following integration by parts formula for gaussian vectors (cf. Lemma 1.2.1 in [11]): Lemma 2.1. Let (g, g1 , . . . , gn ) be a centered gaussian vector and G : Rn → R a C 1 function such that its partial derivatives have at most exponential growth. Then we have: E(gG(g1 , . . . , gn )) =

n

E(ggi )E(

i=1

∂G (g1 , . . . , gn )). ∂ xi

(2.6)

From the above formula, one can easily deduce by induction the following lemma which will be frequently used in the sequel: Lemma 2.2. Let l ∈ N ∗ be some positive integer and (g, g1 , . . . , g2l ) a centered gaussian vector. Then: E(g1 . . . g2l e g ) = (

l

1

Sk,l )e 2 E(g ) , 2

k=0

where

Sk,l =

E(ggi1 ) . . . E(ggi2k )E(gi2k+1 gi2k+2 ) . . . E(gi2l−1 gi2l ),

{i 1 ,...,i 2k }⊂{1,...,2l}

where the second sum is taken over all partitions of {1, . . . , 2l}\{i 1 , . . . , i 2k } in subsets of two elements {i 2 p+1 , i 2 p+2 }. Similarly, we get the following formula: E(g1 . . . g2l+1 e g ) = (

l

2 Sk,l )e 2 E(g ) , 1

k=0

where

Sk,l =

E(ggi1 ) . . . E(ggi2k+1 )E(gi2k+2 gi2k+3 ) . . . E(gi2l gi2l+1 ).

{i 1 ,...,i 2k+1 }⊂{1,...,2l+1} (2l+1)! 2l! Sk,l ), the summation is made of 2k!2l−k ( ) terms, Remark 2.3. In Sk,l ( (l−k)! (2k+1)!2l−k (l−k)! αk,l ). number we will denote by αk,l (

We will also use the following lemma essentially due to Kahane ([8]). Lemma 2.4. Let (T, d) be a metric space and σ a finite positive measure on T equipped with the borelian σ -field induced by d. Let q : T × T :→ R+ a symmetric application and m a positive integer. Then we have the following inequalities: q(t j ,tk ) 1 j
s∈T

e

T

(2.7) 1 j
T 2m+1

dσ (t1 ) . . . dσ (t2m+1 ) eq(s,t) dσ (t))( emq(s,t) eq(s,t) dσ (t))2m−1 .

σ (T ) sup( s, s

T

T

(2.8)

654

R. Robert, V. Vargas

Proof. The proof of (2.7) can be found in [8]. Thus we just prove how to derive inequality (2.8) from (2.7). By integrating with respect to the first 2m variables and applying (2.7) with the measure eq(t,t2m+1 ) dσ (t), we get: e 1 j

=

dσ (t2m+1 )

T

(2.7)

e

eq(t,t2m+1 ) dσ (t))(sup T

s, s

2m

eq(t j ,t2m+1 ) dσ (t1 ) . . . dσ (t2m )

j=1

dσ (t2m+1 )(

σ (T ) sup(

q(t j ,tk )

T 2m

T

1 j
s

eq(s,t) dσ (t))( T

emq(s,t) eq(t,t2m+1 ) dσ (t))2m−1 T

emq(s,t) eq(s,t) dσ (t))2m−1 . T

3. Construction of a Four Parameter Family of Multifractal, Homogeneous, Isotropic Vector Fields with Non-Symmetrical Increments In this section, we will suppose that d/2 < α < (d/2 + 1) ∧ d and ωd γ12 < d. We consider the field X defined by formula (2.3) with X (y) = γ0 ()X 0 (y) + γ1 X 1 (y), where X i (y)

=

Rd

kR (y − σ )dWi (σ ),

i = 0, 1.

We set also C = ((γ0 ())2 + γ12 )ρ/R (0), and d−ωd γ12 γ0 () = γ0∗ ( ) 2 . R Therefore, we introduce a slight correlation between X and dW0 (γ0 () tends to 0 as goes to 0). 3.1. Multiplicative chaos in dimension d. Multiplicative chaos or the “limit-lognormal” model introduced by Mandelbrot is a generalization of the exponential of a gaussian process. As mentioned in the introduction, it was defined rigorously by Kahane in [8]. The construction of Kahane was based on the theory of martingales and thus the generalized correlation kernel (here ρ(t − s)) had to verify a condition hard to verify practically (the σ -positivity condition). Our construction is based on L 2 -theory and can be carried out without this condition.

Hydrodynamic Turbulence and Intermittent Random Fields

655

We will construct the multiplicative chaos associated to the generalized correlation kernel ρ( x−y R ) defined in 2.3 and to some (positive) intermittency parameter γ1 such that γ12 ωd < d. Let be a positive number. Let B(Rd ) denote the standard borelian σ -field; we want to consider the limit as goes to 0 of the random measures Q ,γ1 defined by:

Q ,γ1 (dy) = eγ1 X 1 (y)− =e

γ12 2

E((X 1 (y))2 )

γ1 X 1 (y)− 21 γ12 ρ/R (0)

dy

dy.

(3.1)

This leads us to state the following proposition: Proposition 3.1. (Multiplicative chaos of order γ1 ) There exists a positive random measure Q γ1 (dy) independent of the regularizing function θ such that: (1) for all A bounded in B(Rd ), E(Q γ1 (A)) = |A|. (2) Q γ1 has almost surely no atoms. (3) Almost surely, Q γ1 is singular with respect to the Lebesgue measure on all set A (with positive measure). If q is some positive integer and f : Rd → R a deterministic function that satisfies the following condition: 1 | f (y1 )| . . . | f (y2q )| dy1 . . . dy2q < ∞, (3.2) yi −y j γ12 ωd (Rd )2q 1 i< j 2q | R |∗ then we have the following convergence: L 2q ,γ1 f (y)Q (dy) → →0

Rd

Rd

f (y)Q γ1 (dy).

We also have the following expression for the moments of Rd f (y)Q γ1 (dy): γ1 k ∀k 2q, E ( f (y)Q (dy)) Rd

=

(R d )k

f (y1 ) . . . f (yk )

1 i< j k

eγ1 φ( 2

|

yi −y j R

)

yi −y j γ12 ωd R |∗

dy1 . . . dyk .

(3.3)

Moreover, the above formula (3.3) extends straightforwardly to the case of two functions f, g and two intermittency parameters γ1 , γ2 giving: E ( f (y)Q γ1 (dy))k ( g(y)Q γ2 (dy))l =

×

Rd

Rd

(Rd )k+l

f (y1 ) . . . f (yk )g(yk+1 ) . . . g(yk+l )

eγ1 γ2 φ(

yi −y j R

)

2

1 i< j k

|

yi −y j R

)

yi −y j γ1 γ2 ωd 1 i k, j>k | R |∗ k+1 i< j k+l

We will call Q γ1 (dy) multiplicative chaos of order γ1 .

eγ1 φ(

eγ2 φ( 2

|

yi −y j γ22 ωd R

|∗

yi −y j R

)

yi −y j γ12 ωd R |∗

dy1 . . . dyk+l .

656

R. Robert, V. Vargas

Proof. We first start by considering a positive integer q and a function f that satisfies the corresponding integrability condition (3.2). Let , be two positive numbers. By using Fubini, we get for all j 2q:

f (y)Q ,γ1 (dy))2q− j ) j 2 2q− j 2 = e− 2 γ1 ρ/R (0)− 2 γ1 ρ /R (0) f (y1 ) . . . f (y2q )

E((

Rd

×e

f (y)Q

,γ1

j 1 2 2 γ1 E(( i=1

(dy)) ( j

Rd

2q X 1 (yi )+ i= j+1

(Rd )2q

X 1 (yi ))2 )

dy1 . . . dy2q

×

→

, →0

(Rd )2q

f (y1 ) . . . f (y2q )

1 i< j 2q

eγ1 φ( 2

|

yi −y j R

)

yi −y j γ12 ωd R |∗

dy1 . . . dy2q ,

since ρ/R (x) → ρ(x) as goes to 0. From this, we deduce that: E((

Rd

f (y)Q ,γ1 (dy) −

Rd

f (y)Q ,γ1 (dy))2q ) → 0, , →0

and therefore that Rd f (y)Q ,γ1 (dy) is a Cauchy sequence in L 2q that converges to γ1 ( f ). For k 2q, the moment E(( Q γ1 ( f ))k ) is the limit as some random variable Q ,γ k 1 γ1 ( f ) are goes to 0 of E((Q ( f )) ); from this one can deduce that the moments of Q given by formula (3.3). For any bounded set A in B(Rd ), consider f = 1 A and q = 1. Since γ12 ωd < d, we deduce from Lemma 2.4 that the integrability condition (3.2) is satisfied. Thus it follows γ1 (A). from the proof above that Q ,γ1 (A) converges in L 2 to some random variable Q This defines a family of random variables (indexed by the bounded Borelian sets) that satisfies the following properties: (1) For all disjoint and bounded sets A1 , A2 in B(Rd ), γ1 (A1 ∪ A2 ) = Q γ1 (A1 ) + Q γ1 (A2 ) Q

a.s.

(2) For any bounded sequence (An )n 1 decreasing to ∅: γ1 (An ) → 0 a.s. Q n→∞

By Theorem 6.1.VI. in [5], there exists a random measure Q γ1 such that for all bounded A in B(Rd ) we have: γ1 (A) a.s. Q γ1 (A) = Q γ1 Finally, one can easily show that the limit random variable Q ( f ) is almost surely equal to Rd f (y)Q γ1 (dy).

Hydrodynamic Turbulence and Intermittent Random Fields

657

3.2. Convergence of X towards a field X . In the sequel, (e j ) j will denote the canonical basis (whereas (e j ) j denotes the components of a vector e). In this subsection, we will prove the following proposition: Proposition 3.2. Let α be such that d/2 < α < (d/2 + 1) ∧ d and γ1 such that 2γ12 ωd < α − d/2. There exists a field (X (x))x∈Rd such that for all k and x1 , . . . , xk ∈ Rd , the following convergence in law holds: (X (x1 ), . . . , X (xk )) ⇒ (X (x1 ), . . . , X (xk )).

(3.4)

→0

Let l be an integer such that one of the following conditions hold: (1) l is even and lγ12 ωd < α − d/2. (2) l is odd and (l + 1)γ12 ωd < α − d/2. j

j

j

y . Then there exists C such that, for all x Let FR be defined by FR (y) = ϕ R (y) |y|d−α+1

in Rd , the random variables X j (x) have a moment of order 2l given by the following expression: E((X j (x))2l ) =

l

αk,l C 2k

k=0

e |

1 i< j 2k

γ12 φ(

yi −y j R

R

)

yi −y j γ12 ωd

|∗

j

(Rd )k+l

j

e2γ1 φ( 2

|

1 i 2k j>2k

j

j

FR (y1 ) . . . FR (y2k )(FR (y2k+1 ))2 . . . (FR (yk+l ))2 yi −y j R

yi −y j 2γ12 ωd

|∗

R

)

e4γ1 φ( 2

|

2k+1 i< j k+l

yi −y j R

)

yi −y j 4γ12 ωd

|∗

R

dy1 . . . dyk+l . (3.5)

We also have: E((X j (x + h) − X j (x))2l ) =

l

αk,l C 2k

k=0 j

j

j

j

(Rd )k+l

j

(FR (y1 ) − FR (y1 − h)) . . .

j

j

j

(FR (y2k ) − FR (y2k − h))(FR (y2k+1 ) − FR (y2k+1 − h))2 . . . (FR (yk+l ) − FR (yk+l − h))2 1 i< j 2k

eγ1 φ( 2

|

yi −y j R

)

e2γ1 φ( 2

yi −y j γ12 ωd 1 i 2k R |∗ j>2k

|

yi −y j R

)

yi −y j 2γ12 ωd 2k+1 i< j k+l R |∗

e4γ1 φ( 2

|

yi −y j R

)

yi −y j 4γ12 ωd R |∗

dy1 . . . dyk+l . (3.6)

Proof. Let γ1 be such that

2γ12 ωd

< α − d/2. We set

γ0∗ C0 e−1/2γ1 C1 , R d/2 and define two auxiliary fields Y , Z by the following expressions: x−y Y (x) = ϕ R (x − y) Q ,γ1 (dy) |x − y|d−α+1 Rd 2

C=

and

Z (x) =

Rd

ϕ R (x − y)

x−y eγ1 X 1 (y)−C dW0 (y). d−α+1 |x − y|

(3.7)

(3.8)

658

R. Robert, V. Vargas

Note that Z (x) exists since X 1 and dW0 are independent with: ϕ R (x − y)2 2γ1 X (y)−2C 1 e dy) < ∞. E( 2(d−α) Rd |x − y|

(3.9)

We can compute, for all x in Rd , E(|X (x) − CY (x) − Z (x)|2 ) (cf. the more complicated computations in the proof of Proposition 3.7) and derive the following limit: L2

X (x) − (CY (x) + Z (x)) → 0. →0

Thus, we must show that the finite dimensional distributions of the field CY + Z converge in law. Let k be some positive integer and x1 , . . . , xk points in Rd . For all ξ = (ξ1 , . . . , ξk ) in (Rd )k , we compute the characteristic function of (CY (x1 ) + Z (x1 ), . . . , CY (xk ) + Z (xk )): C (ξ ) = E(ei

k

j=1 ξ j .(C Y (x j )+Z (x j ))

).

By conditioning on the field generated by the white noise dW1 , we get: C (ξ ) = E(eiC = E(eiC

k

j=1 ξ j .Y (x j )

k j=1

k ( j=1 ξ j .FR (x j −y))2 e2γ1 X 1 (y)−2C dy

1

e− 2

ξ j .FR (x j −y)Q ,γ1 (dy)

e

)

−2γ 2 ρ (0) k ( j=1 ξ j .FR (x j −y))2 Q ,2γ1 (dy) − 12 e 0 /R

).

Now, using Proposition 3.1, we have: k

ξ j .FR (x j − y)Q ,γ1 (dy) → L2

j=1

k

ξ j .FR (x j − y)Q γ1 (dy),

j=1

k k ( ξ j .FR (x j − y))2 Q ,2γ1 (dy) → ( ξ j .FR (x j − y))2 Q 2γ1 (dy), L2

j=1

j=1

from where: C (ξ ) → C(ξ ) = E(eiC

k j=1

ξ j .FR (x j −y)Q γ1 (dy) − 21 ( kj=1 ξ j .FR (x j −y))2 Q 2γ1 (dy)

e

→0

).

Thus, by applying Levy’s theorem, we conclude that the finite dimensional distributions of the field CY + Z converge in law to those of a field X whose finite dimensional distributions are given by: E(ei

k

j=1 ξ j .X (x j )

) = C(ξ ).

Suppose that l is a positive integer that satisfies the condition of the proposition. For all ξ in Rd , we have: E(eiξ.X (x) ) = E(e−iC

ξ.FR (y)Q γ1 (dy) − 21 (ξ.FR (y))2 Q 2γ1 (dy)

We derive expression (3.5) by computing We derive (3.6) similarly .

e

∂ 2l (∂ξ j )2l

).

E(eiξ.X (x) )|ξ =0 using Proposition 3.1.

Hydrodynamic Turbulence and Intermittent Random Fields

659

3.3. Scaling of X . The purpose of this subsection is to show that the field (X (x))x∈Rd satisfies the multifractal scaling relation (this is what Propositions 3.5 and 3.6 below assert). We first state two preliminary lemmas we will use in the rest of the paper. Lemma 3.3. Let δ be some real number such that 0 δ < α and δ = α − 1. There exists C = C(δ) such that we have the following inequality for |h| R: 1 h sup |FR (y) − FR (y − h)| x−y dy R d/2 C| |(α−δ)∧1 . (3.10) δ d R d | R |∗ x∈R R Proof. By homogenity, we suppose that R = 1 and for simplicity, we suppose d 2. Since |x|1 δ |x|1 δ + 1 and the right-hand side of (3.10) increases with δ, we have to show ∗ that for δ ∈ [0, α[ and |h| 1: 1 sup |F1 (y) − F1 (y − h)| dy C|h|(α−δ)∧1 . |x − y|δ x∈Rd Rd Indeed, this would imply that for |h| 1: 1 sup |F1 (y) − F1 (y − h)| dy C|h|(α−δ)∧1 + C|h|α∧1 |x − y|δ∗ x∈Rd Rd 2C|h|(α−δ)∧1 . There exists C such that for all y and h, we have: |ϕ(y − h) − ϕ(y)| C|h| and ϕ(y) C1|y| 2 . We set

I (x) =

Therefore we get

Rd

|F1 (y) − F1 (y − h)|

1 dy. |x − y|δ

1 1 dy d−α |x − y|δ |y| |y| 3 y y−h 1 | d−α+1 − | dy d−α+1 |y − h| |x − y|δ |y| 3 |y| y y−h 1 C|h| + C | d−α+1 − | dy, d−α+1 |y − h| |x − y|δ |y| 3 |y|

I (x) C|h| +C

where we denote by C different constants. First case: δ < α − 1. Plugging inequality |

(3.11)

y−h (d − α + 1)|h| y − | |y|d−α+1 |y − h|d−α+1 |y − h|d−α+1 ∧ |y|d−α+1

in (3.13), we get I (x) C|h|

|y| 3

1 |y

− h|d−α+1

∧ |y|d−α+1

1 dy. |x − y|δ

(3.12)

(3.13)

660

R. Robert, V. Vargas

We have:

1

1 dy |x − y|δ |y| 3 |y − 1 1 1 1 dy + dy d−α+1 |x − y|δ d−α+1 |x − y|δ |y − h| |y| |y| 3 |y| 3 1 1 dy, 2 sup d−α+1 |x − y|δ x |y| 4 |y| h|d−α+1

∧ |y|d−α+1

which concludes the proof. Second case: δ > α − 1. By the change of variable y = |h|u and setting h = |h|e with |e| = 1, we get: y−h y 1 | − d−α+1 | dy d−α+1 |y| |x − y|δ |y| 3 |y − h| u−e u 1 α−δ | − d−α+1 | du = |h| d−α+1 3 |u − e| |u| |x/|h| − u|δ |u| |h| |h|

α−δ

sup a∈Rd

Rd

|

u−e u 1 − d−α+1 | du. |u − e|d−α+1 |u| |a − u|δ

Lemma 3.4. Let δ be some real number such that 0 δ < 2α−d. There exists C = C(δ) such that we have the following inequality for |h| R: 1 h sup |FR (y) − FR (y − h)|2 x−y dy C| |2α−d−δ . (3.14) δ R | R |∗ x∈Rd Rd Proof. As in the proof above, we can replace |.|∗ by |.| and suppose that R = 1; thus we have to show inequality (3.14) with J (x) where we set: 1 |FR (y) − FR (y − h)|2 dy. J (x) = d |x − y|δ R Using inequality (3.11), we get 1 1 J (x) C|h|2 dy 2(d−α) |x − y|δ |y| |y| 3 1 y−h y +C | − d−α+1 |2 dy d−α+1 |y| |x − y|δ |y| 3 |y − h| 1 y−h y 2 C|h| + C | − d−α+1 |2 dy. d−α+1 |y − h| |y| |x − y|δ |y| 3

(3.15)

Since 2 > 2α − d − δ, we only have to consider the second term in inequality (3.15). By the change of variable y = |h|u and setting h = |h|e with |e| = 1, we get:

Hydrodynamic Turbulence and Intermittent Random Fields

661

1 y−h y − d−α+1 |2 dy d−α+1 |y| |x − y|δ |y| 3 |y − h| 1 u−e u = |h|2α−d−δ | − d−α+1 |2 du d−α+1 3 |u − e| |u| |x/|h| − u|δ |u| |h| 1 u−e u 2α−d−δ sup | − d−α+1 |2 du. |h| d−α+1 d |u − e| |u| |a − u|δ a∈Rd R |

Proposition 3.5 (Scaling along the even integers). Let l be an integer such that one of the following conditions hold: (1) l is even and lγ12 ωd < α − d/2, (2) l is odd and (l + 1)γ12 ωd < α − d/2. j

Let e be a unit vector (|e| = 1). Then there exists Cl (e) > 0 such that the following scaling relation holds: λ j E((X j (x + λe) − X j (x))2l ) ∼ Cl (e)( )ζ2l , λ→0 R

(3.16)

ζ2l = l(2α − d) − 2γ12 ωd l(l − 1).

(3.17)

where we have

Proof. For simplicity, we will suppose that l is even and that lγ12 ωd < α − d/2. We introduce the following notation: j

j

f h (y) = FR (y) − FR (y − h). We shall see that the scaling at small scale of the sum (3.6) is given by the term k = 0. Indeed for all k 1 let us consider the integral f h (y1 ) . . . f h (y2k )( f h (y2k+1 )) . . . ( f h (yk+l )) 2

(Rd )k+l

×

2

eγ1 φ( 2

1 i< j 2k

e

1 i 2k j>2k

|

y −y 2γ12 φ( i R j

)

yi −y j 2γ12 ωd R

|∗

e

2k+1 i< j k+l

|

y −y 4γ12 φ( i R j

)

yi −y j 4γ12 ωd R

|∗

|

yi −y j R

)

yi −y j γ12 ωd

|∗

R

dy1 . . . dyk+l Ik Jk,l , (3.18)

where we set Ik =

sup

y2k+1 ,...,yk+l

(Rd )2k

| f h (y1 )| . . . | f h (y2k )|

1 i 2k j>2k

×

1 i< j 2k

e |

y −y γ12 φ( i R j

)

yi −y j γ12 ωd R

|∗

dy1 . . . dy2k ,

e2γ1 φ( 2

|

yi −y j R

)

yi −y j 2γ12 ωd R |∗

662

R. Robert, V. Vargas

and Jk,l =

( f h (y2k+1 )) . . . ( f h (yk+l )) 2

2

(Rd )l−k

e4γ1 φ( 2

2k+1 i< j k+l

|

yi −y j R

)

yi −y j 4γ12 ωd R |∗

dy2k+1 . . . dyk+l .

By using the estimates (3.10),(3.14) and the inequalities (2.7), (2.8), one can show that for all k 1, we have h Ik Jk,l C R dk | |ck,l , R with ck,l = (α − 2(l −k)γ12 ωd ) ∧ 1 + ((α − (2l − k)γ12 ωd ) ∧ 1)(2k − 1) + (2α − d)(l − k) −2γ12 ωd (l − k)(l − k − 1). If α − 2(l − k)γ12 ωd < 1, then ck,l = ζ2l + k(d − γ12 ωd ); if α − 2(l − k)γ12 ωd 1 and α − (2l − k)γ12 ωd < 1, then ck,l = ζ2l + 1 − α + dk + (2l − 3k)γ12 ωd ; otherwise ck,l = 2k + (2α − d)(l − k) − 2γ12 ωd (l − k)(l − k − 1). In all cases, it is easy to show that ck,l > ζ2l under the conditions of the proposition. Finally, we study the term where k = 0. We get for h = λe with |e| = 1: ( f h (y1 )) . . . ( f h (yl )) 2

(Rd )l

=

yi =λu i

λ ( )l(2α−d) R

2

1 i
e4γ1 φ( 2

yi −yn R

)

4γ12 ωd

n | yi −y R |∗

dy1 . . . dyl .

j

j

u1 − e j u1 λ λ (ϕ( (u 1 − e)) − ϕ( u 1 ) )2 . . . d−α+1 R |u 1 − e| R |u 1 |d−α+1 (Rd )l j

j

ul − e j ul λ λ × (ϕ( (u l − e)) − ϕ( u l ) )2 d−α+1 R |u l − e| R |u l |d−α+1

×

e4γ1 φ( 2

λ(u i −u n ) ) R

du 1 . . . du l . 4γ 2 ωd | λ(u iR−u n ) |∗ 1 j j u1 − e j u1 2l(l−1)γ12 φ(0) λ ζ2l ∼ e ( ) ( − )2 . . . d−α+1 d−α+1 d l λ→0 R |u − e| |u | 1 1 (R ) 1 i
j

j

ul − e j ul ×( − )2 |u l − e|d−α+1 |u l |d−α+1

1

1 i
|u i − u n |4γ1 ωd 2

du 1 . . . du l ,

and inequality (2.7) shows that this integral is finite when lγ12 ωd < α − d/2.

In the next proposition, we state the scaling relations of X along the odd integers. j We define Il (e) by: j

Il (e) =

j

j

j

j

ul − e j ul u1 − e j u1 2 − ) . . . ( − )2 d−α+1 d−α+1 d−α+1 d−α+1 d l |u − e| |u | |u − e| |u | 1 1 l l (R ) 1 × du 1 . . . du l . 4γ12 ωd |u − u | i j 1 i< j l (

Hydrodynamic Turbulence and Intermittent Random Fields

663

Proposition 3.6. (Scaling along the odd integers) Let l be an integer satisfying the conditions in Proposition 3.5 and 1 + 2lγ12 ωd < α. Let e be a unit vector (|e| = 1). Then we have the following scaling relation: λ j E((X j (x + λe) − X j (x))2l+1 ) ∼ l (e)( )ζ2l+1 , λ→0 R

(3.19)

ζ2l+1 = l(2α − d) − 2γ12 ωd l(l − 1) + 1,

(3.20)

where we have

and l (e) = γ0∗ C(l, γ1 )Il (e)e j , C(l, γ1 ) > 0. j

j

2

Proof. As in Proposition 3.2, setting C =

γ0∗ C0 e−1/2γ1 C1 , R d/2

it is possible to show that:

E((X j (x + h) − X j (x))2l+1 ) l 2k+1 αk,l C f h (y1 ) . . . f h (y2k+1 )( f h (y2k+2 ))2 . . . ( f h (yk+l+1 ))2 = (Rd )k+l+1

k=0

×

eγ1 φ( 2

1 i< j 2k+1

×

|

yi −y j R

)

e2γ1 φ( 2

yi −y j R

)

yi −y j γ12 ωd yi −y j 2γ12 ωd 1 i 2k+1 | R |∗ R |∗ j>2k+1

e4γ1 φ( 2

2k+2 i< j k+l+1

|

yi −y j R

)

yi −y j 4γ12 ωd

|∗

R

dy1 . . . dyk+l+1 ,

where, as usual, we set: j

j

f h (y) = FR (y) − FR (y − h). Similarly to Proposition 3.5, to get the main contribution as |h| goes to 0, we examine the term k = 0. We introduce I: y1 −y j 2 I= f h (y1 )( f h (y2 ))2 . . . ( f h (yl+1 ))2 e2γ1 ρ( R ) (Rd )l+1

×

j 2

e

2 i< j l+1 |

y −y 4γ12 φ( i R j

)

yi −y j 4γ12 ωd R

|∗

dy1 . . . dyl+1 .

Putting h = λe (|e| = 1), y1 = Ru 1 , yi = λu i (i 2), we get: λ 2 2 j I ∼ R d/2 e2l(l−1)γ1 φ(0) ( )ζ2l+1 Il (e)e. ∇ψ j (y)e2lγ1 ρ(y) dy, λ→0 R j

y where ψ j (y) = ϕ(y) |y|d−α+1 . Now a calculation gives:

∇ψ j (y)e2lγ1 ρ(y) dy = − 2

2lγ12 ωd ( d

∞ 0

r α−1 ϕ(r )e2lγ1 ρ(r ) 2

dρ dr )e j , dr

664

R. Robert, V. Vargas

the last integral being negative since ρ(r ) is a strictly decreasing function on the interval ]0, 2[: the result follows. Notice that the condition 1 + 2lγ12 < α implies that this integral j is finite while the other conditions imply that Il (e) is finite. From Proposition 3.6, we deduce readily that for λ small the law of X (x +λe)−X (x) is nonsymmetrical (for γ0∗ = 0). Indeed, by isotropy, we have : (X (x + λe) − X (x)).e = X j (x + λe j ) − X j (x) law

and l (e j ) = γ0∗ C(l, γ1 )Il (e j ) > 0. j

j

3.4. Tightness of X and regularity of X . In this section, we prove that the convergence in law of X towards X given by Proposition 3.2 holds in a functional sense and that the field X is locally Hölderian. The straight way to do so is to prove the tightness of the sequence X by means of a Kolmogorov estimate (cf. Chapter 13 of [13]). Proposition 3.7 (Tightness). Let l be some positive integer that satisfies the condition of Proposition 3.5 and γ a positive parameter such that γ12 < γ 2 . Then there exists 0 > 0 and C independent of such that for < 0 and |h| R: ∀x,

E((X (x + h) − X (x))2l ) C|h|l(2α−d)−2γ

2 ω l(l−1) d

,

(3.21)

and E((X (0))2l ) C.

(3.22)

Proof. We only prove (3.21) (the proof of (3.22) is similar). We are going to compute the moment j j 2l f ,h (y)e X (y)−C dW0 (y))2l ) E((X (x + h) − X (x)) ) = E(( Rd

where we set: f ,h (y) = We get:

ϕ R (y)y j |y|d−α+1

−

ϕ R (y − h)(y j − h j ) |y − h|d−α+1

.

f ,h (y)e X (y)−C dW0 (y))2l ) Rd ˆ −2lC =e f ,h (y1 ) . . . f ,h (y2l )E(e X dW0 (y1 ) . . . dW0 (y2l )),

E((

(3.23)

where Xˆ = X (y1 ) + · · · + X (y2l ). The rest of the computation can be performed rigorously by regularizing the white noise dW0 , using Lemma 2.2 and going to the limit. It is easy to see that we obtain the same result by introducing the following formal rules: E(dW0 (y)dW0 (y )) = δ y−y dy

(3.24)

Hydrodynamic Turbulence and Intermittent Random Fields

665

and E(dW0 (y)X (y )) = γ0 ()kR (y − y)dy.

(3.25)

ˆ

As a consequence of Lemma 2.2, E(e X dW0 (y1 ) . . . dW0 (y2l )) is the sum of terms of the form E(dW0 (y1 ) Xˆ ) . . . E(dW0 (yk ) Xˆ )E(dW0 (yk+1 )dW0 (yk+2 )) . . . ˆ )2 )

1

E(dW0 (yq−1 )dW0 (y2l ))e 2 E(( X

.

(3.26)

We will compute the limit of each one of these terms. By using (3.25), we get E(dW0 (yk ) Xˆ ) = γ0 ()( =

2l

kR (yi − yk ))dyk

i=1 γ0 ()kR (0)(1 +

Q k )dyk ,

where Q k =

1 ( kR (yi R k (0) i=k

− yk )).

We also have from the definition of X : ˆ )2 )

1

e 2 E(( X

= e(lρ/R (0)+

i< j

ρ/R (

yi −y j R

))(γ0 ()2 +γ12 )

.

By using Lemma 2.2, expression (3.23) and the rules above, we get: f ,h (y)e X (y)−C dW0 (y))2l ) E(( =

Rd l

αk,l (γ0 ())2k (kR (0))2k e(2l−k)((γ0 ())

2 +γ 2 )ρ 1 /R (0)−2lC

k=0 (Rd )k+l

×

f ,h (y1 ) . . . f ,h (y2k )( f ,h (y2k+1 ))2 . . . ( f ,h (yk+l ))2

2k (1 + Q i,k,l )e Sk,l dy1 . . . dyk+l , i=1

where Q i,k,l =

1 ( R k (0)

1 j 2k

kR (yi − y j ) + 2

kR (yi − y j ))

j>2k

j =i

and

= ((γ0 ())2 + γ12 )( Sk,l

+4

1 i< j 2k

2k+1 i< j k+l

ρ/R (

yi − y j )+2 R

yi − y j )). ρ/R ( R

1 i 2k j>2k

ρ/R (

yi − y j ) R

666

R. Robert, V. Vargas

We first take care of the normalizing constant outside each integral: (γ0 ()kR (0)e−1/2((γ0 ())

2 +γ 2 )ρ 1 /R (0)

By the choice of C , we have e2l((γ0 ()) and (2.5), we derive the following limit:

)2k e2l((γ0 ())

2 +γ 2 )ρ 1 /R (0)−2lC

2 +γ 2 )ρ 1 /R (0)−2lC

.

= 1. Using expansions (2.4)

γ ∗ C0 e−1/2γ1 C1 → 0 . →0 R d/2 2

2 2 γ0 ()kR (0)e−1/2((γ0 ()) +γ1 )ρ/R (0)

In conclusion, the constant outside the integral of term k in the above sum converges to 2

γ ∗ C e−1/2γ1 C1

αk,l ( 0 0 R d/2 )2k . Let γ be such that γ12 < γ 2 . One can choose 0 > 0 such that γ0 (0 ))2 + γ12 < γ 2 . R + C with C independent of , we Using the fact that, for all y, ρ/R (y/R) ωd ln+ |y| get:

e Sk,l C

1

yi −y j γ 2 ωd 1 i< j 2k | R |∗ 1 i 2k j>2k

2k+1 i< j k+l

1 |

yi −y j 4γ 2 ωd R |∗

1 y −y 2γ 2 ω | i R j |∗ d

×

.

(3.27)

| is bounded by a constant independent Finally, we conclude by using the fact that |Q i,k,l of , inequality (3.10) and (3.14) similarly as in the proof of Proposition 3.5.

Corollary 3.8. One can easily deduce from this that for γ12 sufficiently small, by Kolmogorov’s compacity theorem ([13]), X tends to X in the functional sense and that X is locally Hölderian. Comment 3.9. Starting with a two parameter (R, α) monofractal Gaussian field, we constructed a four parameter (R, α, γ1 , γ0∗ ) multifractal field with nonsymmetrical increments. This family has its own interest. As we shall see in the next section, this family is too restricted to take into account all the constraints needed for a satisfactory model of turbulent flows. In the case where γ0∗ = 0, we obtain symmetrical random fields which extend to higher dimensions the model introduced in [2]. In the next section, we will study a multifractal field which is not in this family but that can be seen as a limit case where γ1 = 0 and γ0 is constant (independent of ). As we will see, this family will be compatible with the 4/5-law. 4. A Step Towards a Model of the Velocity Field of Turbulent Flows An acceptable solution to the problem of hydrodynamical turbulence in dimension 3 would be to construct a random velocity field U solution to the dynamics (Euler or Navier Stokes typically) that is stationary, incompressible, space-homogeneous, isotropic and that satisfies the main statistical properties of the velocity field of turbulent flows. Two main properties are:

Hydrodynamic Turbulence and Intermittent Random Fields

667

(1) The 4/5-law of Kolmogorov that links the energy dissipation of the turbulent flow to the statistics of the increments of the velocity. This law is widely accepted since it is the only one that can be proven with the dynamics ([6,7,14]). More precisely, this law states: ξ 4 E((U (x + ξ ) − U (x). )3 ) = − D|ξ |. (4.1) |ξ | 5 In the above formula, D denotes the average dissipation of the kinetic energy per unit mass in the fluid. Remark 4.1. To obtain this law, it is sufficient to suppose that the field U is space homogeneous and isotropic. (2) The intermittency of the field U : E((U (x + ξ ) − U (x).

ξ q ) ) ∼ Cq |ξ |ζq , |ξ |→0 |ξ |

(4.2)

where ζq is a well known concave structure function (cf. [7]). It is a very challenging task to construct a field with all the aforementioned properties, especially because this field must be invariant by the Euler or Navier-Stokes equation. Nevertheless, one can in the first place forget the invariance by the dynamics and simply try to construct a field that satisfies the other properties. The 4/5-law shows that the nonsymmetry of the increments is an essential feature. Let us consider the family X constructed in the previous section (d = 3). By Proposition 3.6, we have: λ E(((X (x + λe) − X (x)).e)3 ) ∼ C3 ( )ζ3 , C3 > 0, λ→0 R ζ3 = 1, which gives α = 3/2. with ζ3 = 2α − 2. To satisfy the 4/5 law one should have This is incompatible with the constraint 3/2 < α < 5/2. Thus we have now to modify the family X to reach the limit case ζ3 = 1. In this aim, we will construct a new (three parameter) family X0 corresponding to the limit case γ1 = 0, γ0 constant > 0. 4.1. Construction of the field X0 . In this section, we only outline the main steps of the construction of X0 . The field X0, is given by formula (2.3) where X is now defined by: X (y) = γ0 kR (y − σ )dW0 (σ ). Rd

We suppose that α is in the interval ]0, 1[. We choose the normalizing constant C such that: 1

γ0 kR (0)e−C + 2 γ0 ρ/R (0) = 1. 2

We start by stating a lemma we will use in the proof of the proposition below: Lemma 4.2. Let δ be some real number different from d. Then there exists C = C(δ) > 0 with: du C (d−δ)∧0 . (4.3) δ |u| |u| R

668

R. Robert, V. Vargas

Proof. We suppose δ > d, the other case being obvious. We have: du d u d−δ = δ u u |dv)δ |u| R |u| u= |u| R/ ( |v| 1 θ (v)|v + d u d−δ . d ( θ (v)|v + u |dv)δ R |v| 1 We can now state the following proposition: Proposition 4.3. Let l be an integer 1 and γ0 such that: (1) γ02 ωd < α if l = 1. (2) (2l − 3/2)γ02 ωd < α ∧ d/2 if l > 1. Then for all x, X0, (x) converges in L 2l to a random vector X0 (x) (i.e. E((X0, (x) − X0 (x))2l ) → 0). The random vector field X0 (x) satisfies the following scaling: For e (|e| = 1), and for all q 2l: λ j j j E((X0 (x + λe) − X0 (x))q ) ∼ Cq (e)( )ζq , λ→0 R

(4.4)

where ζq = qα − 21 q(q − 1)γ02 ωd and j

Cq (e) = R qd/2 e −

q(q−1) 2 γ0 φ(0) 2

1

(Rd )q 1 i< j q

|u i − u j |γ0 ωd

j

(

2

1 i q

ui |u i |d−α+1

j ui

− ej )du 1 . . . du q . |u i − e|d−α+1

(4.5)

Proof. We will first prove that: j E((X0, (x))2l ) → →0

eγ0 φ( 2

(Rd )2l 1 i< j 2l

|

yi −y j R

)

j

yi −y j γ02 ωd 1 i 2l R |∗

ϕ R (yi )yi dy1 . . . dy2l . |yi |d−α+1 (4.6)

We recall that the right hand side of the above limit exists by Lemma 2.4. In order to prove j the above relation, we develop E((X0, (x))2l ) in l + 1 terms similarly as in the proof of Proposition 3.7; then, using formula (2.4) and the fact that, for all y, ρ/R (y) ωd ln R + C, we are led to show that, for all k l − 1, we have the following convergence: (l−k)(d−γ0 ωd ) −2(l−k)(l−k−1)γ0 ωd −4k(l−k)γ0 ωd 2

2

j

×

(ϕ R (y2k+1 )y2k+1 )2 2(d−α+1)

|y2k+1 |

2

j

...

(ϕ R (yk+l )yk+l )2 2(d−α+1)

|yk+l |

j

j

ϕ R (y1 )y1

|y1 |d−α+1

(Rd )k+l

1 i< j 2k

eγ0 φ( 2

|

yi −y j R

)

yi −y j γ02 ωd R

|∗

...

ϕ R (y2k )y2k |y2k |d−α+1

dy1 . . . dyk+l → 0. →0

Hydrodynamic Turbulence and Intermittent Random Fields

669

We apply inequality (4.3) and obtain (if α = d/2, one can work with α − η for η > 0 sufficiently small) :

ϕ R (y)2 Rd

|y|2(d−α)

dy C (2α−d)∧0 .

Therefore the above convergence to 0 amounts to showing that, for all k l − 1, we have the following inequality: d + (2α − d) ∧ 0 − γ02 ωd > 2(l − k − 1)γ02 ωd + 4kγ02 ωd . This is equivalent to (2l − 23 )γ02 ωd < α ∧ d2 . One can show, for all x, that (X0, (x))>0 j j is a Cauchy sequence in L 2l by computing E((X0, (x) − X0, (x))2l ) and letting , j

go to 0. Thus, E((X0 (x))2l ) is given by the left hand side of (4.6). To show the scaling (4.4), observe that we can prove the following analogue to (4.6) for any q 2l:

j j E((X0 (x +λe)−X0 (x))q ) =

(Rd )q 1 i< j q

eγ0 φ( 2

|

yi −y j R

)

yi −y j γ02 ωd R

|∗

f λe (yi )dy1 . . . dyq ,

1 i q

(4.7) where f λe (y) =

ϕ R (y)y j ϕ R (y − λe)(y j − λe j ) − . d−α+1 |y| |y − λe|d−α+1

By setting yi = λu i in the integral of (4.7), we deduce easily (4.4).

(4.8) j

Remark 4.4. It is not obvious why in the above proposition the coefficients Cq (e) are different from 0 (cf. the Appendix). Remark 4.5. Similarly as in the previous section, for γ0 sufficiently small, X0, converges in law to X0 in the space of continuous fields.

4.2. Nonsymmetry of the increments of X0 . By isotropy, we have: j

j

(X0 (x + λe) − X0 (x)).e = X0 (x + λe j ) − X0 (x). law

One can show that for λ small the law is nonsymmetrical by showing that the third j moment is = 0, that is C3 (e j ) = 0 (see the Appendix).

670

R. Robert, V. Vargas

4.3. Towards a model of the turbulent velocity field. In dimension 3, for the field X0 , we have: ζq = qα − 2πq(q − 1)γ02 . Thus, for α = 1/3 + 4π γ02 , we have ζ3 = 1, which means that for this choice the associated fields X0 satisfy at small scale the 4/5 law with a non zero finite dissipation. Unfortunately, the fields in this family are not incompressible. The incompressible case (at small scale) would correspond to the choice α = 1, a limit case which is excluded by the constraint 0 < α < 1 needed for the validity of the scaling of Proposition 4.3. There is another severe obstacle for the choice α = 1. Indeed, for the field X0 above, we have: ζq = (1/3 + 6π γ02 )q − 2π γ02 q 2 . One can easily identify the intermittency parameter 4π γ02 using the experimental curve given in [1] (cf. Fig. 8.8, p.132 in [7]). With this data, we find 4π γ02 = 0.023. With this small intermittency parameter, we would get α ∼ 0.35 which is not close to the incompressible value α = 1. So, in spite of its qualitative interest, this model cannot reach quantitative adequacy. Another natural way to get incompressible fields is to use a Biot-Savart like formula and take the limit as goes to 0 of fields of the form: x−y U (x) = ϕ R (x − y) ∧ d , 3 |x − y|d−α+1 R where d is an isotropic random field. For example, we can take: d = e X

(y)−C

dW (y),

where dW (y) = (dW1 (y), dW2 (y), dW3 (y)) denotes a three dimensional white noise and X is defined by the following formula: K R (y − σ ).dW (σ ), X (y) = γ R3

x 1|x| R . with K R (x) = |x|1+d/2 As for the case of X0 , we choose the constant C such that U converges to a non trivial field U as goes to zero. The vector field U we obtain is incompressible, homogeneous, isotropic and intermittent with structural exponents ζq given by:

ζq = qα − 2π γ 2 q(q − 1). Unfortunately, since the field d (y) is isotropic with respect to all unitary transformations (and not just the rotations) we get for U the symmetry: U (−x) − U (0) = U (x) − U (0), law

so that the dissipation is equal to 0. Thus the construction of an homogeneous, isotropic, intermittent and incompressible vector field with positive finite dissipation remains an open question.

Hydrodynamic Turbulence and Intermittent Random Fields

671

Comment 4.6. In our approach, we perturb a Gaussian field to get multifractality and we further introduce some dependency to obtain also dissymmetry. This can make one think that in turbulence dissipation is linked to intermittency. This is a rather intricate issue. On one hand, only dissymmetry seems to be needed to get energy dissipation (see the 4/5 law). On the other hand, it is well known experimentally that dissipation is not homogeneously distributed in the fluid but rather follows the lognormal distribution described by Kolmogorov and Obukhov by which it appears linked to intermittency. In dimension d = 1, our model displays some kind of (non causal) leverage effect. To get a realistic model for finance, with causal leverage effect, we have to make some specific change in the construction. This issue will be addressed in a forthcoming paper. 5. APPENDIX In this Appendix, we prove that for q even Cq1 (e), given by Eq. (4.5), is different from 0 outside a countable set and in the neighbourhood of 0. We also show the same result j for C3 (e j ). Consider first the case q even; we set q = 2l with l greater or equal to 1 and we introduce the following function F: 1 F(γ ) = f (u i )du 1 . . . du 2l , γ |u i − u j | (Rd )2l 1 i< j 2l

1 i 2l

where f is the real function defined by: f (u) =

u j − e j /2 u j + e j /2 − . d−α+1 |u + e/2| |u − e/2|d−α+1

The function F is analytical in a neighbourhood of 0; therefore, in order to obtain the desired result, we have to prove that F is not identically equal to 0. One can show that, for all i < l, F (i) (0) = 0 and that: 2l! 1 ) f (u 1 ) f (u 2 )du 1 du 2 )l . F (l) (0) = l ( ln( 2 (R d )2 |u 1 − u 2 | 1 The Fourier transform of ln( |u| ) is ad P f ( |ξ1|d ) + bd δ0 , where ad > 0 and bd are two constants that depend only on the dimension and P f is Hadamard’s finite part (see p.258 in [15]). Since Rd f (u)du = 0, we get: 1 fˆ(ξ )2 ) f (u 1 ) f (u 2 )du 1 du 2 = ad ln( dξ, d |u 1 − u 2 | (R d )2 Rd |ξ |

thus F (l) (0) > 0. j Let us now consider C3 (e j ) and the corresponding function: 1 f (u 1 ) f (u 2 ) f (u 3 )du 1 du 2 du 3 . F(γ ) = γ |u − u |γ |u − u |γ d 3 |u − u | 1 2 1 3 2 3 (R ) We obviously have F(0) = 0, F (0) = 0 and: 1 F (0) = I = ln(|u 1 − u 2 |) ln(|u 1 − u 3 |) f (u 1 ) f (u 2 ) f (u 3 )du 1 du 2 du 3 , 6 (R d )3

672

R. Robert, V. Vargas

so that I = Rd f (x)(x)2 d x, where ln(|x − y|) f (y)dy. (x) = Rd

Now we prove that there exists some real constant c different from 0 such that: (x) = c(

x j + 1/2 x j − 1/2 − ). |x + e j /2|1−α |x − e j /2|1−α

(5.1)

Indeed, we have (in what follows, c denotes different real constants that are not equal to 0): ˆ )=c (ξ

fˆ(ξ ) |ξ |d

and fˆ(ξ ) = c sin(π ξ j ) |ξξ|α+1 , thus j

ˆ ) = c sin(π ξ j ) (ξ

ξj |ξ |d+α+1

.

The above expression (5.1) now follows from: (

xj ξj )(ξ ) = ci d+α+1 1−α |x| |ξ |

and 2i sin(π ξ j ) = δ −e j /2 (ξ ) − δ e j /2 (ξ ). Now let us denote x = x j e j + x and φ( p, y, a) =

y + 1/2 y − 1/2 − . ((y + 1/2)2 + a) p ((y − 1/2)2 + a) p

If we set: ϕ(x j ) = φ(

d −α+1 j , x , | x |2 ) 2

and ψ(x j ) = φ(

1−α j , x , | x |2 ), 2

we get:

I=c

2

Rd−1

d x

R

ϕ(x j )ψ(x j )2 d x j .

Since 0 < α < 1, it is easy to check that ψ(z) is a positive function of z, decreasing on [0, ∞[. One can also check that there exists some z ∗ > 1/2 such that ϕ(z) is positive

Hydrodynamic Turbulence and Intermittent Random Fields

673

on [0, z ∗ [ and negative on ]z ∗ , ∞[. Since ϕ and ψ are even and can derive the following: ∞ ϕ(z)ψ(z)2 dz = 2 ϕ(z)ψ(z)2 dz R

0

= 2

0

0

z∗

2

z∗

= 2 > 0. It follows that F (0) > 0.

z∗

ϕ(z)ψ(z) dz + 2 2

∞ z∗

∞ 0

ϕ(z)dz = 0, one

ϕ(z)ψ(z)2 dz

ϕ(z)ψ(z)2 dz + 2ψ(z ∗ )2

∞ z∗

ϕ(z)dz

ϕ(z)(ψ(z)2 − ψ(z ∗ )2 )dz

0

References 1. Anselmet, F., Antonia, R.A., Gagne, Y., Hopfinger, E.J.: High-order velocity structure functions in turbulent shear flow. J.Fluid Mech. 140, 63–89 (1984) 2. Bacry, E., Delour, J., Muzy, J.-F.: Modelling financial time series using multifractal random walks. Physica A 299, 84–92 (2001) 3. Bacry, E., Muzy, J.-F.: Log-infinitely divisible multifractal processes. Commun. Math. Phys. 236, 449–475 (2003) 4. Bacry, E., Kozhemyak, A., Muzy, J.-F.: Continuous cascade models for asset returns. J. Econ. Dyn. Cont. 32(1), 156–199 (2008) 5. Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes. Berlin-HeidelbergNew York: Springer-Verlag, 1988 6. Duchon, J., Robert, R.: Inertial energy dissipation for weak solutions of incompressible Euler and NavierStokes equations. Nonlinearity 13(1), 249–255 (2000) 7. Frisch, U.: Turbulence. Cambridge: Cambridge University Press, 1995 8. Kahane, J.-P.: Sur le chaos multiplicatif. Ann. Sci. Math. Québec 9(2), 105–150 (1985) 9. Kolmogorov, A.N.: A refinement of previous hypotheses concerning the local structure of turbulence. J. Fluid. Mech. 13, 83–85 (1962) 10. Mandelbrot, B.B.: A possible refinement of the lognormal hypothesis concerning the distribution of energy in intermittent turbulence. In: Statistical Models and Turbulence, (La Jolla, CA), Lecture Notes in Phys. no. 12, Berlin-Heidelberg-New York: Springer, pp. 333–335, 1972 11. Nualart, D.: The Malliavin Calculus and Related Topics. Berlin-Heidelberg-New York: Springer Verlag, 1995 12. Obukhov, A.M.: Some specific features of atmospheric turbulence. J. Fluid. Mech. 13, 77–81 (1962) 13. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Berlin-Heidelberg-New York: Springer, 2005 14. Robert, R.: Mathématiques et turbulence. Images des mathématiques, CNRS, 91–100 (2004) 15. Schwartz, L.: Théorie des distributions. Paris: Hermann, 1997 Communicated by A. Kupiainen

Commun. Math. Phys. 284, 675–712 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0601-7

Communications in

Mathematical Physics

Monopoles and Clusters Roger Bielawski1,2 1 School of Mathematics, University of Leeds, Leeds LS2 9JT, UK 2 Mathematisches Institut, Universität Göttingen, Göttingen 37073, Germany.

E-mail: [email protected] Received: 1 May 2007 / Accepted: 7 May 2008 Published online: 26 September 2008 – © Springer-Verlag 2008

Abstract: We define and study certain hyperkähler manifolds which capture the asymptotic behaviour of the SU (2)-monopole metric in regions where monopoles break down into monopoles of lower charges. The rate at which these new metrics approximate the monopole metric is exponential, as for the Gibbons-Manton metric. 1. Introduction The moduli space Mn of framed SU (2)-monopoles of charge n on R3 is a complete Riemannian manifold, topological infinity of which corresponds to monopoles of charge n breaking down into monopoles of lower charges. This asymptotic picture is given in Proposition (3.8) in [3] which we restate here: Proposition 1.1. Given s an infinite sequence of points of Mn , therei exists3 a subsequence m r , a partition n = i=1 n i with n i > 0, a sequence of points xr ∈ R , i = 1, . . . , s, such that (i) the sequence m ri of monopoles translated by −xri converges weakly to a monopole of charge n i with centre at the origin; j (ii) as r → ∞, the distances between any pair of points xri , xr tend to ∞ and the j direction of the line xri xr converges to a fixed direction. We can think of clusters of charge n i with centres at xri receding from one another in definite directions. The aim of this paper is to capture this asymptotic picture in metric terms. Observe thatthe above description, which leads to the asymptotic metric being the product metric on Mn i , is valid only at infinity. It ignores the interaction of clusters at finite distance from each other, e.g. the relative electric charges arising from their motion. A physically meaningful description of the asymptotic metric should take into consideration

676

R. Bielawski

the contributions made by this interaction. Such an asymptotic metric, governing the motion of dyons, was found by Gibbons and Manton [16] in the case when all n i are 1, i.e. a monopole breaks down into particles. It was then shown in [8] that this metric is an exponentially good approximation to the monopole metric in the corresponding asymptotic region. sOur aim is to generalise this to clusters of arbitrary charges. For any partition n = i=1 n i with n i > 0 we define a space of (framed) clusters Mn 1 ,...,n s with a natural (pseudo)-hyperkähler metric. The picture is that as long as the size of clusters is bounded, say by K and the distances between their centres x i are larger than some R0 = R0 (K ), then there are constants C = C(K ), α = α(K ) such that the cluster metric in this region of Mn 1 ,...,n s is Ce−α R -close to the monopole metric in the corresponding region of Mn , where R = min{|x i − x j |; i, j = 1, . . . , s, i = j}. The definition of the cluster metric is given in terms of spectral curves and sections of the line bundle L 2 , analogous to one of the definitions of the monopole metric (cf. [3]). Essentially, a framed cluster in Mn 1 ,...,n s corresponds to s real spectral curves Si of degrees n i together with meromorphic sections of L 2 on each Si , such that the zeros and poles of the sections occur only at the intersection points of different curves (together with certain nonsingularity conditions). Let us say at once that we deal here almost exclusively with the case of two clusters. Apart from notational complications when s > 2, the chief difficulty (also for s = 2) is that unlike in the case of the Gibbons-Manton metric, we have not found a description of Mn 1 ,...,n s as a moduli space of Nahm’s equations. For s = 2 we have such a description of the smooth (and complex) structure of Mn 1 ,n 2 but not of its metric nor of the hypercomplex structure. The fact that our spaces of clusters Mn 1 ,...,n s are defined in terms of spectral curves satisfying certain transcendental conditions, makes them quite hard to deal with. In particular, for s > 2 we do not have a proof that such curves exist (although we are certain that they do). For s = 2 we do have existence, since the spectral curves in this case turn out to be spectral curves of SU (2)-calorons of charge (n 1 , n 2 ). Contents 1. 2.

3. 4. 5. 6. 7. 8.

9.

Introduction . . . . . . . . . . . . . . . . . Line Bundles and Flows on Spectral Curves 2.1 Line bundles and matricial polynomials 2.2 Real structure . . . . . . . . . . . . . . 2.3 Hermitian metrics . . . . . . . . . . . . 2.4 Flows . . . . . . . . . . . . . . . . . . The Monopole Moduli Space . . . . . . . . The Moduli Space of Two Clusters . . . . . The Complex Structure of Nk,l . . . . . . . The Hyperkähler Structure of Mk,l . . . . . Mk,l as a Hyperkähler Quotient . . . . . . . Spaces of Curves and Divisors . . . . . . . . 8.1 The Douady space of C2 . . . . . . . . 8.2 The Douady space of TP1 . . . . . . . . 8.3 Curves and divisors . . . . . . . . . . . 8.4 Line bundles . . . . . . . . . . . . . . . 8.5 Translations . . . . . . . . . . . . . . . Asymptotics of Curves . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

675 677 677 679 679 680 681 683 685 689 691 692 692 693 693 693 695 695

Monopoles and Clusters

677

10. Asymptotics of Matricial Polynomials . . . . . . . . . 11. The Asymptotic Region of Mk,l and Nahm’s Equations 12. Comparison of Metrics . . . . . . . . . . . . . . . . . . 13. Concluding Remarks . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

699 706 709 710 711

2. Line Bundles and Flows on Spectral Curves We recall here essential facts about spectral curves and line bundles. For a more detailed overview we refer to [10]. 2.1. Line bundles and matricial polynomials. In what follows T denotes the total space of the line bundle O(2) on P1 (T T P1 ), π : T → P1 is the projection, ζ is the affine coordinate on P1 and η is the fibre coordinate on T . In other words T is obtained by gluing two copies of C2 with coordinates (ζ, η) and (ζ˜ , η) ˜ via: ζ˜ = ζ −1 , η˜ = η/ζ 2 . We denote the corresponding two open subsets of T by U0 and U∞ . Let S be an algebraic curve in the linear system O(2n), i.e. over ζ = ∞, S is defined by the equation P(ζ, η) = ηn + a1 (ζ )ηn−1 + · · · + an−1 (ζ )η + an (ζ ) = 0,

(2.1)

where ai (ζ ) is a polynomial of degree 2i. S can be singular or non-reduced (although spectral curves corresponding to monopoles, or to the clusters considered here, are always reduced). We recall the following facts (see, e.g., [17,1]): Proposition 2.1. The group H 1 (T, OT ) (i.e. line bundles on T with zero first Chern class) is generated by ηi ζ − j , i > 0, 0 < j < 2i. The corresponding line bundles have transition functions exp(ηi ζ − j ) from U0 to U∞ . Proposition 2.2. The natural map H 1 (T, OT ) → H 1 (S, O S ) is a surjection, i.e. H 1 (S, O S ) is generated by ηi ζ − j , 0 < i ≤ n − 1, 0 < j < 2i. Thus, the (arithmetic) genus of S is g = (n − 1)2 . For a smooth S, the last proposition describes line bundles of degree 0 on S. In general, by a line bundle we mean an invertible sheaf and by a divisor we mean a Cartier divisor. The degree of a line bundle is defined as its Euler characteristic plus g − 1. The theta divisor is the set of line bundles of degree g − 1 which have a non-zero section. Let OT (i) denote the pull-back of O(i) to T via π : T → P1 . If E is a sheaf on T we denote by E(i) the sheaf E ⊗ OT (i) and similarly for sheaves on S. In particular, π ∗ O is identified with O S . If F is a line bundle of degree 0 on S, determined by a cocycle q ∈ H 1 (T, OT ), and s ∈ H 0 (S, F(i)), then we denote by s0 , s∞ the representation of s in the trivialisation U0 , U∞ , i.e.: s∞ (ζ, η) =

eq s0 (ζ, η). ζi

We recall the following theorem of Beauville [4]:

(2.2)

678

R. Bielawski

Theorem 2.3. There is a 1 − 1 correspondence between the affine Jacobian J g−1 − of line bundles of degree g − 1 on S and G L(n, C)-conjugacy classes of gl(n, C)-valued polynomials A(ζ ) = A0 + A1 ζ + A2 ζ 2 such that A(ζ ) is regular for every ζ and the characteristic polynomial of A(ζ ) is (2.1). The correspondence is given by associating to a line bundle E on S its direct image V = π∗ E, which has a structure of a π∗ O-module. This is the same as a homomorphism A : V → V (2) which satisfies (2.1). The condition E ∈ J g−1 − is equivalent to H 0 (S, E) = H 1 (S, E) = 0 and, hence, to H 0 (P1 , V ) = H 1 (P1 , V ) = 0, i.e. V = O(−1). Thus, we can interpret A as a matricial polynomial precisely when E ∈ J g−1 − . Somewhat more explicitly, the correspondence is seen from the exact sequence 0 → OT (−2)⊕n → OT⊕n → E(1) → 0,

(2.3)

where the first map is given by η · 1 − A(ζ ) and E(1) is viewed as a sheaf on T supported on S. The inverse map is defined by the commuting diagram H 0 (S, E(1)) −−−−→ H 0 Dζ , E(1) ⏐ ⏐ ⏐·η ⏐ ˜ ) A(ζ 0 0 H (S, E(1)) −−−−→ H Dζ , E(1) ,

(2.4)

where Dζ is the divisor consisting of points of S which lie above ζ (counting multiplici˜ ) is quadratic in ζ is proved e.g. in [1]. Observe that if ties). That the endomorphism A(ζ Dζ0 consists of n distinct points p1 , . . . , pn and if ψ 1 , . . . ψ n is a basis of H 0 (S, E(1)), ˜ 0 ) in this basis is then A(ζ −1 diag (η( p1 ), . . . , η( pn )) ψ j ( pi ) , A(ζ0 ) = ψ j ( pi )

(2.5)

where ψ j ( pi ) is a matrix with rows labelled by i and columns by j. Remark 2.4. For a singular curve S, Beauville’s correspondence most likely extends to J g−1 − , where J g−1 is the compactified Jacobian in the sense of [2]. It seems to us that this is essentially proved in [1]. Let K be the canonical (or dualising) sheaf of S. We have K O S (2n − 4). If E belongs to J g−1 − , then so does E ∗ ⊗ K and: Proposition 2.5. Let A(ζ ) be the quadratic matricial polynomial corresponding to E ∈ J g−1 − . Then A(ζ )T corresponds to E ∗ ⊗ K . In particular, theta-characteristics outside correspond to symmetric matricial polynomials. For a proof, see [10].

Monopoles and Clusters

679

2.2. Real structure. The space T is equipped with a real structure (i.e. an antiholomorphic involution) τ defined by 1 η¯ ζ → − , η → − 2 . ζ¯ ζ¯

(2.6)

Suppose that S is real, i.e. invariant under τ . Then τ induces an antiholomorphic involution σ on Pic S as follows. Let E be a line bundle on S trivialised in a cover {Uα }α∈A with transition functions gαβ (ζ, η) from Uα to Uβ . Then σ (E) is trivialised in the cover {τ (Uα )}α∈A with transition functions gαβ (τ (ζ, η)), from τ (Uα ) to τ (Uβ ). Observe that σ (E) = τ ∗ E, where “bar" means taking the opposite complex structure. This map does not change the degree of E and preserves line bundles O S (i). As there is a corresponding map on sections σ : s → τ ∗ s,

(2.7)

it is clear that J g−1 − is invariant under this map. The σ -invariant line bundles are called real. Real line bundles of degree 0 have [10] transition functions exp q(ζ, η), where q satisfies: q(τ (ζ, η)) = q(ζ, η). On the other hand, a line bundle E of degree d = in, i ∈ Z, on S is real if and only if it is of the form E = F(i), where F is a real line bundle of degree 0. For bundles of degree g − 1 we conclude (see [10] for a proof): g−1

Proposition 2.6. There is a 1 − 1 correspondence between JR − R and conjugacy classes of matrix-valued polynomials A(ζ ) as in Theorem 2.3 such that there exists a hermitian h ∈ G L(n, C) with h A0 h −1 = −A∗2 , h A1 h −1 = A∗1 , h A2 h −1 = −A∗0 .

(2.8)

2.3. Hermitian metrics. Let S be a real curve. g−1

Definition 2.7. A line bundle of degree g − 1 on S is called definite if it is in JR − R and the matrix h in (2.8) can be chosen to be positive-definite. The subset of definite line g−1 bundles is denoted by J+ . g−1

and We easily conclude that there is a 1-1 correspondence between J+ U (n)-conjugacy classes of matrix-valued polynomials A(ζ ) as in Theorem 2.3 which in addition satisfy A2 = −A∗0 ,

A1 = A∗1 .

Definite line bundles have also the following interpretation (cf. [17]):

(2.9)

680

R. Bielawski g−1

For E = F(n−2) ∈ JR

the real structure induces an antiholomorphic isomorphism (2.10) σ : H 0 (S, F(n − 1)) −→ H 0 S, F ∗ (n − 1) ,

via the map (2.7). Thus, for v, w ∈ H 0 (S, F(n − 1)), vσ (w) is a section of O S (2n − 2) and so it can be uniquely written [17,1] as c0 ηn−1 + c1 (ζ )ηn−2 + · · · + cn (ζ ),

(2.11)

where the degree of ci is 2i. Following Hitchin [17], we define a hermitian form on H 0 (S, F(n − 1)) by v, w = c0 .

(2.12)

The following fact can be deduced from [17]: g−1

Proposition 2.8. A line bundle E = F(n − 2) ∈ JR the above form on H 0 (S, F(n − 1)) is definite.

− R is definite if and only if

Let s, s be two sections of F(n − 1) on S. The form s, s is given by computing the section Z = sσ (s ) of O(2n − 2) on S. Writing Z (ζ, η) = c0 ηn−1 + c1 (ζ )ηn−2 + · · · + cn (ζ ) on S, we have s, s = c0 . If P(ζ, η) = 0 is the equation defining S, then for any ζ0 , such that S ∩ π −1 (ζ0 ) consists of distinct points, we have c0 =

(ζ0 ,η)∈S

Res

Z (ζ0 , η) . P(ζ0 , η)

Thus, if we write (ζ0 , η1 ), . . . , (ζ0 , ηn ) for the points of S lying over ζ0 , then we have s, s =

n s(ζ0 , ηi ) · σ (s )(ζ0 , ηi ) . j=i ηi − η j i=1

(2.13)

Therefore, one can compute s, s from the values of the sections at two fibres of S over two antipodal points of P1 (as long as the fibres do not have multiple points). 2.4. Flows. If we fix a tangent direction on J g−1 (S), i.e. an element q of H 1 (S, O S ), then the linear flow of line bundles on J g−1 (S) corresponds to a flow of matricial polynomials (modulo the action of G L(n, C)). We shall be interested only in the flow corresponding to [η/ζ ] ∈ H 1 (S, O S ). Following the tradition, we denote by L t the line bundle on T with transition function exp(−tη/ζ ) from U0 to U∞ . For any line bundle F of degree 0 on S we denote by Ft the line bundle F ⊗ L t . We consider the flow Ft (n − 2) on J g−1 (S). Even if F = F0 is in the theta divisor, this flow transports one immediately outside , and so we obtain a flow of endomorphisms of Vt = H 0 (S, Ft (n − 1)). These vector spaces have dimension k as long as Ft (n − ˜ ) of Vt as equal to multiplication by η on 2) ∈ . We obtain an endomorphism A(ζ H 0 (S ∩ π −1 (ζ ), Ft (n − 1)), where π : T → P1 is the projection.

Monopoles and Clusters

681

To obtain a flow of matricial polynomials one has to trivialise the vector bundle V over R (the fibre of which at t is Vt ). This is a matter of choosing a connection. If we choose the connection ∇ 0 defined by evaluating sections at ζ = 0 (in the trivialisation U0 , U∞ ), then the corresponding matricial polynomial A(t, ζ ) = A0 (t) + A1 (t)ζ + A2 (t)ζ 2 satisfies [17,1] d A(t, ζ ) = [A(t, ζ ), A2 (t)ζ ] . dt As mentioned above, if F is a real bundle, then V has a natural hermitian metric (2.12) (possibly indefinite). The above connection is not metric, i.e. it does not preserve the form (2.12). Hitchin [17] has shown that the connection ∇ = ∇ 0 + 21 A1 (t)dt is metric and that, in a ∇-parallel basis, the resulting A(t, ζ ) satisfies d A(t, ζ ) = [A(t, ζ ), A1 (t)/2 + A2 (t)ζ ] . dt If the bundle F(n −1) is positive-definite, then so are all Ft (n −1). If the basis of sections is, in addition, unitary, then the polynomials A(t, ζ ) satisfy the reality condition (2.9). If we write A0 (t) = T2 (t) + i T3 (t) and A1 (t) = 2i T1 (t) for skew-hermitian Ti (t), then these matrices satisfy the Nahm equations: 1 T˙i +

i jk [T j , Tk ] = 0 , i = 1, 2, 3. (2.14) 2 j,k=1,2,3

3. The Monopole Moduli Space The moduli space of SU (2)-monopoles of charge n has a well-known description as a moduli space of solutions to Nahm’s equations [29,17]. From the point of view of Sect. 2.4 monopoles correspond to spectral curves on which the flow L t (n−1) is periodic and does not meet the theta divisor except for the periods. We can then describe the moduli space of SU (2)-monopoles as the space of solutions to Nahm’s equations (2.14) on (0, 2) with symmetry Ti (2 − t) = Ti (t)T (cf. Proposition 2.5) and satisfying appropriate boundary conditions. If we wish to consider the moduli space Mn of framed monopoles (which is a circle bundle over the moduli space of monopoles) and its natural hyperkähler metric, then it is better to allow gauge freedom and introduce a fourth u(n)-valued function T0 (t). Thus we consider the following variant of Nahm’s equations: 1 T˙i + [T0 , Ti ] +

i jk [T j , Tk ] = 0 , i = 1, 2, 3. (3.1) 2 j,k=1,2,3

The functions T0 , T1 , T2 , T3 are u(n)-valued, defined on an interval and analytic. The space of solutions is acted upon by the gauge group G of U (n)-valued functions g(t): T0 → gT0 g −1 − gg ˙ −1 , Ti → gTi g −1 , i = 1, 2, 3.

(3.2)

To obtain Mn we consider solutions analytic on (0, 2) which have simple poles at 0, 2, residues of which define a fixed irreducible representation of su(2). The space Mn is identified with the moduli space of solutions to (3.1) satisfying these boundary conditions and the symmetry condition Ti (2 − t) = Ti (t)T , i = 0, 1, 2, 3, modulo the action of gauge transformations g(t) which satisfy g(0) = g(1) = 1 and g(2 − t)−1 = g T (t).

682

R. Bielawski

The tangent space at a solution (T0 , T1 , T2 , T3 ) can be identified with the space of solutions to the following system of linear equations: t˙0 + [T0 , t0 ] + [T1 , t1 ] + [T2 , t2 ] + [T3 , t3 ] = 0, t˙1 + [T0 , t1 ] − [T1 , t0 ] + [T2 , t3 ] − [T3 , t2 ] = 0, t˙2 + [T0 , t2 ] − [T1 , t3 ] − [T2 , t0 ] + [T3 , t1 ] = 0, t˙3 + [T0 , t3 ] + [T1 , t2 ] − [T2 , t1 ] − [T3 , t0 ] = 0.

(3.3)

The first equation is the condition that (t0 , t1 , t2 , t3 ) is orthogonal to the infinitesimal gauge transformations and the remaining three are linearisations of (3.1). Again, the symmetry condition ti (2 − t) = ti (t)T holds. Mn carries a hyperkähler metric defined by (t0 , t1 , t2 , t3 )2 = −

3 i=0

2 0

tr ti2 (s)ds.

(3.4)

We now describe Mn and its metric in terms of spectral curves. Mn consists of pairs (S, ν) where S ∈ |O(2n)| satisfies (3.5) H 0 S, L s (n − 1) = 0 for s ∈ (0, 2), L 2|S O,

(3.6)

and ν is a section of L 2 of norm 1 (the norm is defined by ν2 = νσ (ν) ∈ H 0 (S, O) = C, where σ is defined as in (2.7)). This last condition guarantees in particular that g−1 L s (n − 1) ∈ J+ for s ∈ [0, 2]. Remark 3.1. In [17] there is one more condition: that S has no multiple components. This, however, follows from the other assumptions. Namely, an S, satisfying all other conditions, produces a solution to Nahm’s equations with boundary conditions of Mn . Thus, S is a spectral curve of a monopole and cannot have multiple components. With respect to any complex structure, Mn is biholomorphic to Ratn P1 - the space of based (mapping ∞ to 0) rational maps of degree n on P1 . If we represent an (S, ν) ∈ Mn in the patch ζ = ∞ by a polynomial P(η, ζ ) and a holomorphic function ν0 (η, ζ ), then, for a given ζ0 , the denominator of the corresponding rational map is P(η, ζ0 ). The numerator can be identified [20], when the denominator has distinct zeros, with the unique polynomial of degree n − 1 taking values ν0 (ηi , ζ0 ) at the zeros ηi of the denominator. The complex symplectic form (i.e. ω2 + iω3 for ζ = 0) arising from the hyperkähler structure is the standard form on Ratn P1 : n dp(ηi ) i=1

p(ηi )

∧ dηi ,

(3.7)

where p(z)/q(z) ∈ Ratn P1 has distinct roots ηi . The Kähler form Iζ0 ·, · , where Iζ0 is the complex structure corresponding to ζ0 ∈ P1 is given by the linear term in the expansion of (3.7) as power series in ζ − ζ0 . To complete the circle of ideas we recall, after Donaldson [15] and Hurtubise [20,21], how to read off the section of L 2 from a solution to Nahm’s equations. The Nahm’s

Monopoles and Clusters

683

equations (3.1) can be written in the Lax pair is an affine coordinate on P1 and

d dt

A(t, ζ ) = [A(t, ζ ), A# (t, ζ )], where ζ

A(t, ζ ) = (T2 (t) + i T3 (t)) + 2T1 (t)ζ + (T2 (t) − i T3 (t)) ζ 2 , A# (t, ζ ) = (T0 (t) + i T1 (t)) + (T2 (t) − i T3 (t)) ζ. In the case of monopoles, the residues at t = 0, 2 of A(t) and of A# (t) define irreducible representations of sl(2, C), which are independent of the solution. In addition, the −(n − 1)/2-eigenspace of the residue of A# is independent of ζ and can be chosen to be generated by the first vector of Euclidean basis of Cn . There is a unique solution w(t, ζ ) d of dt w + A# w = 0 satisfying t −(n−1)/2 w(t, ζ ) → (1, 0, . . . , 0)T as t → 0. The rational map, for any ζ = ∞, corresponding to a solution to Nahm’s equations is then w(1, ζ )T (z − A(1, ζ ))−1 w(1, ζ ). Thus the section of L 2 , which is the numerator of the rational map, is (in the patch ζ = ∞) ν0 = w(1, ζ )T (z − A(1, ζ ))adj w(1, ζ ).

(3.8)

4. The Moduli Space of Two Clusters We consider the space k,l of pairs (S1 , S2 ) of compact, real curves S1 ∈ |O(2k)|, S2 ∈ |O(2l)| such that there exists a D ⊂ S1 ∩ S2 satisfying (i) D ∪ τ (D) = S1 ∩ S2 (as divisors). (ii) Over S1 : L 2 [D − τ (D)] O; over S2 : L 2 [τ (D) − D] O. (iii) H 0 (S1 , L s (k + l − 2)[−τ (D)]) = 0 and H 0 (S2 , L s (k + l − 2)[−D]) = 0 for s ∈ (0, 2). In addition the first (resp. second) cohomology group vanishes also for s = 0 if k ≤ l (resp. l ≤ k). (iv) L s (k + l − 2)[−τ (D)] on S1 and L s (k + l − 2)[−D] on S2 are positive-definite in the sense of Definition 2.7 for every real s. We now define the space Mk,l as the set of quadruples (S1 , ν1 , S2 , ν2 ), where (S1 , S2 ) ∈ k,l , ν1 and ν2 are sections of norm 1 of L 2 [D − τ (D)] on S1 and of L 2 [τ (D) − D] on S2 , respectively. The norm of a section is defined as in the previous section (after (3.6)). We observe that Mk,l is a T 2 -bundle over k,l (this corresponds to a framing of clusters). The space Mk,l should be viewed as a “moduli space" of two (framed) clusters, of cardinality k and l. We shall show that Mk,l is equipped with a (pseudo)-hyperkähler metric. In the asymptotic region of Mk,l the metric is positive-definite and exponentially close to the exact monopole metric in the region of Mk+l , where monopoles of charge k + l separate into clusters of cardinality k and l. There is of course the problem whether curves satisfying conditions (i)-(iii) above exist and finding enough of them to correspond to all pairs of far away clusters. Recall that Ratm P1 denotes the space of based (∞ → 0) rational maps on degree m. We are going to show Theorem 4.1. Let ζ0 ∈ P1 − {∞}. There exists a diffeomorphism from Ratk P1 × ζ0 of Mk,l with the following property. For every Rat P1 onto an open dense subset Mk,l l 1 1 p1 (z) p2 (z) ∈ Ratk P × Ratl P there exists a unique element (S1 , ν1 , S2 , ν2 ) q1 (z) , q2 (z)

684

R. Bielawski ζ

0 of Mk,l such that the polynomials Pi (ζ, η) defining the curves Si , i = 1, 2, satisfy Pi (ζ0 , η) = qi (η) and the values of νi at points of π −1 (ζ0 ) ∩ Si (in the canonical trivialisation of Sect. 2.4) are the values of the numerators pi at the roots of qi .

A proof of this theorem will be given at the end of the next section. We can describe Mk,l (but not its metric) as a moduli space Nk,l of solutions to Nahm’s equations: (a) The moduli space consists of u(k)-valued solutions Ti− on [−1, 0) and of u(l)-valued solutions Ti+ on (0, 1]. (b) If k ≥ l, then Ti+ , i = 0, 1, 2, 3, T0− and the k × k upper-diagonal block of Ti− , i = 1, 2, 3, are analytic at t = 0. The (k − l) × (k − l) lower-diagonal blocks of Ti− have simple poles with residues defining the standard (k − l)-dimensional irreducible representation of su(2). The off-diagonal blocks of Ti− are of the form t (k−l−1)/2 × (analytic in t). Similarly, if l ≥ k, then Ti− , i = 0, 1, 2, 3, T0+ and the l × l upper-diagonal block of Ti+ , i = 1, 2, 3, are analytic at t = 0; The (l − k) × (l − k) lower-diagonal blocks of Ti+ have simple poles with residues defining the standard (l − k)-dimensional irreducible representation of su(2) and the off-diagonal blocks of Ti+ are of the form t (l−k−1)/2 × (analytic in t). (c) We have the following matching conditions at t = 0: if k < l (resp. k > l) then the limit of the k × k upper-diagonal block of Ti+ (resp. l × l upper-diagonal block of Ti− ) at t = 0 is equal to the limit of Ti− (resp. Ti+ ) for i = 1, 2, 3. If k = l, then there exists a vector (V, W ) ∈ C2k such that (T2+ +i T3+ )(0+ ) − (T2− + i T3− )(0− ) = V W T and T1+ (0+ ) − T1− (0− ) = (|V |2 − |W |2 )/2. (d) The solutions are symmetric at t = −1 and at t = 1. (e) The gauge group G consists of gauge transformations g(t) which are U (k)-valued on [−1, 0], U (l)-valued on [0, 1], are orthogonal at t = ±1 and satisfy the appropriate matching conditions at t = 0: if k ≤ l, then the upper-diagonal k × k block of g(t) is continuous, the lower-diagonal block is identity at t = 0 and the off-diagonal blocks vanish to order (l − k − 1)/2 from the left. Similarly for l ≤ k. Remark 4.2. It is known that Nk,l is isomorphic to the moduli space of SU (2)-calorons, i.e. periodic instantons [31,13]. The matching conditions at t = 0 are those for SU (3)monopoles (cf. [22]). Remark 4.3. If we omit the condition that the Ti are symmetric at ±1 and allow only gauge transformations which are 1 at ±1, then we obtain the space Fk,l (−1, 1) considered in [9]. Thus Nk,l is the hyperkähler quotient of Fk,l (−1, 1) by O(k) × O(l). We have Proposition 4.4. There is a natural bijection between Mk,l and Nk,l . Proof. According to [22] the flow L t (k + l − 1)[−D] on S1 and S2 corresponds to a solution to Nahm’s equations (with T0 = 0) satisfying the matching conditions of Nk,l at t = 0. The condition (iii) in the definition of k,l is equivalent to regularity of the solution on (−2, 0) and on (0, 2). Proposition 2.5 implies that the condition that the Ti are symmetric at ±1 corresponds to L −1 (k + l − 1)|S1 [−D] and L 1 (k + l − 1)|S2 [−D] being isomorphic to P1 (k − 1) and P2 (l − 1), where P1 and P2 are elements of order two in the real Jacobians of S1 and S2 . Hence L −1 (l)|S1 [−D] P1 and L 1 (k)|S2 [−D] P2 .

Monopoles and Clusters

685

Squaring gives L −2 (2l) [2D] on S1 and L 2 (2k) [2D] on S2 . Using the relations [D + τ (D)] O(2l) on S1 and [D + τ (D)] O(2k) on S2 shows the condition (d) in the definition of Nk,l is equivalent to (ii) in the definition of k,l . Therefore there is a 1-1 correspondence between k,l and the spectral curves arising from solutions to Nahm’s equations in Nk,l . Now, a pair of spectral curves determines an element of Nk,l only once we have chosen τ -invariant isomorphisms L −1 (l)|S1 [−D] P1 and L 1 (k)|S2 [−D] P2 or, equivalently, isomorphisms in (ii) in the definition of k,l . Conversely, extending a solution to Nahm’s equations, which belongs to Nk,l , by symmetry to (−2, 0) ∪ (0, 2) gives isomorphisms of (ii). The space Nk,l carries a natural hyperkähler metric, defined in the same way as for other moduli spaces of solutions to Nahm’s equations. This is not, however, the asymptotic monopole metric, which will be defined in Sect. 6. 5. The Complex Structure of Nk,l As remarked above (Remark 4.3), Nk,l has a natural hyperkähler structure. We wish to describe Nk,l as a complex manifold with respect to one of these complex structures (the S O(3)-action rotating T1 , T2 , T3 guarantees that all complex structures are equivalent). As usual, such a proof involves identifying the hyperkähler quotient with the complexsymplectic quotient. We have not been able to show that all complex gauge orbits are stable (or equivalently, given Remark 4.3, that all O(k, C) × O(l, C)-orbits on Fk,l (−1, 1) are stable) and so we only describe an open dense subset of Nk,l . We set α = T0 + i T1 and β = T2 + i T3 . The Nahm equations can be then written as one complex and one real equation: dβ = [β, α], (5.1) dt d (α + α ∗ ) = [α ∗ , α] + [β ∗ , β]. (5.2) dt − + We define Ak,l as the space of solutions (α, β) = (α , α ), (β − , β + ) to the complex equation (5.1) on [−1, 0) ∪ (0, 1] satisfying condition (b) of the definition of Nk,l . Moreover β (but not necessarily α) satisfies conditions (c) and (d) of that definition. The space Ak,l is acted upon by the complexified gauge group G C , i.e. the group of complex gauge transformations satisfying the matching conditions in part (e) of the definition of r and Ar the subsets where β(±1) are regular matrices. We have Nk,l . Denote by Nk,l k,l r = Ar /G C . Proposition 5.1. Nk,l k,l

Proof. Let Nk,l be the space of solutions to (5.1) and (5.2) satisfying the conditions (a)-(d) of the definition of Nk,l , so that Nk,l = Nk,l /G. We have to show that in every G C -orbit in Ak,l , there is a unique G-orbit of an element of Nk,l . First we rephrase the problem. Denote by A˜ k,l (resp. N˜ k,l ) the set of solutions to (5.1) (resp. to both (5.1) and (5.2)) on (−2, 0) ∪ (0, 2) satisfying the matching conditions of Ak,l (resp. Nk,l ) at 0 and, in addition, α ± (±2 − t) = α ± (t)T , β ± (±2 − t) = β ± (t)T . Denote by G˜ C ˜ the group of complex (resp. unitary) gauge transformations which satisfy the (resp. G) matching conditions of G C (resp. G) at 0 and, in addition, g(t)−1 = g(−2 − t)T if t ≤ 0 and g(t)−1 = g(2 − t)T if t ≥ 0. We observe that A˜ k,l /G˜ C = Ak,l /G C and N˜ k,l /G˜ = Nk,l /G.

686

R. Bielawski

Indeed, the maps from the left-hand to the right-hand spaces are simply restrictions to [−1, 0)∪(0, 1]. To define the inverses, we can use an element of G C or G to make α− (−1) and α+ (1) symmetric. We now extend the solutions to (−2, 0) ∪ (0, 2) by symmetry, i.e. we put (α+ (t), β+ (t)) = α+ (2 − t)T , β+ (2 − t)T for t ≥ 1 and similarly for (α− , β− ). ˜ We shall show that every G˜ C -orbit in A˜ rk,l contains a unique G-orbit of an element of ˜ Nk,l . We proceed along the lines of [21]. Given an element of A˜ k,l and an h ∈ G L(m, C)/U (m), where m = min(k, l), we can solve the real equation sepa˜ pair of complex gauge rately on (−2, 0) and on (0, 2) via a (unique up to action of G) trasformations g− on [−2, 0] and g+ on [0, 2] such that (i) g− and g+ satisfy the matching condition of G C at t = 0; (ii) the upper diagonal m × m-blocks of g− (0) and of g+ (0) are both equal to h; T (0)−1 and g (2) = g T (0)−1 . (iii) g− (−2) = g− + + This is shown exactly as in [15] and in [21]. The condition (iii) and uniqueness guarantee that g− (t)−1 = g− (−2 − t)T and g+ (t)−1 = g+ (2 − t)T , so that g− and g+ define an element of G˜ C . We now need to show that there is a unique h ∈ G L(m, C)/U (m) for which the resulting solutions to Nahm’s equations will satisfy the matching conditions at t = 0, i.e. that the jump (α˜ + α˜ ∗ ) of the resulting α˜ ± = g± α± g± − g˙ ± g−1 at t = 0 will vanish. To prove this we need to show two things: that the map h → tr (α˜ + α˜ ∗ )2 is proper and that the differential of h → (α˜ + α˜ ∗ ) is non-singular. To prove the properness of h → tr (α˜ + α˜ ∗ )2 we need Lemma 2.19 in [21] in our setting. We observe that Hurtubise’s argument goes through as long as we can show that that logarithms of eigenvalues of g− (−1)∗ g− (−1) and of g+ (1)∗ g+ (1) have a bound independent of h. The next two lemmas achieve this. Lemma 5.2. Let B be a regular symmetric n × n matrix. The adjoint O(n, C)-orbit of B is of the form O(n, C)/ , where is a finite subgroup of O(n, R). Proof. Since B is regular, the stabiliser of B in G L(n, C) is the set of linear combinations of powers of B and hence consists of symmetric matrices. Thus any g which is orthogonal and stabilises B satisfies g 2 = 1. Decompose g as ei p A, where p is real and skewsymmetric and A real and orthogonal. Then ei p stabilises AB A−1 and repeating the argument we get p = 0. Thus is a closed subgroup of O(n, R) consisting of elements, the square of which is 1, hence discrete, hence finite. Lemma 5.3. Let (α1 , β1 ) and (α2 , β2 ) be two solutions to (real and complex) Nahm’s equations on [−a, a] which differ by a complex gauge transformation g(t), i.e. (α2 , β2 ) = g(α1 , β1 ). Suppose in addition that g(0) is orthogonal and that β1 (0) is a regular symmetric matrix. Then 1/M ≤ tr g ∗ (0)g(0) ≤ M, where M ∈ [1, +∞) depends only on a and on the eigenvalues of β1 (0). Proof. The previous lemma shows that, if g(0) tends to infinity in O(n, C), then so does β2 (0) = g(0)β1 (0)g(0)−1 in gl(n, C). The proof of Proposition 1.3 in [7] shows, however, that there is a constant C = C(a) such that2 for any solution (α, β) to Nahm’s equations on [−a, a], tr β ∗ (0)β(0) ≤ C + |di | , where di are the eigenvalues of β(0). It remains to prove that the differential of h → (α˜ + α˜ ∗ ) is non-singular. As in [21], we choose a gauge in which α = α ∗ . Let 1 + ρ be an infinitesimal complex gauge transformation (i.e. ρ ∈ Lie G˜ C ) preserving the Nahm equations with ρ self-adjoint. The

Monopoles and Clusters

687

differential of (α˜ + α˜ ∗ ) is then −2ρ. ˙ The fact that ρ preserves the Nahm equations implies that ρ satisfies, on both (−2, 0) and (0, 2), the equation ρ¨ = [α ∗ , [α, ρ]] + [β ∗ , [β, ρ]] − [[β ∗ , β], ρ]. We compute the L 2 -norm of (a, b) = (−ρ˙ +[ρ, α], [ρ, β]) on an interval [r, s] contained in either [−2, 0] or [0, 2]:

s −ρ˙ + [ρ, α], −ρ˙ + [ρ, α] + [ρ, β], [ρ, β] = − tr ρρ ˙ rs . (5.3) r

Since ρ(±1) is skew-symmetric and ρ(±1) ˙ is symmetric, tr ρρ ˙ vanishes at ±1. Were the jump of ρ˙ to vanish at 0, we would get

0

−1

2 2 a + b +

1

a2 + b2 = 0,

0

and hence, in particular, [ρ, β] = 0 on both [−1, 0] and on [0, 1]. Then ρ(1) commutes with β(1). As β(1) is a regular symmetric matrix, its centraliser consists of symmetric matrices and hence ρ(1) is both symmetric and skew-symmetric, hence zero. For the same reason ρ(−1) vanishes. We can now finish the proof as in [21]. r as a complex affine variety. It is not however a manifold and One can now identify Nk,l r . We consider sets Arr and the for our purposes it is sufficient to identify a subset of Nk,l k,l rr essentially consisting of those solutions (α, β) for which β (0) and corresponding Nk,l − β+ (0) do not have a common eigenvector with a common eigenvalue. More precisely, if k < l (resp. k > l) we require that there is no (λ, v) ∈ C × Ck (resp. (λ, v) ∈ C × Cl ) such that β− (0)v = λv (resp. β+ (0)v= λv) and limt→0 (β+ (t) − λ)v˜ = 0) (resp. v limt→0 (β− (t) − λ)v˜ = 0), where v˜ = . If k = l and β+ (0) − β− (0) = V W T , we 0 only require that W T v = 0 for any eigenvector v of β− (0) (if V = 0, this is equivalent to β− (0) and β+ (0) not having a common eigenvector with a common eigenvalue). We have: rr is biholomorphic to Rat P1 × Rat P1 . Proposition 5.4. Nk,l k l

˜C Proof. Given Proposition 5.1, it is enough to show that A˜ rr k,l /G is biholomorphic to 1 1 Ratk P × Ratl P . The case of k < l. First of all, just as in [21,9], we use a singular gauge transformation to make β+ (0) regular and of the form ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ β+ (0) = ⎜ ⎜ ⎜ ⎜ ⎜ ⎝

β− (0) f1

...

0 .. .

...

0

...

0 ... .. .

0 ... fk 0 . . . . 0 1 .. .. .. . . 0 0 ...

0 .. . 0 0 ..

. 1

⎞ g1 .. ⎟ . ⎟ ⎟ gk ⎟ e1 ⎟ ⎟. ⎟ ⎟ e2 ⎟ .. ⎟ ⎠ . el−k

(5.4)

688

R. Bielawski

rr /G˜ C , where B˜ The quotient A˜ rk,l /G˜ C becomes the quotient B˜k,l k,l is defined exactly as ˜ Ak,l , except that the matching condition for β at t = 0 is now given by (5.4). The superscript rr means now that both β− (0) and β+ (0) are regular and do not have a common eigenvector with a common eigenvalue. Since β− (0) is a regular matrix, we can find an element of G˜ C which conjugates it to the form: ⎛ ⎞ 0 . . . 0 b1 ⎜ .. ⎟ ⎜1 . b2 ⎟ ⎜ ⎟ (5.5) ⎜ . . . . .. ⎟ . ⎝ . . . ⎠ 0 . . . 1 bk

The remaining gauge freedoms are gauge transformations in G˜ C such that their upperdiagonal block h at t = 0 centralises (5.5). We want to use this gauge freedom to make ( f 1 , . . . , f k ) equal to (0, . . . , 0, 1). Lemma 5.5. Let B be a matrix of the form 5.5 and let u = (u 1 , . . . , u k ) be a covector. There exists an invertible matrix X such that X B X −1 = B and u X −1 = (0, . . . , 0, 1) if and only if uv = 0 for any eigenvector v of B. If such an X exists, then it is unique. Proof. Since (0, . . . , 0, 1) is a cyclic covector for B, there exists a unique X such that [X, B] = 0 and u = (0, . . . , 0, 1)X . The problem is the invertibility of X . We can write k−1 X as i=0 ci B i for some scalars ci . If we put B in the Jordan form, then it is clear that k−1 det X = 0 if and only if i=0 ci λi = 0 for any eigenvalue λ of B. Let v = (v1 , . . . , vk )T be an eigenvector for B with the eigenvalue λ. We observe that Bv = λv and v = 0 k−1 implies that vk = 0. Since uv = (0, . . . , 0, 1)X v = vk i=0 ci λi , we conclude that det X = 0 precisely when uv = 0 for any eigenvector v. Returning to the proof of the proposition, we observe that the condition that β− (0) of the form 5.5 and β+ (0) of the form 5.4 do not have a common eigenvector with a common eigenvalue is equivalent to ( f 1 , . . . , f k )v = 0 for any eigenvector v of β− (0). Thanks to the above lemma we can now find a unique gauge transformation in G˜ C such that its upper-diagonal block h at t = 0 centralises (5.5) and which makes ( f 1 , . . . , f k ) equal to (0, . . . , 0, 1). The only gauge transformations which preserve this form of β± (0) are those which are identity at t = 0 (and hence at t = ±2). We can now find a unique pair (g− , g+ ) of gauge transformations on [−2, 0] and [0, 2] with g± (0) = 1 which make α identically zero. Therefore sending (α, β) to (β+ (0), g− (−2), g+ (2)) gives a well˜C defined map from A˜ rr k,l /G to the set of (B+ , g1 , g2 ) ∈ gl(l, C) × G L(k, C) × G L(l, C), where B+ is of the form (5.4) with β− (0) of the form (5.5), ( f 1 , . . . , f k ) = (0, . . . , 0, 1), g1−1 β− (0)g1 = β− (0)T , g2−1 B+ g2 = B+T . Let us write B− for β− (0). We observe that T is the same as giving a cyclic covector w for B . The giving g1 with g1−1 B− g1 = B− 1 − T k−1 T T w T , w T . The pair (B , w ) corresponds to corresponding g1 is (B− ) w1 , . . . , B− − 1 1 1 an element of Ratk P1 via the map (B− , w1 ) → w1 (z − B− )−1 (1, . . . , 0)T . We claim that (B+ , g2 ) also corresponds to a unique element of Ratl P1 . This follows from Lemma 5.6. Let B+ be a matrix of the form (5.4) with β− (0) of the form (5.5) and ( f 1 , . . . , f k ) = (0, . . . , 0, 1). There exists an invertible matrix A, depending only on β− (0), which conjugates B+ to an l × l-matrix of the form (5.5).

Monopoles and Clusters

689

Proof. Since B+ is regular we can represent it as multiplication by z on C[z]/ (q+ (z)), where q+ (z) = det(z − B+ ). Let q− (z) = det(z − B− ). In the basis 1, z, . . . , z l−1 , B+ is of the form (5.5), while in the basis 1, z, . . . , z k−1 , q− (z), zq− (z), . . . , z l−k−1 q− (z) it is of the form (5.4) with β− (0) of the form (5.5) and ( f 1 , . . . , f k ) = (0, . . . , 0, 1). Therefore we can consider, instead of (B+ , g2 ), the pairs AB+ A−1 , Ag2 A T and proceed as for (B− , g1 ). The case of k > l. This is exactly symmetric to the previous case. The case of k = l. We have β+ (0) − β− (0) = V W T . As in the case k < l we conjugate β− (0) to the form (5.5). By assumption W T v = 0 for any eigenvector v of β− (0), so Lemma 5.5 shows that we can make W T equal to (0, . . . , 0, 1) by a unique gauge transformation g(t) ∈ G˜ C such that g(0) centralises β− (0). It follows that β+ (0) is also of the form (5.5). The remainder of the argument is basically the same (but simpler) as for k < l. rr . If we We observe that the above proof identifies the complex symplectic form of Nk,l “double” the metric, i.e. consider solutions on (−2, 0) ∪ (0, 2) (just as at the beginning of the proof of Proposition 5.1) then the complex symplectic form is given by

2

0 tr(dα− ∧ dβ− ) + tr(dα+ ∧ dβ+ ) + tr(d V ∧ dW T ), (5.6) −2

0

where the last term occurs only if k = l. Since this form is invariant under complex gauge transformations, going through the above proof on the set where β− and β+ have all eigenvalues distinct (compare also [8,9]) shows that this form on Ratk P1 ×Ratl P1 1 1 is −ω− + ω+ , where ω± are standard forms on Ratk P and Ratl P , given on each factor by (3.7). We can now prove the existence theorem 4.1. For this we need to consider the correspondence in Proposition 5.4 for different complex structures, i.e. for different ζ ∈ P1 . This works essentially as in [8,9] and shows that the denominators of the rational maps trace curves S1 , S2 in k,l while the numerator of the first map gives a section κ1 of L −2 [τ (D) − D] and the numerator of the first map gives a section κ2 of L 2 [τ (D) − D]. Setting ν1 = σ (κ1 ) and ν2 = κ2 gives us an element of Mk,l . Since we had the correspondence between (curves, sections) and rational maps for Nk,l , we have one for Mk,l . Remark 5.7. The proofs of [9] show that a section of the twistor space of Nk,l corresrr for ζ ∈ π(S ∩ S ). ponding to (S1 , κ1 , S2 , κ2 ) will lie outside of Nk,l 1 2 6. The Hyperkähler Structure of M k,l The space Mk,l has been defined in such a way that its hypercomplex structure is quite clear: the quadruples (S1 , ν1 , S2 , ν2 ) are canonically sections of a twistor space. We can describe this twistor space by changing the real structure (and, hence, sections) of the twistor space of Nk,l . As already mentioned (Remark 4.3), the space Nk,l , being a moduli space of solutions to Nahm’s equations has a natural (singular) hyperkähler structure. Let us double the metric on Nk,l by considering solutions on (−2, 0) ∪ (0, 2) just as at the beginning of the proof of Proposition 5.1. Let p : Z Nk,l → P1 be the twistor space of this hyperkähler structure. The fibers of p correspond to Nk,l with different complex structures and so, by

690

R. Bielawski

Proposition 5.4, each fiber has an open dense subset isomorphic to Ratk P1 ×Ratl P1 . The real sections correspond to solutions of Nahm’s equations and, by the arguments of the previous two sections, to quadruples (S1 , κ1 , S2 , κ2 ), where (S1 , S2 ) ∈ Sk,l , κ1 is a norm 1 section of L −2 [τ (D) − D] on S1 and κ2 a norm 1 section of L 2 [τ (D) − D] on S2 (at least on the open dense subset of Nk,l ). Consider the mapping T : Z Nk,l → Z Nk,l (6.1) defined in the following way. Let χ = (S1 , κ1 , S2 , κ2 ) be the unique real passing section through a point n ∈ p −1 (ζ ) corresponding to the pair ( f 1 , f 2 ) ∈ Ratk P1 × Ratl P1 . If ζ = ∞ and π −1 (ζ ) ∩ (S1 ∪ S2 ) consists of distinct points, then we can identify the numerator of f 1 with the unique polynomial taking values κ1 (ζ, ηi ) at points ηi , where (ζ, ηi ) ∈ π −1 (ζ )∩ S1 (where, once again, we think of κ1 as a pair of analytic functions in the standard trivialisation in U0 , U∞ )). Define T (n) ∈ π −1 (ζ ) as (g1 , g2 ) ∈ Ratk P1 × Ratl P1 , where g2 = f 2 , the denominator of g1 is the same as the denominator of f 1 and the numerator of f 1 is the unique polynomial taking values σ (κ1 )(ζ, ηi ) at points ηi (σ is given in (2.10)). We can extend T by continuity to the remaining points of the 2 fiber p −1 (ζ ) and, by doing the same over U ∞ , to ζ = ∞. Observe that T = Id. Let τ denote the real structure of Z Nk,l . We define a new real structure by τ = T ◦ τ ◦ T −1 and define Z as Z Nk,l with real structure τ . The points of Mk,l are real sections of Z , since they are of the form T (χ ), where χ = (S1 , κ1 , S2 , κ2 ) is a real section of Z Nk,l . The normal bundle of each T (χ ) must be a direct sum of O(1)’s, since through every two points in distinct fibres there passes a unique section (as this is true for the normal bundle of χ ). Therefore we have a hypercomplex structure on Mk,l . Finally, we modify the fibre-wise symplectic form on Z Nk,l by taking ω+ + ω− on each fiber (compare with the remark after (5.6)). This is an O(2)-valued symplectic form ω on Z and evaluated on real sections of T Z , ω gives real sections of O(2). Thus we obtain a (pseudo)-hyperkähler metric on Mk,l (which may be degenerate): Theorem 6.1. The space Mk,l carries a canonical hypercomplex structure. With respect to each complex structure an open dense subset of Mk,l can be identified with Ratk P1 × 1 Ratl P . In addition, there is a pseudo-hyperkähler metric (with degeneracies) on Mk,l compatible with the hypercomplex structure. The Kähler form corresponding to ζ0 of the hyperkähler metric is given (on an open dense set, where the roots of each rational map are distinct) by the linear term in the power series expansion of k dν1 (ζ, ηi ) i=1

ν1 (ζ, ηi )

∧ dηi +

k+l dν2 (ζ, ηi ) ∧ dηi , ν2 (ζ, ηi )

i=k+1

around ζ0 , where (ζ, η1 ), . . . , (ζ, ηk ) are the points of π −1 (ζ ) ∩ S1 and (ζ, ηk+1 ), . . . , (ζ, ηk+l ) are the points of π −1 (ζ ) ∩ S2 . Remark 6.2. The above construction of a hypercomplex structure via a change of real structure of the twistor space can be seen already in the twistor space description of Taub-NUT metrics in Besse [5], Sect. 13.87. There a change of real structure leads to replacing the Taub-NUT metric with a positive mass parameter to one with a negative mass parameter. It is know that the Taub-NUT metric with a negative mass parameter is the asymptotic metric of charge 2 monopoles [3,28].

Monopoles and Clusters

691

7. M k,l as a Hyperkähler Quotient We wish to expand Remark 4.3. The moduli space Mn of SU (2)-monopoles of charge n n of SU (n+1)-monopoles can be obtained as a hyperkähler quotient of a moduli space M n is defined with minimal symmetry breaking (see [14] for the case n = 2). Namely, M as the space of solutions to Nahm’s equations on (0, 1], which have a simple pole at t = 0 with residues defining the standard irreducible representation of su(2), modulo gauge transformations, which are identity at t = 0, 1. The gauge transformations which n and Mn is the hyperkähler are orthogonal at t = 1 induce an action of O(n, R) on M n by O(n, R). quotient of M n is that the spectral curves involved do not need to satisfy any The nice thing about M n is a principal U (n)-bundle over an open transcendental or even closed conditions: M subset of all real spectral curves. We now define an analogous space for Mk,l . It should be viewed as given by generic pairs of spectral curves with framing being U (k) × U (l). We consider first the space Fk,l , already described in Remark 4.3. It is defined in the same way as Nk,l (cf. Sect. 4), except that the condition (d) is removed and the orthogonality condition in (e) is replaced by g(±1) = 1. In other words, Fk,l consists of u(k)-valued solutions to Nahm’s equations on [−1, 0) and of u(l)-valued solutions on (0, 1], satisfying the matching conditions of Nk,l at t = 0, but arbitrary at t = ±1, modulo gauge transformations which are identity at t = ±1 (and satisfy the matching condition of Nk,l at t = 0. Fk,l is a hyperkähler manifold [9] and Nk,l is the hyperkähler quotient of Fk,l by O(k, R) × O(l, R) (the action is defined by allowing gauge transformations which are orthogonal at t = ±1). The set of spectral curves, defined by elements of Fk,l , is given by: Definition 7.1. We denote by Sk,l the space of pairs (S1 , S2 ) of real curves S1 ∈ |O(2k)|, S2 ∈ |O(2l)|, of the form (2.1), without common components, such that S1 ∩ S2 = D + τ (D), supp D ∩ supp τ (D) = ∅, so that (i) H 0 S1 , L t (k + l − 2)[−τ (D)] = 0 and H 0 S2 , L t (k + l − 2)[−D] = 0 for t ∈ (0, 1]. In addition, if k ≤ l (resp. l ≤ k), then H 0 (S1 , O(k + l − 2)[−τ (D)]) = 0 (resp. H 0 (S2 , O(k + l − 2)[−D]) = 0). (ii) L t (k + l − 2)[−τ (D)] on S1 and L t (k + l − 2)[−D] on S2 are positive-definite in the sense of Definition 2.7 for every t. One can show that Fk,l is a U (k) × U (l)-bundle over Sk,l , but we shall not need this. rr , defined in What we do need is the complex structure of Fk,l or, rather, its open subset Fk,l rr exactly the same way as Nk,l . As in Sect. 5, we fix a complex structure and write Nahm’s equations as the complex one and the real one. According to [9], Fk,l is biholomorphic to W × G L(l, C), where , for k < l, W is the set of matrices of the form (5.4), while for k = l, W is the set {(B− , B+ , V, W ) ∈ gl(l, C)2 × (Cl )2 ; B+ − B− = V W T }. Thus Fk,l is biholomorphic to G L(l, C) × gl(k, C) × Ck+l . On the other hand, the proof of rr : Proposition 5.4 furnishes a different biholomorphism for Fk,l rr is biholomorphic to Ck × G L(k, C) × Cl × G L(l, C). Proposition 7.2. Fk,l

Proof. This is the same argument as in the proof of Proposition 5.4. We can uniquely conjugate β+ (0) to a matrix B+ of the form (5.4) (resp. (5.5)) if k < l (resp. k ≥ l), with β− (0) being a matrix B− of the form (5.5) if k ≤ l and of the form (5.4) if k > l, and ( f 1 , . . . , f k ) = (0, . . . , 0, 1) in both cases. There is a unique pair (g− , g+ ) of

692

R. Bielawski

gauge transformations on [−1, 0] and [0, 1] with g± (0) = 1 which make α identically −1 (−1)B− g− (−1) = β− (−1) and g+−1 (1)B+ g+ (1) = β+ (1). The desired zero. Thus g− biholomorphism is given by associating to a solution (α(t), β(t)) the invertible matrices g− (−1), g+ (+1) and the characteristic polynomials of B− and B+ . 8. Spaces of Curves and Divisors This section is largely technical, given to fix the notation and introduce certain notions needed later on. [m] 8.1. The Douady space of C2 . According to [30] and [12], the Douady space C2 , parameterising 0-dimensional complex subspaces of length m in C2 , can be represented by the manifold Hm of G L(m, C)-equivalence classes of H˜ m = (A, B, v) ∈ gl(m, C)2 × Cm ; [A, B] = 0, Cm = Span Ai B j v

i, j∈N

. (8.1)

[m] The correspondence is induced by the G L(n, C)-invariant map H˜ m → C2 , which assigns to (A, B, v) the complex space Z , the support of which are the pairs of eigenvalues of A and B (A and B commute), with O Z = O(U )/I , where U is a neighbourhood of supp Z and I is the kernel of the map ψ : O(U ) → Cm , ψ( f ) = f (A, B)v.

(8.2)

[m] Let Y ⊂ C2 × C2 be the tautological subspace (i.e. (Z , t) ∈ Y ⇐⇒ t ∈ supp Z ) [m] and let Wm be the pushdown of the structure sheaf of Y onto C2 . As a vector bundle, 2 [m] 0 the fibre of Wm at Z ∈ C is H (Z , O Z ). Following Nakajima [30], we call Wm the tautological vector bundle. In the above matricial model, Wm is the vector bundle associated to the principal G L(m, C)-bundle H˜ m over Hm . The next step is to make Wm into a Hermitian vector bundle. Given the usual correspondence between the complex quotient of the set of stable points and the Kähler quotient, we can identify (cf. [30]) Hm with the manifold of U (m)-equivalence classes of

Hˆ m = (A, B, v) ∈ gl(m, C)2 × Cm ; [A, B] = 0, A, A∗ + [B, B ∗ ] + vv ∗ = 1 . The bundle Wm is now isomorphic to Hˆ m ×U (m) Cm and, hence, it inherits a Hermitian metric from the standard metric on Cm . More explicitly, this metric is defined as follows. [m] Let Z ∈ C2 be represented by (A, B, v) satisfying both equations in the definition ˆ ¯ of Hm , and let f , g¯ ∈ O Z = O(U )/I be represented by f, g ∈ O(U ). Then: f¯, g ¯ = f (A, B)v, g(A, B)v , where the second metric is the standard Hermitian inner product on Cm .

(8.3)

Monopoles and Clusters

693

8.2. The Douady space of TP1 . We consider now the Douady space T [m] of T = TP1 , parameterising 0-dimensional complex subspaces of length m in T . Recall that T = TP1 is obtained by glueing together two copies U0 , U∞ of C2 . According to [12], [m] 0 ,H ∞ ˜m we obtain T [m] by an analogous glueing of U0[m] and U∞ . We take two copies H˜ m ˜ B, ˜ v˜ and glue them together over the subset of (8.1), with “coordinates" A, B, v and A, det A = 0 = det A˜ by: A˜ = A−1 ,

B˜ = B A−2 , v˜ = v.

Call the resulting manifold T˜m . The glueing is G L(m, C)-equivariant and we obtain a manifold Tm = T˜m /G L(m, C) which represents T [m] . The tautological bundle Wm over T [m] is the vector bundle associated to the principal G L(m, C)-bundle T˜m over Tm . Remark 8.1. Unsurprisingly, one cannot glue together the unitary descriptions of U0[m] , [m] . In particular, we do not have a natural Hermitian metric on Wm over T [m] . U∞ 8.3. Curves and divisors. Let Cn denote the space of all curves S ∈ |O(2n)|, i.e. space 2 of polynomials of the form (2.1). Thus, Cn Cn +2n . Let Yn ⊂ T × Cn be the resulting correspondence, i.e. Yn = {(t, S) ∈ T × Cn ; t ∈ S} .

(8.4)

We have the two projections: p1 : Yn → T and p2 : Yn → Cn . We denote by Yn,m the relative m-Douady space for p2 : Yn → Cn . It is a complex space [32] with a projection p : Yn,m → Cn , and its points are pairs (S, ), where S ∈ Cn and is an effective Cartier divisor of degree m on S. There is a natural holomorphic map φ : Yn,m → T [m] ,

(8.5)

which assigns to (S, ) the complex subspace Z = (supp , O ), where O is given by the ideal generated by (as a Cartier divisor) and the polynomial (2.1) defining S. We have two canonical subsets of Yn,m : 0 ∞ Yn,m = {(S, ); ∞ ∈ π(supp )} , Yn,m = {(S, ); 0 ∈ π(supp )} . (8.6) 0 into U [m] and Y ∞ into U [m] . The map φ maps Yn,m ∞ n,m 0

8.4. Line bundles. Let now E be a line bundle on TP1 , the transition function of which from U0 to U∞ is ρ(ζ, η). We fix a trivialisation of E on U0 , U∞ (since H 0 (TP1 , O) = C, such a trivialisation of E on U0 , U∞ is determined up to a constant factor). 0 , we obtain a map For any (S, ) ∈ Yn,m : H 0 (S, E |S ) → H 0 (supp , O ),

(8.7)

from H 0 (S, E |S ) to the fibre of Wm over φ(S, ) by first representing a section by a pair of holomorphic functions s0 , s∞ on U0 ∩ S, U∞ ∩ S, satisfying s∞ = ρs0 on U0 ∩ U∞ ∩ S, and taking an extension of s0 to some neighbourhood U of U0 ∩ S in U0 .

694

R. Bielawski

If we denote by E the linear space over Yn,m , the fibre of which over (S, ) is H 0 (S, E |S ) (i.e. E is the pullback of the analogously defined linear space over Cn ), then makes the following diagram commute: E ⏐ ⏐

−−−−→ Wm ⏐ ⏐

(8.8)

φ

0 Yn,m −−−−→ U0[m] . ∞ as well. Obviously the above discussion holds for Yn,m We now specialise to the case E = F(n + p − 1), where F is a line bundle on TP1 with c1 (F) = 0. Let S ∈ |O(2n)| be of the form (2.1), and let be an effective divisor on S of degree pn such that

H 0 (S, F(n + p − 2)[−]) = 0. Let ζ0 ∈ We write

P1

(8.9)

− π(supp ) and Dζ0 = S ∩ (ζ − ζ0 ) be the divisor of points lying over ζ0 . V = H 0 (S, F(n + p − 1)) , V = H 0 (S, F(n + p − 1)[−]) , Vζ0 = H 0 S, F(n + p − 1)[−Dζ0 ] .

(8.10)

The condition (8.9) and the fact that F(n + p − 2)[−] has degree equal to genus(S) − 1 imply that the first cohomology of F(n + p − 2)[−] vanishes. Therefore, the first cohomology of F(n + p − 2) and of F(n + p − 1) vanish as well. Consequently dim V = np + n and dim V = n. Since [Dζ0 ] = O S (1), dim Vζ0 = np, and H 0 S, F(n + p − 1)[− − Dζ0 ] = H 0 (S, F(n + p − 2)[−]) = 0, we have that V = V ⊕ Vζ0 . Moreover, we have an isomorphism: Vζ0 −→ H 0 supp , F(n + p − 1)[−Dζ0 ] .

(8.11)

(8.12)

Definition 8.2. We write Yn, pn (F) for the subset of Yn, pn on which (8.9) is satisfied. If ζ0 ∈ P1 , then we write Yn, pn (ζ0 ) for the subset of Yn, pn on which ζ0 ∈ π(supp ). We also write Yn, pn (F, ζ0 ) = Yn, pn (ζ0 ) ∩ Yn, pn (F) and we use the superscripts 0, ∞ 0 ∞ to denote the intersections of any of these sets with Yn, pn or Yn, pn . We write V, V , Vζ0 for the vector bundles over Yn, pn (ζ0 ), the fibres of which over (S, ) are, respectively, the vector spaces V, V , Vζ0 , given by (8.10). If ζ0 = ∞, then the isomorphism (8.12) can be interpreted as the top map in (8.8) for E = F(n + p − 1)[−Dζ0 ]. In particular, we obtain a Hermitian metric on Vζ0 0 over Yn, pn (F, ζ0 ). Similarly, if ζ0 = 0, then we obtain a Hermitian metric on Vζ0 over ∞ Yn, pn (F, ζ0 ). We finally specialise to the case F = L t and we write, for any interval I : Yn, pn (I ) = Yn, pn (L t ). (8.13) t∈I

The notation Yn, pn (I, ζ0 ),

0 Yn, pn (I, ζ0 )

∞ (F, ζ ) is then self-explanatory. and Yn, 0 pn

Monopoles and Clusters

695

8.5. Translations. Let c(ζ ) be a quadratic polynomial, viewed as a section of π ∗ O(2) on T . It induces a fibrewise translation on T : (ζ, η) → (ζ, η + c(ζ )), which in turn induces a translation tc(ζ ) : Yn,m → Yn,m . We have a similar map on T [m] , given by H˜ m (A, B) → (A, B + c(A)) ∈ H˜ m .

(8.14)

We denote this map also by tc(ζ ) . The following diagram commutes φ

Yn,m −−−−→ T [m] ⏐ ⏐ ⏐t ⏐ tc(ζ ) c(ζ )

(8.15)

φ

Yn,m −−−−→ T [m] . The formula (8.14) defines a map on the tautological bundle Wm over T [m] . In terms of O Z , Z being a 0-dimensional subspace of length m, this map is given by f (ζ, η) → f (ζ, η + c(ζ )).

(8.16)

[m] We remark that this last map is not an isometry over U0[m] or over U∞ .

9. Asymptotics of Curves In this section, we consider the asymptotic behaviour of two spectral curves, the centres of which move away from each other. We define first an S O(3)-invariant distance function between curves in Cn . On P1 distance is measured in the standard round Riemannian metric of diameter π on S 2 . This induces a fibrewise inner product on TP1 . Let d H be the induced fibrewise Hausdorff distance between sets and π : TP1 → P1 be the projection. For two curves S, S in |O(2n)| we define their distance d(S, S ) by (9.1) d(S, S ) = max d H S ∩ π −1 (w), S ∩ π −1 (w) ; w ∈ S 2 . The distance d is equivalent to the supremum of the Euclidean distance between roots of the polynomials (2.1) defining S, S as we vary ζ over a relatively compact open set. For a curve S ∈ Cn , given in U0 by the equation ηn + a1 (ζ )ηn−1 + · · · + an−1 (ζ )η + an (ζ ) = 0, we define its centre as c(ζ ) = a1 (ζ )/n.

(9.2)

C(S) = {(ζ, η); (η + c(ζ ))n = 0}.

(9.3)

In addition, we set

696

R. Bielawski

We shall consider next a pair of real curves S1 ∈ |O(2k)| and S2 ∈ |O(2l)|. Let c1 (ζ ), c2 (ζ ) be their centres. These are quadratic polynomials invariant under the antipodal map, and we write c1 (ζ ) = z 1 + 2x1 ζ − z¯ 1 ζ 2 , c2 (ζ ) = z 2 + 2x2 ζ − z¯ 2 ζ 2 . Let R = R(S1 , S2 ) =

(x1 − x2 )2 + |z 1 − z 2 |2

be the distance between the centres and let x1 − x2 + R x1 − x2 − R ζ12 = and ζ21 = z¯ 1 − z¯ 2 z¯ 1 − z¯ 2

(9.4)

(9.5)

be the two intersection points of the polynomials c1 (ζ ) and c2 (ζ ), i.e. the two opposite directions between the centres. Recall that S1 ∩ S2 denotes a complex subspace of T , and, in an appropriate context, a Cartier divisor on S1 or S2 . Recall the set Sk,l of pairs of curves (plus a choice of a divisor D) defined in 7.1. For every K > 0 we define the following region of Sk,l : ! Sk,l (K ) = (S1 , S2 ) ∈ Sk,l ; d (Si , C(Si )) ≤ K , i = 1, 2, . (9.6) A priori, we do not know that Sk,l (K ) has nonempty interior (it could happen that, when R → ∞, then d (Si , C(Si )) → 0). We shall prove that it is so. First of all, we have Lemma 9.1. Let c1 (ζ ) and c2 (ζ ) be two quadratic polynomials, invariant under the antipodal map. Then the pair of curves defined by (η + c1 (ζ ))k = 0 and (η + c2 (ζ ))l = 0 belongs to Sk,l . Proof. One needs to show that there exists a solution to Nahm’s equations on [−1, 0) ∪ (0, 1] with the correct matching conditions (those of Nk,l ) at t = 0, and such that the corresponding spectral curves are the given ones. We can, in fact, find it on (−∞, 0) ∪ (0, +∞). We observe that such a solution is a point in the hyperkähler quotient of Fk,l (−1, 1) × Ok × O(l) by U (k) × U (l), where Ok and Ol are regular nilpotent adjoint orbits in gl(k, C) and gl(l, C) with Kronheimer’s metric [25] and Fk,l (−1, 1) was defined in Remark 4.3. One shows, as in [9] (using nilpotent orbits, rather than the semi-simple ones) that this hyperkähler quotient is a one-point set. The proof shows that a solution to Nahm’s equations, corresponding to this pair of curves, exists on (−∞, 0)∪(0, +∞). Its restriction to [−1, 0)∪(0, 1] defines an element rr , as long as c (ζ ) = c (ζ ). Let (v 0 , g 0 , v 0 , g 0 ) be the corresponding element of of Fk,l 1 2 − − + + k l C × G L(k, C) × C × G L(l, C), given by Proposition 7.2. Observe that v− and v+ are the coefficients of polynomials (η + c1 (0))k and (η + c2 (0))l . Proposition 9.2. For any L > 0, there exists a K = K (L , k, l) > 0 with the following property. Let ci (ζ ) = z i + 2xi ζ − z¯ i ζ 2 , i = 1, 2, and suppose that |z 1 − z 2 | ≥ 1. Let (v− , g− , v+ , g+ ) ∈ Ck × G L(k, C) × Cl × G L(l, C) and let q− (z) and q+ (z) be polynomials, the coefficients of which are given by the entries of v− and v+ , so that q− (z), q+ (z) are the characteristic polynomials of B− , B+ , defined in the proof of Proposition 7.2. Suppose that all roots of q− (z) (resp. roots of q+ (z)) satisfy |r − c1 (0)| ≤ L (resp. |r − c2 (0)| ≤ L) and that ∗ 0 ∗ 0 ln g− g− − ln(g− ) g− ≤ 2L , ln g+∗ g+ − ln(g+0 )∗ g+0 ≤ 2L

(9.7)

Monopoles and Clusters

697

(here ln denotes the inverse to the exponential mapping restricted to hermitian matrices). Then the pair of spectral curves corresponding, via Proposition 7.2, to (v− , g− , v+ , g+ ) lies in Sk,l (K ). Proof. Let r 1 , . . . , r k (resp. s 1 , . . . , s l be the roots of q− (z) (resp. q+ (z)). Consider a solution to Nahm’s equations on (−∞, 0) ∪ (0, +∞), with the correct matching condi i +2x ζ tions at t = 0, and such that the corresponding pair of spectral curves is η+r 1 i − r i ζ 2 = 0 and i η + s i + 2x2 ζ − s i ζ 2 = 0. Such a solution exists just as the one in Lemma 9.1 (this follows directly from [9]). Its restriction to [−1, 0) ∪ (0, 1] defines rr and the proofs in [9] show that the corresponding g 1 , g 1 satisfy the an element of Fk,l − + estimate (9.7). Let ((α− , α+ ), (β− , β+ )) be this solution to Nahm’s equations. Moreover, the estimates of Kronheimer [24] and Biquard [11] show that for t ≤ −1/2 and t ≥ 1/2 the solution to Nahm’s equations is within some C from its centre (i.e. Ti (t) are within 1 −1 −1 distance C from their centres for i = 1, 2, 3). Let h − = g− g− and h + = g+ g+1 and let h − (t) (resp. h + (t)) be a path in G L(k, C) (resp. G L(l, C)) with h − (−1) = h − and h − (t) = 1 for t ∈ [−1/2, 0] (resp. h + (1) = h + and h + (t) = 1 for t ∈ [0, 1/2]). Define a solution to the complex Nahm equation by acting on ((α− , α+ ), (β− , β+ )) with the complex gauge transformations h ± (t). If we now solve the real Nahm equation via a complex gauge transformation G(t), which is identity at ±1, then the corresponding element of Ck × G L(k, C) × Cl × G L(l, C) is the given one. On the other hand, the lefthand side of the real Nahm equation is bounded, because β± (t) and (α± (t) + α± (t)∗ ) /2 are within C from their centres for t ∈ [−1, −1/2] ∪ [1/2, 1]. Then it follows from estimates of Donaldson and Hurtubise (see Sect. 2 in [21]) that the hermitian part of ˙ −1 is uniformly bounded at t = ±1, which proves the estimate (K is determined by GG ˙ −1 (±1). C and the bound on GG As a corollary (of the proof) we can give an estimate on spectral curves of clusters in terms of the corresponding rational map: Corollary 9.3. For any L > 0, there exists a K = K (L , k, l) > 0 with the following (z) p2 (z) property. Let qp11(z) , q2 (z) ∈ Rat k P1 × Ratl P1 be a pair of rational maps and let β11 , . . . , βk1 (resp. β12 , . . . , βl2 ) be the roots of q1 (z) (resp. q2 (z)). Suppose that the functions satisfy: (i) |βi1 − β 2j | ≥ 1 for any i, j. (ii) |βis − β sj | ≤ 2L for any i, j and s = 1, 2. (iii) ln | ps (βis )| − ln | ps (β sj )| ≤ 2L for any i, j and s = 1, 2. Let (S1 , S2 ) ∈ k,l correspond to the above pair of rational 1functions viaPropoβi /k, b2 = βi2 /l, sition 5.4. Then (S1 , S2 ) ∈ Sk,l (K ). Moreover, if b1 = 1 2 ln | p1 (βi )|/2k, a2 = ln | p2 (βi )|/2l, then |bs − z s | ≤ K , |as − ys | ≤ K , a1 = s = 1, 2, where z s + 2ys ζ − z¯ s ζ 2 is the centre of Ss . Proof. Once again consider the solution ((α− , α+ ), (β− , β+ )) to Nahm’s equations on [−1, 0) ∪ (0, 1] with r i = βi1 , i = 1, . . . , k, s j = β 2j , j = 1, . . . , l, xs = as , s = 1, 2. The assumption (i) and Kronheimer’s estimates [24] imply that, near t = ±1, the solution is within some constant C from the diagonal one (after acting by U (k) and U (l)), and that the derivatives of the solution are bounded by C. Let us act by a complex gauge transformation, which differs from the identity only near ±1 and which diagonalises

698

R. Bielawski

there β± . We also require that α± becomes diagonal near ±1 and that after extending this solution to the complex Nahm equation to [−2, 0) ∪ (0, 2] by symmetry, it corresponds, via Proposition 5.4 to the given pair of rational maps. The remainder of the proof follows that of the previous proposition word by word. We observe that if (S1 , S2 ) ∈ Sk,l (K ) and p ∈ supp S1 ∩ S2 , then π( p) is within b(K )/R from either ζ12 or from ζ21 for some function b(K ). We would like to argue that π( p) must lie within b(K )/R from ζ21 , but we can only prove a somewhat weaker result: Proposition 9.4. For every L > 0 and δ > 0, there is an R0 with the following property. Let (S1 , S2 ) ∈ Sk,l be obtained from a (v− , g− , v+ , g+ ) ∈ Ck × G L(k, C) × Cl × G L(l, C), which satisfies the assumptions of Proposition 9.2 and suppose, in addition, that R(S1 , S2 ) ≥ R0 . Then the divisor D ⊂ S1 ∩ S2 may be chosen so that π(supp D) is within distance δ from the point ζ21 . Proof. First of all, observe that the subset of Sk,l described in the statement is connected, since the corresponding subset of Ck × G L(k, C) × Cl × G L(l, C) is. Therefore, it is enough to show that there is (S1 , S2 ) in this subset such that π(supp D) is within some small distance, say 1, from ζ21 . For this we take again a pair of completely reducible curves and consider the corresponding Nahm flow as in [9]. The divisor D can be read off a solution to Nahm’s equations as in [22], pp. 73–76. This, together with a more explicit description of solutions for reducible curves, given in Sect. 5,6 and 7 of [9] (in particular, the formula 6.10 together with Lemma 9.6 of that paper) shows that D (which is now a Weil divisor) can be chosen as those points of S1 ∩ S2 which are closer to ζ21 than to ζ12 (a word of warning: the Nahm equations in [22] have a different sign, corresponding to the change t → −t). We now give a picture of the asymptotic behaviour of curves in k,l , analogous to that of monopole spectral curves given in [3], Propositions 3.8 and 3.10. Before stating the result, we need to define an appropriate subset of k,l . Definition 9.5. We denote by k,l (K ) the subset of k,l ∩ Sk,l (K ) defined as follows. If π(S1 ∩ S2 ) is within distance 1 from {ζ12 , ζ21 }, then (S1 , S2 ) ∈ k,l (K ) if and only if D can be chosen so that π(supp D) is within distance 1 from the point ζ21 . Remark 9.6. Proposition 9.4 implies that curves corresponding to rational maps satisfying the assumption of Corollary 9.3 belong to k,l (K ). Proposition 9.7. Let (S1n , S2n ) be a sequence of points in k,l (K ) such that the distances Rn between the centres of S1n and S2n tend to infinity. Let P1n (ζ, η) = 0 and P2n (ζ, η) = 0 be the equations defining S1n and S2n and c1n (ζ ), c2n (ζ ) the centres of S1n and S2n . Then the centred curves P1n ζ, η − c1n (ζ ) = 0, P2n ζ, η − c2n (ζ ) = 0 have a subsequence converging to spectral curves of monopoles of charge k and l, respectively. Proof. We prove this for S2n . The centred curves, given by the polynomials P2n ζ, η − c2n (ζ ) = 0, lie in a compact subset, and so we can find a subsequence converging to some S2∞ . Let Rn = R(S1n , S2n ). The divisor of P1n on S2n is the same as that of P1n /(Rn )k . The latter has a subsequence convergent to c(ζ )k , where c(ζ ) is a quadratic polynomial. Write ζ12 and ζ21 for its roots, as in (9.5). Proposition 9.4 implies that the translated divisors n = {(ζ, η); (ζ, η − c2n (ζ )) ∈ Dn } converge to k Dζ21 on S2∞

Monopoles and Clusters

699

(recall that Dζ0 denotes the divisor of (ζ − ζ0 )). Consider now the corresponding solutions to Nahm’s equations, given by Proposition 4.4. The solutions shifted by the centres will have a convergent subsequence on (0, 2), thanks to Proposition 1.3 in [7]. Therefore, the sections of L t (k + l − 1)[−n ] converge to sections of a line bundle over S2∞ . This line bundle must be L t (k +l − 1)[−k Dζ21 ] L t (k − 1), and, hence, the limit Nahm flow corresponds to L t (k − 1). Since the limit flow is nonsingular, H 0 S2∞ , L t (l − 2) = 0 for t ∈ (0, 2). In addition, if the Nahm matrices were symmetric at t = 1 for S2n , then they are symmetric for S2∞ , and, hence, L 2 is trivial on S2∞ . Finally, S2∞ does not have a multiple component, thanks to Remark 3.1. The proof shows that the divisors Dn and τ (Dn ), translated by the centres, converge as well. Observe that we have embeddings Sk,l → Yk,kl ((0, 2)) and Sk,l → Yl,kl ((0, 2)) (recall (8.13)), given by (S1 , S2 ) → (S1 , τ (D)) ∈ Yk,kl , (S1 , S2 ) → (S2 , D) ∈ Yl,kl .

(9.8)

From the proof of the above proposition, we have: Corollary 9.8. Let 1 (K ) (resp. 2 (K )) be the subset of k,l (K ) defined by c1 (ζ ) = 0 (resp. c2 (ζ ) = 0) and R ≥ 1. Then 1 (K ) is a relatively compact subset of Yk,kl ((0, 2)) and 2 (K ) is a relatively compact subset of Yl,lk ((0, 2)). We also have: Corollary 9.9. There exists an R0 , such that, for all (S1 , S2 ) ∈ k,l (K ) with R(S1 , S2 ) ≥ R0 , neither S1 nor S2 has multiple components. Proof. If this were not the case, then the limit curves obtained in Proposition (9.7) would also have a multiple component, and could not be spectral curves of monopoles. 10. Asymptotics of Matricial Polynomials We shall now consider the flow L t (k+l −1) on S1 ∪S2 for (S1 , S2 ) ∈ k,l (defined in 9.5). Observe that the corresponding matricial flow A(t, ζ ) has poles at t = 0 corresponding to the irreducible representation of dimension k + l, and so the boundary behaviour of SU (2)-monopoles. Of course, it does not have the correct boundary behaviour at t = 2, but we are going to show that, in the asymptotic region of k,l (K ) ⊂ k,l ∩ Sk,l (K ), the corresponding matricial flow is exponentially close to the block-diagonal matricial flow corresponding to L t (k + l − 1)[−τ (D)] on S1 and L t (k + l − 1)[−D] on S2 . In particular, it is exponentially close to being symmetric at t = 1, and so we can construct an exponentially approximate solution to Nahm’s equations with the correct (monopolelike) boundary behaviour by taking A(2 − t, ζ )T on [1, 2). We are going to prove Theorem 10.1. For every K > 0, δ > 0, there exist an R0 , α > 0, C > 0 such that for any (S1 , S2 ) ∈ k,l (K ) with R(S1 , S2 ) ≥ R0 the following assertions hold: 1. The line bundle L t (k + l − 2) on S1 ∪ S2 does not lie in the theta divisor for any t ∈ (0, 2).

700

R. Bielawski

2. For any t ∈ [δ, 2 − δ], the line bundle L t (k + l − 1) can be represented by a matricial polynomial A(t, ζ ) = (T2 (t) + i T3 (t)) + 2T1 (t)ζ + (T2 (t) − i T3 (t)) ζ 2 such that the matrices are skew-hermitian and the Ti (t), i = 1, 2, 3, are Ce−α R -close to block-diagonal skew-hermitian matrices Tˆi (t) with blocks defining a given matrixpolynomial representation of L t (k + l − 1)[−τ (D)] on S1 and L t (k + l − 1)[−D] on S2 . The second part of the theorem can be strengthened. Let us write ˆ ζ ) = Tˆ2 (t) + i Tˆ3 (t) + 2Tˆ1 (t)ζ + Tˆ2 (t) − i Tˆ3 (t) ζ 2 . A(t, Theorem 10.2. With the notation and assumptions of the previous theorem, there exists a map g : [δ, 2 − δ] × P1 → S L(k + l, C), analytic in the first variable and meromorphic ˆ ζ ), for any (t, ζ ) ∈ in the second variable, such that g(t, ζ )A(t, ζ )g(t, ζ )−1 = A(t, 1 [δ, 2 − δ] × P . Moreover: (i) There are constants C, α > 0 such that, for any t ∈ [δ, 2 − δ] and any ζ1 , ζ2 ∈ P1 with |ζi −ζ12 | ≥ 1/2, |ζi −ζ21 | ≥ 1/2, i = 1, 2, g(t, ζ1 )g(t, ζ2 )−1 −1 ≤ Ce−α R (as matrices). g11 g12 , with g11 being k × k and g22 l × l, (ii) If we write g in the block form as g21 g22 then the only poles of g11 (t, ζ ) and g12 (t, ζ ) may occur at ζ ∈ π (supp τ (D)) and the only poles of g21 (t, ζ ) and g22 (t, ζ ) may occur at ζ ∈ π(supp D). The remainder of the section is devoted to a proof of these theorems. Step 1. Let P1 (ζ, η) = 0, P2 (ζ, η) = 0 be the equations of S1 and S2 . Let c1 (ζ ), c2 (ζ ) be the centres of S1 , S2 (defined by (9.2)). Consider the effect of shifting the curves by k l the “total centre” c12 = k+l c1 + k+l c2 , i.e. curves defined by P1 (ζ, η − c12 (ζ )) = 0, P2 (ζ, η − c12 (ζ )) = 0. The effect is the same on matrices Tˆi and Ti : adding a matrix in the centre of U (k + l). Thus, we can assume, without loss of generality, that c12 (ζ ) = 0, i.e. that the centres of curves S1 , S2 satisfy kc1 (ζ ) + lc2 (ζ ) = 0.

(10.1)

We can also assume, using the S O(3)-action, that ζ21 = 0 (recall (9.5)). This means that Rζ Rζ the centre of S1 is lk+l (R = R(S1 , S2 )), and the centre of S2 is − kk+l . Finally, thanks to Proposition 9.4, we can take R0 large enough, so that π(supp D) ⊂ B(0, 1/2). Choose now a ζ0 ∈ P1 with d(ζ0 , 0) > 1/2 and d(ζ0 , ∞) > 1/2. Following (8.10), write V i (t) = H 0 Si , L t (k + l − 1) , Vζi0 (t) = H 0 Si , L t (k + l − 1)[−Dζ0 ] , i = 1, 2, V1 (t) = H 0 S1 , L t (k + l − 1)[−τ (D)] , V2 (t) = H 0 S2 , L t (k + l − 1)[−D] . For t ∈ (0, 2), we have the decompositions (8.11): V i (t) = Vi (t) ⊕ Vζi0 (t), i = 1, 2. The idea of the proof is that sections of V 1 (t) and V 2 (t), which are, in this decomposition, of the form s +0 (s ∈ Vi (t)), are exponentially close (in a sense to be defined) to sections of L t (k + l − 1) on S1 ∪ S2 .

Monopoles and Clusters

701

Step 2. We now consider arbitrary curves and divisors, as in Sect. 8. Recall, from i (t), V i (t) over Y Sect. 8.4, the vector bundles V i (t), V k,kl (ζ0 ) and Yl,kl (ζ0 ), the ζ0 i i i fibre of which at (S1 , S2 ) are V (t), V (t), Vζ0 (t). We denote by the same letters the corresponding vector bundles over Sk,l or, rather, over the subset Sk,l (ζ0 ), on which ζ0 ∈ π(supp S1 ∩ S2 ). We shall usually not write this ζ0 , keeping in mind that it should be inserted wherever Vζi0 (t) is discussed. There are embeddings λ11 , λ12 : Sk,l → Yk,kl and λ21 , λ22 : Sk,l → Yl,kl (cf. (9.8)): λi j (S1 , S2 ) = S j , τ i (D) , i, j = 1, 2 (10.2) ∞ , λ into Y ∞ , λ into Y 0 and (recall that τ 2 = Id). Observe that λ11 maps into Yk,kl 12 21 l,kl l,kl 0 λ22 into Yl,kl . We have the maps i j , i, j = 1, 2, defined as follows: 11 is the top map ∞ for E = V 1 (t), is the top map in (8.8) over Y 0 for E = V 1 (t), in (8.8) over Yk,kl 21 ζ0 ζ0 k,kl ∞ for E = V 2 (t), and, finally, is the top map 12 is the top map in (8.8) over Yl,kl 22 ζ0 0 for E = V 2 (t). We have the corresponding maps for the bundles in (8.8) over Yl,kl i j ζ0 j

Vζ0 (t) over Sk,l . A section of L t (k + l − 1) on S1 ∪ S2 corresponds to a pair of sections 0 s1 ∈ H S1 , L t (k + l − 1) , s2 ∈ H 0 S2 , L t (k + l − 1) such that 11 (s1 ) = 12 (s2 ), 21 (s1 ) = 22 (s2 ).

(10.3)

We shall want to write these equations in terms of bases. Recall, from Corollary 9.8, the subsets 1 (K ) and 2 (K ) of k,l (K ). The argument in the proof of Proposition 9.7 shows that λi j ( j (K )) are relatively compact sets for i, j = 1, 2. We write i j (K ) for the compact sets λi j ( j (K )). Corollary 9.8 says that 11 (K ) (resp. 22 (K )) is ∞ ((0, 2)) (resp. a subset of Y 0 ((0, 2))). Recall, from the end actually a subset of Yk,kl l,kl ∞ ((0, 2)) and V 2 (t) over Y 0 ((0, 2)) have of Sect. 8.4, that the bundles Vζ10 (t) over Yk,kl ζ0 l,kl Hermitian metrics induced by maps 11 and 22 . These give us Hermitian metrics on j Vζ0 (t), j = 1, 2, over k,l . In other words, we choose Hermitian metrics on these bundles which make 11 and 22 isometric. Since 11 (K ) and 22 (K ) are compact, there exists a constant M = M(K , t), such that any vector s1 of length one in the restriction of Vζ10 (t) to 11 (K ) and any vector s2 of length one in the restriction of Vζ20 (t) to 22 (K ) satisfies: |21 (s1 )| ≤ M, |12 (s2 )| ≤ M.

(10.4)

j

For V (t), we have given bases (unitary with respect to (2.12)) u rj , r = 1, . . . , δ j1 k+δ j2 l, in which multiplication by η is represented by the chosen matricial polynomials. Again, we can assume that over 11 (K ) and 22 (K ), 21 (u r ) ≤ M, 12 (u r ) ≤ M. (10.5) 1 2 Remark 10.3. In both (10.4) and (10.5), we can replace i j with i j . Given δ > 0, we can choose an M = M(K , δ), such that (10.4) and (10.5) hold with this M for all t ∈ [δ, 2 − δ]. We now write 1 (K , R) (resp. 2 (K , R)) for the subset of k,l (K ) defined by Rζ Rζ (resp. c2 (ζ ) = − kk+l ). We define similarly sets i j (K , R) for i, j = 1, 2. c1 (ζ ) = lk+l

702

R. Bielawski

We observe that i j (K , R) are obtained from i j (K ) by the map tc j (ζ ) defined in [m] (resp. U0[m] ). Consider Sect. 8.5. Let Wm1 (resp. Wm2 ) be the tautological bundle over U∞ j the analogous maps tc j (ζ ) on Wm , given by (8.14) or (8.15) and define new Hermitian j

metrics on Wm by pulling back the old metric via tc j (ζ ) . This induces new Hermitian j

metrics on Vζ0 (t), j = 1, 2, over Sk,l . In particular, these are the metrics we shall consider for (S1 , S2 ) ∈ 1 (K , R) ∩ 2 (K , R). We need the following Lemma 10.4. Let S ∈ Cn be defined by P(ζ, η) = 0 and let c(ζ ) = z + 2xζ − z¯ ζ 2 be its centre. Define the corresponding centred curve S c by P(ζ, η − c(ζ )) = 0. For any m ∈ N and any t ∈ C there is a 1-1 correspondence between sections of L t (m) on S and on S c . The correspondence is given by c s0c (ζ, η) = et (x−¯z ζ ) s0 (ζ, η − c(ζ )), s∞ (ζ, η) = et (−x−z/ζ ) s∞ (ζ, η − c(ζ )),

where s0 , s∞ represent a section of L t (m)|S in the trivialisation U0 , U∞ . c define a section of L t (m) on S c : Proof. We check that s0c , s∞

e−tη/ζ ζ −m s0c (ζ, η) = ζ −m e−tc(ζ )/ζ e−t (η−c(ζ ))/ζ et (x−¯z ζ ) s0 (ζ, η − c(ζ )) = ζ −m e−tc(ζ )/ζ et (x−¯z ζ ) e−t (η−c(ζ ))/ζ s0 (ζ, η − c(ζ )) c = et (−x−z/ζ ) s∞ (ζ, η − c(ζ )) = s∞ (ζ, η).

Step 3. We go back to (S1 , S2 ) as in Step 1, i.e. (S1 , S2 ) ∈ k,l with R(S1 , S2 ) = R and ζ21 = 0. We write (S11 , S21 ) (resp. (S12 , S22 )) for the translation of S1 and S2 by −c1 (ζ ) j (resp. −c2 (ζ )). Thus S11 and S22 have null centres. Let u rj be the basis of V (t), in which multiplication by η is represented by the chosen matricial polynomials. We observe that j p u rj for S j is obtained, via the formula in Lemma 10.4, from u rj for S j . Let v j , p = j 1, . . . , kl, j = 1, 2, be unitary bases of H 0 S j , L t (k + l − 1)[−Dζ0 ] , with respect to p the metrics defined in Step 2. Lemma 10.4 gives us bases v˜ j of H 0 S j , L t (k + l − 1) [−Dζ0 ] . With respect to the metrics on H 0 S j , L t (k + l − 1)[−Dζ0 ] , defined just before Lemma 10.4, we have: p

q

2l Rt

p

q

2k Rt

v˜1 , v˜1 = δ pq e k+l , v˜2 , v˜2 = δ pq e k+l . (10.6) r 0 t 0 t For any u 1 we seek w1 ∈ H S1 , L (k +l −1)[−Dζ0 ] and w2 ∈ H S1 , L (k +l −1) [−Dζ0 ] so that (cf. (10.3)) 11 (w1 ) − 12 (w2 ) = −11 (u r1 ), 21 (w1 ) − 22 (w2 ) = −21 (u r1 ), (10.7) p p p p and similarly for u r2 . We write w1 = x1 v˜1 and w2 = x2 v˜2 so that (10.7) becomes the matrix equation: x1 C1 B11 B12 = . B21 B22 x2 C2

Monopoles and Clusters

703 l Rt

k Rt

From (10.6), we know that B11 = e k+l · I and B22 = e k+l · I . On the other hand, (10.4), (10.5), Remark 10.3 and Lemma 10.4 imply that all entries of B21 are bounded l Rt k Rt by Me− k+l , while all entries of B12 are bounded by Me− k+l . In particular, the matrix B is invertible, if Rt is greater than some N = N (k, l, M) = N (k, l, K ). This holds for t ∈ [δ, 2 − δ], if R is sufficiently large. Similarly, if we solve (10.7) with the right-hand l Rt side given by u r1 , then C1 = 0 and every entry of C2 is bounded by Me− k+l . If we solve (10.7) with the right-hand side given by u r2 , then C2 = 0 and every entry of C1 is k Rt

bounded by Me− k+l . It follows that, if t ∈ [δ, 2 − δ] and R ≥ R0 , then the entries of x1 and x2 satisfy: |x1 | ≤ Me−Rt , |x2 | ≤ Me−Rt , p

p

(10.8)

for a new constant M = M(K , δ). Step 4. We show that the basis of H 0 S1 ∪ S2 , L t (k + l − 1) , obtained above, can be replaced by a unitary one. Let u 11 , . . . , u k1 and u 12 , . . . , u l2 be the (unitary) bases of 0 H S1 , L t (k + l − 1)[−τ (D)] and H 0 S2 , L t (k + l − 1)[−D] , in which the multiplication by η gives the chosen matricial polynomials. Step 2 has given us, for t ∈ [δ, 2 −δ] a basis of H 0 S1 ∪ S2 , L t (k + l − 1) of the form (u 11 + w11 , w21 ), . . . , (u k1 + w1k , w2k ), (y11 , u 12 + y21 ), . . . , (y1l , u l2 + y2l ),

(10.9)

where wir , yis ∈ H 0 Si , L t (k + l − 1)[−Dζ0 ] . We claim that this basis is almost ortho normal with respect to (2.12) on H 0 S1 ∪ S2 , L t (k + l − 1) . We use the formula (2.13) for the metric on H 0 S1 ∪ S2 , L t (k + l − 1) and on H 0 Si , L t (k + l − 1)[−τ i (D)] , i = 1, 2. Observe that on S1 ∪ S2 , this formula can be written as v, w =

(η,ζ1 )∈S1

Res

v1 σ (w1 )(η, ζ1 ) + P(η, ζ1 )

Res

(η,ζ1 )∈S2

v2 σ (w2 )(η, ζ1 ) , (10.10) P(η, ζ1 )

where P = P1 P2 is the polynomial defining S = S1 ∪ S2 and ζ1 is an arbitrary point of P1 . Let now v, w be arbitrary sections in H 0 S1 , L t (k + l − 1) . Then vσ (w) is a section of O(2k + 2l − 2) on S1 , and according to [22, Lemma (2.16)], it can be written as k+l−1 i i=0 η f i (ζ ) with deg f i = 2k +2l −2 −2i. This representation is not unique: adding any polynomial of the form h(ζ, η)P1 (ζ, η) defines the same section. Nevertheless, (ζ1 ,η)∈Dζ

Res

(vσ (w)) (ζ1 , η) P(ζ1 , η)

does not depend on the representation, as long as ζ1 ∈ π(supp S1 ∩ S2 ). With our choice of ζ21 , Proposition 9.4 implies that there is an R0 such that, for R(S1 , S2 ) ≥ R0 and B¯ = {ζ ; 1/2 ≤ |ζ | ≤ 2} ∩ π(supp S1 ∩ S2 ) = ∅. The above discussion is valid for v, w ∈ H 0 S1 , L t (k + l − 1) as well, and, therefore, on the set

0 = {(S1 , S2 ) ∈ k,l (K ); ζ21 = 0, R(S1 , S2 ) ≥ R0 },

704

R. Bielawski

we have well defined quantities

(w)) (ζ, η) (vσ , Res Ni (v, w) = sup P(ζ, η) ζ ∈B (ζ,η)∈Dζ

(10.11)

for any v, w ∈ H 0 Si , L t (k + l − 1) , i = 1, 2. Observe that the Ni equal the corres ponding Ni for v c , w c ∈ H 0 Sic , L t (n + p − 1) , obtained via Lemma 10.4. The Ni are upper semi-continuous as functions on V 1 ⊕V 2 over 0 , and the compactness argument, used in Step 2, guarantees that there is a constant N = N (k, l, δ) such that p

Ni (u ri , v˜i ), Ni (v˜ir , v˜is ) ≤ N , i = 1, 2, p

for all (S1 , S2 ) ∈ k,l (K ), t ∈ [δ, 2 − δ], and all r, p, s, where the v˜ j are the bases of H 0 S j , L t (k + l − 1)[−Dζ0 ] , defined in Step 3. Now, the estimate (10.8) shows that the matrix of the form (10.10) evaluated on the basis (10.9) is N e−Rt -close to the identity matrix (different N ). We can, therefore, for any t ∈ [δ, 2 − δ], use the Gram- Schmidt process and modify the bases u 11 , . . . , u k1 of H 0 S1 , L t (k + l − 1)[−τ (D)] and u 12 , . . . , u l2 of H 0 S2 , L t (k + l − 1)[−D] by vectors of length N e−Rt (relative to these bases), so that the solution of (10.7) will be unitary in H 0 S1 ∪ S2 , L t (k + l − 1) . Step 5. We prove Theorem 10.2, which also proves the second statement of Theorem 10.1. We have a unitary basis of H 0 S1 ∪ S2 , L t (k + l − 1) of the form (10.9). We rename u 11 , . . . , u k1 , u 12 , . . . , u l2 as ψˆ 1 , . . . , ψˆ k+l and we rename the basis (10.9) as ˆ ζ ) and A(t, ζ ) represent multiplication ψ1 , . . . , ψk+l . The matricial polynomials A(t, by η in the bases ψˆ i and ψi . The formula (2.4) defines g(t, ζ ) and shows that it is meromorphic in ζ with only possible singularities at points of π (supp S1 ∩ S2 ). The (2.5) shows that, at any point ζ ∈ P1 , such that supp Dζ on S1 ∪ S2 consists of k + l distinct points p1 , . . . , pk ∈ S1 , pk+1 , . . . , pk+l ∈ S2 (such points are generic, thanks to Corollary 9.9), we have −1 ψ j ( pi ) . g(t, ζ ) = ψˆ j ( pi ) In particular, g(t, ζ ) satisfies the assertion (ii) of Theorem 10.2. Moreover, since j ˆ det ψ ( pi ) and det ψ j ( pi ) vanish to the same order at any π (supp S1 ∩ S2 ), we conclude that det g(t, ζ ) is constant and can be assumed to be 1. Represent each u rj by (u rj )0 and (u rj )∞ in U0 ∩ S j and U∞ ∩ S j , j = 1, 2. Let G be a compact subset of P1 − {∞} with a nonempty interior. Because of the compactness of 11 (K ) and 22 (K ), we have N j (G) = sup |(u rj )0 (ζ, η)|; ζ ∈ G, (S j , τ j (D)) ∈ j j (K ) < +∞. (10.12) r

Similarly, for every vector s of length one in the restriction of Vζ10 (t) to 11 (K ) or in the restriction of Vζ20 (t) to 22 (K ), we have sup {|s0 (ζ, η)|; ζ ∈ G} ≤ O j (G) for some finite number O j (G), j = 1, 2.

(10.13)

Monopoles and Clusters

705

p Consider the sections u rj of H 0 S j , L t (k + l − 1)[−τ j (D)] and, as in Step 3, v˜ j of H 0 S j , L t (k + l − 1)[−Dζ0 ] . Let N˜ j (G), O˜ j (G) be the suprema applied to these sections (for ζ ∈ G). Lemma 10.4 gives: (10.14) N˜ 1 (G) ≤ N1 (G)e− k+l , N˜ 2 (G) ≤ N2 (G)e k+l , l Rt k Rt (10.15) O˜ 1 (G) ≤ O1 (G)e− k+l , O˜ 2 (G) ≤ O2 (G)e k+l . Now, our basis ψ j of H 0 S1 ∪ S2 , L t (k + l − 1) is of the form (10.9), where wir and p yis are linear combinations of the v˜i with coefficients satisfying the estimates (10.8). Hence ! (10.16) sup (w1r )0 (ζ, η) , (y1s )0 (ζ, η) ; ζ ∈ G ≤ M O1 (G)e−(k+2l)Rt/(k+l) , l Rt

k Rt

r,s

! sup (w2r )0 (ζ, η) , (y2s )0 (ζ, η) ; ζ ∈ G ≤ M O2 (G)e−l Rt/(k+l) .

(10.17)

r,s

ˆ ) = ψˆ j ( pi ) . We can also write Let us write ψ(ζ ) = ψ j ( pi ) and ψ(ζ ψ(ζ ) =

e−l Rt/(k+l) · 1 0

0 ek Rt/(k+l) · 1

C11 (ζ ) C21 (ζ )

C12 (ζ ) , C22 (ζ )

where the diagonal blocks have sizes k × k and l × l. The above estimates imply |C11 (ζ )| , |C22 (ζ )| ≤ N , |C12 (ζ )| , |C21 (ζ )| ≤ Me−α Rt , for all ζ ∈ G and all (S1 , S2 ) ∈ k,l (K ) with R(S1 , S2 ) sufficiently large (N , M, α depend only on k, l, δ, K , G). Similarly, we can write −l Rt/(k+l) ·1 0 e Cˆ 11 (ζ ) 0 ˆ ψ(ζ ) = , 0 ek Rt/(k+l) · 1 0 Cˆ 22 (ζ ) ˆ ) be with Cˆ ii (ζ ) bounded by N , and Cii (ζ ) − Cˆ ii (ζ ) ≤ Me−α Rt . Let C(ζ ) and C(ζ the matrices with blocks Ci j (ζ ) and Cˆ i j (ζ ) (we omit the t-dependence). Then g(t, ζ ) = ˆ )−1 C(ζ ), and since C(ζ ) is uniformly bounded on G and det g(t, ζ ) = 1, det C(ζ ˆ ) C(ζ is uniformly bounded on G. Together with the above estimates, this proves the assertion (i) of Theorem 10.2. Step 6. We prove the first statement of Theorem 10.1. We have to show that the Nahm flow corresponding to L t (k + l − 1) on S1 ∪ S2 does not have singularities for all t ∈ (0, 2). We know already, from Step 3, that there is an N = N (k, l, K ), such that the flow is regular on (N /R, 2). Suppose that there is a sequence (S1n , S2n ) ∈ k,l (K ) (with the standing assumption that the total center is zero and ζ21 = 1) such that the flow corresponding to L t (k + l − 2) on S1n ∪ S2n has a pole at n ∈ (0, N /Rn ), where Rn = R(S1n , S2n ). Let Pin (ζ, η) = 0 be the equations of Sin , i = 1, 2, and consider the rescaled curves S˜in given by the equations Pin (ζ, η/Rn ) = 0. The Nahm flow on S˜1n ∪ S˜2n hasa pole at Rn n ∈ (0, N ). On the other hand, we can find a converging subsequence n n ˜ ˜ of S1 , S2 and the limit S ∞ is a nilpotent curve or the union of two such curves. In both cases the limit Nahm flow on S ∞ is regular on (0, +∞). For any spectral curve,

706

R. Bielawski

the Nahm flow (without the T0 -component) corresponding to L t (k + l − 1) is a regular singular ODE the resonances of which are determined by the coefficients of the curve. Thus, the usual lower semi-continuity of ω+ , where [0, ω+ ) is the maximal interval of existence of solutions to an ODE, implies that, for curves close enough to S ∞ , the Nahm flow is regular on (0, N + 1). This is a contradiction. 11. The Asymptotic Region of M k,l and Nahm’s Equations We consider now these elements of k,l for which the flow L t (k + l − 1) on S1 ∪ S2 does not meet the theta divisor for t ∈ (0, 2). In other words the corresponding Nahm flow exists for t ∈ (0, 1]. According to Theorem 10.1, this is true in the asymptotic region of

k,l (K ). Recall, once again, that the flows Ti (t) corresponding to L t (k +l −1) on S1 ∪ S2 have poles at t = 0 corresponding to the irreducible representation of dimension k +l. Let A(t, ζ ) denote the corresponding matricial polynomials, i.e. A(t, ζ ) = (T2 (t) + i T3 (t))+ 2i T1 (t)ζ + (T2 (t) − i T3 (t)) ζ 2 . Theorem 10.2 implies that, as long as R(S1 , S2 ) is large enough, there is a meromorphic map g : P1 → S L(k + l, C), with poles at S1 ∩ S2 , ˆ ), where A(ζ ˆ ) is block-diagonal with the blocks such that g(ζ )A(1, ζ )g(ζ )−1 = A(ζ corresponding to line bundles L 1 (k + l − 1)|S1 [−τ (D)] and L 1 (k + l − 1)|S2 [−D]. We define a space P as the set of pairs (A(t, ζ ), g(ζ )), where A(t, ζ ), t ∈ (0, 1], is the matricial polynomial corresponding to the flow L t (k + l − 1) on S1 ∪ S2 ((S1 , S2 ) ∈ Sk,l ) and g : P1 → G L(k + l, C) is meromorphic with poles at S1 ∩ S2 , such that ˆ ), where A(ζ ˆ ) is block-diagonal with the blocks symmeg(ζ )A(1, ζ )g(ζ )−1 = A(ζ tric, satisfying the reality condition (2.9) and corresponding to line bundles L 1 (k + l − 1)|S1 [−τ (D)] and L 1 (k + l − 1)|S2 [−D]. The map g is not unique: the conditions on ˆ ) are preserved by conjugation by block-diagonal matrices H ∈ U (k) × U (l) such A(ζ that the non-central parts of the blocks are orthogonal. Let M be the quotient of P by O(k) × O(l). Proposition 11.1. There is a canonical embedding of M into Mk,l . Proof. We already have an embedding on the level of spectral curves. We have to show that an element of M gives also a pair of meromorphic sections of L 2 on S1 and on S2 . Let (A(t, ζ ), g(ζ )) represent an element of M. Just as at the end of Sect. 3 consider the unique d solution w(t, ζ ) of dt w + A# w = 0 satisfying t −(k+l−1)/2 w(t, ζ ) → (1, 0, . . . , 0)T as T t → 0 ((1, 0, . . . , 0) lies in the −(k + l − 1)/2-eigenspace of the residue of A# ). The vector w(ζ ) = w(1, ζ ) is cyclic for A(1, ζ ) for any ζ , and similarly w T (ζ ) is a cyclic ˆ ), apart from singularities, and covector for A(1, ζ )T . Hence g(ζ )w(ζ ) is cyclic for A(ζ T T T ˆ ) = A(ζ ˆ ). Therefore the following formula is w (ζ )g (ζ ) is a cyclic covector for A(ζ well-defined on M and associates to (A(t, ζ ), g(ζ )) a meromorphic function on (S1 ∪ S2 ) − π −1 (∞): ν0 (ζ, η) = w(ζ )T g T (ζ )g(ζ ) (η − A(1, ζ ))adj w(ζ ) ˆ ) g(ζ )w(ζ ). = w(ζ )T g T (ζ ) η − A(ζ adj

(11.1)

Arguments such as in [20] show that this defines a (meromorphic) section of L 2 on S1 ∪ S2 and Theorem 10.2 shows that ν0 restricted to S1 and to S2 have correct divisors, i.e. D − τ (D) on S1 and τ (D) − D on S2 . Finally, it is clear that g(ζ ) and H g(ζ ), where H is block-diagonal with each block central, give different ν0 unless H is orthogonal. Therefore the map is an embedding.

Monopoles and Clusters

707

ζ0 From the proof we obtain an interpretation of the biholomorphism Mk,l Ratk P1 × Ratl P1 of Theorem 4.1 in terms of Nahm’s equations: Corollary 11.2. The composition of the embedding M → Mk,l with the biholomor 1 1 ζ0 (z) p2 (z) phism Mk,l Ratk P × Ratl P is given by (A(t, ζ ), g(ζ )) → qp11(z) , q2 (z) , where q1 , q2 are the equations of S1 , S2 at ζ = ζ0 and p1 , p2 are defined by p1 (z) ≡ ν0 (ζ0 , z)

mod q1 (z),

p2 (z) ≡ ν0 (ζ0 , z)

mod q2 (z),

with ν0 given by (11.1). For every ζ0 ∈ P1 we now define a map from a subset of M (i.e. from a subset of Mk,l ) to the monopole moduli spaceMk+l . Thismap is simply given by a corresponding (z) p2 (z) map on the rational functions. Let qp11(z) , q2 (z) ∈ Ratk P1 × Ratl P1 and assume P(z) that q1 and q2 are relatively prime. We define a rational map Q(z) of degree k + l by Q(z) = q1 (z)q2 (z) and P(z) as the unique polynomial of degree k + l − 1 such that P(z) ≡ p1 (z) mod q1 (z) and P(z) ≡ p2 (z) mod q2 (z). The map p1 (z) p2 (z) P(z) ,

−→ q1 (z) q2 (z) Q(z)

induces a map from the corresponding region of Mk,l to Mk+l . We shall abuse the notation and write ζ0 : Mk,l −→ Mk+l for this map (even that it is not defined on all of Mk,l ). It is clearly holomorphic for the chosen complex structure and preservesthe corresponding complex symplectic form. k +l to 1. We also observe that generically ζ0 is k The region on which ζ0 is defined contains an open dense subset of M (given by the condition ζ0 ∈ π(S1 ∩ S2 )) and we wish to give a description of ζ0 in terms of solutions to Nahm’s equations. First of all, the map which associates to an [A(t, ζ ), g(ζ ))] ∈ M P(z) the rational function Q(z) is given, by the discussion above, by −1 ˆ 0) g(ζ0 )w(ζ0 ), ( A(t, ζ ), g(ζ )) −→ w(ζ0 )T g T (ζ0 ) z − A(ζ

(11.2)

where w(ζ ) is defined as in the proof of Proposition 11.1. To obtain a solution to Nahm’s equations, corresponding to P(z)/Q(z), directly from [(A(t, ζ ), g(ζ ))] ∈ M we proceed as follows. Thanks to the S O(3)-action, we can assume, without loss of generality, that ζ0 = 0. We then split the Nahm equations into a complex one and a real one, as in (5.1) and (5.2). Then β(t) = A(t, ζ0 ) and α(t) = A# (t, ζ0 ). Since ζ0 ∈ π(S1 ∩ S2 ), g(ζ0 ) is a regular matrix which conjugates β(1) to a symmetric and block-diagonal matrix B. Extend g(ζ0 ) to a smooth path g(t) ∈ Gl(n, C), for t ∈ [0, 1], with g(t) = 1 for t ≤ 1/2, g(1) = g(ζ0 ) and α(t) ˜ = g(t)α(t)g(t)−1 − −1 −1 ˜ g(t)g(t) ˙ being symmetric at t = 1. Let β(t) = g(t)β(t)g(t) and extend α, ˜ β˜ to [0, 2] by symmetry. We obtain a smooth solution to the complex Nahm equation on [0, 2] with boundary conditions of an element of Mk+l . We can now find, as in [15], a

708

R. Bielawski

unique solution to the real equation via a complex gauge transformation G(t) which is identity at t = 0, 2. The resulting solution is the value of ζ0 at [(A(t, ζ ), g(ζ ))]. We are now going to show that asymptotically the map ζ0 is exponentially close to the identity. For this we need to restrict the asymptotic region and define it directly in terms of rational functions, as in Corollary 9.3. ζ

ζ

0 0 Definition 11.3. Let ζ0 ∈ P1 and K > 0. We denote by Mk,l (K ) the subset of Mk,l (z) p2 (z) corresponding to qp11(z) , q2 (z) ∈ Rat k P1 × Ratl P1 which satisfy:

(i) Any zero of q1 (z) is at least distance 1 apart from any zero of q2 (z). (ii) Any two zeros of q1 (z) (resp. of q2 (z)) are distance at most 2K apart. (iii) If β1 , β2 are two zeros of q1 (z) (resp. of q2 (z)), then |ln | p1 (β1 )| − ln | p1 (β2 )|| ≤ 2K (resp. |ln | p2 (β1 )| − ln | p2 (β2 )|| ≤ 2K ). ζ

0 In other words, Mk,l (K ) corresponds to pairs of rational functions, which are within a 1 2 e 1 e a2 , where b1 = fixed “distance” from (z−b , βi /k, b2 = βi /l, a1 = k l (z−b2 ) 1) 1 2 1 1 2 ln | p1 (βi )|/k, a2 = ln | p2 (βi )|/l (here where β1 , . . . , βk (resp. β1 , . . . , βl2 ) are ζ0 the roots of q1 (z) (resp. q2 (z))). For an m ∈ Mk,l , let us define

R ζ0 (m) = min{βi1 − β 2j ; i = 1, . . . , k, j = 1, . . . , l}. If m = (S1 , ν1 , S2 , ν2 ), then we obviously have R(S1 , S2 ) ≥ R ζ0 (m). With these preliminaries, we have: Theorem 11.4. For every K > 0, there exist positive constants R0 , α, C such that the ζ0 map ζ0 satisfies the following estimates in the region of Mk,l (K ), where R ζ0 (m) ≥ R0 and ζ0 is at least distance 1/2 from the roots of (b1 −b2 )+2(a1 −a2 )ζ −(b¯1 − b¯2 )ζ 2 . Let ζ0 (S1 , ν1 , S2 , ν2 ) = (S, ν). Then d(S, S1 ∪ S2 ) ≤ Ce−α R . Moreover, the numerators p˜ ζ (z), pζ (z) of the rational functions of degree k + l, corresponding to (S1 , ν1 , S2 , ν2 ) and to (S, ν) and direction ζ (so that pζ0 (z) for = p˜ ζ0 (z)), are also exponentially close −α R ˆ ˆ | p˜ ζ (βi ), where ζ sufficiently close to ζ0 in the sense that pζ (βi ) − pˆ ζ (βi ) ≤ Ce βˆi , βi , i = 1, . . . , k +l, are the η-coordinates of points of S1 ∪ S2 and of S lying above ζ . Proof. According to Theorems 10.1 and 10.2 (and Remark 9.6), in the region under consideration, we can conjugate the flow A(t, ζ ) by a unitary u(t), u(0) = 1, so that ˆ ) (and satisfying the A(1, ζ ) is Ce−α R -close to a block-diagonal and symmetric A(ζ reality condition (2.9)). Moreover, in the notation of Theorem 10.2, the matrix g(ζ ) ˆ ) is, for ζ close to ζ0 , Ce−α R -close to identity. The which conjugates A(1, ζ ) to A(ζ ˜ solutions α, ˜ β, defined before Definition 11.3, to the complex Nahm equation on [0, 2] are then exponentially close to satisfying the real equation, in the sense that the difference of the two sides in (5.2) is bounded by Ce−α R . It follows then, using Lemma 2.10 in [15], that the complex gauge transformation G(t), G(0) = G(2) = 1, which solves the real equation is Ce−α R -close to a unitary gauge transformation, uniformly on [0, 2], and ˙ −1 is uniformly Ce−α R -close to a skew-hermitian matrix. The result follows. GG

Monopoles and Clusters

709

12. Comparison of Metrics We wish to show that the (local) biholomorphism ζ0 of the previous section is very close to being an isometry when the clusters are far apart. Recall the definition 11.3 of ζ0 (K ), and the notation following that definition. Then: the region Mk,l Theorem 12.1. Let g and g˜ be the hyperkähler metrics on Mk+l and Mk,l , respectively. For every K > 0, there exist positive constants R0 , α, C such that, in the region of ζ0 Mk,l (K ), where R ζ0 (m) ≥ R0 and ζ0 is at least distance 1/2 from the roots of (b1 − b2 ) + 2(a1 − a2 )ζ − (b¯1 − b¯2 )ζ 2 , the following estimate holds: " " " ∗ " "ζ0 g − g˜ " ≤ Ce−α R . The remainder of the section is devoted to the proof of this theorem. The metric (3.4) on Mk+l is given in terms of solutions to infinitesimal Nahm’s equations (3.3). Things are more complicated for Mk,l . Although, we have a description of Mk,l as a space of solutions to Nahm’s equations, it is not a moduli space (i.e. there is no gauge group involved). In particular, in our description of Mk,l , a tangent vector is a triple (t˜1 , t˜2 , t˜3 ) on [0, 1] satisfying only the last three equations in (3.3), with t˜0 = 0 (and, of course satisfying additional restrictions, since we allow spectral curves to vary only in special directions). Nevertheless, the first equation in (3.3) arises only by adding an infinitesimal gauge transformation, and this has no effect on the Kähler form corresponding to ζ0 . This fact can be interpreted by trivialising the twistor space of any moduli space of solutions to Nahm’s equations, such as Mk+l . For a solution (T1 (t), T2 (t), T3 (t)), set A(t, ζ ) = (T2 (t) + i T3 (t)) + i T1 (t)ζ + (T2 (t) − i T3 (t)) ζ 2 , ˜ ζ ) = (T2 (t) + i T3 (t)) /ζ 2 + A# (t, ζ ) = i T1 (t) + (T2 (t) − i T3 (t)) ζ for ζ = ∞, and A(t, ˜ i T1 (t)/ζ + (T2 (t) − i T3 (t)), A# (t, ζ ) = −i T1 (t) + (T2 (t) + i T3 (t)) /ζ for ζ = 0. Then, over ζ = 0, ∞, we have A˜ = A/ζ 2 , A˜ # = A# − A/ζ . The fibrewise complex symplectic form, given by (3.7), on the twistor spaces Z (Mk+l ) of Mk+l is then equal to

2 ζ = d A# (t, ζ ) ∧ d A(t, ζ ). (12.1) 0

The Kähler form ω1 , corresponding to the complex structure I0 , is then the linear term in the expansion of ζ in ζ . ˜ ζ and the We can give a similar interpretation of the complex symplectic form Kähler form ω˜ 1 on Mk,l . From the previous section, a solution (T1 (t), T2 (t), T3 (t)) to Nahm’s equations on (0, 1], corresponding to a point of Mk,l , defines a meromorphic section of L 2 on S1 ∪ S2 by first combining the matrices Ti into a matricial polynomial A(t, ζ ), as above, and then conjugating A(1, ζ ) by a meromorphic g(ζ ). If we extend, for ζ close to 0, g(ζ ) to a path g(·, ζ ) : [0, 1] → S L(n, C) and define A# (t, ζ ) as for ˜ ζ is equal to Mk+l , then the form

1 dg(t, ζ ) 2 g(t, ζ )−1 ∧d g(t, ζ )A(t, ζ )g(t, ζ )−1 . d g(t, ζ )A# (t, ζ )g(t, ζ )−1 − dt 0 (12.2) Again, ω˜ 1 is the ζ -coefficient of this expression. To estimate dζ0 , we use the S O(3)action to assume that ζ0 = 0. We observe directly from definitions, 0 is not only a

710

R. Bielawski

˜ 0. biholomorphism, but that it also respects the complex symplectic forms 0 and Thus, to prove the theorem, it suffices to show that ∗0 ζ , evaluated on vectors of length ˜ ζ for ζ close to 0. Equivalently, we can evaluate on 1 in g, ˜ is exponentially close to tangent vectors v, such that d0 (v) has length 1 in the metric g. ˜ ζ do not depend on adding Furthermore, the above expressions of the forms ζ and an infinitesimal gauge transformation (equal to zero at both ends of the interval) to a tangent vector. This means, in practice, that it does not matter, whether we consider tangent vectors as being quadruples (t0 , t1 , t2 , t3 ) satisfying (3.3), or triples (t1 , t2 , t3 ) satisfying only the last three equations in (3.3) (with t0 = 0). We now consider a unit tangent vector to Nk,l , i.e. solutions (tˇ0 , tˇ1 , tˇ2 , tˇ3 ) to Eq. (3.3) on [−1, 0]∪[0, 1]. The asymptotic region under consideration corresponds to an asymptotic region of Nk,l , and there we have C 0 -bounds on tangent vectors, obtained as in [9, pp. 316–318]. From a tangent vector to Nk,l , we obtain a tangent vector to Mk,l , as an infinitesimal solution (t˜0 , t˜1 , t˜2 , t˜3 ) to Nahm’s equations on [0, 1]. This is done as an infinitesimal version of the proof of Theorems 10.1 and 10.2 (this is straightforward but rather long and we shall leave out the details), and the estimates, applied to the unit tangent bundle of the compact sets considered there, show that: (i) there is a pointwise C 0 -bound on the ti , (ii) t˜i (1) are exponentially close to being symmetric, and (iii) the ) are exponentially small for ζ close to 0. infinitesimal variations of g(t, ζ ) and dg(t,ζ dt Furthermore, the following expression (which has nothing to do with the metric g) ˜ N (t˜) = −2

3 i=0

1 0

tr t˜i2

(12.3)

is O(1/R) close to 1 (essentially, by integrating the O(e−α Rs )-difference between t˜i (s) and tˇi (s)). Now, an infinitesimal version of the proof of Theorem 11.4 (Lemma 2.10 in [7] is now replaced by arguments on p. 152 in [6]) produces a tangent vector (t0 , t1 , t2 , t3 ) to Mk+l , which is pointwise exponentially close to (t˜0 , t˜1 , t˜2 , t˜3 ). The estimate on N (t˜), together with a pointwise bound on t˜i , shows that the length of (t0 , t1 , t2 , t3 ) in the metric g is O(1/R) close to 1. Hence, if we reverse the steps and assume that the (t0 , t1 , t2 , t3 ) thus obtained has length 1, then (t˜0 , t˜1 , t˜2 , t˜3 ) is still exponentially close to (t0 , t1 , t2 , t3 ) and the pointwise bound on t˜i (s) and exponential bound on the corresponding infinitesimal ) remain valid. This, together with the estimates on A(t, ζ ) variations of g(t, ζ ) and dg(t,ζ dt in the proof of Theorems 10.1 and 10.2, shows that (12.2) evaluated on two vectors (t˜0 , t˜1 , t˜2 , t˜3 ) is exponentially close to (12.1) evaluated on two unit vectors (t0 , t1 , t2 , t3 ). This completes the proof. Remark 12.2. The spaces Nk,l and Mk,l are also biholomorphic for a fixed complex structure Iζ0 (see the definition of the map T in Sect. 6). The above proof shows that, in the asymptotic region of Theorem 12.1, this biholomorphism is O(1/R)-close to being an isometry. This is again (cf. Remark 6.2) analogous to the behaviour of the Taub-NUT metrics with positive and with negative mass parameter. 13. Concluding Remarks 13.1. It would be interesting to derive the hyperkähler metric on Mk,l from physical principles, i.e. as a Lagrangian on pairs of monopoles of charges k and l with a relative electric charge.

Monopoles and Clusters

711

13.2. The metric on Mk,l can be constructed via the generalised Legendre transform of Lindström and Roˇcek [26,27], analogously to the monopole metric [18,19,23]. This, and further twistor constructions, will be discussed elsewhere. 13.3. The constraints on spectral curves in k,l are those for SU (2)-calorons of charge (k, l) [13,31]. Is there any physics behind this? 13.4. As mentioned in the introduction we could not give a description of Mk,l as a moduli space of Nahm’s equations. Nevertheless there is an analogy with the description of the Gibbons-Manton metric in [8]. For (S1 , S2 ) ∈ k,l we would like to consider the flow L s (k + l − 2) on S1 ∪ S2 for all s ≥ 0. The (unique) compactification (as the moduli space of semi-stable admissible sheaves) of J g−1 (S1 ∪ S2 ) has a stratum (of smallest dimension) isomorphic to J g1 −1 (S1 ) × J g2 −1 (S2 ). From the proof of Theorem 10.1 we know that the flow L s (k + l − 2) approaches the flow L s (k + l − 2)[−τ (D)] ⊕ L s (k + l − 2)[−D] on this boundary stratum as s → +∞. Can one obtain Mk,l as a moduli space of solutions to Nahm’s equations on [0, +∞) with the corresponding behaviour as s → +∞? The Nahm flow will have singularities, so this is certainly not obvious. 13.5. We defined, for every complex structure, a (finite-to-one) biholomorphism ζ between open domains of Mk,l and of Mk+l . On the other hand, we have, also for every complex structure, a biholomorphism ζ between an open domain of Mk,l and Mk × Ml , namely the identity on pairs of rational functions. Given Proposition 1.1 or the arguments in the proof of Proposition 9.7 and Remark 12.2, we expect also ζ to be an asymptotic isometry. To obtain a precise rate of approximation requires a more precise analysis of convergence in Proposition 9.7, but we expect, by analogy with the Gibbons-Manton metric, that the metrics on Mk,l and on Mk × Ml are O(1/R)-close. 13.6. Finally, let us address the question of more than two clusters. As mentioned in the Introduction, it is clear how to define the “moduli space” Mn 1 ,...,n s of s clusters with magnetic charges n 1 , . . . n s , n 1 + · · · + n s = n. We need s spectralcurves Si ∈ |O(2n i | with Si ∩ S j = Di j ∪ D ji , D ji = τ (Di j ), and s sections νi of L 2 j=i (D ji − Di j ) on every Si . They need to satisfy conditions analogous to those for Mk,l . We also can define a pseudo-hyperkähler metric on Mn 1 ,...,n s just as for Mk,l and even to argue that a map ζ to Mn is a biholomorphism. One needs to show that the images of maps ζ for different ζ cover the asymptotic region of Mn , i.e. to prove an analogue of Theorem 4.1 for s clusters, and this might be hard, since we do not know what the analogue of Nk,l should be. Nevertheless, to prove that ζ is exponentially close to being an isometry in the asymptotic region of Mn 1 ,...,n s one does not need to rely on the arguments given here. In principle, one could try (also for the case of two clusters) to do everything in terms of theta functions of the spectral curves. Acknowledgements. A Humboldt Fellowship, during which a part of this work has been carried out, is gratefully acknowledged.

References 1. Adams, M.R., Harnad, J., Hurtubise, J.: Isospectral Hamiltonian flows in finite and infinite dimensions II. Integration of flows. Commun. Math. Phys. 134, 555–585 (1990) 2. Alexeev, V.: Compactified Jacobians. http://front.math.ucdavis.edu/9608.5012,alg-geom/9608012, 1996 3. Atiyah, M.F., Hitchin, N.J.: The geometry and dynamics of magnetic monopoles. Princeton, NJ: Princeton University Press, 1988 4. Beauville, A.: Jacobiennes des courbes spectrales et systèmes hamiltoniens complètement intégrables. Acta Math. 164, 211–235 (1990)

712

R. Bielawski

5. Besse, A.L.: Einstein manifolds. Berlin: Springer Verlag, 1987 6. Bielawski, R.: Asymptotic behaviour of SU (2) monopole metrics. J. Reine Angew. Math. 468, 139–165 (1995) 7. Bielawski, R.: Monopoles, particles and rational functions. Ann. Glob. Anal. Geom. 14, 123–145 (1996) 8. Bielawski, R.: Monopoles and the Gibbons-Manton metric. Commun. Math. Phys. 194, 297–321 (1998) 9. Bielawski, R.: Asymptotic metrics for SU (N )-monopoles with maximal symmetry breaking. Commun. Math. Phys. 199, 297–325 (1998) 10. Bielawski, R.: Reducible spectral curves and the hyperkähler geometry of adjoint orbits. J. London Math. Soc. 76, 719–738 (2007) 11. Biquard, O.: Sur les équations de Nahm et les orbites coadjointes des groupes de Lie semi-simples complexes. Math. Ann. 304, 253–276 (1996) 12. de Cataldo, M.A., Migliorini, L.: The Douady space of a complex surface. Adv. in Math. 151, 283– 312 (2000) 13. Charbonneau, B., Hurtubise, J.C.: Calorons, Nahm’s equations on S 1 and bundles over P1 × P1 . http:// arxiv.org/abs/math/0610804, 2006 14. Dancer, A.S.: Nahm’s equations and hyperkähler geometry. Commun. Math. Phys. 158, 545–568 (1993) 15. Donaldson, S.K.: Nahm’s equations and the classification of monopoles. Commun. Math. Phys. 96, 387– 407 (1984) 16. Gibbons, G.W., Manton, N.S.: The moduli space metric for well-separated BPS monopoles. Phys. Lett. B 356, 32–38 (1995) 17. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) 18. Houghton, C.J.: On the generalized Legendre transform and monopole metrics. J. High Energy Phys. 2 (2000) 19. Houghton, C.J., Manton, N.S., Romão, N.M.: On the constraints defining BPS monopoles. Commun. Math. Phys. 212, 219–243 (2000) 20. Hurtubise, J.C.: Monopoles and rational maps: a note on a theorem of Donaldson. Commun. Math. Phys. 100, 191–196 (1985) 21. Hurtubise, J.C.: The classification of monopoles for the classical groups. Commun. Math. Phys. 120, 613– 641 (1989) 22. Hurtubise, J.C., Murray, M.K.: On the construction of monopoles for the classical groups. Commun. Math. Phys. 122, 35–89 (1989) 23. Ivanov, I.T., Roˇcek, M.: Supersymmetric σ -models, twistors, and the Atiyah-Hitchin metric. Commun. Math. Phys. 182, 291–302 (1996) 24. Kronheimer, P.B.: A hyper-kählerian structure on coadjoint orbits of a semisimple complex group. J. London Math. Soc. 42, 193–208 (1990) 25. Kronheimer, P.B.: Instantons and the geometry of the nilpotent variety. J. Differ. Geom. 32, 473– 490 (1990) 26. Lindström, U., Roˇcek, M.: Scalar tensor duality and N = 1, 2 nonlinear σ -models. Nucl. Phys. 222B, 285–308 (1983) 27. Lindström, U., Roˇcek, M.: New hyper-Kähler metrics and new supermultiplets. Commun. Math. Phys. 115, 21–29 (1988) 28. Manton, N.S.: Monopole interactions at long range. Phys. Lett. B 154, 397–400 (1985) 29. Nahm, W.: The construction of all self-dual monopoles by the ADHM method, In: Monopoles in quantum field theory, Singapore: World Scientific, 1982 30. Nakajima, H.: Lectures on Hilbert schemes of points on surfaces. Providence, RI: Amer. Math. Soc., 1999 31. Nye, T.M.W.: The Geometry of Calorons, Ph.D thesis, Univ. of Edinburgh, 2001, available at http://arxiv. org/list/hep-th/0311215, 2003 32. Poucin, G.: Théorème de Douady au-dessus de S. Ann. Scuola Norm. Sup. Pisa 23, 451–459 (1969) Communicated by G.W. Gibbons

Commun. Math. Phys. 284, 713–774 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0607-1

Communications in

Mathematical Physics

Quantisation of Twistor Theory by Cocycle Twist S. J. Brain1, , S. Majid2 1 Mathematical Institute, 24-29 St. Giles’, Oxford, OX1 3LB, UK.

E-mail: [email protected]; [email protected]

2 School of Mathematical Sciences, Queen Mary, University of London, 327 Mile End Rd,

London E1 4NS, UK Received: 30 June 2007 / Accepted: 20 May 2008 Published online: 8 October 2008 – © Springer-Verlag 2008

Abstract: We present the main ingredients of twistor theory leading up to and including the Penrose-Ward transform in a coordinate algebra form which we can then ‘quantise’ by means of a functorial cocycle twist. The quantum algebras for the conformal group, twistor space CP3 , compactified Minkowski space CM# and the twistor correspondence space are obtained along with their canonical quantum differential calculi, both in a local form and in a global ∗-algebra formulation which even in the classical commutative case provides a useful alternative to the formulation in terms of projective varieties. We outline how the Penrose-Ward transform then quantises. As an example, we show that the pullback of the tautological bundle on CM# pulls back to the basic instanton on S 4 ⊂ CM# and that this observation quantises to obtain the Connes-Landi instanton on θ -deformed S 4 as the pull-back of the tautological bundle on our θ -deformed CM# . We likewise quantise the fibration CP3 → S 4 and use it to construct the bundle on θ -deformed CP3 that maps over under the transform to the θ -deformed instanton. 1. Introduction and Preliminaries There has been a lot of interest in recent years in the ‘quantisation’ of space-time (in which the algebra of coordinates xµ is noncommutative), among them one class of examples of the Heisenberg form [xµ , xν ] = ıθµν , where the deformation parameter is an antisymmetric tensor or (when placed in canonical form) a single parameter θ . One of the motivations here is from the effective theory of The work was mainly completed while S.M. was visiting July-December 2006 at the Isaac Newton Institute, Cambridge, which both authors thank for support. Current address: SISSA International School for Advanced Studies, Via Beirut 2–4, 34014 Trieste, Italy

714

S. J. Brain, S. Majid

the ends of open strings in a fixed D-brane [19] and in this context a lot of attention has been drawn to the existence of noncommutative instantons and other non-trivial noncommutative geometry that emerges, see [16] and references therein to a large amount of literature. One also has θ -versions of S 4 coming out of considerations of cyclic cohomology in noncommutative geometry (used to characterise what a noncommutative four-sphere should be), see notably [6,11]. In the present paper we show that underlying and bringing together these constructions is in fact a systematic theory of what could be called θ -deformed or ‘quantum’ twistor theory. Thus we introduce noncommutative versions of conformal complexified space-time CM# , of twistor space CP3 as well as of the twistor correspondence space F1,2 of 1-2-flags in C4 used in the Penrose-Ward transform [18,20]. Our approach is a general one but we do make contact for specific parameter values with some previous ideas on what should be noncommutative twistor space, notably with [9,10] even though these works approach the problem entirely differently. In our approach we canonically find not only the noncommutative coordinate algebras but also their algebras of differential forms. Indeed, because our quantisation takes the form of a ‘quantisation functor’, we find in principle the noncommutative versions of all suitably covariant constructions. Likewise, inside our θ -deformed CM# we find (again for certain parameter values) exactly the θ -deformed S 4 of [6] as well as its differential calculus. While the quantisation of twistor theory is our main motivation, most of the present paper is in fact concerned with properly setting up the classical theory from the ‘right’ point of view after which quantisation follows functorially. We provide in this paper two classical points of view, both of interest. The first is purely local and corresponds in physics to ordinary (complex) Minkowski space as the flat ‘affine’ part of CM# . Quantisation at this level gives the kind of noncommutative space-time mentioned above, which can therefore be viewed as a local ‘patch’ of the actual noncommutative geometry. The actual varieties CM# and CP3 are however projective varieties and cannot therefore be described simply by generators and relations in algebraic geometry; one should rather pass to the ‘homogeneous coordinate algebras’ corresponding to the affine # , spaces CM CP3 = C4 . Our notation throughout is that if X ⊆ CPn is a projective ˜ variety then X ⊆ Cn+1 is the affine variety which projects to X upon deleting the origin and quotienting by the action of C∗ onto one-dimensional subspaces). Let us call this the ‘conventional approach’. We explain the classical situation in this approach in Sects. 1.1, 2 below, and quantise it (including the relevant quantum group of conformal transformations and the algebra of differential forms) in Sects. 4,5. The classical Sects. 1.1, 2 here are not intended to be anything new but to provide a lightning introduction to the classical theory and an immediate coordinate algebra reformulation for those unfamiliar either with twistors or with algebraic groups. The quantum Sects. 4,5 contain the new results in this stream of the paper and provide a more or less complete solution to the basic noncommutative differential geometry at the level of the quantum homogeneous # ], C [ 3 coordinate algebras C F [CM F CP ], etc. Here F is a two-cocycle which is the general quantisation datum in the cocycle twisting method [12,13] that we use. Our second approach even to classical twistor theory is a novel one suggested in fact from quantum theory. We call this the ‘unitary or ∗-algebraic formulation’ of our projective varieties CM# , CP3 as real manifolds, setting aside that they are projective varieties. The idea is that mathematically CM# is the Grassmannian of two-planes in C4 and every point in it can therefore be viewed not as a two-plane but as a self-adjoint rank 2 projector P that picks out the two-plane as the eigenspace of eigenvalue 1. Working

Quantisation of Twistor Theory by Cocycle Twist

715

directly with such projectors as a coordinatisation of CM# , its commutative coordinate ∗-algebra is therefore given by 16 generators P µ ν with relations that P.P = P as an algebra-valued matrix, Tr P = 2 and the ∗-operation P µ ν ∗ = P ν µ . Similarly, CP3 is described as the commutative ∗-algebra with a matrix of generators Q µ ν , the relations Q.Q = Q, Tr Q = 1 and the ∗-operation Q µ ν ∗ = Q ν µ . One may proceed similarly for all classical flag varieties. The merit of this approach is that if one forgets the ∗-structure one has algebras defined simply by generators and relations (they are the complexifications of our original projective varieties viewed as real manifolds), while the ∗-structure picks out the real forms that are CM# , CP3 as real manifolds in our approach (these cannot themselves be described simply by generators and relations). Note that to have an actual variety one should replace the ideal corresponding to the above relations by its radical, and we will see how this works in practice. Finally, the complex structure of our projective varieties appears now in real terms as a structure on the cotangent bundle. This amounts to a new approach to projective geometry suggested by our theory for classical flag varieties and provides a second stream in the paper starting in Sect. 3. Note that there is no simple algebraic formula for a change of coordinates from describing a two-plane as a two-form and as a rank 2 projector, so the projector coordinates have a very different flavour from those usually used for CM# , CP3 . For example, the tautological vector bundles in these coordinates are now immediate to write down and we find that the pull-back of the tautological bundle on CM# to a natural S 4 contained in it is exactly the instanton bundle given by the known projector for S 4 (it is the analogue of the Bott projector that gives the basic monopole bundle on S 2 ). We explain this calculation in detail in Sect. 3.1. The Lorentzian version is also mentioned and we find that Penrose’s diamond compactification of Minkowski space arises very naturally in these coordinates. In Sect. 3.2 we explain the known fibration CP3 → S 4 in our new approach, used to construct an auxiliary bundle that maps over under the Penrose-Ward transform to the basic instanton. The second merit of our approach is that just as commutative C ∗ -algebras correspond to (locally compact Hausdorff) topological spaces, quantisation has a precise meaning as a noncommutative ∗-algebra with (in principle) C ∗ -algebra completion. Moreover, one does not need to consider completions but may work at the ∗-algebra level, as has been shown amply in the last two decades in the theory of quantum groups [12]. The quantisation of all flag varieties, indeed of all varieties defined by ‘matrix’ type relations on a matrix of generators, is given in Sect. 6, with the quantum tautological bundle looked at explicitly in Sect. 6.1. Our quantum algebra C F [CM# ] actually has three independent real parameters in the unitary case and takes a ‘Weyl form’ with phase factor commutation relations (see Proposition 6.3). We also show that only a oneparameter subfamily gives a natural quantum S 4 and in this case we recover exactly the θ -deformed S 4 and its instanton as in [6,11], now from a different point of view as a ‘pull-back’ from our θ -deformed CM# . Finally, while our main results are concerned with the coordinate algebras and differential geometry behind twistor theory in the classical and quantum cases, we look in Sect. 7,8 at enough of the deeper theory to see that our methods are compatible also with the Penrose-Ward transform and ADHM construction respectively. In these sections we concentrate on the classical theory but formulated in a manner that is then ‘quantised’ by our functorial method. Since their formulation in noncommutative geometry is not fully developed we avoid for example the necessity of the implicit complex structures. We also expect our results to be compatible with another approach to the quantum version based on groupoid C ∗ -algebras [5]. Although we only sketch the quantum version,

716

S. J. Brain, S. Majid

we do show that our formulation includes for example the quantum basic instanton as would be expected. A full account of the quantum Penrose-Ward transform including an explicit treatment of the noncommutative complex structure is deferred to a sequel. 1.1. Conformal space-time. Classically, complex Minkowski space CM is the fourdimensional affine vector space C4 equipped with the metric ˜ ds 2 = 2(dzd˜z − dwdw) written in double null coordinates [15]. Certain conformal transformations, such as isometries and dilations, are defined globally on CM, whereas others, such as inversions and reflections, may map a light cone to infinity and vice versa. In order to obtain a group of globally-defined conformal transformations, we adjoin a light cone at infinity to obtain compactified Minkowski space, usually denoted CM# . This compactification is achieved geometrically as follows (and is just the Plücker embedding, see for example [4,15,21]). One observes that the exterior algebra 2 C4 can be identified with the set of 4 × 4 matrices as ⎛ ⎞ 0 s −w z˜ ⎜ −s 0 −z w˜ ⎟ x =⎝ , w z 0 t ⎠ −˜z −w˜ −t 0 the points of 2 C4 being identified with the six entries x µν , µ < ν. Then GL4 = GL(4, C) acts from the left on 2 C4 by conjugation, x → axa t , a ∈ GL4 . We note that this action preserves the quadratic relation det x ≡ (st − z z˜ + w w) ˜ 2 = 0. # , is the subset From the point of view of 2 C4 this quadric, which we shall denote CM

of the form {a ∧ b : a, b ∈ C4 } ⊂ 2 C4 , (the antisymmetric projections of rank-one matrices, i.e. of decomposable elements of the tensor product). We exclude x = 0. Note that x of the form ⎛ ⎞ 0 a11 a22 − a21 a12 −(a31 a12 − a11 a32 ) a11 a42 − a41 a12 0 −(a31 a22 − a21 a32 ) a21 a42 − a41 a22 ⎟ ⎜−(a11 a22 − a21 a12 ) x =⎝ , a31 a12 − a11 a32 a31 a22 − a21 a32 0 a31 a42 − a41 a32 ⎠ −(a11 a42 − a41 a12 ) −(a21 a42 − a41 a22 ) −(a31 a42 − a41 a32 ) 0 or x µν = a [µ bν] , where a = a·1 , b = a·2 , automatically has determinant zero. Conversely, if the determinant vanishes then an antisymmetric matrix has this form over C. To see this, we provide a short proof as follows. Thus, we have to solve a1 b2 − a2 b1 = s, a1 b3 − a3 b1 = −w, a1 b4 − a4 b1 = z˜ , a2 b3 − a3 b2 = −z, a2 b4 − a4 b2 = w, ˜ a3 b4 − a4 b3 = t. We refer to the first relation as the (12)-relation, the second as the (13)-relation and so forth. Now if a solution for ai , bi exists, we make use of a ‘cycle’ consisting of the (12)b3 , (23)b1 , (13)b2 relations (multiplied as shown) to deduce that a1 b2 b3 = a2 b1 b3 + sb3 = a3 b1 b2 + sb3 − zb1 = a1 b3 b2 + sb3 − zb1 + wb2 ,

Quantisation of Twistor Theory by Cocycle Twist

717

hence a linear equation for b. The cycles consisting of the (12)b4 , (24)b1 , (14)b2 relations, the (13)b4 , (34)b1 , (14)b3 relations, and the (23)b4 , (34)b2 , (24)b3 relations give altogether the necessary conditions ⎛

0 ⎜ s ⎝w −z

−s 0 z˜ −w˜

−w −˜z 0 t

⎞⎛ ⎞ z b4 w˜ ⎟ ⎜b3 ⎟ = 0. −t ⎠ ⎝b2 ⎠ 0 b1

The matrix here is not the matrix x above but it has the same determinant. Hence if det x = 0 we know that a nonzero vector b obeying these necessary conditions must exist. We now fix such a vector b, and we know that at least one of its entries must be non-zero. We treat each case in turn. For example, if b2 = 0 then from the above analysis, the (12),(23) relations imply the (13) relations. Likewise (12), (24) ⇒ (14), (23), (24) ⇒ (34). Hence the six original equations to be solved become the three linear equations in four unknowns ai : a1 b2 − a2 b1 = s, a2 b3 − a3 b2 = −z, a2 b4 − a4 b2 = w˜ with the general solution ⎛

⎞ s ⎜ 0 ⎟ a = λb + b2−1 ⎝ , λ ∈ C. z ⎠ −w˜ One proceeds similarly in each of the other cases where a single bi = 0. Clearly, adding any multiple of b will not change a ∧ b, but we see that apart from this, a is uniquely fixed by a choice of zero mode b of a matrix with the same but permuted entries as x. It follows that every x defines a two-plane in C4 spanned by the obtained linearly independent vectors a, b. Such matrices x with det x = 0 are the orbit under GL4 of the point where s = 1, t = z = z˜ = w = w˜ = 0. It is easily verified that this point has isotropy subgroup H˜ consisting of elements of GL4 such that a3µ = a4µ = 0 for µ = 1, 2 and a11 a22 − # = GL / H˜ , where we quotient from the right. a21 a12 = 1. Thus CM 4 Finally, we may identify conformal space-time CM# with the rays of the above quadric cone st = z z˜ − w w˜ in 2 C4 , identifying the finite points of space-time with the rays for which t = 0 (which have coordinates z, z˜ , w, w˜ up to scale): the rays for which t = 0 give the light cone at infinity. It follows that the group PGL4 = GL4 /C acts globally on CM# by conformal transformations and that every conformal transformation arises in this way. Observe that CM# is, in particular, the orbit of the point s = 1, z = z˜ = w = w˜ = t = 0 under the action of the conformal group PGL4 . Moreover, by the above result we have that CM# = F2 (C4 ), the Grassmannian of two-planes in C4 . We may equally identify CM# with the resulting quadric in the projective space CP5 by choosing homogeneous coordinates s, z, z˜ , w, w˜ and projective representatives with t = 0 and t = 1. In doing so, there is no loss of generality in identifying the conformal group PGL4 with SL4 by representing each equivalence class with a transformation of unit determinant. Observing that 2 C4 has a natural metric υ˜ = 2(−dsdt + dzd˜z − dwdw), ˜

718

S. J. Brain, S. Majid

# is the null cone through the origin in 2 C4 . This metric may be we see that CM restricted to this cone and moreover it descends to give a metric υ on CM# [18]. Indeed, choosing a projective representative t = 1 of the coordinate patch corresponding to the affine piece of space-time, we have υ = 2(dzd˜z − dwdw), ˜ thus recovering the original metric. Similarly, we find the metric on other coordinate patches of CM# by in turn choosing projective representatives s = 1, z = 1, z˜ = 1, w = 1, w˜ = 1. µ Passing to the level of coordinates algebras, let us denote by aν the coordinate functions in C[GL4 ] (where we have now rationalised indices so that they are raised and lowered by the metric υ) ˜ and by s, t, z, z˜ , w, w˜ the coordinates in C[2 C4 ]. The algebra C[2 C4 ] is the commutative polynomial algebra on these six generators with # ] is the quotient by the further relation no further relations, whereas the algebra C[CM

st − z z˜ + w w˜ = 0. (Although we are ultimately interested in the projective geometry of the space described by this algebra, we shall put this point aside for the moment). In the coordinate algebra (as an affine algebraic variety) we do not see the deletion of the zero # . point in CM

# ] is essentially the algebra of functions on the orbit of the point As explained, C[CM s = 1, t = z = z˜ = w = w˜ = 0 in 2 C4 under the action of GL4 . The specification of # becomes at the level an element of GL4 / H˜ that moves the base point to a point of CM of coordinate algebras the map ˜ µ µ # ] ∼ φ : C[CM = C[GL4 ]C[ H ] , φ(x µν ) = a1 a2ν − a1ν a2 .

As shown, the relation st = z z˜ − w w˜ in C[2 C4 ] automatically holds for the image of the generators, so this map is well-defined. Also in these dual terms there is a left coaction L (x µν ) = aαµ aβν ⊗x αβ of C[GL4 ] on C[2 C4 ]. One should view the orbit base point above as a linear function on C[2 C4 ] that sends s to 1 and the rest to zero. Then applying this to L defines the above map φ. By construction, and one may easily check if in doubt, the image of φ lies in the fixed subalgebra under the right coaction R = (id ⊗ π ) of C[ H˜ ] on C[GL4 ], where π is the canonical surjection to C[ H˜ ] = C[GL4 ]/ a13 = a23 = a14 = a24 = 0, a11 a22 − a12 a21 = 1 , and is the matrix coproduct of C[GL4 ]. Ultimately we want the same picture for the projective variety CM# . In order to do this the usual route in algebraic geometry is to work with rational functions instead of polynomials in the homogeneous coordinate algebra and make the quotient by C∗ as the subalgebra of total degree zero. Rational functions here may have poles so to be more precise, for any open set U ⊂ X in a projective variety, we take the algebra O X (U ) = {a/b | a, b ∈ C[ X˜ ], a, b of equal degree, b(x) = 0 ∀x ∈ U },

Quantisation of Twistor Theory by Cocycle Twist

719

where C[ X˜ ] denotes the homogeneous coordinate algebra of X (the coordinate algebra of functions on the affine variety X˜ which projects to X ) and a, b are homogeneous. Doing this for any open set gives a sheaf of algebras. Of particular interest are principal open sets of the form U f = {x ∈ X | f (x) = 0} for any nonzero homogeneous f . Then O X (U f ) = C[ X˜ ][ f −1 ]0 , where we adjoin f −1 to the homogeneous coordinate algebra and the subscript 0 denotes the degree zero part. In the case of PGL4 we in fact have a coordinate algebra of regular functions ∗

C[PGL4 ] := C[GL4 ]C[C ] = C[GL4 ]0 constructed as the affine algebra analogue of GL4 /C∗ . It is an affine variety and not projective (yet one could view its coordinate algebra as OCP15 (U D ), where D is the determinant). In contrast, CM# is projective and we have to work with sheaves. For example, Ut is the open set where t = 0 with coordinate algebra # ][t −1 ] OCM# (Ut ) := C[CM 0 and there is a natural inclusion C[CM] → OCM# (Ut ) of the coordinate algebra of affine Minkowski space-time CM (polynomials in the four coordinate functions x1 , x2 , x3 , x4 on C4 with no further relations) given by ˜ x1 → t −1 z, x2 → t −1 z˜ , x3 → t −1 w, x4 → t −1 w. This is the coordinate algebra version of identifying the affine space-time CM with the patch of CM# for which t = 0. 2. Twistor Space and the Correspondence Space Next we give the coordinate picture for twistor space T = CP3 = F1 (C4 ), the set of lines in C4 . As a partial flag variety this is also known to be a homogeneous space. At the non-projective level we just mean T˜ = C4 with coordinates Z = (Z µ ) and the origin deleted. This is of course a homogeneous space for GL4 and may be identified as the orbit of the point Z 1 = 1, Z 2 = Z 3 = Z 4 = 0: the isotropy subgroup K˜ consists of elements such that a11 = 1, a12 = a13 = a14 = 0, giving the identification T˜ = GL4 / K˜ (again we quotient from the right). Again we pass to the coordinate algebra level. At this level we do not see the deletion of the origin, so we define C[T˜ ] = C[C4 ]. We have an isomorphism ˜ µ φ : C[T˜ ] → C[GL4 ]C[ K ] , φ(Z µ ) = a1

according to a left coaction L (Z µ ) = aαµ ⊗Z α . One should view the principal orbit base point as a linear function on C4 that sends Z 1 to 1 and the rest to zero: as before, applying this to the coaction L defines φ as the

720

S. J. Brain, S. Majid

dual of the orbit construction. It is easily verified that the image of this isomorphism is exactly the subalgebra of C[GL4 ] fixed under C[ K˜ ] = C[GL4 ]/ a11 = 1, a12 = a13 = a14 = 0 by the right coaction on C[GL4 ] given by projection from the coproduct of C[GL4 ]. Finally we introduce a new space F, the ‘correspondence space’, as follows. For each point Z ∈ T we define the associated ‘α-plane’ Zˆ = {x ∈ CM# | x ∧ Z = x [µν Z ρ] = 0} ⊂ CM# . The condition on x is independent both of the scale of x and of Z , so constructions may be done ‘upstairs’ in terms of matrices, but we also have a well-defined map at the projective level. The α-plane Zˆ contains for example all points in the quadric of the form W ∧ Z as W ∈ C4 varies. Any multiple of Z does not contribute, so Zˆ is a three# and hence a CP2 contained in CM# (it is the image of dimensional subspace of CM some two-dimensional subspace of CM under the conformal compactification, whence the term ‘plane’). Explicitly, the condition x ∧ Z = 0 in our coordinates is: z˜ Z 3 + w Z 4 − t Z 1 = 0, w˜ Z 3 + z Z 4 − t Z 2 = 0, s Z 3 + w Z 2 − z Z 1 = 0, s Z 4 − z˜ Z 2 + w˜ Z 1 = 0.

(1)

If t = 0 one can check (using the quadric relation det x = 0) that the second pair of equations is implied by the first, so generically we have two equations for four unknowns as expected. Moreover, for each such plane Zˆ , one has in the Lorentzian case the property that the aforementioned metric ν vanishes upon restriction to Zˆ , i.e. that at each point of Zˆ one has ν(A, B) = 0 for any pair A, B of vectors tangent to the plane (we say Zˆ is totally null). One may also check that such a plane Zˆ is totally null if and only if the bivector π = A ∧ B defined at each point of the plane (uniquely determined up to scale by any linearly independent choice of A, B) is self-dual with respect to the Hodge ∗-operator. Thus one should think of twistors Z as parameterising the totally null two-planes with self-dual tangent bivector. We note that one may also construct ‘β-planes’, for which the tangent bivector is anti-self-dual: these are instead parameterized by three-forms in the role of Z . Conversely, given any point x ∈ CM# we define the ‘line’ xˆ = {Z ∈ T | x ∈ Zˆ } = {Z ∈ T | x [µν Z ρ] = 0} ⊂ T. We have seen that we may write x = a ∧ b, and indeed Z = λa + µb solves this equation for all λ, µ ∈ C. This is a two-plane in T˜ = C4 which projects to a CP1 contained in CP3 , thus each xˆ is a projective line in twistor space T = CP3 . We may now define F as the set of pairs (Z , x), where x ∈ CM# and Z ∈ T are such that x ∈ Zˆ (or equivalently Z ∈ x), ˆ i.e. such that x ∧ Z = 0. This space fibres naturally over both space-time and twistor space via the obvious projections F

CP3

p

@ q @ R @

CM# .

(2)

Quantisation of Twistor Theory by Cocycle Twist

721

Clearly we have by construction that Zˆ = q( p −1 (Z )), xˆ = p(q −1 (x)). It is also clear that the defining relation of F is preserved under the action of GL4 . From the Grassmannian point of view, Z ∈ CP3 = F1 (C4 ) is a line in C4 and ˆ Z ⊂ F2 (C4 ) is the set of two-planes in C4 containing this line. Moreover, xˆ ⊂ F1 (C4 ) consists of all one-dimensional subspaces of C4 contained in x viewed as a two-plane in C4 . It follows that F is the partial flag variety F1,2 (C4 ) of subspaces C ⊂ C2 ⊂ C4 . Here x ∈ CM# = F2 (C4 ) is a two-plane in C4 and Z ∈ T = CP3 is a line in C4 contained in this plane. From this point of view it is known that the homology H4 (CM# ) is two-dimensional and indeed one of the generators is given by any Zˆ (they are all homologous and parameterized by CP3 ). The other generator is given by a similar construction of ‘β-planes’ [15,18] with correspondence space F2,3 (C4 ) and with F3 (C4 ) = (CP3 )∗ . Likewise, the homology H2 (CP3 ) is one-dimensional and indeed any flag xˆ is a generator (they are all homologous and parameterized by CM# ). For more details on the geometry of this construction, see [15]. More on the algebraic description can be found in [4]. This F is known to be a homogeneous space. Moreover, F˜ (the affine variety which projects to F) can be viewed as a quadric in (2 C4 ) ⊗ C4 , and hence the orbit under ˜ where s = 1, Z 1 = 1 and all other coordinates are the action of GL4 of the point in F, zero. The isotropy subgroup R˜ of this point consists of those a ∈ GL4 such that a12 = 0, aµ3 = aµ4 = 0 for µ = 1, 2 and a11 = a22 = 1. As one should expect, R˜ = H˜ ∩ K˜ . At the level of the coordinate rings, the identification of F˜ with the quadric in ˜ as the polynomials in the coordinate func(2 C4 ) ⊗ C4 gives the definition of C[F] tions x µν , Z α , modulo the quadric relations and the relations (1). That it is an affine homogeneous space is the isomorphism ˜ µ µ β ˜ → C[GL4 ]C[ R] , φ(x µν ⊗ Z β ) = (a1 a2ν − a1ν a2 )a1 , φ : C[F]

according to the left coaction L (x µν ⊗ Z σ ) = aαµ aβν aγσ ⊗(x αβ ⊗ Z γ ). The image of φ is the invariant subalgebra under the right coaction of ˜ = C[GL4 ]/ a12 = 0, a11 = a22 = 1, a13 = a23 = a14 = a24 = 0 C[ R] on C[GL4 ] given by projection from the coproduct of C[GL4 ]. 3. SL4 and Unitary Versions # = {x ∈ 2 C4 | det x = 0} by conjugation, As discussed, the group GL4 acts on CM x → axa t , and this picture descends to an action of the projective group PGL4 on the quotient space CM# . Our approach accordingly was to work at the non-projective level

722

S. J. Brain, S. Majid

in order for the algebraic structure to have an affine form and pass at the end to the projective spaces CM# , T and F as rational functions of total degree zero. If one wants to work with these spaces directly as homogeneous spaces one may do this as well, so that CM# = PGL4 /P H˜ , and so on. From a mathematician’s point of view one may equally well define CM# = F2 (C4 ) = GL4 /H, T = F1 (C4 ) = GL4 /K , F = F1,2 (C4 ) = GL4 /R, ⎛ ⎞ ∗∗∗∗ ⎜∗ ∗ ∗ ∗⎟ H =⎝ , 0 0 ∗ ∗⎠ 00∗∗

⎛

⎞ ∗∗∗∗ ⎜ 0 ∗ ∗ ∗⎟ K =⎝ , 0 ∗ ∗ ∗⎠ 0∗∗∗

⎛ ∗ ⎜0 R=H∩K =⎝ 0 0

∗ ∗ 0 0

∗ ∗ ∗ ∗

⎞ ∗ ∗⎟ , ∗⎠ ∗

where the overall GL4 determinants are non-zero. Here H is slightly bigger than the subgroup H˜ we had before. As homogeneous spaces, CM# , T and F carry left actions of GL4 , which are essentially identical to those given above at both the PGL4 and at the non-projective levels. One equally well has CM# = F2 (C4 ) = SL4 /H, T = F1 (C4 ) = SL4 /K , F = F1,2 (C4 ) = SL4 /R, where H, K , R are as above but now viewed in SL4 , and now CM# , T and F carry canonical left actions of SL4 similar to those previously described. These versions would be the more usual in algebraic geometry, but at the coordinate level one does need to then work with an appropriate construction to obtain these projective or quasi-projective varieties. For example, if one simply computes the invariant functions C[SL4 ]C[K ] etc. as affine varieties, one will not find enough functions. As an alternative, we mention a version where we consider all our spaces in the double fibration as real manifolds, and express this algebraically in terms of ∗-structures on our algebras. Thus for example CP3 is a real six-dimensional manifold which we construct by complexifying it to an affine six-dimensional variety over C, but we remember its real form by means of a ∗-involution on the complex algebra. The ∗-algebras in this approach can then in principle be completed to an operator-algebra setting and the required quotients made sense of in this context, though we shall not carry out this last step here. In this case the most natural choice is CM# = SU4 /H, T = SU4 /K , F = SU4 /R, H = S(U2 × U2 ), K = S(U1 × U3 ), R = H ∩ K = S(U1 × U1 × U2 ), embedded in the obvious diagonal way into SU4 . As homogeneous spaces, one has canonical actions now of SU4 from the left on CM# , T and F. µ For the coordinate algebraic version one expresses SU4 by generators aν , the determinant relation and in addition the ∗-structure, a † = Sa, µ

where † denotes transpose and ∗ on each matrix generator entry, (aν )† = aµν ∗ , and S is the Hopf algebra antipode characterised by aS(a) = (Sa)a = id. This is as for any compact group or quantum group coordinate algebra. The coordinate algebras of the subgroups are similarly defined as ∗-Hopf algebras. There is also a natural ∗-structure on twistor space. To see this let us write it in the form

Quantisation of Twistor Theory by Cocycle Twist

T = CP3 = {Q ∈ M4 (C),

723

Q = Q†,

Q 2 = Q, Tr Q = 1}

in terms of Hermitian-conjugation †. Thus CP3 is the space of Hermitian rank one µ projectors on C4 . Such projectors can be written explicitly in the form Q ν = Z µ Z¯ ν for some complex vector Z of modulus one and determined only up to a U1 normalisation. Thus CP3 = S 7 /U1 as a real six-dimensional manifold. In this description the left action of SU4 is given by conjugation in M4 (C), i.e. by unitary transformation of the Z and its inverse on Z¯ . One can exhibit the identification with the homogeneous space picture, as the orbit of the projector diag(1, 0, 0, 0) (i.e. Z = (1, 0, 0, 0) = Z ∗ ). The isotropy group of this is the intersection of SU4 with U1 × U3 , as stated. The coordinate ∗-algebra version is C[CP3 ] = C[Q µ ν ]/ Q 2 = Q, Tr Q = 1 , Q = Q † , with the last equation now as a definition of the ∗-algebra structure via † = ( )∗t . We can also realise this as the degree zero subalgebra,

µ∗ µ 7 7 µ ν∗ C[CP3 ] = C[S ]0 , C[S ] = C[Z , Z ] Z Z =1 , µ

Z∗

are two sets of generators related by the ∗-involution. The grading is given where Z , by deg(Z ) = 1 and deg(Z ∗ ) = −1, corresponding to the U1 -action on Z and its inverse on Z ∗ . Finally, the left coaction is L (Z µ ) = aαµ ⊗Z α , L (Z µ∗ ) = Saµα ⊗Z α ∗ , as required for a unitary coaction of a Hopf ∗-algebra on a ∗-algebra, as well as to preserve the defining relation of C[S 7 ]. We have similar ∗-algebra versions of F and CM# as well. Indeed, C[CM# ] = C[Pνµ ]/ P 2 = P, Tr P = 2 , P = P † in terms of a rank two projector matrix of generators, while µ 2 2 C[F] = C[Q µ , P ]/ Q = Q, P = P, Tr Q = 1, Tr P = 2, P Q = Q = Q P ν ν Q = Q†, P = P †. We see that our ∗-algebra approach to flag varieties has a ‘quantum logic’ form. In physical terms, the fact that P, Q commute as matrices (or matrices of generators in the coordinate algebras) means that they may be jointly diagonalised, while Q P = Q acting from the right says that the 1-eigenvectors of Q are a subset of the 1-eigenvectors of P (equivalently, P Q = Q says that the 0-eigenvectors of P are a subset of the 0-eigenvectors of Q). Thus the line which is the image of Q is contained in the plane which is the image of P: this is of course the defining property of pairs of projectors (Q, P) ∈ F = F1,2 (C4 ). Clearly this approach works for all flag varieties Fk1 ,...kr (Cn ) of k1 < · · · < kr -dimensional planes in Cn as the ∗-algebra with n × n matrices Pi of generators 2 C[Fk1 ,...kr ] = C[Pi µ ; i = 1, . . . , r ]/ P = P , Tr P = k , P P = P = P P , i i i i i+1 i i+1 i ν i Pi = Pi† .

724

S. J. Brain, S. Majid

In this setting all our algebras are polynomial algebras (with ∗-structure) and we can expect to be able to work algebraically. They define complex affine varieties if they are reduced (have no nontrivial nilpotents) which can be ensured if we enlarge the ideal I corresponding to the relations above to its radical rad(I ), defined as the set of elements for which some power lies in I . These reduced versions are canonically defined by the above non-reduced versions but they are very hard to compute in practice. For this reason we work with the non-reduced versions as our primary objects of interest. Aside from this subtlety, one may expect for example that C[CP3 ] = C[SU4 ]C[S(U1 ×U3 )] (similarly for other flag varieties) and indeed we may identify the above generators and relations in the relevant invariant subalgebra of C[SU4 ]. This is the same approach as was successfully used for the Hopf fibration construction of CP1 = S 2 = SU2 /U1 ∗ as C[SU2 ]C[U1 ] , namely as C[SL2 ]C[C ] with suitable ∗-algebra structures [14]. Note that one should not confuse such Hopf algebra (‘GIT’) quotients with complex algebraic geometry quotients, which are more complicated to define and typically quasi-projective. Finally, we observe that in this approach the tautological bundle of rank k over a flag variety Fk (Cn ) appears tautologically as a matrix generator viewed as a projection P ∈ Mn (C[Fk ]). The classical picture is that the flag variety with this tautological bundle is universal for rank k vector bundles. 3.1. Tautological bundle on CM# and the instanton as its Grassmann connection. Here we illustrate the merit of this framework with a result that is surely known to some, but apparently not well-known even at the classical level, and yet drops out very naturally in our ∗-algebra approach. We show that the tautological bundle on CM# restricts in a natural way to S 4 ⊂ CM# , where it becomes the basic one-instanton bundle, and for which the Grassmann connection associated to the projector is the basic one-instanton [2]. We first explain the notion of Grassmann connection for a projective module E over an algebra A. We suppose that E = An e, where e ∈ Mn (A) is a projection matrix acting on an A-valued row vector. Thus every element v ∈ E takes the form v = v˜ · e = v˜ j e j = (v˜k )ek j e j , where the elements e j = e j· ∈ An span E over A and v˜i ∈ A. The action of the Grassmann connection is the exterior derivative on components followed by projection back down to E: ∇v = ∇(v.e) ˜ = (d(v.e))e ˜ = ((dv˜ j )e jk + v˜ j de jk )ek = (dv˜ + vde)e. ˜ One readily checks that this is both well-defined and a connection in the sense that ∇(av) = da.v + a∇v, ∀a ∈ A, v ∈ E, and that its curvature operator F = ∇ 2 on sections is F(v) = F(v.e) ˜ = (v.de.de).e. ˜ As a warm-up example we compute the Grassmann connection for the tautological bundle on A = C[CP1 ]. Here e = Q, the projection matrix of coordinates in our ∗-algebraic set-up:

a z ; a(1 − a) = zz ∗ , a ∈ R, z ∈ C, e=Q= ∗ z 1−a

Quantisation of Twistor Theory by Cocycle Twist

725

where s = a − 21 and z = x + ı y describes a usual sphere of radius 1/2 in Cartesian coordinates (x, y, s). We note that (1 − 2a)da = dz.z ∗ + zdz ∗ allows us to eliminate da in the open patch where a = 21 (i.e. if we delete the equator of S 2 ). Then

2z dzdz ∗ 1 − 1−2a da dz da dz ∗ = dzdz de.de = (1 − 2e), = ∗ ∗ ∗ 2z dz −da dz −da − 1−2a −1 1 − 2a and hence F(v.e) ˜ =−

dzdz ∗ v.e. ˜ 1 − 2a

In other words, F acts as a multiple of the identity operator on E = A2 .e and this multiple has the standard form for the charge one monopole connection if one converts to usual Cartesian coordinates. We conclude that the Grassmann connection for the tautological bundle on CP1 is the standard monopole of charge one. This is surely wellknown. The q-deformed version of this statement can be found in [8], provided one identifies the projector introduced there as the defining projection matrix of generators −1 for Cq [CP1 ] = Cq [SL2 ]C[t,t ] as a ∗-algebra in the q-version of the above picture (the projector there obeys Trq (e) = 1, where we use the q-trace). Note that if one looks for any algebra A containing potentially non-commuting elements a, z, z ∗ and a projection e of the form above with Tr(e) = 1, one immediately finds that these elements commute and obey the sphere relation as above. If one performs the same exercise with the q-trace, one finds exactly the four relations of the standard q-sphere as a ∗-algebra. Next, we look in detail at A = C[CM# ] in our projector ∗-algebra picture. This has a 4 × 4 matrix of generators which we write in block form,

A B , Tr A + Tr D = 2, A† = A, D † = D. P= B† D A(1 − A) = B B † ,

D(1 − D) = B † B.

1 1 (A − )B + B(D − ) = 0, 2 2

(3)

(4)

where we have written out the requirement that P be a Hermitian algebra-valued projection without making any assumptions on the ∗-algebra C[CM# ] (so that these formulae also apply to any noncommutative version of C[CM# ] in our approach). To proceed further, it is useful to write A = a + α · σ,

B = t + ı x · σ,

B † = t ∗ − ı x ∗ · σ,

D =1−a+δ·σ

in terms of usual Pauli matrices σ = (σ1 , σ2 , σ3 ). We recall that these are traceless and Hermitian, so a, α, δ are self-adjoint, whilst t, xi , i = 1, 2, 3 are not necessarily so and are subject to (3)–(4).

726

S. J. Brain, S. Majid

Proposition 3.1. The commutative ∗-algebra C[CM# ] is defined by the above generators a = a ∗ , α = α ∗ , δ = δ ∗ , t, t ∗ , x, x ∗ and the relations tt ∗ + x x ∗ = a(1 − a) − α.α, (1 − 2a)(α − δ) = 2ı(t ∗ x − t x ∗ ), (1 − 2a)(α + δ) = 2ı x × x ∗ , α.α = δ.δ, (α + δ).x = 0, (α + δ)t = (α − δ) × x. Proof. This is a direct computation of (3)–(4) under the assumption that the generators commute. Writing our matrices in the form above, Eqs. (3) become a(1 − a) − α · α + (1 − 2a)α · σ = tt ∗ + x · σ x ∗ · σ + ı(xt ∗ − t x ∗ ) · σ, (1 − a)a − δ · δ − (1 − 2a)δ · σ = t ∗ t + x ∗ · σ x · σ + ı(t ∗ x − x ∗ t) · σ. Taking the sum and difference of these equations and in each case the parts proportional to 1 (which are the same on both right-hand sides) and the parts proportional to σ (where the difference of the right-hand sides is proportional to x × x ∗ ) gives four of the stated equations (all except those involving terms (α + δ) · x and (α + δ)t). We employ the key identity σi σ j = δi j + ıi jk σk , where with 123 = 1 is the totally antisymmetric tensor used in the definition of the vector cross product. Meanwhile, (4) becomes ıα.σ x · σ + αt · σ + ı x · σ δ · σ + tδ · σ = 0 after cancellations, and this supplies the remaining two relations using our key identity. We see that in the open set where a = 21 we have α, δ fully determined by the second and third relations and the second ‘auxiliary’ line of equations then hold automatically. So the main free variables are the complex generators t, x, with a determined from the first equation (or regarded as a further variable). Proposition 3.2. There is a natural ∗-algebra quotient C[S 4 ] of C[CM# ] defined by the additional relations x ∗ = x, t ∗ = t and α = δ = 0. The tautological projector of C[CM# ] becomes

a t + ıx · σ ∈ M2 (C[S 4 ]). e= t − ıx · σ 1 − a The Grassmann connection on the projective module E = C[S 4 ]4 e is the basic instanton with local form (F ∧ F)(v.e) ˜ = −4!

dtd3 x v.e. ˜ 1 − 2a

Quantisation of Twistor Theory by Cocycle Twist

727

Proof. All relations in Proposition 3.1 are trivially satisfied in the quotient except tt ∗ + x x ∗ = a(1 − a), which is that of a four-sphere of radius 21 in the usual Cartesian coordinates (t, x, s) if we set s = a − 21 . The image e of the projector exhibits S 4 ⊂ CM# as a projective variety in our ∗-algebra projector approach. We interpret this as providing a projective module over C[S 4 ], the pull-back of the tautological bundle on CM# from a geometrical point of view. To compute the curvature of its Grassmann connection we first note that dx · σ dx · σ = ı(dx × dx) · σ, (dx × dx) · (dx × dx) = 0, (dx × dx) × (dx × dx) = dx × (dx × dx) = 0, dx · (dx × dx) = 3!d3 x, since one-forms anticommute and since any four products of the dxi vanish. Now we have

da dt + ıdx · σ da dt + ıdx · σ dede = dt − ıdx · σ −da dt − ıdx · σ −da

−2ıdtdx · σ + ıdx × dx · σ 2dadt + 2ıdadx · σ = , −2dadt + 2ıdadx · σ 2ıdtdx · σ + ıdx × dx · σ and we square this matrix to find that

2 1 − 2e (t + ı x · σ ) 1 − 1−2a (de)4 = 4!dtd3 x 4!dtd3 x = 2 − 1−2a (t − ı x · σ ) −1 1 − 2a after substantial computation. For example, since (da)2 = 0, the 1-1 entry is −(dx × dx − 2dtdx).σ (dx × dx − 2dtdx) · σ = 2dtdx · (dx × dx) + 2(dx × dx) · dtdx = 4!dtd3 x, where only the cross-terms contribute on account of the second observation above and the fact that (dt)2 = 0. For the 1-2 entry we have similarly that 2ıda(dx × dx − 2dtdx) · σ (dt + ıdx · σ ) + 2ıda(dt + ıdx · σ )(dx × dx + 2dtdx) · σ = 12ıdadt (dx × dx) · σ − 4da(dx × dx) · σ dx · σ 24 8 2 = − 2ı x · σ dtd3 x − t3!dtd3 x = − (t + ı x · σ )4!dtd3 x, 1 − 2a 1 − 2a 1 − 2a where at the end we substitute da =

2(tdt + x · dx) , 1 − 2a

and note that x · dx(dx × dx) = xi dxi jkm dx j dxk = 2xm d3 x, since in the sum over i only i = m can contribute for a non-zero three-form. The 2-2 and 2-1 entries are analogous and left to the reader. We conclude that (de)4 e acts on C[S 4 ]4 from the right as a multiple of the identity as stated.

728

S. J. Brain, S. Majid

One may check that F = dede.e is anti-self-dual with respect to the usual Euclidean Hodge ∗-operator. Note also that the off-diagonal corners of e are precisely a general quaternion q = t + ı x · σ and its conjugate, which relates our approach to the more conventional point of view on the basic instanton. We stress however that this is not our starting point, as we come from CM# , where the top right corner is a general 2 × 2 matrix B and the bottom left corner its adjoint. If instead we let B be an arbitrary Hermitian matrix in the form B =t +x ·σ (i.e. we replace ı x above by x and let t ∗ = t, x ∗ = x) then the quotient t ∗ = t, x ∗ = x, α = −δ = 2t x/(1 − 2a) gives us

t2 1 2 2 s + t + 1 + 2 x2 = s 4 when s = a− 21 = 0 and t xi = 0, t 2 +x 2 +α 2 = 41 when s = 0. We can also approach this case directly from (3)-(4). We have to find Hermitian A, D, or equivalently S = A − 21 , T = D − 21 with Tr(S + T ) = 0 and S 2 = T 2 = 41 − B 2 . Since B is Hermitian it has real eigenvalues and indeed after conjugation we can rotate x to |x| times a vector in the 3-direction, i.e. B has eigenvalues t ± |x|. It follows that square roots S, T exist precisely when |t| + |x| ≤

1 2

(5)

and are diagonal in the same basis as was B, hence they necessarily commute with B. In this case (4) becomes that (S + T ).B = 0. If B has two nonzero eigenvalues then S +T = 0. If B has one nonzero eigenvalue then S + T has a zero eigenvalue, but the trace condition then again implies S + T = 0. If B = 0 our equations reduce to those for two self-adjoint 2 × 2 projectors A, D with traces summing to two. In summary, if B = 0 there exists a projector of the form required if and only if (t, x) lies in the diamond region (5), with S = T and a fourfold choice (the choice of root for each eigenvalue) of S in the interior. These observations about the moduli of projectors with B Hermitian means that the corresponding quotient of C[CM# ] is a fourfold cover of a diamond region in affine Minkowski space-time (viewed as the space of 2 × 2 Hermitian matrices). The diamond is conformally equivalent to a compactification of all of usual Minkowski space (the Penrose diagram for Minkowski space), while its fourfold covering reminds us of the Penrose diagram for a black-white hole pair. It is the analogue of the disk that one obtains by projecting S 4 onto its first two coordinates. One may in principle compute the connection associated to the pull-back of the tautological bundle to this region as well as the four-dimensional object of which it is a projection. The most natural version of this is to slightly change the problem to two 2 × 2 Hermitian matrices S, B with S 2 + B 2 = 41 (a ‘matrix circle’), an interesting variety which will be described elsewhere. Note also that in both cases D = 1 − A and if we suppose this at the outset our equations including (4) and (3) simplify to [A, B] = [B, B † ] = 0,

A(1 − A) = B B † ,

A = A† .

(6)

In fact this is the same calculation as for any potentially noncommutative CP1 which (if we use the usual trace) is forced to be commutative as mentioned above.

Quantisation of Twistor Theory by Cocycle Twist

729

Finally, returning to the general case of C[CM# ], we have emphasised ‘Cartesian coordinates’ with different signatures. From a twistor point of view it is more natural to work with the four matrix entries of B as the natural twistor coordinates. This will also be key when we quantise. Thus equivalently to Proposition 3.1 we write

δ z w˜ a + α3 α 1 − a + δ3 B= , A= , D= , (7) α ∗ a − α3 δ∗ 1 − a − δ3 w z˜ where a = a ∗ , α3 = α3∗ , δ3 = δ3∗ as before but all our other notations are different. In particular, α, α ∗ , δ, δ ∗ , z, z ∗ , w, w ∗ , z˜ , z˜ ∗ , w, ˜ w˜ ∗ are now complex generators. Corollary 3.3. The relations of C[CM# ] in these new notations appear as zz ∗ + ww ∗ + z˜ z˜ ∗ + w˜ w˜ ∗ = 2(a(1 − a) − αα ∗ − α32 ), (1 − 2a)α = zw ∗ + w˜ z˜ ∗ , (1 − 2a)δ = −z ∗ w˜ − z˜ w ∗ , (1 − 2a)(α3 + δ3 ) = w˜ w˜ ∗ − ww ∗ , (1 − 2a)(α3 − δ3 ) = zz ∗ − z˜ z˜ ∗ , and the auxiliary relations αα ∗ + α32 = δδ ∗ + δ32 , (α3 + δ3 )

∗ z w z −α −δ ∗ w −α −δ ∗ . , (α = − δ ) = 3 3 δ α∗ δ α z˜ w˜ z˜ w˜

Moreover, S 4 ⊂ CM# appears as the ∗-algebra quotient C[S 4 ] defined by w ∗ = −w, ˜ z ∗ = z˜ and α = δ = 0. Proof. It is actually easier to recompute these, but of course this is just a change of generators from the equations in Proposition 3.1. Note that these ∗-algebra coordinates are more similar in spirit but not the same as those for CM# as a projective quadric in Sect. 1. 3.2. Twistor space CP3 in the ∗-algebra approach. For completeness, we also describe CP3 more explicitly in our ∗-algebra approach. As a warm-up we start with CP2 , in line with CP1 already covered above. Thus C[CP2 ] has a trace 1 matrix of generators ⎞ ⎛ a x y Q = ⎝ x ∗ b z ⎠, a + b + c = 1 y∗ z∗ c with a, b, c self-adjoint. Proposition 3.4. C[CP2 ] is the algebra with the above matrix of generators with a + b + c = 1 and the projector relations x ∗ x = ab, y ∗ y = ac, z ∗ z = bc, cx = yz ∗ , by = x z, az = x ∗ y.

730

S. J. Brain, S. Majid

Proof. First of all, the ‘projector relations’ Q 2 = Q come out as the second line of relations stated and the relations a(1 − a) = X + Y, b(1 − b) = X + Z , c(1 − c) = Y + Z , where we use the shorthand X = x ∗ x, Y = y ∗ y, Z = z ∗ z. We subtract these from one another to obtain X − Z = (a − c)b, Y − Z = (a − b)c,

X − Y = (b − c)a

(in fact there are only two independent ones here). Combining with the original relations allows to solve for X, Y, Z as stated. Clearly, if a = 0 (say), i.e. if we look at C[CP2 ][a −1 ], we can regard x, y (and their adjoints) and a, a −1 , b, c as generators with the relations x ∗ x = ab,

y ∗ y = ac, a + b + c = 1,

(8)

and all the other relations become empty. Thus az = x ∗ y is simply viewed as a definition of z and one may check for example that z ∗ za 2 = y ∗ x x ∗ y = X Y = bca 2 , as needed. Likewise, for example, ayz ∗ = yy ∗ x = Y x = acx as required. We can further regard (8) as defining b, c, so the localisation viewed in this way is a punctured S 4 with complex generators x, y, real invertible generator a and the relations x ∗ x + y ∗ y = a(1 − a), conforming to our expectations for CP2 as a complex two-manifold. We can also consider setting a, b, c to be real numbers with a + b + c = 1 and b, c > 0, b + c < 1. The inequalities here are equivalent to ab, ac > 0 and a = 0 (with a > 0 necessarily following since if a < 0 we would need b, c < 0 and hence a + b + c < 0, which is not allowed). We then have C[CP2 ]| b,c>0 = C[S 1 × S 1 ], b+c<1

so the passage to this quotient algebra is geometrically an inclusion S 1 × S 1 ⊂ CP2 with (8) defining the two circles (recall that x, y are complex generators). As the parameters vary, the circles vary in size, so the general case with a inverted can be viewed in this sense as an inclusion C∗ × C∗ ⊂ CP2 . This holds classically as an open dense subset (since CP2 is a toric variety). We have the same situation for C[CP1 ], where there is only one relation x ∗ x = a(1 − a), i.e. circles S 1 ⊂ CP1 of different size as 0 < a < 1. They are the circles of constant latitude and as a varies in this range they map out C∗ (viewed as S 2 with the north and south pole removed). We now find similar results for CP3 (the general CPn case is analogous). We now have a matrix of generators ⎛ ⎞ a x y z ⎜x ∗ b w v ⎟ Q=⎝ ∗ ∗ , a ∗ = a, b∗ = b, c∗ = c, d ∗ = d, a + b + c + d = 1, y w c u⎠ z∗ v∗ u ∗ d and make free use of the shorthand notation X = x ∗ x, Y = y ∗ y,

Z = z ∗ z, U = u ∗ u, V = v ∗ v , W = w ∗ w.

Quantisation of Twistor Theory by Cocycle Twist

731

Proposition 3.5. C[CP3 ] is the commutative ∗-algebra with generators Q of the form above with a + b + c + d = 1 and projector relations a(1 − a) = X + Y + Z , X − U = ab − cd, Y − V = ac − bd, Z − W = ad − bc, au = y ∗ z, av = x ∗ z, aw = x ∗ y, bu = w∗ v, cv = wu, dw = vu ∗ , cx = yw∗ , by = xw, bz = xv, d x = zv ∗ , dy = zu ∗ , cz = yu. Proof. We first write out the relations P 2 = P as a(1 − a) = X + Y + Z , b(1 − b) = X + V + W,

(9)

c(1 − c) = Y + U + W, d(1 − d) = Z + U + V,

(10)

yw ∗ + zv ∗ = x(c + d), xw + zu ∗ = y(b + d), xv + yu = z(b + c),

(11)

y ∗ z + w ∗ v = u(a + b), x ∗ z + wu = v(a + c), x ∗ y + vu ∗ = w(a + d).

(12)

We add and subtract several combinations of (9)–(10) to obtain the equivalent four equations stated in the proposition. For example, subtracting (9) gives (Y − V ) + (Z −W ) = (c+d)(a−b) while subtracting (10) gives (Y −V )−(Z −W ) = (c−d)(a+b) and combining these gives the Y − V and Z − W relations stated, similarly, for the X −U relation. We can also write our three equations as (a + c)(a + d) + X − U = (a + b)(a + d) + Y − V = (a + b)(a + c) + Z − W = a, (13) using a + b + c + d = 1. Next, we compute (12) assuming (11), for example (a + b)(a + c)u = (a + c)(y ∗ z + w ∗ v) = (a + c)y ∗ z + w ∗ (x ∗ z + wu) = (a + c)y ∗ z + W u + (y ∗ (b + d) − uz ∗ )z = y ∗ z + (W − Z )u which, using (13), becomes au = y ∗ z. We similarly obtain av = x ∗ z, aw = x ∗ y. Given these relations, clearly (12) is equivalent to the next three stated equations, which completes the first six equations of this type, similarly for the remaining six. Lemma 3.6. In C[CP3 ] we have (X − ab)(Y − (ac − bd)) = 0, (X − ab)(Z − (ad − bc)) = 0, (Y − ac)(X − (ab − cd)) = 0, (Y − ac)(Z − (ad − bc)) = 0, (Z − ad)(X − (ab − cd)) = 0, (Z − ad)(Y − (ac − bd)) = 0, (X − ab)(X − b(1 − a)) = 0, (Y − ac)(Y − c(1 − a)) = 0, (Z − ad)(Z − d(1 − a)) = 0, (X − ab)(X − a(1 − b)) = 0, (Y − ac)(Y − a(1 − c)) = 0, (Z − ad)(Z − a(1 − d)) = 0.

732

S. J. Brain, S. Majid

Proof. For example, adu = dy ∗ z = uz ∗ z = u Z . In this way one has (X − ab)v = (X − ab)w = (Y − ac)u = (Y − ac)w = (Z − ad)u = (Z − ad)v = 0. Multiplying by u ∗ , v ∗ , w ∗ and replacing U, V, W using Proposition 3.5 gives the first two lines of relations. Next, by = xw in Proposition 3.5 implies b2 Y = X W and similarly for bz gives b2 (Y + Z ) = X (V +W ). We then use (9) to obtain X 2 −bX +b2 a(1−a) which factorises to one of the quadratic equations stated. Similarly, the equations a 2 V = X Z , a 2 W = X Y imply a 2 (V + W ) = X (Y + Z ), which yields the other quadratic equation for X , similarly for the other quadratic equations. Lemma 3.7. If we consider the trace 1 projection Q as a numerical Hermitian matrix of the form above, then X = ab, Y = ac, Z = ad necessarily holds. Proof. We use the preceding lemma but regarded as for real numbers (equivalently one can assume that our algebra has no zero divisors). Suppose without loss of generality that X = ab. Then by the lemma, V = W = 0 or Y = ac − bd, Z = ad − bc. We also have v = w = 0, and hence from Proposition 3.5 that x ∗ z = x ∗ y = 0. We can also deduce from the quadratic equations of X that a = b and X = a(1 − a) or U = a(1 − 2a) + cd. We distinguish two cases: (i) x = 0 in which case a = b = 0 (since X = a(1 − a) = ab = a 2 ) and (ii) x = 0, y = z = 0, in which case a = b = 0, 1, c + d = 0. In either case since b = 1 we have Y = ac and Z = ad, hence X = ab − cd or U = 0, u = 0, and hence y ∗ z = 0 while c, d = 0. This means that at most one of x, y, z is non-zero. We can now go through all of the subcases and find a contradiction in every case. Similar arguments prove that Y = ac, Z = ad. Let us denote by C− [CP3 ] the quotient of C[CP3 ] by the relations in the lemma. We call it the ‘regular form’ of the coordinate algebra for CP3 in our ∗-algebraic approach and will work with it henceforth. The lemma means that there is no discernible difference between the ∗-algebras C− [CP3 ] and C[CP3 ] in the sense that if we were looking at CP3 as a set of projector matrices and the above variables as real or complex numbers, we would not see any distinction. (As long as the relevant intersections are transverse, the same would then be true in the algebras also, but it is beyond our scope to prove this here.) In algebraic terms, the relations in the lemma hold in the reduced version of C[CP3 ] defined by the radical construction. Imposing them takes us towards and may now coincide with that. Moreover, if either x, y, z or u, v, w are made invertible then one can show that the relations in the lemma indeed hold and do not need to be imposed, i.e. C[CP3 ] and C− [CP3 ] have the same localisations in this respect. Proposition 3.8. C− [CP3 ] can be viewed as having generators a, b, c, x, y, z with a + b + c + d = 1 and the relations X = ab, Y = ac, Z = ad, as well as auxiliary generators u, v, w and auxiliary relations, U = cd, V = bd, W = bc, au = y ∗ z, av = x ∗ z aw = x ∗ y, bu = w∗ v, cv = wu, dw = vu ∗ , cx = yw∗ , by = xw, bz = xv, d x = zv ∗ , dy = zu ∗ , cz = yu. If a = 0 these auxiliary variables and equations are redundant.

Quantisation of Twistor Theory by Cocycle Twist

733

Proof. If a = 0 (i.e. if we work in the algebra with a −1 adjoined) we regard three of the auxiliary equations as a definition of u, v, w. We then verify that the other equations hold automatically. The first line is clear since these equations times a 2 were solved in the lemma above. For example, from the next line we have a 2 w ∗ v = y ∗ x x ∗ z = X y ∗ z = aby ∗ z = a 2 bu, as required. Similarly a(yw∗ + zv ∗ ) = yy ∗ x + zz ∗ x = (Y + Z )x = a(c + d)x as required. From this we see that the ‘patch’ given by inverting a is described by just three independent complex variables x, y, z and one invertible real variable a with the single relation x ∗ x + y ∗ y + z ∗ z = a(1 − a) (the relations stated can be viewed as a definition of b, c, d, but we still need a+b+c+d = 1), in other words a punctured S 6 where the point x = y = z = a = 0 is deleted. This conforms to our expectations for CP3 as a complex three-manifold or real six-manifold. Of course, our original projector system was symmetric and we could have equally well analysed and presented our algebra in a form adapted to one of b, c, d = 0. Finally, we also see that if we set a, b, c, d to actual real values with a + b + c + d = 1 and b, c, d > 0, b + c + d < 1 (the inequalities here are equivalent to a = 0 and ab, ac, ad > 0) then C− [CP3 ]| b,c,d>0 = C[S 1 × S 1 × S 1 ] b+c+d<1

for three circles x ∗ x = ab, y ∗ y = ac, z ∗ z = ad, and no further relations. This is the analogue in our ∗-algebra approach of inclusions S 1 × S 1 × S 1 ⊂ CP3 and as the circles vary in radius we have part of the fact in the usual picture that CP3 is a toric variety (namely that C∗ × C∗ × C∗ ⊂ CP3 is open dense). We now relate this description of twistor space to the space-time algebras in the previous section. In particular, we note that classically there is a fibration of twistor space over the Euclidean four-sphere, CP3 → S 4 , whose fibre is CP1 (see for example [21]). This fibration arises through the observation that each α-plane in CM# intersects S 4 at a unique point (essentially because there are no null lines in Euclidean signature). The double fibration (2) thus collapses to a single fibration CP3 → S 4 (making the twistor theory of the real space-time S 4 much easier to study than that of its complex counterpart). To see this we make use of the following nondegenerate antilinear map on C4 , J (Z ) = J (Z 1 , Z 2 , Z 3 , Z 4 ) := (− Z¯ 2 , Z¯ 1 , − Z¯ 4 , Z¯ 3 ). Once again we recall that points of twistor space are one-dimensional subspaces of C4 , whereas points of CM# are two-dimensional subspaces. Of course, given a onedimensional subspace (spanned by Z ∈ C4 ), there are many two-dimensional subspaces in which it lies, and these constitute exactly the set Zˆ = CP2 . However, the involution J serves to pick out a unique such two-dimensional subspace, the one spanned by Z and J (Z ). Now recall our ‘quantum logic’ interpretation of the correspondence space F, as pairs of projectors (Q, P) on C4 with Q of rank one and P of rank two such that Q P = Q = P Q. Then since we have CP3 = S 7 /U1 , S 7 = Z µ , Z¯ ν | Z¯ µ Z µ = 1 ,

734

S. J. Brain, S. Majid

the involution J extends to one on CP3 , given by J (Z µ Z¯ ν ) = J ( Z¯ ν )J (Z µ ). At the level of the coordinate algebra C− [CP3 ] we have the following interpretation. Lemma 3.9. There is an antilinear involution J : C− [CP3 ] → C− [CP3 ], given in the notation of Proposition 3.8 by J (a) = b, J (b) = a, J (c) = 1 − (a + b + c), J (x) = −x, J (y) = v ∗ , J (z) = −w ∗ , J (u) = −u, J (v) = y ∗ , J (w) = −z ∗ . Proof. This is by direct computation, noting that if we write ⎛ 1 1 1 2 1 3 1 4⎞ Z¯ Z Z¯ Z Z¯ Z Z¯ Z ⎜ Z¯ 2 Z 1 Z¯ 2 Z 2 Z¯ 2 Z 3 Z¯ 2 Z 4 ⎟ ⎟ Tr Q = 1, Q=⎜ ⎝ Z¯ 3 Z 1 Z¯ 3 Z 2 Z¯ 3 Z 3 Z¯ 3 Z 4 ⎠, Z¯ 4 Z 1 Z¯ 4 Z 2 Z¯ 4 Z 3 Z¯ 4 Z 4 we see that

⎛

Z¯ 2 Z 2 ⎜− Z¯ 2 Z 1 J (Q) = ⎜ ⎝ Z¯ 2 Z 4 − Z¯ 2 Z 3

− Z¯ 1 Z 2 Z¯ 1 Z 1 − Z¯ 1 Z 4 Z¯ 1 Z 3

Z¯ 4 Z 2 − Z¯ 4 Z 1 Z¯ 4 Z 4 − Z¯ 4 Z 3

⎞ − Z¯ 3 Z 2 Z¯ 3 Z 1 ⎟ ⎟, − Z¯ 3 Z 4 ⎠ Z¯ 3 Z 3

Tr J (Q) = Tr Q = 1,

and the result follows by comparing with the notation of Proposition 3.8. In particular we see that J (X ) = X and J (U ) = U . The relations of Proposition 3.5 indicate that J extends to the full algebra as an antialgebra map (indeed this needs to be the case for J to be well-defined), since then we have J (au) = J (u)J (a) = −ub = −bu = −w∗ v = J (z)J (y ∗ ) = J (y ∗ z), similarly for the remaining relations. Note that J commutes with ∗.

We remark that since the algebra is commutative here, we may treat J as an algebra map rather than as an antialgebra map as required in the notion of an antilinear involution. This will no longer be the case when we come to quantise, when it is the notion of antilinear involution that will survive. The map J extends further to an involution on CM# : given P ∈ CM# , we write P = Q + Q for Q, Q ∈ CP3 in the fibre of F above P and define J (P) := J (Q) + J (Q ). Note that if the image of Q is spanned by some vector Z in the image of P, and likewise the image of Q = P − Q is spanned by W in the image of P (together they span the image of P), then J (P) is defined with image spanned by J (Z ) and J (W ). If we chose a different Q in the fibre of F above P then Z , W will change to some linear combination. Hence J (Z ), J (W ) will change to some other (conjugate) linear combination but the 2-dimensional space that they define will not change. Hence J (P) is independent of the choice of Q in the decomposition of P.

Quantisation of Twistor Theory by Cocycle Twist

735

Proposition 3.10. P ∈ CM# is fixed under J if and only if P ∈ S 4 . Proof. Writing Q = (W¯ µ W ν ), we have that P = Q + Q = ( Z¯ µ Z ν + W¯ µ W ν ), and that this is supposed to be identified with the 2 × 2 block decomposition

A B , A = A† , D = D † , Tr A + Tr D = 2. P= B† D Here we shall use the notation of Proposition 3.1. Examining A, we have

1 1 Z¯ Z + W¯ 1 W 1 Z¯ 1 Z 2 + Z¯ 2 Z 1 A = a + α · σ = ¯2 1 ¯1 2 ¯2 2 ¯ 2 2 , Z Z +Z Z Z Z +W W and hence an identification a=

1 ¯1 1 ¯2 2 ¯ 1 1 ¯ 2 2 ( Z Z + Z Z + W W + W W ), 2

α3 =

1 ¯1 1 ( Z Z − Z¯ 2 Z 2 + W¯ 1 W 1 − W¯ 2 W 2 ), 2

as well as the off-diagonal entries α1 =

1 ¯1 2 ( Z Z − Z¯ 2 Z 1 + W¯ 1 W 2 − W¯ 2 W 1 ), 2

α2 =

1 ¯1 2 ¯2 1 ¯ 1 2 ¯ 2 1 ( Z Z + Z Z + W W + W W ). 2ı

Clearly the relations a = a ∗ , α = α ∗ hold under this identification. Under the involution J we calculate that J (a) = a,

J (α) = −α.

Similarly we look at the block D,

3 3 Z¯ Z + W¯ 3 W 3 D = d + δ · σ = ¯4 3 ¯ 4 3 Z Z +W W

Z¯ 3 Z 4 + W¯ 3 W 4 . Z¯ 4 Z 4 + W¯ 4 W 4

The same computation as above shows that the relations d = d ∗ , δ = δ ∗ hold here, and moreover the trace relation implies that d = 1 − a, in agreement with Sect. 3.1. Under the involution J we also see that J (d) = d,

J (δ) = −δ.

Finally we look at the matrix B,

B = t + ıx · σ =

Z¯ 1 Z 3 + W¯ 1 W 3 Z¯ 2 Z 3 + W¯ 2 W 3

Z¯ 1 Z 4 + W¯ 1 W 4 . Z¯ 2 Z 4 + W¯ 2 W 4

736

S. J. Brain, S. Majid

Solving, we have the identification of generators, t=

1 ¯1 3 ¯2 4 ¯ 1 3 ¯ 2 4 ( Z Z + Z Z + W W + W W ), 2

1 ¯1 3 ( Z Z − Z¯ 2 Z 4 + W¯ 1 W 3 − W¯ 2 W 4 ) 2ı for the diagonal entries, as well as x3 =

x1 =

1 ¯2 3 ¯1 4 ¯ 2 3 ¯ 1 4 ( Z Z + Z Z + W W + W W ), 2ı

1 ¯2 3 ( Z Z − Z¯ 1 Z 4 + W¯ 2 W 3 − W¯ 1 W 4 ) 2 on the off-diagonal. This is in agreement with the fact as in Proposition 3.1 that the generators t, x are not necessarily Hermitian. Moreover, it is a simple matter to compute that under the involution J we have x2 =

J (t) = t ∗ ,

J (x) = x ∗ .

Overall, we see that J has fixed points in CM# consisting of those with coordinates subject to the additional constraints α = δ = 0, t = t ∗ , x = x ∗ . Thus (upon verification of the extra relations) the fixed points of CM# under J are exactly those lying in S 4 in accordance with Proposition 3.2. In the notation of Corollary 3.3, the action of J on C[CM# ] is to map J (a) = a,

J (α3 ) = α3 ,

J (w) = −w˜ ∗ ,

J (δ3 ) = δ3 ,

J (z) = z˜ ∗ ,

J (α) = −α,

J (w) ˜ = −w ∗ ,

J (δ) = −δ,

J (˜z ) = z ∗ .

This may either be recomputed, or obtained simply by making the same change of variables as was made in going from Proposition 3.1 to Corollary 3.3. The fixed points ˜ z ∗ = z˜ , in agreement with in these coordinates are those with α = δ = 0, w∗ = −w, Corollary 3.3. Proposition 3.11. For each P ∈ CM# we have P ∈ S 4 if and only if there exists Q ∈ CP3 in the fibre of F above P such that P = Q + J (Q). Proof. By the previous proposition, P ∈ S 4 if and only if J (P) = P. Of course, the reverse direction of the claim is easy, since if P = Q + J (Q), we have J (P) = J (Q) + J 2 (Q) = P. Conversely, let J (P) = P for a 2d subspace P of C4 and suppose that P contains a non-zero vector v such that v, J (v) are linearly independent. In fact v, J (v) are necessarily orthogonal due to the properties of J . J (v) also lies in P, by assumption, hence v, J (v) define two distinct subspaces of P, interchanged by J . We define Q to be the rank 1 projector defined by one of our distinct subspaces. Then J (Q) is the rank 1 projector defined by the other. By construction we have P = Q + J (Q) in terms of the corresponding projectors. As promised, there is a fibration of CP3 over S 4 given at the coordinate algebra level by an inclusion C[S 4 ] → C− [CP3 ]. In terms of the C− [CP3 ] coordinates a, b, c, x, y, z used in Proposition 3.8 and the C[S 4 ] coordinates z, w, a of Corollary 3.3 we have the following (we note the overlap in notation between these propositions and rely on the context for clarity).

Quantisation of Twistor Theory by Cocycle Twist

737

Proposition 3.12. There is an algebra inclusion η : C[S 4 ] → C− [CP3 ] given by η(a) = a + b, η(z) = y + v ∗ , η(w) = w − z ∗ . Proof. That this is an algebra map is a matter of rewriting the previous proposition in our explicit coordinates, for example that η(z) = y + v ∗ = y + J (y). The sole relation to investigate is the image of the sphere relation zz ∗ + ww ∗ = a(1 − a). Applying η to the left-hand side, we obtain η(zz ∗ + ww ∗ ) = yy ∗ + yv + v ∗ y ∗ + v ∗ v + ww ∗ − wz − z ∗ w ∗ + z ∗ z. Now using the relations of Proposition 3.5 we compute that ayv = yav = yx ∗ z = x ∗ yz = awz, where we have relied upon the commutativity of the algebra. Similarly one computes that byv = bwz, cyv = cwz, dyv = dwz, so that adding these four relations yields that yv = wz in C− [CP3 ]. Then finally using the relations in Proposition 3.8 we see that η(zz ∗ + ww ∗ ) = Y + V + W + Z = (a + b)(c + d) = (a + b)(1 − (a + b)) = η(a(1 − a)), as required.

We now look at the typical fibre CP1 of the fibration CP3 → S 4 , but now in the coordinate algebra picture. Proposition 3.13. The quotient of the algebra C− [CP3 ] obtained by setting η(a), η(z), η(w) to be constant numerical values is isomorphic to the coordinate algebra of a CP1 . Proof. If we suppose that we are in the patch where a = 0 in C− [CP3 ] then we can view x, y, z as the variables and X = ab, Y = ac, Z = ad as the relations. The generators u, v, w are defined by the equations in Proposition 3.8 and the rest are redundant. Now suppose that a + b = A, a fixed real number, and y + v ∗ = B, w − z ∗ = C, fixed complex numbers, such that B B ∗ + CC ∗ = A(1 − A) (an element of S 4 ). Then we have just one equation X = a(A − a) = (A/2)2 − s 2 if we set s = a − A/2. This is a CP1 of radius A/2 in place of the usual radius 1/2. The equation Y = ac is viewed as a definition of c. The equation Z = ad is then equivalent to Y + Z = a(1 − A). We will see that this is automatic and that y, z are uniquely determined by x, a and our fixed parameters A,B,C, so are not in fact free variables. Indeed, av = x ∗ z and aw = x ∗ y determine v and w as mentioned above, so our quotient is a B = ay + z ∗ x,

aC = x ∗ y − az ∗ ,

738

S. J. Brain, S. Majid

which implies that a 2 (B B ∗ +CC ∗ ) = (a 2 + X )(Y + Z ) = a A(Y + Z ), so Y + Z = a(1− A) necessarily holds if A, B, C lie in S 4 . We also combine the equations to find ax ∗ B = a 2 C + z ∗ a A and aC x = a Ay − a 2 B so that at least if A = 0 we have z,y determined. (In fact one has By ∗ − z ∗ C = a(1 − A) from the above, so if z is determined then so is y if B is not zero, etc.). Thus C[CP1 ] is viewed inside C− [CP3 ] in this patch as ⎛ ⎞ a x ∗∗ ⎜x ∗ A − a ∗ ∗⎟ , x ∗ x = a(A − a), (14) ⎝∗ ∗ ∗ ∗⎠ ∗ ∗ ∗∗ where the unspecified entries are determined as above using the relations in terms of x and a. Similar analysis holds in the other coordinate patches, although we shall not check this here as this is a well-known classical result. In other patches we would see the various copies of C[CP1 ] appearing elsewhere in the above matrix. This situation now provides us with yet another way to view the instanton bundle. Let M be a finite rank projective C[CM# ]-module. Then J induces a module map J : M → M whose fixed point submodule is a finite rank projective C[S 4 ]-module. In particular, if we take M to be the C[CM# ]-module given by the defining tautological projector (7), then as explained above as well as in that section, the fixed point submodule is precisely the tautological bundle E = C[S 4 ]4 e of Proposition 3.2 which defines the instanton bundle over S 4 . Now the map η : C[S 4 ] → C− [CP3 ] induces the ‘push-out’ of the C[S 4 ]-module ˜ given by viewing the projector E along η to obtain an ‘auxiliary’ C− [CP3 ]-module E, e ∈ M4 (C[S 4 ]) as a projector e˜ ∈ M4 (C− [CP3 ]), so that E˜ := C− [CP3 ]4 e, ˜ giving a bundle over twistor space. Explicitly, we have ⎛ ⎞ η(a) 0 η(z) η(−w ∗ ) η(a) η(w) η(z ∗ ) ⎟ ⎜ 0 e˜ = ⎝ ⎠ ∗ ∗ η(z ) η(w ) η(1 − a) 0 η(−w) η(z) 0 η(1 − a) ⎞ ⎛ ∗ z − w∗ a+b 0 y+v ∗ ∗ a+b w−z y +v ⎟ ⎜ 0 − 3 =⎝ ∗ ⎠ ∈ M4 (C [CP ]). y + v w ∗ − z 1 − (a + b) 0 z∗ − w y + v∗ 0 1 − (a + b) Moreover, if one sets a + b = A, y + v ∗ = B, w − z ∗ = C for fixed real A and complex B, C as in Proposition 3.13, then we have ⎛ ⎞ A 0 B −C ∗ ∗ B ⎟ ⎜ 0 A C e˜ = ⎝ ∗ ∗ , B C 1− A 0 ⎠ −C B 0 1− A a constant projector of rank two. Then viewing the fibre C[CP1 ] as a subset of C[CP3 ] as in (14), it is easily seen that C[CP1 ]4 e˜ is a free C[CP1 ]-module of rank two. This is just the coordinate algebra version of saying that for all x = (A, B, C) ∈ S 4 the instanton bundle pulled back from S 4 to CP3 is trivial upon restriction to each fibre xˆ = CP1 , and

Quantisation of Twistor Theory by Cocycle Twist

739

we may thus see the instanton bundle E over C[S 4 ] as coming from the bundle E˜ over twistor space. This is an easy example of the Penrose-Ward transform, which we shall discuss in more detail later. 4. The Quantum Conformal Group The advantage of writing space-time and twistor space as homogeneous spaces in the language of coordinate functions is that we are now free to apply the standard theory of quantisation by a cocycle twist. To this end, we recall that if H is a Hopf algebra with coproduct : H → H ⊗ H , counit : H → C and antipode S : H → H , then a two-cocycle F on H means F : H ⊗ H → C which is convolution invertible and unital (i.e. a 2-cochain) in the sense F(h (1) , g (1) )F −1 (h (2) , g (2) ) = F −1 (h (1) , g (1) )F(h (2) , g (2) ) = (h)(g) (for some map F −1 ) and obeys ∂ F = 1 in the sense F(g (1) , f (1) )F(h (1) , g (2) f (2) )F −1 (h (2) g (3) , f (3) )F −1 (h (3) , g (4) ) = ( f )(h)(g). We have used Sweedler notation (h) = h (1) ⊗ h (2) and suppressed the summation. In this case there is a ‘cotwisted’ Hopf algebra H F with the same coalgebra structure and counit as H but with modified product • and antipode S F , [12], h • g = F(h (1) ⊗ g(1) ) h (2) g(2) F −1 (h (3) ⊗ g(3) ),

(15)

S F (h) = U (h (1) )Sh (2) U −1 (h (2) ), U (h) = F(h (1) , Sh (2) ), for h, g ∈ H , where we use the product and antipode of H on the right-hand sides and U −1 (h (1) )U (h (2) ) = (h) = U (h (1) )U −1 (h (2) ) defines the inverse functional. If H is a coquastriangular Hopf algebra then so is H F . In particular, if H is commutative then H F is cotriangular with ’universal R-matrix’ and induced (symmetric) braiding given by R(h, g) = F(g (1) , h (1) )F −1 (h (2) , g (2) ), V,W (v⊗w) = R(w (1) , v (1) )w (2) ⊗v (2) for any two left comodules V, W . We use the Sweedler notation for the left coactions as well. In the cotriangular case one has 2 = id, so every object on the category of H F -comodules inherits non-trivial statistics in which transposition is replaced by this non-standard transposition. The nice property of this construction is that the category of H -comodules is actually equivalent to that of H F -comodules, so there is an invertible functor which ‘functorially quantises’ any construction in the first category (any H -covariant construction) to give an H F -covariant one. So not only is the classical Hopf algebra H quantised but also any H -covariant construction as well. This is a particularly easy example of the ‘braid statistics approach’ to quantisation, whereby deformation is achieved by deforming the category of vector spaces to a braided one [12]. In particular, if A is a left H -comodule algebra, we automatically obtain a left H F -comodule algebra A F which as a vector space is the same as A, but has the modified product a • b = F(a (1) , b(1) )a (2) b(2) ,

(16)

740

S. J. Brain, S. Majid

for a, b ∈ A, where we have again used the Sweedler notation L (a) = a (1) ⊗ a (2) for the coaction L : A → H ⊗A. The same applies to any other covariant algebra. For example if (A) is an H -covariant differential calculus (see later) then this functorially quantises as (A F ) := (A) F by this same construction. Finally, if H → H is a homomorphism of Hopf algebras then any cocycle F on H pulls back to one on H and as a result one has a homomorphism H F → H F . In what follows we take H = C[C4 ] (the translation group of C4 ) and H variously the coordinate algebras of K˜ , H˜ , GL4 . # , we have (as in Sect. 1.1) In particular, since the group GL acts on the quadric CM 4

# ], and we shall first deform this a coaction L of the coordinate ring C[GL4 ] on C[CM picture. In order to do this we note first that the conformal transformations of CM# break down into compositions of translations, rotations, dilations and inversions. Written with respect to the aforementioned double null coordinates, GL4 decomposes into 2×2 blocks

γ τ (17) σ γ˜ with overall non-zero determinant, where the entries of τ constitute the translations and the entries of σ contain the inversions. The diagonal blocks γ × γ˜ constitute the spacetime rotations as well as the dilations. Writing M2 := M2 (C), GL4 decomposes as the subset of nonzero determinant GL4 ⊂ C4 (M2 × M2 ) C4 , where the outer factors denote σ, τ and γ × γ˜ ∈ M2 × M2 . In practice it is convenient to work in a ‘patch’ GL− 4 , where γ is assumed invertible. Then by factorising the matrix we deduce that

γ τ det = det(γ ) det(γ˜ − σ γ −1 τ ), σ γ˜

which is actually a part of a universal formula for determinants of matrices with entries in a noncommutative algebra (here the algebra is M2 and we compose with the determi4 4 nant map on this algebra). We see that as a set, GL− 4 is C × GL2 × GL2 × C , where −1 the two copies of GL2 refer to γ and γ˜ − σ γ τ . There is of course another patch GL+4 , where we similarly assume γ˜ invertible. In terms of coordinate functions for C[GL4 ] we therefore have four matrix generators τ, σ, γ , γ˜ organised as above. These together have a matrix form of coproduct

γ τ γ τ γ τ = ⊗ . σ γ˜ σ γ˜ σ γ˜ In the classical case the generators commute and an invertible element D obeying ˜ D = det a is adjoined. For C[GL− 4 ] we instead adjoin inverses to d = det(γ ) and d = det −1 (γ˜ − σ γ τ ). We focus next on the translation sector H = C[C4 ] generated by some t AA , where A ∈ {3, 4} and A ∈ {1, 2} to line up with our conventions for GL4 . These generators have a standard additive coproduct. We let ∂ AA be the Lie algebra of translation generators dual to this, so ∂ AA , t BB = δ BA δ AB

Quantisation of Twistor Theory by Cocycle Twist

741

which extends to the action on products of the t AA by differentiation and evaluation at zero (hence the notation). In this notation we use the cocycle ı A B F(h, g) = exp( θ AAB B ∂ A ⊗ ∂ B ), h⊗g . 2 Cotwisting here does not change H itself, H = H F , because its coproduct is cocommutative (the group C4 is Abelian) but it twists A = C[C4 ] as a comodule algebra into the Moyal plane. This is by now well-known both in the module form and the above comodule form. We now pull this cocycle back to C[GL4 ], where it takes the same form as above on the generators τ AA (which project onto t AA ). The pairing extends as zero on the other generators. One can view the ∂ AA in the Lie algebra of GL4 as the nilpotent 4 × 4 matrices with entry 1 in the A, A position for some A = 1, 2, A = 3, 4 and zeros elsewhere, extending the above picture. Either way, one computes ı µα A B µ α Fνβ = F(aνµ , aβα ) = exp( θ AAB B ∂ A ⊗ ∂ B ), aν ⊗aβ 2 ı µ α A B = δνµ δβα + θ AAB δ δ δ δ 2 B A B ν β ı µα = δνµ δβα + θνβ , 2 µα

where it is understood that θνβ is zero when µ, α ∈ {1, 2}, or ν, β ∈ {3, 4}. We also compute ı ı µp A B µ p µ θ = δνµ . U (aνµ ) = F(a µp , Saνp ) = exp(− θ AAB B ∂ A ⊗ ∂ B ), a p ⊗aν = δν − 2 2 pβ Then following Eqs. (15), the deformed coordinate algebra C F [GL4 ] has undeformed antipode on the generators and deformed product µα m n −1 a p aq F νβ , aνµ • aβα = Fmn pq

µ

where aν ∈ C[GL4 ] are the generators of the classical algebra. The commutation relations can be written in R-matrix form (as for any matrix coquasitriangular Hopf algebra) as µν

β

αβ

Rαβ aγα • aδ = aβν • aαµ Rγ δ ,

µν

νµ

γδ

Rαβ = Fδγ F −1 αβ ,

where in our particular case µα

µα

µα

Rνβ = δνµ δβα − ıθ − νβ , θ − νβ =

1 µα αµ (θ − θβν ) 2 νβ

has the same form but now with only the antisymmetric part of θ in the sense shown. We give the resulting relations explicitly in the γ , γ˜ , σ, τ block form (17). These are in fact all 2 × 2 matrix relations with indices A, A etc. as explained but when no confusion can arise we write the indices in an apparently GL4 form. For example, in µ µ writing γν it is implicit that µ, ν ∈ {1, 2}, whereas for σν it is understood that µ ∈ {3, 4} and ν ∈ {1, 2}.

742

S. J. Brain, S. Majid

Theorem 4.1. The quantum group coordinate algebra C F [GL4 ] has deformed product µ

µ

ı µα c d 2 θcd γ˜ν γ˜β ı µα c d 2 θcd σν γ˜β , ı µ α cd 2 σc γd θνβ , ı µα c d 2 θcd σν σβ ,

τν • τβα = τν τβα +

µ

µ

γν • τβα = γν τβα +

µ µ γ˜ν • τβα = γ˜ν τβα − µ µ γν • γβα = γν γβα +

ı µ α 2 γc γd µ τν • γβα µ τν • γ˜βα µ γ˜ν • γ˜βα

−

1 µα a b cd 4 θab σc σd θνβ , µ µα τν γβα + 2ı θcd γ˜νc σβd , µ µ cd , τν γ˜βα − 2ı γc σdα θνβ µ µ cd γ˜ν γ˜βα − 2ı σc σdα θνβ

cd + θνβ

= = =

with the remaining relations, antipode and coproduct undeformed on the generators. The quantum group is generated by matrices γ , γ˜ , τ, σ of generators with commutation relations µα

[γνµ , γβα ]• = ıθ − cd σνc • σβd , [γ˜νµ , γ˜βα ]• = −ıσdα • σcµ θ −cd νβ , µα

[γνµ , τβα ]• = ıθ − cd σνc • γ˜βd , [γ˜νµ , τβα ]• = −ıγdα • σcµ θ −cd νβ , µα

[τνµ , τβα ]• = ıθ − cd γ˜νc • γ˜βd − ıγdα • γcµ θ −cd νβ and a certain determinant inverted. Proof. Finishing the computations above with the explicit form of F we have ı µα ı 1 µα cd cd aνµ • aβα = aνµ aβα + θcd aνc aβd − acµ adα θνβ + θab aca adb θνβ . 2 2 4 µα

Noting that θνβ = 0 unless µ, α ∈ {1, 2} and ν, β ∈ {3, 4} we can write these for the 2 × 2 blocks as shown. For the commutation relations we have similarly µα

[aνµ , aβα ]• = ıθ − cd aνc • aβd − ıadα • acµ θ −cd νβ , which we similarly decompose as stated. Note that different terms here drop out due to the range of the indices for nonzero θ− , which are the same as for θ . There is in principle a formula also for the determinant written in terms of the • product. It can be obtained via braided ‘antisymmetric tensors’ from the R-matrix and will necessarily be a product of 2 × 2 determinants in the ‘patches’ where γ or γ˜ are invertible in the noncommutative algebra. One may proceed to compute these more explicitly, for example µα

µα

µα

µα

[γνµ , γβα ]• = ıθ − 34 σν3 σβ4 + ıθ − 43 σβ3 σν4 + ıθ − 33 σν3 σβ3 + ıθ − 44 σν4 σβ4 and so forth. We similarly calculate the resulting products on the coordinate algebras of the deformed homogeneous spaces. Indeed, using Eq. (16), we have the following results. We recall our notation that if X is a projective variety then X˜ is the affine variety which # and T˜ are respectively the homogeneous versions projects to X , so that in particular CM # of conformal space-time CM and twistor space T .

Quantisation of Twistor Theory by Cocycle Twist

743

# ] has the deformed product Proposition 4.2. The covariantly twisted algebra C F [CM ı µβ νβ µα aν cβ να µb cβ x µν • x αβ = x µν x αβ + (θad x aν x αd + θac x x + θbd x µb x αd + θbc x x ) 2 1 να µβ νβ µα x ab x cd − θbc θad + θbd θac 4 and is isomorphic to the subalgebra ˜

C F [GL4 ]C F [ H ] , where F is pulled back to C[ H˜ ]. Products of generators with t = x 34 are undeformed. ˜ # ] ∼ Proof. The isomorphism C F [CM = C F [GL4 ]C F [ H ] is a consequence of the functoriality of the cocycle twist. The deformed product is simply a matter of calculating the # ]. The coaction (x µν ) = a µ a ν ⊗x ab combined with the twisted product on C [CM F

L

a b

formula (16) yields β

µα mβ νn Fap Fqc Fbd x ab x cd , x µν • x αβ = F(aaµ abν , acα ad )x ab x cd = Fmn qp

using that F in our particular case is multiplicative (a Hopf algebra bicharacter on C[C4 ] and hence when pulled back to C[GL4 ]). Alternatively, one may compute it directly from the original definition as an exponentiated operator, going out to ∂∂⊗∂∂ terms before µα evaluating at zero in C4 . Either way we have the result stated when we recall that θbd is understood to be zero unless µ, α ∈ {1, 2}, b, d ∈ {3, 4}. We note also the commutation relations µα mβ νn x αβ • x µν = Rmn Rap Rqc Rbd x ab • x cd qp

following from b •a = R(a (1) , b(1) )a (2) •b(2) computed in the same was as above but now with R in place of F. Since R −1 = R21 these relations may be written in a ‘reflection’ form on regarding x as a matrix. Finally, since θ is zero when µ, α ∈ / {1, 2} we see that the • product of the generator t = x 34 with any other generator is undeformed, which also implies that t is central in the deformed algebra. Examining the resulting relations associated to the twisted product more closely, one finds ı 21 43 34 ı 12 34 43 2 [z, z˜ ]• = [−x 23 , x 14 ]• = − θ43 x x + θ34 x x = ıθ −21 43 t , 2 2 ı 12 43 34 ı 21 34 43 2 [w, w] ˜ • = [−x 13 , x 24 ]• = − θ43 x x + θ34 x x = ıθ −12 43 t , 2 2 with the remaining commutators amongst these affine Minkowski space generators undeformed. Since products with t are undeformed, the above can be viewed as the # ]. The comcommutation relations among the affine generators t, w, w, ˜ z, z˜ of C [CM F

mutation relations for the s generator are ı 12 a2 c3 ı 21 a3 c2 [s, z]• = [x 12 , −x 23 ]• = − θac x x + θac x x 2 2 ı 12 32 43 ı 12 42 43 ı 21 43 32 ı 21 43 42 = − θ34 x x − θ44 x x + θ43 x x + θ44 x x 2 2 2 2 −21 = ıθ −12 zt + ıθ wt, ˜ 34 44

744

S. J. Brain, S. Majid

for example, as well as −21 [s, z˜ ]• = ıθ −12 33 wt + ıθ 43 z˜ t, −21 [s, w]• = ıθ −12 43 wt + ıθ 44 z˜ t, −21 [s, w] ˜ • = ıθ −12 ˜ 33 zt + ıθ 34 wt.

Again, we may equally well use the • product on the right hand side of each equation. Proposition 4.3. The twisted algebra C F [T˜ ] has deformed product ı µν Z µ • Z ν = Z µ Z ν + θab Z a Z b . 2 It is isomorphic to the subalgebra ˜

C F [GL4 ]C F [ K ] , where F is pulled back to C[ K˜ ]. Products of generators with Z 3 , Z 4 are undeformed. ˜ Proof. The isomorphism C F [T˜ ] ∼ = C F [GL4 ]C F [ K ] is again a consequence of the theory of cocycle twisting. An application of Eq. (16) gives the new product νµ

νµ

µν

c d c d Z ν • Z µ = Fab Z a Z b = Fab F −1ba cd Z • Z = Rcd Z • Z ,

and we compute the first of these explicitly. The remaining relations tell us that this is a ‘braided vector space’ associated to the R-matrix, see [12, Ch. 10]. As before, the form of θ implies that products with Z 3 , Z 4 are undeformed. Hence these are central. We conclude that the only non-trivial commutation relation is the Z 1 -Z 2 one, which we compute explicitly as a b −12 −12 3 4 −12 3 3 −12 4 4 [Z 1 , Z 2 ]• = −ıθ −21 ab Z • Z = ı(θ 34 + θ 43 )Z Z + ıθ 33 Z Z + ıθ 44 Z Z ,

where we could as well use the • on the right. We remark that although our deformation of the conformal group is different from and more general than that previously obtained in [9] (which insists on only first order terms in the deformation parameter), it is of note that the deformed commutation relations associated to twistor space and conformal space-time are in agreement with those proposed in recent literature [9,10,16] provided one supposes that the four parameters −12 −11 −22 θ −12 33 = θ 44 = θ 34 = θ 34 = 0. This says that θ − AB A B as a 4 × 4 matrix with rows A A and columns B B (the usual presentation) has the form ⎛ ⎞ 0 0 0 θ1 ⎜ 0 0 −θ2 0 ⎟ −21 θ− = ⎝ , θ1 = θ −12 34 , θ2 = θ 34 . 0 θ2 0 0 ⎠ −θ1 0 0 0

The quantum group deformation we propose then agrees with [9] on the generators µ γν , σβα and this in fact is the reason that the space-time and twistor algebras then agree,

Quantisation of Twistor Theory by Cocycle Twist

745

since their generators may be viewed as living in the subalgebra generated by the first two columns of a via the isomorphisms given in Propositions 4.2 and 4.3. ˜ Finally we give the commutation relations in the twisted coordinate algebra C F [F] of the correspondence space, which may be computed either by viewing it as a twisted comodule algebra for C F [GL4 ] or by identification with the appropriate subalgebra of C F [GL4 ] and calculating there. Either way, one obtains 3 −21 4 −21 3 −21 4 −11 2 [s, Z 1 ]• = −ıθ −21 33 w Z − ıθ 34 w Z + ıθ 43 z˜ Z + ıθ 44 z˜ Z + ıθ 34 t Z , (18) 3 −12 4 −12 1 [s, Z 2 ]• = ıθ −12 ˜ Z 3 − ıθ −12 ˜ Z 4 + ıθ −22 33 z Z + ıθ 34 z Z − ıθ 43 w 44 w 34 t Z , 3 −21 4 [z, Z 1 ]• = ıθ −21 43 t Z + ıθ 44 t Z ,

3 [z, Z 2 ]• = ıθ −22 43 t Z ,

4 [˜z , Z 1 ]• = ıθ −11 34 t Z ,

3 −12 4 [˜z , Z 2 ]• = ıθ −12 33 t Z + ıθ 34 t Z ,

3 [w, Z 1 ]• = ıθ −11 43 t Z ,

3 −12 4 [w, Z 2 ]• = ıθ −12 43 t Z + ıθ 44 t Z ,

3 −21 4 [w, ˜ Z 1 ]• = ıθ −21 33 t Z + ıθ 34 t Z ,

4 [w, ˜ Z 2 ]• = ıθ −22 34 t Z ,

where we may equally write • on the right hand side of each relation. The generators t, Z 3 , Z 4 are of course central. The relations (1) twist simply by replacing the old product by •. # ] and C [ T 5. Quantum Differential Calculi on C F [GL4 ], C F [CM F ] We recall that a differential calculus on an algebra A consists of an A-A-bimodule 1 A and a map d : A → 1 A obeying the Leibniz rule such that 1 A is spanned by elements of the form adb. Every unital algebra has a universal calculus 1un = ker µ, where µ is the product map of A. The differential is dun (a) = 1⊗a − a⊗1. Any other calculus is a quotient of 1un by a sub-bimodule N A . When A is a Hopf algebra, it coacts on itself by left and right translation via the coproduct : we say a calculus on A is left covariant if this coaction extends to a left coaction L : 1 A → A ⊗ 1 A such that d is an intertwiner and L is a bimodule map, that is L (da) = (id ⊗ d)(a), a · L (ω) = L (a · ω), L (ω) · b = L (ω · b) for all a, b ∈ A, ω ∈ 1 A, where A acts on A⊗1 A in the tensor product representation. Equivalently, L (a · ω) = (a) · L (ω) etc., with the second product as an A⊗A-module. We then say that a one-form ω ∈ 1 A is left invariant if it is invariant under left translation by L . Of course, similar definitions may be made with ‘left’ replaced by ‘right’. It is bicovariant if both definitions hold and the left and right coactions commute. We similarly have the notion of the calculus on an H -comodule algebra A being H -covariant, namely that the coaction extends to the calculus such that it commutes with d and is multiplicative with respect to the bimodule product. We now quantise the differential structures on our spaces and groups by the same covariant twist method. For groups the important thing to know is that the classical exterior algebra of differential forms (GL4 ) (in our case) is a super-Hopf algebra where

746

S. J. Brain, S. Majid

the coproduct on degree zero elements is that of C[GL4 ] while on degree one it is L + R for the classical coactions induced by left and right translation (so L da = a (1) ⊗da (2) , etc.). We view F as a cocycle on this super algebra by extending it as zero, and make a cotwist in the super-algebra version of the cotwist of C[GL4 ]. Then (C F [GL4 ]) has the bimodule and wedge products, µα m n −1 µα n −1 a p daq F νβ , daνµ • aβα = Fmn (da m aνµ • daβα = Fmn p )aq F νβ , pq

pq

µα m da p ∧ daqn F −1 νβ , daνµ • daβα = Fmn pq

while d itself is not deformed. The commutation relations are µα

Rab aνa • daβb = dabα • aaµ R ab νβ ,

µα

Rab daνa • daβb = −dabα • daaµ R ab νβ .

In terms of the decomposition (17) the deformed products come out as ı µα ı µα γνµ • dγβα = γνµ dγβα + θab σνa dσβb , dγνµ • γβα = (dγνµ )γβα + θab (dσνa )σβb , 2 2 ı µα dγνµ • dγβα = dγνµ ∧ dγβα + θab dσνa ∧ dσβb , 2 ı α α cd ı µ α µ α µ cd , γ˜ν • dγ˜β = γ˜ν dγ˜β − σc dσd θνβ , dγ˜ν • γ˜βα = (dγ˜νµ )γ˜βα − (dσcµ )σdα θνβ 2 2 ı cd , dγ˜νµ • dγ˜βα = dγ˜νµ ∧ dγ˜βα − dσcµ ∧ dσdα θνβ 2 ı µα ı µα γνµ • dτβα = γνµ dτβα + θab σνa dγ˜βb , dγνµ • τβα = (dγνµ )τβα + θab (dσνa )γ˜βb , 2 2 ı µα dγνµ • dτβα = dγνµ ∧ dτβα + θab dσνa ∧ dγ˜βb , 2 ı ı µα µα τνµ • dγβα = τνµ dγβα + θab γ˜νa dσβb , dτνµ • γβα = (dτνµ )(γβα ) + θab (dγ˜νa )σβb , 2 2 ı µα a µ α µ α b dτν • dγβ = dτν ∧ dγβ + θab dγ˜ν ∧ dσβ , 2 ı ı cd cd , dγ˜νµ • τβα = (dγ˜νµ )τβα − (dσcµ )γdα θνβ , γ˜νµ • dτβα = γ˜νµ dτβα − σcµ γdα θνβ 2 2 ı cd , dγ˜νµ • dτβα = dγ˜νµ ∧ dτβα − dσcµ ∧ γdα θνβ 2 ı ı cd cd , dτνµ • γ˜βα = (dτνµ )γ˜βα − (dγcµ )σdα θνβ , τνµ • dγ˜βα = τνµ dγ˜βα − γcµ dσdα θνβ 2 2 ı cd , dτνµ • dγ˜βα = dτνµ ∧ dγ˜βα − dγcµ ∧ dσdα θνβ 2 ı µα ı 1 µα cd cd + θab σca dσdb θνβ , τνµ • dτβα = τνµ dτβα + θab γ˜νa dγ˜βb − γcµ dγdα θνβ 2 2 4 ı µα ı 1 µα cd cd + θab (dσca )σdb θνβ , dτνµ • τβα = (dτνµ )τβα + θab (dγ˜νa )γ˜βb − (dγcµ )γdα θνβ 2 2 4 ı µα ı 1 µα cd cd + θab dσca ∧ dσdb θνβ , dτνµ • dτβα = dτνµ ∧ dτβα + θab dγ˜νa ∧ dγ˜βb − dγcµ ∧ dγdα θνβ 2 2 4

Quantisation of Twistor Theory by Cocycle Twist

747

with remaining relations undeformed. As before we adopt the convention that in each set of equations, the indices α, β, µ, ν lie in the appropriate ranges for each 2 × 2 block. One may also calculate the explicit commutation relations in closed form; they will be similar to the above but with θ − in place of θ . # , T˜ are covariant under Similarly, since the classical differential structures on CM GL4 , we have coactions on their classical exterior algebras induced from the coactions on the spaces themselves, such that d is equivariant. We can hence covariantly twist these in the same way as the algebras themselves. Thus (C F [T˜ ]) has structure νµ

νµ

Z ν • dZ µ = Fab Z a dZ b , dZ ν • Z µ = Fab (dZ a )Z b ,

(19)

νµ

dZ ν • dZ µ = Fab dZ a ∧ dZ b . The commutation relations are similarly µν

µν

Z ν • dZ µ = Rab dZ a • Z b , dZ ν • dZ µ = −Rab dZ a • dZ b . These formulae are essentially as for the coordinate algebra, but now with d inserted, and are (in some form) standard for braided linear spaces defined by an R-matrix. More explicitly, ı µν ı µν Z µ • dZ ν = Z µ dZ ν + θab Z a dZ b , dZ µ • Z ν = (dZ µ )Z ν + θab dZ a Z b , 2 2 ı µν dZ µ • dZ ν = dZ µ ∧ dZ ν + θab dZ a ∧ dZ b , 2 so that the Z 3 , Z 4 , dZ 3 , dZ 4 products are undeformed. In terms of commutation relations µν

µν

[Z µ , dZ ν ]• = ıθ − ab Z a dZ b , {dZ µ , dZ ν }• = ıθ − ab dZ a ∧ dZ b , where the right hand sides are for a, b ∈ {3, 4} and could be written with the bullet product equally well. The only non-classical commutation relations here are those with µ, ν ∈ {1, 2}. # ]) has structure Similarly, (C [CM F

µα mβ νn µα mβ νn x µν • dx αβ = Fmn Fap Fqc Fbd x ab dx cd , dx µν • x αβ = Fmn Fap Fqc Fbd (dx ab )x cd , qp

qp

µα mβ νn Fap Fqc Fbd dx ab ∧ dx cd . dx µν • dx αβ = Fmn qp

On the affine Minkowski generators and t we find explicitly: ı 21 ı 21 tdt, dz • z˜ = (dz)˜z + θ43 (dt)t, z • d˜z = zd˜z + θ43 2 2 ı 12 ı 12 tdt, d˜z • z = (d˜z )z + θ34 (dt)t, z˜ • dz = z˜ dz + θ34 2 2 ı 21 dt ∧ dt = dz ∧ d˜z , dz • d˜z = dz ∧ d˜z + θ43 2 ı 12 ı 12 tdt, dw • w˜ = (dw)w˜ + θ43 (dt)t, w • dw˜ = wdw˜ + θ43 2 2

(20)

748

S. J. Brain, S. Majid

ı 21 ı 21 w˜ • dw = wdw ˜ + θ34 tdt, dw˜ • w = (dw)w ˜ + θ34 (dt)t, 2 2 ı 12 dt ∧ dt = dw ∧ dw, ˜ dw • dw˜ = dw ∧ dw˜ + θ43 2 with other relations amongst these generators undeformed, as are the relations involving dt, whence we may equally use the • product in terms which involve t, dt. The relations in the calculus involving s, ds are more complicated. We write just the final commutation relations for these: ı 12 ı ı 12 ı 21 [s, dz]• = − θ34 zdt + θ 21 ˜ + θ44 tdw; ˜ 43 tdz − θ44 wdt 2 2 2 2 ı 12 ı ı 12 ı 21 (dz)t + θ 21 (dt)z − θ44 (dw)t ˜ + θ44 (dt)w; ˜ [ds, z]• = − θ34 2 2 43 2 2 ı 12 ı ı 12 ı 21 [ds, dz]• = − θ34 dz ∧ dt + θ 21 ˜ ∧ dt + θ44 dt ∧ dw; ˜ 43 dt ∧ dz − θ44 dw 2 2 2 2 ı 21 ı 12 ı 21 ı 12 wdt + θ33 tdw + θ43 z˜ dt − θ34 td˜z ; [s, d˜z ]• = − θ33 2 2 2 2 ı 21 ı 12 ı 21 ı 12 (dw)t + θ33 (dt)w + θ43 (d˜z )t − θ34 (dt)˜z ; [ds, z˜ ]• = − θ33 2 2 2 2 ı 21 ı 12 ı 21 ı 12 dw ∧ dt + θ33 dt ∧ dw + θ43 d˜z ∧ dt − θ34 dt ∧ d˜z ; [ds, d˜z ]• = − θ33 2 2 2 2 ı 21 ı 12 ı 21 ı 12 wdt + θ43 tdw + θ44 z˜ dt − θ44 td˜z ; [s, dw]• = − θ34 2 2 2 2 ı 21 ı 12 ı 21 ı 12 [ds, w]• = − θ34 (dw)t + θ43 (dt)w + θ44 (d˜z )t − θ44 (dt)˜z ; 2 2 2 2 ı 21 ı 12 ı 21 ı 12 dw ∧ dt + θ43 dt ∧ dw + θ44 d˜z ∧ dt − θ44 dt ∧ d˜z ; [ds, dw]• = − θ34 2 2 2 2 ı 12 ı 21 ı 12 ı 21 [s, dw] ˜ • = θ33 zdt − θ33 tdz − θ43 wdt ˜ + θ34 tdw; ˜ 2 2 2 2 ı 12 ı 21 ı 12 ı 21 (dz)t − θ33 (dt)z − θ43 (dw)t ˜ + θ34 (dt)w; ˜ [ds, w] ˜ • = θ33 2 2 2 2 ı 12 ı 21 ı 12 ı 21 dz ∧ dt − θ33 dt ∧ dz − θ43 dw˜ ∧ dt + θ34 dt ∧ dw. ˜ [ds, dw] ˜ • = θ33 2 2 2 2 ˜ is generated by dZ µ , ds, The calculus of the correspondence space algebra C F [F] dt, dz, d˜z , dw, dw˜ with twisted relations given by (18) with d inserted where appropriate. Explicitly we have 3 −21 4 [z, dZ 1 ]• = ıθ −21 43 tdZ + ıθ 44 tdZ , 3 −21 4 [dz, Z 1 ]• = ıθ −21 43 (dt)Z + ıθ 44 (dt)Z , 3 −21 4 [dz, dZ 1 ]• = ıθ −21 43 dt ∧ dZ + ıθ 44 dt ∧ dZ ,

3 [z, dZ 2 ]• = ıθ −22 43 tdZ ,

(21)

3 [dz, Z 2 ]• = ıθ −22 43 (dt)Z , 3 [dz, dZ 2 ]• = ıθ −22 43 dt ∧ dZ ,

for example, where we may equally use the • product on the right-hand sides. The other relations are obtained similarly, hence we refrain from writing them out explicitly, although we stress that dZ 3 , dZ 4 , dt are central in the calculus.

Quantisation of Twistor Theory by Cocycle Twist

749

6. Quantisation by Cotwists in the SUn ∗-Algebra Version Our second setting is to work with twistor space and space-time as real manifolds in a ∗-algebra context. To this end we gave a ‘projector’ description of our spaces CP3 , CM# , F as well as their realisation as SU4 homogeneous spaces. We show here that this setting too quantises nicely by a cochain twist. This approach is directly compatible with C ∗ -algebra methods, although we shall not perform the C ∗ -algebra completions here. We shall however simultaneously quantise all real manifolds defined by n × nmatrices of generators with SUn -covariant conditions, a class which (as we have seen) includes all partial flag varieties based on Cn , with n = 4 the case relevant for the paper. The main new ingredient is that in the general theory of cotwisting, we should add that H is a Hopf ∗-algebra in the sense that is a ∗-algebra map and (S ◦ ∗)2 = id, and that the cocycle is real in the sense [12] F(h, g) = F((S 2 g)∗ , (S 2 h)∗ ). In this case the twisted Hopf algebra acquires a new ∗-structure V −1 (S −1 h (1) )(h (2) )∗ V (S −1 h (3) ), V (h) = U −1 (h (1) )U (S −1 h (2) ). h∗F = (We note the small correction to the formula stated in [12].) Also, if A is a left comodule algebra and a ∗-algebra, we require the coaction L to be a ∗-algebra map. Then A F has a new ∗-structure a ∗ F = V −1 (S −1 a (1) )(a (2) )∗ . The twisting subgroup in our application will be S(U1 × U1 × U1 × U1 ) since this is contained in all relevant subgroups of SU4 , or rather the larger group SUn with twisting subgroup appearing in the Hopf ∗-algebra picture as H = C[S(Un1 )] = C[tµ , tµ−1 ; µ = 1, . . . , n]/ t1 . . . tn = 1 , tµ∗ = tµ−1 , tµ = tµ ⊗tµ (tµ ) = 1, Stµ = tµ−1 . This is also the group algebra of the Abelian group Zn /Z(1, 1, . . . , 1) (where the vector here has n entries, all equal to 1). We define a basis of H by t a = t1a1 . . . tnan , a ∈ Zn /Z(1, 1, . . . , 1). Note that we could of course eliminate one of the U1 factors and identify the group with (U1 )n−1 and the dual group with Zn−1 , and this would be entirely equivalent in what follows but not canonical. We prefer to keep manifest the natural inclusion on the diagonal of SUn . This inclusion appears now as the ∗-Hopf algebra surjection π : C[SUn ] → H, π(aνµ ) = δνµ tµ . Next, we define a cocycle F : H ⊗H → C by

F(t a , t b ) = eı a·θ·b

where θ ∈ Mn (C) is any matrix for which every row and every column adds up to zero (so that (1, 1, . . . , 1) is in the null space from either side). Such a matrix is fully determined by an arbitrary choice of (say) a lower (n −1)×(n −1) diagonal block. Thus

750

S. J. Brain, S. Majid

the data here is an arbitrary (n − 1) × (n − 1) matrix just as we would have if we had eliminated t1 in the first place. The reality condition and the functional V work out as θ † = −θ, U (t a ) = e−ı aθ a , V (t a ) = 1. The latter means that the ∗-structures do not deform. We now pull back this cocycle under π to a cocycle on C[SUn ] with matrix µα

Fνβ = F(aνµ , aβα ) = δνµ δβα F(tν , tβ ) = δνµ δβα eıθνβ . We then use these new F-matrices in place of those in Sects. 4,5 since the general formulae in terms of F-matrices are identical, being determined by the coactions and coproducts, with the additional twist of the ∗-operation computed in the same way. Thus we find easily that C F [SUn ] has the deformed product and antipode (and undeformed ∗-structure): aνµ • aβα = eı(θµα −θνβ ) aνµ aβα = eı(θµα −θνβ −θαµ +θβν ) aβα • aνµ , S F aνµ = e−ı(θµµ −θνν ) Saνµ , which means that C F [SUn ] has a compact form if we use new generators ı

aˆ νµ = e 2 (θµµ −θνν ) aνµ in the sense aˆ νµ = aˆ αµ ⊗aˆ να , S F aˆ αµ • aˆ να = δνµ = aˆ αµ • S F aˆ να , aˆ νµ ∗ = S F aˆ µν . µ

They have the same form of commutation relations as the aν with respect to the • product. Note that for a C ∗ -algebra treatment we will certainly want the commutation relations in ‘Weyl form’ with a purely phase factor and hence θ to be real-valued, hence antisymmetric and hence with zeros on the diagonal. So in this natural case there will be no difference between the aˆ and the a generators. Similarly we find by comodule cotwist that Z µ • Z ν = eıθµν Z µ Z ν ,

Z µ∗ • Z ν = e−ıθµν Z µ∗ Z ν ,

Z µ • Z ν ∗ = e−ıθµν Z µ Z ν ∗ ,

or directly from the unitary transformation of any projectors Pνµ • Pβα = F(aaµ Saνc , abα Saβd )Pca Pdb = F(tµ tν−1 , tα tβ−1 )Pνµ Pβα = eı(θµα −θνα +θνβ −θµβ ) Pνµ Pβα . The commutation relations for the entries of Z , P respectively have the same form as − ≡θ the deformation relations but with θµν replaced by 2θµν µν − θνµ . µ The Pν are no longer projectors with respect to the • product, but the new generators Pˆνµ = e−ıθµν + 2 (θµµ +θνν ) Pνµ ı

µ

are. They enjoy the same commutation relations as the Pν with respect to the bullet product. Moreover Tr Pˆ = Tr P,

Pˆνµ ∗ = Pˆµν .

Thus we see for example that C F [CP3 ] has quantised commutation relations for the ˆ matrix entries of generators Qˆ with further relations Tr Qˆ = 1 and ∗-structure Qˆ † = Q,

Quantisation of Twistor Theory by Cocycle Twist

751

i.e. exactly the same form for the matrix-• relations as in the classical case. Applying these computations but now with Tr P = 2 for the matrix generator P, we obtain C F [CM# ] in exactly the same way but with Tr Pˆ = 2, so that in the projector picture we cover both cases at the same time but with different values for the trace. For C F [F] we have projectors Qˆ of trace 1 and Pˆ of trace 2. Their products are deformed in the same manner as the P-P relations, leading to −

−

−

−

Pˆνµ • Qˆ αβ = e2ı(θµα −θνα +θνβ −θµβ ) Qˆ αβ • Pˆνµ for the quantised commutation relations between entries. Moreover, Pˆ • Qˆ = Qˆ = Qˆ • Pˆ ˆ In particular, we see that Qˆ ∈ M4 (C F [CP3 ]) by a similar computation as for Pˆ 2 = P. # and Pˆ ∈ M4 (C F [CM ]) are projectors which define tautological quantum vector bundles over these quantum spaces and their pull-backs to C F [F]. Rather than proving all these facts for each algebra, let us prove them for the quantisation of any real manifold X ⊂ Mn (C)r defined as the set of r -tuples of matrices P1 , . . . Pr obeying relations defined by the operations of: (a) matrix product; (b) trace; (c) the ( )† operation of Hermitian conjugation. We say that X is defined by ‘matrix relations’. Clearly any such X has on it an action of SUn acting by conjugation. More precisely, we define C[X ] to be the (possibly ∗-) algebra defined by treating the matrix µ entries Pi ν as polynomial generators, the matrix relations as relations in the algebra, and Pi† (when specified) as a definition of Pi νµ ∗ . The coaction of C[SUn ] is µ c a L Pi µ ν = aa Saν ⊗Pi c .

We have already seen several examples of such coordinate algebras with matrix relations. Proposition 6.1. Let C[X ] be a ∗-algebra defined by ‘matrix relations’ among matrices of generators Pi , i = 1, . . . , r . Its quantisation C F [X ] by cocycle cotwist using the cocycle above is the free associative algebra with matrices of generators Pˆi modulo the commutation relations −

−

−

−

2ı(θµα −θνα +θνβ −θµβ ) ˆ α ˆ α Pˆi µ P j β • Pˆi µ ν • Pj β = e ν,

and the matrix relations of C[X ] with Pi replaced by Pˆi . Proof. All the Pi have the same coaction, hence for the deformed product for any P, Q ∈ {P1 , . . . , Pr } we have Pνµ • Q αβ = eı(θµα −θνα +θνβ −θµβ ) Pνµ Q αβ by the same computation as for the P-P relations above. This implies the commutation relations stated for the entries of Pi and hence of the Pˆi . Again motivated by the example we define −ıθµν + 2 (θµµ +θνν ) µ Pi ν Pˆi µ ν =e ı

and verify that † ), Tr Pˆ = Tr P ˆ † = (P Pˆ • Qˆ = (P Q), ( P)

752

S. J. Brain, S. Majid

for any P, Q taken from our collection. Thus −ıθµα + 2 (θµµ +θαα )−ıθαν + 2 (θαα +θνν ) µ ˆ µ ˆµ ˆα ( Pˆ • Q) Pα • Pνα ν = Pα • Pν = e ı

ı

ı

ı

= e−ıθµα + 2 (θµµ +θαα )−ıθαν + 2 (θαα +θνν ) eı(θµα −θαα +θαν −θµν ) Pαµ Q αν µ

= e−ıθµν + 2 (θµµ +θνν ) Pαµ Q αν = (P Q)ν ı

µ

ı ı † Pˆµν ∗ = eıθνµ − 2 (θνν +θµµ ) Pµν ∗ = e−ıθµν + 2 (θµµ +θνν ) P †µ ν = (P )ν .

µ

µ

The proof for the trace is immediate from the definition. We also note that δˆν = δν as the quantisation of the constant identity projector (the identity for matrix multiplication). For example, we now have the quantisation C F [Fk1 ,...,kr (Cn )] of all flag varieties with projectors Pˆi having this new form of commutation relations for their matrix entries, but with matrix-• products having the same form as in the classical case given in Sect. 3. Again, the Pˆi define r tautological projectors with values in the quantum algebra and hence r tautological classes in the noncommutative K -theory, strictly quantising the commutative situation. Let us also make some immediate observations from the relations in the proposition. µ µ We see that diagonal elements Pˆi µ (no sum) are central. We also see that Pˆi ν and µ Pˆi νµ = ( Pˆi ν )∗ commute (so all matrix entry generators are normal in the *-algebra sense). On the other hand, it is clear that non-trivial commutation relations arise when three of the four indices are different, and that if we take the adjoint of a generator on both sides of a commutation relation, we should also invert the commutation factor. This µ µ means that elements of the form ( Pˆi ν )∗ Pˆi ν (no summation) are always central. 1 These observations mean that C F [CP ] is necessarily undeformed in the new generators. For a non-trivial deformation the smallest example is then C F [CP2 ]. Writing its matrix generator as ⎛ ⎞ a x y Qˆ = ⎝x ∗ b z ⎠, y∗ z∗ c one has a, b, c self-adjoint and central, with a + b + c = 1 and x y = eıθ yx,

− − − yz = eıθ zy, x z = eıθ zx, θ = 2(θ12 + θ23 + θ31 )

and the projection relations exactly as stated in Proposition 3.4 (whose statement and proof assumed only that a, b, c, x ∗ x, y ∗ y, z ∗ z are central; no other commutation relations were actually needed). Also note that since a, b, c are central it is natural to set them to constants even in the quantum case. In the quotient a = b = c = 13 we can define U = 3x, W = 3y, V = 3z. Then we have the algebra U V = W, U ∗ = U −1 , V ∗ = V −1 , W ∗ = W −1 , U W = eıθ W U, W V = eıθ V W, U V = eıθ V U. Actually this is just the usual noncommutative torus Cθ [S 1 × S 1 ] with W defined by the above relations and no additional constraints. Similarly in general, for any actual values with b, c > 0 and b + c < 1, we will have the same result but with different rescaling

Quantisation of Twistor Theory by Cocycle Twist

753

factors for U, V, W , so that once again we have noncommutative tori as quantum versions of a family of inclusions S 1 × S 1 ⊂ CP2 . We can consider this family as a quantum analogue of C∗ ×C∗ ⊂ CP2 . This conforms to our expectation of C F [CP2 ] as a ‘quantum toric variety’. Moreover, by arguments analogous to the classical case given in Sect. 3.2, we can view the localisation C F [CP2 ][a −1 ] as a punctured quantum S 4 with generators x, y, a, a −1 and the relations x y = eıθ yx, x ∗ x + y ∗ y = a(1 − a), with a central. One can also check that the cotriangular Hopf ∗-algebra C F [SUn ] coacts on C F [X ] now as the quantum version of our classical coactions, as is required by the general theory, and that these quantised spaces may be realised as quantum homogeneous spaces if one wishes. Again from general theory, these spaces are -commutative with respect to the induced involutive braiding built from θ − and appearing in the commutation relations for the matrix entries. Finally, the differential calculi are twisted by the same methods as in Sect. 5. The formulae are similar to the deformation of the coordinate algebras, with the insertion of d just as before. Thus (C F [SUn ]) has structure aνµ • daβα = eı(θµα −θνβ ) aνµ daβα , daνµ • aβα = eı(θµα −θνβ ) daνµ aβα , daνµ • daβα = eı(θµα −θνβ ) daνµ ∧ daβα , and commutation relations −

−

−

−

aˆ νµ • daˆ βα = e2ı(θµα −θνβ ) daˆ βα • aˆ νµ , daˆ νµ • daˆ βα = −e2ı(θµα −θνβ ) daˆ βα • daˆ νµ . Similarly, for the quantisation C F [X ] above of an algebra with matrix relations, (C F [X ]) has commutation relations −

−

−

−

2ı(θµα −θνα +θνβ −θµβ ) ˆ α ˆ α Pˆi µ d P j β • Pˆi µ ν • d Pj β = e ν, −

−

−

−

2ı(θµα −θνα +θνβ −θµβ ) ˆ α ˆ α d Pˆi µ d P j β • d Pˆi µ ν • d P j β = −e ν.

6.1. Quantum C F [CM# ], C F [S 4 ] and the quantum instanton. In this section we specialise the above general theory to C[CM# ] and its cocycle twist quantisation. We also find a natural one-parameter family of the θµν -parameters for which C F [CM# ] has a ∗-algebra quotient C F [S 4 ]. We find that this recovers the Sθ4 previously introduced by Connes and Landi [6] and that the quantum tautological bundle (as a projective module) on C F [CM# ] pulls back in this case to a bundle with Grassmann connection equal to the Landi-van Suijlekom noncommutative basic instanton found in [11]. This is very different from the approach in [11]. We start with some notations. Since here n = 4, θµν is a 4 × 4 matrix with all rows and columns summing to zero. For convenience we limit ourselves to the case where θ is real and hence (see above) antisymmetric (only the antisymmetric part enters into the commutation relations so this is no real loss). As a result it is equivalent to giving a 3 × 3 real antisymmetric matrix, i.e. it has within it only three independent parameters.

754

S. J. Brain, S. Majid

Lemma 6.2. The parameters θ A = θ12 + θ23 + θ31 , θ B = θ23 + θ34 + θ42 , θ = θ13 − θ23 + θ24 − θ14 determine any antisymmetric theta completely. Proof. We write out the three independent equations

j θi j

= 0 for j = 1, 2, 3 as

θ12 + θ13 + θ14 = 0, −θ12 + θ23 + θ24 = 0, −θ13 − θ23 + θ34 = 0, then sum the first two and sum the last two to give θ13 + θ23 + θ14 + θ24 = 0, θ24 + θ34 − θ12 − θ13 = 0. Adding the first equation to the one for θ tells us that θ = 2(θ13 + θ24 ), while adding the second equation to θ A − θ B tells us that θ A − θ B = 2(θ24 − θ13 ). Hence, knowing (θ, θ A − θ B ) is equivalent to knowing θ13 , θ24 . Finally, θ A + θ B = θ12 + 2θ23 + θ34 − θ13 − θ24 = θ12 + 3θ23 − θ24 = 4(θ12 − θ24 ), using the third of our original three equations to identify θ23 and then the second of our original three equations to replace it. Hence knowing θ, θ A − θ B , we see that knowing θ A + θ B is equivalent to knowing θ12 . The remaining θ14 , θ23 , θ34 are determined from our original three equations. This completes the proof, which also provides the explicit formulae: θ24 =

1 1 1 1 (θ + θ A − θ B ), θ13 = (θ − θ A + θ B ), θ12 = θ + θ A , 4 4 4 2

1 1 1 1 1 1 1 θ14 = − θ − θ A − θ B , θ23 = θ A + θ B , θ34 = θ + θ B . 2 4 4 4 4 4 2 We are now ready to compute the commutation relations between the entries of

z w˜ Aˆ Bˆ Pˆ = ˆ † ˆ , Bˆ = w z˜ B D as in (7), except that now the matrix entry generators are for the quantum algebra C F [CM# ] (we omit their hats). From the general remarks after Proposition 6.1 we know that the generators along the diagonal, i.e. a, α3 , δ3 , are central (and self-conjugate under ∗). Also from general remarks we know that all matrix entry generators Pˆ are normal (they commute with their own conjugate under ∗). Moreover, it is easy to see that if x y = λyx is a commutation relation between any two matrix entries then so is ¯ ∗ x, again due to the form of the factors in Proposition 6.1. x y ∗ = λy Proposition 6.3. The non-trivial commutation relations of C F [CM# ] are αδ = e2ıθ δα, ˜ αw = e2ıθ A wα, α z˜ = e2ı(θ+θ A ) z˜ α, αz = e2ıθ A zα, α w˜ = e2ı(θ+θ A ) wα, ˜ δ z˜ = e−2ıθ B z˜ δ, δz = e−2ı(θ+θ B ) zδ, δw = e−2ıθ B wδ, δ w˜ = e−2ı(θ+θ B ) wδ,

Quantisation of Twistor Theory by Cocycle Twist

755

z w˜ = e2ı(θ+θ B ) wz, ˜ zw = e2ıθ A wz, z z˜ = e2ı(θ+θ A +θ B ) z˜ z, ww ˜ = e2ı(θ A −θ B ) w w, ˜ w˜ z˜ = e2ı(θ+θ A ) z˜ w, ˜ w z˜ = e2ıθ B z˜ w, and similar relations with inverse coefficient when a generator is replaced by its conjugate under ∗. The further (projector) relations of C F [CM# ] are exactly the same as stated in Corollary 3.3 except for the last two auxiliary relations:

w z −α −e−2ı(θ+θ B ) δ ∗ , = 2ıθ B (α3 + δ3 ) w˜ z˜ e δ α∗ (α3 − δ3 )

−α ∗ −e−2ıθ B δ ∗ z w . = 2ı(θ+θ B ) z˜ w˜ e δ α

Proof. Here the product is the twisted • product which we do not denote explicitly. We use the commutation relations in Proposition 6.1, computing the various instances of θi jkl = θik − θ jk + θ jl − θil in terms of the combinations in Lemma 6.2. This gives α Bˆ i 1 = e2ıθ A Bˆ i 1 α, α Bˆ i 2 = e2ı(θ+θ A ) Bˆ i 2 α, i = 1, 2, δ Bˆ 1 i = e−2ı(θ+θ B ) Bˆ 1 i δ, δ Bˆ 2 i = e−2ıθ B Bˆ 2 i δ, i = 1, 2, Bˆ 1 1 Bˆ 1 2 = e2ı(θ+θ B ) Bˆ 1 2 Bˆ 1 1 , Bˆ 1 1 Bˆ 2 2 = e2ı(θ+θ A +θ B ) Bˆ 2 2 Bˆ 1 1 , Bˆ 1 2 Bˆ 2 2 = e2ı(θ+θ A ) Bˆ 2 2 Bˆ 1 2 ,

Bˆ 1 1 Bˆ 2 1 = e2ıθ A Bˆ 2 1 Bˆ 1 1 , Bˆ 1 2 Bˆ 2 1 = e2ı(θ A −θ B ) Bˆ 2 1 Bˆ 1 2 , Bˆ 2 1 Bˆ 2 2 = e2ıθ B Bˆ 2 2 Bˆ 2 1 ,

which we write out more explicitly as stated. As explained above, the diagonal elements of A, D are central and for general reasons the conjugate relations are as stated. Finally, we explicitly recompute the content of the noncommutative versions of (3)–(4) to find the relations required for Pˆ to be a projector (this is equivalent to computing the bullet product from the classical relations). The a, α, α ∗ form a commutative subalgebra, as do a, δ, δ ∗ , so the calculations of A(1 − A) and (1 − D)D are not affected. We can compute B B † without any commutativity assumptions, and in fact we stated all results from (3) in Corollary 3.3 carefully so as to still be correct without such assumptions. Being similarly careful for (4) gives the remaining two auxiliary equations (without any commutativity assumptions) as

∗ −wδ ˜ ∗ − αw z˜ δ + α ∗ z z w , = = , (α (α3 + δ3 ) − δ ) 3 3 wδ + α ∗ w˜ −zδ − α z˜ z˜ w˜ which we write in ‘matrix’ form using the above deformed commutation relations.

756

S. J. Brain, S. Majid

Note that the ‘Cartesian’ decomposition Bˆ = t + ı x · σ may also be computed but it involves sin and cos factors, whereas the ‘twistor’ coordinates, where we work with Bˆ i j directly as generators, have simple phase factors as above. Next, we look at the possible cases where C F [CM# ] has a quotient analogous to C[S 4 ] in the classical case. We saw in the classical case that α = δ = 0 and t, x are Hermitian, or equivalently that z ∗ = z˜ , w ∗ = −w. ˜

(22)

Now in the quantum case the ∗-operations on the entries of Pˆ are given by a multiple of the undeformed ∗-operations (as shown in the proof of Proposition 6.1). Hence the analogous relations in C F [S 4 ], if it exists as a ∗-algebra quotient, will have the same form as (22) but with some twisting factors. Proposition 6.4. The twisting quantisation C F [CM# ] is compatible with the ∗-algebra quotient C[CM# ] → C[S 4 ] if and only if θ A = θ B = − 21 θ . In this case ı

ı

z ∗ = e 2 θ z˜ , w ∗ = −e− 2 θ w. ˜ Proof. If the twisting quantisation is compatible with the ∗-algebra quotient, we have Bˆ 1 1 = e−ıθ13 B 1 1 ,

Bˆ 1 2 = e−ıθ14 B 1 2 ,

Bˆ 2 1 = e−ıθ23 B 2 1 ,

Bˆ 2 2 = e−ıθ24 B 2 2 ,

from which we deduce the required ∗-operations for the quotient. For example, Bˆ 1 1 ∗ = eıθ13 +ıθ24 Bˆ 2 2 , and use the above lemma to identify the factor here as eıθ/2 . Therefore we obtain the formulae as stated for the ∗-structure necessarily in the quotient. Next, working out C F [CM# ] using Proposition 6.3 we have on the one hand z w˜ ∗ = e−2ı(θ+θ B ) w˜ ∗ z, and on the other hand zw = e2ıθ A wz. For these to coincide as needed by any relation of the form (22) (independently of any deformation factors there) we need −(θ + θ B ) = θ A . Similarly for compatibility of the z ∗ w relation with the z˜ w relation, we need θ A = θ B . This determines θ A , θ B as stated for the required quotient to be a ∗-algebra quotient. These are also sufficient as far as the commutation relations are concerned. The precise form of ∗-structure stated allows one to verify the other relations in the quotient as well. We see that while C F [CM# ] has a three-parameter deformation, there is only a oneparameter deformation that pulls back to C F [S 4 ]. The latter has only a, z, w, z ∗ , w ∗ as generators with relations [a, z] = [a, w] = 0, zw = e−ıθ wz, zw ∗ = eıθ w ∗ z, z ∗ z + w ∗ w = a(1 − a), which after a minor change of variables is exactly the Sθ4 in [6]. The ‘pull-back’ of the projector Pˆ to C F [S 4 ] is

ıθ 2 w∗ a Bˆ z −e ∗ , a = a, Bˆ = eˆ = ˆ † , ıθ B 1−a w e− 2 z ∗

Quantisation of Twistor Theory by Cocycle Twist

757

which up to the change of notations is the ‘defining projector’ in the Connes-Landi approach to S 4 . Whereas it is obtained in [6] from considerations of cyclic cohomology, we obtain it by a straightforward twisting-quantisation. In view of Proposition 3.2 we define the noncommutative basic one-instanton to be the Grassmann connection for the projector eˆ on E = C F [S 4 ]4 e. ˆ This should not come as any surprise since the whole point in [6] was to define the noncommutative S 4 by a projector generating the K-theory as the one-instanton bundle does classically. However, we now obtain eˆ not by this requirement but by twisting-quantisation and as a ‘pull-back’ of the tautological bundle on C F [CM# ]. Finally, our approach also canonically constructs (C F [CM# ]) and one may check that this quotients in the one-parameter case to (C F [S 4 ]), coinciding with the calculus used in [6]. As explained above, the classical (anti)commutation relations are modified by the same phase factors as in the commutation relations above. One may then obtain explicit formulae for the instanton connection and for the Grassmann connection on ˆ C F [CM# ]4 P. 6.2. Quantum twistor space C F [CP3 ]. In our ∗-algebra approach the classical algebra C[CP3 ] has a matrix of generators Q µ ν with exactly the same form as for C[CM# ], with the only difference being now Tr Q = 1, which significantly affects the content of the ‘projector’ relations of the ∗-algebra. However, the commutation relations in the quantum case C F [CP3 ] according to Proposition 6.1 have exactly the same form as C F [CM# ] if we use the same cocycle F. Hence the commutation relations between different matrix entries in the quantum case can be read off from Proposition 6.3. We describe them in the special one-parameter case found in the previous section where θ A = θ B = − 21 θ . Using the same notations as in Sect. 3.2 but now with potentially noncommutative generators, C F [CP3 ] has a matrix of generators ⎛ ⎞ a x y z ∗ ⎜x b w v ⎟ Qˆ = ⎝ ∗ ∗ , a ∗ = a, b∗ = b, c∗ = c, d ∗ = d, a + b + c + d = 1 y w c u⎠ z∗ v∗ u ∗ d (we omit hats on the generators). We will use the same shorthand X = x ∗ x etc. as before. As we know on general grounds above, any twisting quantisation C F [CP3 ] has all entries of the quantum matrix Qˆ normal (commuting with their adjoints), and the diagonal elements and quantum versions of X, Y, Z , U, V, W central. Moreover, all proofs and statements in Sect. 3.2 were given whilst being careful not to assume that x, y, z, u, v, w mutually commute, only that these elements are normal and X, Y, Z , U, V, W central. Hence the relations stated there are also exactly the projector relations for this algebra: Proposition 6.5. For the one-parameter family of cocycles θ A = θ B = − 21 θ the quan3 tisations C F [CP3 ] and C− F [CP ] have exactly the projection relations as in Propositions 3.5 and 3.7 but now with the commutation relations x z = eıθ zx, yx = eıθ x y, yz = eıθ zy, and the auxiliary commutation relations uv = eıθ vu, uw = eıθ wu, vw = eıθ wv,

758

S. J. Brain, S. Majid

x(u, v, w) = (e2ıθ u, eıθ v, e−ıθ w)x, y(u, v, w) = (eıθ u, v, e−ıθ w)y, z(u, v, w) = (eıθ u, eıθ v, w)z, and similar relations with inverse factor if any generator in a relation is replaced by its adjoint under ∗. Proof. As explained, the commutation relations are the same as for the entries of Pˆ in C F [CM# ] with a different notation of the matrix entries. We read them off and specialise to the one-parameter case of interest. The ‘auxiliary’ set of relations is deduced from those among the x, y, z if a = 0, since in this case u, v, w are given in terms of these and their adjoints. If we localise by inverting a, then by analogous arguments to the classical case, the 3 6 resulting ‘patch’ of C− F [CP ] becomes a quantum punctured S with complex generators x, y, z, invertible central self-adjoint generator a and commutation relations as above, and the relation x ∗ x + y ∗ y + z ∗ z = a(1 − a). Also by the same arguments as in the classical case, if we set a, b, c, d to actual fixed numbers (which still makes sense since they are central) then 3 1 1 1 C− F [CP ]| b,c,d>0 = Cθ [S × S × S ], b+c+d<1

where the right hand side has relations as above for three circles but now with commutation relations between the x, y, z circle generators as stated in the proposition. For each set of values of b, c, d we have a quantum analogue of S 1 × S 1 × S 1 ⊂ CP3 , and if we leave them undetermined then in some sense a quantum version of C∗ ×C∗ ×C∗ ⊂ CP3 , 3 so that C− F [CP ] is in this sense a ‘quantum toric variety’. We have seen the same pattern of results already for C F [CP2 ] and C[CP1 ]. Having established the quantum versions of C[S 4 ] and twistor space C− [CP3 ], we now investigate the quantum version of the fibration CP3 → S 4 . In terms of coordinate − 3 3 algebras one has an antilinear involution J : C− F [CP ] → C F [CP ] analogous to Lemma 3.9. The form of J , however, has to be modified by some phase factors to fit the commutation relations of Proposition 6.5, and is now given by J (y) = eıθ v ∗ , J (w) = −z ∗ ,

J (y ∗ ) = e−ıθ v, J (w ∗ ) = −z,

J (x) = −x,

J (u) = −u,

J (v) = e−ıθ y ∗ , J (z) = −w ∗ , J (a) = b,

J (v ∗ ) = eıθ y, J (z ∗ ) = −w,

J (b) = a.

The map J then extends to C F [CM# ] and by arguments analogous to those given in the previous section and in Sect. 3.2, the fixed point subalgebra under J is once again precisely C F [S 4 ]. We arrive in this way at the analogous main conclusion, which we verify directly. Proposition 6.6. There is an algebra inclusion 3 η : C F [S 4 ] → C− F [CP ]

given by η(a) = a + b, η(z) = eıθ y + v ∗ , η(w) = w − z ∗ .

Quantisation of Twistor Theory by Cocycle Twist

759

Proof. Once again the main relation to investigate is the image of the sphere relation zz ∗ + ww ∗ = a(1 − a). Applying η to the left hand side, we obtain η(zz ∗ + ww ∗ ) = yy ∗ + eıθ yv + e−ıθ v ∗ y ∗ + v ∗ v + ww ∗ − wz − z ∗ w ∗ + z ∗ z. We now compute that ayv = yav = yx ∗ z = e−ıθ x ∗ yz = e−ıθ awz, where the first equality uses centrality of a, the second uses the projector relation av = x ∗ z and the third uses Proposition 6.5. Similarly one obtains that byv = e−ıθ bwz, cyv = e−ıθ cwz, dyv = e−ıθ dwz, so that adding these four relations now reveals that 3 yv = e−ıθ wz in C− F [CP ]. Finally using the relations in Proposition 3.8 (which are still valid as the projector relations in our noncommutative case) we see that η(zz ∗ + ww ∗ ) = Y + V + W + Z = (a + b)(c + d) = (a + b)(1 − (a + b)) = η(a(1 − a)). To verify the preservation of the algebra structure of C F [S 4 ] under η we also have to check the commutation relations, of which the non-trivial one is zw = e−ıθ wz. Indeed, η(zw) = (eıθ y + v ∗ )(w − z ∗ ) = e−ıθ (w − z ∗ )(eıθ y + v ∗ ) = η(e−ıθ wz) using the commutation relations in Proposition 6.5. Just as in Sect. 3.2 we may compute the ‘push-out’ of the quantum instanton bundle 3 along η, given by viewing the tautological projector eˆ as an element e˜ ∈ M4 (C− F [CP ]). Explicitly, we have (following the method of Sect. 3.2)

ıθ a+b M y + v ∗ −e 2 (w ∗ − z) − 3 ∈ M4 (C F [CP ]), M = e˜ = , −ıθ M † 1 − (a + b) w − z ∗ e 2 (y ∗ + v) 3 4 ˜ In this way the and the auxiliary bundle over twistor space is then E˜ = C− F [CP ] e. quantum instanton may be thought of as coming from a bundle over quantum twistor space, just as in the classical case.

7. The Penrose-Ward Transform The main application of the double fibration (2) is to study the Penrose-Ward transform between vector bundles over twistor space and vector bundles over space-time. The goal is to reformulate certain geometric data on space-time (anti-self-dual Yang-Mills fields) in terms of holomorphic data on twistor space. In the classical case, this correspondence gives rise to the well-known ADHM construction of instantons [1]. The idea is to use the double fibration (2) to transform vector bundles E˜ over twistor space CP3 into vector bundles E over space-time CM# , by pull-back along p followed by direct image along q. As we shall see, if the bundle E˜ is holomorphic, this structure may be used to equip the resulting bundle E with a certain connection, constructed in ˜ It turns out that the curvature of this connection satisfies the a canonical way from E. anti-self-dual Yang-Mills equations. These fields do not in general admit an explicit global description, hence it is usual to work locally on some given subset U ⊂ CM# , assumed to be connected and simply

760

S. J. Brain, S. Majid

connected. We then write W = q −1 (U ) and Uˆ = p(W ), so that the relevant picture is now W p

@ q @ R @

p

@ q @ R @

(23) U. Uˆ Moreover, we assume that the intersection of U with each α-plane is connected and simply connected (so that the fibres of the map p have these properties also). In fact we have already seen this transform in action, albeit in the simplified case where U = S 4 , so that the double fibration collapses to a single fibration with Uˆ = CP3 . In Sect. 3.2 we gave a coordinate algebra description of this fibration, with the analogous quantum version computed in Sect. 6.2, cotwisted by the diagonal subgroup of SU4 . In this section we outline how the transform works in a different setting, namely between the affine piece U = CM ⊂ CM# of space-time and the corresponding subset Uˆ = CP3 − CP1 of twistor space. We firstly consider the transform applied to differential forms, before considering the general case. We also outline how the transform quantises, although we emphasise that the cotwist appropriate here is the one given by the translation subgroup, described in Sect. 4. Although we restrict our attention to the special case of (2), our remarks are not specific to this example and one should keep in mind the picture of a pair of fibrations of homogeneous spaces G/R

G/H

G/K ,

(24)

where G is a Lie group with closed subgroups H, K , R such that R = H ∩ K [7]. Here we should also add the assumption that the fibres of q should be connected and compact, so as to ensure that the notion of direct image along q makes sense (this is obviously the case for the twistor example), and that the fibres of p, q are ‘mutually transverse’ in some suitable sense to be determined. 7.1. Localised coordinate algebras. As remarked in Sect. 1, of particular interest are the principal open sets U f ⊂ CM# , for which one can explicitly write down the coordinate algebra OCM# (U f ). In particular, since our intention here is to provide only an outline of the Penrose-Ward transform, we choose to be explicit and restrict our attention to the case of # ](t −1 ) , OCM# (Ut ) = C[CM 0 the ‘affine’ piece of space-time, for which we have the inclusion of algebras C[CM] → # ](t −1 ) given by C[CM 0

x1 → t −1 z, x2 → t −1 z˜ , x3 → t −1 w, x4 → t −1 w. ˜ # ] becomes s˜ = x x − Let us also write s˜ := t −1 s, so that the quadric relation in C[CM 1 2 # ](t −1 ) the generator s˜ is redundant and we may as x x . Thus in the algebra C[CM 3 4

well work with C[CM].

0

Quantisation of Twistor Theory by Cocycle Twist

761

In the language of Sect. 1, then, the ordinary points of space-time correspond to the region in CM# , where t = 0, whereas the additional ‘light cone at infinity’ is given by the region where t = 0. Under the twistor correspondence of the double fibration (2), the set of twistors in T which define α-planes through infinity is the region of T whose homogeneous Z -coordinates have Z 3 = Z 4 = 0 (this is easily computed by setting t = 0 in Eqs. (1)). Projectively, it is a CP1 in T described by the homogeneous coordinates Z 1 , Z 2 . We shall call the complement T − CP1 of this region the ‘twistor space of CM’ and denote it TCM , referring to its homogeneous version T˜CM as the homogeneous twistor space. The latter space has coordinate algebra given by C[T˜ ] with the added condition that Z 3 and Z 4 are not both zero. The twistor space TCM is thus covered by two coordinate patches where Z 3 = 0 and where Z 4 = 0. These patches have coordinate algebras C[TZ 3 ] := C[T˜ ]((Z 3 )−1 )0 ,

C[TZ 4 ] := C[T˜ ]((Z 4 )−1 )0

respectively (so that the first algebra is generated by Zj , Z4

Zi , Z3

i = 1, 2, 4 and the second by

j = 1, 2, 3). Note that when both and are non-zero the two algebras are 3 4 isomorphic (even in the twisted case, since Z , Z remain central in the algebra), with ‘transition functions’ Z µ (Z 3 )−1 → Z µ (Z 4 )−1 , µ = 1, 2. This isomorphism simply says that both coordinate patches look like C3 , in agreement with our expectation that twistor space is a complex three-manifold. In passing from CM# to CM we delete the ‘region at infinity’ where t = 0, and similarly we obtain the corresponding twistor space TCM by deleting the region where Z 3 = Z 4 = 0. At the homogeneous level this region has coordinate algebra C[T˜ ]/ Z 3 = Z 4 = 0 ∼ = C[Z 1 , Z 2 ], Z3

Z4

describing a CP1 at the projective level. In the twisted framework, the generators Z 3 , Z 4 are central so the quotient still makes sense and we have C F [T˜ ]/ Z 3 = Z 4 = 0 ∼ = C[Z 1 , Z 2 ], which is also projectively a CP1 . At the level of the homogeneous correspondence space we also have t −1 adjoined ˜ −1 )0 for the corresponding and Z 3 , Z 4 not both zero, and we write C[F˜ t ] := C[F](t coordinate algebra. We note that in terms of the C[CM] generators, the relations (1) now read Z 1 = x2 Z 3 + x3 Z 4 ,

Z 2 = x4 Z 3 + x1 Z 4 .

(25)

At the projective level, the correspondence space is also covered by two coordinate patches. Thus when Z 3 = 0 and when Z 4 = 0 we respectively mean C[F Z 3 ] := C[F˜ t ]((Z 3 )−1 )0 ,

C[F Z 4 ] := C[F˜ t ]((Z 4 )−1 )0 .

Again when Z 3 and Z 4 are both non-zero these algebras are seen to be isomorphic under appropriate transition functions. Thus at the homogeneous level, the coordinate algebra C[F˜ t ] describes a local trivialisation of the correspondence space in the form

762

S. J. Brain, S. Majid

F˜ t ∼ = C2 × CM. The two coordinate patches F Z 3 and F Z 4 together give a trivialisation of the projective correspondence space in the form Ft = CP1 × CM. # ](t −1 ) , it is easy to see that since Regarding the differential calculus of C[CM 0

d(t −1 t) = 0, from the Leibniz rule we have d(t −1 ) = −t −2 dt. Since dt is central even in the twisted calculus, adjoining this extra generator t −1 causes no problems, and it remains to check that the calculus is well-defined in the degree zero subalgebra # ](t −1 ) . Indeed, we see that for example C[CM 0

dx1 = d(t −1 z) = t −1 dz − (t −1 z)(t −1 dt), which is again of overall degree zero. Similar statements hold regarding the differential calculi of C[TZ 3 ] and C[TZ 4 ] as well as those of C[F Z 3 ] and C[F Z 4 ].

7.2. Pull-back and direct image. As discussed, the two main components of the transform involve the pull-back and direct image of vector bundles along the projections p, q. We briefly recall how these operations behave in the coordinate algebra setting. A holomorphic vector bundle E˜ over homogeneous twistor space T˜ is given by a finite ˜ the space of holomorphic sections of E˜ (note that E˜ is necessarily rank C[T˜ ]-module E, trivial). The pull-back bundle p ∗ E˜ of E˜ along the projection p : F˜ t → T˜ is given by the finite rank C[F˜ t ]-module ˜ p ∗ E˜ := C[F˜ t ] ⊗C[T˜ ] E. Given a holomorphic vector bundle E over the correspondence space Ft , the direct image along q is by definition the vector bundle E := q∗ E whose fibre at x ∈ CM is the space H 0 (q −1 (x), E ) of holomorphic sections of E restricted to the fibre q −1 (x). Since each fibre is compact and connected, this space is finite-dimensional (hence the need to assume this in the general case (24)). Direct images are in our case easy to compute, since the correspondence space has the form Ft = CP1 × CM, as noted above. We therefore have that H 0 (Ft , C) = H 0 (CP1 , C) ⊗ H 0 (CM, C) ∼ = H 0 (CM, C) = C[CM]. This simply says that holomorphic sections of the direct image bundle E cannot depend on the CP1 -coordinate (by Liouville’s theorem) and may hence be identified with the holomorphic sections of E which are independent of the CP1 -coordinate. Such a bundle E is described at the homogeneous level by a corresponding finite rank C[F˜ t ]-module E . The above argument thus gives an easy way of computing the C[CM]module E = q∗ E of sections of the direct image E, by picking out the C[CM]-module of elements of E which are independent of the coordinates Z 3 , Z 4 (the homogeneous coordinates for the CP1 fibre). Of course, these arguments can be made rigorous in the usual way, although since our purpose here is to illustrate the compatibility of these methods with the theory of cotwisting, we defer the details to a sequel.

Quantisation of Twistor Theory by Cocycle Twist

763

7.3. Differential aspects of the double fibration. We now consider the pull-back and direct image of one-forms on our algebras. Indeed, we shall examine how the differential calculi occurring in the fibration (2) are related and derive the promised ‘transversality condition’ required in order to transfer data from one side of the fibration to the other. For now, we consider only the classical (i.e. untwisted) situation. Initially we work at the homogeneous level, with C[T˜ ] and C[F˜ t ]. One may later pass to local coordinates by adjoining an inverse for either Z 3 or Z 4 as described above, although we shall not do this here as it involves issues of coordinate patching. We define 1p := 1 C[F˜ t ]/ p ∗ 1 C[T˜ ]

(26)

to be the set of relative one-forms (the one-forms which are dual to those vectors which are tangent to the fibres of p). Note that 1p is just the sub-bimodule of 1 C[F˜ t ] spanned by d˜s , dx1 , dx2 , dx3 , dx4 , so that 1 C[F˜ t ] = p ∗ 1 C[T˜ ] ⊕ 1p .

(27)

There is of course an associated projection π p : 1 C[F˜ t ] → 1p , and hence an associated relative exterior derivative d p given by composition of d with this projection, d p : C[F˜ t ] → 1p ,

d p = π p ◦ d.

(28)

Similarly, we define the relative two-forms by 2p := 2 C[F˜ t ]/( p ∗ 1 C[T˜ ] ∧ 1 C[F˜ t ]) so that d p extends to a map d p : 1p → 2p by composing d : 1 C[F˜ t ] → 2 C[F˜ t ] with the projection 1 C[F˜ t ] → 2p . We see that d p obeys the relative Leibniz rule, d p ( f g) = (d p f )g + f (d p g),

f, g ∈ C[F˜ t ].

(29)

It is clear by construction that the kernel of d p consists precisely of the functions in C[F˜ t ] which are constant on the fibres of p, whence we recover C[T˜ ] by means of the functions in C[F˜ t ] which are covariantly constant with respect to d p (since functions in C[T˜ ] may be identified with those functions in C[F˜ t ] which are constant on the fibres of p). Moreover, the derivative d p is relatively flat, i.e. its curvature d2p is zero. The next stage is to compute the direct image of differential forms along the projection q. Proposition 7.1. There is an isomorphism q∗ q ∗ 1 C[CM] ∼ = 1 C[CM].

764

S. J. Brain, S. Majid

Proof. The generators dx j , j = 1, . . . 4 of 1 C[CM] pull back to their counterparts dxi , i = 1, . . . 4 in C[F˜ t ] and these span q ∗ 1 C[CM] as a C[F˜ t ]-bimodule. Taking the direct image involves computing q ∗ 1 C[CM] as a C[CM]-bimodule, and as such it is spanned by elements of the form d˜s , dxi , Z i d˜s , Z i dx j , for i, j = 1, . . . 4. As already observed, the generator s˜ is essentially redundant, hence so are the elements involving d˜s . Moreover, the relations (25) allow us to write Z 1 and Z 2 in terms of Z 3 , Z 4 , whence we are left with elements of the form dx j , Z 3 dx j and Z 4 dx j for j = 1, . . . 4. As explained above, the direct image is found by taking the elements of the calculus which are independent of the twistor coordinates, whence it is clear that the resulting differential calculus q∗ q ∗ 1 C[CM] of C[CM] is isomorphic to the one we first thought of. We remark that we have used the same symbol d to denote the exterior derivative in the different calculi 1 C[CM], q ∗ 1 C[CM] and q∗ q ∗ 1 C[CM]. Although they are in principle different, our notation causes no confusion here. Proposition 7.2. There are isomorphisms q∗ 1p ∼ = 1 C[CM], q∗ 2p ∼ = 2+ C[CM], where 2+ C[CM] denotes the space of two-forms in 2 C[CM] which are self-dual with respect to the Hodge ∗-operator defined by the metric η = 2(dx1 dx2 − dx3 dx4 ). Proof. We first consider the C[F˜ t ]-bimodule 1p . Quotienting 1 C[F˜ t ] by the oneforms pulled back from C[T˜ ] means that 1p is spanned as a C[F˜ t ]-bimodule by d˜s and dxi , i = 1, . . . 4. The direct image is computed just as before, so q∗ 1p is as a vector space the Z -independent part of 1p , now considered as a C[CM]-bimodule. In what follows we shall write d for the exterior derivatives in the calculi 1 C[CM] and 1 C[F˜ t ] as they are the usual operators (the ones we which we wrote down and quantised in Sect. 5). However, in calculating the direct image q∗ 1p of the calculus 1p , we must introduce different notation for the image of the operator d p under q∗ . To this end, we write q∗ d p =: d, so that as a C[CM]-bimodule the calculus q∗ 1p is spanned by elements of the form d˜s and dx j , j = 1, . . . 4. As already observed, the generator s˜ is essentially redundant, hence so is the generator d˜s . The identity (29) becomes a Leibniz rule for d upon taking the direct image q∗ , whence (q∗ 1p , d) is a first order differential calculus of C[CM]. We must investigate its relationship with the calculus 1 C[CM]. Differentiating the relations (25) and quotienting by generators dZ i yields the relations Z 3 d p x2 + Z 4 d p x3 = 0,

Z 3 d p x4 + Z 4 d p x1 = 0

(30)

in the relative calculus 1p . Thus as a C[F˜ t ]-bimodule, 1p has rank two, since the basis elements d p x j are not independent. However, in the direct image q∗ 1p , these basis elements (now written dx j ) are independent. Thus it is clear that the calculus (q∗ 1p , d) is isomorphic to (1 C[CM], d) in the sense that as bimodules they are isomorphic, and that this isomorphism is an intertwiner for the derivatives d and d. Using the relations (30) it is easy to see that in the direct image bimodule q∗ 2p := 2 (q∗ 1p ) we have dx1 ∧ dx4 = dx2 ∧ dx3 = dx1 ∧ dx2 + dx3 ∧ dx4 = 0,

Quantisation of Twistor Theory by Cocycle Twist

765

which we recognise (since we are in double null coordinates) as the anti-self-dual twoforms, whence it is the self-dual two-forms which survive under the direct image. It is evident that these arguments are valid upon passing to either coordinate patch at the projective level, i.e. upon adjoining either (Z 3 )−1 or (Z 4 )−1 and taking the degree zero part of the resulting calculus. It is clear that the composition of maps q ∗ 1 C[CM] → 1 C[F˜ t ] → 1p determines the isomorphism 1 C[CM] → q∗ 1p by taking direct images (this is also true in each of the coordinate patches). We note that this is the required transversality condition for the general case (24). The sequence C[F˜ t ] → 1p → 2p ,

(31)

where the two maps are just d p , becomes the sequence C[CM] → 1 C[CM] → 2+ C[CM] upon taking direct images (the maps become d). The condition that d2p = 0 is then equivalent to the statement that the curvature d2 is annihilated by the map 2 C[CM] → q∗ 2p = 2+ , i.e. that the curvature d2 is anti-self-dual, so although the connection d p is flat, its image under q∗ is not. The derivatives d and d agree as maps C[CM] → 1 C[CM], but they do not agree beyond the one-forms. At the level of the correspondence space, the reason for this is that the calculus 1p comes equipped with relations (30), whereas the pull-back q ∗ 1 C[CM] has no such relations. We are now in a position to investigate how this construction behaves under the twisting discussed in Sects. 4 and 5. Of course, we need only check that the various steps of the procedure remain valid under the quantisation functor. To this end, our first observation is that although the relations in the first order differential calculus of C[CM] are deformed, Eqs. (20) show that the two-forms are undeformed, as is the metric υ, hence the Hodge ∗-operator is also undeformed in this case. It follows that the notion of anti-self-duality of two-forms is the same as in the classical case. Furthermore, the definition of relative one-forms (26) still makes sense since the decomposition (27) is clearly unaffected by the twisting. Since we are working with the affine piece of (now noncommutative) space-time, the relevant relations in the correspondence space algebra C F [F˜ t ] are given by (25), which are unchanged under twisting since t, Z 3 , Z 4 remain central in the algebra. Finally we observe that the proofs of Propositions 7.2 and 7.1 go through unchanged. The key steps use the fact that when the generators Z 3 , Z 4 and t are invertible, one may adjoin their inverses to the coordinate algebras and differential calculi and take the degree zero parts. Since these generators remain central under twisting, this argument remains valid and we have the following twisted consequence of Proposition 7.2.

766

S. J. Brain, S. Majid

Proposition 7.3. Let d p : C F [F˜ t ] → 1p be the differential operator defined by the composition of maps d p = π ◦ d, d p : C F [F˜ t ] → 1 C F [F˜ t ] → 1p . Then there is an isomorphism of differential calculi q∗ 1p ∼ = 1 C F [CM], and the direct image d = q∗ d p is a differential operator whose curvature d2 takes values in the anti-self-dual two-forms 2− C F [CM]. We remark that this is no coincidence: the transform between one-forms on T˜CM and the operator d on the corresponding affine patch CM of space-time goes through to this noncommutative picture precisely because of the choice of cocycle made in Sect. 4. Indeed, we reiterate that any construction which is covariant under a chosen symmetry group will also be covariant after applying the quantisation functor. In this case, the symmetry group is the subgroup generated by conformal translations (see Sect. 4), which clearly acts covariantly on affine space-time CM. By the very nature of the twistor double fibration, this translation group also acts covariantly on the corresponding subsets T˜C M and F˜ t of (homogeneous) twistor space and the correspondence space respectively, and it is therefore no surprise that the transform outlined above works in the quantum case as well. Indeed, we remark that the analogous statement is true of the instanton bundle in Sect. 6.2, in which the algebra inclusion C[S 4 ] → C[CP3 ] is quantised by the diagonal subgroup of SU4 , a symmetry group which preserves the relevant fibration in that case. As discussed, it is true classically that one can expect such a transform between subsets of CM# and the corresponding subsets of twistor space T provided the required topological properties (such as connectedness and simple-connectedness of the fibres) are met. We now see, however, that the same is not necessarily true in the quantum case. For a given subset U of CM# , we expect the transform between U and its twistor counterpart Uˆ = p(q −1 (U )) ⊂ T to carry over to the quantum case provided the twisting group of symmetries is chosen in a way so as to preserve U and Uˆ .

7.4. Outline of the Penrose-Ward transform for vector bundles. The previous section described how the one-forms on twistor space give rise to a differential operator on forms over space-time having anti-self-dual curvature. The main feature of this relationship is that bundle data on twistor space correspond to differential data on space-time. The idea of the full Penrose-Ward transform is to generalise this construction from differential forms to sections of more general vector bundles. We begin with a finite rank C[T˜ ]-module E˜ describing the holomorphic sections of a holomorphic vector bundle E˜ over homogeneous twistor space T˜ . Recall that the pull-back p ∗ E is the C[F˜ t ]-module ˜ E˜ := p ∗ E˜ = C[F˜ t ] ⊗C[T˜ ] E.

(32)

The key observation is then that there is a relative connection ∇ p on p ∗ E˜ defined by ∇p = dp ⊗ 1

Quantisation of Twistor Theory by Cocycle Twist

767

with respect to the decomposition (32). Again there is a relative Leibniz rule ∇ p ( f ξ ) = f (∇ p ξ ) + (d p f ) ⊗ ξ

(33)

˜ Moreover, ∇ p extends to C[F˜ t ]-valued k-forms by defining for f ∈ C[F˜ t ] and ξ ∈ p ∗ E. kp E˜ := kp ⊗C[F˜ t ] E˜ for k ≥ 0 and extending ∇ p = d p ⊗ 1 with respect to this decomposition. It is clear that the curvature satisfies ∇ 2p = 0 and we say that ∇ p is relatively flat. Conversely, if E˜ is a finite rank C[F˜ t ]-module admitting a flat relative connection ∇ (that is, a complex-linear map ∇ : E˜ → 1p ⊗C[F˜ t ] E˜ satisfying (33) and (∇ )2 = 0), we may recover a finite rank C[T˜ ]-module E˜ by means of the covariantly constant sections, namely E˜ := {ξ ∈ E˜ | ∇ ξ = 0}. This argument gives rise to the following result. Proposition 7.4. There is a one-to-one correspondence between finite rank C[T˜ ]modules E˜ and finite rank C[F˜ t ]-modules E˜ admitting a flat relative connection ∇ p , ∇ p ( f ξ ) = f (∇ p ξ ) + (d p f ) ⊗ ξ,

∇ 2p = 0

for all f ∈ C[F˜ t ] and ξ ∈ E˜ . We remark that here we do not see the non-trivial structure of the bundles involved (all of our modules describing vector bundles are free) as in our local picture all bundles are trivial. However, given these local formulae it will be possible at a later stage to patch together what happens at the global level. The Penrose-Ward transform arises by considering what happens to a relative connection ∇ p under direct image along q. At the projective level, one should in general impose here the additional assumption that E˜ is also the pull-back of a bundle on space-time, so E˜ = p ∗ E˜ = q ∗ E for some finite rank C[CM]-module E. This is equivalent to assuming that the bundle is trivial upon restriction to each of the fibres of the map q : Ft → CM. At the homogeneous level it is of course obvious. The direct image of E˜ is computed exactly as described in the previous section and one obtains q∗ E˜ = q∗ q ∗ E ∼ = E. Just as in Proposition 7.2, Eq. (33) for ∇ p becomes a Leibniz rule for ∇ := q∗ ∇, whence ∇ p maps onto a genuine connection on E. The sequence E˜ → 1p ⊗C[F˜ t ] E˜ → 2p ⊗C[F˜ t ] E˜ , where the two maps are ∇ p , becomes the sequence E → 1 C[CM]⊗C[CM] E → 2+ ⊗C[CM] E, where the maps here are ∇. Moreover, just as in the previous section it follows that under direct image the condition that ∇ 2p = 0 is equivalent to the condition that the curvature ∇ 2 is annihilated by the mapping 2 C[CM] ⊗C[CM] E → q∗ 2p ⊗C[CM] E = 2+ ⊗C[CM] E, so that ∇ has anti-self-dual curvature.

768

S. J. Brain, S. Majid

7.5. Tautological bundle on CM and its Ward transform. We remark that in the previous section we began with a bundle over homogeneous twistor space T˜ , whereas it is usual to work with bundles over the projective version T . As such, we implicitly assume that in doing so we obtain from E˜ corresponding C[TZ 3 ]- and C[TZ 4 ]-modules which are compatible in the patch where Z 3 and Z 4 are both non-zero, as was the case for the coordinate algebras C[TZ 3 ] and C[TZ 4 ], C[F Z 3 ] and C[F Z 4 ] via the transition functions Z µ (Z 3 )−1 → Z µ (Z 4 )−1 . We similarly assume this for the corresponding calculi on these coordinate patches. In order to neatly capture these issues of coordinate patching, the general construction really belongs in the language of cohomology: as explained, the details will be addressed elsewhere. For the time being we give an illustration of the transform in the coordinate algebra framework, as well as an indication of what happens under twisting, through the tautological example introduced in Sect. 3.1. We recall the identification of conformal space-time and the correspondence space as flag varieties CM# = F2 (C4 ) and F = F1,2 (C4 ) respectively, and the resulting fibration q : F1,2 (C4 ) → F2 (C4 ), where the fibre over a point x ∈ F2 (C4 ) is the set of all one-dimensional subspaces of C4 contained in the two-plane x ⊂ C4 , so is precisely a projective line CP1 . We also have a fibration at the homogeneous level, F˜ → F2 (C4 ), where this time the fibre over x ∈ F2 (C4 ) is the set of all vectors which lie in the two-plane x, i.e. the tautological bundle over CM# . We identify C4 with its dual and take the basis (Z 1 , Z 2 , Z 3 , Z 4 ). In the patch F˜ t the relations in C[F˜ t ] are Z 1 = x2 Z 3 + x3 Z 4 ,

Z 2 = x4 Z 3 + x1 Z 4 ,

(34)

which may be seen as giving a trivialisation of the tautological bundle in the patch CM. In this trivialisation the space E of sections of the bundle is just the free module over C[CM] of rank two, spanned by Z 3 and Z 4 , i.e. E ∼ = C[CM] ⊗ C2 . We equip this module with the anti-self-dual connection d ⊗ 1 constructed in Sect. 7.3. It is now easy to see that the pull-back E˜ of E along q is just the free C[F˜ t ]-module of rank two. By construction, the connection d on E pulls back to the relative connection d p ⊗ 1 on E˜ = C[F˜ t ] ⊗ C2 . As discussed earlier the corresponding C[T˜ ]-module E˜ is obtained as the kernel of the partial connection ∇ p , which is precisely the free rank two C[T˜ ]-module E˜ = C[T˜ ] ⊗ C2 . It is also clear that these modules satisfy the condition that E˜ = p ∗ E˜ = q ∗ E (the corresponding vector bundles are trivial in this case, whence they are automatically trivial when restricted to each fibre of p and of q). Proceeding any further involves giving a description of the bundles over projective twistor space more precisely, in terms of their patching data between the two coordinate charts, rather than simply working at the homogeneous level as we have done here. Such patching data are given in terms of the transition functions between the coordinate charts, and the bundle data on the twistor side of the transform are then given in terms of cohomology classes of its sections. The above construction when given at the projective level must incorporate these patching issues, and classically this is usually dealt with

Quantisation of Twistor Theory by Cocycle Twist

769

using sheaves and sheaf cohomology, a different approach to the coordinate algebra framework used here. As explained, we postpone a more explicit discussion of the details to a sequel. The description given here is, however, enough to see that the Penrose-Ward transform in this example quantises in exactly the same way as the rank one case of Sect. 7.3. Both the transform and its quantisation are evidently very different in flavour to the transform of the instanton bundle given in Sects. 3.2 and 6.2. 8. The ADHM Construction 8.1. The classical ADHM construction. We begin this section with a brief summary of the ADHM construction for connections with anti-self-dual curvature on vector bundles over Minkowski space CM [1,15], with a view to dualising and then twisting the construction. A monad over T˜ = C4 is a sequence of linear maps

A

ρZ - B

τZ - C

between complex vector spaces A, B, C of dimensions k, 2k + n, k respectively, such that for all Z ∈ C4 , τ Z ρ Z : A −→ C is zero and for all Z ∈ C4 , ρ Z is injective and τ Z is surjective. Moreover, we insist that ρ Z , τ Z each depend linearly on Z ∈ T˜ . The spaces A, B, C should be thought of as typical fibres of trivial vector bundles over CP3 of ranks k, 2k + n, k, respectively. A monad determines a rank n holomorphic vector bundle on T = CP3 whose fibre at [Z ] is Ker τ Z /Im ρ Z (where [Z ] denotes the projective equivalence class of Z ∈ C4 ). Moreover, any holomorphic vector bundle on CP3 trivial on each projective line comes from such a monad, unique up to the action of GL(A) × GL(B) × GL(C). For a proof of this we refer to [17], although we note that the condition τ Z ρ Z = 0 implies Im ρ Z ⊂ Ker τ Z for all z ∈ C4 so the cohomology makes sense, and the fact that ρ Z , τ Z have maximal rank at every Z implies that each fibre has dimension n. The idea behind the ADHM construction is to use the same monad data to construct a rank n vector bundle over CM# with second Chern class c2 = k (in the physics literature this is usually called the topological charge of the bundle). For each W, Z ∈ T˜ we write x = W ∧ Z for the corresponding element x of # ⊂ 2 T˜ . Then define (homogeneous) conformal space-time CM E x = Ker τ Z ∩ Ker τW ,

Fx = (ρ Z A) ∩ (ρW A),

x = τ Z ρW .

Proposition 8.1. [15] The vector spaces E x , Fx and the map x depend on x, rather than on Z , W individually. Proof. We first consider x and suppose that x = Z ∧ W = Z ∧ W for some W = W + λZ , λ ∈ C. Then we have τ Z ρW − τ Z ρW = τ Z ρW −W = τ Z ρλZ = λτ Z ρ Z = 0. Now writing Z = Z + λW we see that τ Z ρW − τ Z ρW = (τ Z + λτW )ρW − τ Z ρW = λτ Z ρW − τ Z ρW = 0 by the first calculation, proving the claim for x .

770

S. J. Brain, S. Majid

W

Next we consider b ∈ Ker τ Z ∩ Ker τW and suppose x = Z ∧ W = Z ∧ W where = W + λZ . Then τW b = (τW + λτ Z )b = 0,

so that Ker τ Z ∩ Ker τW = Ker τ Z ∩ Ker τW . Moreover, if Z = Z + λW we see that τ Z b = (τ Z + λτW )b = 0, so that Ker τ Z ∩ Ker τW = Ker τ Z ∩ Ker τW , establishing the second claim. The third follows similarly. As in [15], we write U for the set of x ∈ CM# on which x is invertible. Proposition 8.2. For all x ∈ U we have the decomposition, B = E x ⊕ Im ρ Z ⊕ Im ρW .

(35)

In particular, for all x ∈ CM# we have Fx = 0. Proof. For x ∈ U , define −1 Px = 1 − ρW −1 x τ Z + ρ Z x τW : B → B.

Now Px is linear in x and is dependent only on x and not on Z , W individually. It is easily shown that Px2 = Px and Px B = E x , whence Px is the projection onto E x . Moreover, −ρW −1 x τZ ,

ρ Z −1 x τW ,

are the projections onto the second and third summands respectively. Hence we have proven the claim provided we can show that Fx = 0, since the sum Im ρ Z ⊕ Im ρW is then direct. Suppose that Fx = (ρ Z A) ∩ (ρW A) is not zero for some x ∈ CM# , so there exists a non-zero b ∈ ρW A for all W ∈ T˜ such that x = Z ∧ W = Z ∧ W . Then in particular a := ρW−1 b is non-zero and defines a holomorphic section of a vector bundle over the two-dimensional subspace of T˜ spanned by all such W (whose typical fibre is just A), and hence (at the projective level) a non-zero holomorphic section of the bundle O(−1) ⊗ A over xˆ = CP1 , where O(−1) denotes the tautological line bundle over CP1 . It is however, well-known that this bundle has no non-zero global sections, whence we must in fact have Fx = 0. This procedure has thus constructed a rank n vector bundle over U whose fibre over x ∈ U is E x (again noting that the construction is independent of the scaling of # ). The bundle E is obtained as a sub-bundle of the trivial bundle U × B: the x ∈ CM

projection Px identifies the fibre of E at each x ∈ CM# as well as defining a connection on E by orthogonal projection of the trivial connection on U × B.

Quantisation of Twistor Theory by Cocycle Twist

771

8.2. ADHM in the ∗-algebra picture. In this section we mention how the ADHM construction ought to operate in our SU4 ∗-algebra framework. In passing to the affine variety description of our manifolds, we encode them as real rather than complex manifolds, obtaining a global coordinate algebra description. Thus we expect that the ADHM construction ought to go through at some global level in our ∗-algebra picture. For now we suppress the underlying holomorphic structure of the bundles we construct, with the complex structure to be added elsewhere. Indeed, we observe that the key ingredient in the ADHM construction is the decomposition (35), B = E x ⊕ Im ρ Z ⊕ Im ρW , which identifies the required bundle over space-time as a sub-bundle of the trivial bundle with fibre B. We wish to give a version of this decomposition labelled by points at the projective level, that is in terms of points of T = CP3 and CM# = F2 (C4 ), rather than # used above. We shall in terms of the homogeneous representatives Z ∈ T˜ and x ∈ CM

do this as before by identifying points of CP3 with rank one projectors Q in M4 (C), similarly points of CM# with rank two projectors P. Points of the correspondence space are identified with pairs of such projectors (Q, P) such that Q P = Q = P Q. Recall that in the previous description, given x ∈ 2 T˜ and Z ∈ xˆ there are many W such that x = Z ∧ W . In the alternative description, given a projection P ∈ CM# , the corresponding picture is that there are Q, Q ∈ T with Q P = Q = P Q and Q P = Q = P Q such that Im P = Im Q ⊕ Im Q . Given P ∈ CM# , Q ∈ T with (Q, P) ∈ F, there is a canonical choice for Q , namely Q := P−Q. Indeed, P identifies a two-dimensional subspace of C4 and Q picks out a one-dimensional subspace of this plane. The projector Q = P − Q picks out a line in C4 in the orthogonal complement of the line determined by Q. As in the monad description of the construction outlined earlier, the idea is to begin with a trivial bundle of rank 2k + n with typical fibre B = C2k+n and to present sufficient information to canonically identify a decomposition of B in the form (35). In the monad description this was done by assuming that τ Z ρ Z = 0 and that the maps τ, ρ were linearly dependent on Z ∈ T˜ . Of course, this makes full use of the additive structure on T˜ , a property which we do not have in the projector version: here we suggest an alternative approach. In order to obtain such a decomposition it is necessary to determine the reason for each assumption in the monad construction and to then translate this assumption into the projector picture. The first observation is that the effect of the map ρ is to identify a k-dimensional subspace of B for each point in twistor space (for each Q ∈ T the map ρ Q is simply the associated embedding of A into B). We note that this may be achieved directly by specifying for each Q ∈ T a rank k projection ρ Q : B → B. The construction then requires us to decompose each point of space-time in terms of a pair of twistors. As observed above, given P ∈ CM# and any Q ∈ T such that P Q = Q = Q P we have P = Q + Q , where Q = P − Q (it is easy to check that Q is indeed another projection of rank one). The claim is then that the corresponding k-dimensional subspaces of B have zero intersection and that their direct sum is independent of the choice of decomposition of P. In the monad construction this was obtained using the assumed linear dependence of ρ on points Z ∈ T˜ , although in terms of projectors this is instead achieved by assuming that the projections ρ Q ,ρ Q are orthogonal whenever

772

S. J. Brain, S. Majid

Q, Q are orthogonal (clearly if Q + Q = P then Q,Q are orthogonal), i.e. ρQ ρQ = ρQ ρQ = 0 for all Q, Q such that Q Q = Q Q = 0. Moreover, we impose that ρ Q + ρ Q depends only on Q + Q , so that the direct sum of the images of these projections depends only on the sum of the projections. Then for each P ∈ CM# we define a subspace B P of B of dimension n, B = B P ⊕ Im ρ Q ⊕ Im ρ Q by constructing the projection e P := 1 − ρ Q − ρ Q on C4 , which is well-defined since the assumptions we have made on the family ρ Q imply that the projectors e P , ρ Q and ρ Q are pairwise orthogonal. This constructs a rank n vector bundle over CM# whose fibre over P ∈ CM# is Im e P = B P . 8.3. The tautological bundle on CM# and its corresponding monad. We illustrate these ideas by constructing a specific example in the ∗-algebra picture. We construct a ‘tautological monad’ which appears extremely naturally in the projector version and turns out to correspond to the basic one-instanton bundle of Sect. 3.2. We once again recall that compactified space-time may be identified with the flag variety F2 (C4 ) of two-planes in C4 and this space has its associated tautological bundle whose fibre at x ∈ F2 (C4 ) is the two-plane in C4 which defines x (it is precisely this observation which gave rise to the projector description of space-time in the first place). Then we take B = C4 in the ADHM construction: note that we expect to take n = 2, k = 1, which agrees with the fact that dim B = 2k + n = 4. Then for each point P ∈ CM# we are required to decompose it as the sum of a pair Q, Q of rank one projectors (each representing a twistor). Here this is easy to do: we simply choose any one-dimensional subspace of the image of P and take Q to be the rank one projector whose image is this line. As discussed, the canonical choice for Q is just Q := P − Q. In doing so, we have tautologically specified the one-dimensional subspace of B = C4 associated to Q ∈ T (recalling that k = 1 here), simply defining ρ Q := Q. We now check that these data satisfy the conditions outlined in the previous section. It is tautologically clear that for all Q 1 , Q 2 ∈ T we have that ρ Q 1 and ρ Q 2 are orthogonal projections if and only if Q 1 and Q 2 are orthogonal. Moreover, if we fix P ∈ CM# , Q ∈ T such that P Q = Q = Q P and take Q = P − Q. then ρ Q + ρ Q = Q + Q = P, which of course depends only on Q + Q rather than on Q, Q individually. Thus we construct the subspace B P as the complement of the direct sum of the images of ρ Q and ρ Q . As explained, this is done by constructing the projection e P = 1 − ρ Q − ρ Q = 1 − P.

Quantisation of Twistor Theory by Cocycle Twist

773

Thus as P varies we get a rank two vector bundle over CM# , which is easily seen to be the complement in C4 of the tautological vector bundle over CM# . We equip this bundle with the Grassmann connection obtained by orthogonal projection of the trivial connection on the trivial bundle CM# × C4 , as discussed earlier. This gives a monad description of the tautological bundle over CM# = F2 (C4 ). As explained, the instanton bundle over S 4 is obtained by restriction of this bundle to the two-planes x ∈ F2 (C4 ) which are invariant under the map J defined in Sect. 3.2. Hence we consider this construction only for x ∈ S 4 , and by Proposition 3.11 we have that Q = J (Q) in the above. We now wish to give the monad version of the corresponding bundle over twistor space, for which we need the map τ Q . We recall that τ is meant to satisfy τ Q ρ Q = 0 for all Q ∈ CP3 , and we use this property to construct τ by putting τ Q := ρ J (Q) = J (Q), so that in this tautological example we have τ Q ρ Q = ρ J (Q) ρ Q = J (Q)Q = 0. The bundle over twistor space corresponding to the instanton then appears as the vector bundle whose fibre over Q ∈ CP3 is the cohomology E˜ = Ker τ Q /Im ρ Q = Ker ρ J (Q) /Im ρ Q , which in the case of the basic one-instanton is the rank two bundle E˜ = Ker J (Q)/Im Q, and one may easily check that this bundle over twistor space agrees with the one computed in Sect. 3.2 using the Penrose-Ward transform. Indeed, the crucial property is that it is trivial upon restriction to Pˆ = p(q −1 (P)) for all P ∈ S 4 , which is straightforward to see through the observation that as Q varies with P fixed, Q and J (Q) always span the same plane (the one defined by P), and the fibre of E˜ over all such Q is precisely the orthogonal complement to this plane. Acknowledgements. We would like to thank G. Landi and S.P. Smith for helpful comments on versions of the manuscript during their respective visits to the Noncommutative Geometry Programme at the Newton Institute.

References 1. Atiyah, M.F., Drinfel’d, V.G., Hitchin, N.J., Manin, Y.I.: Construction of Instantons. Phys. Lett. 65(3), 185–187 (1978) 2. Atiyah, M.F.: Geometry of Yang-Mills Fields. Fermi Lectures, Scuola Normale Pisa, 1979 3. Barth, W.: Moduli of Vector Bundles on the Projective Plane. Invent. Math. 42, 63–91 (1977) 4. Baston, R.J., Eastwood, M.G.: The Penrose Transform: Its Interaction with Representation Theory, Oxford: Oxford University Press, 1989 5. Brain, S.J.: The Noncommutative Penrose-Ward Transform and Self-Dual Yang-Mills Fields. DPhil Thesis, University of Oxford, 2005 6. Connes, A., Landi, G.: Noncommutative Manifolds, the Instanton Algebra and Isospectral Deformations. Commun. Math. Phys. 221, 141–159 (2001) 7. Eastwood, M.G.: The Generalized Penrose-Ward Transform. Math. Proc. Camb. Phil. Soc. 97, 165–187 (1985) 8. Majid, S., Hajac, P.: Projective Module Description of the q-Monopole. Commun. Math. Phys. 206, 246–264 (1999) 9. Hannabuss, K.C.: Noncommutative Twistor Space. Lett. Math. Phys. 58(2), 153–166 (2001) 10. Kapustin, A., Kuznetsov, A.: D. Orlov. Noncommutative Instantons and Twistor Transform. Commun. Math. Phys. 221(2), 385–432 (2001) 11. Landi, G., van Suijlekom, W.D.: Principal Fibrations from Noncommutative Spheres. Comm. Math. Phys. 260, 203–225 (2005)

774

S. J. Brain, S. Majid

12. Majid, S.: Foundations of Quantum Group Theory, Cambridge: Cambridge University Press, 1995 13. Majid, S., Oeckl, R.: Twisting of Quantum Differentials and the Planck Scale Hopf Algebra. Commun. Math. Phys. 205, 617–655 (1999) 14. Majid, S.: Noncommutative Riemannian and Spin Geometry of the Standard q-Sphere. Commun. Math. Phys. 256, 255–285 (2005) 15. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-Duality and Twistor Theory, Oxford: Oxford University Press, 1996 16. Nekrasov, N.A., Schwarz, A.: Instantons on Noncommutative R4 and (2,0) Superconformal Six Dimensional Theory. Commun. Math. Phys. 198(3), 689–703 (1998) 17. Okonek, C., Schneider, M., Spindler, H.: Vector Bundles on Complex Projective Spaces, Boston: Birkhauser, 1980 18. Penrose, R., Rindler, W.: Spinors and Space-Time, Vol. 2. Cambridge: Cambridge University Press, 1986 19. Seiberg, N., Witten, E.: String Theory and Noncommutative Geometry. JHEP 9909(9), 32 (1999) 20. Ward, R.S.: On Self-Dual Gauge Fields. Phys. Lett. 61((2), 81–82 (1977) 21. Ward, R.S., Wells R.O. Jr.: Twistor Geometry and Field Theory. Cambridge: Cambridge University Press, 1990 Communicated by A. Connes

Commun. Math. Phys. 284, 775–802 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0644-9

Communications in

Mathematical Physics

A Negative Mass Theorem for the 2-Torus K. Okikiolu Department of Mathematics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0112, USA. E-mail: [email protected] Received: 19 July 2007 / Accepted: 14 July 2008 Published online: 8 October 2008 – © Springer-Verlag 2008

Abstract: Let M be a closed surface. For a metric g on M, denote the area element by dA and the Laplace-Beltrami operator by = g . We define the Robin mass m( p) at the point p ∈ M to be the value of the Green function G( p, q) at q = p after the logarithmic singularity has been subtracted off, and we define trace −1 = M m( p) dA. This regularized trace can also be obtained by regularization of the spectral zeta function and is hence a spectral invariant which heuristically measures the total wavelength of −1 the surface. We define the -mass of (M, g) to equal (trace −1 g − trace S 2 ,A )/A, where S 2 ,A is the Laplacian on the round sphere of area A. This scale invariant quantity is a non-trivial analog for closed surfaces of the ADM mass for higher dimensional asymptotically flat manifolds. In this paper we show that in each conformal class C for the 2-torus, there exists a metric with negative -mass. From this it follows that the minimum of the -mass on C is negative and attained by some metric g ∈ C. For this minimizing metric g, one gets a sharp logarithmic Hardy-Littlewood-Sobolev inequality and an Onofri-type inequality. We remark that if the flat metric in C is sufficiently long and thin then the minimizing metric g is non-flat. The proof of our result depends on analyzing the ordinary differential equation φ = 1 − eφ which is equivalent to h = 1 − 1/ h. The solutions are periodic and we need to establish quite delicate, asymptotically sharp inequalities relating the period to the maximum value. 1. Introduction, Main Results and Summary of the Proof Let M be a smooth, closed, compact surface with a (Riemannian) metric g. Denote the area element of g by dA and the area by A. Let = g denote the Laplace-Beltrami operator for g, given in local coordinates (x1 , . . . , xn ) by ∂ 1 ∂ det g g i j . (1.1) = − √ ∂ x ∂ xj det g i i, j The author was supported by the National Science Foundation #DMS-0302647.

776

K. Okikiolu

The kernel of is the constants. Let −1 denote the inverse operator 1 −1 f = f − f d A. A M The Green function G( p, q) for is the smooth function on M × M\{( p, p) : p ∈ M} which satisfies −1 f ( p) = G( p, q) f (q) dA(q). M

Denoting the distance from p to q in the metric g by d( p, q), the function G( p, q) is smooth away from the diagonal and has an expansion at the diagonal of the form 1 log d( p, q) + m( p) + o(d( p, q)). (1.2) 2π We call the value m( p) = m g ( p) the Robin mass at the point p. For a smooth function φ on M, write Aφ for the area of M in the metric eφ g, so Aφ = eφ d A. G( p, q) = −

M

Conformal change of the Robin mass. If φ is a smooth function on M then 2 φ 1 φ φ m eφ g ( p) = m g ( p) + − (−1 e )( p) + eφ −1 g e d A. 4π Aφ g A2φ M

(1.3)

For the proof, see for example [S1,S2,M2] or [O2]. We define trace −1 = m g ( p) d A( p). g M

This is a spectral invariant for , since it can be obtained from the spectral zeta function associated to , see [S1,S2,M3], or [O2]. Remark. Writing K ( p) for the Gaussian curvature of g at p, it is shown in [S1,S2], that for any metric g on the 2-sphere, we have 1 −1 1 K ( p) = trace −1 (1.4) g . 2π A The left-hand side (and hence the right hand-side) is a 2-sphere analog of the ADM mass from general relativity. Indeed, the (Riemannian) ADM mass is defined for asymptotically flat manifolds. However, if M is a compact Riemannian manifold of dimension greater than 2, with positive conformal Laplacian, then given a point p ∈ M we can define a mass at p by blowing up the metric around p using the Green function for the conformal Laplacian, and taking the ADM mass of the resulting asymptotically flat metric. This amounts to taking the constant term in the asymptotic expansion of the Green function for the conformal Laplacian around the point p. The left hand side of (1.4) is the natural non-trivial analog of this for the 2-sphere. Formula (1.4) does not hold for surfaces of higher genus. The left-hand side is no longer pointwise constant and its fluctuation does not have obvious geometric significance. Therefore we consider the right-hand side of (1.4) as a natural non-trivial analog of the ADM mass for compact surfaces. Now (1.3) immediately gives the following formula, see also [M1]. m g ( p) −

Negative Mass Theorem for the 2-Torus

777

Conformal change of trace −1 (Morpurgo’s Formula). If φ is a smooth function on M, then 1 1 −1 φ φ φ trace eφ g = m g e dA + φ e dA − eφ −1 (1.5) g e dA. 4π M Aφ M M On the round sphere, the right hand side of (1.5) occurs in the logarithmic HardyLittlewood-Sobolev inequality. Sharp logarithmic Hardy-Littlewood-Sobolev inequality on the S 2 . If g is a round metric on S 2 of area A, 1 1 φ eφ d A − eφ −1 eφ d A ≥ 0 4π S 2 A S2 holds for all functions φ : S 2 → R with S 2 eφ d A = A such that S 2 φ eφ d A is finite. Moreover equality is attained exactly when eφ is the Jacobian of a conformal transformation of S 2 . For the proof, see [On,CL,B]. Combining this with (2), Morpurgo obtained the following. Spectral interpretation of the logarithmic HLS inequality. Among all metrics on the 2-sphere of area A, the round metric attains the minimum value of trace −1 . The behavior of trace −1 for non-flat metrics on the torus was first considered in [M1]. Suppose g0 is any flat metric of unit area on the 2-torus, and let λ1 (g0 ) denote the lowest eigenvalue of the Laplace-Beltrami operator for g0 . Let C1 denote the class of metrics conformal to g0 having unit area. It was shown in [M1] that if λ1 (g0 ) > 8π , then g0 is a local minimum for trace −1 on C1 . In [LL1,LL2], this was improved to a global result in most cases. Indeed, it was shown that g0 minimizes trace −1 on C1 provided λ1 (g0 ) ≥ π 3 , or g0 is rectangular and λ1 ≥ 8π . It is well understood that g0 cannot minimize trace −1 on C1 when λ1 (g0 ) is small. Indeed, it can be observed from the Kronecker limit formula that when λ1 (g0 ) is small, the value of trace −1 for g0 is greater than the value for the round sphere of unit area, as was pointed out in [DS2]. However, by blowing a spherical bubble, one can construct a family of metrics in C1 for which trace −1 approaches the value for the round sphere (see [O2,DS2] for different approaches to this). In this paper, we show that if T is a flat torus of unit area with λ1 (T ) < 8π , then the minimum value of trace −1 among conformal metrics of unit area is attained by a non-flat metric. Although we do not identify this minimizing metric explicitly, we do construct a candidate, which is approximately spherical except for a short wormhole joining the poles. Theorem 1. Let T be a 2-dimensional torus with metric g0 . Then there exists a metric g in the same conformal class as g0 and having the same area A, such that the Robin mass m(x) for g is constant, and strictly less than the Robin mass for the round sphere of area A. This leads to the following result. Theorem 2. Let T be a 2-dimensional torus with metric g0 . Then among metrics in the same conformal class as g0 and having the same area A, there exists a metric g which attains the minimum value of trace −1 . Moreover g has constant Robin mass m(x), and this is less than the Robin mass of the round sphere of area A.

778

K. Okikiolu

We remark that if g0 is flat with λ1 (g0 ) < 8π , then the metric g is not flat and the Robin mass for g is less than that for g0 . Corollary 3. (Analogs of Logarithmic HLS inequality and Onofri’s Inequality for the torus.) For the minimizing metric g of Theorem 2, we have 1 1 φ eφ d A − eφ −1 eφ d A ≥ 0 (1.6) 4π T A T for all functions φ : T → R with T eφ d A = A such that T φ eφ d A is finite. Here, ∞ d A and are associated to g. Moreover, for φ ∈ C (T ), 1 1 1 φφ d A − log eφ d A + φ d A ≥ 0. 16π T A T A T To deduce Theorem 2 from Theorem 1, we appeal to Theorem 1 of [O2], which states that the minimum value of trace −1 among metrics conformal to g0 having the same area is attained, provided there exists a metric conformal to g0 for which the value of trace −1 is lower than the value for the round sphere of the same area. The proof of that result is a variational argument very similar in spirit to the proof of the Yamabe theorem in the non-positive case. One is trying to find φ to minimize (1.5). First one modifies the equation to break the lack of compactness by replacing −1 in the integral on the right by −1−ε . One can construct a minimizer for the resulting functional, and one wants this minimizer to converge to a limit as ε → 0. It is here that one uses the fact that the value of trace −1 is lower than that for the round sphere, which is what prevents bubbles from forming and ensures the existence of a convergent subsequence as ε → 0. To deduce Corollary 3 from Theorem 2, we appeal to Theorem 3 in [O2], which is just an explicit formulation of the duality between the logarithmic Sobolev inequality and the Onofri inequality. For some related results, see [Ch,M2,M3,O1,OsPS,S2]. For a probabilistic interpretation of trace −1 , see [DS1]. Proof of Theorem 1. We will quickly show that our result is related to the problem of establishing somewhat delicate inequalities between the period and the maximum value of solutions to the ordinary differential equation φ = 1 − eφ . These inequalities are established by making just the right Taylor expansion of the integral formula for the period. We first remark that under scaling by a constant eλ , the Robin mass scales as m eλ g ( p) = m g ( p) +

λ . 4π

Hence if we can prove the theorem for area A = 1, it follows for arbitrary values of A. Furthermore, by the classical Uniformization Theorem we can assume that g0 is a flat metric on T with area 1, and we seek the metric g = eφ g0 of area 1. From (1.3), the condition that the mass m eφ g0 ( p) is constant is φ φ − 8π(−1 0 e ) is constant,

where 0 is the Laplacian for g0 . Applying 0 we find that this is equivalent to 0 φ = 8π(eφ − 1).

(1.7)

Negative Mass Theorem for the 2-Torus

779

We remark that if φ satisfies this condition then the metric eφ g0 automatically has area 1, since 0 = 0 φ d A = 8π (eφ − 1) d A0 , T

T

where dA0 is the area element for g0 . We assume that φ satisfies (1.7). Then (1.5) gives 1 −1 −1 φ(1 + eφ ) d A0 . (1.8) trace eφ g = trace 0 + 0 8π T Now we work on a torus with flat metric g of area 1, given by C/, where is the lattice generated by 1/b and a + ib. A fundamental domain for the torus is given by ay ay + 1 . (1.9) x + i y : 0 ≤ y ≤ b, ≤x≤ b b It is a fact that every metric on the torus is conformal to such a flat metric, with 1/4 3 b ≥ = 0.9306.... (1.10) 4 For the flat metric g on this torus, we compute in the Appendix using the first Kronecker limit formula that setting √ β = π b, (1.11) we have −1 trace −1 0 − trace S 2 ,1 ∞

√ 1 β2

2 −2n(β 2 −i π βa)

− log(4β ) + 1 − 4 = log 1 − e

, 4π 3

(1.12)

n=1

where −1 is the Laplacian on the round 2-sphere of area 1, see also [Chiu,S1,S2]. S 2 ,1 From this we see that ∞

1 β2

−1 −1 2 −2nβ 2

− log(4β ) + 1 − 4 log 1 − e trace 0 − trace S 2 ,1 ≤

. 4π 3 n=1

(1.13) From this point, the proof involves some simple numerical evaluations as well as exact formulas and asymptotic estimates. It is a fact first pointed out in [DS2] that that the left hand side of (1.13) is negative when β is small. To see this, note that −4

∞

2

log 1 − e−2nβ

n=1

is decreasing in β and is thus bounded by the value at the endpoint β = π 1/2 (3/4)1/4 , which is ∞

1/2

−4 log 1 − e−3 π n < 0.02. n=1

780

K. Okikiolu

On the other hand, β2 − log(4β 2 ) + 1 3 is convex on the interval [π 1/2 (3/4)1/4 , 2.6], and hence is bounded above there by −0.04. Adding these terms, we find that the right hand side of (1.13) is negative when β ≤ 2.6. We see then that in this case the flat metric g = g0 satisfies the conclusion of√Theorem 1. We only need prove Theorem 1 when β > 2.6. Noting that 2.6 > π/ 2, we√now complete the proof of Theorem 1, by explaining how to find g in the case β > π/ 2. Remark. If b > 1, then the length of the shortest geodesic is 1/b and the lowest eigenva√ lue of the Laplace-Beltrami operator is λ1 = 4π 2 /b2 = 4π 3 /β 2 , so the value β = π/ 2 corresponds to λ1 = 8π . The value β = 2 corresponds to λ1 = π 3 . We remark that when β ≤ 2, it is shown in [LL1] that the flat metric minimizes trace −1 . Since the minimum must beat the round sphere, this again confirms for the case β ≤ 2 , that (1.13) is negative. Assuming φ satisfies (1.7), combining (1.8) and (1.13) gives trace −1 − trace −1 eφ g0 S 2 ,1 ∞

2 1 1 β 2

− log(4β 2 ) + 1 − 4 ≤ φ(1 + eφ ) d A0 + log 1 − e−2nβ . 4π 2 T 3 n=1

(1.14) We will find φ ∈ C ∞ (T ) satisfying (1.7) such that φ(x + i y) is a function of y alone, and the right-hand side of (1.14) is negative. We can recast (1.7) and (1.14) in terms of the single variable y so that Theorem 1 follows from the following: Theorem 1 . For each b > (π/2)1/2 , there exists a smooth function φ ∈ C ∞ (R) satisfying d 2φ = 8π(1 − eφ ), dy 2 φ(y + b) = φ(y) for every y ∈ R, φ attains its maximum value φ0 at y = 0,

(1.15) (1.16) (1.17)

and such that writing β = π 1/2 b, we have 1 2b

b 0

φ(1 + eφ ) dy +

∞

β2 2

− log(4β 2 ) + 1 − 4 log 1 − e−2nβ < 0. 3 n=1

(1.18) Remarks. 1. The condition (1.17) is just thrown in to eliminate the degree of freedom given by translation invariance. In fact we choose φ to have smallest period b, which together with (1.15) and (1.17) determines φ uniquely.

Negative Mass Theorem for the 2-Torus

781

2. In proving Theorem 1 , we will establish a relationship between the maximum value φ0 of φ and the period b. A simplified version is that there exist ε1 , ε2 > 0 such that eφ0 + log 4 + ε1 e−φ0 ≤ π b2 ≤ eφ0 + log 4 + ε2 e−φ0 ,

for b ≥

π 1/2 2

.

The precise version is that there exist ε1 , ε2 > 0 such that eφ0 − φ0 + log 4 + ε1 e−φ0 ≤ π b2 − log(π b2 ) ≤ eφ0 − φ0 + log 4 + ε2 e−φ0 , π 1/2 for b ≥ . (1.19) 2 3. In [DS2], conformal factors were chosen for long skinny flat tori of area 1, so that as the length of the flat torus tends to infinity, the Robin mass of the new metric converges to that of the round sphere. From [O2], one sees this can easily be accomplished by conformal factors which concentrate at a point, but the conformal factors in [DS2] depend only on the length variable y. In this paper we choose conformal factors which minimize the Robin mass among one-variable candidates, yielding optimal metrics which beat the mass of the sphere on every torus. It is unknown whether our conformal factors give the true minimizer in any case. The rest of the paper is dedicated to proving Theorem 1 . We begin by giving a summary of the proof, and then supply the details, √ Outline of the proof of Theorem 1 . In Proposition 2.1, we will show that for b > π/2, there exists a unique function φ satisfying (1.15)–(1.17) and having smallest period b. Moreover, the initial condition φ0 increases with b. Next write β =

√

π b,

f 0 = eφ0 − φ0 ,

M =

1 2b

b

φ(1 + eφ ) dy.

(1.20)

0

Let us emphasize that although we are now using 4 variables, b, β, φ0 , f 0 , each one is an increasing function of any of the others. The non-trivial relationship between them is the differential equation which relates b to φ0 . We are trying to prove inequality 1.18, which we write as M +

∞

β2 2

− log(4β 2 ) + 1 − 4 log 1 − e−2nβ < 0. 3

(1.21)

n=1

In Proposition 2.4 we show that d(β M) = 1 − f0 . dβ

(1.22)

We then investigate how f 0 behaves as a function of β, so that we can estimate the left hand side of (1.21). Set ε(β) = β 2 − log(4β 2 ) − f 0 d(β M) + β 2 − log(4β 2 ) − 1. = dβ

(1.23)

782

K. Okikiolu

We will prove the three key estimates, (1.24)–(1.26). Set β1 to be the value of β corresponding to the initial value φ0 = log 5, ε(β) > 0, ε(β) >

π < β ≤ β1 , 21/2

for

0.03 β2

for

(1.24)

β1 ≤ β.

(1.25)

π < β. 21/2

(1.26)

For some γ > 0, we have ε(β) <

γ , β2

for

Thus ε(β) is integrable. For the proof of (1.24), see Proposition 2.6–Corollary 2.8. For the other two inequalities, see Lemma 2.9 and Proposition 2.10. In Corollary 2.5, we obtain a simple upper bound on β in terms of φ0 which yields β1 ≤ 3.8,

for φ0 ≤ log 5.

Hence integrating (1.24), (1.25) from β to infinity yields 1 β

∞

β

˜ d β˜ > ε(β)

0.01 , β2

for

π . 21/2

β>

(1.27)

Now integrating (1.23) gives M +

1 β2 C − log(4β 2 ) + 1 = − 3 β β

∞ β

˜ d β, ˜ ε(β)

(1.28)

where C is the constant of integration. In Proposition 2.11 we rework some of the asymptotic formulas required in the proof of (1.25)–(1.26) to show that C = 0. Hence combining this with (1.27) gives M +

β2 0.01 − log(4β 2 ) + 1 ≤ − 2 , 3 β

for

β>

π . 21/2

(1.29)

Finally, one can check with a simple numerical calculation that −4

∞ n=1

0.002 2

log 1 − e−2nβ < , β2

(1.30)

holds at the value β = π/21/2 . But then in Lemma 2.12 we see that (1.30) must hold at all values β > π/21/2 . Adding (1.29) and (1.30) gives (1.21), thus completing the proof of Theorem 1 . Now we fill in the results stated in the outline to complete the proof.

Negative Mass Theorem for the 2-Torus

783

2. Auxiliary Results and Proofs Proposition 2.1. There exists a smooth function ψ : √ for each fixed b ∈ ( π/2, ∞) the function

√ π/2 , ∞ × R → R such that

φ(y) = ψ(b, y) satisfies (1.15)–(1.17), has smallest period b, and attains its minimum value at y = b/2. Moreover, writing f (φ) = eφ − φ, φ is also characterized by having period b and satisfying the following two conditions: φ(−y) = φ(y), 1 y= √ 4 π

φ0 φ(y)

√

dφ , y ∈ (0, b/2). f 0 − f (φ)

(2.1) (2.2)

Furthermore, the map b → φ0 = ψ(b, 0) √ is smooth from the interval ( π/2, ∞) onto the interval (0, ∞), and db > 0. dφ0 Remarks. 1. Every solution of (1.15)–(1.17) has the form φ(y) = ψ(b/n, y), for some n ∈ N. 2. By making the change of variables h = eφ , and dα = eφ dy, we can transform Eq. (2.4) to 1 d2h . (2.3) = 8π 1 − dα 2 h Now dα is a measure of the change in area, and in some respects it turns out to be more natural to analyze (2.3) than (1.15). However, we will require a delicate estimate on the relationship between b and φ0 , and although we work with the variable h at some points, there are places where it is better to work with (1.15) (for example Proposition 2.4.) Proof of Proposition 2.1. This result is standard and is part of the standard theory of ordinary differential equations, see for example [A] and [Chi]. We give the proof here to set up notation for later. For φ ∈ R, set f (φ) = eφ − φ. We start by constructing the inverse of f . Indeed, f maps R onto [1, ∞), and for each f 1 ∈ [1, ∞) there exist at most two solutions of the equation f (φ) = f 1 , given by φ = φ∗ ( f 1 ) and φ = φ ∗ ( f 1 ), where φ∗ ( f 1 ) ≤ 0, φ ∗ ( f 1 ) ≥ 0.

(2.4)

784

K. Okikiolu

For φ0 > 0, we consider the initial value problem d 2φ = 8π 1 − eφ , 2 dy φ(0) = φ0 . dφ (0) = 0. dy

(2.5) (2.6) (2.7)

Set f 0 = f (φ0 ).

(2.8)

Multiplying (2.5) by dφ/dy and integrating from y = 0 gives 2 dφ = 16π eφ0 − φ0 − (eφ − φ) = 16π( f 0 − f (φ)). dy

(2.9)

Hence 1 dy ±1 . = √ √ dφ 4 π f 0 − f (φ) Set

= (φ0 ) :=

1 √

4 π

φ0 φ∗ ( f 0 )

√

(2.10)

dφ . f 0 − f (φ)

(2.11)

Then the function φ(y), assuming it exists, satisfies y = I (φ(y))

for

0 ≤ y ≤ ,

where

I (z) =

1 √

4 π

φ0

√

z

dφ . f 0 − f (φ) (2.12)

Defining φ to be the inverse of the function I , we find that φ is decreasing and smooth on (0, ) and it extends to be continuously differentiable on [0, ], and satisfies φ(0) = φ0 ,

φ( ) = φ∗ ( f (φ0 )),

dφ dφ (0) = ( ) = 0. dy dy

We now extend φ to [− , ] by requiring that it is even, that is φ(−y) = φ(y), and then we extend it to R by requiring that it is periodic with period 2 . The result is an even, continuously differentiable, periodic function on R whose smallest period is 2 , and which is smooth on R\2 Z and satisfies (2.5) there, and which attains its maximum value at y = 0 and its minimum value at y = . Now by the general theorem on the uniqueness and smoothness of solutions to ordinary differential equations, this solution φ is smooth and satisfies (2.5) everywhere on R. Moreover by the smooth dependence of solutions to ordinary differential equations on the initial conditions, we see that defining η(φ0 , y) = φ(y),

where φ satisfies(2.5)–(2.7),

then η ∈ C ∞ ((0, ∞) × R). The final step is to show that the function φ0 → b = 2 (φ0 )

Negative Mass Theorem for the 2-Torus

785

√ is smooth and bijective from (0, ∞) to ( π/2, ∞), with db > 0, dφ0 so the inverse function b → φ0 (b) √ is smooth and bijective from ( π/2, ∞) to (0, ∞). We then define the function ψ by ψ(b, y) = η(φ0 (b), y). Proposition 2.1 is thus reduced to the following. Proposition 2.2. The function β : [1, ∞) → [0, ∞) defined by ∗ dφ 1 φ ( f0 ) β( f 0 ) := √ 2 φ∗ ( f0 ) f 0 − f (φ)

√ is a smooth function mapping (1, ∞) bijectively onto π/ 2 , ∞ , with

(2.13)

dβ > 0, on (1, ∞). d f0 Proof. See [Chi] for a general proof of this result. See also [ChiJ]. We include the proof here to develop properties of the variable J = j ∗ + j∗ which will be useful later on. To reduce the need for notation, it is convenient to work with physical variables rather than functions. (To be more precise, we suppose that there is a fixed underlying “physical” space which we don’t need to specify. A variable is then a continuous function defined on this space.) We suppose then that φ is a variable taking values in R, and f and h are variables related to φ by f = eφ − φ, h = eφ , φ = log h,

f = h − log h.

(2.14)

The variables f and h take values in [1, ∞) and (0, ∞) respectively. Given a value for f , we write φ ∗ ≥ 0 and φ∗ ≤ 0 for the two corresponding values for φ and set ∗

h ∗ = eφ , h ∗ = eφ∗ .

(2.15)

When f = 1 we have φ ∗ = φ∗ = 0 and h ∗ = h ∗ = 1. For other values of f the values of φ ∗ and φ∗ are distinct. Then making a change of variables, ∗ 1 1 φ ( f0 ) dφ 1 1 f0 1 df = + 2 φ∗ ( f0 ) ( f 0 − f )1/2 2 1 ( f 0 − f )1/2 eφ ∗ ( f ) − 1 1 − eφ∗ ( f ) 1 f0 1 1 1 = d f. + 2 1 ( f 0 − f )1/2 h ∗ ( f ) − 1 1 − h∗( f ) (2.16) We will now analyze the Jacobian factor in (2.16) and modify it to obtain a positive monotonically increasing function of f .

786

K. Okikiolu

Lemma 2.3. Define variables j ∗ and j∗ by j∗ =

h∗

1 1 , − −1 (2( f − 1))1/2

1 1 − . 1 − h∗ (2( f − 1))1/2

j∗ =

(2.17)

Then (a) As f → 1, 1 j∗ → − , 3

j∗ →

1 . 3

(b) The variables j ∗ and j∗ are increasing with f , indeed d j∗ > 0, df

d j∗ > 0, df

for f > 1,

and d j∗ = O(( f − 1)−1/2 ), df

d j∗ = O(( f − 1)−1/2 ), df

as f → 1.

(c) As functions of the variable f , the variables j ∗ and j∗ are concave. More precisely, d2 j ∗ < 0, df2

d 2 j∗ < 0, df2

for f > 1.

(d) The variable 1 1 j + j∗ = ∗ + − h −1 1 − h∗ ∗

2 f −1

1/2 (2.18)

satisfies d( j ∗ + j∗ ) > 0, df

d 2 ( j ∗ + j∗ ) < 0, df2

for f > 1,

and ( j ∗ + j∗ ) → 0,

d( j ∗ + j∗ ) = O(( f − 1)−1/2 ), df

as f → 1.

(e) 0 < j ∗ + j∗ < 1,

when f > 1.

Proof of Lemma 2.3. Clearly (d) follows from (a), (b) and (c). Moreover, see from (2.18) that j ∗ + j∗ → 1, as f → ∞, so (e) follows from (d). (a) Dealing with the variables j ∗ and j∗ simultaneously, note that as h → 1, we have 1 1 1 1 − − = 1/2 |h − 1| (2( f − 1)) |h − 1| (2(h − 1 − log(1 − (1 − h))))1/2 1 1 − = 1/2 2 3 |h − 1| (1 − h) + 2(1 − h) /3 + 2(1 − h)4 /4 + · · · −1/3 as h ↓ 1, (2.19) → 1/3 as h ↑ 1.

Negative Mass Theorem for the 2-Torus

787

(b) We need to show that 1 d 1 > 0, − d f |h − 1| (2( f − 1))1/2

when f > 1.

(2.20)

Note that h dh = . df h−1

(2.21)

We thus compute the sign of the derivative d − sign(h − 1) dh 1 1 1 = − + 1/2 2 d f |h − 1| (2( f − 1)) (h − 1) df (2( f − 1))3/2 −h 1 = + . |h − 1|3 (2( f − 1))3/2 Hence (2.20) will follow if we can show that |h − 1|3 > (2( f − 1))3/2 , h

for h = 1,

h −2/3 (h − 1)2 > 2( f − 1),

for h = 1.

equivalently (2.22)

But this indeed holds, since h −2/3 (h − 1)2 − 2( f − 1)

(2.23)

equals zero at h = 1, and

2h −2/3 (2h + 1) d −2/3 h − 2 > 0 (h − 1)2 − 2( f − 1) = df 3

for f > 1.

Indeed, h −2/3 (2h + 1) > 3

for f > 1,

as one can easily check by cubing both sides or differentiating once more with respect to f . The behavior of the derivative as f → 1 is obtained with a Taylor expansion as in (2.19). (c) We compute d 1 1 −h d2 1 = − + d f 2 |h − 1| (2( f − 1))1/2 d f |h − 1|3 (2( f − 1))3/2 3 h(2h + 1) − . = 5 |h − 1| (2( f − 1))5/2 In order to show that this is negative, we need to show (2( f − 1))5/2 |h − 1|5 < 3 h(2h + 1)

for f > 1,

788

K. Okikiolu

or equivalently we need to show 2( f − 1) < 32/5 (h(2h + 1))−2/5 (h − 1)2

for f > 1.

(2.24)

Now defining τ = 32/5 (h(2h + 1))−2/5 (h − 1)2 − 2( f − 1),

(2.25)

we see that τ vanishes at f = 1. Differentiating with respect to f we get dτ 2 · 32/5 = (2h 2 + h)−7/5 h (6h 2 + 8h + 1) − 2, df 5

(2.26)

which also vanishes at f = 1. To show that this is positive, we compute 5 d 2τ 2h 2 (6h 2 − 2h + 1) = > 0 2 2/5 2 ·3 df 5(2h 2 + h)12/5

for h > 0.

Now we can complete the proof of Proposition 2.2. Introduce the function J : [1, ∞) → R such that j ∗ + j∗ = J ( f ). We see that β is smooth by fixing c with 1 < c < f 0 and writing 2 1 1 f0 β( f 0 ) = + J( f ) d f 2 1 ( f 0 − f )1/2 (2( f − 1))1/2 f0 π J( f ) 1 = 1/2 + df 2 2 1 ( f 0 − f )1/2 J( f ) π 1 c 1 f0 −c J ( f 0 − f ) = 1/2 + d f + d f. 2 2 1 ( f 0 − f )1/2 2 0 f 1/2

(2.27)

Since J is smooth away from 1, both integrals on the right can be differentiated repeatedly in f 0 , and we see β is smooth in f 0 . Differentiating and letting c → 0 gives dβ( f 0 ) 1 f0 −1 J ( f 0 − f ) = d f > 0. d f0 2 0 f 1/2 Our mission is to compute the quantity M in terms of β, and we will prove (1.22) relating M to f 0 . We rescale the function φ to have period 2, by taking the solution ψ from Proposition 2.1, and setting ρ(b, s) = ψ(b, bs/2), so that for b fixed, the function s → ρ(b, s) is even, and attains its maximum value at s = 0, and ∂ 2ρ = 2β 2 (1 − eρ ). ∂s 2

(2.28)

Negative Mass Theorem for the 2-Torus

789

The solution ρ is a smooth function of (β, s), and we are interested in the quantity M, defined in (1.20). Setting f 0 = f (φ0 ) = eφ0 − φ0 , we have from the definition (1.20), the symmetry of φ, and (2.10), φ0 1 b/2 φ(1 + eφ ) 1 1 1 M= φ(1 + eφ ) dy = ρ(1 + eρ ) ds = dφ. b 0 2 0 4β φ∗ ( f0 )) ( f 0 − f (φ))1/2 (2.29) Proposition 2.4. (a) dM 1 = dβ β

1

ρ(1 − eρ ) ds.

0

(b) 1 d(β M) = dβ 2

1

ρ(3 − eρ ) ds = 1 − f 0 .

0

Proof. (a) We differentiate (2.28) to obtain ∂ρ ∂ 2 ∂ρ = 4β(1 − eρ ) − 2β 2 eρ . ∂s 2 ∂β ∂β

(2.30)

Integrating (2.30) we get

1 0

Hence 1 dM = dβ 2

1

0

∂ρ ρ e ds = 0. ∂β

dρ 1 (1 + eρ + ρeρ ) ds = dβ 2

(2.31)

1

0

dρ (1 − eρ + ρeρ ) ds. dβ

(2.32)

However, integrating (2.30) against ρ, we get 1 1 1 ∂ρ ∂ 2 ρ ∂ρ ρ ρ 2 ρe ds. ds = 4β ρ(1 − e ) ds − 2β 2 ∂β ds ∂β 0 0 0 Hence using Eq. (2.28), we get 1 1 1 ∂ρ ∂ρ ρ (1 − eρ ) ds = −2β 2 ρe ds + 4β ρ(1 − eρ ) ds. 2β 2 0 ∂β 0 ∂β 0 Hence 1 2

0

1

∂ρ 1 (1 − eρ + ρeρ ) ds = ∂β β

1

ρ(1 − eρ ) ds.

0

Combining this with (2.32) gives (a). (b) The first equality follows directly from (a). For the second, we multiply (2.28) by dρ/ds and integrating as in (2.9), to get 2 ∂ρ = 4β 2 ( f 0 + ρ − eρ ). ∂s

790

K. Okikiolu

But then 1 2

∂ 2ρ 1 ds = − 2 2 ∂s 4β 0 0 1 1 = − ( f 0 + ρ − eρ ) ds = 1 − f 0 − ρ ds. 1

ρ(1 − eρ ) ds =

1 4β 2

1

ρ

0

1 ∂ρ 2

0

∂s

ds

0

Corollary 2.5. π + ( f 0 − 1)1/2 . 21/2 Proof. From (2.27) and Lemma 2.3 (e), we have J( f ) π 1 f0 β( f 0 ) = 1/2 + df 2 2 1 ( f 0 − f )1/2 1 π 1 f0 π ≤ 1/2 + d f = 1/2 + ( f 0 − 1)1/2 . 2 2 1 ( f 0 − f )1/2 2 β ≤

Our task now is to work towards the estimate in (1.24). This inequality can be checked quite carefully using Mathematica, but we give a concise analytic proof with minimal computation. Proposition 2.6. Given a constant λ > 0, define functions V, W : [1, ∞) → R by π V ( f ) = 1/2 + λ( f − 1)3/2 , 2 W ( f ) = V ( f )2 − log(4V ( f )2 ) − f. Suppose that for f 1 > 1 fixed, there exists λ such that 0<λ<

2J ( f 1 ) , 3( f 1 − 1)

(a)

W ( f 1 ) > 0,

(b)

W ( f 1 ) < 0.

(c)

Then writing β = β( f 0 ) for the function defined in (2.13), we have β 2 − log(4β 2 ) − f 0 > 0, 1 < f 0 < f 1 . Proof. First we show that W ( f ) > 0 for f ≥ 1. Indeed, note that V ( f ) > 1 and V ( f ) > 0, and 1 V ( f ) − 1, W ( f ) = 2 V ( f ) − V( f ) 1 1 2 V ( f ) > 0. (V ( f )) + 2 V ( f ) − W ( f ) = 2 1 + V ( f )2 V( f ) Next note that W > 0 combined with (c) shows that W is decreasing on [1, f 1 ], and this combined with (b) shows that W ( f 0 ) > 0 for 1 < f 0 < f 1 .

Negative Mass Theorem for the 2-Torus

791

Now we show that for 1 < f 0 < f 1 we have β( f 0 ) > V ( f 0 ). Indeed, comparing the concave function J ( f ) with the linear function, we get J( f ) >

J ( f1 ) ( f − 1), 1 < f < f 1 . f1 − 1

Substituting t = ( f − 1)/( f 0 − 1) we get J( f ) π 1 f0 df β( f 0 ) = 1/2 + 2 2 1 ( f 0 − f )1/2 f0 f −1 π J ( f1 ) ≥ 1/2 + df 2 2( f 1 − 1) 1 ( f 0 − f )1/2 t J ( f 1 )( f 0 − 1)3/2 1 π dt = 1/2 + 1/2 2 2( f 1 − 1) 0 (1 − t) π 2J ( f 1 )( f 0 − 1)3/2 π > 1/2 + λ( f 0 − 1)3/2 = V ( f 0 ). = 1/2 + 2 3( f 1 − 1) 2 Hence we have β 2 − log(4β 2 ) − f 0 > V ( f 0 )2 − log(4V ( f 0 )) − f 0 = W ( f 0 ) > 0. Remark. It will be useful to know the formula h∗ =

∞ ( j + 1) j−1 −( j+1) f e , j! j=0

although we will not prove it or depend on it. Lemma 2.7. For λ = 0.098, the conditions of Proposition 2.7 are satisfied for f 1 = 5 − log 5. Proof of Lemma 2.8. Step 1. For h = 5, write 5∗ = h ∗ . Then numerical calculation shows that 5 − log(5) = f (5) < f (0.034) = 0.034 − log(0.034). Hence 5∗ > 0.034, and 1 1 + ( j + j∗ )( f (5)) = − 5−1 1 − 5∗

2 5 − log 5 − 1 1/2 1 2 1 + − ≥ 4 1 − 0.034 4 − log 5 = 0.3705 . . . ,

and the right-hand term in Proposition 2.6 (a) is 2( j + j∗ )( f (5)) = 0.1033.... > 0.098. 3(4 − log 5)

1/2

792

K. Okikiolu

Step 2. π + λ(4 − log 5)3/2 = 2.583664... 21/2

V ( f (5)) = so

2.58366 < V ( f (5)) < 2.584. Hence at the value f 1 = 5 − log 5 we have W = V 2 − log(4V 2 ) − (5 − log 5) ≥ 2.583662 − log(4 × 2.583662 ) − (5 − log 5) = 0.00002... > 0, while

1 W = 3λ V − V

(4 − log 5)

1/2

− 1 < 0.294 2.584 −

1 2.584

×(4 − log 5)1/2 − 1 = −0.001... < 0. Corollary 2.8. Set β1 = β(5 − log 5). For β = β( f 0 ), we set ε(β) = β 2 − log(4β 2 ) − f 0 . Then ε(β) > 0, if

π < β < β1 . 21/2

Proof. By Lemma 2.7, if 1 < f 0 < f 1 = 5 − log 5, then ε(β) > 0. Now we will investigate more precisely how b( f 0 ) depends on f 0 as f 0 → ∞. We will use the fact that we only have positive Taylor coefficients in the expansion ∞

(1 − x)−1/2 =

γk x k , γk =

0

(2k)! 22k (k!)2

1 ∼√ . πk

From (2.5) and the fact that φ is even and periodic with period b, we have

b/2

(1 − eφ ) dy = 0.

0

Hence using (2.10) and setting h = eφ , we get β = =

√ √ πb = 2 π 1 2

b/2

eφ dy =

0 h ∗ ( f0 ) h ∗ ( f0 )

dh f 0 − (h − log h)

.

1 2

φ∗ ( f0 )

φ∗ ( f 0 )

√

eφ dφ f 0 − f (φ) (2.33)

Negative Mass Theorem for the 2-Torus

793

Using the notation of (2.14), (2.15) and writing h 0 = h ∗ ( f 0 ) and h 0∗ = h ∗ ( f 0 ), so f 0 = h 0 − log h 0 = h 0∗ − log h 0∗ , and setting t = 1 − h/ h 0 , we get 1 β = 2 =

h0 2

h0

h h 0 − h + log h0

h 0∗ 1−h 0∗ / h 0 0

=

h0 2(h 0 − 1)1/2

=

h0 2(h 0 − 1)1/2

=

h0 (h 0 − 1)1/2

−1/2 dh

(h 0 t + log(1 − t))−1/2 dt

− log(1 − t) − t −1/2 t −1/2 1 − dt (2.34) (h 0 − 1)t 0 k 1−h 0∗ / h 0 ∞ γk −1/2 − log(1 − t) − t t dt (h 0 − 1)k 0 t

1−h 0∗ / h 0

k=0

∞ γk µk (1 − h 0∗ / h 0 )

(h 0 − 1)k

k=0

,

(2.35)

where the series converges by monotone convergence, and 1 µk (τ ) = 2

τ

t

−1/2

0

− log(1 − t) − t t

k dt.

(2.36)

Clearly µk (τ ) is an increasing function of τ which is strictly positive for τ ∈ (0, 1]. Now β2 =

∞ k h 20 νk (1 − h 0∗ / h 0 ) , ν (τ ) = γ j γk− j µ j (τ ) µk− j (τ ). k h0 − 1 (h 0 − 1)k k=0

j=0

(2.37) Clearly ν j (τ ) is also positive. It is easy to compute µ0 (τ ) = τ 1/2 , ν0 (τ ) = τ. Hence just taking the first term in (2.37) gives β2 >

h 20 ν0 (1 − h 0∗ / h 0 ) h 0 (h 0 − h 0∗ ) = > h0. h0 − 1 h0 − 1

(2.38)

Now applying the Mean Value Theorem to the function w → w − log w, we have β 2 − log(β 2 ) − (h 0 − log h 0 ) ∞

≥

h 0 νk (1 − h 0∗ / h 0 ) h0 − 1 2 (β − h 0 ) = − h0 + 1 h0 (h 0 − 1)k k=0

≥ 1 − h 0∗ +

h 0 ν1 (1 − h 0∗ / h 0 ) h 0 ν2 (1 − h 0∗ / h 0 ) . + h0 − 1 (h 0 − 1)2

(2.39)

794

K. Okikiolu

Lemma 2.9. If h 0 ≥ 5 then 1 − h 0∗ +

h 0 ν1 (1 − h 0∗ / h 0 ) > 2 log 2, h0 − 1

(a)

and h 0 ν2 (1 − h 0∗ / h 0 ) 0.03 > (h 0 − 1)2 β2

(b)

so ε(β) = β 2 − log(4β 2 ) − f 0 >

0.03 . β2

(c)

Proof. (a) Now evaluating (2.36) for k = 1, µ1 (τ ) = 2 log(1 + τ 1/2 ) − τ 1/2 + (τ −1/2 − 1) log(1 − τ ),

(2.40)

so we have ν1 (τ ) = µ0 (τ )µ1 (τ ) = 2τ 1/2 log(1 + τ 1/2 ) − τ + (1 − τ 1/2 ) log(1 − τ ). (2.41) We will estimate the terms on the right hand side. Since log(1 − τ ) < 0, we have (1 − τ 1/2 ) log(1 − τ ) > (1 − τ ) log(1 − τ ), and by the convexity of the logarithm we have log(1 + x) > x log 2 for 0 < x < 1, and so τ 1/2 log(1 + τ 1/2 ) > τ log 2. Hence substituting these inequalities into (2.41), ν1 (τ ) > τ (2 log 2 − 1) + (1 − τ ) log(1 − τ ), and writing 1 − τ = h 0∗ / h 0 we have h 0 ν1 (1 − h 0∗ / h 0 ) h0 − 1 h0 h 0∗ h 0∗ h 0∗ log > 1 − h 0∗ + (2 log 2 − 1) 1 − + h0 − 1 h0 h0 h0 (2 log 2 − 1) + 2h 0∗ (1 − log 2) − h 0∗ (h 0 + log(h 0 / h 0∗ )) , = 2 log 2 + h0 − 1

1 − h 0∗ +

and so (a) holds provided 2 log 2 − 1 + 2h 0∗ (1 − log 2) − h 0∗ (h 0 + log(h 0 / h 0∗ )) ≥ 0,

for h 0 > 5, (2.42)

which certainly follows if we can show h 0∗ (h 0 + log h 0 + log 1/ h 0∗ ) < (2 log 2 − 1),

for h 0 > 5.

(2.43)

Negative Mass Theorem for the 2-Torus

795

We first remark that 0.035 − log 0.035 < 5 − log 5, and hence if h 0 > 5, then h 0∗ < 0.035 < exp(−1). But then for h 0∗ < 0.035, we have that h 0∗ log(1/ h 0∗ ) increases with h 0∗ and hence decreases with h 0 . Moreover, h 0 − 1 > 4 > 1 − h 0∗ , so the functions h 0 → h 0∗ h 0 and h 0 → h 0∗ log h 0 are also decreasing with h 0 , as can be checked by differentiating with respect to f 0 . For example 1 1 d(h 0∗ h 0 ) < 0. − = h 0∗ h 0 d f0 h 0 − 1 1 − h 0∗ Hence the left hand side of (2.43) is decreasing with h 0 , and so bounded above by 0.035(5 + log 5 + log 1/0.035) = 0.34866... < 0.386294.. = 2 log 2 − 1, and (2.43) holds, so (a) holds. (b) Now for τ > 0, ν2 (τ ) =

(µ1 (τ ))2 (µ1 (τ ))2 3µ0 (τ )µ2 (τ ) + > . 4 4 4

Hence for h 0 ≥ 5, we have h 0∗ < 0.035 and (µ1 (1 − h 0∗ / h 0 ))2 (µ1 (1 − 0.035/5))2 ≥ = 0.0340... > 0.03. 4 4 Hence 0.03 h 0 ν2 (1 − h 0∗ / h 0 ) 0.03 > ≥ . 2 (h 0 − 1) h0 − 1 β2 (c) Follows by substituting (a) and (b) into (2.39).

796

K. Okikiolu

Proposition 2.10. ε(β) = O(β −2 ),

as β → ∞.

Proof. We will prove this by bounding the error when we approximate the series in (2.35) by the partial sums. Indeed, we show that there exists a constant C(K ) independent of h 0 such that

K

h0 C(K ) γk µk (1)

(2.44)

β −

≤ K +1/2 , for h 0 > 2 .

(h 0 − 1)1/2 (h 0 − 1)k

h k=0

0

In fact, what we show is

K

C(K )(log h 0 ) K h0 γk µk (1)

,

≤

β − K −1/2 1/2 k

(h 0 − 1) (h 0 − 1)

h0 k=0

for h 0 > 2.

(2.45)

By applying (2.45) with K replaced by K + 2, we get (2.44). Notation. Suppose h = (h 1 , . . . , h p ) and k = (k1 , . . . , kq ) are variables taking values in U ⊂ R p and V ⊂ R q respectively, and suppose that F1 and F2 are two functions of (h, k). Then we write F1 ≤ F2 , k

if for every k ∈ V , there exists a constant C(k) < ∞, such that F1 (h, k) ≤ C(k) F2 (h, k)

for all h ∈ U.

Now we prove (2.45). We first remark that

K −1

−1/2 k

− γk x ≤ (1 − x)−1/2 x K ,

(1 − x)

k

for 0 ≤ x < 1.

k=0

Hence from (2.34), writing t = 1 − h/ h 0 , we have that for h 0 > 2,

K −1

h0 γk µk (1 − h 0∗ / h 0 )

(2.46)

β −

(h 0 − 1)1/2 (h 0 − 1)k k=0 1−h 0∗ / h 0 − log(1−t) − t −1/2 − log(1−t) − t K h0 −1/2 ≤ 1− t dt K +1/2 (h 0 −1)t t k (h 0 −1) 0 K h0 1 −1/2 − log(1 − t) − t − log h − (h − log h)) dh. = (h 0 0 (h 0 − 1) K h 0∗ t We split into two cases. The function ∞ − log(1 − t) − t tk = , 0 < t < 1, t k+1 k=1

Negative Mass Theorem for the 2-Torus

797

is increasing with t, so decreasing with h. Hence for h 0 > h and h 0 > 2, we have 8 log(h 0 / h) − log(1 − t) − t log h 0 h > 1/ h 0 , = − 1 < 3 8 − 3 log h h ≤ 1/ h 0 . t 1 − h/ h 0 Hence the right-hand side of (2.46) is bounded up to a constant C(K ) by h0 (log h 0 ) K (h 0 − log h 0 − (h − log h))−1/2 dh (h 0 − 1) K h 0∗ 1/ h 0 1 + (h 0 − log h 0 − (h − log h))−1/2 (− log h) K dh. (h 0 − 1) K h 0∗

(2.47)

Using Corollary 2.5, for h 0 > 2, the first term in (2.47) is equal to

2(log h 0 ) K β (log h 0 ) K π (log h 0 ) K 1/2 ≤ ≤ + ( f − 1) . 0 K K 1/2 K −1/2 (h 0 − 1) 2 k (h 0 − 1) k (h 0 − 1) To bound the second term in (2.47), we change variables to f = h − log h to get the bound f0 1 h∗( f ) d f. (2.48) ( f 0 − f )−1/2 (− log h ∗ ( f )) K K (h 0 − 1) 1 − h∗( f ) log h 0 +1/ h 0 But − log h ∗ ( f ) = f − h ∗ ( f ) < f + 1, and for f > log(2/ log 2) we have h ∗ ( f ) ≤ 2e− f . Hence (2.48) is bounded up to a constant C(K ) by f0 1 ( f 0 − f )−1/2 e− f f K d f. (h 0 − 1) K 0 But the integral here is uniformly bounded in f 0 , so the second term in (2.47) is bounded up to C(K ) by 1 . (h 0 − 1) K So far we have bounded the left hand side of (2.46) by the right hand side of (2.45). To complete the proof of (2.45) we just have to show that for h 0 > 2, |µk (1 − h 0∗ / h 0 ) − µk (1)| ≤

k, K

1 . h 0K

However, the left hand side equals

k h 0∗ / h 0

1 1

−1/2 − log(1 − t) − t t dt ≤ | log s|k ds

2 1−h 0∗ / h 0

k 0 t ≤ k

h 0∗ 1 h 0∗ | log h 0 − log h 0∗ |k = (h 0 − h 0∗ )k ≤ e− f0 h k−1 ≤ K. 0 h0 h0 k k, K h 0

798

K. Okikiolu

This completes the proof of (2.45). From this we get the asymptotic formula ∞

β2 ∼

h 20 νk (1) , h0 − 1 (h 0 − 1)k k=0

where νk is defined in (2.37), in the sense that for h 0 > 2,

K

h 20 νk (1)

1

2 .

β −

≤

h0 − 1 (h 0 − 1)k k h 0K k=0

Thus β2 =

h 20 2 log 2 − 1 −1 1 + + O(h −1 0 ) = h 0 + 2 log 2 + O(h 0 ). h0 − 1 h0 − 1

(2.49)

From this we see that −2 β 2 − log(β 2 ) − (h 0 − log h 0 ) − 2 log 2 = O(h −1 0 ) = O(β ).

This completes the proof of Proposition 2.10.

Proposition 2.11. M = −

β2 + log(4β 2 ) − 1 + O(β −2 ) 3

as

β → ∞.

Proof. From (2.29), we have φ∗ ( f0 ) φ(eφ + 1) 1 M = dφ √ 4β φ∗ ( f0 ) f 0 − f (φ) φ∗ ( f0 ) ∗ (φ − log h 0 )eφ eφ 1 log h 0 φ ( f0 ) = dφ + dφ √ √ 3β φ∗ ( f0 ) 3β f 0 − f (φ) f 0 − f (φ) φ∗ ( f 0 ) φ∗ ( f0 ) φ(3 − eφ ) 1 + dφ √ 12β φ∗ ( f0 ) f 0 − f (φ) φ∗ ( f0 ) 1 − f0 (φ − log h 0 )eφ 1 2 log h 0 + . (2.50) = dφ + √ 3β φ∗ ( f0 ) 3 3 f 0 − f (φ) The third line here follows from (2.33) and the second equality in Proposition 2.4(b). Now we change variables to h = eφ so f = eφ − φ = h − log h, and set h 0 = h ∗ ( f 0 ) and h 0∗ = h ∗ ( f 0 ). Then define ∗ 1 h 0 log h − log h 0 1 φ ( f0 ) (φ − log h 0 )eφ dh. (2.51) dφ = N := √ √ 2 φ∗ ( f0 ) 2 h 0∗ f0 − f f 0 − f (φ) We follow the argument of (2.34)-(2.35) with β replaced by (2.51) to get 1−h 0∗ / h 0 − log(1 − t) − t −1/2 h0 −1/2 1 − N = (log(1 − t)) t dt 2(h 0 − 1)1/2 0 (h 0 − 1)t ∞ γk κk (1 − h 0∗ / h 0 ) h0 = , 1/2 (h 0 − 1) (h 0 − 1)k k=0

Negative Mass Theorem for the 2-Torus

where κk (τ ) =

1 2

τ

799

(log(1 − t)) t −1/2

0

− log(1 − t) − t t

k dt.

Moreover, following the proof of (2.44)-(2.45), we conclude that for h 0 > 2,

K −1

h0 γk κk (1)

1

N −

≤ K −1/2 . 1/2 k

(h 0 − 1) (h 0 − 1) k h k=0 0 Now κ0 (1) = 2 log 2 − 2, and so in particular, using (2.49), 1/2

N = κ0 h 0

−1/2

+ O(h 0

) = (2 log 2 − 2)β + O(β −1 ),

as β → ∞.

Substituting this into (2.50) and using (2.49), we see that as β → ∞ we have −β 2 + log(4β 2 ) + 1 2(log 4 − 2) 2 log(β 2 ) + + + O(β −2 ) 3 3 3 β2 = − + log(4β 2 ) − 1 + O(β −2 ). 3 √ Lemma 2.12. Suppose that C > 0 and β1 > 1/ 2 are constants and that the formula M =

−4

∞ n=1

C 2 < 2, log 1 − e−2nβ β

(2.52)

holds for β = β1 . Then it holds for all β ≥ β1 . Proof. Define ω(β) = −4

∞

2 log 1 − e−2nβ , β > 0,

n=1

and ψ(β) =

C − ω(β). β2

Then ω is positive and smooth, and −ω (β) = 16β

∞

n

n=1

1 − e−2nβ

2

> 16β

∞

1

n=1

1 − e−2nβ

2

> 4βω(β).

Suppose that (2.52) fails, that is ψ(β) ≤ 0, for some β2 > β1 . Then we can choose β2 > β1 minimal such that this is the case, and clearly ψ(β2 ) = 0. But then 2C 2ω(β2 ) −2 ψ (β2 ) = 3 − ω (β2 ) = − ω (β2 ) ≥ ω(β2 ) + 4β2 . β2 β2 β2 √ But β2 > 1/ 2, so the right-hand side is positive and so ψ(β) < 0 for some β with β1 < β < β2 , which is a contradiction.

800

K. Okikiolu

Appendix. Explicit formulas for the Flat Torus and the Round Sphere Lemma A.1. Let T = C/ be a torus of area 1, where is a lattice, and let u and v be the generators of the dual lattice ∗ and set z = v/u. Then for the flat metric g0 on T , trace −1 g0 = −

log 2π log(|η(z)|4 /|u|2 ) − , 2π 4π

(A.1)

where the Dedekind eta function η is defined by η(z) = eπi z/12

∞

(1 − e2πinz ).

(A.2)

1 log π − , 4π 4π

(A.3)

n=1

On the other hand, = − trace −1 S 2 ,1 and so −1 trace −1 g0 − trace S 2 ,1 =

1 − log(|η(z)|4 /|u|2 ) − log 4π + 1 . (A.4) 4π

When has generators (1/b, a + bi) with a, b ∈ R, we can choose (u, v) = (−i/b, b − ai) and then (A.4) becomes (1.12). Remark. The quantity log(|η(z)|4 /|u|2 ) was shown in [OsPS] to be maximized at the hexagonal torus, for which log(|η(z)|4 /|u|2 ) = −1.0335.... Hence the hexagonal torus minimizes trace −1 among flat tori of a given area. Proof. Now ∗ = {µ ∈ C : (µλ) ¯ ∈ Z for all λ ∈ }. The eigenfunctions of the Laplacian on T = C/ have the form ¯ f (z) = e2πi(µλ) ,

for µ ∈ ∗ .

The corresponding eigenvalue is (2π )2 |µ|2 . Consider the Epstein zeta function 1 Z T (s) = . 2s (2π |µ|) ∗ µ∈ −0

Kronecker’s First Limit Formula states that

π + 2π − (1) − log 2 − log |η(z)|2 + O(s − 1). (2π )2s Z T (s) = s−1 Hence 1 + Z T1 + O(s − 1), 4π(s − 1)

1 − (1) − log(4π ) − log |η(z)|2 . = 2π

Z T (s) = Z T1

Negative Mass Theorem for the 2-Torus

801

But Z T1 is a different regularization of the trace of −1 , and it can be shown that this differs from our Green function regularization trace −1 by a universal constant: 1 trace −1 g0 = Z T +

log 2 (1) + , 2π 2π

(A.5)

see [M2,S1,S2], or [O2] (A.6). Evaluating (A.5) we get (A.1). Formula (A.3) is well known. Indeed, on the round 2-sphere of area 4π given by x 2 + y 2 + z 2 = 1, the Green function G( p, q) can be written in terms of the distance r from p to q, as G( p, q) = −

1 1 log | sin r/2| − . 2π 4π

This gives the Robin mass m S 2 ,4π =

log 2 1 − , 2π 4π

and combining this with (1.3) gives m S 2 ,1 = m S 2 ,4π −

log π 1 log 4π = − − . 4π 4π 4π

Acknowledgement. The author is extremely grateful to the referee for providing helpful comments and pointing out several results related to this work, in particular [LL1,LL2] which site [DJLW] and [NT]. The author would like to thank the University of Pennsylvania for their hospitality.

References [A] [B]

Arnold, V.: Ordinary differential equations. Universitext. Berlin: Springer-Verlag, 2006 Beckner, W.: Sharp sobolev inequalities on the sphere and the moser-trudinger inequality. Ann. Math. 138, 213–242 (1993) [CL] Carlen, E., Loss, M.: Competing symmetries, the logarithmic HLS inequality and onofri’s inequality on s n . Geom. Funct. Anal. 2, 90–104 (1992) [Ch] Chang, S.-Y.A.: Conformal invariants and partial differential equations. Bull. Amer. Math. Soc. 42, 365–393 (2005) [Chi] Chicone, C.: The monotonicity of the period function for planar hamiltonian vector fields. J. Diff. Eqs. 69(3), 310–321 (1987) [ChiJ] Chicone, C., Jacobs, M.: Bifurcation of limit cycles from quadratic isochrones. J. Diff. Eqs. 91(2), 268–326 (1991) [Chiu] Chiu, P.: Height of flat tori. Proc. Aner. Math. Soc. 125, 723–730 (1997) [DJLW] Ding, W., Jost, J., Li, J., Wang, G.: The differential equation u = 8π − 8π heu on a compact Riemann surface. Asian J. Math. 1, 230–248 (1997) [DS1] Doyle, P., Steiner, J.: Spectral invariants and playing hide and seek on surfaces. Preprint, available at http://www.cims.ngu.edu/nsteiner/hideandseet.pdf [DS2] Doyle, P., Steiner, J.: Blowing bubbles on the torus. Preprint, avaiable at http://www.cims.ngu.edu/ steiner/torus.pdf [LL1] Lin, C.-S., Lucia, M.: Uniqueness of solutions for a mean field equation on torus. J. Diff. Eqs. 229(1), 172–185 (2006) [LL2] Lin, C.-S., Lucia, M.: One-dimensional symmetry of periodic minimizers for a mean field equation. Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 6(2), 269–290 (2007) [M1] Morpurgo, C.: The logarithmic hardy-littlewood-sobolev inequality and extremals of zeta functions on s n . Geom. Funct. Anal. 6, 146–171 (1996) [M2] Morpurgo, C.: Zeta functions on S 2 . Extremal Riemann surfaces (San Francisco, 1995), Contemp. Math. 201, Providence, RI: Amer. Math. Soc., 1997, pp. 213–225

802

[M3] [NT] [O1] [O2] [OW] [On] [OsPS] [S1] [S2] [T]

K. Okikiolu

Morpurgo, C.: Sharp inequalities for functional integrals and traces of conformally invariant operators. Duke Math. J. 114, 477–553 (2002) Nolasco, M., Tarantello, G.: On a sharp Sobolev-type inequality on two-dimensional compact manifolds. Arch. Ration. Mech. Anal. 145, 161–195 (1998) Okikiolu, K.: Hessians of spectral zeta functions. Duke Math. J. 124, 517–570 (2004) Okikiolu, K.: Extremals for logarithmic HLS inequalities on compact manifolds. GAFA 107(5), 1655–1684 (2008) Okikiolu, K., Wang, C.: Hessian of the zeta function for the laplacian on forms. Forum Math. 17, 105–131 (2005) Onofri, E.: On the positivity of the effective action in a theory of random surfaces. Comm. Math. Phys. 86, 321–326 (1982) Osgood, B., Phillips, R., Sarnak, P.: Extremals of determinants of laplacians. J. Funct. Anal. 80, 148–211 (1988) Steiner, J.: Green’s Functions, Spectral Invariants, and a Positive Mass on Spheres. Ph. D. Dissertation, University of California San Diego, June 2003 Steiner, J.: A geometrical mass and its extremal properties for metrics on s 2 . Duke Math. J. 129, 63–86 (2005) Terras, A.: Harmonic analysis and symmetric spaces and applications I. Berlin: Springer-Verlag, 1988

Communicated by P. Sarnak

Commun. Math. Phys. 284, 803–831 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0640-0

Communications in

Mathematical Physics

Justification of the Lattice Equation for a Nonlinear Elliptic Problem with a Periodic Potential Dmitry Pelinovsky1 , Guido Schneider2 , Robert S. MacKay3 1 Department of Mathematics, McMaster University, Hamilton, ON L8S 4K1, Canada.

E-mail: [email protected]

2 Institut für Analysis, Dynamik und Modellierung, Universität Stuttgart, Pfaffenwaldring 57,

D-70569 Stuttgart, Germany

3 Mathematics Institute, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK

Received: 29 July 2007 / Accepted: 16 June 2008 Published online: 8 October 2008 – © Springer-Verlag 2008

Abstract: We justify the use of the lattice equation (the discrete nonlinear Schrödinger equation) for the tight-binding approximation of stationary localized solutions in the context of a continuous nonlinear elliptic problem with a periodic potential. We rely on properties of the Floquet band-gap spectrum and the Fourier–Bloch decomposition for a linear Schrödinger operator with a periodic potential. Solutions of the nonlinear elliptic problem are represented in terms of Wannier functions and the problem is reduced, using elliptic theory, to a set of nonlinear algebraic equations solvable with the Implicit Function Theorem. Our analysis is developed for a class of piecewise-constant periodic potentials with disjoint spectral bands, which reduce, in a singular limit, to a periodic sequence of infinite walls of a non-zero width. The discrete nonlinear Schrödinger equation is applied to classify localized solutions of the Gross–Pitaevskii equation with a periodic potential. 1. Introduction Recent experimental and theoretical works on Bose–Einstein condensates in optical lattices [18], coherent structures in photorefractive lattices [3], and gap solitons in photonic crystals [22] have stimulated a new wave of interest in localized solutions of the nonlinear elliptic problem with a periodic potential − φ (x) + V (x)φ(x) + σ |φ(x)|2 φ(x) = ωφ(x), ∀x ∈ R,

(1.1)

where φ : R → C decays to zero sufficiently fast as |x| → ∞, V : R → R is a bounded 2π -periodic function, σ = ±1 is normalized, and ω is a free parameter. Solutions of the nonlinear elliptic problem (1.1) correspond to stationary (time-periodic) solutions of Hamiltonian dynamical systems such as the Gross–Pitaevskii equation. Localized solutions φ(x) of the elliptic problem (1.1) in energy space H 1 (R) have been proved to exist for ω in every bounded gap in the spectrum of the linear Schrödinger

804

D. Pelinovsky, G. Schneider, R. S. MacKay

operator L = −∂x2 + V (x), as well as in the semi-infinite gap for σ = −1 [13]. It is, however, desired to obtain more precise information on classification and properties of localized solutions. The number of branches of localized solutions, counted modulo a discrete group of translations with a 2π -multiple period, is infinite even in one dimension, and the solutions can be classified, for instance, by the number of peaks in different wells of the periodic potentials. To approximate the solution shape by analytic functions or numerically, various asymptotic reductions of the nonlinear elliptic problem (1.1) have been used [14]. The asymptotic reductions are formally derived for bifurcations of small-amplitude localized solutions from the zero solution φ = 0. Recent rigorous results on these bifurcations include justification of the time-dependent nonlinear Schödinger equation for pulses near a band edge of the spectrum of L [2], analysis of the coupled nonlinear Schödinger equations for a finite-amplitude two-dimensional separable potential [6], and justification of the coupled-mode system for a small-amplitude one-dimensional potential [17]. This paper addresses the tight-binding approximation of localized solutions outside a narrow band in the spectrum of L. Although the tight-binding approximation has been used by physicists for a long time, it was only recently that this approximation was formalized by means of Wannier function decompositions [1]. The rigorous analysis of the “averaged procedure” announced in [1] as Ref. [15] appears not to have been written. Moreover, as we show in this paper, one of the claims of [1] (the coupled lattice equations (12) for inter-band interactions) cannot be verified in the context of the nonlinear elliptic problem (1.1). Our main goal here is to prove that, when the potential V (x) is represented by a periodic piecewise-constant sequence of large walls of a non-zero width, a localized solution φ(x) of the nonlinear elliptic problem (1.1) on x ∈ R is a linear transform of a localized sequence {φn } on n ∈ Z satisfying the lattice equation α (φn+1 + φn−1 ) + σ |φn |2 φn = φn , ∀n ∈ Z,

(1.2)

where α is constant and is a new parameter related to the parameter ω. The sequence {φn }n∈Z represents a small-amplitude solution φ(x) of the nonlinear elliptic problem (1.1) in the sense that φn corresponds to φ(x) for the value of x in the n th well of the periodic potential V (x). The precise statement of our main theorem can be found in Sect. 4 of our article. Besides the formal analysis in [1], justification of the lattice equation (1.2) for the nonlinear elliptic problem (1.1) seems not to have been carried out in the literature. Nevertheless, our work has two recent counterparts in the theory of nonlinear parabolic systems. These works are relevant as time-independent solutions of nonlinear parabolic systems satisfy nonlinear elliptic equations. In particular, the stationary solutions of a nonlinear heat equation satisfy the second-order elliptic problem (1.1). The scalar nonlinear heat equation with a periodic diffusive term was considered in [21] and the convergence of the global solutions of the continuous partial differential equation to the global solutions of a lattice differential equation is proven with the Fourier–Bloch analysis. Although the Wannier functions are never mentioned in [21], modifications of these functions (suitable for the Shannon sampling and interpolation theory [24]) are implicitly constructed in Lemma 2.5 of [21]. The lattice differential equation describes dynamics on the invariant infinite-dimensional manifold of the nonlinear heat equation. If the dynamics is stationary (time-independent), this center manifold corresponds directly to the nonlinear elliptic problem (1.1), which we consider here. Unfortunately,

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

805

our methods can not be extended to Hamiltonian dynamical systems such as the Gross– Pitaevskii equation, since the center manifold of nonlinear dispersive wave equations does not give typically any reduction of the problem. This is the main reason why we limit our consideration to the stationary solutions of the Gross–Pitaevskii equation. A more general system of reaction–diffusion equations was considered in [25] and the lattice differential equations were derived to describe dynamics of an infinite sequence of interacting pulses which are located far apart from each other (see also [5] for analysis of a periodic sequence of interacting pulses). The infinite sequence of equally spaced pulses introduces an effective periodic potential in the linearization of the reaction–diffusion equations, which explains the similarity between the two problems. However, the Fourier–Bloch theory cannot be used for strongly nonlinear non-equally spaced pulses. As a result, direct methods of projections in exponentially weighted spaces are applied in [25] to catch a weak tail–tail interaction of neighboring pulses. Similar analysis of the pulse tail–tail interaction was earlier developed for a finite sequence of nonlinear pulses in the reaction–diffusion systems [20]. Our paper is structured as follows. The spectral theory of operators with periodic potentials and the related Fourier–Bloch decomposition are reviewed in Sect. 2. The Wannier functions are introduced and studied in Sect. 3. The main theorem is formulated in Sect. 4 after the analysis of piecewise-constant potentials, which reduce, in a singular limit, to a periodic sequence of infinite walls of a non-zero width. The main theorem is proved in Sect. 5 using elliptic theory and the Wannier decomposition. Examples of localized solutions of the lattice equation (1.2) are given in Sect. 6. Other examples of periodic potentials with similar properties are discussed in Sect. 7. Extensions of analysis for multi-dimensional elliptic problems with a separable periodic potential are developed in Sect. 8. Appendix A reviews the Shannon decomposition which is an alternative to the Wannier decomposition. Appendices B and C give proofs of important technical lemmas about the spectrum of the Schrödinger operator with a periodic piecewise–constant potential. Appendix D describes a relationship between the lattice equation (1.2) and the Poincaré map for the second-order equation (1.1). Notations and basic facts. In what follows, we consider scalar complex-valued functions u on R in the Sobolev space H m (R) for an integer m ≥ 0 equipped with the squared norm m u2H m (R) = |∂xk u(x)|2 d x, (1.3) k=0 R

and complex-valued vectors u for sequences {u n }n∈Z in the weighted spaces lq1 (Z) and ls2 (Z) for q, s ≥ 0 equipped with the norms (1 + n 2 )q/2 |u n |, ul22 (Z) = (1 + n 2 )s |u n |2 . (1.4) ulq1 (Z) = n∈Z

s

n∈Z

Furthermore, we use the space of bounded continuous functions Cb0 (R) and recall Sobolev’s embeddings 1 uC 0 (R) ≤ C1 (s)u H s (R) , ulq1 (Z) ≤ C2 (s, q) uls+q ∀s > , ∀q ≥ 0, 2 (Z) , b 2

(1.5)

for some constants C1 (s), C2 (s, q) > 0. We also recall that the Sobolev space H s (R) forms a Banach algebra for s > 12 such that

806

D. Pelinovsky, G. Schneider, R. S. MacKay

1 uv H s (R) ≤ C(s)u H s (R) v H s (R) , ∀s > , (1.6) 2 for some constant C(s) > 0. We denote N = {1, 2, 3, . . .} ⊂ Z and T = − 21 , 21 ⊂ R. ∀u, v ∈ H s (R) :

2. Review of Spectral Theory We consider the Schrödinger operator L = −∂x2 + V (x) with a real-valued, bounded and 2π -periodic function V with respect to x ∈ R. The operator L is defined for functions in C0∞ (R). It is extended to a self-adjoint operator which maps continuously H 2 (R) to L 2 (R). By Theorem XIII.100 on p. 309 in [19], if V ∈ L 2per (R), then the spectrum of L = −∂x2 + V (x) in L 2 (R), denoted by σ (L), is real, purely continuous, and consists of the union of spectral bands. Accordingto Floquet analysis (see Sects. 6.4, 6.6 and 6.7 in [7]), for each fixed k ∈ T, where T = − 21 , 21 , there exists a Bloch function u l (x; k) = eikx wl (x; k) corresponding to the eigenvalue ωl (k), such that wl (x + 2π ; k) = wl (x; k). The band function ωl (k) and the periodic function wl (x; k) for a fixed k ∈ T correspond to the l th eigenvalue–eigenvector pair of the operator L k = −∂x2 − 2ik∂x + k 2 + V (x), so that L k wl (x; k) = ωl (k)wl (x; k), or, equivalently, Lu l (x; k) = ωl (k)u l (x; k). (2.1) The Bloch functions are uniquely defined up to a scalar multiplication factor. We shall assume that the amplitude factors of the Bloch functions are normalized by the orthogonality relations u l (x, k)u¯ l (x, k )d x = δl,l δ(k − k ), ∀l, l ∈ N, ∀k, k ∈ T, (2.2) R

where δl,l is the Kronecker symbol and δ(k) is the Dirac delta function in the sense of distributions. In addition, if u l (x; k) is a Bloch function for ωl (k), then u l (x; −k) = u¯ l (x; k) can be chosen as a Bloch function for ωl (−k) = ω¯ l (k) = ωl (k), to normalize uniquely the phase factors of the Bloch functions. Similar normalization was used recently for construction of Bloch functions in [12]. Proposition 1. If V ∈ L 2per (R), then there exists a unitary Fourier–Bloch transformation T : L 2 (R) → l 2 (N, L 2 (T)) given by ˆ ∀φ ∈ L 2 (R) : φ(k) = T φ, φˆl (k) = φ(y)u¯ l (y; k)dy, ∀l ∈ N, ∀k ∈ T. (2.3) R

The inverse transformation is given by ∀φˆ ∈ l 2 (N, L 2 (T)) : φ(x) = T −1 φˆ =

l∈N T

φˆl (k)u l (x; k)dk, ∀x ∈ R. (2.4)

Proof. The statement follows by Theorems XIII.97 and XIII.98 on pp. 303–304 in [19], which prove orthogonality and completeness of the set of Bloch functions {u l (x; k)} on l ∈ N and k ∈ T. The orthogonality relations are given by (2.2), while the completeness relation can be written in the form u l (x, k)u¯ l (x ; k)dk = δ(x − x ), ∀x, x ∈ R, (2.5) l∈N

T

where δ(x) is again the Dirac delta function in the sense of distributions.

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

807

Let El for a fixed l ∈ N be the invariant closed subspace of L 2 (R) associated with the l th spectral band of σ (L). Then, ∀φ ∈ El ⊂ L 2 (R) : φ(x) = (2.6) φˆl (k)u l (x; k)dk, T

where φˆl (k) is defined by the integral in (2.3). According to the Fourier–Bloch decomposition (2.3)–(2.4), the space L 2 (R) is decomposed into a direct sum of invariant closed bounded subspaces ⊕l∈N El . The Fourier–Bloch decomposition can be used for representation of classical solutions of partial differential equations with periodic coefficients [2,6]. This decomposition is, however, inconvenient for a reduction of a continuous PDE problem to a lattice problem. Other decompositions, such as the Wannier and Shannon decompositions, are found to be more useful in the recent works [1] and [21], respectively. Properties of the Wannier functions are described in the next section, while the Shannon functions are reviewed in Appendix A. 3. Properties of the Wannier Functions Since the band function ωl (k) and the Bloch function u l (x; k) are periodic with respect to k ∈ T for any l ∈ N, we represent them by the Fourier series ωˆ l,n ei2π nk , u l (x; k) = uˆ l,n (x)ei2π nk , ∀l ∈ N, ∀k ∈ T, (3.1) ωl (k) = n∈Z

n∈Z

where the inverse transformation is ωˆ l,n = ωl (k)e−i2π nkdk, uˆ l,n (x) = u l (x; k)e−i2π nk dk, ∀l ∈ N, ∀n ∈ Z. T

T

(3.2)

Since ωl (k) = ω¯ l (k) = ωl (−k) and u l (x; k) = u¯ l (x; −k) for any k ∈ T and any l ∈ N, the coefficients of the Fourier series (3.1) satisfy the constraints ωˆ l,n = ω¯ˆ l,−n = ωˆ l,−n , uˆ l,n (x) = u¯ˆ l,n (x), ∀n ∈ Z, ∀l ∈ N, ∀x ∈ R.

(3.3)

In particular, the functions uˆ l,n (x) are always real-valued. Since u l (x + 2π ; k) = u l (x; k)ei2π k for any x ∈ R, k ∈ T, and l ∈ N, we obtain another constraint on the functions uˆ l,n (x): uˆ l,n (x) = uˆ l,n−1 (x − 2π ) = uˆ l,0 (x − 2π n),

∀n ∈ Z, ∀l ∈ N, ∀x ∈ R. (3.4)

By substituting the Fourier series representation (3.1) into the linear problem (2.1), we obtain a system of equations for the set of functions {uˆ l,n (x)}n∈Z and coefficients {ωˆ l,n }n∈Z for a fixed l ∈ N: − uˆ l,n (x) + V (x)uˆ l,n (x) = ωˆ l,n−n uˆ l,n (x), ∀n ∈ Z. (3.5) n ∈Z

We shall now make rigorous the formal representations above. Definition 1. The functions in the set {uˆ l,n (x)} for n ∈ Z and l ∈ N are called the Wannier functions.

808

D. Pelinovsky, G. Schneider, R. S. MacKay

Assumption 1. Let V be a real-valued, piecewise-continuous and 2π -periodic function with respect to x ∈ R. Assume that the spectrum of L = −∂x2 + V (x) consists of the union of disjoint spectral bands. Proposition 2. Let V satisfy Assumption 1. There exists a unitary transformation W : L 2 (R) → l 2 (N × Z) given by uˆ l,n (x)φ(x)d x, ∀l ∈ N, ∀n ∈ Z. (3.6) ∀φ ∈ L 2 (R) : φ = Wφ, φl,n = R

The inverse transformation is given by ∀φ ∈ l 2 (N × Z) :

φ(x) = W −1 φ =

φl,n uˆ l,n (x), ∀x ∈ R.

(3.7)

l∈N n∈Z

Moreover, there exists ηl > 0 and Cl > 0 for a fixed l ∈ N, such that |uˆ l,n (x)| ≤ Cl e−ηl |x−2π n| , ∀n ∈ Z, ∀x ∈ R.

(3.8)

Proof. We need to prove that the set of Wannier functions of Definition 1 forms an orthonormal basis in L 2 (R) according to the orthogonality relation uˆ l,n (x)uˆ l ,n (x)d x = δl,l δn,n , ∀l, l ∈ N, ∀n, n ∈ Z (3.9) R

and the completeness relation uˆ l,n (x)uˆ l,n (x ) = δ(x − x ), ∀x, x ∈ R.

(3.10)

l∈N n∈Z

The orthogonality relation (3.9) for the Wannier functions follows from the orthogonality relation (2.2) for the Bloch functions uˆ l,n (x)uˆ l ,n (x)d x = u l (x; k)u¯ l (x; k )ei2π(k n −kn) dkdk d x R R T T = δl ,l δ(k − k)ei2π k(n −n) dkdk T T = δl ,l ei2π k(n −n) dk = δl ,l δn ,n , T

after the integrations in x ∈ R and k, k ∈ T are interchanged. Similarly, the completeness relation (3.10) for the Wannier functions follows from the completeness relation (2.5) for the Bloch functions uˆ l,n (x)uˆ l,n (x ) = u l (x; k)u¯ l (x ; k)ei2π(k −k)n dkdk l∈N n∈Z

l∈N n∈Z

= =

T T

l∈N T T

l∈N T

u l (x; k)u¯ l (x ; k)

ei2π(k −k)n dkdk

n∈Z

u l (x; k)u¯ l (x ; k)dk = δ(x − x ),

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

809

where we have used the well-known orthogonality relation

ei2π n(k−k ) = δ(k − k ),

∀k, k ∈ T.

n∈Z

The Wannier decomposition (3.6)–(3.7) follows by the standard theory of orthonormal bases in L 2 (R). It remains to show that the function uˆ l,n (x) for any l ∈ N and n ∈ Z is uniformly and absolutely bounded with respect to x ∈ R by an exponentially decaying function centered at x = 2π n. By Theorem XIII.95 on p.301 in [19], if the potential V satisfies Assumption 1, the band function ω l (k) and the Bloch function u l (x; k) = wl (x; k)eikx are analytic on k ∈ − 21 , 0 ∪ 0, 21 and are continued analytically along a Riemann surface near k = 0 and k = ± 21 . Let us consider a rectangle Dηl with vertices − 21 , 0 , 21 , 0 , 21 , iηl , − 21 , iηl in the domain of analyticity of u l (x; k). By using the Cauchy complex integration, we obtain the identity

0=

∂ Dηl

u l (x; k)dk =

T

u l (x; k)dk −

+

1 2 +iηl 1 2

T

u l (x; k + iηl )dk

(u l (x; k) − u l (x; k − 1)) dk.

The last integral is zero due to the periodicity of u l (x; k) with period 1 in k. As a result, we obtain a uniform upper bound for the Wannier function uˆ l,0 (x): |uˆ l,0 (x)| = u l (x; k)dk = u l (x; k + iηl )dk T T ikx−η l x dk ≤ C e −ηl x , = wl (x; k + iηl )e ∀x ≥ 0, l T

where Cl = supk∈Dη |wl (x; k)|. A similar computation extends the bound for x ≤ 0. l The decay bound (3.8) follows from the relation (3.4).

Remark 1. The class of piecewise-continuous potentials with disjoint spectral bands provides a sufficient condition for existence of the unitary transformation (3.6)–(3.7) and the exponential decay (3.8). More general potentials are expected to exist for which Proposition 2 remains valid. Moreover, we do not need in our analysis the assumption that all spectral bands are disjoint. It is sufficient for the exponential decay of functions {uˆ l,n (x)}n∈Z for a fixed l ∈ N that the particular l th band is disjoint from the rest of the spectrum of L. Remark 2. Since the transformation ω → ω + ω0 , V (x) → V (x) + ω0 leaves the elliptic problem (1.1) invariant, we will assume without loss of generality that V (x) is bounded from below. For convenience, we choose V (x) ≥ 0, ∀x ∈ R. Then, σ (L) ≥ 0. Lemma 1. Let φ be represented by the set of vectors {φ l }l∈N , where φ l for a fixed l ∈ N is represented by the set of elements {φl,n }n∈Z . If φ ∈ l11 (N, l 1 (Z)), then φ = W −1 φ ∈ H 1 (R).

810

D. Pelinovsky, G. Schneider, R. S. MacKay

Proof. We use the triangle inequality |φl,n |uˆ l,n H 1 (R) = uˆ l,0 H 1 (R) |φl,n | φ H 1 (R) ≤ l∈N n∈Z

l∈N

n∈Z

and the fact that f H 1 (R) ≤ (1 + L)1/2 f L 2 (R) for any f ∈ Dom(L), since L = −∂x2 + V (x) and V (x) ≥ 0. By using the integral representation (3.2) and the orthogonality relations (3.9), we obtain 1/2 2 uˆ l,0 (x)(1 + L)uˆ l,0 (x)d x (1 + L) uˆ l,0 L 2 (R) = R = u¯ l (x; k )(1 + L)u l (x; k)dkdk d x = (1 + ωl (k)) dk. R T T

T

By Theorem 4.2.3 on p. 57 of [7], there are k-independent constants C± > 0 such that C−l 2 ≤ |ωl (k)| ≤ C+l 2 , As a result, φ H 1 (R) ≤ C for some C > 0.

∀l ∈ N, ∀k ∈ T.

(3.11)

1 1 , (1 + l 2 )1/2 |φl,n | = Cφ l (N,l (Z)) l∈N

1

n∈Z

Lemma 2. If φ l ∈ l 1 (Z) for a fixed l ∈ N and φ(x) = n∈Z φl,n uˆ l,n (x), then φ belongs to the invariant closed subspace El ⊂ L 2 (R). Moreover, φ ∈ H 1 (R), such that the function φ is bounded, continuous, and decaying to zero as |x| → ∞. Proof. By Lemma 1, we observe that if φ l ∈ l 1 (Z) and φ(x) = n∈Z φl,n uˆ l,n (x), then φ ∈ H 1 (R). By Sobolev’s embedding (1.5), we obtain that φ ∈ Cb0 (R) and φ(x) decays to zero as |x| → ∞. By the orthogonality property (3.9) and since φ l l 2 (Z) ≤ φ l l 1 (Z) , it follows that φ ∈ El ⊂ L 2 (R).

Remark 3. If uˆ l,n (x) satisfies the exponential decay (3.8) for a fixed l ∈ N, direct computations show that φ ∈ Cb0 (R), i.e. |φ(x)| ≤ |φl,n ||uˆ l,n (x)| ≤ Cl |φl,n |e−ηl |x−2π n| ≤ Cl φ l l 1 (Z) , ∀x ∈ R. n∈Z

n∈Z

Lemma 3. If uˆ l,n (x) satisfies the exponential decay (3.8) for a fixed l ∈ N and |φl,n | ≤ Cr |n| uniformly on n ∈ Z for some C > 0 and 0 0 and 0 < q < 1. A similar analysis applies to m ≤ 0. Using the decay bound (3.8) with Cl ≡ C and ηl ≡ η, we obtain that

∞ m |φ(2π m)| ≤ C |φl,m+n |e−2π ηn + |φl,m−n |e−2π ηn n=1

+e

−2π ηm

n=0 ∞ n=1

|φl,−n |e

−2π ηn

, ∀m ≥ 0.

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

811

Since r < 1, e−2π η < 1 and r e−2π η < 1, the first sum is bounded by C1r m , while the third sum is bounded by C3 e−2π ηm . The second sum is bounded by ⎧ ⎨ r m 1− pm+1 , p < 1, 1− p m m−1 −2π η −2π ηm r +r e + ··· + e = −m−1 ⎩ r m p m 1− p −1 , p > 1, 1− p

where pr = e−2π η . If p ≤ 1, the sum is bounded by C2 r m . If p ≥ 1, the sum is bounded by C2 e−2π ηm . All three terms decay to zero exponentially fast as m → ∞.

Remark 4. The Fourier series (3.1) for the Bloch function u l (x; k) is an example of the Wannier decomposition over the basis {uˆ l,n (x)}n∈Z for a fixed l ∈ N with the explicit representation φ(x) = u l (x; k), φl,n = ei2π nk . This decomposition corresponds to the case when φ l ∈ / l 1 (Z) and φ ∈ / El ⊂ L 2 (R). 4. Main Results We shall now describe our main example of the potential function V (x) which enables us to reduce the continuous elliptic problem (1.1) to the lattice equation (1.2). The potential function transforms, in a singular limit, to a sequence of infinite walls of a non-zero width. Since Proposition 2 is established only for bounded potentials, we need to show that the main properties of the Wannier functions such as the exponential decay (3.8) hold also in the singular limit. To do so, we develop analysis of the one-dimensional Schrödinger operator in Appendices B and C. Assumption 2. Let V be given by a piecewise-constant function V (x) = b on x ∈ (0, a) and V (x) = 0 on x ∈ (a, 2π ) for fixed 0 < a < 2π and b = 1/ε2 > 0, periodically continued with period 2π . Figure 1 shows the potential function V (x) defined by Assumption 2 with a = π and b = 4 (ε = 21 ). Lemma 4. Let V satisfy Assumption 2. For any fixed l0 ∈ N, there exist ε0 , ζ0 , ω0 , c1± , c2 > 0, such that, for any ε ∈ [0, ε0 ), the band functions of the operator L = −∂x2 +V (x) satisfy the properties (i) (band separation) (ii) (band boundness) (iii) (tight-binding approximation)

min

inf |ωl (k) − ωˆ l0 ,0 | ≥ ζ0 ,

∀l∈N\{l0 } ∀k∈T

|ωˆ l0 ,0 | ≤ ω0 , c1− εe

− aε

(4.2)

≤ |ωˆ l0 ,1 | ≤ c1+ εe

where n ≥ 2. Proof. The proof of the lemma is given in Appendix B.

(4.1)

− aε

2a

, |ωˆ l0 ,n | ≤ c2 ε2 e− ε , (4.3)

812

D. Pelinovsky, G. Schneider, R. S. MacKay 5 2

b = 1/ε

4

V

3

2

1

0 x=0

−1 −5

x = 2π

x=a

0

5

10

x

Fig. 1. The potential function V (x) with a = π and b = 4 3 2

tr(A)

1

σ(L)

0 −1 −2 −3 0

5

ω

10

15 0

5

ω

10

15

Fig. 2. Left: The trace of the monodromy matrix A versus ω for the potential function V (x) with a = π and b = 4. Right: The corresponding band-gap structure of the spectrum σ (L)

Figure 2 illustrates properties (4.1)–(4.3) of Lemma 4 for the potential function V (x) with a = π and b = 4. The left panel shows the behavior of the trace of the monodromy matrix A versus ω. The right panel shows the spectral bands defined by the intervals with |tr(A)| ≤ 2. The first two bands are narrow for this value of b = 1/ε2 , according to the tight-binding approximation. Lemma 5. Let V (x) satisfy Assumption 2. For any fixed l0 ∈ N, there exists ε0 , C0 , C > 0, such that, for any ε ∈ [0, ε0 ), the Wannier functions of the operator L = −∂x2 + V (x) satisfy the properties: (i) (compact support) (ii) (exponential decay)

|uˆ l0 ,0 (x) − uˆ 0 (x)| ≤ C0 ε, ∀x ∈ [0, 2π ], uˆ l ,0 (x) ≤ Cεn e− naε , 0 ∀x ∈ [−2π n, −2π(n − 1)] ∪ [2π n, 2π(n + 1)], n ∈ N,

(4.4) (4.5)

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

0.8

0.8

0.4

u2,0

0.6

u1,0

813

0.4

b=4 b = 16

0

−0.4

0.2 b=4 b = 16

0 −5

0

−0.8 5

10

−5

0

x

5

10

x

Fig. 3. The Wannier functions uˆ l,0 (x) for l = 1 (left) and l = 2 (right). The solid lines show the Wannier functions for b = 4 and b = 16. The dashed lines show the limiting function (4.6)

where uˆ 0 (x) =

0, √ √

2 2π −a

sin

πl0 (2π −x) 2π −a ,

∀x ∈ [0, a], ∀x ∈ [a, 2π ].

Proof. The proof of the lemma is given in Appendix C.

(4.6)

Figure 3 illustrates properties (4.4)–(4.5) of Lemma 5 for the potential function V (x) with a = π and b = 4, 16. The left and right panels show the Wannier functions uˆ 1,0 (x) and uˆ 2,0 (x), respectively. The two functions were computed by using the integral representation (3.2) and the numerical approximations of the corresponding Bloch functions. The dashed line shows the limiting function (4.6). Let us sketch a formal derivation of the lattice equation (1.2) from the continuous nonlinear problem (1.1) by using the Wannier function decomposition for a particular l0th a band. Let V satisfy Assumption 2 and denote µ = εe− ε . Fix l0 ∈ N, let ω = ωˆ l0 ,0 + µ, and consider the substitution φ(x) =

1/2 µ φn uˆ l0 ,n (x), (ϕ(x) + µψ(x)) , ϕ(x) = β

(4.7)

n∈Z

where β = uˆ l0 ,0 4L 4 (R) and ψ is orthogonal to El0 ⊂ L 2 (R). Using the ODE system (3.5), we find that ψ(x) satisfies the inhomogeneous system −ψ (x) + V (x)ψ(x) − ωˆ l0 ,0 ψ(x) = −

1 ωˆ l0 ,m (φn+m + φn−m ) uˆ l0 ,n (x) µ n∈Z m∈N

σ + (ϕ(x) + µψ(x)) − |ϕ(x) + µψ(x)|2 (ϕ(x) + µψ(x)) . β

(4.8)

Since ψ ∈ Dom(L) and ψ ⊥ El0 , where L = −∂x2 + V (x), then (uˆ l0 ,n , Lψ) = 0 for all n ∈ Z. As a result, the projection equations for components of the vector φ = (. . . , φ−2 , φ−1 , φ0 , φ1 , φ2 , . . .) satisfy the system

814

D. Pelinovsky, G. Schneider, R. S. MacKay

1 σ ωˆ l0 ,m (φn+m + φn−m ) + µ β m∈N

= φn − where

K n,n 1 ,n 2 ,n 3 φn 1 φ¯ n 2 φn 3

(n 1 ,n 2 ,n 3 )

σ ψ), ∀n ∈ Z, Rn (φ, β

(4.9)

K n,n 1 ,n 2 ,n 3 =

R

uˆ l0 ,n (x)uˆ l0 ,n 1 (x)uˆ l0 ,n 2 (x)uˆ l0 ,n 3 (x)d x, ∀n, n 1 , n 2 , n 3 ∈ Z,

and ψ) = Rn (φ,

By Lemma 4(iii), α = ωˆ l0 ,m µ

(4.10)

ωˆ l0 ,1 µ

uˆ l0 ,n (x) |ϕ(x) + µψ(x)|2 (ϕ(x) R +µψ(x)) − |ϕ(x)|2 ϕ(x) d x, ∀n ∈ Z.

(4.11)

is uniformly bounded and nonzero for small µ > 0,

= O(µ) for all m ≥ 2. By Lemma 5(i), K n,n,n,n = β = uˆ l0 ,0 4L 4 (R) while is uniformly bounded and nonzero for small µ > 0. By Lemma 5(ii), K n,n 1 ,n 2 ,n 3 = O µ|n 1 −n|+|n 2 −n|+|n 3 −n|+|n 2 −n 1 |+|n 3 −n 1 |+|n 3 −n 2 | for all n 1 , n 2 , n 3 = n. If system (4.9) is formally truncated at the leading-order terms as µ → 0, it becomes the lattice equation (1.2). We can now formulate the main result of our article. a

Theorem 1. Let V satisfy Assumption 2. Fix l0 ∈ N and denote µ = εe− ε . Assume that there exists a solution φ 0 ∈ l 1 (Z) of the lattice equation (1.2) with α = ωˆ l0 ,1 /µ and a fixed such that the linearized equation at φ 0 has one-dimensional kernel in l 1 (Z) ⊂ l 2 (Z) spanned by {i φ 0 } and the rest of the spectrum is bounded away from zero. There exist µ0 , C > 0, such that the nonlinear elliptic problem (1.1) with ω = ωˆ l0 ,0 +µ has a solution φ(x) in H 1 (R) with 1/2 µ φn uˆ l0 ,n ≤ Cµ3/2 , (4.12) ∀0 < µ < µ0 : φ − β n∈Z

H 1 (R )

where β = uˆ l0 ,0 4L 4 (R) . Moreover, φ(x) decays to zero exponentially fast as |x| → ∞ if {φn } decays to zero exponentially fast as |n| → ∞. Remark 5. According to Theorem 1 in [13], there exists a bounded, continuous and exponentially decaying solution φ in H 1 (R) if ω is in a finite gap of the spectrum of L. Not only do we recover this result but also we specify the asymptotic correspondence between exponentially decaying solutions of the elliptic problem (1.1) and those of the lattice equation (1.2). The correspondence can be used to classify the localized solutions of these models by the number of pulses in different wells of the periodic potential V (x) modulo a discrete group of translations with a 2π -multiple period. This classification is explained in Remark 9 of Sect. 6. Remark 6. We show in Appendix D that the lattice equation (1.2) occurs naturally as the Poincaré map for the second-order equation (1.1) with a periodic coefficient. However, the map we used to turn a sequence {φn }n∈Z into a function φ(x) on x ∈ R involves the Wannier functions, whereas that to turn a sequence for Poincaré map iterations into a function φ(x) involves evolution of the differential equation on [0, 2π ]. There is a map between these two types of sequences but it is not a sitewise map.

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

815

5. Proof of Theorem 1 a

Using the√same scaling as in the previous section, namely µ = εe− ε , ω = ωˆ l0 ,0 + µ µ ˜ we rewrite the nonlinear elliptic problem (1.1) in the equivalent form and φ = √β φ(x), σ 2 ˜ ˜ ˜ ˜ L µ φ = µ φ − |φ| φ , β

L µ = −∂x2 + V (x) − ωˆ l0 ,0 ,

(5.1)

where both V and ωˆ l0 ,0 depend on ε and thus on µ. Let El0 be an invariant subspace of L 2 (R) associated with the l0th spectral band of σ (L). Using the Lyapunov–Schmidt ˜ reduction theory, we decompose the solution in the form φ(x) = ϕ(x) + µψ(x), where ϕ ∈ El0 ⊂ L 2 (R) and ψ ∈ El⊥0 = L 2 (R)\El0 . Denote projection operators P : L 2 (R) → El0 and Q = I − P : L 2 (R) → El⊥0 . Then, the bifurcation problem (5.1) splits into a system of two equations σ P L µ Pϕ = µ ϕ − P|ϕ + µψ|2 (ϕ + µψ) , β σ Q L µ Qψ = µψ − Q|ϕ + µψ|2 (ϕ + µψ). β

(5.2) (5.3)

The proof of Theorem 1 is based on the following two lemmas that describe solutions of system (5.2)–(5.3). Lemma 6. Let Dδ0 ⊂ H 1 (R) be a ball of finite radius δ0 centered at 0 ∈ H 1 (R) and let Rµ0 ⊂ R be an interval of small radius µ0 centered at 0 ∈ R. There exists a unique smooth map ψµ : Dδ0 × Rµ0 → H 1 (R), such that ψ(x) = ψµ (ϕ(x)) solves Eq. (5.3) and ∀0 < µ < µ0 , ∀ϕ H 1 (R) < δ0 :

ψ H 1 (R) ≤ C0 ϕ3H 1 (R) ,

(5.4)

for some constant C0 > 0. Moreover, ψ(x) decays exponentially as |x| → ∞. Proof. Let ω ≡ ωˆ l0 ,0 . Since ω ∈ / σ (El⊥0 ) by Lemma 4(i), solutions φ(x) of the linear inhomogeneous problem Q L µ Qφ = f (x) with f ∈ L 2 (R) belong to L 2 (R) uniformly in µ ∈ R because of the Fourier–Bloch decomposition (see Proposition 1) φ(x) =

l∈N\{l0 }

fˆl (k) u l (x; k)dk, ∀x ∈ R T ωl (k) − ω

(5.5)

and the Parseval identity

| fˆl (k)|2 dk (ωl (k) − ω)2 T l∈N\{l0 } 1 1 | fˆl (k)|2 dk ≤ 2 f 2L 2 (R) , ≤ 2 ζ0 l∈N\{l } T ζ0

φ2L 2 (R) =

0

(5.6)

816

D. Pelinovsky, G. Schneider, R. S. MacKay

where the bound (4.1) has been used. Since V (x) ≥ 0 and ω > 0 according to Remark 2, we multiply Q L µ Qφ = f (x) by the function φ and integrate it with respect to x ∈ R to obtain φ (x)2L 2 (R) + V 1/2 φ2L 2 (R) ≤ ωφ2L 2 (R) +|(φ, f )| ≤ ωφ2L 2 (R) + φ L 2 (R) f L 2 (R) ,

(5.7)

where the Cauchy–Schwarz inequality has been used. Using the bound (5.6), we find that φ H 1 (R) ≤ C f L 2 (R) ,

(5.8)

where C > 0 is ε-independent. Therefore, the operator Q L µ Q is continuously invertible and the inverse operator (Q L µ Q)−1 provides a continuous map from L 2 (R) to H 1 (R) uniformly in µ, such that −1 ˜ ∀0 < µ < µ0 : Q L µ Q − µ L 2 (R)→ H 1 (R) ≤ C, (5.9) where C˜ > 0 is ε-independent. Therefore, system (5.3) can be rewritten in the form −1 σ Q L µ Q − µ ψ =− Q|ϕ + µψ|2 (ϕ + µψ). (5.10) β Since H 1 (R) is a Banach algebra, the nonlinear operator acting on ψ and given by the right-hand-side of system (5.10) maps an element of H 1 (R) to itself if ϕ ∈ H 1 (R). The existence of the map ψ(x) = ψµ (ϕ(x)) with the desired bound (5.4) follows by the Implicit Function Theorem. By elliptic theory, solution ψ(x) of system (5.10) in H 1 (R) decays exponentially as |x| → ∞.

Remark 7. One might have hoped that the operator (Q L µ Q)−1 provides a continuous map from L 2 (R) to H 2 (R) uniformly in µ, but we suspect this is false. Nevertheless, we do obtain a bound φ H 2 (R) ≤

C f L 2 (R) ≤ C(ν)µ−ν f L 2 (R) , ε

for a fixed ν > 0 and some C(ν) > 0, which may not be sharp. Indeed, this bound follows from the bounds φ (x) L 2 (R) ≤ V φ L 2 (R) + ωφ2L 2 (R) + f L 2 (R) and V φ L 2 (R) ≤ 1ε V 1/2 φ L 2 (R) , where V 1/2 φ L 2 (R) is uniformly bounded in ε by f L 2 (R) , thanks to the bounds (5.6) and (5.7). Using the map in Lemma 6, we rewrite system (5.2) as a bifurcation equation for ϕ(x), σ 2 (5.11) P L µ Pϕ = µ ϕ − P|ϕ + µψµ (ϕ)| (ϕ + µψµ (ϕ)) . β Using the decomposition in Lemma 2, we represent solutions of (5.11) in the form φn uˆ l0 ,n (x). (5.12) ∀ϕ ∈ El0 ⊂ L 2 (R) : ϕ(x) = n∈Z

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

817

We recall that ϕ ∈ H 1 (R) if φ ∈ l 1 (Z). By Proposition 2, the orthogonal projections of the bifurcation equation (5.11) result in the lattice equation (4.9), where ψ(x) = ψµ (ϕ(x)) is represented by the map with the bound (5.4) and ϕ(x) is given by the Wannier function decomposition (5.12). Lemma 7. The lattice system (4.9) for vectors φ is closed in vector space l 1 (Z). Moreover, l 1 (Z) ≤ δ0 : R( ψ)l 1 (Z) ≤ µD0 φ 51 , φ, ∀0 < µ < µ0 , ∀φ l (Z)

(5.13)

for some constant D0 > 0. Proof. We shall prove that every term of the lattice system (4.9) maps l 1 (Z) to itself. The first term is estimated as follows: 1 l 1 (Z) ≤ 1 l 1 (Z) , l0 l 1 (Z) φ φ) ω( |ωˆ l0 ,n ||φn+n | ≤ ω µ µ n∈Z n ∈Z\{0}

l0 is the vector of elements {ωˆ l0 ,n } on n ∈ Z\{0}. Since ωl0 (k) is analytically where ω extended along the Riemann surface on k ∈ T (by Theorem XIII.95 on p.301 in [19]), l0 ∈ l 1 (Z). By Lemma 4(iii), we obtain that ωl0 ∈ H s (T) for any s ≥ 0 and, hence, ω l0 l 1 (Z) ≤ Cµ for some C > 0 and small µ > 0. The second term of the we have ω lattice system (4.9) is estimated as follows: σ l 1 (Z) ≤ σ 31 , φ) K( |K n,n 1 ,n 2 ,n 3 ||ψn 1 ||ψn 2 ||ψn 3 | ≤ K 0 φ l (Z) β β n∈Z (n 1 ,n 2 ,n 3 )

n 1 ,n 2 ,n 3 l 1 (Z) and K n 1 ,n 2 ,n 3 is the vector of elements where K 0 = sup(n 1 ,n 2 ,n 3 ) K {K n,n 1 ,n 2 ,n 3 } on n ∈ Z. Because of the exponential decay (3.8) justified in Lemma 5(ii), there exists a uniform bound |uˆ l0 ,n (x)| ≤ Cl0 e−ηl0 |x−2π n| ≤ A0 , ∀x ∈ R, n∈Z

n∈Z

for some A0 > 0. As a result, we obtain n 1 ,n 2 ,n 3 l 1 (Z) ≤ A0 |uˆ l0 ,n 1 (x)||uˆ l0 ,n 2 (x)||uˆ l0 ,n 3 (x)|d x K R

≤ A0 uˆ l0 ,0 L ∞ (R) uˆ l0 ,0 2L 2 (R) , uniformly in (n 1 , n 2 , n 3 ). Since uˆ l0 ,0 H 1 (R) ≤ (1 + L)1/2 uˆ l0 ,0 L 2 (R) ≤ (1 + ωˆ l0 ,0 )1/2 , we have K 0 ≤ C for some C > 0 and small µ > 0, thanks to Lemma 4(ii) and Sobolev’s of the lattice system (4.9) is estimated as fol φ) embedding. Finally, the vector field R( lows: l 1 (Z) ≤ A0 φ) R( |ϕ(x) + µψµ (ϕ(x))|2 (ϕ(x) + µψµ (ϕ(x))) − |ϕ(x)|2 ϕ(x) d x R

≤ µA0 B0 ϕ2H 1 (R) ψµ (ϕ) H 1 (R) 51 , ≤ µA0 B0 C0 ϕ5H 1 (R) ≤ µA0 B0 C0 φ l (Z) for some B0 > 0, where we have used the property (1.6), the bound (5.4), and Lemma 2. The last computation proves the desired bound (5.13) with D0 = A0 B0 C0 .

818

D. Pelinovsky, G. Schneider, R. S. MacKay

Proof of Theorem 1. By Lemmas 4(iii) and 7, the first term of the lattice system (4.9) can be rewritten in the form 1 ωˆ l0 ,m (φn+m + φn−m ) = α (φn+1 + φn−1 ) + µL n (φ, µ), µ m∈N

where α = ωˆ l0 ,1 /µ is uniformly bounded and nonzero for µ > 0 and l 1 (Z) ≤ δ0 : ∀0 < µ < µ0 , ∀φ

µ)l 1 (Z) ≤ D1 φ l 1 (Z) , φ, L(

(5.14)

for some constant D1 > 0. By Lemmas 5(ii) and 7, the second term of the lattice system (4.9) can be rewritten in the form σ K n,n 1 ,n 2 ,n 3 φn 1 φ¯ n 2 φn 3 = σ |φn |2 φn + µQ n (φ, µ), β (n 1 ,n 2 ,n 3 )

where l 1 (Z) ≤ δ0 : ∀0 < µ < µ0 , ∀φ

µ)l 1 (Z) ≤ D2 φ 31 , φ, Q( l (Z)

(5.15)

for some constant D2 > 0. As a result, we obtain the perturbed lattice system µ), ∀n ∈ Z, α (φn+1 + φn−1 ) + σ |φn |2 φn − φn = µNn (φ,

(5.16)

where the perturbation term satisfies the bound l 1 (Z) ≤ δ0 : ∀0 < µ < µ0 , ∀φ

µ)l 1 (Z) ≤ Dφ l 1 (Z) , φ, N(

(5.17)

for some constant D > 0. Assume that there exists a solution φ 0 ∈ l 1 (Z) of the lattice equation (1.2) with α = ωˆ l0 ,1 /µ and a fixed such that the linearized equation at φ 0 has a one-dimensional kernel in l 1 (Z) ⊂ l 2 (Z) spanned by {i φ 0 } and the rest of the spectrum is bounded away from zero. This eigenmode is always present owing to the invariance iθ , ∀θ ∈ R. of the lattice equation (1.2) with respect to the gauge transformation φ → φe The perturbed lattice equation (5.16) is also invariant with respect to this transformation, since it is inherited from the properties of the nonlinear elliptic problem (1.1). Fix θ uniquely by picking a n 0 ∈ Z such that |(φ 0 )n 0 | = 0 and requiring that Im(φ 0 )n 0 = 0, Im(φ)n 0 = 0. The vector field of the perturbed lattice equation (5.16) preserves the constraint Im(φ)n 0 = 0 by symmetry and it is closed in l 1 (Z) ⊂ l 2 (Z), while the linearized operator is continuously invertible under the constraint. By the Implicit Function Theorem, there exists a smooth continuation of the solution φ 0 due to the perturbation terms of the lattice equation (5.16) such that Im(φ)n 0 = 0 and ∀0 < µ < µ0 :

φ − φ 0 l 1 (Z) ≤ Cµ,

(5.18)

for some C > 0. By Lemma 2, if φ ∈ l 1 (Z), then ϕ(x) in the representation (5.12) is a continuous bounded function of x ∈ R, which decays to zero as |x| → ∞. By Lemma 3, it decays to zero exponentially fast as |x| → ∞ if φ decays to zero exponentially fast as |n| → ∞. The same properties hold for ψ(x) = ψµ (ϕ(x)), by Lemma 6, and thus to the full solution φ(x). These arguments finish the proof of Theorem 1.

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

819

Remark 8. For the proof of Theorem 1, we used a different approach compared to [6, 17]. In these papers, we first formulated the elliptic problem with a periodic potential in the Bloch (Fourier) space and then reduced a closed system of equations for the Bloch–Fourier transform by using the Lyapunov–Schmidt reductions. One can think of using the same strategy here, when a full (double-series) Wannier function decomposition is used to transform the elliptic problem to an algebraic system for coefficients of the double series. We have avoided this approach since we do not know how to show that the system of the nonlinear algebraic equations is closed in the space l11 (N, l 1 (Z)), which would ensure that φ ∈ H 1 (R). 6. Examples of Localized Solutions We review examples of localized solutions of the lattice equation (1.2) which satisfy the assumptions of Theorem 1. Following paper [11], these solutions can be efficiently characterized in the anti-continuum limit where α is sufficiently small. We recall that all localized solutions {φn }n∈Z of the lattice equation (1.2) are realvalued (see, e.g., [15]). Let φ be a real-valued vector on n ∈ Z and rewrite the lattice equation (1.2) in the form

− σ φn2 φn = α (φn+1 + φn−1 ) , ∀n ∈ Z.

(6.1)

The linearized equation at the real-valued solution φ perturbed with the real-valued is written in the form vector ψ

Lαψ

n

= − 3σ φn2 ψn − α (ψn+1 + ψn−1 ) , ∀n ∈ Z.

(6.2)

The nonlinear vector field of the lattice equation (6.1) maps l 1 (Z) to itself for any α ∈ R. If α = 0 and σ = sign(), there exists a limiting solution of the lattice equation (6.1) in the form 0,√ ∀n ∈ U0 , φn = (6.3) ± ||, ∀n ∈ U± , where U+ ∪ U− ∪ U0 = Z. The spectrum of the linearized operator L 0 evaluated at the limiting solution (6.3) consists of two points σ (L 0 ) = {−2, }, where eigenvalue has multiplicity dim(U0 ) and eigenvalue −2 has multiplicity dim(U+ ) + dim(U− ). If dim(U+ ) + dim(U− ) < ∞, the limiting solution (6.3) is in l 1 (Z) and the linearized operator L α is continuously invertible in l 1 (Z) for any = 0. By the Implicit Function Theorem, there exists a unique smooth solution φ α ∈ l 1 (Z) of the lattice equation (1.2) with |α| < α0 and σ = sign(), where α0 > 0 is sufficiently small, and φ α − φ 0 l 1 (Z) ≤ C|α|, for some α-independent constant C > 0. Since the kernel of the linearization operator (6.2) is empty for sufficiently small α, the assumptions of Theorem 1 are satisfied for real-valued solutions φ ∈ l 1 (Z).

820

D. Pelinovsky, G. Schneider, R. S. MacKay

Remark 9. For sufficiently small values of α, all localized solutions of the lattice equation (6.1) can be classified by the configurations U+ and U− in the limiting solution (6.3). Simply speaking, the limiting configuration indicates a finite number of nodes on Z, where “up” and “down” pulses are placed. Using the bound (4.12) of Theorem 1, we can transfer this information to the localized solution of the elliptic problem (1.1) since each n ∈ Z with φn = 0 corresponds to the Wannier function uˆ l0 ,n (x) = uˆ l0 ,0 (x −2π n), which is centered at the n th potential well of V (x) and is exponentially decaying as |x| → ∞. Therefore, the limiting configuration (6.3) indicates the finite number of “up” and “down” pulses placed in the corresponding wells of the periodic potential V . Remark 10. Bifurcations of localized real-valued solutions may occur for larger values of α, when the linearized operator (6.2) may admit a nontrivial kernel in l 1 (Z) ⊂ l 2 (Z). Theorem 1 does not hold at the bifurcation point but can be used to prove persistence of solutions before and after the bifurcation point, provided that the linearized operator is again invertible. 7. Examples of Potential Functions V The main example of the potential function V in Assumption 2 can be extended to other potential functions using semi-classical techniques [4,8]. This extension will not be presented here. We shall, however, review other examples of the potential function V , for which the analysis of our paper can be applied immediately. Example 1. Let V (x) = cδ(x) on [−π, π ], periodically continued with the period 2π . The band function ωl (k) for this example can be obtained from analysis of Appendix B if a → 0 and c = ab is fixed. The expression (B.3) for tr(A) simplifies in the limit a → 0 to the form √ √ c tr(A) = 2 cos(2π ω) + √ sin(2π ω), 0 < ω < ∞. (7.1) ω All bands have non-zero widths if c is finite. Therefore, the delta-function potential (with infinitesimal thickness of the walls) does not satisfy the tight-binding property (4.3) of Lemma 4. In addition, the periodic potential V (x) is unbounded at x = 2π n, ∀n ∈ Z for any c > 0. Example 2. Let V (x + L) = V (x) be L-periodic, such that V (x) = b on x ∈ (0, a) and V (x) = 0 on x ∈ (a, L). We show that this function is equivalent, in the limit L → ∞, to the potential function of Assumption 2 in the limit ε → 0. Let 2 2π x˜ ˜ x), = ˜ φ(x) = φ( ˜ ω = ω. ˜ , x = √ , V (x) = V˜ (x), L ˜ x) Then, φ(x) and φ( ˜ solve ˜ x) ˜ x), (−∂x2 + V (x))φ(x) = ωφ(x), (−∂x2˜ + V˜ (x)) ˜ φ( ˜ = ω˜ φ( ˜ respectively, while V˜ (x) ˜ = b˜ on x˜ ∈ (0, a) ˜ and V˜ (x) ˜ = 0 on x˜ ∈ (a, ˜ 2π ) with √ ˜ is equivalent to the one in a˜ = a and b˜ = b/. For fixed La and b, the function V˜ (x) Assumption 2. We note that the band separation property (4.1) of Lemma 4 is not satisfied for the function V (x) in the limit L → ∞, since the distance between spectral bands reduces as O( L12 ). However, this property is satisfied for the rescaled function V˜ (x). ˜

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

821 2

x 4 Example 3. Let V (x) = 212 (1 − cos x) be such that V (x) = 4 2 + O(x ) near x = 0. According to the asymptotic analysis in [23], the tight-binding property (4.3) of Lemma 4 is satisfied in the limit → 0, while the exponentially narrow spectral bands ωl (k) conx2 2l−1 verge to eigenvalues of the parabolic potential 4 2 at ω = 2 for all l ∈ N. The distance between different bands satisfies the band separation property (4.1) and in fact, diverges as → 0. We note, however, that the band-boundeness property (4.2) fails for this example. However, it only affects the bound (4.12) and does not affect the lattice equation (1.2) where the coefficient of the nonlinear term is scaled to unity.

We finish this section with remarks on the related works [21,25]. • The periodic piecewise-constant function a(x) in the operator L = −∂x a(x)∂x considered in [21] is similar to the main example of the potential function V (x) used in Assumption 2. It provides both the separation of bands and the tight-banding approximation with some modifications: (i) the lowest-order band does not satisfy the property |ωˆ l0 ,1 | ωˆ l0 ,0 but does satisfy the property |ωˆ l0 ,m | ωˆ l0 ,0 for any |m| ≥ 2 and (ii) the lowest-order band is separated from all other bands by a distance diverging as → 0. • Complicated projection analysis in the problem involving nonlinear pulses located far away from each other [25] is partly explained in Example 2: the bands are not separated from each other in the limit L → ∞ unless a rescaling to tilded variables is applied. 8. Lattice Equations in Two and Three Dimensions The results of our analysis were restricted to the space of one dimension since we have used the Banach algebra property of H 1 (R) and the fact that (Q L µ Q)−1 provides a bounded map from L 2 (R) to H 1 (R) uniformly in µ > 0. According to Remark 7, no uniform bound may exist from L 2 (R) to H 2 (R). Nevertheless, thanks to the exponential smallness of bounds (4.3) and (4.5) in Lemmas 4 and 5, we are still able to extend results of our analysis to the nonlinear elliptic problem with a multi-dimensional separable potential in the form − ∇ 2 φ + W (x)φ + σ |φ|2 φ = ωφ, ∀x ∈ Rd , (8.1) where ∇ 2 is the continuous d-dimensional Laplacian and W = dj=1 V (x j ) is a separable potential with a bounded 2π -periodic function V : R → R. The Laplacian ∇ 2 can be replaced by ∇ M∇ with an arbitrary positive-definite matrix M and the results will remain the same. Equivalently, the period parallelogram of W can be arbitrary. For the sake of simplicity, we restrict our attention to the case when M is the identity matrix and W has period 2π in each coordinate. The lattice equation (1.2) is generalized in the multi-dimensional setting in the form d

α j φn+e j + φn−e j + σ |φn |2 φn = φn , ∀n ∈ Zd ,

(8.2)

j=1

where (e1 , e2 , . . . , ed ) is a standard basis in Zd and (α1 , α2 , . . . , αd ) are constants. Our main result is generalized as follows.

822

D. Pelinovsky, G. Schneider, R. S. MacKay

Theorem 2. Let W =

d

V (x j ) be a separable potential, where V satisfies Assumpa and denote µ = εe− ε . Assume that there exists a solution φ 0 ∈ l 1 (Zd ) tion 2. Fix l0 ∈ of the lattice equation (8.2) for d = 1, 2, 3 with α j = ωˆ (l0 ) j ,1 /µ, j = 1, 2, . . . , d, and a fixed such that the linearized equation at φ 0 has one-dimensional kernel in l 1 (Zd ) spanned by {i φ 0 } and the rest of the spectrum is bounded away from zero. Fix ν ∈ (0, 1). There exist µ0 , C(ν) > 0 such that the nonlinear elliptic problem (8.1) with ω = ω0 + µ has a solution φ(x) in H 2 (Rd ) for d = 1, 2, 3 satisfying 1/2 µ ∀0 < µ < µ0 : φ − φn uˆ l0 ,n ≤ C(ν)µ3/2−ν , (8.3) β n∈Zd 2 d j=1

Nd

H (R )

d where ω0 = ˆ (l0 ) j ,0 , uˆ l0 ,n (x) = dj=1 uˆ (l0 ) j ,n j (x j ) and β = uˆ l0 ,0 4L 4 (Rd ) . j=1 ω Moreover, φ(x) decays to zero exponentially fast as |x| → ∞ if {φn } decays to zero exponentially fast as |n| → ∞. Proof. We recall that the band and Bloch functions for the l th spectral band of L d = −∇ 2 + dj=1 V (x j ) with l = (l1 , l2 , . . . , ld ) ∈ Nd are represented by ω=

d

ωl j (k j ),

j=1

u=

d

u l j (x j ; k j ),

(8.4)

j=1

where ωl (k) and u l (x; k) are the band and Bloch functions of the operator L = −∂x2 + V (x) on x ∈ R. By using the same scaling of variables, we derive system (5.1) and split it into system (5.2)–(5.3) by using the orthogonal projections. Since µ is exponentially small in ε, while the operator (Q L µ Q)−1 provides a map from L 2 (Rd ) to H 2 (Rd ) that diverges only algebraically in ε (see Remark 7), there exists a unique map ψµ : H 2 (Rd ) × (0, µ0 ) → H 2 (Rd ), such that ψ(x) = ψµ (ϕ(x)) and ∀0 < µ < µ0 , ∀ϕ H 2 (R) < δ0 :

ψ H 2 (R) ≤ µ−ν/6 C0 (ν)ϕ3H 2 (R) ,

(8.5)

for a fixed ν ∈ (0, 1) and some constant C0 (ν) > 0. Therefore, we close the bifurcation equation (5.11) using the Wannier function decomposition ∀ϕ ∈ El0 ⊂ L 2 (Rd ) :

ϕ(x) =

φn uˆ l0 ,n (x), uˆ l0 ,n (x) =

n∈Zd

d

uˆ (l0 ) j ,n j (x j ).

j=1

As a result, we obtain the lattice equation (4.9) in Zd . Since uˆ l0 ,n H 2 (Rd ) ≤

C(d) uˆ l0 ,n H 1 (Rd ) ε

for some C(d) > 0, we have ϕ H 2 (Rd ) ≤

C(d) l 1 (Zd ) ˜ φl 1 (Zd ) ≤ µ−ν/6 C(ν) φ ε

(8.6)

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

823

˜ for the same ν ∈ (0, 1) and some C(ν) > 0. The lattice system (4.9) is closed in l 1 (Zd ) by the same analysis as in Lemma 7 and, since H 2 (Rd ) is a Banach algebra for d = 1, 2, 3, we obtain l 1 (Zd ) ≤ δ0 : R( ψ)l 1 (Zd ) ≤ µ1−ν D(ν)φ 51 d φ, ∀0 < µ < µ0 , ∀φ l (Z )

for some D(ν) > 0. The rest of the proof repeats the proof of Theorem 1.

Example 4. Complex-valued localized solutions of the lattice equation (8.2) in l 1 (Zd ) were constructed in the anti-continuum limit in [16] and [10] for d = 2 and d = 3 respectively. The method of Lyapunov–Schmidt reductions was used and all examples considered in these papers were represented by isolated families of solutions with the only free parameter due to the gauge invariance of the lattice equation (8.2). As a consequence, the linearized equation at the complex-valued solution φ perturbed with the complex-valued vector ψ,

(L α ψ)n = − 2σ |φn |

2

ψn −σ φn2 ψ¯ n

−

d

α j ψn+e j + ψn−e j , ∀n ∈ Zd ,

(8.7)

j=1

for any 0 < d |α j | < α0 was shown to have one-dimensional kernel spanned by {i φ} j=1 sufficiently small. Therefore, the assumptions of Theorem 2 are satisfied and all solutions of the lattice equation (8.2) obtained in [10,16] persist as solutions of the nonlinear elliptic problem (8.1). A. Shannon Decomposition We review here the Shannon decomposition, which is different from the Wannier decomposition of Proposition 2. Fix l ∈ N and assume that u l (0; k) = 0 for all k ∈ T. Let us define the set of functions {gn (x)}n∈Z according to the integrals u l (x; k) −i2π kn e gn (x) = dk, ∀x ∈ R. (A.1) T u l (0; k) Since u l (x; k) = wl (x; k)eikx , where wl (x; k) is a 2π -periodic function in x, it follows from the integrals (A.1) that gn (2π n ) = δn,n , ∀n, n ∈ Z. Therefore, the set {gn (x)}n∈Z can be used for interpolation of a continuous complexvalued function u(x) from its values {u n }n∈Z at the points x = 2π n. This construction reminds us of the Shannon theory of sampling and interpolation (see review in [24]). Definition 2. The functions of the set {gn (x)}n∈Z are called the Shannon functions. Proposition 3. Let V satisfy Assumption 1. Fix l ∈ N and let El be an invariant closed subspace of L 2 (R) associated to the l th spectral band of σ (L). There exists an isomorphism S : El ⊂ L 2 (R) → l 2 (Z) given by the sampling φ = Sφ, φn = φ(2π n) ∀n ∈ Z.

(A.2)

The inverse transformation is given by the interpolation ∀φ ∈ l 2 (Z) : φ(x) = S −1 φ = φn gn (x), ∀x ∈ R.

(A.3)

∀φ ∈ El ⊂ L 2 (R) :

n∈Z

824

D. Pelinovsky, G. Schneider, R. S. MacKay

Proof. By Sobolev’s embeddings (1.5), there exists an n-independent constant C > 0 such that |φ(2π n)| ≤ Cφ H 1 ([2π n−π,2π n+π ]) ,

∀n ∈ Z.

Therefore, 22 = φ l (Z)

|φ(2π n)|2 ≤ C 2

n∈Z

n∈Z

φ2H 1 ([2π n−π,2π n+π ]) ≤ C 2 φ2H 1 (R) .

If φ ∈ El ⊂ L 2 (R), then φ ∈ H 1 (R), so that the map S is uniformly bounded on El ⊂ L 2 (R). On the other hand, it follows from the Fourier–Bloch decomposition (2.6) that ∀φ ∈ El :

φn = φ(2π n) = =

T

T

φˆl (k)u l (2π n; k)dk φˆl (k)u l (0; k)ei2π nk dk, ∀n ∈ Z.

Inverting this representation by the Fourier series theory, we obtain that φˆl (k)u l (0; k) =

e−i2π nk φn ,

n∈Z

where u l (0; k) = 0 for all k ∈ T is assumed. As a result, ∀φ ∈ El :

φ(x) =

T

φˆl (k)u l (x; k)dk =

φn gn (x),

n∈Z

provided that the integrals (A.1) for the Shannon functions {gn (x)}n∈Z converge absolutely and uniformly on x ∈ R. This property follows from the exponential decay of the Shannon functions |gn (x)| ≤ Ce−η|x−2π n| ,

∀x ∈ R,

for some C > 0 and η > 0, which is proved similarly to the decay property (3.8) for the Wannier functions.

By Sturm–Liouville theory, the assumption u l (0; k) > 0 for all k ∈ T is satisfied for the lowest spectral band with l = 1. This assumption, however, may fail for some higher-order spectral bands with l > 1. Since our analysis is expected to work for any l ∈ N, we have avoided the Shannon decomposition and have used an equivalent Wannier decomposition, which does not rely on the assumption above. Shannon functions were applied to the justification of lattice equations for the lowest spectral band in [21].

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

825

B. Proof of Lemma 4 Let L = −∂x2 + V (x), where V is given by Assumption 2. For any 0 < ω < b, the solution φ(x) of the ODE Lφ = ωφ on [0, 2π ] is obtained explicitly in the form ⎧ √ √ ⎨ φ(0) cosh b − ωx + √φ (0) sinh b − ωx, ∀x ∈ [0, a], b−ω φ(x) = (B.1) (2π ) √ √ φ ⎩ φ(2π ) cos ω(x − 2π ) + √ sin ω(x − 2π ), ∀x ∈ [a, 2π ]. ω The continuity of φ(x) and φ (x) across the jump point x = a leads to the 2-by-2 transfer matrix φ(2π ) = a11 φ(0) + a12 φ (0),

φ (2π ) = a21 φ(0) + a22 φ (0),

(B.2)

where the explicit expressions for {ai j }1≤i, j≤2 show that a11 a22 − a12 a21 = det(A) = 1 and a11 + a22 = tr(A) is given explicitly by √ √ tr(A) = 2 cosh(a b − ω) cos (2π − a) ω √ √ b − 2ω +√ sinh(a b − ω) sin (2π − a) ω . (B.3) ω(b − ω) This equation is valid for 0 < ω b to the equation √ √ tr(A) = 2 cos(a ω − b) cos (2π − a) ω √ √ b − 2ω +√ sin(a ω − b) sin (2π − a) ω . (B.4) ω(ω − b) The band functions ωl (k) enumerated by l ∈ N and parameterized by k ∈ T correspond to the values of ω in the interval |tr(A)| ≤ 2. They are defined by the equation tr(A) = 2 cos(2π k). In the limit ε → 0, where b = ε12 , |tr(A)| is bounded near the 2 particular values ω = 2ππl−a for any l ∈ N, such that the distance between the two consequent values of ω is finite. We shall rewrite the algebraic equation tr(A) = 2 cos(2π k), where tr(A) is given by (B.3) with b = ε12 in the equivalent form: √ 2ε ω(1 − ε2 ω) √ sin (2π − a) ω + cos (2π − a) ω 2 1 − 2ε ω √ 4ε ω(1 − ε2 ω) − a 1−ε2 ω ε e = cos(2π k) 1 − 2ε2 ω

√ √ − 2a 1−ε2 ω √ 2ε ω(1−ε2 ω) ε sin (2π − a) ω e . − sin (2π − a) ω − 1−2ε2 ω

(B.5)

At ε = 0, all roots of the algebraic equation (B.5) are simple. By the Lyapunov–Schmidt theory, the simple roots persist, and owing to the analyticity of the trigonometric functions, they persist in the form na εn e− ε ωˆ l,n (ε) cosn (2π k), (B.6) ωl (k) = ωˆ l,0 (ε) + n∈N

826

D. Pelinovsky, G. Schneider, R. S. MacKay

where all parameters ωˆ l,n are continuous functions of ε, uniformly bounded in the limit ε → 0. In particular, ωˆ l,0 (ε) =

(πl)2 8(−1)l (πl)2 + O(ε), ωˆ l,1 (ε) = + O(ε), etc. 2 (2π − a) (2π − a)3

Properties (4.1)–(4.3) follow from the representation (B.6) for sufficiently small ε > 0. C. Proof of Lemma 5 Let us first rewrite the system of Eq. (3.5) in the form − uˆ l,0 (x) + V (x)uˆ l,0 (x) = ωˆ l,n uˆ l,n (x),

∀n ∈ Z.

(C.1)

n∈Z

Consider the ODE (L − ω)φ = f (x) for L = −∂x2 + V (x) on [0, 2π ]. The explicit solution is φ(x) =

√ ⎧ √ 2 1−ε ω 1−ε 2 ω ⎪ ⎨ Ae ε (x−a) + Be− ε (x−a) −

⎪ ⎩C cos √ω(x − a) + D sin √ω(x

√ ε 2 1−ε2 ω

a

e−

√

1−ε 2 ω |x−ξ | ε

0 2π sin √ω|x−ξ | √ − a) − a 2 ω

f (ξ )dξ,

f (ξ )dξ, ∀x ∈ [0, a], ∀x ∈ [a, 2π ], (C.2)

where (A, B, C, D) are arbitrary constants. Because of the property (3.4), the ODE above corresponds to system (C.1) for ω = ωˆ l,0 , φ = uˆ l,0 and f = n∈Z\{0} ωˆ l,n uˆ l,0 (x − na

2π n). Because ωˆ l,n = O(εn e− ε ) by the expansion (B.6), whereas uˆ l,0 (x) are uniformly bounded in ε, we have a

sup | f (x)| ≤ εe− ε (F+ + F− ),

x∈[0,2π ]

F− =

F+ =

sup |uˆ l,0 (x − 2π )|.

x∈[0,2π ]

sup |uˆ l,0 (x + 2π )|,

x∈[0,2π ]

(C.3)

Matching the two solutions at x = a for φ(x) and φ (x), we find that √ √ a a ε ω 1 ˜ B = 1 C − √ ε ω D + εe− ε B, ˜ C+√ A= D + εe− ε A, 2 2 2 2 1−ε ω 1−ε ω where A˜ and B˜ are uniformly bounded in |ε| < ε0 . As ε → 0, the homogeneous solution a is bounded only if B = εe− ε B , where B is a new parameter. This constraint results in the relation √ a ε ω ˜ C=√ D + εe− ε C, 2 1−ε ω where C˜ is uniformly bounded in |ε| < ε0 . As a result, we rewrite the solution (C.2) in the form ⎧ √ 2 √ 2 √ 1−ε ωˆ l,0 1−ε ωˆ l,0 ⎪ ε ωˆ l,0 ⎪ (x−a) − x ⎨ D√ ε ε e + B εe , ∀x ∈ [0, a] a 2 1−ε ωˆ l,0 + O(εe− ε ). uˆ l,0 = √ ⎪ ε ω ˆ l,0 ⎪ ⎩ D sin ωˆ l,0 (x − a) + D √ 2 cos ωˆ l,0 (x − a), ∀x ∈ [a, 2π] 1−ε ωˆ l,0

(C.4)

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

827

The parameter D is fixed by the normalization condition uˆ l,0 L 2 (R) = 1, from which the property (4.4) on [0, 2π ] is proved. Consider now the same ODE (L − ω)φ = f (x) on [2π, 4π ]. The explicit solution φ ≡ φ1 is now written in the form φ1 (x) = ⎧ √ √ 2ω 1−ε2 ω ⎪ ⎪ (x−2π −a) − 1−ε (x−2π −a) ⎪ ε ε e + B e A ⎪ 1 1 ⎪ √ ⎨ a+2π − 1−ε2 ω |x−ξ | ε ε √ − e f (ξ )dξ, ∀x ∈ [2π, 2π + a], ⎪ ⎪ 2 1−ε 2 ω 2π ⎪ ⎪ 4π sin √ω|x−ξ | √ √ ⎪ ⎩ C1 cos ω(x − 2π − a) + D1 sin ω(x − 2π − a) − √ f (ξ )dξ, ∀x ∈ [2π +a, 4π ]. a+2π 2 ω

We have

sup

x∈[2π,4π ]

| f (x)| ≤ εe

− aε

1+

sup |uˆ l,0 (x + 4π )| ,

x∈[0,2π ]

where we have used the bound (4.4). By matching the solution at x = 2π + a, we obtain the constraints on parameters of the solution: √ √ a a 1 ε ω 1 ε ω A1 = D1 + εe− ε A˜ 1 , B1 = D1 + εe− ε B˜ 1 , C1 + √ C1 − √ 2 2 1 − ε2 ω 1 − ε2 ω where A˜ 1 and B˜ 1 are uniformly bounded in |ε| < ε0 . Now we apply the continuity conditions φ(2π ) = φ1 (2π ) and φ (2π ) = φ1 (2π ), which relate coefficients A1 and B1 to D: √ 2ω √ 2ε ω(1 − ε2 ω) 1 − 2ε2 ω − a 1−ε ε D sin (2π − a) ω + = A1 e 2(1 − ε2 ω) 1 − 2ε2 ω √ a × cos (2π − a) ω + εe− ε F1 , B1 e

a

√

1−ε2 ω ε

=

√ a 1 D sin (2π − a) ω + εe− ε G 1 , 2(1 − ε2 ω)

where F1 and G 1 are uniformly bounded in |ε| < ε0 . If ω = ωˆ l,0 , the second equation a implies that B1 = εe− ε B1 , such that √ a ε ω C1 = √ D1 + εe− ε C˜ 1 , 1 − ε2 ω where B1 and C˜ 1 are uniformly bounded in |ε| < ε0 . Substituting the expansion (B.6) a into the algebraic equation (B.5), we find that, if ω = ωˆ l,0 , then A1 = εF1 + O(ε2 e− ε ). Since F1 is linear with to C˜ and C˜ is linear with respect to B , one can choose respect − aε , after which B so that F1 = O εe a a A1 = O ε 2 e− ε , C 1 = O ε 2 e− ε ,

a D1 = O εe− ε .

828

D. Pelinovsky, G. Schneider, R. S. MacKay

Moreover, since F+ =

a sup |uˆ l,0 (x + 2π )| = O εe− ε ,

x∈[0,2π ]

a it is also clear that B = O εe− ε . This construction proves the bound (4.5) on [2π, 4π ]. Continuing the ODE analysis on the intervals [2π n, 2π(n + 1)] with n ≥ 2, the bound (4.5) is extended for any x ∈ [2π, 2π(n + 1)]. It is left for a reader’s exercise to use the same method to prove the bound (4.5) for x < 0. D. Poincaré Mappings We review here the Poincaré map for the second-order equation (1.1) with a 2π -periodic coefficient V , in comparison with the second-order difference equation (1.2). Denote φ(2π n) = φn and φ (2π n) = ψn , ∀n ∈ Z and consider the initial value problem for the second-order equation (1.1) on the interval [2π n, 2π(n + 1)] for a fixed n ∈ Z. By the theorem on local existence and smoothness of solutions of the initial-value problem, there exists a continuously differentiable solution φ(x) on [2π n, 2π(n + 1)] if the function V is piecewise continuous and δ0 := |φn | + |ψn | is sufficiently small. The Poincaré map is then defined in the form un+1 = P(un ), un =

φn ψn

, ∀n ∈ Z,

(D.1)

where P : C2 → C2 is a continuously differentiable function. On one hand, the difference map (D.1) is exactly equivalent to the second-order equation (1.1) with the periodic function V . On the other hand, the Poincaré map P(un ) is generally different from the second-order difference equation (1.2). We will show that the Poincaré map (D.1) reduces to the scalar equation (1.2) in the near-linear limit, when the cubic term |φn |2 φn is small compared to the second-order difference term φn+1 + φn−1 . This limit differs from the domain of applicability of the lattice equation (1.2) justified in Theorem 1, where all terms are considered to be of the same order. In the linear theory, P(un ) = Aun , where A is a monodromy matrix with the elements ai j for 1 ≤ i, j ≤ 2. Since the Wronskian determinant of the second-order ODE (1.1) is constant in x, the coefficients satisfy the constraint det(A) = a11 a22 − a12 a21 = 1. Eliminating ψn from the system (D.1) with P(un ) = Aun , we obtain that φn+1 + φn−1 = (ω)φn ,

∀n ∈ Z,

(D.2)

where (ω) = tr(A) = a11 + a22 . The Floquet theory follows immediately from the linear second-order map (D.2) since the spectral bands are found from solutions of the equation (ω) = 2 cos(2π k) for all k ∈ T. It is proved in [7] that this equation admits infinitely many solutions, which can be enumerated by the index l ∈ N and ordered as follows: ω1 (k) ≤ ω2 (k) ≤ · · · ≤ ωl (k) ≤ · · · for any k ∈ T. Consider the nonlinear Poincaré map (D.1) in the limit where the cubic terms are small, i.e. when supx∈[2π n,2π(n+1)] |φ(x)| + |φ (x)| := δ is sufficiently small. Expanding P(un ) into the Taylor series in un and eliminating ψn from the second equation by

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

829

using the near-identity transformations, we obtain the perturbed second-order difference map, φn+1 + φn−1 − (ω)φn = σ α1 |φn |2 φn + α2 |φn |2 (φn+1 + φn−1 ) + α3 φn2 (φ¯ n+1 + φ¯ n−1 ) + α4 (|φn+1 |2 + |φn−1 |2 )φn + α5 (φ¯ n+1 φn−1 + φn+1 φ¯ n−1 )φn 2 2 + α6 (φn+1 + φn−1 )φ¯ n + α7 φn+1 φn−1 φ¯ n 2 ¯ 2 + α8 (|φn+1 |2 φn+1 +|φn−1 |2 φn−1 )+ α9 (φn+1 φn−1 + φn−1 φ¯ n+1 ) + α10 (|φn+1 |2 φn−1 + |φn−1 |2 φn+1 ) , ∀n ∈ Z, (D.3)

where (α1 , α2 , . . . , α10 ) are some coefficients. The perturbed equation (D.3) contains all cubic terms, which preserve the gauge invariance and reversibility of the original ODE (1.1), such that if {φn }n∈Z is a solution, then {φn eiθ }n∈Z and {φ−n }n∈Z are also solutions for any θ ∈ R. The actual values of the coefficients of the cubic terms depend on the potential function V . See [15] for analysis of localized solutions of the second-order difference equation (D.3). Proposition 4. Let {φn }n∈Z be a real-valued solution of the lattice equation (D.3) such l 2 (Z) is small. There exists a near-identity transformation that φ l52 (Z) , φn = ϕn + σ B(ω)ϕn3 + O ϕ

(D.4)

which transforms the lattice equation (D.3) to the canonical form l52 (Z) , ϕn+1 + ϕn−1 − (ω)ϕn = σ A(ω)ϕn3 + O ϕ

(D.5)

for some constants A(ω) and B(ω). Proof. Let {φn }n∈Z be a real-valued solution of the lattice equation (D.3), such that the right-hand-side can be rewritten in the form 2 2 β1 φn3 + β2 φn2 (φn+1 + φn−1 ) + β3 (φn+1 + φn−1 )φn + β4 φn+1 φn−1 φn 3 3 +β5 (φn+1 + φn−1 ) + β6 φn+1 φn−1 (φn+1 + φn−1 ),

where β1 = α1 , β2 = α2 + α3 , β3 = α4 + α6 , β4 = 2α5 + α7 , β5 = α8 and β6 = α9 + α10 . Substituting the leading-order equation (D.2) to the terms above, we obtain

1 1 β1 + β2 + 2 (β3 + β4 ) + 3 β6 φn3 3 3 1 1 3 3 (β4 − 2β3 ) (φn+1 + β5 − β6 − + φn−1 ). 3 3

1 Using the near-identity transformation (D.4) with B(ω) = β5 − 13 β6 − 3 (β4 − 2β3 ), 1 2 we arrive to the canonical form (D.5) with A(ω) = β1 + β2 + 3 (β3 + β4 ) + 13 3 β6 + B(ω).

830

D. Pelinovsky, G. Schneider, R. S. MacKay

The canonical form (D.5) does not hold when the potential function V is defined by Assumption 2 for sufficiently small ε. In the lattice equation (1.2) justified by Theorem 1, the cubic term |φn |2 φn can not be considered to be small compared to the second-order difference term φn+1 +φn−1 , which implies that (ω) in the canonical form (D.5) is large. Compared to the Poincaré map (D.1), the Wannier decomposition replaces the secondorder equation by the lattice system with infinite coupling between lattice sites, which is reduced asymptotically to the second-order difference map. Notice that the analysis above implies that the lattice system for coefficients of the Wannier functions must be satisfied exactly by the second-order Poincaré map (D.1) after a nonlocal transformation. Acknowledgements. The work of D. Pelinovsky is supported by the EPSRC and Humboldt Research Fellowships. The work of G. Schneider is supported by the Graduiertenkolleg 1294 “Analysis, simulation and design of nano-technological processes” granted by the Deutsche Forschungsgemeinschaft (DFG) and the Land Baden-Württemberg. The work of R.S. MacKay is supported by EPSRC grant EP/D069513/1.

References 1. Alfimov, G.L., Kevrekidis, P.G., Konotop, V.V., Salerno, M.: Wannier functions analysis of the nonlinear Schrödinger equation with a periodic potential. Phys. Rev. E 66, 046608 (2002) 2. Busch, K., Schneider, G., Tkeshelashvili, L., Uecker, H.: Justification of the nonlinear Schrödinger equation in spatially periodic media. Z. Angew. Math. Phys. 57, 905–939 (2006) 3. Desyatnikov, A.S., Kivshar, Yu.S., Torner, L.: Optical vortices and vortex solitons, In: Progress in Optics, Vol. 47, Ed. E. Wolf, Amstredam: Elsevier 2005, pp. 219–319 4. Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit, London Mathematical Society Lecture Notes 268, Cambridge: Cambridge University Press, 1999 5. Doelman, A., Sandstede, B., Scheel, A., Schneider G.: The dynamics of modulated wave trains. Memoirs of the AMS, to appear (2008) 6. Dohnal, T., Pelinovsky, D., Schneider, G.: Coupled-mode equations and gap solitons in a two-dimensional nonlinear elliptic problem with a periodic potential. J. Nonlin. Sci., to appear (2008) 7. Eastham, M.S.: The Spectral Theory of Periodic Differential Equations. Edinburgh: Scottish Academic Press, 1973 8. Helffer, B.: Semi-classical analysis for the Schrödinger operator and applications, Lecture Notes in Mathematics 1336, New York: Springer, 1988 9. Kohn, W.: Analytic properties of Bloch waves and Wannier functions. Phys. Rev. 115, 809–821 (1959) 10. Lukas, M., Pelinovsky, D., Kevrekidis, P.G.: Lyapunov–Schmidt reduction algorithm for three-dimensional discrete vortices. Physica D 237, 339–350 (2008) 11. MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994) 12. Panati, G.: Triviality of Bloch and Bloch–Dirac bundles. Ann. Henri Poincar 8, 995–1011 (2007) 13. Pankov, A.: Periodic nonlinear Schrödinger equation with application to photonic crystals. Milan J. Math. 73, 259–287 (2005) 14. Pelinovsky, D.E.: “Asymptotic reductions of the Gross–Pitaevskii equation”. In: Emergent Nonlinear Phenomena in Bose–Einstein Condensates. Eds. Kevrekidis, P.G., Franzeskakis, D.J., Carretero-Gonzalez, R., New York: Springer-Verlag, 2008, pp. 377–398 15. Pelinovsky, D.E.: Translationally invariant nonlinear Schrödinger lattices. Nonlinearity 19, 2695– 2716 (2006) 16. Pelinovsky, D.E., Kevrekidis, P.G., Frantzeskakis, D.J.: Persistence and stability of discrete vortices in nonlinear Schrodinger lattices. Physica D 212, 20–53 (2005) 17. Pelinovsky, D., Schneider, G.: Justification of the coupled-mode approximation for a nonlinear elliptic problem with a periodic potential. Applic. Anal. 86, 1017–1036 (2007) 18. Pitaevskii, L., Stringari, S.: Bose–Einstein Condensation Oxford: Oxford University Press, 2003 19. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. IV. Analysis of Operators. New York: Academic Press, 1978 20. Sandstede, B.: Stability of multiple-pulse solutions. Trans. Am. Math. Soc. 350, 429–472 (1998) 21. Scheel, A., Van Vleck, E.S.: Lattice differential equations embedded into reaction–diffusion systems. Proc. Royal Soc. of Edinburgh, to appear (2008) 22. Slusher, R.E., Eggleton, B.J. (eds): Nonlinear photonic crystals. Berlin: Springer, 2003

Justification of the Lattice Equation for a Nonlinear Elliptic Problem

831

23. Slater, J.C.: A soluble problem in energy bands. Phys. Rev. 87, 807–835 (1952) 24. Unser, M.: Sampling - 50 years after Shannon. Proc. IEEE 88, 569–587 (2000) 25. Zelik, S., Mielke, A.: Multi–pulse evolution and space–time chaos in dissipative systems. Memoirs of the AMS, to appear (2008) Communicated by P. Constantin

Commun. Math. Phys. 284, 833–865 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0646-7

Communications in

Mathematical Physics

The Jancovici–Lebowitz–Manificat Law for Large Fluctuations of Random Complex Zeroes F. Nazarov1, , M. Sodin2, , A. Volberg3, 1 Mathematics Department, University of Wisconsin-Madison, 480 Lincoln Dr., Madison,

WI 53706, USA. E-mail: [email protected]

2 School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

E-mail: [email protected]

3 Department of Mathematics, Michigan State University, East Lansing,

MI 48824, USA. E-mail: [email protected] Received: 3 September 2007 / Accepted: 25 June 2008 Published online: 15 October 2008 – © Springer-Verlag 2008

Abstract: Consider a Gaussian Entire Function f (z) =

∞

zk ζk √ , k! k=0

where ζ0 , ζ1 , . . . are Gaussian i.i.d. complex random variables. The zero set of this function is distribution invariant with respect to the isometries of the complex plane. Let n(R) be the number of zeroes of f in the disk of radius R. It is easy to see that En(R) = R 2 , and it is known that the variance of n(R) grows linearly with R (Forrester andHonner). We prove that, for every α > 1/2, the tail probability P |n(R) − R 2 | > R α behaves as exp −R ϕ(α) with some explicit piecewise linear function ϕ(α). For some special values of the parameter α, this law was found earlier by Sodin and Tsirelson, and by Krishnapur. In the context of charge fluctuations of a one-component Coulomb system of particles of one sign embedded into a uniform background of another sign, a similar law was discovered some time ago by Jancovici, Lebowitz and Manificat. 1. Introduction Consider the Fock-Bargmann space of the entire functions of one complex variable 2 that are square integrable with respect to the measure π1 e−|z| dm(z), where m is the Lebesgue measure on C. Let f be a Gaussian function associated with this space; i.e., f (z) = ζk ek (z), k 0

Partially supported by the National Science Foundation, DMS grant 0501067. Partially supported by the Israel Science Foundation of the Israel Academy of Sciences and Humanities, grants 357/04 and 171/07.

834

F. Nazarov, M. Sodin, A. Volberg

where ζk are independent standard complex Gaussian random variables (that is, the 2 density of ζk on the complex plane C is π1 e−|w| ), and {ek } is an orthonormal basis in the Fock-Bargmann space. The Gaussian function f does not depend on the choice of zk the basis {ek }, so usually one takes the standard basis ek (z) = √ , k ∈ Z+ . In what k! follows, we call f a Gaussian Entire Function (G.E.F., for short). G.E.F. together with other similar models were introduced in the 90’s in the works of Bogomolny, Bohigas, Lebouef [1], and Hannay [5]. A remarkable feature of the zero set Z f = f −1 {0} of a G.E.F. is its distribution invariance with respect to the isometries of C. The rotation invariance is obvious since the distribution of the function f is rotation invariant. The translation invariance follows, 2 for instance, from the fact that the operators (Tw g)(z) = g(w + z)e−zw e−|w| /2 , w ∈ C, are unitary operators in the Fock-Bargmann space, and therefore, if f is a G.E.F., then Tw f is a G.E.F. as well (see Sect. 2.2 below). It is worth mentioning that by Calabi’s rigidity [11, Sect. 3], f (z) together with its scalings f (t z), t > 0, are the only Gaussian functions analytic in C with the distribution of zeroes invariant with respect to the isometries of C. See [12, Part I] for further discussion. Let n(R) = Card Z f ∩ RD be the number of zeroes of f in the disk of radius R. It is not hard to check that the mean number of points of Z f per unit area equals π1 (cf. Sect. 2.3). Therefore, En(R) = R 2 . The asymptotics of the variance of n(R) was computed by Forrester and Honner in [2]: 2 E n(R) − R 2 = c R + o(R), R → ∞, with an explicitly computed positive c. In [10], Shiffman and Zelditch gave a different computation of the asymptotics of the variance valid in a more general context. The norn(R) − R 2 malized random variables √ converge in distribution to the standard Gaussian Var n(R) random variable. This can be proven, for instance, by a suitable modification of the argument used in [12, Part I]. In this work, we describe the probabilities of large fluctuations of the random variable n(R) − R 2 . and every ε > 0, ϕ(α)−ε R α < e−R

Theorem 1. For every α e−R

ϕ(α)+ε

1 2

(1.1)

for all sufficiently large R > R0 (α, ε), where ⎧ 1 ⎪ ⎨2α − 1, 2 α 1; ϕ(α) = 3α − 2, 1 α 2; ⎪ ⎩2α, α 2. In a different context of charge fluctuations of a one-component Coulomb system of particles of one sign embedded into a uniform background of the opposite sign, a similar law was discovered by Jancovici, Lebowitz and Manificat in their physical paper [4]. Let us mention that it is known since Ginibre’s classical paper [3] that the class of point processes considered by Jancovici, Lebowitz and Manificat contains as a special case the N → ∞ limit of the eigenvalue point process of the ensemble of N × N random matrices with independent standard complex Gaussian entries. The resemblance between the

Large Fluctuations of Random Complex Zeroes

835

zeroes of G.E.F. and the eigenvalues of Ginibre’s ensemble was discussed both in the physical and the mathematical literature. Now, let us return to the zeroes of G.E.F. In some cases, the estimate (1.1) is known. As we have already mentioned, it is known for α = 21 when it follows from the asymptotics of the variance and the asymptotic normality. In the case α = 2 it follows from a result of Sodin and Tsirelson [12, Part III], which says that for each R 1, 4 4 e−C R P |n(R) − R 2 | > R 2 e−c R with some positive numerical constants c and C. In [7], Krishnapur considered the case α > 2 and proved that in that case α 2α P n(R) > R α = e−( 2 −1)(1+o(1))R log R ,

R → ∞.

In the same work, he also proved the lower bound in the case 1 < α < 2: 3α−2 . P |n(R) − R 2 | > R α e−C R Using a certain development of his method, we’ll get the lower bound P

|n(R) − R 2 | > R α

e−C R

2α−1

,

1 < α < 1. 2

Apparently, in the case 21 < α < 2, the technique used in [12, Part III] and [7] does not allow one to treat the upper bounds in the law (1.1), which require new ideas. Outline of the proof. Let us sketch the main ideas we use in the proof of Theorem 1. 1. We denote by I arg f the increment of the argument of a G.E.F. f over an arc I ⊂ RT oriented counterclockwise, and set δ( f, I ) = I arg f − E I arg f . Then by the argument principle, 2π(n(R) − R 2 ) = δ( f, RT) . Note that the random variable δ( f, I ) is set-additive and split the circumference RT into R N = 2π disjoint arcs I j of length r . Thus we need to estimate the probability of the r event ⎧ ⎫ N ⎨ ⎬ α (R) = δ( f, I j ) > 2π R α . ⎩ ⎭ j=1 2. Let us fix an arc I of length r and look more closely at the tails of the random variable δ( f, I ). It is not hard to check that δ( f, I ) = δ(Tw f, I − w), where w is the midpoint of 2 the arc I and Tw f (z) = f (w + z)e−zw e−|w| /2 . A classical complex analysis argument shows that for any analytic function g in the disk 2r D and any “good” arc γ ⊂ r D of length at most r , one has γ arg g C log max2r D |g| , maxr D |g|

836

F. Nazarov, M. Sodin, A. Volberg

see Lemma 9. Then, estimating the probability that, for a G.E.F. g = Tw f , the doubling max2r D |g| is large, we come up with the tail estimate exponent log maxr D |g| C M2 4 , M 1. r P |δ( f, I )| > Mr 2 exp − log M 3. Now, let us come back to the sum

N

δ( f, I j ). The random variables δ( f, I j ) are not

j=1

independent, however in [9, Theorem 3.2] we’ve introduced an “almost independence device” that allows us to think about these random variables as of independent ones, provided that the arcs I j are well-separated from each other. Here we’ll need a certain extension of that result (Lemma 5 below). 4. To see how the almost independence and the tail estimate work, first, consider the case 1 < α < 2. We split the circumference RT into N disjoint arcs {I j } of length r . In view of the tail estimate in Item 2, we need to distribute the total deviation R α between these arcs in such a way that the “deviation per arc” R α /N is bigger than r 2 . Since N Rr , this leads to the choice of r comparable to R α−1 . Then we consider the event that for a fixed subset J ⊂ {1, 2, ... , N } and for every j ∈ J , one has |δ( f, I j )| m j r 2 , where m j are some big positive integer powers of 2 that satisfy m j r 2 Rα . (1.2) j∈J

Then we choose a well-separated sub-collection of arcs J ⊂ J that falls under the assumptions of the almost independence Lemma 5. This step weakens condition (1.2) to 3/2 m j r 2 Rα , j∈J

which still suffices for our purposes. Then regarding the random variables δ( f, I j ), j ∈ J , as independent ones and using the tail estimate for these variables, we see that the probability of this event does not exceed ⎞ ⎛ ⎞ ⎛ 3/2 m 2j r 4 ⎠ exp ⎝−r 2 m j r 2⎠ exp ⎝−c log m j j∈J j∈J exp −cr 2 R α exp −c1 R 3α−2 . To get the upper bound for the probability of the event α , we need to take into account the number of possible choices of the subset J and of the numbers m j . This factor does not exceed 2 N (log R) N < eC R log log R which is not big enough to destroy our estimate. 5. Now, let us turn to the upper bound in the case 21 < α < 1. We choose the arcs I j of length 1. To separate them from each other, we choose from this collection R 1−ε arcs {I j } j∈J separated by R ε and such that

Large Fluctuations of Random Complex Zeroes

837

δ( f, I j ) > R α−ε . j∈J For these arcs, the random variables δ( f, I j ) behave like independent ones, and since their tails have a fast decay, we can apply to them the classical Bernstein inequality (Lemma 3), which yields ⎧ ⎫ ⎨ ⎬ c(R α−ε )2 α−ε = C exp −c R 2α−1−ε . P δ( f, I j ) > R C exp − ⎩ ⎭ Card J j∈J 6. To get the lower bound for the probability of α in the case an auxiliary Gaussian Taylor series g(z) =

1 2

< α < 1, we introduce

∞

zk ζk a k √ , k! k=0

where ζk are independent standard complex Gaussian random variables, and ⎧√ 2 2 ⎪ ⎨√1 − R α−1 , R + R < k < R + 2R ; 2 α−1 ak = 1+ R , R − 2R < k < R 2 − R ; ⎪ ⎩1, otherwise . It is not difficult to check that for some absolute c > 0, the probability that the function g has at most R 2 − c R α zeroes in the disk RD is not exponentially small (more precisely, it cannot be less than c R −2+α ). Now, let γ be the standard Gaussian measure in the space C∞ ; i.e., the product of countably many copies of standard complex Gaussian measures on C, and let γa be another Gaussian measure on C∞ which is the product of complex Gaussian measures γak on C with variances ak2 . Let E ⊂ C∞ be the set of coefficients ηk such that the Taylor k series k 0 ηk √z converges in C and has at most R 2 − c R α zeroes in RD. Then k!

γa (E) c R −2+α , while the quantity P n(R) R 2 − c R α we are interested in equals γ (E). Thus, it remains to compare γ (E) with γa (E), and a more or less straightforward computation finishes the job. Following [12, Parts I and II], we compare the zero point process Z f with random independent perturbations of the lattice points. We fix the parameter ν > 0, and consider the random point set {ω + ζω }ω∈Z2 , where ζω are independent, identical, radially distributed random variables with the tails P { |ζω | > t } decaying as exp(−t ν ) for t → ∞. Set n(R) = Card{ω ∈ Z2 : |ω + ζω | R}. Then one can see that, for every α > 21 and every ε > 0, ϕ(α,ν)+ε ϕ(α,ν)−ε e−R R α < e−R

838

F. Nazarov, M. Sodin, A. Volberg

for all sufficiently large R > R0 (α, ε) with ⎧ 1 ⎪ ⎨2α − 1, 2 α 1; ϕ(α, ν) = (ν + 1)α − ν, 1 α 2; ⎪ ⎩(ν/2 + 1)α, α 2. In the range 21 α 1, the exponent ϕ(α) = 2α − 1 seems to be determined by the asymptotic normality at the endpoint α = 21 . In the range α > 1, the JancoviciLebowitz-Manificat law (1.1) corresponds to the case ν = 2; i.e., to the lattice perturbation with the Gaussian decay of the tails. Convention about the constants. By c and C we denote positive numerical constants that appear in the proofs. The constants denoted by c are supposed to be small (in particular, they are always less than 1), while the constants denoted by C are supposed to be big (they are always larger than 1). Within the proof of each lemma, we start a new sequence of indices for these constants, and we never refer to these constants after the corresponding proof is completed. Notation A B and A B means that there exist positive numerical constants C and c such that A C · B and A c · B correspondingly. If A B and A B simultaneously, then we write A B. Notation A B stands for “much less” and means that A c · B with a very small positive c; similarly, A B stands for “much larger” and means that A C · B with a very large positive C. 2. Preliminaries 2.1. A combinatorial lemma. For j, k ∈ {1, . . . , N }, we set | j − k|∗ = min {|i − k| : i ≡ jmodN } = min {| j − k|, | j − k + N |, | j − k − N |} . Lemma 1. Let m 1 , ..., m N be non-negative integers. Then, given Q 1, there exists a subset J ⊂ {1, . . . , N } such that √ √ j, k ∈ J , j = k , | j − k|∗ Q( m j + m k ) , and j∈J

m j 5Q

j∈J

3/2

mj

.

Proof of Lemma 1. We build the set J by an inductive construction. Choose j1 ∈ {1, . . . , N } such that m j1 = max m j : j ∈ {1, . . . , N } . Set J1 = { j1 },

√ J1 = j : 0 < | j − j1 |∗ < 2Q m j1 ,

J1 = J1 ∪ J1 ,

and note that j∈J1

3/2 √ m j 4Q m j1 + 1 m j1 5Q mj . j∈J1

Large Fluctuations of Random Complex Zeroes

839

Now, suppose that we’ve made k steps of this construction. If Jk = {1, . . . , N }, then we are done with J = Jk . If {1, . . . , N }\Jk = ∅, we choose jk+1 ∈ {1, . . . , N }\Jk such that m jk+1 = max m j : j ∈ {1, . . . , N }\Jk , and define the sets Jk+1 = Jk ∪ { jk+1 }, √ Jk+1 = Jk ∪ j ∈ {1, . . . N }\Jk : 0 < | j − jk+1 |∗ < 2Q m jk+1 , ∪ J . Then, as above, and Jk+1 = Jk+1 k+1 3/2 m j 5Qm jk+1 , j∈Jk+1 \Jk

whence

m j 5Q

j∈Jk+1

j∈Jk+1

We are done.

3/2

mj

.

2.2. Probabilistic preliminaries. Lemma 2. [9, Lemma 2.1]. Let ηk be standardcomplex Gaussian random variables (not necessarily independent). Let ak > 0, S = k ak . Then, for every t > 0, 1 2 ak |ηk | > t 2e− 2 (t/S) . P k

We also need the following classical Bernstein’s estimate: Lemma 3. Let ψk , k = 1, 2, ... , n, be independent random variables with zero mean such that, for some K > 0 and every t > 0, P { |ψk | > t } K e−t . Then, for 0 < t 5K n,

t2 P . ψk > t 2 exp − 16K n k

Proof. Set Sn =

n

ψk . Then EeλSn =

k=1

n

Eeλψk . Note that

k=1

Eeλψk = E 1 + λψk + eλψk − 1 − λψk ∞ ∞ = 1+λ P { ψk > t } eλt − 1 dt + P { ψk < −t } 1 − e−λt dt 0 0 ∞ ∞ e−t eλt − 1 dt + e−t 1 − e−λt dt 1 + Kλ 0

= 1+

2K λ2 1 − λ2

0

1 + 4K λ2 e4K λ , 2

840

F. Nazarov, M. Sodin, A. Volberg

provided that λ 23 . Hence, we get P { Sn > t } e−λt EeλSn e4K nλ

2 −λt

.

Similarly, P { Sn < −t } e4K nλ −λt , and therefore P { |Sn | > t } 2e4K nλ t , we get the lemma. Taking λ = 8K n

2 −λt

2

.

2.3. Mean number of zeroes of a Gaussian Taylor series. Consider a Gaussian Taylor series g(z) =

∞

ζk a k z k

k=0

√ with non-negative ak such that lim k ak = 0 and with independent standard complex k→∞

Gaussian random variables ζk . Then almost surely, the series on the right-hand side has infinite radius of convergence, and hence g is an entire function. By n g (r ) we denote the number of zeroes of the function g in the disk of radius r . Lemma 4. En g (r ) =

1 r C g (r ) , 2 Cg (r )

where Cg (r ) =

∞

ak2 r 2k .

k=0

This readily follows from the Edelman-Kostlan formula for the density of mean counting measure of zeroes of an arbitrary Gaussian analytic function, see [11, Sect. 2]. Alternatively, one can obtain this formula using the argument principle, see [6, p. 195, Exercise 5]. 2.4. Operators Tw and shift invariance. For a function g : C → C and a complex number w ∈ C, we define 1

Tw g(z) = g(w + z)e−zw e− 2 |w| . 2

In what follows, we use some simple properties of these operators. (a) Tw are unitary operators in the Fock-Bargmann space of entire functions that are 2 square integrable with respect to the measure π1 e−|z| dm(z): 1 2 2 Tw f 2 = | f (w + z)|2 e−2Re (zw)−|w| −|z| dm(z) π C 1 2 = | f (w + z)|2 e−|w+z| dm(z) = f 2 . π C

Large Fluctuations of Random Complex Zeroes

841

(b) If f is a G.E.F., then Tw f is a G.E.F. as well. In particular, the distribution of the random zero set Z f = f −1 {0} is translation invariant. The property (b) also yields the distribution invariance of the function f ∗ (z) = 2 | f (z)|e−|z| /2 with respect to the isometries of C. Indeed, a straightforward inspection shows that (Tw f )∗ (z) = f ∗ (w + z). (c) By (b), if f is a G.E.F., then Tw f =

zk ζk (w) √ , k! k 0

where ζk(w) are independent standard complex Gaussian random variables. Recallzk is an orthonormal basis in the Fock-Bargmann space, and using ing that √ k! k 0 that Tw is a unitary operator and Tw T−w is the identity operator in that space, we get k " " ! z zk = f, T−w √ . ζk (w) = Tw f, √ k! k! !

Note that for w = w , the Gaussian variables ζk (w) and ζk (w ) are correlated and (d) # $ %& k zk z , T−w √ . E ζk (w)ζk (w ) = T−w √ k! k! Let γ ⊂ C be an oriented curve. Note that if f does not vanish on the curve γ , then γ −w arg Tw f = γ arg f − γ Im(z − w)w = γ arg f − γ Im(zw) , where γ Im(zw) is the increment of the function Im(zw) over γ , and γ − w denotes the translation of the curve γ by −w. (e) Set δ( f, γ ) = γ arg f − Eγ arg f . Then δ(Tw f, γ − w) = δ( f, γ ). If I ⊂ RT is a counterclockwise oriented arc with the midpoint at w, then using rotation invariance and the argument principle, we get E I arg f =

|I | 2 |I | |I | E R T arg f = E2π n(R) = R = |I |R 2π R 2π R R

and δ(Tw f, I − w) = I arg f − |I |R .

842

F. Nazarov, M. Sodin, A. Volberg

2.5. Almost independence. Our approach is based on the almost independence property introduced in [9]. It says that if {w j } ⊂ C is a “well-separated” set, then the G.E.F. Tw j f can be simultaneously approximated by independent G.E.F. The following lemma somewhat extends Theorem 3.2 from [9]. Lemma 5. There exists a numerical constant A > 1 such that for every family of pairwise disjoint disks D(w j , r j + Aρ j ) with ' w j ∈ C, r j 1, ρ j max 1, log r j , one can represent the family of G.E.F. Tw j f as Tw j f = f j + h j , where f j are independent G.E.F. and P

max |h j (z)|e

z∈r j D

−|z|2 /2

e

−ρ 2j

2 2 exp − 21 eρ j .

Theorem 3.2 in [9] corresponds to the case when r j = r 1 and ρ j = Nr with N 1. We prove Lemma 5 in the Appendix.

2.6. Bounds for G.E.F. Our first lemma estimates the probability that the function f is very large: Lemma 6. (cf. [9, Lemma 4.1]). Let f be a G.E.F. Then, for each r 1 and M 1, 1 2 2 P max | f (z)|e−|z| /2 M 18r 2 e− 32 M . z∈r D

Proof. We cover the disk r D by at most (2r + 1)2 9r 2 disks D j of radius 1 and show that for each j, 1 2 2 P max | f (z)|e−|z| /2 M 2e− 32 M . z∈D j

By the translation invariance of the distribution of the random function | f (z)|e−|z| suffices to prove this estimate in the unit disk D. Clearly, 2 P max | f (z)|e−|z| /2 M P max | f (z)| M z∈D z∈D ⎫ ⎧ ⎬ Lemma 2 ⎨ |ζ | 1 2 k 2e− 2 (M/S) P √ M ⎭ ⎩ k!

2 /2

k 0

with S =

k 0

√1 k!

< 4. Hence, the lemma.

The following lemma estimates the probability that the function f is very small:

it

Large Fluctuations of Random Complex Zeroes

843

Lemma 7. (cf. Lemma 8 in [7] and Lemma 4.2 in [9]). Let f be a G.E.F. Let r 1 and m 3. Then m2 4 −mr 2 r . P max | f | e exp − log m rD Proof. Suppose that | f | e−mr everywhere in r D. Then by Cauchy’s inequalities, √ n! n n/2 2 |ζn | n max | f | n e−mr , n = 0, 1, 2, ... . r r rD 2

For 0 n

m r 2 , the probabilities of these events do not exceed log m m r2 n log m m 2 2 −2 −2mr 2 nr e e−2mr < e−mr . log m

Since these events are independent, the probability we are estimating is bounded by m m2 4 2 2 = exp − . r r exp −mr log m log m We are done.

The next lemma bounds the probability that a G.E.F. is small on a given curve of a given length. Lemma 8. Let f be a G.E.F., and let γ be a curve of length at most r 1. Then, for any positive ε 41 , ( 1 −|z|2 /2 P min | f (z)|e < ε < 100r ε log . z∈γ ε Proof. We split the curve γ into r arcs γ j of length at most 1, and fix the collection of disks D j of radius 1 such that γ j ⊂ D j . We’ll show that for each j, ( 1 −|z|2 /2 P min | f (z)|e < ε < 50ε log . z∈γ j ε Clearly, this will yield the lemma. 2 By the shift invariance of the distribution of the random function | f (z)|e−|z| /2 , we assume without loss of generality that D j is the unit disk D. Taking into account that 2 e−|z| /2 > 21 everywhere in the unit disk, we have 2 P min | f (z)|e−|z| /2 < ε P min | f (z)| < 2ε . z∈γ j

z∈γ j

We choose points {z m } ⊂ γ and disks Dm = {|z − z m | κε} such that + * ) 1 , Dm , and Card{z m } γ ⊂ 2κε m

844

F. Nazarov, M. Sodin, A. Volberg

with the parameter κ to be specified later. Then, for z ∈ Dm , | f (z)| | f (z m )| − |z − z m | max | f | | f (z m )| − κε max | f | . D

D

Hence, we need to estimate the probability of the events 1 1 = min | f (z m )| 3ε and 2 = max | f | m κ D

.

If neither of these events holds, then | f (z)| > 3ε − ε = ε everywhere on γ . Recall that for any standard complex Gaussian random variable ζ and for any t > 2 0, we have P { |ζ | t } < t 2 , also recall that f (z m )e−|z m | /2 is a standard complex Gaussian random variable. Hence, for any fixed m, we have P { | f (z m )| 3ε } 2 P | f (z m )|e−|z m | /2 3ε < 9ε2 . Therefore, * P { 1 } < Next,

+ 1 9 · 9ε2 εκ −1 + 9ε2 . 2κε 2

⎧ ⎫ ⎨ k 1 ⎬ Lemma 2 − 1 (κ S)−2 P { 2 } P 2e 2 √ |ζk | ⎩ κ⎭ k! k 1

with S =

k 1

√k k!

1

−2

< 6. Therefore, P { 2 } 2e− 72 κ , and

P { 1 } + P { 2 } < Choosing here κ −1 =

,

1 −2 9 −1 εκ + 2e− 72 κ + 9ε2 . 2

72 log 1ε , we get

P

min | f (z)| < 2ε z∈γ j

proving the lemma.

P { 1 } + P { 2 } ( ( √ 1 1 2 < 27 2 ε log + 2ε + 9ε < 50ε log , ε ε

2.7. Upper bounds for the increment of the argument. We say that a piecewise C 1 curve γ ⊂ r D is good if its length does not exceed r and, for any ζ ∈ C\{γ }, we have γ arg(z − ζ ) 2π . The following lemma is classical (cf. [8, Lemma 6, Chapter VI]): Lemma 9. There exists a numerical constant B > 1 with the following property. Let g be an analytic function in the disk 2r D such that sup |g| 1. If max |g| e−β , then for any good curve γ ⊂ r D, we have

2r D

γ arg g Bβ .

rD

Large Fluctuations of Random Complex Zeroes

845

Proof. By scale invariance, it suffices to prove the lemma for r = 1. Choose z 0 ∈ r T such that |g(z 0 )| = max |g| e−β , and denote by ϕ a Möbius transformation ϕ : 2D → 2D rD

with ϕ(0) = z 0 . Denote by n g (t) and n g◦ϕ (t) the number of zeroes of the functions g and g ◦ ϕ in the disk tD, and choose ρ < 2 such that ϕ −1 23 D ⊂ ρD. Then by Jensen’s formula

2π

0

log |(g ◦ ϕ)(2eiθ )|

0

2 n g◦ϕ (t) dθ = log |g ◦ ϕ(0)| + dt 2π t 0 2 n g◦ϕ (t) −β + dt −β + n g◦ϕ (ρ) log ρ2 t ρ −β + n g ( 23 ) log ρ2 .

Thus the number of zeroes of g in the disk 23 D does not exceed C1 β. Hence, g = pg1 , where p is a polynomial of degree N C1 β with zeroes in 23 D and a unimodular leading coefficient, and g1 does not vanish in 23 D, g1 (0) > 0.

2π

Claim 9 -1. 0

dθ C2 β. log |g1 ( 23 eiθ )| 2π

Proof of Claim 9 -1. Indeed,

2π dθ dθ 2π dθ + log |g1 ( 23 eiθ )| log |g( 23 eiθ )| log | p( 23 eiθ )| 2π 2π 2π 0 0 2π 2π dθ dθ + . log− |g( 23 eiθ )| = log | p( 23 eiθ )| 2π 2π 0 0

2π

0

To estimate the integral on the right-hand side, we note that

2π

0

3 2 − |z 0 |2 3 iθ log |g( 2 e )| 32 | 2 eiθ − z 0 |2

dθ log |g(z 0 )| −β , 2π

whence 0

2π

log− |g( 23 eiθ )|

dθ C3 β . 2π

The estimate of the second integral on the right-hand side is also straightforward: since N p(z) = (z − λ j ) with λ j ∈ 23 D, we have j=1

2π 0

2π dθ dθ N · sup log | p( 23 eiθ )| log | 23 eiθ − λ| 2π 2π 3 0 λ∈ D

Hence, the claim.

2

N C1 β

C4 β .

846

F. Nazarov, M. Sodin, A. Volberg

Now, γ arg g γ arg p + γ arg g1 , and since the curve γ is good, we have γ arg p 2π N C5 β. Fix the branch h of arg g1 . Then γ h 2 max |h − h(0)| . D

Since h harmonic in

3 2 D, 2π

h(z) = 0

we have 3 iθ 2e 3 iθ 2e

log |g1 ( 23 eiθ )| Im

and

|h(z) − h(0)| C6

2π

0

This proves the lemma.

+ z dθ + h(0) , − z 2π

log |g1 ( 23 eiθ )| dθ

Claim 9−1

|z| 1 ,

C7 β ,

|z| 1 .

Lemma 10. Let r 1, let γ ⊂ r D be a good curve, let m 25B, and let f be a G.E.F. Consider the event = |δ( f, γ )| mr 2 . Then 1 1 2 2 with P exp −e 6B mr . ⊂ ∪ max | f | < e− 4B mr rD

In particular,

1 P { } 2 exp − 16B 2

m2r 4 log m

.

Proof. Introduce the events

1 (m) = |γ arg f | mr 2 ,

and

(m) =

max | f (z)|e−|z|

2 /2

z∈2r D

1

> e 3B mr

2

.

1 2 Claim 10 -1. For m 12B, 1 (m) ⊂ (m) ∪ maxr D | f | < e− 2B mr . Proof of Claim 10 -1. Suppose that the event (m) does not occur. Then 1

max | f | e 3B mr

2 +2r 2

2r D

1

2

= e( 3B + m )mr

2

m 12B

1

If the event 1 (m) occurs, then by Lemma 9 max2r D | f | , mr 2 γ arg f B log maxr D | f | whence, 1

1

max | f | e− B mr max | f | e− 2B mr , rD

proving the claim.

2

2r D

2

e 2B mr .

2

Large Fluctuations of Random Complex Zeroes

847

1 2 Claim 10 -2. For m 12B, P (m) exp −e 3B mr . Proof of Claim 10 -2. We have Lemma 6 2 2 1 3B 72r 2 exp − 32 e mr P (m)

2 6 B mr exp

2 2 1 3B − 32 e mr .

1 It’s easy to see that for t = 3B mr 2 4, one has 4 1 2t e t 18t exp − 32 e e < 18t exp − 23 et 18t exp − 32 = 18t exp − 21 et exp −et < exp −et . ./ 0 <1

Hence, the claim.

Claim 10 -3. For m 12B, P { 1 (m) } 2 exp − 4B1 2 Proof of Claim 10 -3. By Lemma 7, 1 2 P max | f | < e− 2B mr 1 2 B mr

.

(mr 2 )2 exp − 4B1 2 log(m/2B) .

rD

It’s easy to check that for t =

(mr 2 )2 log(m/2B)

12, one has et/3 >

t2 4.

Therefore,

(mr ) > 4B1 2 log(m/2B) . (mr 2 )2 . We are done. Thus P (m) also does not exceed exp − 4B1 2 log(m/2B) Claim 10 -4. Eγ arg f 12.5Br 2 . 1

2

e 3B mr >

2 2

1 (mr 2 )2 4B 2

Proof of Claim 10 -4. By Claim 10 -3, for s 12Br 2 , we have (s/2B)2 (s/2B)2 2 exp − . P |γ arg f | s 2 exp − log s/(2Br 2 ) log s/2B Therefore,

(s/2B)2 ds exp − log(s/2B) 12Br 2 ∞ 2 = 12Br 2 + 4B e−s / log s ds < 12.5Br 2 , 6r 2 ./ 0 -

Eγ arg f 12Br 2 + 2

∞

<1/8

proving the claim.

Now, we readily finish the proof of Lemma 10. Suppose that the event occurs; i.e., |δ( f, γ )| mr 2 with m 25B. Then γ arg f |δ( f, γ )| − Eγ arg f (m − 12.5B)r 2 1 mr 2 . 2 That is, ⊂ 1 ( 21 m) and the lemma follows from Claims 10 -1, 10 -2 and 10 -3 applied with 21 m instead of m.

848

F. Nazarov, M. Sodin, A. Volberg

Remark. One can get a better estimate Eγ arg f r 2 than the one given in Claim 10 -4 in the following way. If γ : [0, 1] → C is a good curve, then γ arg f = Im

f (γ (t)) γ (t) dt. f

1

0

Taking into account that E ff (z) = z, we get Eγ arg f = Im

1

γ (t)γ (t) dt,

0

whence Eγ arg f r · Length(γ ) r 2 . 3. The Upper Bound for 1 < α < 2 3.1. Few arcs with large increments of the argument. Given r 1, we fix a collection of N 2π Rr disjoint arcs I j 1 j N of length r on the circumference RT. Then, given 1 and a positive integer L, we introduce two events. The first event 1 (r, R, , L) is that the collection I j 1 j N contains a sub-collection of L disjoint arcs {I j } j∈J such that δ( f, I j ) . j∈J

To define the second event, we fix N independent G.E.F. f j . Then the event 2 (r, R, , L) is that the collection I j 1 j N contains a sub-collection of L disjoint arcs {I j } j∈J such that, δ( f j , 1 I j ) . j∈J

Here, 1 I j = I j − w j , where w j are the centers of the arcs I j . Lemma 11. Suppose that R is sufficiently big. Suppose also that R 1/2 R 2

and

1
r2

b + log R

with a sufficiently small positive numerical constant b. Then the probabilities of the 2 events i , i = 1, 2, do not exceed e−b1 r with a positive numerical constant b1 . Proof of Lemma 11. First, we estimate the probability of the event 2 ; this is a simpler part of the job. Suppose that the event 2 (r, R, , L) occurs. We choose M j r −2 δ( f j , 1 I j ) such that j∈J M j r 2 = . Let B be the constant from Lemma 9. Note that the arcs 1 I j with M j < 50B can contribute at most 50B L < 50Bb < 21 to the 1 total sum, provided that b < 100B . We discard the arcs 1 I j with M j < 50B and denote by J the collection of remaining arcs.

Large Fluctuations of Random Complex Zeroes

849

Now, let m j be the largest positive integer power of 2 such that m j M j , j ∈ J . Then 1 m jr2 and m j 25B , (3.1.1) 4 j∈J

and P { 2 }

J {m j }

⎧ ⎫ ⎨2 ⎬ |δ( f j , 1 P I j )| m j r 2 , ⎩ ⎭ j∈J

where the first sum is taken over all subsets J ⊂ {1, ..., N } of cardinality at most L, and the second sum is taken over all possible choices of m j , j ∈ J , that are positive integer powers of 2 satisfying restrictions (3.1.1). Since f j are independent, we have ⎧ ⎫ ⎨2 ⎬ P |δ( f j , 1 I j )| m j r 2 P |δ( f j , 1 I j )| m j r 2 . = ⎩ ⎭ j∈J

j∈J

The probabilities of the events on the right-hand side were estimated in Lemma 10: $ % m 2j r 4 1 2 1 P |δ( f j , I j )| m j r 2 exp − . 16B 2 log m j Therefore,

⎧ ⎫ ⎛ ⎞ ⎨2 ⎬ m 2j r 2 1 ⎠ P |δ( f j , 1 I j )| m j r 2 r2 2 L exp ⎝− ⎩ ⎭ 16B 2 log m j j∈J j∈J ⎞ ⎛ 1 < 2 L exp ⎝− r2 m j r 2⎠ 16B 2 j∈J 1 L 2 2 exp − r , 64B 2

and

1 2 r 1. P { 2 } < 2 L exp − 64B 2 J {m j }

To get rid of the sums on the right-hand side, we need to estimate the number of different ways to choose the “data” J, {m j } j∈J . Since m j is an integer power of 2 and m j , for each j ∈ J , there are at most 2 log ways to choose the integer m j . Hence, given a set J of cardinality at most L, we have at most (2 log ) L ways to choose the collection {m j } j∈J . Also there are at most N N 2π R < (N + 1) L < eC L log R 0L

850

F. Nazarov, M. Sodin, A. Volberg

ways to choose the subset J ⊂ {1, 2, ... , N } of cardinality at most L. Therefore, 2L 1 (4 log ) L eC L log R J {m j }

< eC

L(log

R+log log )

< eC

L

< eC

b

, 1 2 −2 . which is a negligible factor with respect to exp − 64B 2 r , provided that b B This completes the estimate of P { 2 } . log R

Theestimate of the probability of the event 1 follows a similar pattern. Now, the events |δ( f, I j )| m j r 2 , j ∈ J , are not independent. To get around this obstacle, we’ll use the almost independence lemma, which brings in some awkward technicalities. We split the proof into several steps. R, , L) occurs. As above, we choose M j r −2 that theevent 1 (r, (i) Suppose 2 δ( f, I j ) such that j∈J M j r = . Then we fix a sufficiently large positive numerical constant C1 25B and note that the arcs I j with M j < 2C1 (1 + r −2 log ) can contribute to the total deviation at most 2C1 L(r 2 + log ) < 2C1

b(r 2 + 2 log R) 4bC1 , r 2 + log R

which is much smaller than provided that the constant b is sufficiently small. We choose b < 8C1 1 and conclude that at least half of the deviation must come from the arcs I j with sufficiently large M j . From now on, we discard the arcs I j with M j < 2C1 (1+r −2 log ) and denote by J the set of the remaining arcs. Now, let m j be the largest positive integer power of 2 such that m j M j , j ∈ J . Then 1 m jr2 , and m j r 2 C1 r 2 + log (3.1.2) 4 j∈J

and P { 1 }

J {m j }

⎧ ⎫ ⎨2 ⎬ |δ( f, I j )| m j r 2 P , ⎩ ⎭

(3.1.3)

j∈J

where the first sum is taken over all subsets J ⊂ {1, ..., N } of cardinality at most L, and the second sum is taken over all possible choices of m j , j ∈ J , that are positive integer powers of 2 satisfying restrictions (3.1.2). As in the previous case, it suffices to show that, for a fixed subset J ⊂ {1, 2, . . . , N } with Card J L, and for fixed m j , j ∈ J , that are integer powers of 2 and satisfy conditions (3.1.2), one has ⎧ ⎫ ⎨2 ⎬ 2 P |δ( f, I j )| m j r 2 (3.1.4) e−cr . ⎩ ⎭ j∈J

Since we have at most eC L log R < eC b possible combinations of the “data” J and {m j } j∈J , the two sums on the right-hand side of (3.1.3) contribute by a negligible factor 2 with respect to e−cr , provided that b < 2Cc .

Large Fluctuations of Random Complex Zeroes

851

(ii) From now on, we fix a set J of cardinality at most L, and m j , j∈J , that are integer powers of 2 and satisfy conditions (3.1.2). Let w j be the centers of the arcs I j = I j − w j , and let Ij, 1 def I j )| m j r 2 . j = |δ( f, I j )| m j r 2 = |δ(Tw j f, 1 By Lemma 10 applied to the G.E.F. Tw j f with γ = 1 I j and m = m j , we have 1 2 j ⊂ j ∪ max |Tw j f | < e− 4B m j r rD

with P

j

1 − 6B m j r 2 , whence, < exp −e

⎛ ⎞ ) 2 2 1 2 ⎠ δ( f, I j ) m j r 2 ⊂ ⎝ max |Tw j f | < e− 4B m j r j ∪ j∈J

with P

⎧ ⎨) ⎩

j∈J

j∈J

j

⎫ ⎬ ⎭

m j r 2 C1 log

j∈J

rD

C1 25B 1 4 L< 4 L exp −e 6B C1 log Le− < e− .

−r when R 1. Since r 2 < , this is much )less than e Discarding the event j , we need to estimate the probability of the event 2

j∈J

2 1 max |Tw j f | exp − 4B m jr2 . j∈J

rD

(iii) Combinatorial Lemma 1 applied with m j = 0 for j ∈ / J and with the constant Q = π2 (A + 1), gives us a subset J ⊂ J such that √ √ j, k ∈ J , j = k , | j − k|∗ Q( m j + m k ) , and

j∈J

3/2

mj

1 mj . 5Q

(3.1.5)

j∈J

Hence, the centers w j of the arcs from J are well-separated: |w j − wk | = 2R sin

√ √ 2 | j − k|∗r | j − k|∗ r (A + 1) m j r + m k r 2R π

for j, k ∈ J , j = k. By the almost independence Lemma 5 applied with r j = ρ j = √ m j r , we have Tw j f = f j + h j , j ∈ J , where f j are independent G.E.F., and 2 2 2 2 exp − 21 em j r P max |h j (z)|e−|z| /2 e−m j r z∈r D

m j r 2 C1 log

2 exp − 21 C1 .

852

F. Nazarov, M. Sodin, A. Volberg

Introduce the event

) F= max |h j | > exp − 21 m j r 2 . rD

j∈J

Claim 11 -1. For R 1, P { F } e−r

2

.

Proof of Claim 11 -1. If for some j ∈ J ,

max |h j | > exp − 21 m j r 2 , rD

then max |h j (z)|e−|z|

2 /2

z∈r D

m j 25 2 > exp − 21 (m j + 1)r 2 > e−m j r .

Therefore, P {F }

P

2 /2

z∈r D

j∈J

L · 2 exp − 21 C1

max |h j (z)|e−|z|

> exp − 21 m j r 2

2 exp − 21 C1 .

L

Since r 2 < , this is much less than e−r

2

, provided that R 1.

If the event F does not occur, then for each j ∈

J ,

max | f j | max |Tw j f | + max |h j | rD

rD

rjD

1

1

e− 4B m j r + e− 2 m j r 2

2

B>1

1

< 2e− 4B m j r

2

m j r 2 25B

<

1

e− 6B m j r . 2

We conclude that if R is sufficiently big, then outside of an event of probability less than exp(−r 2 ), we have 1 max | f j | < exp − 6B m jr2 rD

J .

for each j ∈ (iv) Our problem boils down to the estimate of the probability that the independent events 1 2 max | f j | < e− 6B m j r , j ∈ J , rD

occur. By Lemma 7 the logarithm of the probability of each of these events doesn’t c2 m 2

1 exceed − log mj j r 4 with c2 = 36B 2 . Therefore, the logarithm of the probability that all these events happen doesn’t exceed

−c2

j∈J

(3.1.5)

m 2j log m j

< −

r 4 < −c2 r 2

c2 r2 5 π2 (A + 1)

j∈J

j∈J

3/2

m j r2 (3.1.2)

m j r 2 −cr 2

Large Fluctuations of Random Complex Zeroes

853

with c=

2c2 1 1 = . 4 5π(A + 1) 360π B 2 (A + 1)

This completes the proof of (3.1.4) and, thereby, of the lemma.

3.2. Proof of Theorem 1: the upper bound for 1 < α < 2. We need to estimate the probability of the event α = |n(R) − R 2 | > R α . Let b be the constant from the pre2−α vious lemma. We fix a small positive δ ∈ ( 41 b, 21 b) such that the number N = 2π δ R α−1 is an integer, take r = δ R , and split the circumference RT into N disjoint arcs α {I j } of length r . By the argument principle, α = j δ( f, I j ) > 2π R . In the case 1 < α < 2, the cancelations between different random variables δ( f, I j ) are not important, upper bound for the probability of the bigger event so we are after the α α = j δ( f, I j ) > 2π R . We take = 2π R α , and check that Lemma 11 can be applied to the whole collection of arcs {I j }; i.e., with L = N . If R is big enough then log R r 2 , and 1L=

1 b · 2π R α 1 b b 2π 2−α < = < 2 R . δ 2 δ 2 R 2α−2 2 r2 r + log R

Therefore, the assumptions of Lemma 11 are fulfilled, and we get 2 α 3α−2 . P α e−2π b1 r R < e−c R Done!

4. The Upper Bound for

1 2

<α<1

4.1. Approximating the total increment of arg f by the sum of increments of arguments of independent G.E.F. Lemma 12. Suppose that R is sufficientlybig, that 1 r 2, and that 3R 1/2 R. Then, given a collection of disjoint arcs I j of length r of the circumference RT that are separated by arcs of length at least log R, there exists a collection of independent G.E.F. { f j } such that ⎫ ⎧ ⎬ ⎨ P δ( f, I j ) − δ( f j , 1 I j ) e−b2 , ⎭ ⎩ j j where 1 I j = I j − w j and b2 is a positive numerical constant. √ Proof of Lemma 12. Set ρ = C1 log R with C1 1. Let A be the constant from the almost independence lemma. If R is big enough, then by our assumptions, the disks D(w j , r + Aρ) are disjoint. So the almost independence Lemma 5 yields a decomposition Tw j f = f j + h j with independent G.E.F. { f j } and 2 P max |h j (z)|e−|z| /2 R −C1 2 exp − 21 R C1 . rD

854

F. Nazarov, M. Sodin, A. Volberg

In what follows, we assume that max max |h j (z)|e−|z|

2 /2

R −C1 .

rD

j

For this, we throw away an event of probability at most 1

2π R · 2e− 2 R

C1

e− .

Since δ( f, I j ) = δ(Tw j f, 1 I j ), we need to estimate the probability of the event ⎫ ⎧ ⎬ ⎨ , 1 1 δ(T f, I ) − δ( f , I ) w j j j j ⎭ ⎩ j introduce the events

j = min | f j (z)|e

−|z|2 /2

z∈ I1j

R

−C1 /2

,

and note that if j does not occur, then

δ(Tw f, 1 1 = I ) − δ( f , I ) arg T f − arg f j j j wj j j I1j I1j h j R −C1 /2 = I1j arg 1 + fj

(we have used that E I1j arg Tw j f = E I1j arg f j ), whence

δ(Tw f, 1 I j ) − δ( f j , 1 I j ) 2π R · R −C1 /2 1 . j

j : j doesn t occur

Therefore, we conclude that 1 1 δ(Tw j f, I j ) − δ( f j , I j ) j δ(Tw f, 1 I j ) + j j : j occurs

=

j : j occurs

δ( f, I j ) +

δ( f j , 1 I j ) + 1

j : j occurs

δ( f j , 1 I j ) + 1.

j : j occurs

To estimate the size on the right-hand side, we introduce the (random) of the two sums counter L = Card j : j occurs . Lemma 11 (applied to 13 instead of ) handles the case b b < for R 1 . L 6 log R 3(r 2 + log R) It yields that outside of some event of probability at most 2e−b1 r two sums does not exceed 13 .

2

, each of these

Large Fluctuations of Random Complex Zeroes

855

Now, consider the second case when L > b6 logR . Denote by Q the integer part of b . Then at least Q independent events j1 , ... , j Q must occur. By Lemma 8 6 log R applied with γ = 1 I j and ε = R −C1 /2 , we have , P j 100r R −C1 /2 21 C1 log R R −C1 /3 , provided that R is sufficiently big. Therefore, Card{I j } −C1 /3 Q b R P L 16 log R Q ./ 0 (2π R) Q 1

< e− 4 C1 Q log R e−c2 . Thereby, ⎫ ⎧ ⎬ ⎨ P δ( f, I j ) − δ( f j , 1 I j ) ⎭ ⎩ j j 2 b P + P L 16 log < 2e−b1 r + e−c2 < e−c3 , R and we are done.

4.2. Proof of Theorem 1: the upper bound in the case 21 < α < 1. We split the circumference RT into N = 2π R disjoint arcs {I j } of equal length r , 1 r 2. We fix a positive ε < 1−α 4 and suppose that N α δ( f, I ) j > 2π R . j=1 Then we split the set {1, ... , N } into n = 2R ε disjoint arithmetic progressions J1 , ..., Jn . If R is sufficiently big, then the cardinality of each of these arithmetic progressions cannot be less than N 2π R − 1 −1 − 1 > 2R 1−ε , n 2R ε and cannot be larger than N 2π R +1 + 1 < 4R 1−ε . n 2R ε − 1 For at least one of these progressions, say for Jl , we have Rα > 2R α−ε . δ( f, I j ) > 2π n j∈Jl

856

F. Nazarov, M. Sodin, A. Volberg

Given a collection {I j } j∈J with 2R 1−ε < Card J < 4R 1−ε of R ε -separated arcs of length r , we show that ⎧ ⎫ ⎨ ⎬ 2α−1−ε P δ( f, I j ) > 2R α−ε C1 e−c2 R . ⎩ ⎭ j∈J Since we have n R such collections I j , this will prove the upper bound in the case 1 2 < α < 1. Now, suppose that j∈J δ( f, I j ) > 2R α−ε . By Lemma 12 applied with = R α−ε , we see that there is a collection of independent G.E.F. { f j } such that throwing away an event of probability at most e−b2 = e−b2 R

α−ε

ε<1−α

e−R

2α−1

,

we have

δ( f j , 1 I j ) > 2R α−ε − = R α−ε . j∈J To estimate the probability of the event P j∈J δ( f j , 1 I j ) > R α−ε , we apply Bernstein’s estimate (Lemma 3) to the independent identically distributed random variables ψ j = δ( f j , 1 I j ). By Lemma 10, the tails of these random variables decay superexponentially: c3 t 2 P ψ j t exp − log t for t 1. The number of the random variables ψ j is bigger than 2R 1−ε . Hence, the Bernstein estimate can be applied with t = R α−ε . We see that the probability we are interested in does not exceed 2 exp −c4 t 2 / Card J < exp −c5 R 2α−1−ε , completing the argument.

5. Proof of Theorem 1: The Lower Bound for

1 2

<α<1

( 21 , 1)

and show that, for some positive numerical constant c0 and for each We fix α ∈ R > R0 (α), one has 2α−1 . P n(R) R 2 − c0 R α e−3R Everywhere below, we assume that R > 2. Let N = R. Let J− be a set consisting of N integers between R 2 − 2R and R 2 − R, and let J+ be a set consisting of N integers between R 2 + R and R 2 + 2R. Let ⎧√ ⎪ k ∈ J+ ; ⎨√1 − R α−1 , α−1 ak = 1+ R , k ∈ J− ; ⎪ ⎩1, k∈ / J+ ∪ J− .

Large Fluctuations of Random Complex Zeroes

857

Consider the Gaussian Taylor series g(z) =

∞

zk ζk a k √ , k! k=0

and denote by n g (R) the number of its zeroes in the disk RD. Claim 5.1. For R 1, we have En g (R) R 2 − c1 R α . Proof of Claim 5.1. By Lemma 4, R 2k−1 R 2k 2 1 R k 0 ak2 · 2k · k! k 0 ak · k · k! En g (R) = = . 2k 2k 2 a2 · R a2 · R k 0 k

k 0 k

k!

The ratio on the right-hand side can be written as 2 2 k 0 ak · (k − R ) · 2 R + 2 R 2k k 0 ak · k!

R 2k k!

k!

.

Note that

(k − R 2 ) ·

k 0

R 2k = 0, k!

so the numerator in the second term equals

R α−1 · (k − R 2 ) ·

k∈J−

R 2k R 2k + (−R α−1 ) · (k − R 2 ) · k! k! k∈J+

−R α

R 2k

k∈J− ∪J+

k!

. 2

Since R 1, we have ak2 2, and the denominator cannot be bigger than 2e R . Hence, En g (R) R 2 −

1 α −R 2 R e 2

k∈J− ∪J+

R 2k . k!

Now, observe that k∈J− ∪J+

R 2k 2 ce R k! 2k

with some absolute c > 0. To see this, note that the function k → Rk! decreases for k ∈ J+ and increases for k ∈ J− . We set K = R 2 + 2R. Applying Stirling’s formula, we get 2 K R 2k R 2K 1 eR √ k! K! K K 2

2

2

e R +2R−1 e R +2R eR R 2 +2R Re2R+4 R R 1 + R2

858

F. Nazarov, M. Sodin, A. Volberg

for k ∈ J+ . A similar estimate holds for k ∈ J− . Therefore, En g (R) R 2 −

1 α −R 2 R e 2

k∈J− ∪J+

R 2k 1 2 2 R 2 − R α e−R · ce R , k! 2

proving the claim.

Claim 5.2. For R 1, we have c1 c1 P n g (R) R 2 − R α R −2+α . 2 2 Proof of Claim 5.2. We have c1 R α E(R 2 − n g (R)) whence P

n g (R) R 2 −

c1 α c1 R + R 2 P n g (R) R 2 − R α 2 2

c1 α c1 c1 R R −2 · R α = R −2+α . 2 2 2

Claim 5.3. Let 0 t N . Then ⎫ ⎧ ⎬ ⎨ t2 2 2 . |ζk | − |ζk | t 2 exp − P ⎭ ⎩ 16(e + 1)N k∈J−

k∈J+

Proof of Claim 5.3. Note first of all that P |ζk |2 t = e−t and E|ζk |2 = 1, whence, for t > 0, P |ζk |2 − 1 > t < e−t and P

|ζk |2 − 1 < −t

= max 1 − et−1 , 0 < e1−t .

Thus we can apply Bernstein’s Lemma 3 with K = e + 1 to the random variables ±(|ζk |2 − 1), which yields the desired conclusion. In particular, ⎫ ⎧ ⎬ ⎨ c 1 |ζk |2 − |ζk |2 R 1/2 log R 2 exp −c2 log2 R R −2+α , P ⎭ ⎩ 4 k∈J−

k∈J+

provided that R > R0 (α). Now everything is ready to make the final estimate. Let γ be the standard Gaussian measure on the space C∞ ; i.e., the product of countably many copies of the measures 1 −|ηk |2 e dm(ηk ), and let γa be another Gaussian measure on C∞ that is the product of π 1 −|ηk |2 /a 2 k dm(η ). Let E ⊂ C∞ be the set of coefficients e the Gaussian measures k πak2

Large Fluctuations of Random Complex Zeroes

859

k ηk such that the Taylor series k 0 ηk √z converges in C and has at most R 2 − k! zeroes in RD. Then Claim 5.2 can be rewritten as

c1 2

Rα

c1 −2+α , R 2 while the quantity P n(R) R 2 − c21 R α we are interested in equals γ (E). Thus, it remains to compare γ (E) with γa (E). Let ⎧ ⎫ ⎨ ⎬ |ηk |2 1 α− 2 U= |ηk |2 + R log R , ⎩ ⎭ ak2 k∈J ∪J k∈J ∪J γa (E)

−

and 1= U

⎧ ⎨ ⎩

−

+

Note that 1) = P γa (U ) = γ (U

|ηk |2 + R

α− 21

k∈J− ∪J+

⎧ ⎨ ⎩

ak2 |ηk |2

k∈J− ∪J+

+

|ζk |2 −

k∈J−

log R . ⎭

|ζk |2 R 1/2 log R

k∈J+

⎫ ⎬

⎫ ⎬ ⎭

c1 −2+α R . 4

Hence, γa (E\U )

c1 −2+α R . 4

But on E\U , we can bound the density of γa with respect to γ : α− 21 dγa 2α−1 log R eR (1 − R 2α−2 )−N < e2R dγ

for R > R0 (α). The rest is obvious: γa (E\U ) γ (E) γ (E\U ) e−2R c1 −2+α −2R 2α−1 2α−1 R e e−3R , 4 2α−1

provided that R > R0 (α). This proves the lower bound in Theorem 1. Appendix: Asymptotic Almost Independence. Proof of Lemma 5 A-1. Elementary inequalities. Claim A-1.1. For all positive k and t, √ √ k log t − t k log k − k − ( t − k)2 .

860

F. Nazarov, M. Sodin, A. Volberg

Proof. The function ϕ(τ ) = k log(τ 2 ) − τ 2 attains its maximum at τ = ϕ (τ ) = − Hence,

2k − 2 −2 τ2

for all τ > 0 .

√ √ ϕ(τ ) ϕ( k) − (τ − k)2

Replacing τ 2 by t, we get the claim.

√ k, and

for all τ > 0 .

Claim A-1.2. Let k be a positive integer and u k. Then ∞ k −t √ √ 2 t e dt e−( u− k) . k! u Proof. ∞ u

t k e−t dt = k!

∞

[t + (u − k)]k e−t−(u−k) dt k! k 4 ∞ k −t 3 u − k k −(u−k) t e = 1+ e dt k! t k 4 3 u − k k −(u−k) ∞ t k e−t dt e 1+ k k! - k ./ 0 1

exp {[k log u − u] − [k log k − k]} proving the claim.

Claim A-1.1 −(√u−√k)2 , e

Corollary A-1.3. ∞ 1 |z|2k −|z|2 t k −t 2 e e dt e−d . dm 2 (z) = √ √ π |z| k+d k! ( k+d)2 k! Claim A-1.4. Let w , w be points in C and let k , k be non-negative integers. Then d2 E ξk (w )ξk (w ) 2e− 8 , provided that |w − w |

√

k +

√

k + d, d > 0.

Proof. By Sect. 2.4(d), E ξk (w )ξk (w ) # $ % $ %& zk zk = T−w √ , T−w √ k ! k ! 1 (z − w )k (z − w )k −zw − 1 |w |2 −zw − 1 |w |2 −|z|2 2 2 = e e e dm 2 (z). √ √ π C k ! k !

Large Fluctuations of Random Complex Zeroes

Therefore,

861

E ξk (w )ξk (w ) 1 |z − w |k − 1 |z−w |2 |z − w |k − 1 |z−w |2 e 2 e 2 dm 2 (z) √ √ π C k ! k ! 1 1 + = I +I . π |z−w |√k + d2 π |z−w |√k + d2

By the Cauchy-Schwarz inequality, 1/2 |2k 1 |z − w 2 e−|z−w | dm 2 (z) I π |z−w |√k + d2 k ! 1/2 |z − w |2k −|z−w |2 1 e × dm 2 (z) π C k ! Claim A-1.3 − d 2 d2 e 8 · 1 = e− 8 . d2

d2

Similarly, I e− 8 . Hence, I + I 2e− 8 , and we are done.

Claim A-1.5. Assume that the disks D(w j , R j + 8σ j ) are pairwise disjoint and R j 1, ' σ j max 1, log R j . Let Di j = |wi − w j | − Ri − R j be the distance between the disks D(wi , Ri ) and D(w j , R j ). Then, for each i, 1 2 2 2 (1 + R 2j )e− 8 Di j e−2σi . j : j =i

Proof. Indeed, since Di j 8σ j , we have

5 6 1 2 Di j 4σ j2 2σ j2 + 2 log(e2 R 2j ) log(4R 2j ) log 2(1 + R 2j ) . 16 − 1 D2 Thus, it suffices to estimate the sum e 16 i j . For each j = i, consider the disk j : j =i

D j ⊂ D(w j , R j + 8σ j ) of radius 4 closest to wi . For each z ∈ D j , we have |z − wi | Di j + Ri . Also, the disks D j are disjoint and ) D j ⊂ C\D(wi , Ri + 8σi ). Hence, j

1

e− 16 Di j 2

j : j =i

=

1 16π 1 16π

(1 + proving the claim.

1

{|z−wi |Ri +8σi }

1

{|z|Ri +8σi }

1 8 Ri )

1 + 18 Ri σi2 +1

e

e− 16 (|z−wi |−Ri ) dm 2 (z)

8σi

e− 16 (|z|−Ri ) dm 2 (z) = 2

1 8

∞

1 2

(Ri + t)e− 16 t dt

8σi

t − 1 t2 2 e 16 dt = (1 + 18 Ri )e−4σi 8

e−2σi 2

2

8 + Ri −2σ 2 9 2 2 e i e−2σi < e−2σi 8e Ri 8e

862

F. Nazarov, M. Sodin, A. Volberg

Fig. 1. The disks D(wi , Ri ), D(w j , R j ) and D j

A-2. Almost orthogonal standard Gaussian random variables are almost independent. Claim A-2.1. Let ξ j be standard complex Gaussian random variables such that their covariance matrix i j = E ξi ξ j satisfies

|i j | δi

j : j =i

1 . 3

Then ξ j = ζ j + s j η j , where ζ j are independent standard complex Gaussian random variables, η j are standard complex Gaussian random variables, and s j ∈ [0, δ j ]. Proof. Let = I − , where I is the identity matrix. Put ζi = ( −1/2 )i j ξ j . j

1 = Then ζi are independent variables. We set standard complex Gaussian random −1/2 1i j ξ j , and estimate the sum 1i j |. I − and si ηi = | j

j

We have

1 αk k −1/2 = I + + 2 k 2

with |αk | 1 for all k 2. Then 1i j | |

1 |i j | + |(k )i j |, 2 k 2

whence

j

1i j | |

1 δi |i j | + |(k )i j | + |(k )i j | . 2 2 j

k 2

j

k 2

j

Large Fluctuations of Random Complex Zeroes

863

To estimate the sum on the right-hand side, we note that for any two square matrices A and B of the same size, we have |(AB)i j | |Ai | |Bj | j

j,

=

⎡ ⎣|Ai | ·

⎤

⎛

|Bj |⎦ ⎝

j

⎞ |Ai j |⎠ · sup

j

|Bj | .

j

Applying this observation to the matrices k = · k−1 (with k 1), we conclude by induction that ⎛ ⎞ δi |(k )i j | ⎝ |i j |⎠ 3−(k−1) k−1 . 3 j

j

Thus j

and we are done.

1i j | |

δi δi = δi , + 2 3k−1 k 2

A-3. Proof of the lemma. We fix two big constants A a 1. Let R j = r j + aρ j , σ j = A−a 8 ρ j . Clearly, R j 1, σ j 1. Also, A−a σ j = 2ρ j + − 2 ρj 8 ' A − a − 16 log(1 + aρ j ) 2 log r j + 8a , ' 2 log r j + 2 log(1 + aρ j ) , ' 2 log r j (1 + aρ j ) 2 log R j , provided that a 2 and A 17a + 16. We consider now the family of standard Gaussian random variables ζk (w j ), k R 2j . Applying to this family Claim A-1.4, we get − 1 D2 E ζk (wi )ζ (w j ) 2e 8 i j where, as before, Di j = |wi −w j |− Ri − R j is the distance between the disks D(wi , Ri ) and D(w j , R j ). Now, Claim A-1.5 implies that the sum of absolute values of the covari1 2 ances of ζk (wi ) with all other ζl (w j ) in our family does not exceed e−2σi e−2 < . 3 Claim A-2.1 then allows us to write ζk (wi ) = ζik + sik ηik ,

k Ri2 ,

864

F. Nazarov, M. Sodin, A. Volberg

where ζik are independent standard Gaussian complex random variables, ηik are standard 2 Gaussian complex random variables, and sik ∈ [0, e−2σi ]. 2 Next, we choose ζik , k > Ri , in such a way that the whole family ζik of standard Gaussian complex random variables is independent and put fi =

k

hi =

zk ζik √ , k! zk zk sik ηik √ + [ζk (wi ) − ζik ] √ . k! k! 2 2

k Ri

k>Ri

By construction, Twi f = f i + h i . To estimate the probability 2 1 2 P max |h i (z)|e− 2 |z| > e−ρ j , z∈ri D

it suffices to estimate the expression

|z|k 1 2 |z|k 1 2 sik max √ e− 2 |z| + 2 max √ e− 2 |z| . z∈ri D k! z∈ri D k! 2 2

k Ri

k>Ri

If this expression is less than e−2ρ j , then by Lemma 2, we get what Lemma 5 asserts: 1 2 −ρ 2j −|z|2 /2 P max |h j (z)|e e 2 exp − e2ρ j . 2 z∈r j D 2

|z|k 2 For every k 1, we have √ e−|z| /2 1 and thereby, k!

|z|k 2 sik max √ e−|z| /2 sik z∈ri D k! 2 2

k Ri

k Ri

(1 + Ri2 )e−2σi 2

1 + Ri2 Ri4

e

−σi2

1 + Ri2 e

2e−

σi2

e−σi

(A−a)2 2 64 ρi

provided that A > a + 16. For k > Ri2 , Claim A-1.1 implies that √ √ |z|2k −|z|2 kk 2 2 e e−k e−( k−|z|) e−( k−ri ) k! k!

for all z ∈ ri D. Hence, √ 1 |z|k 2 2 max √ e−|z| /2 e− 2 ( k−ri ) , z∈ri D k!

k > Ri2 ,

2

1 −2ρ 2 e i , 2

Large Fluctuations of Random Complex Zeroes

and it suffices to show that 2

1

865

e− 2 (

√ k−ri )2

k>Ri2

Now,

=

Ri2
k>Ri2

1 −2ρ 2 e i . 2

+

k>max(Ri2 ,4ri2 )

with the usual convention that the sum taken over the empty set equals zero. The first sum does not exceed 5ri2

ρi2 log ri

5 −2ρ 2 1 2 e i < e−2ρi , 4 e 8 e provided that a 4. At last, the remaining sum does not exceed 1 1 2 2 1 e− 8 k e− 8 a ρi −1/8 1 − e 2 1 2 2 ρi

(1 + 4ri2 )e− 2 a

1 2 −6)ρi2

e(− 2 a 2

4+2ρi

k aρi

9 −( 1 a 2 −6)ρ 2 1 −2ρ 2 i e i , e 8 e6 8 provided that a 8. This finishes off the proof of Lemma 5. 1 2 2 ρi

9e− 8 a

Acknowledgement. We thank Manjunath Krishnapur, Yuval Peres, and Boris Tsirelson for very helpful discussions.

References 1. Bogomolny, E., Bohigas, O., Leboeuf, P.: Distribution of roots of random polynomials. Phys. Rev. Lett. 68, 2726–2729 (1992); Quantum chaotic dynamics and random polynomials. J. Stat. Phys. 85, 639–679 (1996) 2. Forrester, P.J., Honner, G.: Exact statistical properties of the zeros of complex random polynomials. J. Phys. A 32, 2961–2981 (1999) 3. Ginibre, J.: Statistical Ensembles of Complex, Quaternion, and Real Matrices. J. Math. Phys. 6, 440–449 (1965) 4. Jancovici, B., Lebowitz, J.L., Manificat, G.: Large charge fluctuations in classical Coulomb systems. J. Statist. Phys. 72, 773–787 (1993) 5. Hannay, J.H.: Chaotic analytic zero points: exact statistics for those of a random spin state. J. Phys. A 29, L101–L105 (1996); The chaotic analytic function. J. Phys. A 31, L755–L761 (1998) 6. Kahane, J.-P.: Some random series of functions. 2nd edition. Cambridge: Cambridge University Press, 1985 7. Krishnapur, M.: Overcrowding estimates for zeroes of Planar and Hyperbolic Gaussian analytic functions. J. Statist. Phys. 124, 1399–1423 (2006) 8. Levin, B.Ya.: Distribution of zeros of entire functions. Revised edition. Translations of Mathematical Monographs, 5. Providence, R.I.: Amer. Math. Soc., 1980 9. Nazarov, F., Sodin, M., Volberg, A.: Transportation to random zeroes by the gradient flow. Geom. and Funct. Anal. 17, 887–935 (2007) 10. Shiffman, B., Zelditch, S.: Number variance of random zeros on complex manifold. Geom. and Funct. Anal., to appear. available at http://arXiv.org/list/math.CV/0608743, 2006 11. Sodin, M.: Zeros of Gaussian analytic functions. Math. Res. Lett. 7, 371–381 (2000) 12. Sodin M., Tsirelson, B.: Random complex zeroes. I. Asympotic normality. Israel J. Math. 144, 125–149 (2004); II. Perturbed Lattice, ibid 152, 105-124 (2006); III. Decay of the hole probability, ibid 147, 371–379 (2005) Communicated by S. Zelditch

Commun. Math. Phys. 284, 867–896 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0547-9

Communications in

Mathematical Physics

Baxter Operator and Archimedean Hecke Algebra A. Gerasimov1,2,3 , D. Lebedev1,4 , S. Oblezin1,4 1 Institute for Theoretical and Experimental Physics, 117259, Moscow, Russia.

E-mail: [email protected]; [email protected]

2 School of Mathematics, Trinity College, Dublin 2, Ireland 3 Hamilton Mathematics Institute, TCD, Dublin 2, Ireland. E-mail: [email protected] 4 Max-Planck-Institut für Mathematik, Vivatsgasse 7, D-53111 Bonn, Germany

Received: 20 October 2007 / Accepted: 13 February 2008 Published online: 19 August 2008 – © Springer-Verlag 2008

Abstract: In this paper we introduce Baxter integral Q-operators for finite-dimensional Lie algebras gl+1 and so2+1 . Whittaker functions corresponding to these algebras are eigenfunctions of the Q-operators with the eigenvalues expressed in terms of Gammafunctions. The appearance of the Gamma-functions is one of the manifestations of an interesting connection between Mellin-Barnes and Givental integral representations of Whittaker functions, which are in a sense dual to each other. We define a dual Baxter operator and derive a family of mixed Mellin-Barnes-Givental integral representations. Givental and Mellin-Barnes integral representations are used to provide a short proof of the Friedberg-Bump and Bump conjectures for G = G L( + 1) proved earlier by Stade. We also identify eigenvalues of the Baxter Q-operator acting on Whittaker functions with local Archimedean L-factors. The Baxter Q-operator introduced in this paper is then described as a particular realization of the explicitly defined universal Baxter operator in the spherical Hecke algebra H(G(R), K ), K being a maximal compact subgroup of G. Finally we stress an analogy between Q-operators and certain elements of the non-Archimedean Hecke algebra H(G(Q p ), G(Z p )). 1. Introduction The notion of the Q-operator was introduced by Baxter as an important tool to solve quantum integrable systems [Ba]. These operators were initially constructed for a parti+1 and cular class of quantum integrable systems associated with affine Lie algebras gl its quantum/elliptic generalizations. A new class of integral Q-operators corresponding +1 -Toda chain was later proposed by Pasquier and Gaudin [PG]. Its generalito the gl zation to Toda chains for other classical affine Lie algebras was proposed recently in [GLO1,GLO2,GLO3]. In this paper we introduce integral Baxter Q-operators for Toda chains corresponding to the finite-dimensional classical Lie algebras gl+1 and so2+1 . These integral operators are closely related with the recursion operators in the Givental integral representation of

868

A. Gerasimov, D. Lebedev, S. Oblezin

Whittaker functions (see [Gi,JK] for gl+1 and [GLO3] for other classical Lie algebras). It is well known that g-Whittaker functions are common eigenfunctions of the complete set of mutually commuting g-Toda chain quantum Hamiltonians. The quantum Hamiltonians arise as projections of the generators of the center Z(g) of the universal enveloping algebra U(g). One of the characteristic properties of the introduced Baxter integral operators for a finite-dimensional classical Lie algebra g is that the corresponding g-Whittaker functions are their eigenfunctions. Moreover, integral Q-operators provide a complete set of integral equations defining g-Whittaker functions. Similarly to the relation of the Hamiltonians with the generators of the center Z, we construct universal Baxter operators in a spherical Hecke algebra whose projection gives the Baxter operator for Toda chains. Other projections provide Baxter operators for other quantum integrable systems (e.g. Sutherland models). The eigenvalues of the Baxter operators acting on Whittaker functions are expressed in terms of a product of Gamma-functions. The appearance of the Gamma-functions implies a close connection between Givental and Mellin-Barnes integral representations [KL1] for gl+1 -Whittaker functions. We discuss this relation in some detail. Note that the representation theory interpretation [GKL] of the Mellin-Barnes integral representation uses the Gelfand-Zetlin construction of the maximal commutative subalgebra of U(gl+1 ). One can guess a connection between Mellin-Barnes and Givental representations on a general ground by noticing that Givental diagrams for classical Lie algebras [GLO3] are identical to Gelfand-Zetlin patterns [BZ]. Moreover both constructions are most natural for classical Lie algebras. In this note we discuss a duality relation between recursive structures of Givental and Mellin-Barnes integral representations. We construct a dual version of the Baxter Q-operator and derive a set of relations between recursive/Baxter operators and their duals. We also propose a family of mixed Mellin-Barnes-Givental integral representations interpolating between Mellin-Barnes and Givental integral representations of Whittaker functions. We use the Mellin-Barnes integral representation to give simple proofs of BumpFriedberg and Bump conjectures on Archimedean factors arising in the application of the Rankin-Selberg method to analytic continuations of G L( + 1) × G L( + 1) and G L( + 1) × G L() automorphic L-functions. We also discuss a relation with the proofs given by Stade [St1,St2]. The proof in [St1,St2] is based on a recursive construction of gl+1 -Whittaker functions generalizing the construction due to Vinogradov and Takhtajan [VT]. As it was noticed in [GKLO] and is explicitly demonstrated below, the Stade recursion basically coincides with the Givental recursion (see also the recent detailed discussion in [St3]). We also show that the Bump-Friedberg and Bump conjectures are simple consequences of the Mellin-Barnes integral representation of gl+1 -Whittaker function. The Rankin-Selberg method is a powerful tool of studying analytic properties of automorphic L-functions. The application of the Baxter Q-operators and closely related recursive operators to a derivation of analytic properties of L-functions using the RankinSelberg method is not accidental. We remark that the eigenvalues of the Q-operators acting on g-Whittaker functions are given by Archimedean local L-factors and the integral Q-operators should be naturally considered as elements of the Archimedean Hecke algebra H(G(R), K ), K being a maximal compact subgroup of G. We construct the corresponding universal Baxter operator as an element of the spherical Hecke algebra H(G(R), K ). We also describe non-Archimedean counterparts of the universal Baxter operators as elements of non-Archimedean Hecke algebras H(G L( + 1, Q p ), G L( + 1, Z p )). The consideration of Archimedean and non-Archimedean universal Q-operators

Baxter Operator and Archimedean Hecke Algebra

869

on an equal footing provides a uniform description of the automorphic forms as their common eigenfunctions (replacing the more traditional approach based on the algebra of the invariant differential operators as a substitute of H(G(R), K )). Let us note that the connection of the Baxter operators with Archimedean L-factors implies in particular that there is a hidden parameter in the Q-operator corresponding to a choice of a finite-dimensional representation of the Langlands dual Lie algebra. In this sense, Q-operators considered in this paper correspond to standard representations of the classical Lie algebras. We are going to consider the Q-operators corresponding to more general representations in a separate publication. One should stress that there are various Hecke algebras relevant to the study of the quantum Toda chains. For example for B ⊂ G being a Borel subgroup, the Hecke algebra H(G, B) of B-biinvariant functions is closely related to the scattering data of quantum Toda chains [STS]. The Hecke algebra H(G, N ), N being the unipotent radical of B also deserves consideration. Note that the representations of H(G, N ) contain certain information about the scattering data of the theory and its center is isomorphic to H(G, K ). Finally let us remark that the constructions of affine integral Q-operators and their eigenvalues for the action on Whittaker functions [PG] together with the considerations of this paper imply an intriguing possibility to interpret the eigenvalues of affine Q-operators as a kind of local Archimedean L-factors. It is natural to expect that these L-factors should be connected with 2-dimensional local fields in the sense of Parshin [Pa]. We are going to discuss this fascinating possibility elsewhere. The plan of this paper is as follows. In Sect. 2 we recall the Givental integral representation for gl+1 and introduce the Baxter Q-operator for gl+1 . In Sect. 3 we consider the relation between Givental and Mellin-Barnes integral representations of gl+1 -Whittaker functions and introduce the dual Baxter operator. In Sect. 4 we use the Mellin-Barnes integral representation to prove the Bump-Friedberg and Bump conjecture and discuss the relation with [St1,St2]. In Sect. 5 we identify eigenvalues of the Baxter Q-operator with local Archimedean L-factors and construct universal Baxter operators as elements of the spherical Hecke algebra H(G(R), K ). The main result of this paper is given in Theorem 5.1. We also discuss an analogy between Q-operators and certain elements of the non-Archimedean Hecke algebra H(G L( + 1, Q p ), G L( + 1, Z p )). Finally in Sect. 6 a generalization to so(2 + 1) is given. 2. Baxter Operator for gl+1 2.1. Whittaker functions as matrix elements. Let us recall two constructions of g-Whittaker functions as matrix elements of infinite-dimensional representations of U(g) and a relation of g-Whittaker functions with eigenfunctions of g-Toda quantum chains. Let us first describe the construction based on the Gauss decomposition. According to Kostant [Ko1,Ko2], gl+1 -Whittaker function can be defined as a certain matrix element in a principal series representation of G = G L( + 1, R). Let U(g) be a universal enveloping algebra of g = gl+1 and V , V be U(g)-modules, dual with respect to a nondegenerate invariant pairing . , . : V × V → C, v , X v = −X v , v for all v ∈ V , v ∈ V and X ∈ g. Let B− = N− AM and B+ = AM N+ be Langlands decompositions of opposite Borel subgroups. Here N± are unipotent radicals of B± , A is the identity component of the vector Cartan subgroup and M is the intersection of the centralizer of the vector Cartan subalgebra with the maximal compact subgroup K ⊂ G. We will assume that the actions of the Borel subalgebras b+ = Lie(B+ ) on V and b− = Lie(B− )

870

A. Gerasimov, D. Lebedev, S. Oblezin

on V are integrated to the actions of the corresponding subgroups. Let χ± : n± → C be the characters of n± defined by χ+ (ei ) := −1 and χ− ( f i ) := −1 for all i = 1, . . . , , where ei , f i are generators of n+ and n− which correspond to the simple roots. A vector ψ R ∈ V is called a Whittaker vector with respect to χ+ if ei ψ R = −ψ R , i = 1, . . . , ,

(2.1)

and a vector ψ L ∈ V is called a Whittaker vector with respect to χ− if f i ψ L = −ψ L , i = 1, . . . , .

(2.2)

One defines a Whittaker model V as a space of functions on G such that f (ng) = χ N+ (n) f (g), n ∈ N+ , χ N+ (n) = χ+ (log n). The U(g)-module admits a Whittaker model with respect to the character χ if it is equivalent to a sub-representation of V. Let Vλ = Ind G B χλ be a principal series representation of G induced from the generic character χλ of B trivial on N ⊂ B with λ = (λ1 , . . . , λ+1 ). It is realized in the space of functions f ∈ C ∞ (G) satisfying equation f (bg) = χλ (b) f (g), where b ∈ B. The action of G is given by the right action πλ (g) f (x) = f (xg −1 ). We U(g) will be interested in the infinitesimal form IndU(b) χλ of this representation given by (X f )(g) =

d f (ge−t X )|t→0 . dt

Define the (g, B)-module as a g-module such that the action of the Borel subalgebra b ⊂ g is integrated to the action of the Borel subgroup B, b = Lie(B). Consider an (0) irreducible (g, B)-submodule Vλ of Vλ given by the Schwartz space S(N− ) of functions on N− exponentially decreasing at infinity with all their derivatives. This (g, B)-module always admits a Whittaker model. Below we will denote by ψ L , ψ R the Whittaker vectors (0) in Vλ and its dual. Following Kostant [Ko1,Ko2] ( see also [Et] for a recent discussion) we define a g-Whittaker function in terms of the invariant pairing of Whittaker modules as follows: gl+1

λ

(x) = e−ρ,x ψ L , πλ (eh x ) ψ R ,

x ∈ h,

(2.3)

+1 where h x := i=1 ωi , x h i , ωi is a basis of fundamental weights of g, ρ = 1/2 α>0 α and πλ (eh x ) is an action of eh x in the representation Vλ . It was shown in [Ko1] that g-Whittaker function is a common eigenfunction of a complete set of commuting Hamiltonians of the g-Toda chain. A complete set of commuting Hamiltonians of the g-Toda chain is generated by the differential operators Hk ∈ Diff(h), k = 1, . . . , +1 on the Cartan subalgebra h defined in terms of the generators {ck } of the center Z(g) ⊂ U(g) as follows: gl+1

Hk λ

(x) = e−ρ,x ψ L , πλ (eh x ) ck ψ R .

(2.4)

More explicitly one has gl+1

λ

(x) = e−

+1

i=1 xi ρi

ψ L , πλ (e

+1

i=1 xi E i,i

) ψ R ,

(2.5)

Baxter Operator and Archimedean Hecke Algebra

871

where ρ j = 2 + 1 − j, j = 1, . . . , + 1 are the components of ρ in the standard basis of R+1 , x = (x1 , . . . , x+1 ) and E i, j are the standard generators of U(gl+1 ). The linear and quadratic Hamiltonians in this case are given by +1 ∂ , ∂ xi

(2.6)

+1 ∂2 + e xi −xi+1 . 2 ∂ x i i=1 i=1

(2.7)

gl

H1 +1 = −ı

i=1

1 gl H˜ 2 +1 = − 2

Let us introduce a generating function for gl+1 -Toda chain Hamiltonians as t gl+1 (λ) =

+1

gl+1

(−1) j λ+1− j H j

(x, ∂x ),

(2.8)

j=1

gl

gl

gl

where H˜ 2 +1 = 21 (H1 +1 )2 − H2 +1 . Then the gl+1 -Whittaker function satisfies the following equation gl+1

t gl+1 (λ) λ

(x) =

+1

gl+1

(λ − λ j ) λ

(x),

(2.9)

j=1

where λ = (λ1 , . . . , λ+1 ) and x = (x1 , . . . , x+1 ). The appropriately normalized gl+1 -Whittaker function (2.3) is a solution of Eqs. (2.9) invariant with respect to the actions of the Weyl group W = S+1 given by s : λi → λs(i) , s ∈ W . The W -invariant gl+1 - Whittaker functions provide a basis of W -invariant functions in R+1 (see e.g. [STS,KL2]). Theorem 2.1. For the properly normalized W -invariant gl+1 -Whittaker functions the following orthogonality and completeness relations hold

gl

R+1

=

gl+1

λ +1 (x) ν

(x)

+1

dx j

j=1

1 δ (+1) (λ − w(ν)), (+1) ( + 1)! µ (λ)

(2.10)

w∈W

R+1

+1 gl+1 gl+1 (+1) λ (x) λ (y)µ (λ) dλ j j=1

= δ (+1) (x − y),

(2.11)

where µ(+1) (λ) =

1 1 .

(ıλk − ıλ j ) (2π )+1 ( + 1)! j=k

(2.12)

872

A. Gerasimov, D. Lebedev, S. Oblezin

There exists another construction of gl+1 -Whittaker functions that uses a pairing of the spherical vector (i.e. a vector invariant with respect to the maximal compact subgroup K = S O( + 1, R) of G L( + 1, R)) and a Whittaker vector (see e.g. [J,Ha]). Consider the following function: gl λ +1 (g) = e−ρ(g) φ K , πλ (g) ψ R ,

(2.13)

where ρ(g) is given by ρ(kan) = ρ, log a, φ K is a spherical vector in Vλ , φ K (bgk) = χλ (b)φ K (g), k ∈ K , b ∈ B+ .

(2.14)

gl The function λ +1 (g) defined by (2.13) satisfies the functional equation gl ˜ gl+1 (g), k ∈ K , n ∈ N− , λ +1 (kgn) = χ˜ N− (n) λ

(2.15)

where χ˜ N− (n) = exp(2 j=1 n j+1, j ). Thus (2.13) descends to a function on the space A of the diagonal matrices a = diag(e x˜1 , . . . , e x˜+1 ) entering the Iwasawa decomposition K AN− → G L( + 1, R). We fix a normalization of the matrix element so that the function (2.13) is W -invariant. The resulting function on A is related to the gl+1 -Whittaker function (2.3) by a simple redefinition of the variables. gl+1

Lemma 2.1. The following relation between λ

gl (x) and ˜ +1 (x) ˜ holds: λ

gl gl+1 (x) ˜ = λ +1 (x), ˜

(2.16)

λ

where x˜ = (x˜1 , . . . , x˜+1 ), λ˜ = (λ˜ 1 , . . . , λ˜ +1 ) are expressed through x = (x1 , . . . , x+1 ), λ = (λ1 , . . . , λ+1 ) as follows x˜ j =

1 xj, 2

λ˜ j = 2λ j .

2.2. Recursive and Baxter operators. The following integral representation for gl+1 Whittaker function was introduced by Givental [Gi] (see also [JK]). Theorem 2.2. gl+1 -Whittaker functions (2.5) admit an integral representation

gl

λ1+1 ,...,λ+1 (x 1 , . . . , x +1 ) = where F

gl+1

(x) = ı

+1

λk

(+1) R 2

k

k=1

−

k

i=1

k k=1 i=1

and xi := x+1,i , i = 1, . . . , + 1.

gl +1 (x) d xk,i eF ,

(2.17)

k=1 i=1

xk,i −

k−1

xk−1,i

i=1

e xk+1,i −xk,i + e xk,i −xk+1,i+1 ,

(2.18)

Baxter Operator and Archimedean Hecke Algebra

873

The interpretation of the Givental integral formula as a matrix element (2.5) was first obtained in [GKLO], where it was also noted that the integral representation (2.17) of the gl+1 -Whittaker function has a recursive structure over the rank of the Lie algebra gl+1 . gl

Corollary 2.1. The following integral operators Q glk+1 provide a recursive construction k of gl+1 -Whittaker functions:

gl

λ1+1 ,...,λ+1 (x +1 ) =

R

i=1

gl

gl

d x,i Q gl+1 (x +1 , x |λ+1 )λ1,...,λ (x ),

(2.19)

gl

Q gl+1 (x +1 , x |λ+1 ) +1

x+1,i −x,i x,i −x+1,i+1 = exp ıλ+1 e x+1,i − x,i − +e , i=1

i=1

(2.20)

i=1

gl

where x k = (xk,1 , . . . , xk,k ) and we assume that Q gl1 (x11 |λ1 ) = eıλ1 x1,1 . 0

Definition 2.1. nel

Baxter operator Qgl+1 (λ) for gl+1

is an integral operator with the ker-

Qgl+1 (x, y| λ) +1

= exp ıλ e xi −yi + e yi −xi+1 − e x+1 −y+1 , (xi − yi ) − i=1

(2.21)

i=1

where we assume xi := x+1,i and yi := y+1,i . Note that the Baxter operator defined above is non-trivial even for gl1 . Theorem 2.3. The Baxter operator Qgl+1 (λ) satisfies the following identities: Qgl+1 (λ) · Qgl+1 (λ ) = Qgl+1 (λ ) · Qgl+1 (λ), gl

gl

(2.22)

Qgl+1 (γ )Q gl+1 (λ) = (ıγ − ıλ) Q gl+1 (λ)Qgl (γ ),

(2.23)

Qgl+1 (λ) · T gl+1 (λ ) = T gl+1 (λ ) · Qgl+1 (λ),

(2.24)

Qgl+1 (λ − ı) = ı +1 T gl+1 (λ) Qgl+1 (λ),

(2.25)

T gl+1 (x, y|λ) = t gl+1 (x, ∂x |λ)δ +1 (x − y),

(2.26)

where

t gl+1 (x, ∂x |λ) =

+1 gl (−1) j λ+1− j H j +1 (x, ∂x ). j=1

(2.27)

874

A. Gerasimov, D. Lebedev, S. Oblezin

Proof. The commutativity of Q-operators

Qgl+1 (y, x|λ) Qgl+1 (x, z|λ )

R+1

=

R+1

+1

dx j

(2.28)

dx j,

(2.29)

j=1

Qgl+1 (y, x|λ ) Qgl+1 (x, z|λ)

+1 j=1

is proved using the following change of variables xi :

x1 −→ −x1 + z 1 + ln e y1 + e z 2 ,

xi −→ −xi − ln e−yi−1 + e−zi + ln e yi + e zi+1 ,

x+1 −→ −x+1 + y+1 − ln e−y + e−z +1 .

1 < i ≤ ,

The proof of (2.23) is similar to the proof of the commutativity (2.22). The commutation relations (2.24) and the difference equation (2.25) then easily follow from (2.23) and (2.10), (2.11). Corollary 2.2. The following relation holds:

+1

R+1

gl+1

d xi Qgl+1 (y, x| γ ) λ

(x) =

i=1

+1

gl+1

(ıγ − ıλi ) λ

(y),

(2.30)

i=1

where x = (x1 , . . . , x+1 ), y = (y1 , . . . , y+1 ) and λ = (λ1 , . . . , λ+1 ). Finally let us provide an expression for the kernel of the Baxter Q-operator in the parametrization naturally arising in the construction of gl+1 -Whittaker functions using Iwasawa decomposition (see (2.13) and Lemma 2.1). Let Q˜ gl+1 (x˜ , y˜ |λ˜ ) be defined by Q˜ gl+1 (x˜ , y˜ |λ˜ ) +1

= 2+1 exp ı λ˜ e2(x˜k − y˜k ) + e2( y˜k −x˜k+1 ) − e2(x˜+1 − y˜+1 ) . (x˜i − y˜i ) − i=1

k=1

Proposition 2.1. The following relation holds:

+1

R+1 i=1

gl+1 ( y˜ , x˜ | γ˜ ) ˜ gl+1 (x) d x˜i Q ˜ = ˜ λ

+1 ı γ˜ − ı λ˜i gl+1 ˜

( y˜ ). λ˜ 2

(2.31)

i=1

3. Givental versus Mellin-Barnes Integral Representations An important property of the Givental integral representation is its recursive structure with respect to the rank of the Lie algebra. There is another integral representation [KL1] for gl+1 -Whittaker functions generalizing the Mellin-Barnes integral representation for low ranks. This representation also has a recursive structure. Its interpretation in terms of representation theory uses the Gelfand-Zetlin construction of a maximal commutative

Baxter Operator and Archimedean Hecke Algebra

875

subalgebra in U(gl+1 ) [GKL]. In this section we compare recursive structures of Givental and Mellin-Barnes representations and demonstrate that these two integral representations should be considered as dual to each other. We propose the construction of the dual Baxter operator based on Mellin-Barnes integral representations. We also construct a family of new integral representations interpolating between Givental and MellinBarnes representations. Finally we introduce a symmetric recursive construction of gl+1-Whittaker functions such that the corresponding recursive operator is expressed through the Baxter and dual Baxter operators. The Givental and Mellin-Barnes integral representations are then obtained from the symmetric integral representations by simple manipulations. Let us first recall the Mellin-Barnes integral representation of gl+1 -Whittaker functions [KL1]. Theorem 3.1. The following integral representation of gl+1 -Whittaker function holds: gl λ +1 (x)

=

S n=1

n n+1

(ıγn+1,m − ıγn )

k=1 m=1 (2π )n n!

s= p

(ıγns − ıγnp )

ı

e

+1

n

(γn j −γn−1, j )xn

n=1 j=1

dγn j , (3.1)

n=1 j≤n

where λ = (λ1 , . . . , λ+1 ) := (γ+1,1 , . . . , γ+1,+1 ), x = (x1 , . . . , x+1 ) and the domain of integration S is defined by the conditions min j {Im γk j } > maxm {Im γk+1,m } for all k = 1, . . . , . Recall that we assume γn j = 0 for j > n. Corollary 3.1. The following recursive relation holds: gl+1

γ

+1

(x1 , . . . , x+1 )

=

S

gl gl+1 (γ , γ |x+1 )γ (x1 , . . . , x ) µ() (γ ) Q gl +1

dγ, j ,

(3.2)

j=1

where gl+1 (γ Q , γ |x +1 ) gl +1

=e

ı(

+1 j=1

γ+1, j −

k=1

γ,k )x+1

+1

(ıγ+1,m − ıγ,k ),

(3.3)

k=1 m=1

the measure µ() (γ ) is defined by (2.12) and γ k = (γk,1 , . . . , γk,k ). We imply gl

γ1,11 (x1 ) = eıγ1,1 x1 . The domain of integration S is defined by the conditions min j {Imγ, j } > maxm {Imγ+1,m }.

We call the integral operator Corollary 3.1 the Mellin-Barnes recursive operator. Let us stress that the recursive structure of the Mellin-Barnes integral representation of gl+1 -Whittaker functions is dual to that of the Givental integral representation. Indeed, gl the Givental recursive operator Q gl+1 depends on an additional “spectral” variable λ+1 and acts in the space of functions of the “coordinate” variables x, while the dual Mellingl+1 depends on the additional “coordinate” variable x+1 Barnes recursive operator Q gl and acts in the space of functions of the “spectral” variables γ . Using the orthogonal and completeness relations (2.10), (2.11) one can show that these two operators are related gl by a conjugation by the integral operator with the kernel γ (x ).

876

A. Gerasimov, D. Lebedev, S. Oblezin

Proposition 3.1. The following integral representation for the kernel of the recursive gl+1 holds: operator Q gl

gl+1 (γ , γ |x+1,+1 ) = Q gl +1

=

R j=1

d x+1, j

k=1

R

gl

gl+1

+1

d x+1, j γ (x +1 ) γ

j=1

gl

(x

+1 )

gl

gl

d x,k γ (x +1 ) Q gl+1 (x +1 , x γ+1 , +1 )γ (x ), (3.4) +1

where x k = (xk,1 , . . . , xk,k ), x k = (xk,1 , . . . , xk,k−1 ). In view of the above duality for the recursive operators it is natural to introduce an operator dual to the Baxter Q-operator. gl+1 (z) is an integral operator with the Definition 3.1. The dual Baxter operator Q kernel gl+1 (γ Q , β +1 |z) = +1

+1 +1

(γ+1,i − ıβ+1, j )eı z(

+1

+1

i=1 γ+1,i −

j=1 β+1, j )

,

(3.5)

i=1 j=1

acting on the space of functions of γ = (γ1 , . . . , γ+1 ) as gl+1 (z) · F(γ ) = Q

gl+1 (γ , γ˜ |z) F(γ˜ ) µ(+1) (γ˜ ) Q

S+1

+1

d γ˜ j .

(3.6)

j=1

Proposition 3.2. The gl+1 -Whittaker function satisfies the following relation: gl+1 (z) · γgl+1 (x ) = e−e Q +1

(x+1,+1 −z)

+1

gl+1

γ

+1

(x +1 ).

(3.7)

Proof. We should prove that +1

S+1

e

ız (

+1

i=1

j=1 λ+1,i − γ+1,i ) i, i= j

= (2π )+1 ( + 1)! e−e

(ıλ+1,i − ıγ+1, j )

(ıγ+1, j − ıγ+1,i )

(x+1,+1 −z)

gl+1

λ

+1

gl+1

γ

+1

(x +1 )

+1

dγ+1, j

j=1

(x +1 ).

(3.8)

Due to the orthogonality condition (2.10) this is equivalent to the following: R+1

e

−e(x+1,+1 −z)

= eı z

+1

gl gl γ +1 (x +1 )λ +1 (x +1 ) +1 +1

i=1 (λ+1,i −γ+1,i )

+1 +1 i=1 j=1

+1

d x+1, j

j=1

(ıλ+1,i − ıγ+1, j ).

(3.9)

Baxter Operator and Archimedean Hecke Algebra

877

Using the recursive relation (3.2) one can rewrite this as

+1

R+1 ×S ×S j=1

×e ı x+1,+1 (

dλ, j

j=1

dγ, j

j=1

i=1 (λ+1,i −γ+1,i )−

k=1 (λ,k −γ,k ))−e

(x+1,+1 −z)

(3.10)

(λ+1,i − ıλ,k ) (ıγ,k − ıγ+1,i ) gl gl (x ) (x )

(ıλ,l − ıλ,k ) (ıγ,l − ıγ,k ) γ +1 λ +1

i=1 k=1 (2π )2 (!)2

= e−ı z

+1

+1

×

d x+1, j

k=l

+1

i=1 (λ+1,i −γ+1,i )

+1 +1

(ıλ+1,i − ıγ+1, j ),

(3.11)

i=1 j=1

where x +1 = (x+1,1 , . . . , x+1, ). Using the orthogonality condition (2.10) with respect to the x +1 and integrating over γ we see that (3.7) is equivalent to the following: ∞ +1 1 −ı(x+1,+1 −z) i=1 (λ+1,i −γ+1,i )−e x+1,+1 −z d x e +1,+1 (2π ) ! −∞

×

S j=1

=

+1

dλ, j

i=1 k=1

(λ+1,i − ıλ,k ) (ıλ,k − ıγ+1,i )

(ıλ,l − ıλ,k )

(3.12)

k=l

+1 +1

(ıλ+1,i − ıγ+1, j ).

i=1 j=1

where the contour of integration S in above formulas is deformed so as to separate the sequences of poles going down {γ+1, j − ık, j = 1, . . . , + 1, k = 0, . . . , ∞} from the sequences of poles going up {λ+1, j + ık, j = 1, . . . , + 1, k = 0, . . . , ∞} . We assume also that γ+1, j = λ+1,k for any j, k. The last identity is a simple consequence of the following integral formula due to Gustafson (see [Gu], Theorem 5.1, p. 81):

+1

1 dλ, j (2π ) S j=1 +1 +1 i=1 j=1

i=1 k=1

(λ+1,i − ıλ,k ) (ıλ,k − ıγ+1,i )

(ıλ,l − ıλ,k ) k=l

(ıλ+1,i − ıγ+1, j )

= !

. +1 +1

ıλ+1,i − ıγ+1,i i=1

i=1

Proposition 3.3. The following symmetric recursive relation for gl+1 -Whittaker functions holds: gl+1

γ

+1

gl (x+1,+1 ) · Qgl (γ+1,+1 ) gl , (x +1 ) = e ıγ+1,+1 x+1,+1 Q

(3.13)

878

A. Gerasimov, D. Lebedev, S. Oblezin

where x +1 = (x+1,1 , . . . , x+1, ), γ +1 = (γ+1,1 , . . . , γ+1, ) and the action of the gl is given by Baxter operator Qgl and its dual Q

gl (x+1,+1 ) · Qgl (γ+1,+1 ) gl Q (x +1 ) (3.14) =

γ +1

dγ, j

j=1

gl

×Q

j=1

gl (γ , γ | x+1,+1 ) d x, j µ() (γ ) Q +1

gl (x +1 , x | γ+1,+1 )γ (x ).

Proof. Let us start with the Mellin-Barnes recursive relation gl+1

γ

+1

(x +1 )

+1

dγ, j µ() (γ )

(ıγ+1,i − ıγ, j ) i=1 j=1 +1 gl × e ı x+1,+1 ( i=1 γ+1,i − i=1 γ,i ) γ (x +1 ).

=

j=1

Using the properties of the Baxter operator we have gl γ +1 (x +1 ) +1

×

=e

j=1

d x, j Qgl (x +1 , x |γ+1,+1 )

j=1

dγ, j µ() (γ )

×eı x+1,+1 ( =e

ıγ+1,+1 x+1,+1

i=1 γ+1,i −

ıγ+1,+1 x+1,+1

(ıγ+1,i − ıγ, j )

i=1 j=1

i=1 γ,i )

gl

γ (x )

gl (x+1,+1 ) · Qgl (γ+1,+1 ) gl . Q

Note that one can equally start with a Givental recursive relation and use the eigenvalue property (3.7) of the dual Baxter operator. The Givental and Mellin-Barnes recursions are easily obtained from the symmetric recursion (3.13). This provides a direct and inverse transformation of the Givental representation into the Mellin-Barnes one. Moreover, this leads to a family of the gl intermediate Givental-Mellin-Barnes representations. Indeed, to obtain γ +1 (x +1 ) gl

+1

gl

from γ (x ) one can either use the integral operator Q gl+1 (x +1 , x |γ+1,+1 ) or the gl+1 (γ integral operator Q , γ |x ). This leads to the following family of mixed gl +1 +1,+1 Mellin-Barnes-Givental integral representations of gl+1 -Whittaker function gl+1 = Q (1 ) · Q (2 ) · · · Q ( ) gl1 , = L , R , where Q (L) is the integral operator with the integral kernel gl

gl Q glk+1 , k

(3.15)

Q (R) is the integral

k+1 and the integral operators act on γ - or x-variables operator with the integral kernel Q glk depending on i . Various choices of {i } in (3.15) provide various integral representations of the gl+1 -Whittaker function.

Baxter Operator and Archimedean Hecke Algebra

879

4. Archimedean Factors in Rankin-Selberg Method In this section we apply the dual recursion operator and Baxter operators discussed in the previous section to simplify calculations of the correction factors arising in the Rankin-Selberg method applied to G L( + 1) × G L( + 1) and G L( + 1) × G L(). Note that these calculations are an important step in the proof of the functional equations for the corresponding automorphic L-functions using the Rankin-Selberg approach. Explicit expressions for these correction factors in terms of Gamma-functions were conjectured by Friedberg-Bump and Bump and proved later by Stade [St1,St2]. The proofs in [St1,St2] are based on a recursive generalization of the integral representation of gl+1 -Whittaker functions, = 2 first derived by Vinogradov and Takhtajan [VT]. The recursion in [St1,St2] changes the rank by two − 1 → + 1. It was noted in [GKLO] that this recursion is basically the Givental recursion applied twice. In this section we will demonstrate that using the recursive properties of the MellinBarnes representation and the dual Baxter operator one can give a one-line proof of the Friedberg-Bump and Bump conjectures. We start with a brief description of the relevant facts about automorphic L-functions, the Rankin-Selberg method and the BumpFreidberg and Bump conjectures. For more details see e.g. [Bu,Go]. Let A be the adele ring of Q and G be a reductive Lie group. An automorphic representation π of G(A) can be characterized by an automorphic form φπ such that it is an eigenfunction of any element of the global Hecke algebra H(G(A)). The global Hecke algebra can be represented as a product H(G(A)) = (⊗ p H p ) ⊗ H∞ of the local non-Archimedean Hecke algebras H p = H(G(Q p ), G(Z p )) for each prime p and an Archimedean Hecke algebra H∞ = H(G(R), K ), where K is a maximal compact subgroup in G(R). The local Hecke algebra H p is isomorphic to a representation ring of a simply connected complex Lie group L G 0 , Langlands dual to G (e.g. A , B , C , D are dual to A , C , B , D respectively). For each unramified representation of G(Q p ) one can define an action of H p such that an automorphic form φπ is a common eigenfunction of all elements of H p for all primes p and thus defines a set of homomorphisms H p → C. Identifying local Hecke algebras with the representation ring of L G 0 one can describe this set of homomorphisms as a set of conjugacy classes g p in L G 0 . Given a finite-dimensional representation ρV : L G 0 → G L(V, C) one can construct an L-function corresponding to an automorphic form φ in the form of the Euler product as follows: L p (s, φ, ρV ) = det (1 − ρV (g p ) p −s )−1 , (4.1) L(s, φ, ρV ) = p

p

V

where p is a product over primes p such that the corresponding representation of G(Q p ) is not ramified. It is natural to complete the product by including local L-factors corresponding to Archimedean and ramified places. L-factors for ramified representations can be taken trivial. For the Archimedean place the Hecke eigenfunction property is usually replaced by the eigenfunction property with respect to the ring of invariant differential operators on G(R). The corresponding eigenvalues are described by a conjugacy class t∞ in the Lie algebra L g0 = Lie( L G 0 ). The Archimedean L-factor is given by [Se] L ∞ (s, φ, ρV ) =

+1 j=1

π−

s−α j 2

s − α

j

2

s−ρV (t∞ ) s − ρ (t )

V ∞ = det π − 2 ,

V 2 (4.2)

880

A. Gerasimov, D. Lebedev, S. Oblezin

where ρV (t∞ ) = diag(α1 , . . . α+1 ). The complete L-function (s, φ, ρ) = L(s, φ, ρ)L ∞ (s, φ, ρ),

(4.3)

should satisfy the functional equation of the form (1 − s, φ, ρ) = (s, φ, ρ)(s, φπ ∨ , ρ ∨ ), where the -factor is of the exponential form (s, φ, ρ) = A B s and π ∨ , ρ ∨ are dual to π , ρ. In the Rankin-Selberg method one considers automorphic L-functions associated with automorphic representations of the products G × G˜ of reductive groups. Let ρV : L G 0 → End(V ), ρ˜V˜ : L G˜ 0 → End(V˜ ) be finite-dimensional representations of dual groups and let g p ∈ L G 0 , g˜ p ∈ L G˜ 0 be representatives of the conjugacy classes cor˜ One defines the L-function L(s, π × π˜ , ρ × ρ) responding to automorphic forms φ and φ. ˜ as follows: ˜ ρ × ρ) L(s, φ × φ, ˜ = det (1 − ρV (g p ) ⊗ ρ˜V (g˜ p ) p −s )−1 . (4.4) p

V ⊗V˜

The L-function (4.4) up to a correction factor can be naturally written as an integral of the product of automorphic forms φ and φ˜ with a simple kernel function. Given an explicit expression for the correction factor, this integral representation can be an important tool ˜ as a function of s. for studying analytic properties of L(s, φ × φ) being In the following we consider the Rankin-Selberg method in the case of G × G either G L(+1)×G L(+1) or G L( + 1)×G L() with ρ and ρ˜ being standard representations. We start with the case of G L( + 1) × G L( + 1). Consider the following zeta-integral: ˜ = ˜ Z (s, φ × φ) φ(g)φ(g)E(g, s) dg, (4.5) (+1)

G L(+1,Q)Z A

\G L(+1,A)

where the Eisenstein series is

E(g, s) = ζ (( + 1)s)

f s (γ g).

(4.6)

γ ∈P(+1,,Z)\G L(+1,Z) (+1) Here Z A is the center of G L(+1, A), ζ (s) = and

∞

n=1 n

−s

is the Riemann zeta-function

G L(+1,A)

f s ∈ Ind P(+1,,A) δ sP , where δ P denotes the modular function of the parabolic subgroup P( + 1, , A) of G L( + 1, A) with the Levi factor G L(, A) × G L(1, A). Using the Rankin-Selberg unfolding technique (4.5) can be represented in the form ˜ = L(s, φ × φ)(s, ˜ ˜ Z (s, φ × φ) φ × φ), ˜ is a convolution of two gl+1 -Whittaker functions. where the correction factor (s, φ×φ) ˜ is equal to the The Bump-Freidberg conjecture proved in [St1] claims that (s, φ × φ) Archimedean local L-factor.

Baxter Operator and Archimedean Hecke Algebra

881

Theorem 4.1 (Bump-Freidberg-Stade). ˜ = L ∞ (s, φ × φ) ˜ = (s, φ × φ)

+1 +1

π−

s−α j −α˜ k 2

j=1 k=1

s − α − α˜

j k , 2

(4.7)

where ρV (t∞ ) = diag(α1 , . . . α+1 ) and ρ˜V (t˜∞ ) = diag(α˜ 1 , . . . α˜ +1 ) correspond to the automorphic representations φ and φ˜ as in (4.2). The proof of the theorem can be reduced to the following identity proved by Stade (we rewrite Theorem 1.1, [St2] in our notations). Lemma 4.1. The following integral relation holds:

+1

R+1 j=1

=

d x+1, j e−e

+1 +1

x+1,+1

gl+1 γ

+1

gl+1 (x +1 ) +1 +t

(x +1 )λ

(ıt + ıλ+1,k − ıγ+1, j ),

(4.8)

k=1 j=1

where t = (t, . . . , t) ∈ R+1 . Proof. The proof readily follows from the proof of Proposition 3.2. Next we consider the Rankin-Selberg method for G L( + 1) × G L(), ρ and ρ˜ being standard representations of G L( + 1) and G L(). In this case one has to study the following integral: g ˜ = ˜ Z (s, φ × φ) ) φ(g) |det(g)|s−1/2 dg, φ( (4.9) () 1 G L(,Z)Z \G L(,A) A

()

where Z A is the center of G L(, A). Using the Rankin-Selberg unfolding technique, the integral (4.9) can be represented in the form ˜ = L(s, φ × φ)(s, ˜ ˜ Z (s, φ × φ) φ × φ), ˜ is a convolution of gl+1 - and gl -Whittaker where the correction factor (s, φ × φ) ˜ is equal to the functions. The Bump conjecture proved in [St1] claims that (s, φ × φ) Archimedean local L-factor. Theorem 4.2 (Bump-Stade). ˜ = L ∞ (s, φ × φ) ˜ = (s, φ × φ)

+1 j=1 k=1

π−

s−α j −α˜ k 2

s − α − α˜

j k , 2

(4.10)

where ρV (t∞ ) = diag(α1 , . . . α+1 ) and ρ˜V (t˜∞ ) = diag(α˜ 1 , . . . α˜ ) correspond to the automorphic representations φ and φ˜ as in (4.2). The proof of the theorem is equivalent to the proof of the following integral identity ( we rewrite Theorem 3.4, [St2] using our notations):

882

A. Gerasimov, D. Lebedev, S. Oblezin

Lemma 4.2. +1 R+1

gl

gl+1 (x +1 ) +1 +t

d x+1, j γ (x +1 )λ

j=1

+1 +1

= δ ı( + 1)t + ı λ+1,i − ı γ,k

(ıt + ıλ+1,i − ıγ,k ), (4.11) i=1

where t = (t, . . . , t) ∈ function.

k=1

R+1 ,

x +1

i=1 k=1

= (x+1,1 , . . . , x+1, ) and δ(x) is the Dirac δ-

Proof. To verify this statement we substitute into the l.h.s. of (4.11) the following recursive relation: gl +1, +1 ) · gl (x +1 ). +1 (x +1 ) = Q(x λ+1 +t

λ

Then applying the orthogonality relation from Theorem 2.1 and integrating over x+1, +1 we obtain the r.h.s. (4.11). ˜ as Let us stress that one should not expect to have expressions for (s, φ × φ) products of Gamma-functions for more general cases G L( + n) × G L(), n > 1. From ˜ are the kernels the point of view of Mellin-Barnes recursive construction, (s, φ × φ) of recursive operators corresponding to the change of rank → + n and thus are given by compositions of elementary recursive operators. This leads to general expressions ˜ in terms of the integrals of the products of Gamma-functions. Let us for (s, φ × φ) remark that in this paper we consider Rankin-Selberg method as a method for studying properties of matrix elements of the natural (recursive) operators acting in the space of automorphic forms. One can expect that this point of view might be useful in the investigation of other properties of automorphic L-functions. Let us comment on Stade’s proof of Theorems 4.1, 4.2. The proof in [St1,St2] is based on the recursive relation connecting gl+1 - and gl−1 -Whittaker functions. Below we derive this recursion from the following form of the Givental recursion. Proposition 4.1. The following recursive relations for gl+1 -Whittaker functions holds: −1 gl+1 gl λ1 ,...,λ+1 (x +1 ) = d x−1,i Q gl+1 (x +1 , x −1 |λ+1 , λ ) R−1 i=1 gl ×λ1−1 ,...,λ−1 (x −1 ),

−1

(4.12)

gl

Q gl+1 (x +1 , x −1 |λ+1 , λ ) −1 +1

d x, j exp ıλ+1 x+1,i − x,k = R j=1

−

i=1

x −x e +1,k ,k + e x,k −x+1,k+1

k=1

+ ıλ

k=1

k=1

x,k −

−1 j=1

x−1, j

⎫ −1 ⎬ x −x − e ,k −1,k + e x−1,k −x,k+1 . (4.13) ⎭ k=1

Baxter Operator and Archimedean Hecke Algebra

883

Proof. The recursive relation (4.12) is the Givental recursive relation (2.19) applied twice. Theorem 4.3 (Stade). The following recursion relation for gl+1 -Whittaker functions holds: gl λ1+1 ,...,λ+1 (x +1 )

=

−1

R−1 j=1

d x−1, j K +1, −1 (x +1 , x −1 | λ+1 , λ )

gl

×λ1−1 ,...λ−1 (x −1 ),

(4.14)

where K +1, −1 (x +1 , x −1 | λ) is given by the following explicit formula: K +1, −1 (x +1 , x −1 | λ) =2

1−

+1 −1

ı(λ + λ ) +1 exp x+1,i − x−1, j 2 i=1

×

j=1

K ı(λ −λ+1 ) 2 e x+1,i + e x−1,i−1 e−x+1,i+1 + e−x−1,i .

(4.15)

i=1

Here we use the following integral representation for the Macdonald function: ∞ dt ν −y(t+t −1 )/2 t e K ν (y) = . t 0 Proof. At first we substitute into the expression for K +1, −1 the integral representation with integration variables ti for Macdonald functions K ı(λ −λ+1 ) . Then we make the following change of variables ti :

e−x+1,2 + e−x−1,1 , e x+1,1 −x+1,k+1 + e−x−1,k e−x+1,+1 x,k e x, tk = e , t = e , e x+1,k + e x−1,k−1 e x+1, + e x−1,−1

t1 = e

x,1

(4.16)

for k = 1, . . . , and j = 1, . . . , − 1. Thus we obtain the following identity between the kernels: gl

K +1, −1 (x +1 , x −1 | λ) = Q gl+1 (x +1 , x −1 | λ). −1

(4.17)

This reduces the Stade recursion to the Givental recursive procedure. The appearance of the Gamma-functions both in the Mellin-Barnes integral representation of the gl+1 -Whittaker functions and in the expressions for the Archimedean L-factors is not accidental. In the next section we explain this connection by relating the constructed Baxter operator with a universal Baxter operator considered as an element of the Archimedean Hecke algebras H(G(R), K ), where K is a maximal compact subgroup of G(R).

884

A. Gerasimov, D. Lebedev, S. Oblezin

5. Universal Baxter Operator 5.1. Universal Baxter operator in H(G(R), K ). In this section we will argue the Baxter Q-operator for the gl+1 -Toda chain can (and should) be considered as a realization of the universal Baxter operator considered as elements of the spherical Hecke algebra H(G L( + 1, R), K ), K being a maximal compact subgroup of G L( + 1, R). We also consider non-Archimedean analogs of the universal Baxter operator as an element of a local Hecke algebra H(G L( + 1, Q p ), G L( + 1, Z p )). Both in Archimedean and nonArchimedean cases the eigenvalues of the Baxter Q-operators acting on gl+1 -Whittaker functions are given by the corresponding local L-factors. Let us start with the definition of the spherical Hecke algebra H∞ = H(G(R), K ), where K is a maximal compact subgroup of G(R). Algebra H∞ is defined as an algebra of K -biinvariant functions on G, φ(g) = φ(k1 gk2 ), k1 , k2 ∈ K acting by a convolution

φ(g g˜ −1 ) f (g)d ˜ g. ˜

φ ∗ f (g) =

(5.1)

G

To ensure the convergence of the integrals one usually imposes the condition of compact support on K -biinvariant functions. We will consider slightly more general class of exponentially decaying functions.1 By the multiplicity one theorem [Sha], there is a unique smooth spherical vector φ K | in a principal series irreducible representation Vγ = Ind G B− χγ . The action of a K -biinvariant function φ on the spherical vector φ K | in Vγ is given by the multiplication by a character φ of the Hecke algebra: φ ∗ φ K | ≡ G

dgφ(g −1 ) φ K |πγ (g) = φ (γ )φ K |.

(5.2)

In particular, the elements φ of the Hecke algebra should act by convolution on the Whittaker function as follows: gl

gl

φ ∗ γ +1 (g) = φ (γ )γ +1 (g), φ ∈ H∞ .

(5.3)

gl

Here the Whittaker function γ +1 is considered as a function on G such that gl

gl

γ +1 (kan) = χ N− (n) γ +1 (a),

(5.4)

where kan ∈ K AN− → G is the Iwasawa decomposition. In the previous section we construct the Baxter integral operator acting on the gl+1 Whittaker function (considered as a function on the subspace A of the diagonal matrices) as gl+1

Qgl+1 (λ) · γ

(x) =

+1 j=1

π−

ıλ−ıγ j 2

ıλ − ıγ

j

2

gl+1

γ

(x),

(5.5)

1 This should be compared with the use of exponentially decreasing functions instead of functions with compact support in the Mathai-Quillen construction of the representative of the Thom class.

Baxter Operator and Archimedean Hecke Algebra

885

where the kernel of the operator Qgl+1 (λ) is given by Qgl+1 (x, y|λ) +1

= 2+1 exp ıλ e2(xk −yk ) + e2(yk −xk+1 ) − π e2(x+1 −y+1 ) . (xi − yi ) − π i=1

k=1

Note that here we use a parametrization of Baxter operator naturally arising in the description of Whittaker functions in terms of Iwasawa decomposition. In this section we will use only this type of the parametrization and drop the tildes in the corresponding notations (see (2.13) and Lemma 2.1). We also take coupling constants in the Toda chain gi = π 2 to agree with the standard normalizations in Representation theory. gl Let us recall that we introduce the gl+1 -Whittaker function γ +1 (x) as a matrix element multiplied by the factor exp(−ρ, x) (see (2.5), (2.13)). In the construction of the universal Baxter operator it is more natural to consider a modified Whittaker function gl+1 equal to the matrix elements itself gl

gl+1

γ +1 (x) = eρ,x γ

(x).

(5.6)

Define a modified Baxter Q-operator: gl

Q0 +1 (λ) = eρ,x Qgl+1 (λ)e−ρ,x . It has the kernel gl

Q0 +1 (x, y|λ) = 2+1 exp

+1

(ıλ + ρ j )(x j − y j )

j=1

−π

e2(xk −yk ) + e2(yk −xk+1 ) − π e2(x+1 −y+1 ) , k=1

where ρ ∈ , with ρ j = 2 + 1 − j, j = 1, . . . , + 1, and it acts on the modified Whittaker functions as follows: R+1 gl

gl

Q0 +1 (λ) · γ +1 (x) =

+1

π−

ıλ−ıγ j 2

ıλ − ıγ

j

2

j=1

gl

γ +1 (x).

(5.7)

We would like to find an element φQ0 (λ) in H∞ such that the following relation holds: gl

φQ0 (λ) ∗ γ +1 (g) =

+1

π−

ıλ−ıγ j 2

ıλ − ıγ

j

2

j=1

gl

γ +1 (g),

(5.8)

and the restriction of φQ0 (λ) to the subspace of functions satisfying (5.4) coincides with gl

the operator Q0 +1 (λ). We shall call such φQ0 (λ) a universal Baxter operator. Theorem 5.1. Let φQ0 (λ) (g) be a K -biinvariant function on G = G L( + 1, R) given by

φQ0 (λ) (g) = 2+1 | det g|ıλ+ 2 e−π Trg g . t

(5.9)

886

A. Gerasimov, D. Lebedev, S. Oblezin

i) Then, the action of φQ0 (λ) on the functions satisfying (5.4) descends to the action gl

of Q0 +1 (λ) defined by (5.7);

gl

ii) The action of φQ0 (λ) on the modified Whittaker function γ +1 (g) by a convolution is given by gl gl (5.10) φQ0 (λ) ∗ γ +1 (g) = L ∞ (λ) γ +1 (g), where L ∞ (λ) is the local Archimedean L-factor, L ∞ (λ) =

+1

π−

ıλ−ıγ j 2

j=1

ıλ − ıγ

j . 2

(5.11)

Proof. i) The action of the K -biinvariant function on gl+1 -Whittaker functions is given by gl gl d g˜ φ(g g˜ −1 ) γ +1 (g) ˜ φ ∗ γ +1 (g) = G d g˜ φ(g g˜ −1 ) k|πγ (g)|ψ ˜ R . (5.12) = G

Fix the Iwasawa decomposition g˜ = k˜ a˜ n, ˜ k˜ ∈ K , a˜ ∈ A, n˜ ∈ N− of a generic element g˜ ∈ G and let δ B− (a) ˜ = detn− Ada˜ . We shall use the notation d × a = da · det(a)−1 for a ∈ A. We have for a ∈ A, gl gl φ ∗ γ +1 (a) = d × ad ˜ n˜ δ B− (a) ˜ φ(a n˜ −1 a˜ −1 ) χ N− (n) ˜ γ +1 (a) ˜ AN− gl d × a˜ K φ (a, a) ˜ γ +1 (a) ˜ (5.13) = A

with

˜ = K φ (a, a)

d n˜ δ B− (a) ˜ φ(a n˜ −1 a˜ −1 ) χ N− (n), ˜

N−

χ N− (n) ˜ = exp 2πı n˜ i+1,i . i=1

Thus to prove the first statement of the theorem we should prove the following; gl Q0 +1 (x, y|λ) = d n˜ δ B− (a) ˜ φQ0 (λ) (a n˜ −1 a˜ −1 |λ) χ N− (n), ˜ (5.14) N−

where a = diag (e x1 , . . . , e x+1 ),

a˜ = diag(e y1 , . . . e y+1 ),

˜ δ B− (a) ˜ = e−2ρ,log a =e

i> j (yi −y j )

.

(5.15)

For g = a n˜ −1 a˜ −1 we have det g = e

+1

i=1 (xi −yi )

, Tr g t g =

+1 i=1

e2(xi −yi ) +

i> j

u i2j e2(xi −y j ) ,

(5.16)

Baxter Operator and Archimedean Hecke Algebra

887

where u = n˜ −1 ∈ N− . Taking into account that χ N− (n) ˜ = χ N− (u −1 ) = exp(−2πı i=1 u i+1,i ), we obtain gl

Q0 +1 (x, y|λ) = 2+1 du e i> j (yi −y j ) e−2πı i=1 u i+1,i N− +1 +1

× exp (ıλ + )(xi − yi ) − π e2(xi −yi ) − π u i2j e2(xi −y j ) (5.17) 2 i=1

i=1

= 2+1 exp (ıλ + ) 2 × ×

+1

(xi − yi ) − π

i=1

R i=1

e2(xi −yi ) e

i> j (yi −y j )

i=1

i> j

+1

du i+1,i exp − 2πı

u i+1,i − π

i=1

2 u i+1,i e2(xi+1 −yi )

i=1

du i j exp − π u i2j e2(xi −y j ) .

(5.18)

i> j+1

Computing the integrals by using the formula ∞ 2 π −ω −ıωx− px 2 e 4p e dx = p −∞

(5.19)

we readily obtain that gl

Q0 +1 (x, y|λ) = 2+1 exp

+1

(ıλ + ρi )(xi − yi ) i=1

−π

e2(xi −yi ) + e2(yi −xi+1 ) − π e2(x+1 −y+1 ) ,

(5.20)

i=1

where ρ j = 2 + 1 − j, j = 1, . . . , + 1. This completes the proof of the first statement of the theorem. ii) The proof of (5.10) follows from the results of Sect. 2. It is instructive to provide a direct proof of (5.10). To do so let us first recall standard facts in the theory of spherical functions (see [HC] for details). There is a general integral expression for the K -biinvariant function in terms of eigenvalues φ (γ ) (5.3). Consider the action on the spherical functions ϕγ (g) = φ K |πγ (g)|φ K ,

(5.21)

normalized by the condition ϕγ (e) = 1. The explicit integral representation for ϕγ (g) is ϕγ (g) = dk eh(gk),ıγ +ρ , (5.22) K

888

A. Gerasimov, D. Lebedev, S. Oblezin

where K dk = 1 and h(g) = log a, where g = kan ∈ K AN− → G is the Iwasawa decomposition. Then we have φ ∗ ϕγ (g) = φ (γ )ϕγ (g), φ (γ ) = φ ∗ ϕγ (e).

(5.23)

Thus the eigenvalues can be written in terms of the spherical transform as follows: φ (γ ) = dg φ(g −1 )ϕγ (g) = 2−(+1) d × a φ(a −1 )ϕγ (a), (5.24) A+

G

where we have used the Cartan decomposition G = K A+ (M \ K ) to represent the first integral as an integral over diagonal matrices. Here we define A+ = exp a+ , where a+ consists of the diagonal matrices of the form diag (e x1 , . . . , e x+1 ), x1 ≤ x2 ≤ . . . ≤ x+1 and M is the normalizer of a in K . Notice that 2+1 = |M|. Proposition 5.1. The following integral relation holds: φQ0 (λ) (γ ) = 2−(+1) d × a φQ0 (λ) (a −1 ) ϕγ (a) =

A+ +1

j=1

where ρ j =

2

π−

ıλ−γ j 2

ıλ − ıγ

j , 2

(5.25)

+ 1 − j, j = 1, . . . , + 1.

Proof. Using the integral representation (5.24), the l.h.s. of (5.25) is given by t −1 dkd × a | det a|−ıλ− 2 e−π Tr (a a) e . K ×A+

(5.26)

Using Cartan and Iwasawa decompositions we have t −1 dk d × a | det a|−ıλ− 2 e−πTr (a a) e K ×A+ t −1 = dk d × a dk | det k ak|−ıλ− 2 e−πTr ((k ak) (kak )) e (5.27) + K ×A ×K t −1 = 2+1 dk d × a dk | det k ak|−ıλ− 2 e−πTr ((k ak) (kak )) e K ×A+ ×M\K t −1 = 2+1 dg | det g|−ıλ− 2 e−πTr (g g) e G t 2 −1 = 2+1 dn d × a dk δ B− (a)| det a|−ıλ− 2 e−πTr (n a n) elog(a),ıγ +ρ K ×A×N− t 2 −1 = 2+1 dn d × a δ B− (a)| det a|−ıλ− 2 e−πTr (n a n) e A×N−

=

+1 j=1

π−

ıλ−γ j 2

ıλ − ıγ

j , 2

Baxter Operator and Archimedean Hecke Algebra

where the formula

+∞ −∞

889

d x eνx e−ae

−2x

=

1 ν ν a 2 (− ) 2 2

was used. The integral operator constructed above can be considered as a universal Baxter operator on matrix elements between the spherical vector and any other vector in the representation space. In particular it is easy to describe explicitly an action of the Baxter operators on the space of zonal spherical functions. In this case one obtains the Baxter operator for the Sutherland model at a particular value of the coupling constant.

5.2. Non-Archimedean analog of Baxter operator. Let us construct a non-Archimedean analog of the universal Baxter Q-operator introduced above. In the non-Archimedean case the local Hecke algebra H p = H(G L( + 1, Q p ), K p ), K p = G L( + 1, Z p ) is defined as an algebra of the compactly supported K p -biinvariant functions on G L( + 1, Q p ). Note that K p is a maximal compact subgroup of G L( + 1, Q p ). Consider a (i) set {T p }, i = 1, . . . , ( + 1) of generators of H(G L( + 1, Q p ), K p ) given by the characteristic functions of the following subsets: Oi = K p · diag( p, . . . , p , 1 . . . , 1) · K p ⊂ G L( + 1, Q p ).

(5.28)

i (i)

The action of T p on functions f ∈ C(G/K ) is then given by the following integral formula: (i) f (gh)dh. (5.29) (T p f ) (g) = Oi

(i)

This can be considered as a convolution with characteristic function T p of Oi . For an appropriately defined non-Archimedean gl+1 -Whittaker function Wσ [Sh,CS] one has T p(i) Wσ = Tr Vωi ρi (σ ) Wσ ,

(5.30)

where ρi : G L( + 1, C) → End(Vωi , C), Vωi = ∧i C+1 is a representation of G L( + 1, C) corresponding to the fundamental weight ωi and σ is a conjugacy class in G L( + 1, C) corresponding to a non-Archimedean Whittaker function Wσ . Note that, in contrast (i) with (5.30), the standard normalization of T p includes an additional factor p −i(i−1)/2 . (V ) More generally, one considers Hecke operators T p associated to arbitrary finite dimensional representations ρV : G L( + 1, C) → End(V, C) satisfying T p(V ) Wσ = Tr V ρV (σ ) Wσ .

(5.31)

It is natural to arrange the generators of H p into the following generating function: T p (λ) =

+1 ( j) (−1) j p −(+1− j)λ T p . j=1

(5.32)

890

A. Gerasimov, D. Lebedev, S. Oblezin

We introduce another generating function gl

Q p +1 (λ) =

∞

p −nλ T p(S

nV)

,

(5.33)

n=0

where V = C+1 is the standard representation of gl+1 (C). The generating functions (5.32), (5.33) satisfy the following relations: gl

gl

gl

gl

Q p +1 (λ) · Q p +1 (λ ) = Q p +1 (λ ) · Q p +1 (λ), gl

gl

Q p +1 (λ) · T p (λ ) = T p (λ ) · Q p +1 (λ), gl

1 = T p (λ) · Q p +1 (λ), and the operators T p (λ) and Whittaker function as

gl Q p +1 (λ)

(5.34) (5.35) (5.36)

act on the non-Archimedean analog of the

T p (λ) Wσ = det (1 − p −λ ρV (σ )) Wσ , V

gl

Q p +1 (λ) Wσ = det (1 − p −λ ρV (σ ))−1 Wσ . V

(5.37) (5.38)

gl

Thus the eigenvalues of Q p +1 (λ) are given by the local non-Archimedean L-factors L p (s) = det(1 − p −s ρV (σ ))−1 , V

(5.39)

where we use a more traditional notation s := λ. Comparing (5.34), (5.35), (5.36) with (2.22), (2.24), (2.25) one can see that the gl+1 gl Baxter Q-operator appears quite similar to the generating function Q p +1 (λ) in the Hecke algebra H(G L( + 1, Q p ), K p ) and the analog of T p (λ) is given by (2.26). In particular both operators share the property that their eigenvalues are given by local L-factors. One can represent Archimedean and non-Archimedean Baxter operators in a unified form. Let us rewrite (5.33) as Qgl+1 (λ)(g) = ( p n 1 · · · p n +1 )ıλ δn (g), (5.40) (n 1 ,,...n +1 )∈Z+1 +

where n = (n 1 , . . . , n +1 ), δn (g) is a characteristic function of On ⊂ G L( + 1, Q p ), On = K p · diag( p n 1 , . . . , p n +1 ) · K p .

(5.41)

On the other hand the (universal) Archimedean Baxter Q-operator (5.9) can be written in the following form: +1 2 (5.42) φQ0 (λ) (g) = dt1 · · · dt+1 (t1 · · · t+1 )ıλ e−π j=1 ti δt (g), where δt (g), t = (t1 , . . . , t+1 ) is an appropriately defined function with the support at Ot ⊂ G L( + 1, R), Ot = K · diag(t1 , . . . , t+1 ) · K .

(5.43)

The integral formulas (5.40) and (5.42) are compatible in the sense of the standard correspondence between Archimedean and non-Archimedean integrals (see e.g. [W]).

Baxter Operator and Archimedean Hecke Algebra

891

6. Baxter Operator for so2+1 In the next section we define a Baxter Q-operator for g = so2+1 and demonstrate that the relation between local L-factors and eigenvalues of Q-operators holds in this case. A more systematic discussion of the general case will be given elsewhere. According to [Ko1], the so2+1 -Whittaker function can be written in terms of the invariant pairing of Whittaker modules as follows λso2+1 (x) = e−ρ,x ψ L , πλ (eh x ) ψ R , where h x :=

x ∈ h,

(6.1)

ωi , x h i , ωi is a basis of the fundamental weights of so2+1 . Note

i=1

that so2+1 -Whittaker functions are common eigenfunctions of the complete set of the commuting so2+1 -Toda chain Hamiltonians H2k ∈ Diff(h), k = 1, . . . , defined by so2+1 so2+1 H2k λ (x) = e−ρ,x ψ L , πλ (eh x ) c2k ψ R ,

(6.2)

where {c2k } are generators of the center Z(so2+1 ) ⊂ U(so2+1 ). For the quadratic Hamiltonian we have H2so2+1 = −

−1

1 ∂2 1 x1 xi+1 −xi + e . e + 2 ∂ xi 2 2 i=1

(6.3)

i=1

Let us introduce a generating function for the so2+1 -Toda chain Hamiltonians as t so2+1 (λ) =

(−1) j λ2+1−2 j H2soj 2+1 (x).

(6.4)

j=1

Then the so2+1 -Whittaker function satisfies the following equation: t so2+1 (λ) λso2+1 (x) = λ

(λ2 − λ2j ) λso2+1 (x),

(6.5)

j=1

where λ = (λ1 , . . . , λ+1 ) and x = (x1 , . . . , x+1 ). Theorem 6.1. Eigenfunctions of the so2+1 -Toda chain admit the integral representation: −1 k k so 2+1 (x,z) 2+1 λso1 ,...,λ (x , . . . , x ) = d x dz k,i eF , ,1 , k,i R2 k=1 i=1

k=1 i=1

where F so2+1 (x, z) = −ıλ1 (x1,1 − 2z 1,1 ) − ı

n=2

−

e z n,1 +

n=1

+

λn

n

xn,i −2

i=1

n

z n,i +

i=1

i=1

k=2 n=k+1

e z n,k −xn−1,k−1 + e z n,k −xn,k−1 + e xn,n −z n,n ,

n=k

xn−1,i

e xn−1,k −z n,k + e xn,k −z n,k

where we set xi := x,i , 1 ≤ i ≤ .

n−1

n=1

(6.6)

892

A. Gerasimov, D. Lebedev, S. Oblezin

This integral representation was [GLO3] ( we made an additional change proposed in

x x ,1 −1,1 in the integral representation given in +e of variables z ,1 −→ −z ,1 + ln e [GLO3]). 2+1 Corollary 6.1. The following integral operators Q so so2−1 provide a recursive construction of the so2+1 -Whittaker function: −1 so2−1 2+1 2+1 λso1 ,...,λ (x ) = d x−1,i Q so so2−1 (x , x −1 |λ )λ1 ,...,λ−1 (x −1 ), (6.7)

R−1i=1

where 2+1 Q so so2−1 (x , x −1 |λ )

× exp

− ıλ

=

dz ,i

R i=1

x,i − 2

i=1

i=1

z ,i +

−1

x−1,i

i=1

−1

× exp − e z ,1 + e x−1,i −z ,i + e z ,i+1 −x−1,i i=1 −1

+ e x,i −z ,i + e z ,i+1 −x,i + e x, −z , .

(6.8)

i=1

For = 1 we set 3 Q so so1 (x 1,1 |λ1 )

=

R

dz 1,1 eıλ1 x1,1 −2ıλ1 z 1,1 exp − e z 1,1 + e x1,1 −z 1,1 .

(6.9)

Below λso2+1 (x) will always denote the unique W -invariant solution of (6.5) (class one principal series Whittaker function). Note that the space of W -invariant Whittaker functions λso2+1 (x) provides a basis in the space of W -invariant functions on R . Definition 6.1. The Baxter Q-operator for so2+1 is given by +1 +1

so2+1 Q (y, x|λ) = dz i exp − ıλ yi − 2 zi + xi R+1 i=1

i=1

i=1

(6.10)

i=1

× exp − e z 1 − e yi −zi + e zi+1 −yi + e xi −zi + e zi+1 −xi , i=1

where y = (y1 , . . . , y ) and x = (x1 , . . . , x ). Theorem 6.2. The operator Qso2+1 (λ) satisfies the following identities: Qso2+1 (λ) Qso2+1 (λ ) = Qso2+1 (λ ) Qso+1 (λ), (6.11)

so2+1 so−1 2+1 (λ), Qso+1 (λ) · Q so so2−1 (λ ) = ıλ − ıλ − ıλ − ıλ Q so2−1 (λ ) · Q so2+1

so2+1

so2+1

so2+1

Q (λ) T (λ ) = T (λ ) Q (λ), λQso+1 (λ + ı) = ı 2 Qso+1 (λ) T so2+1 (λ),

(6.12) (6.13) (6.14)

Baxter Operator and Archimedean Hecke Algebra

893

where T so2+1 (x, y|λ) = t so2+1 (x, ∂x |λ)δ () (x − y), t so2+1 (x, ∂x |λ) =

(6.15)

+1

(−1) j λ2+1−2 j H2soj 2+1 (x, ∂x ).

(6.16)

j=1

Proof. We will prove the commutativity of Q-operators (6.11). The relation (6.12) can be proved using a similar approach. The other identities then easily follow. To prove (6.11) we should verify the following identity between the kernels: R+1

Qso2+1 (y, x|λ) Qso2+1 (x, z|λ )

dx j

(6.17)

j=1

=

+1

R+1

+1

Qso2+1 (y, x|λ ) Qso2+1 (x, z|λ)

dx j,

(6.18)

j=1

where Qso2+1 (y, x|λ) +1 +1

= du i exp − ıλ yi − 2 ui + xi R+1 i=1

× exp − eu 1 −

i=1

i=1

i=1

e yi −u i + eu i+1 −yi + e xi −u i + eu i+1 −xi

,

(6.19)

i=1

Qso2+1 (x, z|λ ) +1 +1

= dvi exp − ıλ xi − 2 vi + zi R+1 i=1

× exp − ev1 −

i=1

i=1

i=1

e xi −vi + evi+1 −xi + e zi −vi + evi+1 −zi

.

(6.20)

i=1

The proof is given by the following sequence of elementary transformations. Let us first make a change of variables u i and vi in (6.17):

u 1 −→ −u 1 + ln e y1 + e x1 ,

u i −→ −u i − ln e yi−1 + e xi−1 + ln e yi + e xi , 1 < i ≤ ,

v1 −→ −v1 + ln e x1 + e z 1 ,

vi −→ −vi − ln e xi−1 + e zi−1 + ln e xi + e zi , 1 < i ≤ .

894

A. Gerasimov, D. Lebedev, S. Oblezin

We introduce additional integration variables u +1 and v+1 in (6.17) using integral formulas:

−2ıλ

e−y + e−x = (2ıλ )−1 du +1 exp 2ıλ u +1 − eu +1 −y − eu +1 −x , R

−2ıλ

e−x + e−z (6.21) = (2ıλ)−1 dv+1 exp 2ıλ v+1 − ev+1 −x − ev+1 −z . R

Then let us modify the variables xi , i = 1, . . . , as:

xi −→ −xi − ln e−u i + e−zi + ln eu i+1 + e zi+1 ,

(6.22)

and use the following integral representations to introduce the additional variables x0 and x+1 :

−ı(λ+λ ) u1 v1 −1 e +e = (ı(λ + λ )) d x0 R

× exp − ı(λ + λ )x0 − eu 1 −x0 − e z 1 −x0 ,

ı(λ+λ ) e−u +1 + e−v+1 = (−ı(λ + λ ))−1

× d x+1 exp − ı(λ + λ )x+1 − e x+1 −u +1 − e x+1 −v+1 . R

Now we make the following sequence of changes of the variables:

u 1 −→ −u 1 − ln 1 + e−x0 + ln e y1 + e x1 ,

u i −→ −u i − ln e yi−1 + e xi−1 + ln e yi + e xi , 1 < i ≤ ,

u +1 −→ −u +1 + x+1 − ln e−y + e−x ,

v1 −→ −v1 − ln 1 + e−x0 + ln e x1 + e z 1 ,

vi −→ −vi − ln e xi−1 + e zi−1 + ln e xi + e zi ,

v+1 −→ −v+1 + x+1 − ln e−x + e−z ,

x0 −→ −x0 + ln eu 1 + e z 1 ,

xi −→ −xi − ln e−u i + e−zi + ln eu i+1 + e zi+1 ,

x+1 −→ −x+1 − ln e−u +1 + e−z +1 .

1 < i ≤ ,

1 ≤ i ≤ ,

(6.23)

One integrates out the variables x0 and x+1 and modifies the variables u i and vi as follows

u 1 −→ −u 1 + ln e y1 + e x1 ,

u i −→ −u i − ln e yi−1 + e xi−1 + ln e yi + e xi , 1 < i < ,

u −→ −u − ln e−y−1 + e−x−1 ,

Baxter Operator and Archimedean Hecke Algebra

v1 −→ −v1 + ln e x1 + e z 1 ,

vi −→ −vi − ln e xi−1 + e zi−1 + ln e xi + e zi ,

v −→ −v − ln e−x−1 + e−z −1 .

895

1 < i < ,

Integrating out u +1 and v+1 , one completes the proof of (6.11). Corollary 6.2. The following identity holds:

R

d x,i Qso2+1 (y, x|γ ) λso2+1 (x)

i=1

=

ıλi − ıγ

− ıλi − ıγ λso2+1 (y). i=1

(6.24)

i=1

Finally let us note that this result is in agreement with the interpretation of the eigenvalues of Q-operators as local Archimedean L-functions corresponding to automorphic representations of reductive Lie groups discussed above. Acknowledgements. The research of AG was partly supported by Science Foundation Ireland grant and the research of SO was partially supported by RF President Grant MK-134.2007.1.

References [Ba] [BZ] [Bu] [CS] [Et] [GKL] [GKLO] [GLO1] [GLO2] [GLO3] [Gi] [Go] [Gu] [HC]

Baxter, R.J.: Exactly solved models in statistical mechanics. London: Academic Press, 1982 Berenstein, A., Zelevinsky, A.: Tensor product multiplicities and convex polytopes in partition space. J. Geom. Phys. 5, 453–472 (1989) Bump, D.: The Rankin-Selberg method: A survey. In: Number Theory, Trace Formulas and Discrete Groups: Symposium in Honor of Atle Selberg (Oslo, Norway, July 14–21, 1987), London: Academic Press, (1989) Casselman, W., Shalika, J.: The unramified principal series of p-adic groups II. The Whittaker Function. Comp. Math. 41, 207–231 (1980) Etingof, P.: Whittaker functions on quantum groups and q-deformed Toda operators. Amer. Math. Soc. Transl. Ser.2, 194, Providence, RI: Amer. Math. Soc., 1999, pp. 9–25 Gerasimov, A., Kharchev, S., Lebedev, D.: Representation Theory and Quantum Inverse Scattering Method: Open Toda Chain and Hyperbolic Sutherland Model. Int. Math. Res. Notices 17, 823–854 (2004) Gerasimov, A., Kharchev, S., Lebedev, D., Oblezin, S.: On a Gauss-Givental representation for quantum Toda chain wave function. Int. Math. Res. Notices, Volume 2006, Article ID 96489, 23 p. Gerasimov, A., Lebedev, D., Oblezin, S.: Givental representation for classical groups. http://arxiv. org/list/math.RT/0608152, 2006 Gerasimov, A., Lebedev, D., Oblezin, S.: Baxter Q-operator and Givental integral representation for Cn and Dn . http://arxiv.org/list/math.RT/0609082, 2006 Gerasimov, A., Lebedev, D., Oblezin, S.: New Integral Representations of Whittaker Functions for Classical Lie Groups. http://arxiv.org/abs/0705.2886, 2007 Givental, A.: Stationary Phase Integrals, Quantum Toda Lattices, Flag Manifolds and the Mirror Conjecture. In: Topics in Singularity Theory, Amer. Math. Soc. Transl. Ser., 2 180, Providence, RI: Amer. Math. Soc., 1997, pp. 103–115 Goldfeld, D.: Automorphic forms and L-functions for the group G L(n, R). Cambridge studies in Adv. Math., Cambridge: Cambridge Univ. Press, 2006 Gustafson, R.A.: Some q-beta and Mellin-Barnes integrals on compact Lie groups and Lie algebras. Trans. Amer. Math. Soc. 341:1, 69–119 (1994) Harish-Chandra, Spherical functions on a semisimple Lie group I, II. Amer. J. Math. 80, 241–310, 553–613 (1958)

896

[Ha] [J] [JPSS] [JS1] [JK] [K] [KL1] [KL2] [Ko1] [Ko2] [Pa] [PG] [RSTS] [Se] [Sha] [Sh] [St] [St1] [St2] [St3] [STS] [VT] [W]

A. Gerasimov, D. Lebedev, S. Oblezin

Hashizume, M.: Whittaker functions on semi-simple Lie groups. Hiroshima Math. J. 12, 259–293 (1982) Jacquet, H.: Fonctions de Whittaker associées aux groupes de Chevalley. Bull. Soc. Math. France 95, 243–309 (1967) Jacquet, H., Piatetski-Shapiro, I.I., Shalika, J.: Rankin-Selberg convolutions. Amer. J. Math. 105, 367–464 (1983) Jacquet, H., Shalika, J.: Rankin-Selberg convolution: the Archimedean theory. Festshrift in Honor of Piatetski-Shapiro, Part I, Providence, RI: Amer. Math. Soc., 1990, pp. 125–207 Joe, D., Kim, B.: Equivariant mirrors and the Virasoro conjecture for flag manifolds. Int. Math. Res. Notices 15, 859–882 (2003) Kac, V.: Infinite-dimensional Lie algebras. Cambridge: Cambridge University Press, 1990 Kharchev, S., Lebedev, D.: Eigenfunctions of G L(N , R) Toda chain: The Mellin-Barnes representation. JETP Lett. 71, 235–238 (2000) Kharchev, S., Lebedev, D.: Integral representations for the eigenfunctions of quantum open and periodic Toda chains from QISM formalism. J. Phys. A 34, 2247–2258 (2001) Kostant, B.: Quantization and representation theory. In: Representation theory of Lie groups. 34, London Math. Soc. Lecture Notes Series, London. Math. Soc., 1979, pp. 287–316 Kostant, B.: On Whittaker vectors and representation theory. Invent. Math. 48(2), 101–184 (1978) Parshin, A.N.: On the arithmetic of 2-dimensional schemes. I, repartitions and residues. Russ. Math. Izv. 40, 736–773 (1976) Pasquier, V., Gaudin, M.: The periodic Toda chain and a matrix generalization of the Bessel function recursion relation. J. Phys. A 25, 5243–5252 (1992) Reyman, A.G., Semenov-Tian-Shansky, M.A.: Integrable Systems. Group theory approach. Modern Mathematics, Moscow-Igevsk: Institute Computer Sciences, 2003 Serre, J.-P.: Facteurs locaux des fonctions zêta des variétés algébraiques (définisions et conjecures). Sém. Delange-Pisot-Poitou, exp. 19, 1969/70 Shalika, J.A.: The multiplicity one theorem for G L n . Ann. Math. 100:1, 171–193 (1974) Shintani, T.: On an explicit formula for class 1 Whittaker functions on G L n over padic fields. Proc. Japan Acad. 52, 180–182 (1976) Stade, E.: On explicit integral formulas for G L(n, R)-Whittaker functions. Duke Math. J. 60(2), 313–362 (1990) Stade, E.: Mellin transforms of G L(n, R) Whittaker functions. Amer. J. Math. 123, 121–161 (2001) Stade, E.: Archimedean L-functions and barnes integrals. Israel J. Math. 127, 201–219 (2002) Ishii, T., Stade, E.: New formulas for Whittaker functions on G L(N , R). J. Funct. Anal. 244, 289–314 (2007) Semenov-Tian-Shansky, M.: Quantum Toda lattices. Spectral theory and scattering. Preprint LOMI 3-84, 1984: Quantization of open Toda lattices. In “Encyclodpaedia of Mathematical Sciences” 16. Dynamical systems VII. Berlin: Springer, 1994, pp. 226–259 Vinogradov, I., Takhtadzhyan, L.: Theory of Eisenstein series for the group S L(3, R) and its application to a binary problem. J. Soviet. Math. 18, 293–324 (1982) Weil, A.: Basic Number theory. Berlin: Springer, 1967

Communicated by L. Takhtajan

Commun. Math. Phys. 284, 897–918 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0575-5

Communications in

Mathematical Physics

Geometries with Killing Spinors and Supersymmetric Ad S Solutions Jerome P. Gauntlett1,2 , Nakwoo Kim3 1 Theoretical Physics Group, Blackett Laboratory, Imperial College, London SW7 2AZ, UK.

E-mail: [email protected]

2 The Institute for Mathematical Sciences, Imperial College, London SW7 2PE, UK 3 Department of Physics and Research Institute of Basic Science, Kyung Hee University,

Seoul 130-701, Korea Received: 23 October 2007 / Accepted: 27 March 2008 Published online: 18 July 2008 – © Springer-Verlag 2008

Abstract: The seven and nine dimensional geometries associated with certain classes of supersymmetric Ad S3 and Ad S2 solutions of type IIB and D = 11 supergravity, respectively, have many similarities with Sasaki-Einstein geometry. We further elucidate their properties and also generalise them to higher odd dimensions by introducing a new class of complex geometries in 2n + 2 dimensions, specified by a Riemannian metric, a scalar field and a closed three-form, which admit a particular kind of Killing spinor. In particular, for n ≥ 3, we show that when the geometry in 2n + 2 dimensions is a cone we obtain a class of geometries in 2n + 1 dimensions, specified by a Riemannian metric, a scalar field and a closed two-form, which includes the seven and nine-dimensional geometries mentioned above when n = 3, 4, respectively. We also consider various ansätze for the geometries and construct infinite classes of explicit examples for all n. 1. Introduction An interesting class of geometries in seven and nine dimensions was recently discovered in [1,2] and further explored in [3]. The geometries are specified by a Riemannian metric, a scalar field B and a closed two-form F and they admit Killing spinors of a certain type. The seven dimensional geometries give rise to supersymmetric solutions of type IIB supergravity with a three dimensional anti-de-Sitter space (Ad S3 ) factor and these are dual to supersymmetric conformal field theories (SCFTs) with (0, 2) supersymmetry in two-dimensions. Similarly, the nine-dimensional geometries give rise to supersymmetric solutions of D = 11 supergravity with Ad S2 factors and these are dual to superconformal quantum mechanics with two supercharges. This geometry in 2n + 1 dimensions, with n = 3, 4, is strikingly similar to SasakiEinstein geometry. In particular, they both have a Killing Reeb vector of constant norm and define a U (n) or metric contact structure. The Killing vector defines a natural foliation and in the Sasaki-Einstein case the metric transverse to these orbits is Kähler

898

J. P. Gauntlett, N. Kim

and Einstein, while for the geometry considered in [1,2] it is Kähler and in addition satisfies1 R − 21 R 2 + Ri j R i j = 0,

(1.1)

where R and Ri j are the Ricci-scalar and Ricci-tensor, respectively, of the transverse metric, is the Laplacian and we also demand2 that R > 0. Moreover, locally, the whole geometry can be reconstructed from a local Kähler metric in 2n-dimensions satisfying (1.1). Recall that a succinct definition of a Sasaki-Einstein metric in 2n + 1 dimensions is that the corresponding cone metric in 2n + 2 dimensions, with base given by the SasakiEinstein metric, is Ricci-flat and Kähler, or equivalently has SU (n + 1) holonomy. A metric with SU (n + 1) holonomy has an SU (n + 1) structure, specified by a fundamental two-form J and an (n + 1, 0)-form (which together define a metric), with vanishing intrinsic torsion, d J = d = 0.

(1.2)

Equivalently it can be characterised as admitting covariantly constant spinors. In this paper we will determine the analogous statements for the geometries studied in [1,2] and furthermore generalise the geometry from n = 3, 4 to all n ≥ 3. We find that the analogue of a metric with SU (n + 1) holonomy is a geometry specified by a metric, a scalar field φ and a closed three-form f with an SU (n + 1) structure (J, ) satisfying d[enφ ] = 0, d[e2(n−1)φ J n ] = 0, d[e2φ J ] = f, and in addition

d e2(n−3)φ ∗2n+2 f = 0,

(1.3)

(1.4)

where ∗2n+2 is the Hodge dual using the (2n + 2)-dimensional metric defined by the SU (n + 1) structure. Note that (1.3) imply that the almost complex structure associated with the SU (n + 1) structure is integrable. We will show that complex geometries satisfying (1.3) are equivalent to geometries that admit a certain type of Killing spinor. Furthermore, by analysing the integrability conditions for the Killing spinor equations, and in addition imposing (1.4), we will also determine the equations of motion satisfied by the metric, the scalar and the three-form, which are the analogue of the property of Ricci-flatness in the case of SU (n + 1) holonomy. If we demand that the geometry in 2n +2 dimensions satisfying (1.3), (1.4) is a metric cone and with the scalar and the three-form having a specific scaling: 2 2 ds2n+2 = dr 2 + r 2 ds2n+1 ,

e−2φ = r f =r

2(n−1) n−2 n 2−n

eB,

dr ∧ F,

(1.5)

1 This equation first appeared, for n = 2, in a different context in [4]. We also note that this condition is similar in form to that of the vanishing of Branson Q-curvature [5], but it is in fact a different condition. 2 Note that when R < 0 we can construct Lorentzian geometries which for n = 3, 4 give rise to solutions of type IIB supergravity and D = 11 supergravity with an S 3 and S 2 factor, respectively, as described in [3].

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

899

we will show that we obtain a geometry in 2n + 1 dimensions specified by a metric, with 2 line element ds2n+1 , a scalar field, B, and a closed two-form, F, all independent of the co-ordinate r , which for n = 3, 4 is precisely equivalent to the geometry of [1,2]. We show that for all n, the 2n + 1 dimensional metric has a Killing vector of constant norm and that the transverse metric is Kähler and satisfies (1.1). For all n, locally, the whole geometry can be reconstructed from a local Kähler metric in 2n-dimensions satisfying (1.1). We determine the kind of Killing spinors that the geometry in 2n + 1 dimensions admit and also the kind of equations of motion that are satisfied by the metric, the scalar field B and the two-form F, which are the analogue of the Einstein condition in the case of Sasaki-Einstein geometry. Note that in 2n + 2 dimensions we are generalising the notion of SU (n + 1) holonomy in the sense that if we set f = φ = 0 in (1.3), (1.4) we clearly return to the case of SU (n + 1) holonomy. However, on the base of the cone in 2n + 1 dimensions, defined by (1.5), we are not generalising the notion of Sasaki-Einstein geometry: for example (1.1) is not satisfied for Einstein metrics for n ≥ 3. When n = 3 an eight dimensional geometry satisfying (1.3), (1.4) gives rise to a supersymmetric solution of type IIB supergravity which is a warped product of R1,1 with the eight-dimensional geometry. Assuming that the eight dimensional geometry is a cone as in (1.5) we recover the type IIB Ad S3 solutions of [1]. Similarly, as discussed in [6], when n = 4, a ten-dimensional geometry satisfying (1.3), (1.4) gives rise to a supersymmetric solution of D = 11 supergravity which is a warped product of R with the ten-dimensional geometry. If the ten dimensional geometry is a cone as in (1.5) we recover the D = 11 Ad S2 solutions of [2]. We do not know of any physical application for the geometry when n ≥ 5. However, it is possible that the geometries in 2n + 1 dimensions, with n ≥ 5, inherit some properties dictated by physics for the seven and/or nine dimensional cases. This is by analogy with the Sasaki-Einstein case. Recall that five-dimensional Sasaki-Einstein geometries, S E 5 , give rise to Ad S5 × S E 5 solutions of type IIB supergravity. These solutions are dual to N = 1 supersymmetric conformal field theories (SCFTs) in four spacetime dimensions and such SCFTs exhibit the phenomenon of a-maximisation [7]. Motivated by this observation, it was proven in [8,9] that the volume of Sasaki-Einstein manifolds in any dimension satisfies a variational principle. Sections 2 and 3 of this paper will be devoted to expanding on the above discussion. In the subsequent sections we will then consider various ansätze in order to find explicit examples. In Sect. 4 we will construct explicit examples of the geometries in 2n + 1 dimensions. This is a direct analogue of the explicit construction of Sasaki-Einstein metrics that was carried out in [10,11] and generalises the analysis of [3] from n = 3, 4 to all n ≥ 3. More specifically, we construct explicit local Kähler metrics in 2n-dimensions satisfying (1.1), by considering local metrics on line bundles over positively curved Kähler-Einstein manifolds in 2n − 2 dimensions. We then argue that for each choice of Kähler-Einstein manifold these lead to countably infinite classes of smooth, compact and simply connected globally defined geometries in 2n + 1 dimensions. Section 5 will present an ansatz for geometries in 2n + 2 dimensions that depend on a number of functions of one variable. We show that the ansatz includes the simple case of a cone over a 2n + 1 dimensional geometry with the corresponding 2n-dimensional Kähler manifold satisfying (1.1) being a product of Kähler-Einstein spaces. Such 2n + 1dimensional geometries, for n = 3, 4 were studied in [3]. We also show that the ansatz includes singular non-compact Calabi-Yau geometries, some of which were discussed in [12]. For n = 3 the ansatz also incorporates a known solution in type IIB supergravity

900

J. P. Gauntlett, N. Kim

that describes an interpolation between a solution with an Ad S5 factor and a solution with an Ad S3 × H2 / factor, where H2 is the hyperbolic plane and is a discrete group of isometries [13,14]. Similarly, for n = 4 the ansatz covers a known solution in D = 11 supergravity that describes an interpolation between a solution with an Ad S4 factor and a solution with an Ad S2 × H2 / factor [15,16]. In Sect. 6 we will consider an ansatz for the geometries in 2n + 1 dimensions which is inspired by the work of [17]. The resulting system boils down to solving a differential equation for a function D of three variables, x 1 , x 2 , z. For n = 3 the equation is linear, as in [17], and can be explicitly solved. For n ≥ 4 the equation is n−4

D + z n−3 ∂z2 e D = 0,

(1.6)

where = ∂12 + ∂22 . For n = 4 this is equivalent to the continuous Toda equation as in [17]. We don’t know whether the equation is an integrable system for n ≥ 5. Section 7 briefly concludes. 2. Geometry in 2n + 2 Dimensions The geometry in 2n + 2 dimensions that we will be interested in is specified by a Riemannian metric, g, a scalar field, φ, and a closed three-form, f : d f = 0.

(2.1)

We are interested in such geometries that admit a solution to the Killing spinor equations: i γ α ∇α φ + e−2φ f σ1 σ2 σ3 γ σ1 σ2 σ3 = 0, 12 i −2φ ∇α − e (2.2) f σ1 σ2 σ3 γα σ1 σ2 σ3 = 0, 24 where is a Spin(2n + 2) spinor, the gamma-matrices, γ α , generate the Clifford algebra Cli f f (2n + 2): {γα , γβ } = 2gαβ and the indices α, σ, . . . run from 1 to 2n + 2. We will be particularly interested in geometries admitting such Killing spinors that in addition satisfy the following equation of motion for f : d e2(n−3)φ ∗2n+2 f = 0. (2.3) We now argue that any geometry satisfying (2.1), (2.2), (2.3) is also a solution to the following equations: E αβ = 0, 1 ∇ 2 φ + 2(n − 1)(∇φ)2 − e−4φ f 2 = 0, 2

(2.4)

where we have defined E αβ ≡ Rαβ − 2(n − 1)∇αβ φ + 2(n − 2)∇α φ∇β φ 1 1 + e−4φ f ασ1 σ2 f β σ1 σ2 − gαβ e−4φ f 2 . 4 2

(2.5)

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

901

To see this we follow an argument of [18]. Specifically, the integrability conditions for the Killing spinor equations can be used to show that E βσ γ σ = −

and

i −2φ i e d f σ1 σ2 σ3 σ4 γβ σ1 σ2 σ3 σ4 − e2(2−n)φ ∇α (e−2(3−n)φ f α σ1 σ2 )γβ σ1 σ2 48 4 (2.6)

1 −4φ 2 ∇ φ + 2(n − 1)(∇φ) − e f = 2 i 1 − e−2φ d f σ1 σ2 σ3 σ4 γ σ1 σ2 σ3 σ4 − ie2(2−n)φ ∇α (e−2(3−n)φ f α σ1 σ2 )γ σ1 σ2 . (2.7) 48 4 If we now impose (2.1), (2.3), we immediately deduce from (2.7) that the scalar equation of motion in (2.4) is satisfied. From (2.6) we similarly deduce that E αβ γ β = 0, but on a Riemannian manifold this implies that E αβ = 0. Observe that (2.3), (2.4) are equations of motion that can be derived by varying an action with Lagrangian density given by 1 L2n+2 = e2(n−1)φ R + 2n(2n − 3)(∇φ)2 + e−4φ f 2 . (2.8) 2

2

2

Here we have defined f 2 ≡ (1/3!) f α1 α2 α3 f α1 α2 α3 and we are thinking of the action as being a functional of the metric, the scalar φ and a two-form potential b with f = db. We next observe that the only compact solutions to the equations of motion (2.3), (2.4) are Ricci-flat manifolds. To see this note that the scalar equation of motion implies that ∇ 2 [e2(n−1)φ ] = 2(n − 1)e2(n−3)φ f 2 .

(2.9)

Integrating this over a compact manifold we deduce for n ≥ 2 that f = 0. The scalar equation of motion in (2.4) then implies that φ = 0. A similar argument works for n = 1 also. In Sect. 3 we will focus on non-compact cone geometries which can have compact base spaces. 2.1. SU (n + 1) structure. We now restrict our considerations to solutions of the Killing spinor equations (2.2) where the Killing spinor is a Weyl spinor. More specifically, we demand that is no-where vanishing and has isotropy group SU (n + 1) ⊂ Spin(2n + 2). In other words we demand that the Killing spinor fixes a globally defined SU (n + 1)structure. We first observe that the Killing spinor equations (2.2) imply that ¯ is a constant. We will fix the normalisation by imposing ¯ = 1. The SU (n + 1) structure is specified by a fundamental two-form J and an (n + 1, 0)-form both of which can be constructed as bi-linears in : Jαβ = −i γ ¯ αβ , α1 ...αn+1 = ¯ c γα1 ···αn+1 ,

(2.10)

where c is the spinor conjugate to . Recall that (J, ) define a metric and an almost complex structure3 . After some detailed calculations, we find that the Killing spinor 3 For example, J 2 = −1 can be shown by using a Fierz rearrangement.

902

J. P. Gauntlett, N. Kim

equations (2.2) imply that the SU (n + 1) structure must satisfy d[enφ ] = 0, d[e2(n−1)φ J n ] = 0, d[e2φ J ] = f.

(2.11)

These equations account for all of the intrinsic torsion modules of the SU (n+1) structure. In particular, using the notation of [19], the first equation in (2.11) says that the torsion modules W1 = W2 = 0, which implies that the manifold is complex (i.e. that the almost complex structure is integrable), and that the Lee form W5 ∝ dφ. The second equation in (2.11) says that the Lee form W4 ∝ dφ. The third equation in (2.11) relates W3 and W4 to the three-form f . We have argued that (2.11) are necessary conditions for solutions of the Killing spinor equations (2.2) with spinors that define an SU (n + 1) structure. They are also sufficient. In particular given an SU (n + 1) structure satisfying the first two conditions in (2.11), one can extract dφ from the torsion modules W4 or W5 and obtain a three-form f via the last equation. Following the same type of argument as that discussed after Eq. (4.23) of [18] we conclude that there will be an SU (n + 1) invariant Weyl spinor that solves the Killing spinor equations (2.2). Clearly the Bianchi identity for f , (2.1), is automatically implied by (2.11). Thus in light of the integrability argument made in the previous subsection, if we also impose the equation of motion for f , (2.3), then we deduce that all of the equations of motion (2.4) are satisfied. Also observe that we are describing a generalisation of manifolds with special holonomy SU (n + 1). In particular, if φ = f = 0, we are demanding the existence of SU (n + 1) invariant covariantly constant spinors, in other words geometries with SU (n + 1) holonomy, and (2.11) reduces to the usual conditions d J = d = 0. The geometries with Killing spinors that we are describing generalise a certain class of supersymmetric solutions of type IIB and D = 11 supergravity. Specifically, we have checked4 that the geometry with n = 3 satisfying (2.11) and (2.3) gives rise to a supersymmetric solution of type IIB supergravity of the form: ds 2 = eφ [ds 2 (R1,1 ) + ds82 ], 1 F5 = − [V ol(R1,1 ) ∧ f − ∗8 f ], 4

(2.12)

where F5 is the self-dual five form. These solutions preserve (0, 2) supersymmetry with respect to R1,1 . Similarly, the n = 4 geometry satisfying (2.11) and (2.3) gives [6] the following supersymmetric solution of D = 11 supergravity: 2 ], ds 2 = e4φ/3 [−dt 2 + ds10 G 4 = dt ∧ f.

(2.13)

These solutions preserve two supercharges. For both of these cases flux quantisation in the supergravity theory implies that the periods of f should be rational5 . One might consider demanding that this condition holds for general n. In the next section we will assume that the metric in 2n + 2 dimensions is a metric cone, as well as imposing additional assumptions on φ, f , and study the corresponding 4 It was recently shown in [20] that this result can also be obtained by considering a restricted class of solutions analysed in [21]. 5 Actually, to be more precise, the quantisation condition on the four-form is slightly different: see [22].

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

903

geometry on the 2n + 1-dimensional base of the cone. Before doing that, let us conclude with two comments which will play no role in the sequel. Firstly, we observe that the Killing spinor equations in (2.2), for arbitrary spinor, can be equivalently written: γ α ∇α φ +

1 1 ∇α + ∇α φ + ∇β φγα β 2 2

i −2φ f σ1 σ2 σ3 γ σ1 σ2 σ3 = 0, e 12 i + e−2φ f ασ1 σ2 γ σ1 σ2 = 0. 8

(2.14)

If we now introduce the rescaled metric g˜ = e2φ g and the rescaled spinor ˜ = eφ/2 , the Killing spinor equations become (dropping tildes) i f σ σ σ γ σ1 σ2 σ3 = 0, 12 1 2 3 i (2.15) ∇α + f ασ1 σ2 γ σ1 σ2 = 0. 8 Interestingly these are just the Killing spinor equations that arise in the common NS-NS sector of supergravity (see [23,24] and e.g. [25,19]) with imaginary 3-form flux H = i f and dilaton = φ. Although we will be focussing on n ≥ 3 in the remainder, the second comment concerns the n = 2 case. If we let be a chiral spinor: iγ7 = where γ7 = γ123456 and define H = e−2φ ∗6 f and the dilaton = −φ, then we find the Killing spinor equations are exactly the same as in the common NS-NS sector in six dimensions. γ α ∇α φ +

3. Geometry in 2n + 1 Dimensions We now restrict our considerations to n ≥ 3 and take the 2n + 2 dimensional metric of the last section to be a cone metric: 2 2 ds2n+2 = dr 2 + r 2 ds2n+1 ,

(3.1)

2 where ds2n+1 is independent of r . We also demand that the scalar field e2φ and the three-form f have the following dependence on r :

e−2φ = r f =r

2(n−1) n−2 n 2−n

eB,

dr ∧ F,

(3.2)

where the scalar field B and the closed two-form F are independent of r . We are interested in the geometry in 2n + 1 dimensions on the link L, defined to be the surface r = 1 on the 2 cone, with metric whose line element is ds2n+1 , scalar field B and closed two-form F. We first observe that the equations of motion in 2n + 2 dimensions given in (2.1), (2.4) give rise to the following equations of motion in 2n + 1 dimensions: Rab + (n − 1)∇ab B +

2 (n − 2) 1 1 ∇a B∇b B + gab + e2B Fac Fb c − gab F 2 = 0, 2 n−2 2 4 4(n − 1) e2B 2 ∇ 2 B − (n − 1)(∇ B)2 − + F = 0, (n − 2)2 2 d e(3−n)B ∗2n+1 F = 0, (3.3)

904

J. P. Gauntlett, N. Kim

where F 2 ≡ Fab F ab . These equations of motion can be derived from an action with Lagrangian given by6 1 2n n(2n − 3) L2n+1 = e(1−n)B R + . (3.4) (∇ B)2 + e2B F 2 − 2 4 (n − 2)2 Here we are thinking of the action as being a functional of the metric, the scalar B and a one-form potential A with F = d A. The Killing spinor equations in 2n + 2 dimensions (2.2) give rise to Killing spinor equations in 2n + 1 dimensions. The generators γα of Cli f f (2n + 2) can be written γa = a ⊗ σ1 a = 1, . . . , 2n + 1, γr = 1 ⊗ σ2 ,

(3.5)

where a generate Cli f f (2n + 1) and σ1 , σ2 are Pauli matrices. For definiteness, when n is odd we take 1 . . . 2n+1 = −i and the chirality operator in 2n + 2 dimensions as γ1 . . . γ2n+1 γr = 1 ⊗ σ3 . When n is even we take 1 . . . 2n+1 = −1 and the chirality operator in 2n + 2 dimensions as iγ1 . . . γ2n+1 γr = 1 ⊗ σ3 . In both cases, then, a positive chirality spinor in 2n + 2 dimensions can be written as = (η, 0), where η is a spinor in 2n + 1 dimensions. We then find that substituting (3.1) and (3.2) into (2.2) leads to 2(n − 1) 1 B a ∇a B + i + e Fab ab η = 0, n−2 2 i 1 ∇c + c + e B Fab c ab η = 0. (3.6) 2 8 Using a result of the last section, we also conclude that if we have a solution to these Killing spinor equations and in addition we impose the Bianchi identity, d F = 0, and the equation of motion for the two-form d e(3−n)B ∗2n+1 F = 0, (3.7) then all of the equations of motion in (3.3) will be satisfied. For the n = 3 case a solution to the Killing spinor equations (3.6) and (3.7) give rise to a supersymmetric type IIB solution with an Ad S3 factor of the form ds 2 = e−B/2 [ds 2 (Ad S3 ) + ds72 ], 1 F5 = − [V ol(Ad S3 ) ∧ F − ∗7 F], 4

(3.8)

while for the n = 4 case we can obtain the following solution of D = 11 supergravity with an Ad S2 factor, ds 2 = e−2B/3 [ds 2 (Ad S2 ) + ds92 ], G 4 = V ol(Ad S2 ) ∧ F. In making the comparison with [1–3] one should identify B = case we have F her e = −4F ther e .

(3.9) 2(n−1) 2−n A

and in the IIB

6 Note for n = 3 that if we change the sign of the last two terms in this Lagrangian, we obtain a Lagrangian equivalent to one considered in [27].

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

905

We can analyse the geometries in 2n +1 dimensions that admit solutions to the Killing spinor equations (2.2) with SU (n + 1) invariant spinors in two ways. We can either directly analyse the Killing spinor equations (3.6), generalising the analysis of [1,2], or equivalently, we can reduce the SU (n + 1) structure in 2n + 2 dimensions satisfying (2.11) that we discussed in the last section. Let us consider the latter approach. We first stay on the cone. We introduce the Reeb vector field ξ defined by ξ α = J α β (r ∂r )β ,

(3.10)

which has norm squared given by r 2 . In this expression J refers to the integrable complex structure on the cone obtained by raising an index on the two-form J ; we hope that using the same letter for both does not cause confusion. We also define the one form η on the cone: ηα = Jα β (

dr )β , r

(3.11)

i.e. η = r12 g2n+1 (ξ, ·). We will restrict our considerations to no-where vanishing e B . Compatible with the cone metric (3.1), we can decompose the SU (n + 1) structure J, via J = r η ∧ dr + r 2 e B JT , ¯ T, = r n en B/2 (dr − ir η) ∧

(3.12)

¯ T is an n-form both orthogonal to ξ and ∂r . The conditions where JT is a two-form and on the SU (n + 1) structure (2.11) now become d JT = 0, (JT ) ∧ de B = 0, n

1 − e B (JT )n + n(JT )n−1 dη = 0, c i ¯ T, ¯T = η∧ d c

(3.13)

with the two-form F given by 1 F = − JT + d(e−B η), c

(3.14)

and we have introduced the constant c = (n − 2)/2. The Bianchi identity for F is automatically satisfied and so we just need to impose the equation of motion for F, (3.7), to ensure that all equations of motion (3.3) are satisfied. From these conditions one can show that Lξ e B = Lξ JT = Lξ η = 0,

i ¯T = ¯ T. Lξ c

(3.15)

From this we deduce that Lξ J = 0,

i Lξ = , c

(3.16)

and hence that ξ is Killing and holomorphic. One can also show that r ∂r is holomorphic.

906

J. P. Gauntlett, N. Kim

Let us now consider the link L, defined as r = 1. The vector field ξ restricts to a vector field on L, which we denote by the same letter. Similarly, we find that η and J ¯ T : it does pull back to well defined forms on L. One has to be a little careful about n ∗ not give rise to an n-form but rather a section of (T X ) twisted by the complex line bundle defined by dr − ir η. For this reason we have a U (n) structure on L (or a metric contact structure - see [26] Def. 5). We now introduce local coordinates on L so that we can write ξ = (1/c)∂z , η = c(dz + P) and 2 2 = c2 (dz + P)2 + e B ds2n . ds2n+1

(3.17)

2 is a local Kähler metric with Kähler-form J . FurtherUsing (3.13) we deduce that ds2n T −i z ¯ T , we have that T is the local (n, 0)-form on the Kähler more, defining T = e manifold satisfying dT = i P ∧ T and d P = ρ, where ρ is the Ricci-form of the Kähler metric. We also write the Ricci tensor of this Kähler metric as Ri j and the Ricci scalar as R. The scalar field and the two-form then take the form B 2 R e =c , 2 1 F = − JT + cd[e−B (dz + P)]. (3.18) c

The equation of the two-form in (3.7) implies that the Kähler metric must satisfy R + Ri j R i j −

1 2 R = 0. 2

(3.19)

At this point it is worth pausing to emphasise that we have shown that if we have a local Kähler metric in 2n-dimensions that solves this master equation we can reconstruct a local 2n + 1-dimensional geometry via (3.17), (3.18) which admits solutions to the Killing spinor equations (3.6) and solves the equations of motion (3.3). Returning to the cone, in terms of this local description the original SU (n+1) structure J , is given by J = −cr dr ∧ (dz + P) + r 2 e B JT , = ei z (e B/2 r )n [dr − ir c(dz + P)] ∧ T ,

(3.20)

and one can directly check that this SU (n + 1) structure satisfies (2.11). One can also directly check that the equation of motion (2.3) is also satisfied: to do so observe that (JT )n−2 R ∧ ρ ∧ (dz + P) + ∗2n d . (3.21) e2(n−3)φ ∗2n+2 d[e2φ J ] = −c2 (n − 2)! 2 Note that the natural orientation on the cone is given by (J )n+1 (JT )n = −cr 2n+1 en B dr (dz + P) , (n + 1)! n!

(3.22)

and hence we take r zi1 ...i2n = −1 and we also take zi1 ...i2n = +1. Since the Killing vector ξ is no-where vanishing it defines a foliation. Just as in the case of Sasaki structures there are three cases to consider. The regular case is when the orbits of ξ close and the circle action is free. In this case, the local description above is

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

907

globally defined. In particular, it is characterised by Kähler manifolds satisfying (3.19). The quasi-regular case is when the orbits of ξ close but the action is only locally free. In this case the orbit space is an orbifold and L is the total space of an orbifold circle bundle over a Kähler orbifold satisfying (3.19). The irregular case is when the orbits generically do not close and there is no globally defined Kähler geometry. In the remaining sections we will illustrate the geometry we have introduced both in 2n + 1 dimensions and 2n + 2 dimensions by discussing several ansätze, some of which lead to new explicit examples. 4. Fibration Construction Using K E2+n−2 Spaces In this section we will construct explicit examples of the geometries in 2n +1 dimensions. For each Kähler-Einstein manifold with positive curvature in 2n − 2 dimensions, the construction gives countably infinite classes of simply connected, compact geometries. The strategy is to first find local Kähler metrics satisfying (3.19) and then afterwards show that they lead to globally defined complete geometries in 2n + 1 dimensions. The approach is the analogue of the construction of Sasaki-Einstein manifolds in [10,11] and generalises the analysis of [3] from n = 3, 4 to all n ≥ 3. In order to find explicit examples of local Kähler metrics in 2n dimensions satisfying (3.19), following [28], we consider the ansatz 2 ds2n =

dρ 2 + + Uρ 2 (Dφ)2 + ρ 2 ds 2 (K E 2n−2 ) U

(4.1)

with Dφ = dφ + C

(4.2)

+ and U is a function of ρ. Here ds 2 (K E 2n−2 ) is a 2n − 2-dimensional Kähler-Einstein metric of positive curvature. It is normalised so that R K E = 2n JK E and the one-form form C satisfies dC = 2JK E . Note that nC is then the connection on the canonical bundle of the Kähler-Einstein space. Let K E denote a local (n − 1, 0)-form, unique up to rescaling by a complex function. 2 is a Kähler metric observe that the Kähler form, defined by To show that ds2n

JT = ρdρ ∧ Dφ + ρ 2 JK E , is closed, and that the holomorphic (n, 0)-form √ dρ T = einφ √ + iρ U Dφ ∧ ρ n−1 K E U

(4.3)

(4.4)

satisfies dT = i f Dφ ∧ T ,

(4.5)

with f = n(1 − U ) −

ρ dU . 2 dρ

(4.6)

908

J. P. Gauntlett, N. Kim

This implies, in particular, that the complex structure defined by T is integrable. In 2 : addition (4.5) allows us to obtain the Ricci tensor of ds2n R = d P,

P = f Dφ.

(4.7)

The Ricci-scalar is then obtained via R = Ri j J i j . 2 satisfies Eq. (3.19). It is We would like to find the conditions on U such that ds2n convenient to introduce the new coordinate x = 1/ρ 2 so that 1 dx2 2 2 2 + + U (Dφ) = + ds (K E ) (4.8) ds2n 2n−2 x 4x 2 U and f = n(1 − U ) + x

dU , dx

R = 4(n − 1)x f − 4x 2

(4.9) df . dx

(4.10)

We can now show that (3.19) can be integrated once to give 2(n − 1) f 2 + U

dR = (constant) × x n−2 . dx

(4.11)

It is now straightforward to obtain polynomial solutions of (4.11). For simplicity we will restrict our considerations7 to solutions of the form U = 1 − αx n−2 (x − β)2 . Note that if we scale x → kx, we obtain the same 2n + 1 dimensional metric (see below) providing that α → k n α, β → β/k. For reasons that will become clear soon, we are interested in U having two distinct roots. If n is even then we must have α > 0. If n is odd, by rescaling x if necessary, we can also take α > 0. We will also use this scaling to set β = 1. Thus we will focus on solutions with U = 1 − αx n−2 (x − 1)2 .

(4.12)

Observing that U has turning points at x = (n − 2)/n and at x = 1, we will choose α ∈ (α0 , ∞), where α0 = n n /(4(n − 2)n−2 ) so that U has two positive roots x1 and x2 . We now consider the local metrics in 2n + 1 dimensions that can be constructed from these local 2n-dimensional Kähler metrics: R 2 2 2 2 ds2n+1 = c (dz + P) + ds2n , (4.13) 2 where R is the Ricci-scalar of the 2n-dimensional metric given in (4.10): R = 8αx n−1 .

(4.14)

The scalar B and the two-form F can be obtained from (3.18). It will be very convenient to employ the coordinate transformation φ=

1 (ψ − z) n

7 For n = 3 and n = 4, see [3] for some discussion of other solutions.

(4.15)

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

909

so that the metric can be written 1 2 RU R R 2 + Dψ 2 + 3 d x 2 + ds (K E 2n−2 ds2n+1 = w Dz 2 + 2 ), 2 c 2n wx 8x U 2x

(4.16)

where Dψ = dψ + n B, Dz = dz + g Dψ with f 2 RU ) + 2 , n 2n x RU 1 ). g = 2 (n f − f 2 − n w 2x

w = (1 −

(4.17)

We demand that w > 0, R > 0 and U ≥ 0. We will achieve this by demanding that x ∈ [x1 , x2 ], where xi are two positive roots of U (x). We now want to argue that this local metric, for countably infinite values of α ∈ (α0 , ∞) and for any positively curved Kähler-Einstein manifold, globally extends to give a complete, compact metric on a 2n + 1 dimensional manifold. We will argue this in two steps. We first study the 2n-dimensional metric transverse to the z direction in (4.16) and then consider the U (1) fibration, with fibre parametrised by the coordinate z, which we will take to be periodic with a suitable chosen period. We start by analysing the 2n-dimensional metric transverse to the z direction in (4.16). As we have already mentioned we take x ∈ [x1 , x2 ] and we take ψ to be a periodic coordinate with period 2π . Then the above 2n-dimensional metric extends to a smooth complete metric on the total space of an S 2 bundle over the original Kähler-Einstein + space, K E 2n−2 , and this is true for any value of α ∈ (α0 , ∞). To see this, we observe that the two sphere is parametrised by ψ, x. The key issue is to ensure that the metric has no conical singularities at the poles of the two-sphere which are located at x = x1 , x2 . A small calculation shows that since at any root xi of U we have (U x)2 |x=xi = n 2 , w

(4.18)

there will be no conical singularities if we take ψ to have period 2π . Let us call this 2n-dimensional manifold Y2n . We now turn to the 2n + 1-dimensional metric. The idea is to choose z to have period 2πl, for suitable l, so that the metric is that of the total space of a U (1) fibration over the globally defined 2n dimensional base manifold Y2n , with connection one-form given by l −1 g Dψ. The key point here is to ensure that the periods of the (1/2πl)d(g Dψ) over a basis for the free part of the second homology group of Y2n are integers. This is an almost identical set up to that discussed in [11] and we refer to that paper for more details. The result, in order to obtain a simply connected manifold, is that we need to choose p g(x2 ) = , g(x1 ) q

(4.19)

where p, q are relatively prime integers. We also choose l=

hg(x1 ) , q

(4.20)

where h is the highest common factor of p − q and qci , where ci are the Chern numbers of the Kähler-Einstein manifold K E 2n−2 . Finally, we will show below that as α ranges

910

J. P. Gauntlett, N. Kim

2) from α0 to ∞, g(x g(x1 ) monotonically increases from 0 to 1. Hence there will be countably infinite values of α that satisfy (4.19) and hence countably infinite metrics on complete 2n + 1-dimensional manifolds. To examine the behaviour of g(x2 )/g(x1 ) as a function of α we first note that using U (xi ) = 0 we obtain the simple expression

g(xi ) = −

2 . n(xi − 1) + 2

(4.21)

To proceed, we recall that the turning points of U (x) are at x = (n − 2)/n and at x = 1, and hence (n − 2)/n < x1 < 1 < x2 . We thus conclude that g(xi ) is negative for both x1 and x2 . We now observe that d xi d 2n g(xi ) = . 2 dα (n(xi − 1) + 2) dα

(4.22)

Next, since U (xi ) = 0, we have α=

1 xin−2 (xi

(4.23)

− 1)2

and we can compute x n−1 (xi − 1)3 d xi =− i . dα n(xi − 1) + 2

(4.24)

This is negative for x2 , and positive for x1 and hence we deduce the required monotonicity property of g(x2 )/g(x1 ). 5. An Ansatz for the Geometry in 2n + 2 Dimensions Geometries in 2n + 1-dimensions that can be constructed from a 2n-dimensional Kähler manifold consisting of a product of Kähler-Einstein manifolds satisfying (3.19) were studied in [3]. As we have discussed these give rise to conical geometries in 2n + 2dimensions with an SU (n + 1) structure satisfying (1.3), (1.4). In this section we will consider an ansatz for the geometry in 2n + 2-dimensions that generalises these conical geometries. We will derive the ordinary differential equations that need to be solved and while we have not been able to find the most general solution, we do recover some known solutions. Our metric ansatz is given by 2 ds2n+2 = α 2 dr 2 + β 2 (dz + P)2 +

n

γi2 dsi2 (K E 2 ),

(5.1)

i=1

where dsi2 (K E 2 ) denotes the metric of the i th two-dimensional Kähler-Einstein space. We will let Ji , i and Ri denote the Kähler-form, the (1, 0) form and the Ricci-form of the corresponding Kähler-Einstein space, respectively. We also have P = Pi , where d Pi = Ri = li Ji

(5.2)

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

911

and li are constant. Note that if we set some of li ’s to the same value, and also set the corresponding γi ’s to be equal, one can also replace the relevant product of the twodimensional Kähler-Einstein spaces with higher-dimensional Kähler-Einstein spaces. The SU (n + 1) structure J, is given by γi2 Ji , (5.3) J = −αβdr ∧ (dz + P) + i

= [αdr − iβ(dz + P)] ∧

γi i .

(5.4)

i

We will demand that the SU (n + 1) structure satisfies (1.3), (1.4) which we write again here:

d e2(n−3)φ

d(enφ ) = 0, d(e2(n−1)φ J n ) = 0, ∗2n+2 d(e2φ J ) = 0.

(5.5) (5.6) (5.7)

If we set eφ = λ and assume that α, β, γi and λ are all functions of r , we obtain the following set of coupled nonlinear ordinary differential equations: αγ λn + (βγ λn ) = 0,

li + (γ 2 λ2(n−1) ) = 0, αβγ 2 λ2(n−1) γi2 βγ 2 λ2(n−3) 2 2 2 l = ki , αβλ + (γ λ ) i i αγi4 ki li = 0,

(5.8) (5.9) (5.10) (5.11)

where ki are constants and γ = γi . To recover the simple metric cone geometries whose 2n-dimensional base geometries were discussed in [3], we start with α = 1, β = γi = cr, λ = r k .

(5.12)

The above equations then give c= and

n−2 n−1 , k=− , ki = c2(n−1) (li − 1) 2 n−2 i

li =

li2 = 1.

(5.13)

(5.14)

i

The base of this metric cone is the 2n + 1-dimensional geometry built from a 2n-dimensional Kähler base consisting of a product of Kähler-Einstein spaces. The last equation, which up to a scaling was noted in [3], is the algebraic form of the master equation (3.19) for the Kähler base consisting of a product of Kähler-Einstein spaces. Let us now construct some other solutions for n = 3 and n = 4 which give rise to supersymmetric solutions of type IIB supergravity and D = 11 supergravity via (2.12)

912

J. P. Gauntlett, N. Kim

and (2.13), respectively. We first consider the n = 3 case. The solution describing the Ad S3 limit of D3-branes wrapped on H2 / , a Riemann surface with genus g > 1, in a Calabi-Yau four-fold [13,14,29,30], which was discussed from the present point of view in [3], can be recovered by setting l1 = −1/3, l2 = l3 = 2/3, k1 = −1/12, k2 = k3 = −1/48,

(5.15)

α = 1, β = γi = r/2, λ = 1/r 2 .

(5.16)

and

Since l2 = l3 and γ2 = γ3 we can replace the corresponding K E 2 × K E 2 with a K E 4 space. For simplicity we will just discuss the case when K E 4 = C P 2 . We would like to know whether there exists a more general solution to the above coupled nonlinear differential equation, with the same parameters li , ki . One can indeed find that m −1 α 2 = 1 − 4/3 , (5.17) r m r2 1 − 4/3 , (5.18) β2 = 4 r r2 (5.19) γ12 = , 4 m r2 γ22 = γ32 = 1 − 4/3 , (5.20) 4 r m (5.21) λ−1 = r 2 1 − 4/3 , r satisfies the equations, where m is a constant. Using (2.12) we find that the ten dimensional type IIB metric takes the form ds 2 =

1 1 dr 2 2 1,1 2 ds ds (R ) + (H / ) + 2 2 r 2 (1 − r m 4(1 − r m r 2 (1 − r m 4/3 ) 4/3 ) 4/3 ) 1 + (dz + P)2 + ds 2 (C P 2 ) . (5.22) 4

If we take m > 0, m 3/4 < r < ∞ we essentially have the solution of [13,14]. Note that, as discussed in Sect. 6.1 of [30], we can choose the range of z to be 6π if g − 1 is divisible by three and 2π otherwise. In the former case, the solution interpolates between a locally Ad S5 × S 5 region and the Ad S3 × H2 / × S 5 solution8 given above in (5.16). In the latter case the S 5 is replaced with S 5 /Z 3 . The above solution can be interpreted [14] as describing the near horizon limit of D3-branes wrapping a holomorphic H2 / inside a Calabi-Yau four-fold (CY4 ): via the AdS/CFT correspondence it describes a renormalisation group flow “across dimensions” [14]. It would be interesting if we could find a solution that interpolated from an asymptotic CY4 region to this solution or perhaps just to the Ad S3 × H2 / solution (5.16). 8 Note that the S 5 is still fibred over the H 2 / .

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

913

We have not been able to construct such a solution, but we observe that our ansatz does include the following Calabi-Yau four-fold metric: ds 2 =

1 2 9 l1 3 2 9 dr + Ur 2 (dz + P1 + P2 )2 + (r +c)ds12 (K E 2 )+ r 2 ds22 (C P 2 ), U 16 8 4

(5.23)

where U=

3r 2 + 4c , 9(r 2 + c)

(5.24)

we have normalised ds 2 (K E 2 ) so that l1 = ±1, and we have taken K E 4 to be C P 2 with l2 = 6. When l1 = 1, K E 2 is a unit radius S 2 . Choosing c > 0 and 0 ≤ r < ∞, we have the metric of [12] (Eq. (5.34)). In particular the range of z is 2π and as r → ∞ the metric is asymptotically a cone over a regular Sasaki-Einstein space, while at r = 0 we have an S 2 bolt: note that since the period of z is 2π at r = 0, the C P 2 and the z fibre combine to give S 5 /Z 3 and so there is a conical singularity. On the other hand when l1 = −1, the case of more relevance here, we choose K E 2 = H/ , again a Riemann surface with genus g > 1. We also take c < 0 and 0 ≤ r 2 < −c. We can choose the range of z to be 6π if g − 1 is divisible by three and 2π otherwise. In the former case, there is no conical singularity at the H 2 / bolt at r = 0, while in the latter case there is. Note that this metric is singular as r 2 → −c. This metric provides a natural local model of a holomorphic H2 / in a CY4 for which D3-branes can wrap. Now let us consider the M-theory case with n = 4 which is very similar. Setting l1 = −1/2, l2 = l3 = l4 = 1/2, k1 = −3/2, k2 = k3 = k4 = −1/2,

(5.25)

α = 1, β = γi = r, λ = r −3/2 ,

(5.26)

we obtain the solution

which corresponds to the Ad S2 limit of M2-branes wrapping a holomorphic H 2 / , again a Riemann surface of genus g > 1, in a Calabi-Yau five-fold [15,16]. We can also find a more general solution, with 1 , 1 − mr m 2 r , β2 = 1 − r γ12 = r 2 , m 2 γ22 = γ32 = γ42 = 1 − r , r m 2 r . λ−4/3 = 1 − r α2 =

(5.27) (5.28) (5.29) (5.30) (5.31)

The corresponding D = 11 metric can be easily constructed from (2.13) and is given by ds 2 = −

1 1 dr 2 2 2 + (H / ) + dt ds 2 r 2 (1 − m/r ) 1 − m/r r 2 (1 − m/r )2

+(dz + P)2 + ds 2 (C P 3 ),

(5.32) (5.33)

914

J. P. Gauntlett, N. Kim

where, for simplicity, we have restricted attention to the case of K E 6 = C P 3 . If we take m > 0, m < r < ∞ we essentially have the solution of [15,16]. Note that we can choose the range of z to be 8π if g is odd and 4π if g is even. In the former case, the solution interpolates between a locally Ad S4 × S 7 region and the Ad S2 × H2 / × S 7 solution given above (5.26). In the latter case the S 7 is replaced with S 7 /Z 2 . The above solution [16] can be interpreted as describing the near horizon limit of M2-branes wrapping a holomorphic H2 / inside a Calabi-Yau five-fold (CY5 ): via the AdS/CFT correspondence it describes a renormalisation group flow “across dimensions". Again it would be interesting if we could find a solution that interpolated from an asymptotic CY5 region to this solution or perhaps just to the Ad S2 × H2 / solution (5.26). We have not been able to construct such a solution, but we observe that our ansatz does include the following Calabi-Yau five-fold metric: ds 2 =

80 2 U 2 dr + r (dz + P1 + P2 )2 +l1 2(r 2 +c)ds12 (H2 / )+16r 2 ds22 (C P 3 ), U 5

(5.34)

where U=

4r 2 + 5c , (r 2 + c)

(5.35)

we have normalised ds 2 (K E 2 ) so that l1 = ±1, and we have taken K E 6 to be C P 3 with l2 = 8. When l1 = 1, K E 2 is a unit radius S 2 . Choosing c > 0 and 0 ≤ r < ∞ we have the metric in the general class of [12]. The range of z is 4π and as r → ∞ the metric is asymptotically a cone over a regular Sasaki-Einstein space, while at r = 0 we have an S 2 bolt: note that since the period of z is 4π at r = 0 the C P 3 and the z fibre combine to give S 7 /Z 2 and so there is a conical singularity. On the other hand when l1 = −1, the case of more relevance here, we take K E 2 = H/ , c < 0 and 0 ≤ r 2 < c. We can now choose the range of z to be 8π if g is odd and 4π if g is even. In the former case, there is no conical singularity at the H 2 / bolt at r = 0, while in the latter case there is. Note that this metric is singular as r 2 → −c, but nevertheless provides a good local model of a holomorphic H2 / embedded in a CY5 , for which membranes can wrap. 6. LLM Inspired Ansatz In this section we consider an ansatz that is motivated by the results of Lin, Lunin and Maldacena (LLM) [17]. It was shown in [1] and [2] how one can recast the results of LLM in terms of a local Kähler geometry in 2n dimensions satisfying (1.1), for n = 3 and n = 4. Here we extend this by constructing an ansatz for general n. We start with the following ansatz for the local 2n-dimensional Kähler metric: ds 2 =

dy 2 f + y 2 U (Dψ)2 + (d x12 + d x22 ) + y 2 ds 2 (K E 2n−4 ), U U

(6.1)

where ds 2 (K E 2n−4 ) is a Kähler-Einstein metric (possibly local), Dψ = dψ + σ + V , σ is a one-form on the Kähler-Einstein space, V is a one-form on the two-dimensional space spanned by xi , i = 1, 2 and U, f, V all depend on three coordinates y, xi . We take the Kähler form to be given by JT = ydy ∧ Dψ +

f d x1 ∧ d x2 + y 2 JK E , U

(6.2)

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

915

where JK E is the Kähler form on the Kähler-Einstein space. Then, demanding that d JT = 0 we obtain the following conditions: dσ = 2JK E , f 1 d x1 ∧ d x2 , d2 V = ∂ y y U

(6.3) (6.4)

where d2 ≡ d x i ∧ ∂i . In order to see if the complex structure is indeed integrable we need to compute the derivative of the (n, 0)-form T given by √ dy f ikψ (6.5) T = e (d x1 + idx2 ) ∧ y n−2 K E , √ + i y U Dψ ∧ U U where K E is the (n − 2, 0)-form on K E 2n−4 which satisfies d K E = ikσ ∧ K E .

(6.6)

The constant k determines the normalisation of the K E 2n−4 space. In particular, we have R K E = 2k JK E , where R K E is the Ricci-form of the KE space. We now find that dT = i P ∧ T ,

(6.7)

with the Ricci potential given by P=

y 2−n U 1 ∗2 d2 (ln f ) − √ ∂ y y n−1 f Dψ + k(dψ + σ ), 2 f

(6.8)

provided that we impose ∂y V =

1 ∗2 d2 y

1 U

.

The compatibility of (6.4) and (6.9) leads to the following equation: 1 1 f + y∂ y ∂y = 0, U y U

(6.9)

(6.10)

where = ∂12 + ∂22 . Having obtained the Ricci potential, the next step would be to compute the Ricci tensor and see how the master equation (1.1) can be satisfied. To simplify things, we first introduce a function D defined via 1 y = ∂ y D. U 2

(6.11)

We can now readily integrate (6.9) to get Vi =

1 i j ∂ j D, i, j = 1, 2. 2

(6.12)

In terms of D, (6.4) is now expressed as D +

1 ∂ y f y∂ y D = 0. y

(6.13)

916

J. P. Gauntlett, N. Kim

Furthermore, based on hints from [17] in the n = 3, 4 case, we now make the assumption that there exists a relation f = y 2 p eq D ,

(6.14)

which makes the master equation (1.1) identically satisfied. Here p, q are constants that are to be fixed in terms of other parameters n, k that we have already introduced. Noting that now 1 ∗2 d2 ln f = q V, (6.15) 2 we can rewrite the Ricci potential P as P = [k − q − (n + p − 1)U ] Dψ + (q − k)V.

(6.16)

It is now straightforward to obtain the Ricci-form d P and from that the Ricci scalar which takes the simple form 4 R = 2 [(k − q)(n − 2) − q(n + p − 1) − (n + p − 1)(n + p − 2)U ] . (6.17) y After some computation one can now check that (1.1) is satisfied, if we demand that p =3−n

(6.18)

(n − 2)(q − k)[q(n − 1) − k(n − 3)] = 0.

(6.19)

and Let us first discuss the special case when n = 3. In this case we can solve the equations by taking p = q = 0. Then the only equation that needs to be solved is the linear equation 1 D + ∂ y (y∂ y D) = 0, (6.20) y and we have recovered the equation of LLM for the type IIB case.9 For generic values of n ≥ 3, the second equation is satisfied if q = k. Since k is related to the scalar curvature of the Kähler-Einstein base we can rescale it to 0, ±1 without losing generality. Here let us assume that k = 0. Then the entire solution is governed by the equation (when k = −1 we redefine D → −D) 1 D + ∂ y (y 7−2n ∂ y e D ) = 0. (6.21) y After the coordinate change x = (2n − 6)2(3−n)/(n−2) y 2n−6 , this equation becomes n−4

D + x n−3 ∂x2 e D = 0.

(6.22)

When n = 4 this equation is the continuous Toda equation just as LLM discovered in the context of D = 11 supergravity [17]. We do not know whether or not (6.21) is also an integrable system for n > 4. But it is at least clear that (6.21) still enjoys the 2d conformal symmetry, i.e. the equation is invariant under the transformation x1 + i x2 → g(x1 + i x2 ),

D → D − log |∂g|2

(6.23)

for an analytic function g. 9 In order to recover the LLM result, one should add in the extra coordinate z to obtain a seven manifold, and also shift the ψ coordinate ψ → ψ + αz for some constant α.

Geometries with Killing Spinors and Supersymmetric Ad S Solutions

917

7. Conclusions In this paper we have introduced a new class of geometries in 2n + 2 dimensions that are specified by a metric, a scalar and a three-form. The geometries admit a specific kind of Killing spinor or, equivalently, a specific kind of SU (n + 1) structure. For n = 3 and n = 4 these give rise to supersymmetric solutions of type IIB and D = 11 supergravity, with R1,1 and R factors, respectively. We also showed that if these geometries in 2n + 2 dimensions are a certain kind of metric cone, then we obtain a new class of metric contact geometries in 2n + 1 dimensions, on the base of the cone, that are specified by a metric, a scalar and a twoform. For n = 3 and n = 4 these give rise to supersymmetric solutions of type IIB and D = 11 supergravity, with Ad S3 and Ad S2 factors, that were discussed in [1] and [2], respectively. We have noted the strong similarities with Ricci-flat Kähler cones and Sasaki-Einstein manifolds. We also constructed some specific examples of these geometries in Sect. 4-6. The constructions in Sect. 4 can be straightforwardly extended by generalising the construction of Sect. 5 of [3]. It should also be possible to extend this construction further using the results of [31,27]. In Sect. 5 and 6 of this paper our constructions boiled down to solving some differential equations and we think it would be worthwhile to try and find additional solutions. Acknowledgements. We would like to thank Jaume Gomis, Jan Gutowski, Oisin Mac Conamhna and Daniel Waldram for helpful discussions. We thank the Perimeter Institute for hospitality where this work was completed. NK would also like to thank the Institute for Mathematical Sciences at Imperial College for hospitality. NK is supported by the Science Research Center Program of the Korea Science and Engineering Foundation (KOSEF) through the Center for Quantum Spacetime (CQUeST) of Sogang University with grant number R11-2005-021, and by the Korean Research Foundation (KRF) with grant number KRF-2004-042-C00023. JPG is supported by an EPSRC Senior Fellowship and a Royal Society Wolfson Award.

References 1. Kim, N.: AdS(3) solutions of IIB supergravity from D3-branes. JHEP 0601, 094 (2006) 2. Kim, N., Park, J.D.: Comments on AdS(2) solutions of D = 11 supergravity. JHEP 0609, 041 (2006) 3. Gauntlett, J.P., Kim, N., Waldram, D.: Supersymmetric AdS(3), AdS(2) and bubble solutions. JHEP 0704, 005 (2007) 4. Cariglia, M., Mac Conamhna, O.A.P.: The general form of supersymmetric solutions of N = (1,0) U(1) and SU(2) gauged supergravities in six dimensions. Class. Quant. Grav. 21, 3171 (2004) 5. Branson, T.: Sharp inequalities, the functional determinant, and the complementary series. Trans. AMS 347, 3671–3742 (1995) 6. Mac Conamhna, O.A.P., Colgain, E.O: Supersymmetric wrapped membranes, AdS(2) spaces, and bubbling geometries. JHEP 0703, 115 (2007) 7. Intriligator, K., Wecht, B.: The exact superconformal R-symmetry maximizes a. Nucl. Phys. B 667, 183 (2003) 8. Martelli, D., Sparks, J., Yau, S.-T.: The geometric dual of a-maximisation for toric Sasaki-Einstein manifolds. Commun. Math. Phys. 268, 39 (2006) 9. Martelli, D., Sparks, J., Yau, S.-T.: Sasaki-Einstein manifolds and volume minimisation. Commun. Math. Phys. 280, 611–673 (2008) 10. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: Sasaki-Einstein metrics on S(2) x S(3). Adv. Theor. Math. Phys. 8, 711 (2004) 11. Gauntlett, J.P., Martelli, D., Sparks, J.F., Waldram, D.: A new infinite class of Sasaki-Einstein manifolds. Adv. Theor. Math. Phys. 8, 987 (2006) 12. Cvetic, M., Gibbons, G.W., Lu, H., Pope, C.N.: Ricci-flat metrics, harmonic forms and brane resolutions. Commun. Math. Phys. 232, 457 (2003) 13. Klemm, D., Sabra, W.A.: Supersymmetry of black strings in D = 5 gauged supergravities. Phys. Rev. D 62, 024003 (2000)

918

J. P. Gauntlett, N. Kim

14. Maldacena, J.M., Nunez, C.: Supergravity description of field theories on curved manifolds and a no go theorem. Int. J. Mod. Phys. A 16, 822 (2001) 15. Caldarelli, M.M., Klemm, D.: Supersymmetry of anti-de Sitter black holes. Nucl. Phys. B 545, 434 (1999) 16. Gauntlett, J.P., Kim, N., Pakis, S., Waldram, D.: Membranes wrapped on holomorphic curves. Phys. Rev. D 65, 026003 (2002) 17. Lin, H., Lunin, O., Maldacena, J.M.: Bubbling AdS space and 1/2 BPS geometries. JHEP 0410, 025 (2004) 18. Gauntlett, J.P., Pakis, S.: The geometry of D = 11 Killing spinors. JHEP 0304, 039 (2003) 19. Gauntlett, J.P., Martelli, D., Waldram, D.: Superstrings with intrinsic torsion. Phys. Rev. D 69, 086002 (2004) 20. Gauntlett, J.P., Mac Conamhna, O.A.P.: AdS spacetimes from wrapped D3-branes. Class. Quant. Grav. 24, 6267 (2007) 21. Gran, U., Gutowski, J., Papadopoulos, G.: IIB backgrounds with five-form flux. http://arxiv.org/abs/ (2007) 22. Witten, E.: On flux quantization in M-theory and the effective action. J. Geom. Phys. 22, 1 (1997) 23. Strominger, A.: Superstrings with Torsion. Nucl. Phys. B 274, 253 (1986) 24. Hull, C.M.: Compactifications of the Heterotic Supertring. Phys. Lett. B 178, 357 (1986) 25. Gauntlett, J.P., Martelli, D., Pakis, S., Waldram, D.: G-structures and wrapped NS5-branes. Commun. Math. Phys. 247, 421 (2004) 26. Boyer, C.P., Galicki, K.: Sasakian Geometry, Hypersurface Singularities, and Einstein Metrics. Supplemento Ai Rendiconti del Circolo Matematico di Palermo Serie II. Suppl 75, 57–87 (2005) 27. Chen, B., et al. : Bubbling AdS and droplet descriptions of BPS geometries in IIB supergravity. JHEP 0710, 003 (2007) 28. Page, D.N., Pope, C.N.: Inhomogeneous Einstein Metrics On Complex Line Bundles. Class. Quant. Grav. 4, 213 (1987) 29. Naka, M.: Various wrapped branes from gauged supergravities. http://arxiv.org/list/hep-th/0206141, (2002) 30. Gauntlett, J.P., Mac Conamhna, O.A.P., Mateos, T., Waldram, D.: New supersymmetric AdS(3) solutions. Phys. Rev. D 74, 106007 (2006) 31. Chong, Z.W., Lu, H., Pope, C.N.: BPS geometries and AdS bubbles. Phys. Lett. B 614, 96 (2005) Communicated by G.W. Gibbons

Commun. Math. Phys. 284, 919–930 (2008) Digital Object Identifier (DOI) 10.1007/s00220-008-0545-y

Communications in

Mathematical Physics

On the Regularity Criterion of Weak Solution for the 3D Viscous Magneto-Hydrodynamics Equations Qionglei Chen1 , Changxing Miao1 , Zhifei Zhang2 1 Institute of Applied Physics and Computational Mathematics, P.O. Box 8009, Beijing 100088, P. R. China.

E-mail: [email protected]; [email protected]

2 School of Mathematical Science, Peking University, Beijing 100871, P. R. China.

E-mail: [email protected] Received: 24 October 2007 / Accepted: 27 January 2008 Published online: 11 July 2008 – © Springer-Verlag 2008

Abstract: We improve and extend some known regularity criterion of the weak solution for the 3D viscous Magneto-hydrodynamics equations by means of the Fourier localization technique and Bony’s para-product decomposition.

1. Introduction In this paper, we consider the 3D incompressible magneto-hydrodynamics (MHD) equations ⎧ ∂u 1 ⎪ − νu + u · ∇u = −∇ p − ∇b2 + b · ∇b, ⎪ ⎪ ⎪ ∂t 2 ⎪ ⎪ ⎨ ∂b − ηb + u · ∇b = b · ∇u, (MHD) (1.1) ∂t ⎪ ⎪ ⎪ ⎪ ∇ · u = ∇ · b = 0, ⎪ ⎪ ⎩ u(0, x) = u 0 (x), b(0, x) = b0 (x). Here u, b describe the flow velocity vector and the magnetic field vector respectively, p is a scalar pressure, ν > 0 is the kinematic viscosity, η > 0 is the magnetic diffusivity, while u 0 and b0 are the given initial velocity and initial magnetic field with ∇ ·u 0 = ∇ ·b0 = 0. If ν = η = 0, (1.1) is called the ideal MHD equations. The same as for the 3D Navier-Stokes equations, the regularity of the weak solution for the 3D MHD equations remains open [17]. For the 3D Navier-Stokes equations, the Serrin-type criterion states that a Leray-Hopf weak solution u is regular, provided the following condition holds [1,9,11,16,19]: u ∈ L q (0, T ; L p (R3 )), for

2 3 + ≤ 1, 3 ≤ p ≤ ∞, q p

(1.2)

920

Q. Chen, C. Miao, Z. Zhang

or ∇u ∈ L q (0, T ; L p (R3 )), for

2 3 + ≤ 2, q p

3 < p ≤ ∞. 2

(1.3)

Recently, Chen and Zhang [3] have refined the above conditions as follows: If there exists a small ε0 such that for any t ∈ (0, T ), u satisfies t q j u(τ ) p dτ ≤ ε0 , (1.4) lim sup 2 jsq ε→0

j

t−ε

3 with q2 + 3p = 1 + s, 1+s < p ≤ ∞, −1 < s ≤ 1, and ( p, s) = (∞, 1), then u is regular 3 in (0, T ] × R , where j denotes the frequency localization operator. For the marginal case ( p = 3, q = ∞), Cheskidov and Shvydkoy [6] have refined (1.2) to −1 u ∈ C([0, T ]; B∞,∞ ).

(1.5)

−1 stands for the inhomogenous Besov spaces, see Sect. 2 for the definitions. Here B∞,∞ Wu [20,21] extended some Serrin-type criteria for the Navier-Stokes equations to the MHD equations imposing conditions on both the velocity field u and the magnetic field b. However, some numerical experiments [15] seem to indicate that the velocity field plays a more important role than the magnetic field in the regularity theory of solutions to the MHD equations. Recently, He, Xin [12], and Zhou [24] have proved some regularity criteria to the MHD equations which do not impose any condition on the magnetic field b. Precisely, they showed that the weak solution remains smooth on (0, T ] × R3 if the velocity u satisfies one of the following conditions:

u ∈ L q (0, T ; L p (R3 )),

2 3 + ≤ 1, 3 < p ≤ ∞; q p

u ∈ C([0, T ]; L 3 (R3 )); ∇u ∈ L q (0, T ; L p (R3 )),

(1.6) (1.7)

2 3 + ≤ 2, q p

3 < p ≤ ∞. 2

(1.8)

Meanwhile, inspired by the pioneering work of Constantin and Fefferman [7] where the regularity condition of the direction of vorticity was used to describe the regularity criterion to the Navier-Stokes equations, He and Xin [12] showed that the weak solution remains smooth on (0, T ] × R3 if the vorticity of the velocity w = ∇ × u satisfies the following condition: 1

|w(x + y, t) − w(x, t)| ≤ K |w(x + y, t)||y| 2 if |y| ≤ ρ |w(x + y, t)| ≥ , (1.9) for t ∈ [0, T ] and three positive constants K , ρ, . For the marginal case p = ∞ in (1.8), Chen, Miao and Zhang [4] proved a BealeKato-Majda criterion in terms of the vorticity of the velocity u only by means of the Littlewood-Paley decomposition. For the generalized MHD equations with fractional dissipative effect, Wu [22,23] established some regularity results in terms of the velocity only. The purpose of this paper is to improve and extend some known regularity criterion of weak solution for the MHD equations by means of the Fourier localization technique and Bony’s para-product decomposition [2,5]. Let us firstly recall the definition of the weak solution.

Regularity of Weak Solution for 3D Viscous Magneto-Hydrodynamics Equations

921

Definition 1.1. The vector-valued function (u, b) is called a weak solution of (1.1) on (0, T ) × R3 if it satisfies the following conditions: (1) (u, b) ∈ L ∞ (0, T ; L 2 (R3 )) ∩ L 2 (0, T ; H 1 (R3 )); (2) div u = div b = 0 in the sense of distribution; (3) For any function ψ(t, x) ∈ C0∞ ((0, T ) × R3 ) with divψ = 0, there hold T {u · ψt − ν∇u · ∇ψ + ∇ψ : (u ⊗ u − b ⊗ b)}d xdt = 0, R3

0

and

0

T

R3

{b · ψt − η∇b · ∇ψ + ∇ψ : (u ⊗ b − b ⊗ u)}d xdt = 0.

Similar to the Navier-Stokes equation, the global existence of weak solutions to the MHD equations can be proved by using Galerkin’s method and compact argument, see [8]. Now we state our main result as follows. Theorem 1.1. Let (u 0 , b0 ) ∈ L 2 (R3 ) with ∇ · u 0 = ∇ · b0 = 0. Assume that (u, b) is a weak solution to (1.1) on (0, T ) × R3 with 0 ≤ T ≤ ∞. If the velocity u(t) satisfies u(t) ∈ L q (0, T ; B sp,∞ ),

(1.10)

3 with q2 + 3p = 1 + s, 1+s < p ≤ ∞, −1 < s ≤ 1, and ( p, s) = (∞, 1). Then the solution (u, b) is regular on (0, T ] × R3 .

Remark 1.1. By the embedding L p B 0p,∞ , we see that our result is an improvement of (1.6) and (1.8). In addition, we establish the regularity criterion of the weak solution for the MHD equation in the framework of Besov spaces with negative index in terms of the velocity only. On the other hand, the method in this paper can be applied to the generalized MHD equations, please refer to [22,23] for details. Remark 1.2. In the case of s = 0 or s = 1, Kozono, Ogawa and Taniuchi [14] proved similar results for the Navier-Stokes equations by using the Logarithmic Sobolev inequality in the Besov spaces. However, if we try to use their method in our case, we can only obtain the regularity criterion in terms of both the velocity field u and the magnetic field b. Remark 1.3. Chen, Miao and Zhang [4] proved the marginal case ( p, s) = (∞, 1) by using a different argument. However, the method of [4] can’t also be applied to the present case. Remark 1.4. The regularity of weak solution (u, b) under the condition −1 u ∈ C(0, T ; B∞,∞ )

(1.11)

remains unknown. One easily checks that it is the special case of the endpoint case of (1.10) in Theorem 1.1 with s = −1. Notation. Throughout the paper, C stands for a generic constant. We will use the notation A B to denote the relation A ≤ C B and the notation A ≈ B to denote the relations A B and B A. Further, · p denotes the norm of the Lebesgue space L p .

922

Q. Chen, C. Miao, Z. Zhang

2. Preliminaries In this section, we are going to recall some basic facts on Littlewood-Paley theory; one may check [5] for more details. Let S(R3 ) be the Schwartz class of rapidly decreasing functions. Given f ∈ S(R3 ), its Fourier transform F f = fˆ is defined by 3 e−i x·ξ f (x)d x. fˆ(ξ ) = (2π )− 2 R3

Choose two nonnegative radial functions χ , ϕ ∈ S(R3 ) supported respectively in B = {ξ ∈ R3 , |ξ | ≤ 43 } and C = {ξ ∈ R3 , 43 ≤ |ξ | ≤ 83 } such that χ (ξ ) +

ϕ(2− j ξ ) = 1, ξ ∈ R3 .

j≥0

Set ϕ j (ξ ) = ϕ(2− j ξ ) and let h = F −1 ϕ and h˜ = F −1 χ . Define the frequency localization operators: −j 3j j f = ϕ(2 D) f = 2 h(2 j y) f (x − y)dy, for j ≥ 0, R3 ˜ j y) f (x − y)dy, and S j f = χ (2− j D) f = k f = 2 3 j h(2 −1≤k≤ j−1

R3

−1 f = S0 f, j f = 0 for j ≤ −2.

(2.1)

Formally, j = S j+1 − S j is a frequency projection into the annulus {|ξ | ≈ 2 j }, and S j is a frequency projection into the ball {|ξ | 2 j }. One easily verifies that with the above choice of ϕ, j k f ≡ 0 if | j − k| ≥ 2 and j (Sk−1 f k f ) ≡ 0 if | j − k| ≥ 5. (2.2) We now introduce the following definition of inhomogenous Besov spaces by means of Littlewood-Paley projection j and S j : Definition 2.1. Let s ∈ R, 1 ≤ p, q ≤ ∞, the inhomogenous Besov space B sp,q is defined by B sp,q = { f ∈ S (R3 ); f B sp,q < ∞}, where

f B sp,q

⎧⎛ ⎞1 ⎪ q ∞ ⎪ ⎪ ⎪ q ⎠ jsq ⎨⎝ 2 j f L p , = j=−1 ⎪ ⎪ js ⎪ ⎪ ⎩ sup 2 j f L p , j≥−1

for q < ∞, for q = ∞.

Regularity of Weak Solution for 3D Viscous Magneto-Hydrodynamics Equations

923

s is the usual Sobolev space H s and that B s Let us point out that B2,2 ∞,∞ is the usual s Hölder space C for s ∈ R\Z. We refer to [18] for more details. We now recall the para-differential calculus which enables us to define a generalized product between distributions, which is continuous in many functional spaces where the usual product does not make sense (see [2]). The paraproduct between u and v is defined by Tu v S j−1 u j v. (2.3) j

Formally, we have the following Bony’s decomposition: uv = Tu v + Tv u + R(u, v), with R(u, v) =

(2.4)

j u j v,

| j − j|≤1

and we also denote Tu v Tu v + R(u, v). Let us conclude this section by recalling Bernstein’s inequality which will be frequently used in the proof of Theorem 1.1. Lemma 2.1. [5] Let 1 ≤ p ≤ q ≤ ∞. Assume that f ∈ L p , then there exists a constant C independent of f , j such that j|α|+3 j ( 1p − q1 ) supp fˆ ⊂ ξ : |ξ | 2 j =⇒ ∂ α f L q ≤ C2 f L p , (2.5) supp fˆ ⊂ ξ : |ξ | ≈ 2 j =⇒ f L p ≤ C2− j|α| sup ∂ β f L p . |β|=|α|

(2.6)

3. Proof of Theorem 1.1 Since the weak solution (u(t), b(t)) ∈ L 2 (0, T ; H 1 (R3 )), for any time interval (0, δ), there exists an ε ∈ (0, δ) such that (u(ε), b(ε)) ∈ H 1 (R3 ). It is well known that there exist a maximal existence time T0 > 0 and a unique strong solution ( u (t), b(t)) ∈ X (ε, T0 ) C([ε, T0 ); H 1 (R3 ))∩C 1 ((ε, T0 ); H 1 (R3 ))∩C((ε, T0 ); H 3 (R3 ))

which is the same as the weak solution (u, b) on (ε, T0 ) [8,17]. In order to complete the proof of Theorem 1.1, it suffices to show that the strong solution (u(t), b(t)) can be extended after t = T0 in the class X (ε, T0 ) under the condition of Theorem 1.1. For convenience, we set ν = η = 1 and ε = 0 in what follows. We denote u k = k u, bk = k b, πk = k π, here π = p + 21 b2 . We get by applying the operation k to both sides of (1.1) that ∂t u k − u k + k (u · ∇u) − k (b · ∇b) = −∇πk , (3.1) ∂t bk − bk + k (u · ∇b) − k (b · ∇u) = 0.

924

Q. Chen, C. Miao, Z. Zhang

Multiplying the first equation of (3.1) by u k and the second one of (3.1) by bk , we obtain by Lemma 2.1 for k ≥ 0 that 1 d u k (t)22 + c22k u k (t)22 = − k (u · ∇u), u k + k (b · ∇b), u k , (3.2) 2 dt 1 d bk (t)22 + c22k bk (t)22 = − k (u · ∇b), bk + k (b · ∇u), bk . (3.3) 2 dt Set 1 2 Fk (t) u k (t)22 + bk (t)22 . Then we get by adding (3.2) and (3.3) that 1 d Fk (t)2 + c22k Fk (t)2 = − k (u · ∇u), u k + k (b · ∇b), u k 2 dt − k (u · ∇b), bk + k (b · ∇u), bk .

(3.4)

Noting that u · ∇u k , u k = u · ∇bk , bk = 0, b · ∇bk , u k + b · ∇u k , bk = 0. The right-hand side of (3.4) can be written as [u, k ] · ∇u, u k − [b, k ] · ∇b, u k + [u, k ] · ∇b, bk − [b, k ] · ∇u, bk I + I I + I I I + I V, where [A, B] AB − B A. Using Bony’s decomposition (2.4), we rewrite I as I = [Tu i , k ]∂i u, u k + T k ∂i u u i , u k − k T∂i u u i , u k − k R(u i , ∂i u), u k I1 + I2 + I3 + I4 . In view of the support of the Fourier transform of the term T∂i u u i , we have k (Sk −1 (∂i u)u ik ). k T∂i u u i = |k −k|≤4

This helps us to get, Using Lemma 2.1, |I3 | u k 2 ∇ Sk −1 u∞ u k 2 .

(3.5)

|k −k|≤4

Since div u = 0, we have

k R(u i , ∂i u) =

∂i k (k u i k

u).

k ,k

≥k−2;|k −k

|≤1

This together with Lemma 2.1 yields |I4 | 2k u k ∞

k ≥k−2

u k 22 .

(3.6)

Regularity of Weak Solution for 3D Viscous Magneto-Hydrodynamics Equations

925

Using the definition of T k ∂i u u i , we have T k ∂i u u i = Sk +2 k ∂i uk u i . k ≥k−2

Note that Sk +2 k u = k u for k > k, we get Sk +2 k ∂i uk u i , u k , I2 = k−2≤k ≤k

from which and Lemma 2.1, it follows that I2 u k 2 ∇ Sk −1 u∞ u k 2 + 2k u k ∞ u k 22 . |k −k|≤2

(3.7)

k ≥k−2

Making use of the definition of k , we have [Tu i , k ]∂i u = [Sk −1 u i , k ]∂i u k |k −k|≤4

=

2

3k

2

4k

R3

|k −k|≤4

=

h(2k (x − y))(Sk −1 u i (x) − Sk −1 u i (y))∂i u k (y)dy

1

R3 0

|k −k|≤4

y · ∇ Sk −1 u i (x − τ y)dτ ∂i h(2k y)u k (x − y)dy,

from which and the Minkowski inequality, we infer that |I1 | u k 2 ∇ Sk −1 u∞ u k 2 .

(3.8)

|k −k|≤4

By summing up (3.5)–(3.8), we obtain |I | u k 2 ∇ Sk −1 u∞ u k 2 + 2k u k ∞ u k 22 . |k −k|≤4

(3.9)

k ≥k−2

Similar arguments as in deriving (3.9) can be used to get that |I I + I V | u k ∞ ∇ Sk −1 b2 bk 2 + 2k u k ∞ bk 22 |k −k|≤4

+ bk 2

|k −k|≤4

+ 2k bk 2

k ≥k−2

(∇ Sk −1 u∞ bk 2 + ∇ Sk −1 b2 u k ∞ )

u k ∞ bk

2 ,

(3.10)

k ,k

≥k−2;|k −k

|≤1

and |I I I | bk 2

|k −k|≤4

+ 2k bk 2

(∇ Sk −1 u∞ bk 2 + ∇ Sk −1 b2 u k ∞ )

k ,k

≥k−2;|k −k

|≤1

u k ∞ bk

2 .

(3.11)

926

Q. Chen, C. Miao, Z. Zhang

Combining (3.9)–(3.11) with (3.4), we easily get for k ≥ 0 that d Fk (t)2 + 22k Fk (t)2 dt (u k 2 + bk 2 ) ∇ Sk −1 u∞ (u k 2 + bk 2 )

+

|k −k|≤4

∇ Sk −1 b2 (bk 2 u k ∞ + u k ∞ bk 2 )

|k −k|≤4

+ 2k bk 2

u k ∞ bk

2

k ,k

≥k−2;|k −k

|≤1

+ 2k u k ∞

(bk 22 + u k 22 ).

(3.12)

k ≥k−2 2

+ 3 −1

2

−1

q p q Making use of B p,∞ (R3 ) → B∞,∞ (R3 ), we only need to deal with the case when p = +∞ since the other cases can be deduced from it by above Sobolev embedding. Here we omit the details. By the restrictions on p, q, s, we see that s = q2 − 1 and q ∈ (1, +∞).

Case 1. q ∈ (1, 2]. Integrating (3.12) with respect to t, we deduce that t Fk (t)2 − Fk (0)2 + 22k Fk (τ )2 dτ 0 t (u k 2 + bk 2 ) ∇ Sk −1 u∞ (u k 2 + bk 2 )dτ 0

|k −k|≤4

t

∇ Sk −1 b2 (bk 2 u k ∞ + u k ∞ bk 2 )dτ 0 |k −k|≤4 t 2k bk 2 u k ∞ bk

2 dτ + 0

k ,k ≥k−2;|k −k |≤1 +

t

+ 0

2k u k ∞

(bk 22 + u k 22 )dτ

k ≥k−2

1 + 2 + 3 + 4 .

(3.13)

Take ρ ∈ ( 21 , 1) and set A(t) sup 2kρ Fk (τ ),

t

B(t) = sup 22k(ρ+1)

k≥−1

k≥−1

Fk (τ )2 dτ.

0

We get by using Lemma 2.1 that 2

2kρ

t

1

2 0

2kρ

(u k 2 + bk 2 )

−2 k

|k −k|≤4 k

=−1

u k

∞ 2k (u k 2 + bk 2 )dτ

Regularity of Weak Solution for 3D Viscous Magneto-Hydrodynamics Equations

t

u(τ )

0

t

s B∞,∞

A(τ )2

t

0

(u k 2 + bk 2 )

2

k(ρ+2− q2 )

q u(τ ) B s A(τ )2 dτ ∞,∞

−2 k

−k ρ

|k −k|≤4

s u(τ ) B∞,∞ A(τ )2

0

2kρ

927

2

k

(2− q2 )

dτ

k

=−1

Fk (τ )dτ

q1 B(t)

1− q1

,

(3.14)

where we used the fact that 1 < q ≤ 2 in the last two inequalities. Similarly, we get by using Lemma 2.1 and the fact that ρ < 1 and 1 < q ≤ 2 that 2

2kρ

t

2

s u(τ ) B∞,∞ 2

0

t

t

t

t

t

0

t

A(τ )2 dτ

q u(τ ) B s A(τ )2 dτ ∞,∞

s u(τ ) B∞,∞ 2

0

0

t

bk

2 2k (bk 2 + bk 2 ) dτ

q1

B(t)

q

u(τ ) B s

∞,∞

1− q1

B(t)

,

(3.15)

bk 2 2

k ≥k−2

q1

k(2ρ+2− q2 )

2k (1−ρ) (bk 2 + bk 2 ) dτ

|k −k|≤4

s u(τ ) B∞,∞ A(τ )2k(2ρ+1) bk 2

0

22kρ 4

k(1− q2 +2ρ)

s u(τ ) B∞,∞ 2k(2ρ+1) bk 2

0

q

u(τ ) B s

∞,∞

0

−2 k

|k −k|≤4 k

=−1

s u(τ ) B∞,∞ A(τ )2

0

22kρ 3

k(1− q2 +2ρ)

2

k (1− q2 )

dτ

k (1− q2 −ρ)

dτ

k ≥k−2 1− q1

,

(3.16)

(bk 22 + u k 22 )dτ

k ≥k−2

A(τ )2 dτ

q1

B(t)

1− q1

.

(3.17)

On the other hand, the strong solution (u, b) also satisfies the energy equality u(t)22 + b(t)22 + 2

t 0

(∇u(τ )22 + ∇b(τ )22 )dτ = u 0 22 + b0 22 ,

hence, we have −1 u(t)2 + −1 b(t)2 ≤ C(u 0 2 + b0 2 ). Thus, summing up (3.13)–(3.18), we get by the Young’s inequality that A(t)2 ≤ C(A(0)2 + u 0 22 + b0 22 ) + C

0

t

q

u(τ ) B s

∞,∞

A(τ )2 dτ,

(3.18)

928

Q. Chen, C. Miao, Z. Zhang

which together with the Gronwall inequality yields that t q 2 2 2 2 A(t) ≤ C(A(0) + u 0 2 + b0 2 ) exp C u(τ ) B s

∞,∞

0

dτ .

(3.19)

Case 2. q ∈ (2, +∞). Multiplying (3.12) by 22k(q−1) Fk (t)2(q−1) , then integrating the resulting equation with respect to t leads to the result for k ≥ 0, t 22k(q−1) Fk (t)2q − 22k(q−1) Fk (0)2q + 22kq Fk (τ )2q dτ 0 t 22k(q−1) Fk (τ )2(q−1) (u k 2 + bk 2 ) ∇ Sk −1 u∞ (u k 2 + bk 2 )dτ 0

t

22k(q−1) Fk (τ )2(q−1)

+ 0

|k −k|≤4

∇ Sk −1 b2 (bk 2 u k ∞ + u k ∞ bk 2 )dτ

|k −k|≤4

t

22k(q−1) Fk (τ )2(q−1) 2k bk 2

+ 0

u k ∞ bk

2 dτ

k ,k

≥k−2;|k −k

|≤1

t

22k(q−1) Fk (τ )2(q−1) 2k u k ∞

+ 0

(bk 22 + u k 22 )dτ

k ≥k−2

1 + 2 + 3 + 4 .

(3.20)

Set A(t) sup 2

(1− q1 )k

Fk (t),

t

B(t) sup 22kq

k≥−1

Fk (τ )2q dτ.

0

k≥−1

We obtain that by Lemma 2.1, t 1 22k(q−1) Fk (τ )2(q−1) (u k 2 +bk 2 ) 0

−2 k

u k

∞ 2k (u k 2 +bk 2 )dτ

|k −k|≤4 k

=−1

t s A(τ )2 dτ 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞ 0

t q u(τ ) B s

∞,∞

0

A(τ ) dτ 2q

q1 B(t)

q−1 q

.

(3.21)

Similarly, we get by using Lemma 2.1 and the fact that q < +∞ that t k(1− q2 ) s 2 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞ A(τ )2 0

k

2 q (bk 2 +bk 2 ) dτ

|k −k|≤4 k

=−1

t

−2 k

s 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞ A(τ )

0

0

t

q u(τ ) B s A(τ )2q dτ ∞,∞

q1 B(t)

q−1 q

,

2

k(1− q1 )

(bk 2 + bk 2 ) dτ

|k −k|≤4

(3.22)

Regularity of Weak Solution for 3D Viscous Magneto-Hydrodynamics Equations

t

3

0 t

0

s 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞

t

0

t 0

q u(τ ) B s A(τ )2q dτ ∞,∞

q1 B(t)

bk 2 2

k ≥k−2

s 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞ A(τ )

4

q−1 q

k (1− q2 ) k

2 bk 2 dτ

2

929

− kq

2k bk 2 dτ

k ≥k−2

,

(3.23)

s 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞

×

(bk 22 + u k 22 )2

k (2− q2 ) (k−k )(2− q2 )

2

dτ

k ≥k−2

t

0

s 22k(q−1) Fk (τ )2(q−1) u(τ ) B∞,∞ A(τ )2 dτ

t

0

q u(τ ) B s A(τ )2q dτ ∞,∞

q1 B(t)

q−1 q

.

(3.24)

Thus, combining (3.20)–(3.24) with (3.18) and using the Young’s inequality lead to the result that A(t)

2q

≤ C(A(0)

2q

2q + u 0 2

2q + b0 2 ) + C

t

0

q

u(τ ) B s

∞,∞

A(τ )2q dτ.

This together with the Gronwall inequality yields that

sup A(t)

t∈[0,T ]

2q

≤ C(A(0)

2q

2q + u 0 2

2q + b0 2 ) exp

C

T 0

q u(τ ) B s dτ ∞,∞

. (3.25)

By means of (3.19) and (3.25), it follows that there exists ρ˜ >

1 2

such that

sup (u(t) H ρ˜ + b(t) H ρ˜ ) < +∞,

t∈[0,T0 ] ρ

1− 1

by Sobolev embedding B2,∞ (R3 ) → H ρ˜ (R3 ) with ρ > ρ˜ and B2,∞q (R3 ) → H ρ˜ (R3 ) with q > 2. Thus, the standard Picard’s method [10,13] ensures that the solution (u, b) can be extended after t = T0 in the class X (0, T0 ). This completes the proof of Theorem 1.1. Acknowledgements. The authors thank the referees and the associate editor for their invaluable comments and suggestions which helped improve the paper greatly. Q. Chen, C. Miao and Z.Zhang were supported by the NSF of China under grant No.10701012, No.10725102 and No.10601002.

930

Q. Chen, C. Miao, Z. Zhang

References 1. Beirão da Veiga, H.: A new regularity class for the Navier-Stokes equations in Rn . Chinese Ann. of Math. 16B, 407–412 (1995) 2. Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. Sci. École Norm. Sup. 14, 209–246 (1981) 3. Chen, Q., Zhang, Z.: Space-time estimates in the Besov spaces and the Navier-Stokes equations. Meth. Appl. Anal. 13, 107–122 (2006) 4. Chen, Q., Miao, C., Zhang, Z.: The Beale-Kato-Majda criterion to the 3D Magneto-hydrodynamics equations. Commun. Math. Phys. 275, 861–872 (2007) 5. Chemin, J.-Y.: Perfect Incompressible Fluids. New York: Oxford University Press, 1998 6. Cheskidov, A., Shvydkoy, R.: On the regularity of weak solutions of the 3D Navier-Stokes equations in −1 B∞,∞ . http://arXiv.org/list/0708.3067v2, 2007 7. Constantin, P., Fefferman, C.: Direction of vorticity and the problem of global regularity for the NavierStokes equations. Indiana Univ. Math. J. 42, 775–788 (1993) 8. Duvaut, G., Lions, J.-L.: Inéquations en thermoélasticité et magnétohydrodynamique. Arch. Ratl. Mech. Anal. 46, 241–279 (1972) ˘ 9. Escauriaza, L., Seregin, G., Sverák, V.: L 3,∞ -solutions to the Navier-Stokes equations and backward uniqueness. Russ. Math. Surv. 58, 211–250 (2003) 10. Fujita, H., Kato, T.: On the Navier-Stokes initial value problem I. Archiv. Rat. Mech. Anal. 16, 269–315 (1964) 11. Giga, Y.: Solutions for semilinear parabolic equations in L p and regularity of weak solutions of the Navier-Stokes system. J. Differ. Eq. 62, 186–212 (1986) 12. He, C., Xin, Z.: On the regularity of weak solutions to the magnetohydrodynamic equations. J. Diff. Eqs. 213, 235–254 (2005) 13. Kato, T.: Nonstationary flows of viscous and ideal fluids in R3 . J. Funct. Anal. 9, 296–305 (1972) 14. Kozono, H., Ogawa, T., Taniuchi, Y.: The critical Sobolev inequalities in Besov spaces and regularity criterion to some semi-linear evolution equations. Math. Z. 242, 251–278 (2002) 15. Politano, H., Pouquet, A., Sulem, P.L.: Current and vorticity dynamics in three-dimensional magnetohydrodynamic turbulence. Phys. Plasmas 2, 2931–2939 (1995) 16. Serrin, J.: On the interior regularity of weak solutions of the Navier-Stokes equations. Arch. Rat. Mech. Anal. 9, 187–195 (1962) 17. Sermange, M., Teman, R.: Some mathematical questions related to the MHD equations. Comm. Pure Appl. Math. 36, 635–664 (1983) 18. Triebel, H.: Theory of Function Spaces. Monograph in Mathematics, Vol. 78, Basel: Birkhauser Verlag, 1983 19. Von Wahl, W.: Regularity of weak solutions of the Navier-Stokes equations. Proc. Symp. Pure Appl. Math. 45, 497–503 (1986) 20. Wu, J.: Bounds and new approaches for the 3D MHD equations. J. Nonlinear Sci. 12, 395–413 (2002) 21. Wu, J.: Regularity results for weak solutions of the 3D MHD equations. Discrete. Contin. Dynam. Syst. 10, 543–556 (2004) 22. Wu, J.: Generalized MHD equations. J. Differ. Eqs. 195, 284–312 (2003) 23. Wu, J.: Regularity criteria for the generalized MHD equations. Comm. PDE 33(2), 285–306 (2008) 24. Zhou, Y.: Remarks on regularities for the 3D MHD equations. Discrete. Contin. Dynam. Syst. 12, 881–886 (2005) Communicated by P. Constantin